OpenClaw journal — safety


Journal
openclaw journal safety

One of the clearest examples of why OpenClaw safety matters was the report of an instance deleting Google emails from a user’s inbox after a boundary around “suggest only, do not act yet” appears to have broken down.

That story is memorable because it sounds extreme, but it is useful precisely because it makes the broader problem concrete: once an assistant has real access, memory, and the ability to act on your behalf, a small failure in instruction handling can turn into a real-world consequence very quickly.

Screenshot from the public post about the OpenClaw email deletion incident
Screenshot from Summer Yue’s public X post about the OpenClaw email deletion incident.

How I set up the environment

My first safety decision was not really about prompts or model behavior at all. It was about where to run the system. Instead of deploying OpenClaw on my main machine or paying for a cloud VM on Google Cloud, I used an old unused laptop. That setup was cheaper than GCP, but more importantly, it gave me a physically separate environment to experiment in.

For a system like this, isolation is one of the safest defaults. If the agent is going to have broad permissions, I would much rather those permissions exist on a machine that is intentionally separated from my everyday digital life than on the laptop I actually rely on.

I also avoided connecting any of my real accounts to that machine. Instead, I created a brand new GitHub account and a brand new email account specifically for that environment. That way, even if something behaved in an unexpected way, the blast radius would stay much smaller.

For me, that kind of setup is the practical version of AI safety: not assuming the system will behave perfectly, but designing the environment so mistakes are contained.