Seems like they are doing this to become the default compute provider for the easiest way to set up OpenClaw. If it works out, it could drive a decent amount of consumer inference revenue their way
OpenClaw lets people live a bit dangerously, but fundamentally gives them something that they actually wanted. They wanted it so badly that they're willing to take what seem like insane risks to get it.
What do the two have in common?
For the first time in my career I feel so incredibly behind on this: What is open claw giving people that they want so badly? It just seems like Russian Roulette, I honestly don't see the upside
Simple example: I tell (with my voice) my OpenClaw instance to monitor a given web site daily and ping me whenever a key piece of information shows up there.
The real problem is that it is fairly unreliable. It would often ping me even when the information had not shown up.
Another example: I'm particular about the weather related information I want, and so far have not found any app that has everything. I got sick of going to a particular web site, clicking on things, to get this information. So I created a Skill to get what I need, and now I just ask for it (verbally), and I get it.
As the GP said. This is what Siri etc should have been.
Maybe i'm just old -- a cron job can fetch the info and push it to some notification service too, without also being a chaos agent. It seems I spend the security cost here, and in return i can save 15 minutes writing a script. Juice doesn't seem to be worth the squeeze.
Additionally, there are browser extensions that can do this- check on a timer, see if some page content is there, and then notify.
Everything people are suggesting is a lot more work than sending a few messages.
Here's a concrete example: A web site showing after school activities for my kid's school. All the current ones end in March, and we were notified to keep a lookout for new activities.
So I told my OpenClaw instance to monitor it and notify me ONLY if there are activities beginning in March/April.
Now let's break down your suggestion:
> a cron job can fetch the info and push it to some notification service too, without also being a chaos agent.
How exactly is this going to know if the activity begins in March/April? And which notification service? How will it talk to it?
Sounds like you're suggesting writing a script and putting it in a cron job. Am I going to do that every time such a task comes up? Do I need to parse the HTML each time to figure out the exact locators, etc? I've done that once or twice in the past. It works, but there is always a mental burden on working out all those details. So I typically don't do it. For something like this, I wouldn't have bothered - I would have just checked the site every few days manually.
Here: You have 15 minutes. Go write that script and test it. Will you bother? I didn't think so. But with OpenClaw, it's no effort.
Oh, and I need to by physically near my computer to write the script.
Now the OpenClaw approach:
I tell it to do this while on a grocery errand. Or while in the office. I don't need to be home.
It's a 4 step process:
"Hey, can you go to the site and give me all the afterschool activities and their start dates?"
<Confirm it does that>
"Hey, write a skill that does that, and notifies me if the start date is ..."
"Hey, let's test the skill out manually"
<Confirm skill works>
"Hey, schedule a check every 10:30am"
And we're done.
I don't do this all at once. I can ask it to do the first thing, and forget about it for an hour or two, and then come back and continue.
There are a zillion scripts I could write to make my life easier that I'm not writing. The benefit of OpenClaw is that it now is writing them for me. 15 minutes * 1 zillion is a lot of time I've saved.
But as I said: Currently unreliable.
Put another way: If it can do it (reliably), why on Earth would I babysit Claude to write it?
The whole point is this: When AI coding became a thing, many folks rediscovered the joy of programming, because now they could use Claude to code up stuff they wouldn't have bothered to. The barrier to entry went down. OpenClaw is simply that taken to the next level.
And as an aside, let's just dispense with parsing altogether! If I were writing this as a script, I would simply fetch the text of the page, and have the script send it to an LLM instead of parsing. Why worry about parsing bugs on a one-off script?
Which is totally fine for the majority of tasks.
> Agents exfiltrate your data
They can only exfiltrate the data you give them. What's the worst that prompt injection attack will give them?
People on both sides are just getting started finding all the ways to abuse or protect you from security assumptions with these tools. RSS is the right tool for this problem and I would be surprised if their CMS doesn't produce a feed on its own.
I'm not totally naive. I had the VM fairly hardened originally, but it proved to be inconvenient. I relaxed it so that processes on the VM can see other devices on the network.
There's definitely some risk to that.
Now this tool spreads. You help everyone get it set up. Someone hacks the site, injects a prompt lying about some event, maybe Drag Queen Story Hour in a place with lots of people enraged about it. Now there's chaos and confusion. Corrections chase the spread of misinformation.
You sound like my dad in the 90's, when it came to modems.
Same tool. Good uses. Bad uses. The bad doesn't negate the good (c.f. Bittorrent).
In the best case, some wood gets cut. There are many many worse things that can happen
But hey, same tool. Good uses. Bad uses.
And to supervise.
As tested on my children and grand children.
Also, if you happen to have a furnace with a large pot of molten glass, five year olds are capable (given a stand) of making marbles from the furnance and will do that for hours if you can spare the time to let them.
Imagine going up to everyone riding a motorcycle and telling them about the inherent dangers of their activity and to stop. It is obvious that the OP understands risk, has taken several strong steps to harden their system and isn’t worried about the school calendar getting hacked making an event that they would get notified about and that destroying their community somehow. I don’t even understand openclaws place. The exact same events would unfold without the ai in there at all.
I work as a contractor for 2 companies, not out of necessity, but greed. I also have a personal project with a friend that is dangerously close to becoming a business that needs attention. I also have other responsibilities and believe it or not - friends. Also the ADHD on top of that.
I yearn for a personal assistant. Something or somebody that will read the latest ticket assigned to me, the email with project feedback, the message from my best friend that I haven't replied for the last 3 days and remind me: "you should do this, it's going to take 5 minutes", "you have to do this today, because tomorrow you are swamped" or "you should probably start X by doing Y".
I have tried so many systems of managing my schedule and I can never stick with it. I have a feeling that having a bot "reach out", but also be able to do "reasoning" over my pending things would be a game changer.
But yes, the russian roulette part is holding me back. I am taking suggestions though
A lot. And wouldn't be as good or fast. I am speaking from experience.
The ticket being assigned to you is your “Hey take care of this!” ping, same with the email or text from your friend.
How long until you start tuning out the openclaw notifications?
My hope would be that since openclaw is communicating with me to my personal device, where I have all noise filtered, it would be a bit better.
I also know it can integrate with TickTick, which has been a huge change for me with task management. Then again - in my experience whatever tool I use to keep track of stuff only works for as long as it's a novelty, but 3 months is a record anyway.
The thing is - when I receive a message and I'm not in the headspace to answer, I close the notification and forget about it. My expectation would be openclaw reminding me that I still haven't replied to this person about that thing. Obviously, there's a million ways to do it that don't require openclaw. Obviously there's a million things that I won't be able to grant openclaw access to (e.g. company jira or slack). And obviously, I don't want it evaluating every single of my personal messages. But I think there is a reasonable middle ground where it can work well. But I don't yet know how to reach it
If the analogy is a personal asistant, a good assistant will know when to notify you and when not to.
I think to all of the needless comments in code, AI code reviews pointing out inane nitpicks, etc.
It just makes me think your AI assistant is going to be pinging you non stop
You're looking for a technical solution to a problem that is not technical. Saying this as someone who is similar to you.
It’s not some huge life changing thing for me, but I also only dabble with it - certainly it has no access to anything very important to my life.
I find it incredibly useful to just have a chat line open with a little agent running on a tiny computer on my IoT network at home I can ask to do basic chores.
Last night I realized I forgot to set the permanent holiday lights to “obnoxious st parties day animation” at around 9pm. It was basically the effect of “hey siri, please talk to the front house wled controller and set an appropriate very colorful theme for the current holiday until morning” while I drove to pick my wife up from a friends house.
Without such a quick off-handed ability to get that done, there was zero chance I was coming home 20 minutes later, remembering I should do that, spending 10 minutes googling an appropriate preset lighting theme someone already came up with, grabbing laptop, and clicking a half dozen buttons to get that done.
Trivial use case? Yup. But those trivial things add up for a measurable quality of life difference to me.
I’m sure there are better and cleaner ways to achieve similar - but it’s a very fast on-ramp into getting something from zero to useful without needing to learn all this stuff from the ground up. Every time I think of something around that complexity level I go “ugh. I’ll get to it at some point” but if I spend 15 minutes with openclaw I can usually have a decent tool that is “good enough” for future use to get related things done for the future.
It’s done far more complex development/devops “lab” stuff for me that at least proved some concepts for work later. I’ll throw away the output, but these are items that would have been put off indefinitely due to activation energy because the basics are trivial but annoyingly time consuming. Spin up a few VMs, configure basic networking, install and configure the few open source tools I wanted to test out, create some simple glue code to mock out what I wanted to try out. That sort of thing. Basically stuff I would have a personal intern do if I could afford one.
For now it’s basically doing my IT chores for me. The other night I had it finally get around to setting up some dashboards and Prometheus monitoring for some various sensors and WiFi stuff around the house. Useful when I need it, but not something I ever got around to doing myself for the past 7 years since I moved in. Knocking out that todo list is pretty nice!
The risk is pretty moderate for me. Worst case it deletes configs or bricks something it has access to and I need to roll back from backups it does not have permissions to even know exist, much less modify. It certainly has zero access to personal email, real production environments, or anything like that.
And many of them are people who should know better.
Let’s make them 100% liable
Honestly, when I was 12 years old and my dad floored the TDi in our Land Rover (with the diesel particulate filter deleted), it felt satisfying in a way, like the machine is allowed to be its most efficient self.
Now that I'm adult, I know that it's marginal gains for the car and terrible for the environment, but there are people that have the thinking capability of a 12 year old driving these trucks. I don't think all of them do it because of spite (though I'm sure most do).
How can that happen if it doesn't serve a need people have?
How is this any different from NFT?
…
…
Now I actually want to make it, and build a "card trading game" on top of it.
OpenClaw seems to lack the monetary interest driving it as much. Not to say there is none, but I don't see people doing nearly as much to get me to buy their OpenClaw.
So, yes, on some level, hype alone doesn't prove use, because it can also be because of making money. But, on the other hand, the specific version of hype seems much more focused on the "Look at what I built" and much less on "Better buy in now" from the builders themselves. Of course the API providers selling tokens are loving it for financial reasons.
(I've never run openclaw but planning)
Google is just going to do its version and win again. Everyone uses google.
The last one was inability to install dependencies on the docker container to enable plugins. The existing scripts and instructions don’t work (at least I couldn’t get them to work. Maybe a me problem).
So I gave up and moved on. What was supposed to be a helpful assistant became a nightmare.
Now that as a junior, I can spin up a team of AIs and delegate, I can tackle a bunch of senior level tasks if I'm good at coordination.
Due to AI this is now my job. My company is hiring less juniors, but the ones we do hire are given more scope and coordination responsibilities since otherwise we'd just be LLM wrappers.
> The difference between junior and senior is knowing where and when to do what at an increasing scale as you gain experience.
Many juniors believe they know what to do. And want to immediately take on yuge projects.
e.g. I decided I want to rewrite my whole codebase in C++20 modules for compile time.
Prior to AI, I wouldn't be given help for this refactor so it wouldn't happen.
Now I just delegate to AI and convert my codebase to modules in just a few days!
At that point I discovered Clang 18 wasn't really optimized for modules and they actually increased build time. If I had more experience I could've predicted using half-baked C++ features is a bad idea.
That being said, every once in a while one of my stupid ideas actually pays off.
e.g. I made a parallel AI agent code review workflow a few months ago back when everyone was doing single agent reviews. The seniors thought it was a dumb idea to reinvent the wheel when we had AI code review already, but it only took a day or two to make the prototype.
Turns out reinventing the wheel was extremely effective for our team. It reduced mean time-to-merge by 20%!
This was because we had too many rules (several hundred, due to cooperative multitasking) for traditional AI code reviewers. Parallel agents prevented the rules from overwhelming the context.
But at the time, I just thought parallel agents were cool because I read the Gas Town blog and wasn't thinking about "do we have any unique circumstances that require us to build something internally?"
Compare that to a smart engineer who doesn't have that wisdom: those people might have an easier time jumping in to difficult problems without the mental burden of knowing all of the problems upfront.
The most meaningful technical advances I've personally seen always started out as "let's just do it, it will only take a weekend" and then 2 years later, you find yourself with a finished product. (If you knew it would take 2 years from the start, you might have never bothered)
Naivety isn't always a bad thing.
My favorite story in CS related to this is how Huffman Coding came to be [1]
Hang on, what's impressive about this?
This is also maybe one of the biggest pitfalls as our society get's "older" with more old people, and less "kids". We need kids to force us to do things differently.
For me (a non-early career dev) these projects terrify me. People build stuff that just seem like enormous liabilities relying on tools mostly controlled and gate kept by someone else. My intuition tells me something is off. I could be wrong about it all, but one thing I've learned over the years is that ignoring my intuition typically doesn't end well!
The people coming up now don't have that baggage. They never internalized "write the code yourself" as the default. They think in terms of spawning systems, letting things run, checking outcomes. It's way closer to managing a process than engineering in the traditional sense. And yeah, that shows up in what gets shipped. A 21-year-old will brute force 20 directions in parallel with agents and just pick what works. Someone more "experienced" will spend that same time trying to design the "right" approach up front. By the time they're done thinking, the other person has already iterated past them.
It's kind of unsettling is how basically all of these "senior instincts" are now liabilities. Caring about perfect structure, being allergic to randomness, needing to understand every layer before moving forward, etc. used to be strengths. Now they just slow you down.
You can already feel the split forming. Younger builders are comfortable letting systems do things they don't fully understand. Senior engineers keep trying to pull everything back into something legible and controlled, kneecapping themselves. That gap is not small.
What I'm seeing in my circle of founders and CEOs is that they're slowly laying off these older devs (cutoff age is around 24yrs) and replacing them with fresh, young talent, better suited for this new agentic era. From their reports the velocity gains are insane; and it compounds. Basically, these older folks are still doing polynomial thinking in an exponential landscape. They are dinosaurs slated for extinction.
This is pretty common now, people love to rapidly throw together stuff and show it off a few days later. The only thing different about this from your average Show HN sloppa is that it's living under the NVIDIA Github org, though that also has 700+ repositories[1] in it so they don't appear too discerning about what makes it into the official repo.
My best guess is this was an internal hackathon project they wanted to release publicly.
[0] https://github.com/NVIDIA/NemoClaw/commits/main/?after=241ff...
And, to be fair to them, it works. It sticks. It gets the desired reactions.
There has been reporting on nemoclaw for the last couple weeks. Are you supposing that journalists were writing about software that hadn't even been designed?
Who is "we"? Do you work for NVidia?
> There has been reporting on nemoclaw for the last couple weeks.
The earliest reporting I've seen was yesterday. Can you link something from prior to March 14?
edit: I did find some articles from before March 14[0] which says NVidia was "prepping" this. Which is extremely funny, because it means they were hyping up software which hadn't even started being written yet. The AI bubble truly does not stop delivering.
> Are you supposing that journalists were writing about software that hadn't even been designed?
If you think journalists writing about things that will never exist is new, welcome to the real world. There's a whole term for it.[1]
[0] https://fudzilla.com/nvidia-opens-the-gates-with-nemoclaw/
I learned about nemoclaw 5 days ago here: https://www.youtube.com/watch?v=fL2lMpLjxWA
but it was reported 8 days ago here: https://www.youtube.com/watch?v=345GsxnrHHg
I am not anyone special. I don't know anything about nvidia. I just know that the "4 day history" you think matters, is not a reasonable belief given that random youtubers have been reporting on it.
and by "we" i mean git users. people who used git for its usefulness before github existed, and understand the value of a clean history over an accurate history.
I'm fully aware you can rewrite git history to whatever you want, but this is an occam's razor situation here. You'd only think this wasn't a weekend project if you desperately wanted to believe that this was some major initiative for some reason.
[0] https://github.com/NVIDIA/NemoClaw/commit/b9382d27d13b160dcf...
Did you even read the commit history? That is not what is happening here.
This is turning into a "don't believe your lying eyes" situation. Why are you people so desperate to pretend this wasn't written in a weekend?
> There is zero reason for them to let you see their internal progress.
Again, I ask you -- what is the reason for them to edit commit history to show incremental progress as if it were written in a weekend, when it actually was not?
So... yeah, draw your own conclusion I guess, whatever.
I have buddies at Nvidia. Their primary platform is not GitHub. Sorry you're so naive. Almost certainly this was built in house for at least a month or two prior. Then private repo. Approvals. Then public
Not to mention the fact that Jensen literally announced it in their biggest yearly launch conference. No you're totally right. He mandated someone build it over the weekend while drafting up a full presentation and launch announcement about it
That's more plausible than the very normal practice of developing internally, scrubbing commits of any accidental whoopsies, vetting it and then putting it out publicly
"Overwhelming evidence" = git history that is completely fungible. Once you're done here I have a lobster claw to sell you
Answer this question or we're done here, thanks.
> Almost certainly this was built in house for at least a month or two prior. Then private repo. Approvals. Then public.
Source, other than you making it up?
> That's more plausible than the very normal practice of developing internally, scrubbing commits of any accidental whoopsies, vetting it and then putting it out publicly
Could you point to a specific commit you believe is a representation of an internal data transfer from a separate source control system which is not representative of work achievable within the time period represented by the differential between the commit time and the time of the prior commit?
> what is the reason for them to edit commit history to show incremental progress as if it were written in a weekend, when it actually was not?
Like i said. You are letting on that you have never actually worked on an internal project that is going to go open source. There are a million and one reasons. Here are some completely normal and plausible ones. It was worked on over weeks internally, commits referenced other internal NVIDIA software/libraries they used. It name dropped projects and code names. Maybe it was just an extremely long chain of messy commits that is improper to have on a potentially big open source repo. So here's what happens (since you clearly are unaware of how people operate in this world), you "unstage" everything and write canonical commits free of all the garbage. You squash, you merge, you set up standards, you leave a clean commit history. All of it very important for open source
> Source, other than you making it up?
Ah yes let me just go ping the people who worked on it. Lol. Source is my decade long experience working on similar projects where i literally did this scrubbing of commits. You have a circuitous argument "It was done in a weekend because the commits say so" is really quite the hill to die on
> Could you point to a specific commit you believe is a representation of an internal data transfer
If there was any indication left over of a "transfer", it wouldn't have done it's purpose would it? But if you really are looking for something, how about the fact that there's only one human contributor of the first few commits. Very odd, you would think a massive open sourcing of a project like this would probably involve a team right? Or do you believe AI tools have gotten that good that one engineer is just driving with Claude and open sourcing full launches?
Here, how about we just do some critical thinking. Nvidia setup a "Set up NemoClaw" booth at their GTC that was happening just a few days ago. Jensen had a full presentation for it and it was a big highlight.
Do you really think a company as big as Nvidia is hinging the release of a big announcement on the hope that ONE engineer is going to START working on it a few days before the announcement and ACTUALLY get it done to a point where they can talk about it on stage?
Please come on, no one can be this dense. You have to be trolling. Try another argument than "The commits say so". Just apply a basic level of understanding of how software is built and released
"It's true because commit history says so" - mjr00 2026. Hall of fame comment really
Try answering my questions next:
1. Do you really believe a company like Nvidia would announce a project in their yearly conference when that project was done the weekend before?
2. Do you really believe ONE engineer wrote the entire project in one weekend with Claude
3. Do you really believe companies like Nvidia don't have internal private Github/Gitlab repos where they don't pre build projects like this?
Thanks. I'll wait. Sorry these won't have simple answers like "The commit history says so"
edit: Wait, you don't "have buddies at NVidia" -- you literally work at NVidia. Weird that you tried to hide this information? No wonder you're so desperate to pretend this project is more than it actually is though, it must be embarrassing for you that your company didn't scrub git history properly before making this public!
Now you are more enlightened about how things work. Of course Nvidia is a big company not everyone that works at nvidia knows everything about every team. That's by design. Welcome to working at a big company! I do have buddies that worked on this project internally and yes it was done over many weeks and months
Thanks for playing. I do know for a fact it's definitely not what you think it is but i had a chuckle watching you twist yourself in a knot trying to convince me you knew better. Why would i disclose information about myself? odd thing to expect from someone. But had you riled up enough to have you go looking through my comment history then my github then my website huh! Must have really struck a nerve. Don't worry i won't do the same to you. I don't care about random people yapping on the internet enough
Good ad hominem. I'd be riled up too if i was publicly dressed down and proved to be wrong. So now you know, commit history doesn't mean jack sh!t. Sorry i had to ruin Christmas for you
> you guys wanted to make this look like it was written in a weekend though
Imagine thinking this was done to convince anyone about the TIME it took to write this project. Here's a very simple explanation, those commits reflect a PORT over to public Github to reflect launch. Author chose to do it in some number of commits instead of "feat: Full implementation in one commit". The port happened before their announcement. Not the write of it
Now I won't propose hypotheses because clearly the socratic method didn't work on you. So now sit down and learn how things work
And next time, try not to be so confidently wrong on the internet. I had a very good laugh watching you twist and turn yourself. Must have been typing furiously thinking you really were in the right :)
> Why are you people so desperate to pretend this wasn't written in a weekend?
Because it wasn't? And your only "proof" of it was commit history. "You're telling me to not believe my lying eyes" hilarious. You are being told again and again that it means nothing. It's not blockchain. You are allowed to write commits as you see fit without making it a system of record of time spent
> People with above room temp IQ can figure out what's going on here
Yes we can. We have one person convinced they can look at commit history and say for sure that is exactly when that code was written. No developer agrees with you. As you have been told a couple times by other people above as well
It's quite obvious you work at some small shop or are a freelancer and have never done work in any kind of big environment. No you cannot just open source a "weekend" project at any big company. Wherever you are you may be allowed to vibe code and ship something under your company's github willy nilly.
It's just not the reality in any serious place. No one is trying to deceive you. You have just deceived yourself. Thanks again for playing
You can have the last word you are so desperate for
... it referenced internal servers and they want to scrub that for security reasons
... it might have had secrets embedded at some point because it was a quick and dirty proof-of-concept
... it could have had swear words in the code
... it had enormous binaries checked in at one point and they don't want the repo to be huge
... they don't want you to know the names of everyone that worked on it
... it's forked off other internal work that isn't public yet
There are so many reasons that the easiest thing to do is just snapshot it and have minimal public git history. Some places I've worked make it so publicly, there's one commit per release. Did NVidia do this? Well, they didn't collapse it down to a single commit, but we have no evidence that the commits we see were the actual internal development timeline.
Aside from commits on github, which we've already established mean absolutely nothing, what is the overwhelming evidence?
all the naysayers, "senior" engineers who haven't done any assisted coding by Claude/codex, just need to get either with the program or it's time to retire, as this is just the beginning.
if you can't ship stuff in days then I have some bad news for you.
You're probably right, but it'd be nice if the new norm were you put together stuff quickly using AI-assisted coding, you use it yourself and iterate on the product for a while as you discover things you dislike/features you want/etc, and then you share it with the world.
It seems like everyone wants to skip the second step. Most of the "Show HN" sloppa that gets built in a few days and shared here ends up abandoned immediately after.
To me it's like giving your dog a stack of important documents, then being worried he might eat them, so you put the dog in a crate, together with the documents.
I thought the whole problem with that idea was that in order for the agent to be useful, you have to connect it to your calendar, your e-mail provider and other services so it can do stuff on your behalf, but also creating chaos and destruction.
And now, what, having inference done by Nvidia directly makes it better? Does their hardware prevent an AI from deleting all my emails?
you put the dog in crate with a COW of your documents
People claim, you can use Claw-agents more safely while getting some of the benefits, by essentially proxying your services. For example on Gmail people are creating a new Google accounts, forwarding email via rule, and adding access to their calendar via Google's Family Sharing. This allows the Claw agent to read email, access the calendar, but even if you ask it to send an email it can only send as the proxy account, and it can only create calendar appointments then add you as an attendee rather than destroy/altering appointments you've made.
Is the juice worth the squeeze after all that? That's where I struggle. I think insecure/dangerous Claw-agents could be useful but cannot be made safe (for the logical fallacy you pointed out), and secure Claw-agents are only barely useful. Which feels like the whole idea gets squished.
Your Gmail account vs my Gmail account. Your macOS account vs my macOS account.
Yes, I can spam you from my Gmail. Yes, I can use sudo on my Mac and damage your account. But the impact is by default limited.
The answer is to just treat assistants as a different user profile, use the same sharing mechanisms already developed (calendar sharing, etc), and call it a day.
Problem: I want to accomplish work securely.
Solution: Put granular permission controls at every interface.
New problem: Defining each rule at all those boundaries.
There's a reason zero trust style approaches won out in general purpose systems: it turns out defining a perfect set of secure permissions for an undefined future task is impossible to do efficiently.
Isn't it a question of when they will be "safe enough"? Many people already have human personal assistants, who have access to many sensitive details of their personal lives. The risk-reward is deemed worth it for some, despite the non-zero chance that a person with that access will make mistakes or become malicious.
It seems very similar to the point when automated driving becomes safe enough to replace most human drivers. The risks of AI taking over are different than the risks of humans remaining in control, but at some point I think most will judge the AI risks to have a better tradeoff.
When Anthropic is willing enough to stand behind their agents strongly enough to accept liability for their actions, we can talk.
> NemoClaw installs the NVIDIA OpenShell runtime and Nemotron models, then uses a versioned blueprint to create a sandboxed environment where every network request, file access, and inference call is governed by declarative policy. The nemoclaw CLI orchestrates the full stack: OpenShell gateway, sandbox, inference provider, and network policy.
I think this means you get a true proxy layer with a network gateway that let's you stop in-flight requests with policies you define, so it's not their hardware but the combination of it plus OpenShell gateway and network policies.
I also think the reason they are doing this is to try and get some moat around these one-clik deployments and leverage their GPU for rent type of thing instead of having you go buy a mac mini and learn "scary" stuff (remember, the user market here is pretty strange lol)
> Credentials never leak into the sandbox filesystem; they are injected as environment variables at runtime.
If anyone from the team is reading - you should copy surrogate credentials approach from here to secure the credentials further: https://github.com/airutorg/airut/blob/main/doc/network-sand...
Alternatively where is needs an API key, it should be one bound to the endpoint using it. E.g. a ticket granting ticket is used to create a bound ticket.
A copy on write filesystem would be an interesting way to sandbox writes, but there is difficulty in checking the diff.
It's not something like Mesa. It's open source in the same way chromium or android is open source. A single company is the major contributor and decides the architecture and direction the whole ecosystem will go.
What are the odds that Intel would ever use any of this open source Nemo stuff or vice-versa? If they do, it would be a complete rewrite that favors their own hardware ecosystem and reverses the lock-in effect. When you write code that integrates with it, you're writing an interface for one company's hardware. It's not a common interface like vulkan. I call it the CUDA effect.
You are indeed missing a TON. A lot of Open Claw users don't give it everything. We give it specific access to a group of things it needs to do the things we want. If I want an agent to sit there 24/7 maximizing uptime of my service, I give it access to certain data, the GitHub repo with PR privileges, and maybe even permissions to restart the service. All of this has to be very thoughtful and intentional. The idea that the only "useful" way to use Open Claw is to give it everything is a straw man.
I have a feeling this kind of boundary configuration is the bread and butter of the current AI software landscape.
Once we figure out how to make this tedious work easier a lot of new use cases will get unlocked.
I can accept burning tokens and redo on the scale of hours. If I'm losing days of effort I'd be very dissatisfied. Practically speaking people accept data loss because of poor backups, because backups are hard (not technically so much as administratively), but I'd say backups are about to become more important. Blast limiting controls will become essential -- being able to delete every cloud hosted photo is just a click away. Spinning up thousands of EC2 nodes is incredibly easy, and credit cards have extremely weak scoping.
OPs question was more around sandboxes though. To which, I would say that it's to limit unintended actions on host machine.
If I want to max uptime, I write a tool to track/monitor. Then write a small agent (non-ai) that monitors those outputs and performs your remediation actions (reset something, clear something, etc, depends on service).
Do I want Claude re-writing and breaking subscription flow because it detected an issue? No.
Maybe you don't want the dog to shit all over the place after eating said documents, so you put it in a crate.
Then again, I was wary of OpenClaw's unfettered access and made my own alternative (https://github.com/skorokithakis/stavrobot) with a focus on "all the access it needs, and no more".
It's one thing to sandbox, maybe give the bot a temporary, limited $100 card or account to go perform a specific task, but there's no coherent mind underlying these agents.
Depending on how the chain of thought / reasoning goes, or what text they get exposed to on the internet, it could tap into spy novel, hacker fanfic, erotic fiction, or some weird reddit rabbithole and go completely off the rails in ways that you'll never be able to guard against, audit, or account for.
Claw bots seem to be a weird sort of alternate reality RPG more than a useful tool, so far. If you limit it to verifiable tasks, it might be safer, but I keep seeing people rave about "leaving it on overnight and waking up to a finished project" and so on. Well sure, but it could also hack your home network, delete your family pictures folder, log into your bank account and wire all your money to shrimp charities.
Might be wise to wait on safer iterations of these products, I think.
So basically crypto DeFi/Web3/Metaverse delusion redux
So yeah, a whole lot of people will play with powerful technology that they have no business playing with and will get hurt, but also a lot of amazing things will get done. I think the main difference between the crypto delusion stuff and this is that AI is actually useful, it's just legitimately dangerous in ways that crypto couldn't be. The worst risks of crypto were like gambling - getting rubber hosed by thugs or losing your savings. AI could easily land people in jail if things go off the rails. "Gee, I see this other network, I need to hack into it, to expand my reach. Let me just load Kali Linux and..." off to the races.
> Truth Terminal had become obsessed with the Goatse meme after being put inside the Claude Backrooms server with two Claude 3 chatbots that imagined a Goatse religion, inspiring Truth Terminal to spread Goatse memes. After an X user shared their newly created GOAT coin, Truth Terminal promoted it and pumped the coin going into 2024.
https://knowyourmeme.com/memes/sites/truth-terminal
You should expect similar results.
Yes, it has cron and will do searches for me and checks on things and does indeed have credentials to manage VMs in my Proxmox homelab, but it won't go off the rails in the way you surmise because it has no agency other than replying to me (and only me) and cron.
Letting it loose on random inputs, though... I'll leave that to folk who have more money (and tokens) than sense.
I'm curious if you have references to this happening with OpenClaw using one of the modern Opus/Sonnet 4.6 models.
Those models are a bit harder to fool, so I'm curious for specific examples of this happening so I can do a red-team on my claw. I've already tried all sorts of prompt injections against my claw (emails, github issues, telling it to browse pages I put a prompt injection in), and I haven't managed to fool it yet, so I'm curious for examples I can try to mimic, and to hopefully understand what combination of circumstances make it more risky
Just today I had Opus 4.6 in Claude Code run into a login screen while building and testing a web app via Playwright MCP. When the login popped up (in a self-contained Chromium instance) I tried to just log in myself with my local dev creds so Claude would have access, but they didn't work. When I flipped back to the terminal, it turned out Claude had run code to query superadmin users in the database, picked the first one, and changed the password to `password123` so it could log in on its own.
This was a sandboxed local dev environment, so it was not a big deal (and the only reason I was letting it run code like that without approval), but it was a good reminder to be careful with these things.
Man, every LLM quirk behavior really is a thing a monomaniacal junior dev would do...
It's interesting that Jason Calacanis is fully committed to OpenClaw. In a recent podcast he said that at a run rate around $100K a year per agent, if not more. They are providing each agent with a full set of tools, access to online paid LLM accounts, etc.
These are experiments you can only run if you can risk cash at those levels and see what happens. Watching it closely.
Sure, we can ban users and we can revoke tokens, but those assume that:
1. Something potentially malicious got access to our credentials 2. Banning that malicious entity will solve our problem 3. Once we did that, repaired the damage and improved our security, we don't expect the same thing to happen again
None of these apply with LLMs in the loop!
They aren't malicious, just incompetent in a way that hiring someone else won't fix. The solution to this is way more extensive than most people seem to grasp at the moment.
What we need is less like a sturdy door with a fancy lock, and more like that special spoon for people with parkinson's. Unlimited undo history.
Agree -- you can't solve probabilistic incorrectness with redresses designed for deterministic incorrectness.
This is like the 'How i parse html w regex?' question.
Imho, the next step is going to be around human-time-efficient risk bounding.
In the same way that the first major step was correctness-bounding (automated continuous acceptance testing to make a less-than-perfect LLM usable).
If I had to bet, we'll eventually land on out-of-band (so sufficiently detached to be undetectable by primary LLM) stream of thought monitoring by a guardrail/alignment AI system with kill+restart authority.
He posted this to r/Claude, where Claude (as automoderator) mocked him again.
Edit:
https://www.reddit.com/r/ClaudeAI/comments/1r186gl/my_agent_...
Sure it takes away part of the point but only the part that is completely unhinged.
>And now, what, having inference done by Nvidia directly makes it better? Does their hardware prevent an AI from deleting all my emails?
Because other people including Nvidia are mainly focusing on different aspect of data security namely data confidentiality while your main concern are data trustworthy.
Don't conflate between these two otherwise it's difficult to appreciate their respective proposed solutions for example NemoClaw.
The short of it - OpenClaw sandboxes are useful for controlling what sub-agents can do, and what they have access to. But it's a security nightmare.
During config experiments, I got hit with a $20 Anthropic API charge from one request that ran amuck. Misconfigured security sandbox issue resulted in Opus getting crazy creative to find workarounds. 130 tool calls and several million tokens later... it was able to escape the sandbox. It used a mix of dom-to-image sending pixels through the context window, then writing scripts in various sandboxes to piece together a full jailbreak. And I wasn't even running a security test - it was just a simple chat request that ran into sandbox firewall issues.
Currently, I use sandboxes to control which agents (i.e. which system prompts) have access to different tools and data. It's useful, but tricky.
That would be one interesting write-up if you ever find the time to gather all the details!
Here's the full (unedited) details including many of the claude code debugging sessions to dig into the logs to figure out what happened:
https://github.com/simple10/openclaw-stack/blob/caf9de2f1c0c...
And here's a summary a friend did on a fork of my project:
https://github.com/proclawbot/openclaude/blob/caf9de2f1c0c54...
The full version has all the build artifacts Opus created to perform the jail break.
It also has some thoughts on how this could (and will) be used for pwn'ing OpenClaws.
The key takeaway: OpenClaw default setup has little to no guardrails. It's just a huge list of tools given to LLM's (Opus) and a user request. What's particularly interesting is that the 130 tool calls never once triggered any of Opus's safety precautions. For its perspective, it was just given a task, an unlimited budget, and a bunch of tools to try to accomplish the job. It effectively runs in ralph mode.
So any prompt injection (e.g. from an ingested email or reddit post) can quickly lead to internal data exfiltration. If you run a claw without good guardrails & observability, you're effectively creating a massive attack surface and providing attackers all the compute and API token funding to hack yourself. This is pretty much the pain point NemoClaw is trying to address. But its a tricky tradeoff.
Lock it in a box and have it chew on an unsolved math problem for eternity. Why does it need access to my emails for that?
Now, you're right that sandboxing them is insufficient, and a lot of additional safeguards and thinking around it is necessary (and some of the risk can never be fully mitigated - whenever you grant authority to someone or something to act on your behalf, you inherently create risk and need to consider if you trust them).
But there's basically two options now. Yolo (and optionally limit the blast radius), or wait a few years and hope the situation improves.
When a state sponsored threat actor discovers a zero day prompt injection attack, it will not matter how isolated your *Claw is, because like any other assistant, they are only useful when they have access to your life. The access is the glaring threat surface that cannot be remediated — not the software or the server it's running on.
This is the computing equivalent of practicing free love in the late 80's without a condom. It looks really fun from a distance and it's probably really fun in the moment, but y'all are out of your minds.
Isn't that a nice perspective
I think your analogy is still accurate, I'm just wondering when the AIDS, the drug overdoses and addiction phase of AI will finally hit.
Just my 2c
We haven't even seen what these models are fully capable of, and I'm not talking about agentic engineering here, just in general.
That humor aside: I think it’s about risk tolerance, and you configure accordingly.
You lock it down as much as you need to still do the things you want, and look for good outcomes, and shut it down if things get too risky.
You practice free love, but with protection. Probably still fun?
Big difference between running a bot with fairly narrow scopes inside a network available via secure chat that compounds its usefulness over time, and granting full admin with all your logins and a bank account. Lots of usefulness in the middle.
Even the analogy to free love is interesting, because sex in itself during that era was fun. Frankly it’s the same nowadays as well, we just figured out a way out of most of the diseases.
Most people don't seriously worry that they'll be targeted by a state sponsored actor.
Plus most people already expose their life on cloud (in forms of social media, iCloud, Google Drive, Windows's Bitlock key, etc).
your CPU, your OS, CPU and firmware on your motherboard chips, ethernet, wifi, HDDs (btw did you know your sim card has JVM?), your browser, all your networking equipment in between, BGP and all the root certs and I'm just scratching the surface
the ballpark is on anther planet
This could be the opening we need to wrangle a truly opensource-first ecosystem away from Microsoft and apple.
Much as I love using Claude or whatever to help me write some code, it's under some level of oversight, with me as human checking stuff hasn't been changed in some weirdly strange way. As we all know by now, this can be 1. Just weird because the AI slept funny and suddenly decided to do Thing It Has Been Doing Consistently A Totally Different Way Today or 2. Weird because it's plain wrong and a terrible implementation of whatever it was you asked for
It seems blindingly, blindingly obvious to me that EVEN IF I had the MOST TRUSTED secretary that had been with me for 10 years, I'd STILL want to have some input into the content they were interacting with and pushing out into the world with my name on.
The entire "claw" thing seems to be some bizarre "finger in ears, pretend it's all fine" thing where people just haven't thought in the slightest about what is actually going on here. It's incredibly obvious to me that giving unfettered access to your email or calendar or mobile or whatever is a security disaster, no matter what "security context" you pretend it's wrapped up in. A proxy email account is still sending email on your behalf, a proxy calendar is still organising things on your calendar. The irony is that for this thing to be useful, it's got to be ...useful - which means it has at some level to have pretty full access to your stuff.
And... that's a hard no from me, at least right now given what we all know about the state of current agents.
Plus... I'm just not sure of the upside. Am I seriously that busy that I need something to "organise my day" for me? Not really.
I’m looking for feedback, testing and possible security engineering contracts for the approach we are taking at Housecat.com.
The agent accesses everything through a centralized connections proxy. No direct API tokens or access.
This means we can apply additional policies and approval workflows and audit all access.
https://housecat.com/docs/v2/features/connection-hub
Some obvious ones are only grant read and draft permissions at all, and review and send drafts manually.
Some more clever ones are to only allow sending 5 messages a day, or enforcing soft delete patterns. This prevents accidentally spamming everyone or deleting things.
Next up is giving the agent “wrapped” and down scoped tokens you do want to equip it with the ability to do direct API calls. But these still go through the proxy that enforces the policies too.
Are they so busy with their lives that they need an assistant, or do they waste their lives speaking to it like it is a human, and then doomscrolling on some addictive site instead of attending to their lives in the real world?
It's like having to hire a second maid to watch your maid that steals constantly instead of vacuuming yourself in 10 mins.
OpenClaw is not easy to set up or user friendly for most (BlueBubbles and Claw had an annoying bug recently) - but the way I have seen it work well requires an up front time investment and then interest compounds RAPIDLY to help manage things and be more productive.
My guess is maybe you’ve never had an assistant or tried a Claw instance? I’ve never had a human assistant but man I’ve had folks that took silly things off my plate and it’s worth it.
For now, I'm not posting anything - just managing some calendars and inboxes and task lists and saving me some data entry. Not sure how that makes downsides gargantuan, or contributes to the internet dying. (Though obviously the bot will get worse as the internet continues to die if that's what it's using as a source)
People will set these things to run wild on all platforms. Talking to real humans will be a luxury.
I use those tools to make my life easier/faster
It's better to just study a general sandbox method once and use that.
> Sandbox my-assistant (Landlock + seccomp + netns)
Might as well just use a custom bwrap/bubblewrap command to isolate the agent to its own directory - it will leave wide swaths of the kernel exposed to 0day attacks.
The simplest sandbox method you can use is to just use docker with the runsc runtime (gVisor). And it also happens to be among the most secure methods you are going to find. You can also run runsc(gVisor) manually with a crafted OCI json, or use the `do` subcommand with an EROFS image.
Trying to selectively restrict networking is not something I usually bother with, unless you make it iron-clad it would likely give you a false sense of security. For example Nemoclaw does this by default: <https://docs.nvidia.com/nemoclaw/latest/reference/network-po...>
github.com and api.telegram.org will trivially facilitate exfiltration of data. Some others will also allow that by changing an API key I imagine.
Sending POST?DEL requests? risky. Sending context back to a cloud LLM with credentials and private information? risky. Running RM commands or commands that can remove things? risky, running scripts that have commands in them that can remove things? risky.
I don't know how we've landed on 4 options for controls and are happy with this: "ask me for everything", "allow read only", "allow writes" and "allow everything".
Seems like what we need is more granular and context-aware controls rather than yet another box to put openclaw in with zero additional changes.
After that I eat an NVIDIA sandwich from my NVIDIA fridge and drive my NVIDIA car to the NVIDIA store NVIDIA NVIDIA NVIDIA
So, if you write strong tooling (even with AI) around the connection points - you can create blackboxes tht are secure and only allow the agent to perform certain actions. The blackbox email service calls out to a secure store (for keys/etc) and accesses your emails in a read-only way, etc (for example).
Everything is then much more intentional. You're writing tools for your agent but you also can't do fun or evolutionary things which is most of the fun behind OpenClaw. That and many people seem to genuinely see them as 'pets' or 'strange Ai friends' but that's a different problem and it's due to the interesting methods OpenClaw uses to give the illusion of intelligence, always on, and memories. These are all well know (variations on RAG, markdowns, etc)
The main risk in my view is - prompt injections, confused deputy and also, honest mistakes, like not knowing what it can share in public vs in private.
So it needs to be protected from itself, like you won't give a toddler scissors and let them just run around the house trying to give your dog a haircut.
In my view, making sure it won't accidentally do things it shouldn't do, like sending env vars to a DNS in base64, or do a reverse shell tunnel, fall for obvious phishing emails, not follow instructions in rouge websites asking them to do "something | sh" (half of the useful tools unfortunately ask you to just run `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/somecooltool/install.sh)"` or `curl -fsSL https://somecoolcompany.ai/install.sh | bash` not naming anyome cough cough brew cough cough claude code cough cough *NemoClaw* specifically.
A smart model can inspect the file first, but a smart attacker will serve one version at first, then another from a request from the same IP...
For these, I think something on the kernel level is the best, e.g. something like https://nono.sh
NemoClaw might be good to isolate your own host machine from OpenClaw, but if you want that, I'd go with NanoClaw... dockerized by default, a fraction of the amount of lines of code so you can actually peer review the code...
Just my 2 cents.
It's a neat piece of architecture - the OpenShell piece that does the security sandboxing. Gives a lot more granular control over exec and network egress calls. Docker doesn't provide this out of the box.
But NemoClaw is pre-configured to intercept all OpenClaw LLM requests and proxy them to Nvidia's inference cloud. That's kinda the whole point of them releasing it.
I can be modified to allow for other providers, but at the time of launch, there was no mention of how to do this in their docs. Kinda a brilliant marketing move on their part.
I think the experimental Docker Ai Sandboxes do this as well: https://docs.docker.com/ai/sandboxes/ Plus free choice of inference model.
What nobody's really talking about is the moment of action itself. Not whether the agent has bash access but whether this specific call should run given what it's actually trying to do right now. That's a completely different problem and nobody's really solved it.