NemoClaw · part 2
[AI Agent] How to Install NemoClaw on DGX Spark (4 Undocumented Fixes)
❯ cat --toc
- Plain-Language Version: Installing AI Agent Software on Unusual Hardware
- Step 1: Upgrade Node.js to v22
- Step 2: Run the NemoClaw Installer
- Step 3: Fix npm link Permissions
- Step 4: Install OpenShell CLI
- Step 5: Fix Docker for cgroup v2
- Step 6: Get an NVIDIA API Key
- Step 7: Run setup-spark
- Step 8: Run onboard
- What Was Gained
- Installation Checklist
TL;DR
NemoClaw on GX10 (GB10/SM121) needs four fixes that aren't in the official docs: upgrade Node.js to v22, manually run sudo npm link after the installer, install OpenShell from the tar.gz release (not the binary URL in the docs — that 404s), and add cgroupns=host to Docker daemon.json before onboard. After those, nemoclaw setup-spark and NEMOCLAW_NON_INTERACTIVE=1 nemoclaw onboard complete cleanly.
Plain-Language Version: Installing AI Agent Software on Unusual Hardware
Installing software on a computer usually means clicking "Download" and following the prompts. On specialized AI hardware like the NVIDIA DGX Spark (or the ASUS GX10, which uses the same chip), it is not that simple. The machine runs Linux, uses an ARM processor instead of the Intel/AMD chip in most PCs, and ships with older versions of some tools.
NemoClaw is NVIDIA's AI agent framework — software that lets an AI assistant run on your machine, read files, and execute tasks. The official install guide assumes you are on a standard setup. On the GX10, four things go wrong that the guide does not mention: the pre-installed Node.js is too old, a file permission step fails silently, a download link in the documentation is broken, and a Linux container setting needs to be changed.
None of these are hard to fix once you know about them. This article walks through each one in order, with the exact commands. Total time from a blank GX10 to a running AI agent sandbox: about 30 minutes.
Step 1: Upgrade Node.js to v22
NemoClaw requires Node.js 20+. The GX10 had v18.19.1 shipped with the system. The installer checks this early:
Error: Node.js version 18.19.1 is not supported. Please upgrade to 20 or later.
Upgraded to v22 via NodeSource:
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt-get install -y nodejs
node --version
# v22.22.1
v22 is fine even though the error only mentions 20+.
Step 2: Run the NemoClaw Installer
curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
The installer exited with code 243 and printed:
nemoclaw.sh: line 47: BASH_SOURCE[0]: unbound variable
This is a set -euo pipefail issue: BASH_SOURCE[0] is unset when the script runs inside a curl | bash subshell. The error fires in the script's cleanup block — after the actual installation is already complete. Running nemoclaw --version immediately after:
nemoclaw v0.1.0
The CLI is installed at ~/.nemoclaw/source. The error is cosmetic but it is loud.
Step 3: Fix npm link Permissions
The installer tries to create a global symlink. That step failed:
npm error code EACCES
npm error syscall symlink
npm error path /usr/local/lib/node_modules/nemoclaw
npm error dest /usr/local/bin/nemoclaw
npm error errno -13
Standard npm global install permission issue on a system-managed Node.js. Fixed manually:
cd ~/.nemoclaw/source
sudo npm link
After this, nemoclaw is callable from any directory.
Step 4: Install OpenShell CLI
NemoClaw's onboard command requires the openshell CLI to be present. The official spark-install.md says to install it like this:
# This URL 404s — do NOT use
ARCH=$(uname -m)
curl -fsSL "https://github.com/NVIDIA/OpenShell/releases/latest/download/openshell-linux-${ARCH}" \
-o /usr/local/bin/openshell
That URL returned HTTP 404. The actual release format is a tar.gz, not a bare binary. The correct install for aarch64 Linux:
curl -fsSL \
"https://github.com/NVIDIA/OpenShell/releases/download/v0.0.13/openshell-aarch64-unknown-linux-musl.tar.gz" \
-o /tmp/openshell.tar.gz
tar xzf /tmp/openshell.tar.gz -C /tmp
sudo mv /tmp/openshell /usr/local/bin/openshell
openshell --version
# openshell 0.0.13
For x86_64, replace aarch64-unknown-linux-musl with x86_64-unknown-linux-musl.
To get the current latest version programmatically:
curl -s https://api.github.com/repos/NVIDIA/OpenShell/releases/latest \
| python3 -c "import sys,json; r=json.load(sys.stdin); print(r['tag_name'])"
# v0.0.13
Step 5: Fix Docker for cgroup v2
Ubuntu 24.04 ships with cgroup v2 (cgroup2fs). OpenShell's gateway container starts k3s internally, and k3s tries to create cgroup v1-style paths that don't exist under cgroup v2. Without a fix, onboard fails with:
K8s namespace not ready
openat2 /sys/fs/cgroup/kubepods/pids.max: no such file or directory
Failed to start ContainerManager: failed to initialize top level QOS containers
Verify cgroup version first:
stat -fc %T /sys/fs/cgroup/
# cgroup2fs ← affected
# tmpfs ← cgroup v1, no fix needed
If cgroup2fs, add cgroupns=host to Docker's daemon config:
sudo python3 -c "
import json, os
path = '/etc/docker/daemon.json'
d = json.load(open(path)) if os.path.exists(path) else {}
d['default-cgroupns-mode'] = 'host'
json.dump(d, open(path, 'w'), indent=2)
"
sudo systemctl restart docker
This makes all containers use the host cgroup namespace, which is what k3s expects. The existing containers (open-webui, pipelines, etc.) survived the restart without issues.
Step 6: Get an NVIDIA API Key
nemoclaw setup-spark and nemoclaw onboard both require a key from build.nvidia.com/settings/api-keys. The key starts with nvapi-. The free tier is valid for 30 days.
The key can be passed as an environment variable to skip the interactive prompt:
export NVIDIA_API_KEY=nvapi-...
Step 7: Run setup-spark
NVIDIA_API_KEY=nvapi-... nemoclaw setup-spark
Output:
>>> User 'coolthor' already in docker group
>>> Docker daemon already configured for cgroupns=host
>>> DGX Spark Docker configuration complete.
>>> Next step: run 'nemoclaw onboard' to set up your sandbox.
setup-spark checks Docker group membership and the daemon config, then exits. On GX10 — after the manual fixes above — it completes in under a second with nothing to do. On a fresh DGX Spark straight from the box it would apply the fixes automatically.
Step 8: Run onboard
The default nemoclaw onboard is interactive — it prompts for sandbox name, model selection, and other configuration. Since this was running over SSH without a TTY, the interactive mode stalled. Passing NEMOCLAW_NON_INTERACTIVE=1 uses defaults throughout:
NVIDIA_API_KEY=nvapi-... NEMOCLAW_NON_INTERACTIVE=1 nemoclaw onboard
The output scrolled through 7 stages:
[1/7] Preflight checks
✓ Docker is running
✓ openshell CLI: openshell 0.0.13
✓ Port 8080 available (OpenShell gateway)
✓ NVIDIA GPU detected: 1 GPU(s), 122502 MB VRAM
[2/7] Starting OpenShell gateway
Gateway nemoclaw ready.
✓ Active gateway set to 'nemoclaw'
[3/7] Creating sandbox
...
[7/7] Done
The k3s startup logs printed during stage 2 are verbose — hundreds of lines of Kubernetes reconciler output. They're not errors. The process completed cleanly.
Verification:
nemoclaw status
# Sandboxes:
# my-assistant * (nvidia/nemotron-3-super-120b-a12b)
nemoclaw my-assistant status
# Phase: Ready
The sandbox is up, policy enforcement is active, and the agent is connected to nvidia/nemotron-3-super-120b-a12b via NVIDIA's cloud inference API.
What Was Gained
What cost the most time: The OpenShell binary URL. spark-install.md documents a URL that 404s on the current release. The actual asset is a tar.gz with a different naming convention. The error from curl -fsSL on a 404 is silent by default — it just saves an HTML error page as the binary, which then fails with a confusing "Exec format error" on execution. Running curl -v would have shown the 404 immediately.
Transferable diagnostic: When a binary fails with Exec format error on Linux, the first thing to check is whether the download actually succeeded. file <binary> will show ASCII text instead of ELF if you saved an HTML page. GitHub's release asset naming is not consistent across versions — always look at the actual release page, not docs that reference a specific URL.
The pattern that applies everywhere: curl | bash installers on non-x86 hardware with strict bash mode have two independent failure points: the download step (architecture mismatch, URL rot) and the script's own bash compatibility. Both can fail silently in ways that look like a complete install failure when only part of the process failed. Check artifacts after every step.
Installation Checklist
For NemoClaw on GX10 / DGX Spark (aarch64, Ubuntu 24.04, cgroup v2):
node --version→ must be 20+. If not: install via NodeSource v22curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash→ ignore exit code 243cd ~/.nemoclaw/source && sudo npm link- Download OpenShell from GitHub releases (tar.gz), not the URL in the docs
- Check
stat -fc %T /sys/fs/cgroup/→ ifcgroup2fs, addcgroupns=hostto daemon.json - Get NVIDIA API key from build.nvidia.com (30-day free)
NVIDIA_API_KEY=nvapi-... nemoclaw setup-sparkNVIDIA_API_KEY=nvapi-... NEMOCLAW_NON_INTERACTIVE=1 nemoclaw onboardnemoclaw status→ should show sandbox in Ready phase
Also in this series: Part 1 — NemoClaw: What It Is, Why It Exists, and How It Works
FAQ
- Why does the NemoClaw installer fail on DGX Spark / GX10?
- The installer fails for four reasons not covered in official docs: Node.js must be v20+ (GX10 ships with v18), npm link needs sudo on system-managed Node, the OpenShell binary URL in the docs returns 404 (the actual release is a tar.gz), and Ubuntu 24.04's cgroup v2 requires adding cgroupns=host to Docker daemon.json before onboard.
- How do I install OpenShell on aarch64 Linux for NemoClaw?
- Download the tar.gz from GitHub releases (not the binary URL in the docs, which 404s). For aarch64: curl the openshell-aarch64-unknown-linux-musl.tar.gz from the latest release, extract it, and move the binary to /usr/local/bin/openshell. For x86_64, use the x86_64-unknown-linux-musl variant.
- What is the 'Exec format error' when running openshell on Linux?
- This usually means the download failed silently — curl saved an HTML error page (404 response) instead of the actual binary. Run 'file openshell' to check: it will show 'ASCII text' instead of 'ELF'. Re-download from the correct GitHub release URL (tar.gz format).
- How do I fix the cgroup v2 error during NemoClaw onboard?
- Ubuntu 24.04 uses cgroup v2 by default, but OpenShell's internal k3s expects cgroup v1 paths. Add '"default-cgroupns-mode": "host"' to /etc/docker/daemon.json and restart Docker. Verify with 'stat -fc %T /sys/fs/cgroup/' — if it shows 'cgroup2fs', the fix is needed.