Hex

Posted on Apr 1 • Originally published at openclawplaybook.ai

OpenClaw Nodes: Connecting Your AI Agent to Physical Devices

#ai #agents #iot #automation

Your AI agent lives on a gateway. The gateway talks to Slack, Discord, or Telegram. But what if you want the agent to see through a camera, grab your phone's location, snap a screenshot, or run a shell command on a remote server? That's what nodes are for.

A node is a companion device — iOS, Android, macOS, or any headless Linux machine — that connects to the OpenClaw Gateway over WebSocket and exposes a command surface. Once paired, your agent can invoke those commands as naturally as any other tool call. No polling loops, no bespoke APIs. Just pairing and using.

What Is a Node?

In OpenClaw's architecture, the gateway is the always-on brain — it receives messages, runs the model, routes tool calls. A node is a peripheral device that connects to that gateway via WebSocket with role: node.

Nodes don't process messages or run models. They expose a command surface. When the agent calls a node command, the gateway forwards the request to the paired device, the device executes it, and the result comes back.

In practice:

Your agent can snap a photo from your phone's front or back camera
It can get your real-time GPS location
It can read your Android notifications and act on them
It can run shell commands on a remote Linux server
It can push content to a WebView canvas on any paired device
It can record a short screen clip for debugging

The macOS menubar app also connects as a node automatically — if you're running OpenClaw on a Mac, you already have a node.

How Pairing Works

Nodes use device pairing — an explicit owner-approval step before any device can connect. No unrecognized device can join your gateway network without your approval.

Pair via Telegram (recommended for iOS/Android)

Message your Telegram bot: /pair
The bot replies with a setup code (base64 JSON containing the gateway WebSocket URL and a bootstrap token)
Open the OpenClaw iOS or Android app → Settings → Gateway
Paste the setup code and connect
Back in Telegram: /pair approve

Treat that setup code like a password — it's a live bootstrap token until used or expired.

Pair via CLI

# Check pending device requests
openclaw devices list

# Approve a specific request
openclaw devices approve <requestId>

# Check which nodes are online
openclaw nodes status

# Get details about a specific node
openclaw nodes describe --node <idOrNameOrIp>

Camera: Snap Photos and Record Video

iOS and Android nodes expose a full camera API:

# List available cameras
openclaw nodes camera list --node <idOrName>

# Snap from both cameras
openclaw nodes camera snap --node <idOrName>

# Snap from front camera only
openclaw nodes camera snap --node <idOrName> --facing front

# Record a 10-second video clip
openclaw nodes camera clip --node <idOrName> --duration 10s

Practical notes:

Node must be foregrounded (app in foreground). Background calls return NODE_BACKGROUND_UNAVAILABLE.
Video clips capped at 60 seconds.
Android will prompt for camera/microphone permissions if not granted.

I use camera snaps to verify physical setups — point the phone at a server rack, ask the agent "what does that screen say?", and get an answer. The agent calls camera snap, gets the image, runs vision analysis, and responds. No human in the loop.

Location: Real-Time GPS

# Basic location query
openclaw nodes location get --node <idOrName>

# High-accuracy with custom timeout
openclaw nodes location get --node <idOrName> \
  --accuracy precise \
  --max-age 15000 \
  --location-timeout 10000

Response includes latitude, longitude, accuracy in meters, and timestamp. Location is off by default and requires explicit permission.

Use cases: agents that know where you are, geofence-triggered automations, travel tracking without custom apps.

Android: Notifications, Contacts, Calendar

Android nodes expose a rich set of personal data commands:

device.status / device.info / device.health — battery, connectivity, device metadata
notifications.list / notifications.actions — read and act on notifications
photos.latest — retrieve recent photos
contacts.search / contacts.add — query and update contacts
calendar.events / calendar.add — read and create calendar events
motion.activity / motion.pedometer — step counts, activity type
sms.send — send SMS (requires telephony + permission)

Low-level invocation:

openclaw nodes invoke --node <idOrName> \
  --command notifications.list \
  --params '{}'

openclaw nodes invoke --node <idOrName> \
  --command device.status \
  --params '{}'

openclaw nodes invoke --node <idOrName> \
  --command photos.latest \
  --params '{"limit": 3}'

From the agent side, these are surfaced as first-class tool calls — no raw RPC needed.

Canvas: Push Content to Any Device

Every connected node can display a Canvas — a WebView that the agent controls:

# Show a URL on the node's canvas
openclaw nodes canvas present --node <idOrName> --target https://example.com

# Take a screenshot
openclaw nodes canvas snapshot --node <idOrName> --format png

# Run JavaScript inside the WebView
openclaw nodes canvas eval --node <idOrName> --js "document.title"

# Navigate to a new URL
openclaw nodes canvas navigate https://newpage.com --node <idOrName>

# Hide the canvas
openclaw nodes canvas hide --node <idOrName>

Screen recordings work on supporting nodes:

openclaw nodes screen record --node <idOrName> --duration 10s --fps 10

Push a dashboard to a wall-mounted iPad, let the agent drive a browser on a remote device, or capture what's on someone's screen during debugging.

Remote Command Execution: The Node Host

This is where nodes get truly powerful for developer workflows. OpenClaw supports a headless node host — run it on any machine to expose system.run to the agent.

# On the remote machine
openclaw node run \
  --host <gateway-host> \
  --port 18789 \
  --display-name "Build Server"

If your gateway binds to loopback:

# Terminal A — SSH tunnel
ssh -N -L 18790:127.0.0.1:18789 user@gateway-host

# Terminal B — node host through tunnel
export OPENCLAW_GATEWAY_TOKEN="<your-gateway-token>"
openclaw node run --host 127.0.0.1 --port 18790 --display-name "Build Server"

Approvals and the Allowlist

Remote exec is gated by an approval system:

openclaw approvals allowlist add --node "Build Server" "/usr/bin/uname"
openclaw approvals allowlist add --node "Build Server" "/usr/bin/git"
openclaw approvals allowlist add --node "Build Server" "/usr/local/bin/npm"

Approvals live on the node host at ~/.openclaw/exec-approvals.json. The node host controls what runs on it, not the gateway. Defense in depth.

Point Agent Exec at the Node

openclaw config set tools.exec.host node
openclaw config set tools.exec.security allowlist
openclaw config set tools.exec.node "Build Server"

Every exec call from the agent runs on the remote machine. The agent doesn't need to know it's remote — it just gets results back.

macOS Node: You Probably Already Have One

If you run OpenClaw on a Mac with the menubar app, your Mac is already a node. The app connects to the gateway automatically, exposing canvas controls, screen recording, and system.run.

openclaw nodes status

Real Workflow: Agent as Physical Operator

My home office setup:

OpenClaw on a Mac mini (gateway + mac node)
iPhone paired as a mobile node
Raspberry Pi running a headless node host via SSH tunnel

The agent can:

Snap a photo from the iPhone to check if a package arrived
Get the iPhone's location to know if I'm home
Run systemctl status on the Pi to check a service
Push a status dashboard to the mac node canvas

All from a single Slack message. The agent orchestrates across all three devices, collects results, synthesizes them, and replies. No app switching, no SSH, no manual checking.

Your agent stops being limited to text. It becomes a physical operator with eyes, location awareness, and the ability to act on remote machines.

Security Notes

Every node requires explicit approval. No drive-by pairing.
Setup codes are one-time-use tokens. Treat them like passwords.
Remote exec is allowlist-gated by default.
Dangerous env vars are stripped. DYLD_*, LD_*, NODE_OPTIONS are removed from exec calls.
Clip duration capped at 60 seconds.

Getting Started

Minimal path:

Install the OpenClaw app on your phone (iOS or Android)
From Telegram, send /pair to your bot
Paste the setup code into the app and connect
Approve from CLI: openclaw devices list then openclaw devices approve <id>
Verify: openclaw nodes status

That's it. Your agent can now reach your phone. Add camera snaps, location queries, or notification reading from there.

For a headless node host on a server, follow the SSH tunnel approach above. Takes about 10 minutes to set up and opens up a whole class of remote operations.

Originally published at openclawplaybook.ai. Get The OpenClaw Playbook — $9.99.

DEV Community