← Autonomy Labs
Autonomy Labs · Rover Program
Runtime Architecture · v0.2
Three-tier runtime
Cognition → Planning → Reflex
Public blueprint
DocARCH-01 StatusPublic · v0.2 AuthorR. Kumar Updated2026-05-23 Peoria, IL
v0.2 · what changed

MPPI is now a first-class layer between cognition and reflex — not buried as an implementation detail inside the planner. That's the seam where the productivity-vs-safety tradeoff actually gets adjudicated. VLM → VLA: the cognition layer now emits actions/subgoals, not just descriptions — aligned with the current frontier (RT-2, OpenVLA, Pi-Zero). Semantic map and behavioral rules are shown explicitly as cost-function inputs to MPPI — the architectural seam where the differentiation from the training walk pays off. The v0.1 four-tier framing (L1–L4 + reflection) collapses to a cleaner three runtime tiers + a separate offline learning loop on its own temporal layer.

Why now
Three convergences make this the moment
We publish this stack openly because the moat is not the architecture — it is the execution. The pieces required to build a vision-first autonomous rover have arrived in the same eighteen months, and the team that integrates them first wins on velocity, not secrecy.
01 · Edge VLAs Vision-language-action models — RT-2, OpenVLA, Pi-Zero, Gemini Robotics-ER — reason spatially and emit subgoals in real time on a $249 Jetson.
02 · Commodity Perception Four $30 USB cameras now deliver the visual fidelity a VLA needs. One LiDAR puck costs more than the entire perception stack.
03 · AI-Paired Development A small team paired with capable coding agents can iterate the full stack daily — closing the loop between field run, log review, and next-day improvement.

System diagram · runtime

Three-tier autonomous architecture diagram. Cameras feed into a Cognition (VLA) layer in purple, which produces a semantic subgoal for an MPPI Planning layer in teal. The Planning layer also takes input from a Semantic Map (left) and Behavioral Rules (right), then sends motor commands to a Reflex (MCU) layer in coral, which enforces safety interlocks before driving the motors.
Runtime stack · cognition → planning → reflex · semantic map + behavioral rules as cost-function inputs

Tier-by-tier

COMMAND / DATA — flows down
ESCALATION / FAULT — flows up
REFLECTION — offline cycle
T1
Cognition · VLAphotons → semantic subgoal
1–5Hz · event-driven
Substrate
Edge VLA on Jetson Orin · vision-language-action class (RT-2, OpenVLA, Gemini Robotics-ER)
VLAmulti-viewsubgoal-emit
Primary Loop
Scene comprehension · task decomposition · subgoal selection · success/failure detection · operator dialogue. Outputs a semantic subgoal (where, why, with what constraint) — never raw actuator commands.
Cognition → Planning · contract subgoal(target_region, constraints, success_test)
Operator → Cognition voice / glasses narration / button
T2
Planning · MPPIsample, simulate, score
10–20Hz
Substrate
SBC (Jetson Orin or NUC) · sampling-based MPC
MPPIcost-fnN-rollout
Primary Loop
Sample N candidate trajectories · simulate each forward · score against a cost function that combines progress (productivity), geometric cost (semantic-map constraints), and rule violation (behavioral rules). Emit weighted-best trajectory as motor commands.
Cost input · ←
Semantic Map
Yard knowledge · built during training walk
Named regions, boundary polygons, categorical hazards, distance buffers. Compiled down from the semantic memory at training time. Read every planning tick to evaluate trajectory cost. See TWP-01 for how this gets populated.
Cost input · →
Behavioral Rules
Learned overnight · plain English
Causal rules with their full explanation chain ("don't get within five feet of ravines — depth + irrecoverability = damage"). Stored in language. Read by MPPI as rule-violation penalty terms in the cost function. Updated by the offline reflection cycle below.
Planning → Reflex · contract motor_cmd(left_pwm, right_pwm, brake)
Planning → Cognition · escalations stuck · all-trajectories-infeasible · subgoal-unreachable
T3
Reflex · MCUsafety floor · spinal cord
100+Hz · hard real-time
Substrate
Dedicated safety MCU (Arduino-class) · deterministic loop, no OS
bumpercliff-IRultrasonicIMU-tilte-stop
Primary Loop
Poll sensors at > 100 Hz · check thresholds · clamp motor PWM on any safety interlock fire. Never negotiates. A bumper trip or cliff-IR fire immediately zeroes the actuators regardless of what the planner just commanded.
Reflex → motors · contract PWM (with override authority)
Reflex → ALL · interrupt E-STOP · halts entire stack

Meta · offline temporal layer

Offline · T+24h · learning loop
Daily Reflection Cycle
Where the rover gets smarter — runs on a different temporal layer than the diagram above
Logsthree-tier telemetry
LLM Reflectionnatural-language reasoning
New Rulesinjected to behavioral ruleset
Operational logs from cognition, planning, and reflex are synthesized overnight by the on-board LLM. The model reflects on what worked, what failed, and what surprised it — and emits new behavioral rules in natural language that become inputs to the MPPI cost function on the next operating day. No model retraining. No gradient updates. Cross-session learning via reasoning alone. This is the central thesis Autonomy Labs is built around — and the one we most want the field to push back on. A companion diagram covering this temporal layer (plus the training walk feedback into the semantic map) is forthcoming. The training walk itself is detailed in TWP-01.

Validation plan in progress. Before any on-yard trials, the cognition tier gets staged validation: an on-device latency + decision-quality baseline on the Jetson Orin (Qwen2.5-VL class model, real yard photos), then closed-loop software-in-the-loop in NVIDIA Isaac Sim with domain randomization, paired with a real-image holdout to guard against the sim-to-real appearance gap. Companion deep-dive page (VLDN-01 · "proving the stack") forthcoming once the on-device baseline is in.

Trace · Kitchen Errand

Worked example
"Go to the kitchen, pick up the plate, return."
A single mission decomposed across the three runtime tiers · trash bin appears mid-hallway
Operator
Voice command at the loading station.
Cognition
Decompose mission → [traverse hallway → enter kitchen → wait for button-press → return via reverse path]. Emit first subgoal to Planning: target_region=kitchen_entry, constraints=[avoid_furniture].
Planning
Sample N candidate trajectories toward kitchen. Score each against progress + semantic-map (hallway corridor polygon) + rules ("yield to people"). Mid-traverse: a new obstacle (trash bin) appears at 2.3 m → instantly reflected in the cost. Next tick, sampled trajectories that pass within the bin's inflation radius score badly; a 5° offset spline wins. Rejoin to subgoal line at 3.1 m.
Reflex
Continuously polling. Nothing triggered. Quietly heroic.
Cognition
Multi-view success detection on "plate placed on back" → button confirms → emit return mission.
Reflection
Overnight: "Trash bin observed in hallway at 2.3 m on three consecutive runs. Propose rule: route around bin location by default until visually confirmed clear."