NVIDIA's ASPIRE framework distills validated coding agent fixes into reusable skills
ASPIRE introduces a self-improving code-as-policy framework for agents using Claude Code. Instead of discarding debugged fixes after a single task, it distills verified repairs into a transferable, reusable skill library.
Impact: Medium
Why it matters
Implement ASPIRE's skill-distillation pattern in your software agents to capture and persist working workflows across separate, unrelated sessions.
TL;DR
- 01Distills validated execution patches into compact, reusable in-context skill guidance.
- 02Uses per-primitive multimodal tracing instead of generic task-level success/fail metrics.
- 03Achieves 31% zero-shot success on long-horizon tasks, outperforming standard 4% baselines.
Key facts
- Zero-shot Long Tasks Success
- 31% (vs 4% baseline)
- Real-robot Token Burn Reduction
- 10x
- Robosuite Handover Success
- 92% (vs 20% baseline)
Granular Failure Localization
Rather than receiving overall rollout feedback, ASPIRE runs on a closed-loop execution engine storing inputs, outputs, and status codes for every call. If a failure is detected, the agent analyzes only the specific calls flagged, identifying root causes like collision buffer violations or invalid inputs instead of guessing from scene summaries.
Evolutionary Exploration Search
To prevent the agent from getting trapped in endless local repair loops (constantly applying variations of the same failed patch), ASPIRE utilizes evolutionary search. It proposes $K$ distinct candidate programs each round, conditioning them on prior top performers and remaining failure traces, prompting diverse architectural solutions.
Simulated and Real Hardware Transfer
ASPIRE was simulated using Claude Code under Claude Opus 4.6 (1M context window) writing CaP-X code on MuJoCo Playground. Transferring these simulated skills to real-world hardware (a bimanual YAM station using OpenAI Codex GPT-5.5) slashed real-robot token consumption up to 10x while drastically improving task completion, such as raising soda-can lifting success from 13/20 to 19/20.
Try it in 2 minutes
# ASPIRE in-context skill sketch
for angle_deg in [180, -90, 90, -45, 45]:
tx = radio_pos[0] + 0.7 * np.cos(np.radians(angle_deg))
ty = radio_pos[1] + 0.7 * np.sin(np.radians(angle_deg))
moved = safe_navigate([tx, ty, face_yaw], f"ang_{angle_deg}")
if moved and dist_to(radio_pos[:2]) < 0.8:
breakpython
✓ When to use
- When designing long-running autonomous workflows that execute physical commands or integrate complex multi-step APIs prone to runtime failures.
✕ When NOT to use
- When your agent runs on isolated, simple APIs where code execution patterns are fully predictable and do not require runtime debugging.
What to do today
- Implement a structured metadata payload to track individual call-level inputs and outputs in your agentic workflows.
- Design a prompt template to automatically summarize successful multi-step bug resolutions into concise instructions.
Sources