NVIDIA HORIZON uses git worktrees and prompt caching to build hardware agents

Token & cost optimization

July 4, 2026 5 min read

Curated by Oleksandr Kuzmenko, AI Product EngineerUpdated July 4, 2026Sources cited on every story

AI-assisted · editor-reviewedHow we use AI

NVIDIA HORIZON uses git worktrees and prompt caching to build hardware agents

NVIDIA Research has introduced HORIZON, an agentic framework for RTL design that achieves 100% on hardware benchmarks. By hosting problems as git worktrees and using persistent sessions, it achieves a 91% prompt caching rate.

Impact: Medium

Why it matters

Adopt HORIZON's pattern of reusing active model sessions and caching stable codebase sources to keep your multi-step agent costs minimal.

TL;DR

01Achieves 100% completion on RTL benchmarks using iterative, state-preserving loops.
02Reuses model sessions to run 91% of input tokens through prompt caching, cutting API costs.
03Leverages git worktrees and git notes as a native experience buffer instead of an external database.

Key facts

Prompt Cache Ratio: 91% (self-reported)
RTL Suite Pass Rate: 100% (self-reported)
Total CVDP Tokens Used: 203.9M tokens

Architectural Breakdown of HORIZON

HORIZON defines a hardware design task as a project pack $p = (\pi_{agent}, E_p, A_p, \Gamma_p, \Omega_p)$, which represents the agent policy, executable evaluator, acceptance predicate, version control policy, and domain skills. Evaluating RTL designs requires cycle-accurate execution, simulation feedback, and coverage extraction. Because single-turn generations fail to meet these constraints, the loop continuously edits the worktree, running simulations and committing changes only when the acceptance predicate passes.

Real-world Evaluation and Token Performance

Tested across legacy hardware suites (ChipBench, RTLLM-2.0, Verilog-Eval) and CVDP code/verification categories, HORIZON hit a 100% pass rate on all benchmarks.

Convergence Speed: While Verilog-Eval and RTLLM-2.0 converged within just 2 iterations, complex RTL code completion tasks (CID 002) required up to 82 iterations to resolve bugs.
Prompt Caching Efficiency: The CVDP evaluation consumed a massive 203.9 million tokens. However, because 91% of these tokens were retrieved from the prompt cache, the actual API financial cost was heavily reduced. This proves that prompt cache design is a core requirement for repository-scale agents.

Try it in 2 minutes

git diff --cached
git commit -m "iter 7: fix full/empty overlap"
git notes add -m "pass=1 mismatches=0"
git log --oneline

bash

✓ When to use

When building autonomous agent setups that edit, run, test, and debug massive local repositories iteratively.

✕ When NOT to use

When your code changes are small, one-shot generations that do not require extensive execution-based feedback loops.

What to do today

Incorporate git notes in your testing CI pipelines to log validation metadata directly onto commits.
Structure agent loops to preserve model session context, minimizing cold-start token bills.

#NVIDIA HORIZON

Sources

NVIDIA HORIZON: Hands-Free RTL Agent

ShareShare on X Share on LinkedIn

Token & cost optimization

July 4, 2026 5 min read

Curated by Oleksandr Kuzmenko, AI Product EngineerUpdated July 4, 2026Sources cited on every story

AI-assisted · editor-reviewedHow we use AI

Impact: Medium

Why it matters

Adopt HORIZON's pattern of reusing active model sessions and caching stable codebase sources to keep your multi-step agent costs minimal.

TL;DR

01Achieves 100% completion on RTL benchmarks using iterative, state-preserving loops.
02Reuses model sessions to run 91% of input tokens through prompt caching, cutting API costs.
03Leverages git worktrees and git notes as a native experience buffer instead of an external database.

Key facts

Prompt Cache Ratio: 91% (self-reported)
RTL Suite Pass Rate: 100% (self-reported)
Total CVDP Tokens Used: 203.9M tokens

Architectural Breakdown of HORIZON

Real-world Evaluation and Token Performance

Tested across legacy hardware suites (ChipBench, RTLLM-2.0, Verilog-Eval) and CVDP code/verification categories, HORIZON hit a 100% pass rate on all benchmarks.

Convergence Speed: While Verilog-Eval and RTLLM-2.0 converged within just 2 iterations, complex RTL code completion tasks (CID 002) required up to 82 iterations to resolve bugs.
Prompt Caching Efficiency: The CVDP evaluation consumed a massive 203.9 million tokens. However, because 91% of these tokens were retrieved from the prompt cache, the actual API financial cost was heavily reduced. This proves that prompt cache design is a core requirement for repository-scale agents.

Try it in 2 minutes

git diff --cached
git commit -m "iter 7: fix full/empty overlap"
git notes add -m "pass=1 mismatches=0"
git log --oneline

bash

✓ When to use

When building autonomous agent setups that edit, run, test, and debug massive local repositories iteratively.

✕ When NOT to use

When your code changes are small, one-shot generations that do not require extensive execution-based feedback loops.

What to do today

Incorporate git notes in your testing CI pipelines to log validation metadata directly onto commits.
Structure agent loops to preserve model session context, minimizing cold-start token bills.

#NVIDIA HORIZON

Sources

NVIDIA HORIZON: Hands-Free RTL Agent

ShareShare on X Share on LinkedIn

NVIDIA HORIZON uses git worktrees and prompt caching to build hardware agents

Architectural Breakdown of HORIZON

Real-world Evaluation and Token Performance

Related stories

Get the morning AI brief

NVIDIA HORIZON uses git worktrees and prompt caching to build hardware agents

Architectural Breakdown of HORIZON

Real-world Evaluation and Token Performance

Related stories

Get the morning AI brief