TL;DR - We added a one-bit surprise gate (FEPGate) from our SIML cognitive sidecar to the frozen V-JEPA 2 planning loop.
With zero retraining and the same CEM/MPC budget, that tiny hook reshapes the latent energy surface and delivers big wins:
- Final error: 0.193 m → 0.109 m (~50% ↓)
- Monotonicity: 0.12 → 0.43 (>3× ↑)
- Per-step latency: 2.28 s → 1.13 s (≈2× faster)
- Energy/episode: 0.058 Wh → 0.029 Wh (≈2× ↓)
- EDP (energy × delay): ~4× lower
Why this matters
Even world-class latent planners waste compute when the environment isn’t surprising, and they can get stuck spiraling around spurious minima when it is. A single cognitive bit fixes both:
- Spend compute only when the world deviates from prediction (surprise ↑).
- Skip expensive re-plans when everything tracks (surprise ↓).
This is the smallest practical step toward a cognitive OS layer that governs any latent system - not just V-JEPA 2 - with a universal, model-agnostic feedback signal.
What we built
-
B1 - Grip-bit
Exposes the gripper open/close as a single bit to both sampling and scoring (cleaner grasp/place phases). (Ablation left to appendix/code.) -
B2 - FEPGate (the star)
A SIML sidecar computes a normalized surprise scores(a; z_k)on CEM/MPC elites at latent statez_k:- If
s > τ→ reject/penalize those elites (reshape the elite set). - If
s ≤ τ→ leave them alone (no unnecessary re-plans).
- If
-
Zero retraining
V-JEPA 2 encoder/predictor are frozen throughout. -
Same budget
Horizon, samples, and iterations unchanged; we only add the bit-level hook. -
τ-calibration (obs-driven)
One short warm-up per scene: sample the L1-ball atz_k, measure surprise, setτat the 5–10th percentile (empirically~1e-4 … 1e-3). -
Adaptive memory (bonus)
SIML schemas expand with novelty spikes and compress when surprise stays low - keeping entropy and compute in check without touching V-JEPA’s weights.
How it works (30-second tour)
- Predict. V-JEPA 2-AC rolls out candidate futures
ĥ z_{t+1}fromz_t. - Gate elites (pre-exec). SIML computes surprise on elite candidates; high-surprise ones are gated out before execution.
- Act. Execute the best action
a_t. - Update (post-exec). Encode the new obs →
z_{t+1}; SIML updates surprise/memory.
This one-bit feedback carves away bad pockets in the search space and preserves curvature where dynamics are feasible.
Results
Setup. 100 episodes of single-goal reaching (6–7 steps/ep), 7-DoF deltas sampled in an L1-ball (r=0.075). NVML power at 1 Hz. Single NVIDIA GPU; sidecar on CPU. No fine-tuning. Budget < $100.
Headline wins (FEPGate vs OFF):
- Error: 0.193 m → 0.109 m
- Smooth progress: 0.12 → 0.43
- Latency: 2.28 s → 1.13 s
- Energy/ep: 0.058 Wh → 0.029 Wh
- EDP: ~4× lower (product of means)
- EDP (Energy×Latency): OFF ≈ 0.132, FEP ≈ 0.033 → ~4× lower.
Figure 1 - Lower error, ~2× faster decisions, and ~2× lower energy with FEPGate.
What it feels like: direct trajectories when the model is right; immediate course-corrections the instant reality diverges. No wasted search when nothing changed.
Implementation notes
- IPC: tiny CLI (
surprise,step) exchanging.npybuffers + a small JSON. - Placement: gate only the elite set each CEM/MPC iteration (orders of magnitude fewer calls; same effect).
- Threshold: τ from an observation-driven warmup; typical values in our runs land around
1e-4 … 1e-3. - Memory: runtime schema expand/contract policy keeps useful novelty while pruning redundancy.
About the Author
This research was conducted by Deep SIML Labs, an independent research lab exploring the frontiers of artificial life, cognition, and self-organizing intelligence.
To stay informed or collaborate, contact us.