Helix: A Vision-Language-Action Model for Generalist Humanoid Control

May 13, 2026

Helix

System 1:

-80M param action expert, ~200Hz
-low-level motor control, adjustments, stabilization, whole-body control

System 2:

-7B large VLM, ~7-9 Hz
-scene understanding and language (generalization)
-Helix outputs continuous control for action spaces, avoiding tokenization schemes that work for low-dimensional control setups but not humanoids