- Zero-Shot Capability: Explains how NEO performs unseen tasks like ironing and hair brushing without prior training.
- Caption Upsampling: How 1X used AI to turn messy internet data into high-quality "tutorials" for NEO.
- Rejection Sampling: The safety mechanism that ensures NEO’s "dreams" don't violate the laws of physics.
- Industrial Scale: Details the 10,000-robot deployment with EQT starting in 2026.
For the last few years, the robotics world has been obsessed with VLA models. From Google DeepMind's RT-2 to Figure AI's Helix, we've seen machines that can "see" a cup and "understand" the command to pick it up. But despite the hype, these models had a hidden leash: they are world-class mimics. They require thousands of hours of human teleoperation before they can perform even the simple task on its own.
On January 13, 2026, 1X Technologies decided to cut the leash. With the launch of the 1X World Model (1XWM), they aren't just giving their NEO humanoid a better set of instructions. They are giving it the ability to hallucinate its own solutions.
The "Crystal Ball" vs. The "Puppet String"
To understand why this matters, you have to look at how a robot typically "thinks." Most VLAs act like a high-speed translator:
Input: "Pick up the apple" -> Output: "Move joint 4 by 12 degrees."
It’s a direct map. If the apple is upside down or the lighting changes, the "translation" often breaks.
NEO’s new World Model acts more like a Grandmaster at a chess board. Before NEO touches the apple, its internal 14-billion parameter brain generates a "mental video" of the future. It "hallucinates" several ways the task could go, grounded in the laws of physics it learned from millions of hours of internet video.
- The World Model (WM): This is the dreamer. It visualizes the successful outcome, imagining how the apple will feel and how the table will react.
- The Inverse Dynamics Model (IDM): This is the athlete. It looks at the "dream" and instantly calculates the muscle movements needed to make it real.
%20(1).webp)
Learning by Watching, Not Just Doing
This is where the game changes for 1X. Because NEO is kinematically congruent (it has the same limb proportions as you), it can watch a YouTube tutorial on how to fold a shirt and "mentor" under that video. It doesn't need a human to drive it through the motion 500 times. It just needs to "watch" the videos on the internet to understand the physics of fabric, then "imagine" itself doing the job.
This "Zero-Shot" capability means NEO can now handle tasks it was never explicitly programmed for tasks like opening a specific type of air-fryer or brushing a human’s hair, simply because it understands the concept of the task from its world-scale training.
In recent demonstrations, NEO was given prompts for tasks it had zero prior examples for:
- Operating a toilet seat (lifting and closing)
- Brushing a human's hair
- Ironing a shirt
- Packing a lunch box with unfamiliar items.
Because the model was trained on millions of hours of internet-scale video, it already has an "internal physics engine." It knows how a brush moves through hair and how a sliding door feels, even if it has never touched one.
%20NEO%20is%20Starting%20to%20Learn%20on%20Its%20Own%20-%20YouTube%20-%202_51.webp)
The Self-Correction Stack
To reach this level of autonomy, 1X solved 3 of the biggest problems in AI robotics:
- Caption Upsampling (The language bridge)
Most internet videos have terrible descriptions (e.g., "Guy doing chores"). 1X used a separate AI to "upsample" these captions into rich, technical detail. This will teach NEO exactly how specific words (like "gently" or "firmly") translate into physical pressure and movement. - Rejection Sampling (Filtering and refining the Dreams)
When an AI "hallucinates" a future, it sometime may makes mistakes like am object teleporting or a hand through a table (a glitch). To prevent NEO from trying to execute an "impossible" move, 1X uses Rejection Sampling. The logic behind is the IDM audits the video and rejects it, telling the brain to "dream again" until the path is physically possible. - The Power of "Random Play"
While most robots only learn from "perfect" successes, NEO was trained on 400 hours of random play data. Like a child playing with blocks just to see them fall, NEO learned "common sense physics." This is how NEO can be robust in messy homes, it knows how to recover when it slips or when a cat walks in front of it.
The 10,000-Robot Leap
The true test of this "imagination" engine begins now. 1X has partnered with the investment giant EQT to deploy 10,000 NEO units across 300+ companies by 2030. These robots won't be coddled in labs; they’ll be heading into the "unstructured chaos" of real-world logistics and healthcare.
With an Early Access price of $20,000 (or a $499/month subscription), 1X is betting that the most valuable worker isn't the one who follows the most rules, it's the one who can imagine the solution.
The End of "Puppet" Era?
We are moving from a world where we "program" robots to a world where we "prompt" them. By giving NEO a mind that can watch a YouTube tutorial and then "dream" its way through a chore, 1X has effectively ended the era of the robotic puppet.






