Intro
The AgiBot Expedition A3 stands 1,730 mm tall and weighs 55 kg, built using a lightweight combination of magnesium, titanium, and TPU structural materials that yield a power-to-weight ratio of 0.218 kW/kg, which AgiBot states is the highest in its product class at the time of launch. Its leg structure adopts a lightweight exoskeleton-inspired design that improves dynamic stability and agility during high-speed and airborne motion sequences. The waist joint provides a full range of rotational and lateral motion that closely mirrors the human torso, enabling the fluid trunk movements required for martial arts-style manoeuvres and dance performances. The full-body degrees of freedom breakdown per limb segment has not been publicly disclosed, but the body is described by AgiBot as incorporating highly anthropomorphic full-body articulation across all segments. All joints are driven by high-power-density actuators capable of explosive torque output without destabilising the platform, and the control system integrates high-frequency balance algorithms for real-time centre-of-mass adjustment during transient airborne phases.
The A3 operates for up to 10 hours on a single battery pack, the longest runtime in AgiBot's bipedal humanoid lineup, with a hot-swap battery exchange system completing a full replacement in 10 seconds. For fleet operation, the A3 integrates UWB (Ultra-Wideband) radio positioning to centimetre-level accuracy, enabling up to 100 A3 units to execute precisely coordinated choreography in a shared physical space. Shoulder tactile sensors detect physical contact and enable reactive responses, while a 360-degree multi-array microphone system captures voice input from any direction without requiring the interacting person to face the robot directly. The AI stack runs the GCFM (Generative Control Foundation Model), which converts text, audio, or video inputs into real-time physical motion sequences, and the WITA Omni model, which delivers context-aware, emotionally expressive multimodal interaction by fusing vision, audio, language, and physical action. The BFM (Behavioural Foundation Model) enables the A3 to imitate new human movements from a single demonstration video, allowing rapid deployment of new choreographic routines without prolonged training cycles.






