
MiAI Environment Voice & Semantic Recognition Engine is Xiaomi’s in-house voice and semantic stack, identified by Xiaomi on August 11, 2022 as the layer powering voice interaction on the CyberOne humanoid prototype. It consists of two components named by Xiaomi: the “MiAI environment voice engine” and the “MiAI semantic recognition engine”. According to Xiaomi’s official announcement, the system can recognise 85 environmental sound categories and 45 classifications of human emotion. Xiaomi has not released a public API or developer documentation — the engine exists publicly only as an internal AI layer of the CyberOne prototype.
Perception Stack
A Perception Stack encompasses the software layers that process data from cameras, LiDARs, IMUs, microphones, and other sensors in order to recognise the surrounding environment, perform localisation, detect and track objects, and interpret the scene. It is typically the first processing stage in an autonomous robot's data pipeline, feeding its outputs to planning and control stacks.
Perception
No additional description for this role.