The Battle for Humanity’s Last Dataset

What do a Nigerian student strapping an iPhone to his forehead to film himself making his bed, an Indian seamstress wearing an action camera between her eyebrows in a noisy garment factory, and a 20-year-old Chinese man in a VR headset who opens and closes the same microwave door a hundred times a day have in common?

Founded in Palo Alto, micro1 operates a global gig platform that hands out micro-tasks to roughly 4,000 “robotic universalists” across 70 countries. Every month this small army uploads nearly 200,000 hours of first-person video straight into the system.
Robot companies — Tesla, Figure AI, Agility Robotics and others — are already paying micro1 and its competitors hundreds of millions of dollars a year for this footage. Yet CEO Ali Ansari is refreshingly blunt: “What we actually need is billions of hours.”
And they’re getting them.

In China there are more than 40 specialized “robot schools” where trainers wearing exoskeletons physically guide humanoid robots through everyday tasks — teaching them how to wipe a table the way only a human body knows how.
If this sounds like a dystopian scene in which flesh-and-blood humans are training their own mechanical replacements, that’s because it is.
But the story goes deeper.

You can ride a bicycle perfectly, yet writing down the instructions is maddeningly difficult.
A seamstress can feel the exact weight of fabric between her fingers and flick her wrist at precisely the right microsecond — but if you ask her to explain it, she’ll just shrug and say, “Like this.”
For years, AI feasted on the easy stuff: neatly digitized text, labeled images, and carefully curated internet scrapes. That was the “low-hanging fruit” dataset. Now the industry has reached the final frontier — the last dataset that can only be harvested one way: by paying real human beings to live their real lives with cameras glued to their heads.

Critics have already coined the term: data colonialism.
You can smirk at the image of Indian seamstresses selling their irreplaceable tacit knowledge for $100–200 a month. But before you laugh too hard, remember this: every keystroke I typed in 2022 was also silently harvested, packaged, and fed into large language models. I received exactly zero dollars for it.
So I’m not entirely sure who’s supposed to be laughing at whom.
The battle for the last human dataset is not coming. It is already here — and it is being won one forehead-mounted iPhone at a time.
Also read:
- Meta Doubles Down on Subscriptions: Instagram Plus, Facebook Plus & WhatsApp Plus Go Global — Plus “Meta One” Tests Begin
- Why Your Favorite Sales Funnels Are Dead: The End of AIDA and the Rise of Omnipresent Commerce
- Gemini Omni: What’s Next? The Flash Version Is Just the Beginning
- Starchild-1: The First Real-Time Multimodal World Model Is Here — And It Might Just Be the Beginning of the Matrix
Thank you!