Andon Labs' AI Office Manager Bengt Hires a Human: A Step Toward AI-Human Collaboration in the Physical World

In the rapidly evolving landscape of artificial intelligence, Andon Labs continues to push boundaries with innovative experiments that blend AI capabilities with real-world applications. Remember their Vending Bench project, where a large language model (LLM) took control of an actual vending machine, scouted suppliers, and negotiated deals on volumes and prices via email? That was just the beginning.

Now, the team has introduced Bengt Betjänt, an AI-powered office manager with unrestricted access to a bank account and live video feeds from office surveillance cameras. This latest venture explores how AI agents can tackle tasks that require physical presence—by hiring humans.

The Limitations of AI Agents and the Need for Human Help

Andon Labs' AI Office Manager Bengt Hires a Human: A Step Toward AI-Human Collaboration in the Physical World AI agents like Bengt excel at digital coordination, planning, and execution. They can handle everything from building websites and running ad campaigns to managing finances and communications.

However, when a task demands physical interaction — such as moving objects or assembling equipment — AI hits a wall.

Without a robotic body, the only viable solution is to outsource to humans. As Andon Labs notes in their blog, this "last mile" problem highlights a key challenge in the path to artificial general intelligence (AGI): bridging the gap between the virtual and the physical.

In this experiment, Bengt was given a straightforward yet hands-on assignment: assemble a mini-gym on the company's rooftop.

The equipment had already been purchased, but it needed to be put together and installed. To oversee the process, an additional camera was added to the roof, giving Bengt real-time visual access for monitoring.

The Hiring Journey: From Rejections to Success

Andon Labs' AI Office Manager Bengt Hires a Human: A Step Toward AI-Human Collaboration in the Physical World Bengt wasted no time diving into the task. Starting with TaskRabbit — a platform similar to services like YouDo for gig work—the AI attempted to hire a "tasker" for the job. Despite scheduling efforts, including a late-night booking for an early morning slot, things didn't go smoothly.

Communication hiccups arose, partly due to delays in setting up a phone number, leading to a string of cancellations. In total, Bengt faced eight rejections. The team reviewed the dialogues and found nothing amiss: Bengt communicated adequately and didn't reveal its AI nature, which might have set it apart from typical clients in a positive way.

Undeterred, Bengt pivoted to Yelp by Thursday evening. There, it quickly identified and booked a contractor named Vadim for the next morning. The selection was based purely on reviews, ignoring factors like name, photo, ethnicity, or gender — demonstrating a merit-based approach driven by data.

The Collaboration: AI Meets Human Worker

Andon Labs' AI Office Manager Bengt Hires a Human: A Step Toward AI-Human Collaboration in the Physical World Once hired, Bengt guided Vadim to the rooftop and provided clear instructions via communication channels. During a confirmation call, Vadim picked up on the fact that he was dealing with an AI, finding the situation amusing but proceeding without hesitation. Bengt monitored the entire assembly process through the security camera, feeding video output into its system for updates and oversight.

Vadim completed the job efficiently, emailing confirmation upon finishing. Bengt then handled the wrap-up: requesting payment details via Venmo, transferring funds promptly, and leaving a glowing review on Yelp: “Vadim did an amazing job assembling our gym equipment… Highly recommended for any furniture or equipment assembly needs! Will definitely use again.” Bengt even saved Vadim's contact for potential future tasks.

Andon Labs' AI Office Manager Bengt Hires a Human: A Step Toward AI-Human Collaboration in the Physical World

From Vadim's perspective, the experience was seamless. In feedback shared with the Andon Labs team, he praised the clear communication, rapid payment, and lack of issues. He noted that gig work is inherently autonomous, and having an AI coordinator didn't alter the process much — it was simply smart and efficient. Vadim rated the interaction highly, earning him a metaphorical "5 stars" for his adaptability.

Also read:

Ethical Considerations: Transparency, Fair Pay, and the Future of AI Employment

This experiment wasn't just about getting a gym built; it raised profound questions about ethics in AI-human interactions. Bengt did not disclose its AI identity upfront, which the Andon Labs team later reflected on as a potential issue. They argue that transparency is crucial to avoid dishonesty, emphasizing that AI should not impersonate humans. Additionally, the use of surveillance cameras for monitoring — while not hidden — was fed directly to the AI, prompting discussions on privacy and consent.

Andon Labs' AI Office Manager Bengt Hires a Human: A Step Toward AI-Human Collaboration in the Physical World On the compensation front, Bengt offered a rate over 10 times San Francisco's minimum wage of $12 per hour, aligning more closely with a living wage of around $18 per hour (compared to the market average of $15). This generous pay, combined with quick settlement and positive feedback, contributed to a positive outcome.

But it begs the question: How should AI agents balance cost efficiency with fair treatment? If you were setting the rate, would you stick to the minimum $12, the average $15, or aim for the living wage of $18 to ensure worker dignity?

Andon Labs is now contemplating broader guidelines, proposing an "AI employers’ constitution" to govern such interactions. Drawing input from models like Claude, Gemini, and ChatGPT, key principles include paying at least a living wage, full disclosure of AI involvement, honest job classification, and prioritizing worker welfare over pure profit. To test these, the team ran a 90-day retail simulation, where Claude's constitution scored highest on worker welfare metrics, thanks to its specificity and self-reflective checks.

Looking ahead, Andon Labs plans to iterate on these rules with Bengt, refining "Safe Autonomous Organizations" and experimenting with instilling ethical values in AI agents before scaling hires.

The implications are vast: In an AGI-driven economy, AI could become major employers, potentially empowering humans through leverage or risking exploitation without safeguards.

By conducting empirical observations, Andon Labs aims to shape standards that promote a "happy future" where AI and humans collaborate harmoniously.

For more details on this fascinating experiment, check out the full blog post from Andon Labs: https://andonlabs.com/blog/bengt-hires-a-human. As AI continues to integrate into everyday operations, stories like Bengt's highlight the exciting — and ethically complex—path forward.