AI Neurodigest: The Last Two Weeks (Late May

Another packed fortnight in AI. Models keep getting smarter, smaller, and weirder, while the infrastructure money flows at absurd scale. Here’s the biggest stuff you might have missed.

LLM Highlights

Claude Opus 4.8 (Anthropic, May 28)
Anthropic dropped Opus 4.8 with a clear focus on judgment, honesty, and autonomous coding. The model is noticeably less willing to “cut corners” or hallucinate confidence, admits knowledge gaps more gracefully, and handles long-horizon agentic tasks better.

Standout features:

New “effort modes” — low-effort sometimes outperforms previous max mode.
Fast mode is now 3× cheape* while delivering 2.5× speed.
Same base pricing as 4.7, but dramatically better reliability.

MiniMax M3 (Early June)
Chinese lab MiniMax released M3 — a frontier-level model with a full 1 million token context window, native multimodality (text + image + video), and strong coding/agentic performance. They use a new Sparse Attention architecture (MSA) that makes 1M context dramatically cheaper — roughly 1/20th the compute cost of their previous generation at extreme lengths. Weights are expected to drop any day now.

Gemma 4 12B (Google, June 3)
Google open-sourced (Apache 2.0) a surprisingly capable 12B multimodal model with 256k context. The big innovation: encoder-free architecture. Instead of heavy separate vision/audio encoders, it uses simple linear projections straight into the transformer. The result runs well on laptops with 16 GB RAM, handles video + audio + images natively, and performs close to much larger models. A serious step forward for local multimodal intelligence.

MAI-Thinking-1 (Microsoft)
Microsoft’s AI team published a rare, highly detailed 109-page technical report on training their new reasoning model — a 35B-active / ~1T-total sparse MoE. They’re not open-sourcing weights but will offer API fine-tuning access. Strong numbers on SWE-Bench, AIME math, and coding benchmarks. The report is a goldmine for anyone interested in modern training techniques.

Generative Models

Extreme Quantization Magic
Startup PrismML took FLUX.2 Klein 4B and crushed it down to true 1-bit weights. The resulting Bonsai Image 4B model weighs just ~930 MB and runs high-quality image generation directly in the browser or on an iPhone. This is one of the most impressive compression feats we’ve seen — real commercial viability for on-device diffusion.

Other Notable Releases

Odysseus — PewDiePie’s Open-Source AI Launcher

The YouTuber dropped Odysseus, a slick self-hosted AI workspace that feels like a local version of ChatGPT/Claude. Clean UI, agentic mode, Deep Research tools, and a built-in Cookbook. It’s gaining massive traction among privacy-conscious users who want full control without sacrificing UX.

The Compute Deal of the Year
Google signed a monster contract to rent 110,000 NVIDIA Blackwell GPUs from SpaceX/xAI data centers for $920 million per month. That’s roughly $26 billion annualized from a single customer. xAI’s terrestrial compute business is already a cash-printing machine while they prepare the real moonshot: orbital data centers.

---

Quick Takeaway

We’re in the phase where frontier capability is spreading fast — both downward in size (Gemma 4 12B, 1-bit FLUX) and upward in context/reasoning (Opus 4.8, MiniMax M3, MAI-Thinking-1). At the same time, the infrastructure layer is consolidating around a handful of players with vertical integration advantages.

The next two weeks will probably be just as wild. Stay tuned.

---

AI Neurodigest: The Last Two Weeks (Late May – Early June 2026)

LLM Highlights

Generative Models

Other Notable Releases

Subscribe to our newsletter