OpenAI's New "Strawberry" AI Is Still Making Idiotic Mistakes

Hello!
OpenAI released its long-awaited AI model that had been hyped up under the code named "Strawberry."

With its new "human-like" ability to "reason," the AI model can tackle even more "complex tasks" and "harder problems," according to the company.
But as early testers have already discovered firsthand, it's still miles away from replacing a human scientist or coder.
In fact, if recent posts making their rounds on social media are anything to go by, the o1-preview is still often struggling with the absolute basics.
For instance, INSA Rennes researcher Mathieu Acher found, it's still repeatedly suggesting illegal chess moves in response to certain puzzles.
Tasks as basic as counting also remain elusive. In one example flagged by Meta AI scientist Colin Fraser, Strawberry attempts to take on a rudimentary word puzzle about a farmer transporting sheep across a river — and accidentally abandons the correct answer in favor of illogical garble at the end.

"o1-preview gives the wrong answer to this prompt 75 percent of the time," one user found.
In fact, some users are claiming, the model is even still sometimes strugglingwith one of the most confounding word problems for AI language models: how many times the letter "R" appears in the word "strawberry."
In all fairness, OpenAI was clear right from the start that its latest AI is still a work in progress.
"As an early model, it doesn't yet have many of the features that make ChatGPT useful, like browsing the web for information and uploading files and images," the company wrote in its announcement. "For many common cases GPT-4o will be more capable in the near term."

That can extend its response time significantly. As one user found, the new AI model took 92 seconds to come up with an answer to a word riddle — before bungling the answer.
OpenAI research scientist Noam Brown, who worked on the new model, argued that having it take its time could result in some groundbreaking answers.
"OpenAI's o1 thinks for seconds, but we aim for future versions to think for hours, days, even weeks," he tweeted. "Inference costs will be higher, but what cost would you pay for a new cancer drug? For breakthrough batteries? For a proof of the Riemann Hypothesis?"
Those lofty conclusions didn't sit well with noted AI critic Gary Marcus.

"This is not realistic," he added. "As you acknowledge o1 is still unreliable even at tic-tac-toe, and in some cases no better than earlier models. Longer processing times are unlikely to reach transcendent reasoning."
(To be fair, Brown also conceded that the new model is still flubbing certain answers, including ones as fundamental as tic-tac-toe.)
Marcus is tapping into a heated debate surrounding the tremendous hype gripping the AI industry.

In short, the company's latest AI still falling for the same old traps isn't exactly confidence-inducing.
OpenAI promised that it's only the beginning, though, symbolically naming its model to reset the "counter back to 1" — which, given it's stumbling right out of the gate, might end up being an appropriate name after all.
Also read:
- The 7 Steps to Building an Efficient Dashboard
- What Are the Basics of Music Production?
- 5 Tips for Managing & Supporting Staff Working from Home
Thank you!
Join us on social media!
See you!