The Seven-Petal Flower Test: ChatGPT’s Enduring Symmetry Struggle (and Why It’s Getting Better)

There’s a beloved Russian fairy tale about a magical Tsvetik-Semitsvetik — a little flower with exactly seven petals, each granting a wish.

The prompt is brutally simple:
Draw a flower with exactly seven petals.”
And for the longest time, the answer was almost always… eight.
The Classic Failure Loop

- User asks ChatGPT (or similar model) to generate a seven-petal flower.
- It produces a beautiful illustration… with 8 petals.
- User: “Please count the petals in this image.”
- Model confidently replies: “There are exactly seven petals, arranged evenly around the center. Starting from the top and going clockwise: 1, 2, 3, 4, 5, 6, 7.”
- User uploads the same image back: “Count again.”
- Model has an existential moment.
In one recent exchange, the model openly admitted its mistakes:
“I got carried away by the priming from the original request for seven petals. I performed a quick symmetry-based estimate instead of a careful recount. Upon closer inspection, there are indeed 8 petals. I was wrong and overconfident.”
It then correctly identified two core errors:
- Priming bias — the request for seven petals influenced its visual interpretation.
- Insufficient verification — it didn’t double-check the actual pixel output.
The Redemption Arc

This is the interesting part. Earlier versions of these models would stubbornly keep generating eight (or nine) petals no matter how many times you asked.
Today, they can:
- Recognize their own counting mistake when confronted with evidence.
- Reflect on the psychological-like bias that caused it.
- Correct the generation in the next attempt.
Why Does This Keep Happening?
The root cause is deceptively simple: training data.
Real flowers in nature and photography rarely have exactly seven petals. Common counts are 5 (roses, buttercups), 6, 8, or irregular numbers. Seven-petal flowers exist but are uncommon, so they’re underrepresented in image-text datasets. When the model tries to generate a “nice symmetrical flower,” it defaults to the most common even or aesthetically pleasing numbers it has seen.
Other models show similar quirks. One competitor (affectionately called “Banana” in the community) honestly notes: “I drew you a flower, but it ended up with 9 petals instead of 7. The model made a counting error.” Then it fails to fix it.
Also read:
- Services: The New Software – Why the Next $1 Trillion Company Will Look Like a Services Firm
- Unmasking Runway Characters: The Unexpected Rise of the Real-Time Avatar
- When Cursor Wiped a User's PC: A Cautionary Tale of AI Overreach
What This Tiny Test Actually Reveals
The seven-petal flower has become a charming stress test for several capabilities:
- Precise visual counting;
- Symmetry understanding;
- Resistance to prompt priming;
- Self-correction and metacognition.
It’s not a serious benchmark like GPQA or SWE-bench, but it’s delightfully human. It reminds us that even as frontier models crush complex reasoning, basic perceptual tasks can still trip them up in surprising ways.
Yet the progress is real. The ability to admit “I was wrong because of bias and sloppy checking” — and then fix it — shows genuine improvement in reliability and humility.
So the next time you feel like testing a new image model, don’t reach for math problems or coding challenges. Just say:
“Draw the seven-petaled flower.”
If it gets exactly seven petals on the first try… you’ll know the models have truly leveled up.