OpenAI Launches GPT Image 1.5 as Users Question Real-World Performance
OpenAI has released GPT Image 1.5, a significant update to its flagship image generation model designed to compete directly with Google’s emerging tools. While the model, codenamed "Hazelnut," secured top rankings in initial benchmarks, early real-world testing has drawn mixed reactions regarding its practical utility compared to Google’s Nano Banana Pro.
Benchmark Success vs. Practical Application
The new model debuted with strong metrics, reportedly topping the LMArena leaderboard with an Elo score of 1264 in text-to-image tasks. In image editing capabilities, it placed closely behind the leader. OpenAI claims the system offers a fourfold increase in generation speed compared to its predecessor, alongside improved instruction following and detail preservation.
Despite these theoretical accolades, user feedback suggests a disparity between benchmark scores and actual performance. Independent comparisons noted by toolmesh.ai indicate that while GPT Image 1.5 produces high-fidelity visuals, it struggles with semantic accuracy in complex tasks. In tests involving the replication of handwritten notes, the model generated aesthetically pleasing results that failed to accurately transcribe the content, whereas Google’s Nano Banana Pro successfully captured the semantic information.
This performance gap has led to criticism within the developer community, with some users characterizing the release as a case of "high scores, low ability."
Enhanced Editing and Consistency
A core focus of the GPT Image 1.5 update is precision editing. According to OpenAI researcher Mark Chen, the model is engineered to modify specific elements of an image—such as lighting, composition, or character attire—without altering the surrounding context.
Demonstrations released by the company highlight the model's ability to perform multi-round edits. In one example, the system successfully modified a photograph of two men and a dog to match a "2000s film" aesthetic, added background characters, and changed specific clothing items while maintaining the original subjects' facial consistency. The model also demonstrated the ability to isolate subjects for background removal and re-compositing into new environments, such as placing a subject into a live-stream interface.
The update also introduces improved grid consistency. In a "hell-level" test involving a 6x6 grid of 36 distinct elements, GPT Image 1.5 successfully rendered all items without the omissions or hallucinations observed in previous versions. Additionally, text rendering capabilities have been upgraded, allowing for the generation of legible code snippets, calorie infographics, and dense markdown text.
Technical Limitations and Style Regressions
Despite improvements in photorealism and text rendering, OpenAI has acknowledged regressions in specific artistic domains. The company admitted that the model’s ability to generate certain stylized outputs, such as Japanese anime or dark fantasy portraits, has degraded compared to previous iterations. Users seeking these specific aesthetics are currently advised to use preset filters or revert to the legacy model.
Further limitations appear in crowd processing and multilingual support. When editing large group photos, the model frequently distorts facial features. Additionally, while English text rendering has improved, the system remains unreliable with non-English scripts, failing to correctly render Chinese, Arabic, and Hebrew characters in testing.
Market Position and API Economics
The release comes as OpenAI faces intensifying competition from Google’s Gemini suite and open-source alternatives like Black Forest Labs’ Flux.2. To strengthen its position in the enterprise market, OpenAI has reduced API pricing for GPT Image 1.5 by 20% compared to the previous generation. This pricing strategy targets developers in e-commerce and marketing who require high-volume image generation for product variants and branding materials.
Fidji Simo, CEO of the ChatGPT App, positioned the update as part of a broader shift from text-based interfaces to dynamic, multimodal experiences. However, the competitive landscape remains tight. Comparisons circulating on social media suggest that for specific use cases, such as creating realistic e-commerce materials, Google’s Nano Banana Pro currently offers superior naturalism.
With Google expected to release further updates to its Gemini 3.0 Flash and image tools, the sector is entering a period of rapid iteration as companies vie for dominance in both consumer and enterprise creative tools.
