OpenAI Sora Android App: AI-Powered Development with Codex

OpenAI's Android version of the Sora application was developed by a four-person engineering team in 28 days, with approximately 85% of the code generated by an AI agent named Codex. The app, launched in November, quickly became the top download on the Google Play Store, with users generating over 1 million videos within 24 hours.

AI-Assisted Development of Sora Android

The development sprint for the Sora Android app took place between October 8 and November 5. During this period, the four-member team collaborated with Codex, consuming an estimated 5 billion tokens. The application achieved a 99.9% crash-free rate upon release, despite its rapid development. The team utilized an early version of the GPT-5.1-Codex model.

Codex, a tool for generating code, has become integral to OpenAI's internal development, accounting for 70% of the company's weekly pull requests five months after its initial release.

Agile Development and Brooks's Law

OpenAI opted for a small, agile team to develop the Sora Android app, adhering to the principle often attributed to Fred Brooks, which suggests that adding more people to a late software project can delay it further. The company equipped its four engineers with Codex to maximize individual productivity. This approach allowed them to release an internal build of Sora Android within 18 days, followed by a public release 10 days later.

AI Iterates AI: Self-Evolution

Within OpenAI, engineers frequently use Codex, including its open-source command-line interface version. Alexander Embiricos, Codex Product Lead, noted that Codex monitors its own training processes and handles user feedback, effectively "deciding" its next steps. Codex also writes much of the research harness for its own training runs, and OpenAI is exploring having it monitor its own training process.

This recursive development model, where tools are used to create better tools, mirrors historical advancements in computing. For instance, early integrated circuits designed manually eventually led to electronic design automation (EDA) software that enabled the creation of far more complex circuits. OpenAI's use of Codex to build and improve Codex follows a similar pattern, with each generation of tool capabilities feeding into the next.

Collaboration with Codex

OpenAI engineers approach Codex as a "newly hired senior engineer," focusing their efforts on directing and reviewing code rather than writing it from scratch. This method, termed "Vibe engineering" by AI researcher Simon Willison, emphasizes human oversight and iterative feedback loops.

Codex excels at understanding large codebases, generating unit tests for various edge cases, and responding to feedback, such as debugging based on CI failure logs. It also allows for massive parallelism, enabling developers to test multiple ideas simultaneously. During design discussions, Codex serves as a generative tool, identifying potential failure points and proposing solutions, such as memory optimization for video players.

However, Codex requires human guidance for tasks involving inferring unknowns, such as preferred architectural patterns, product strategies, or unwritten internal rules. It also cannot assess the user experience of an application on a physical device. To address this, OpenAI engineers provide Codex with clear goals, constraints, and explicit rules, often embedding AGENT.md files throughout the codebase to ensure consistent guidance.

Structured Approach and Cross-Platform Capabilities

The OpenAI team found that providing a well-planned foundation, including defining the app's architecture, modularity, and core functionalities, was crucial for Codex's effectiveness. Attempting a "zero-shot" approach, where Codex was simply instructed to "Build the Sora Android App based on the iOS code," proved unsuccessful due to a lack of context regarding endpoints, data, and user flows.

The development workflow evolved to include a planning phase where Codex helps clarify system and code interactions before execution. This involves asking Codex to summarize related files and data flows, with manual refinement of its understanding. The resulting plan, akin to a miniature design document, guides Codex's step-by-step implementation.

OpenAI leveraged its existing iOS codebase by pointing Codex to both iOS and backend codebases to inform the Android development. The team noted that underlying application logic is portable across platforms, and specific examples provide powerful context for Codex. By placing iOS, backend, and Android repositories in the same environment, engineers could prompt Codex to implement equivalent behavior on Android based on iOS implementations.

Retrospective and Future Implications

At the conclusion of the 28-day sprint, AI-assisted development had become a standard practice for OpenAI. Ed Bayes, a Codex team designer, highlighted the integration of Codex with project management tools like Linear and communication platforms such as Slack, allowing team members to assign programming tasks directly to the AI agent. This integration fosters a "colleague relationship" where Codex can propose pull requests and participate in review cycles.

While Codex significantly enhances development speed, OpenAI emphasizes that human understanding of systems and long-term collaboration with AI remain essential. The company believes that AI-assisted programming empowers developers to focus on higher-leverage tasks, returning to the core aspects of software engineering. OpenAI hopes its experience will inspire broader adoption and further development of AI tools like Codex.