OpenAI's compute power grows 15x for GPT-5

PLUS: Google’s new Gemma 3, Gemini gets a memory, and Meta trains Llama on Threads

Presented by

OpenAI is undertaking a massive infrastructure expansion, increasing its compute power 15-fold this year alone.

This immense buildout is paired with a clever new architecture designed to reduce operational costs. Does this dual focus on raw power and economic efficiency represent the new playbook for winning the AI race?

Today’s AI in Brief:
  • OpenAI's 15x compute boost for GPT-5

  • Google’s new on-device Gemma 3 model

  • Gemini chatbot now has a memory

  • Meta to train Llama on real-time Threads data

  • Apple’s new AI home robot strategy

The #1 AI Newsletter for Business Leaders

Join 400,000+ executives and professionals who trust The AI Report for daily, practical AI updates.

Built for business—not engineers—this newsletter delivers expert prompts, real-world use cases, and decision-ready insights.

No hype. No jargon. Just results.

OpenAI's Planetary Scale

What’s new? OpenAI has massively scaled its infrastructure, having increased its compute 15-fold since 2024 to power GPT-5 with over 200,000 GPUs. This immense power is managed by a new, clever architecture designed to significantly cut operational costs.

In Brief: 

The buildout to launch GPT-5 is enormous, with the company adding over 60 new clusters and 200,000+ GPUs in just 60 days to serve its 700 million users. -GPT-5 operates as a collection of models managed by a router model, which directs prompts to either a lightweight model for simple queries or a heavy-duty one for complex tasks, saving compute resources. -This efficiency-first approach comes with trade-offs, including a context window that remains at 128k tokens for paid users and an initial plan to deprecate older models that the company reversed after CEO Sam Altman admitted was a mistake.

Google's Toaster-Sized AI

What’s new? Google released Gemma 3 270M, an incredibly small open-source model designed to bring AI directly onto low-power devices. This signals a major push toward powerful, private, on-device AI applications that don't need the cloud.

In Brief:

  • Its tiny size lets it run on devices like smartphones and Raspberry Pis, enabling offline functionality and greater user privacy.

  • The model shows strong instruction-following abilities for its size, though competitors like Liquid AI are already pushing the performance benchmarks even higher in this small model category.

  • It's built to be developer-friendly and ready to fine-tune for specific tasks, with pretrained versions available on hubs like Hugging Face.

Seeking impartial news? Meet 1440.

Every day, 3.5 million readers turn to 1440 for their factual news. We sift through 100+ sources to bring you a complete summary of politics, global events, business, and culture, all in a brief 5-minute email. Enjoy an impartial news experience.

Gemini Remembers You

What’s new? Google's Gemini chatbot now has a memory, thanks to a new 'Personal Context' feature that remembers details from past chats to make future conversations more personalized and efficient.

In Brief:

  • It automatically recalls preferences and context from past conversations, eliminating the need to repeat information in new chats.

  • While the feature is opt-out by default, Google gives you control to manage and delete your saved context or use a 'Temporary Chat' for private conversations.

  • Alongside memory, Google is also boosting the rate limits for Gemini's 'Deep Think' feature, allowing for more extensive use on complex problems.

Apple's AI Desktop Robot

What’s new? Apple is planning a major push into AI hardware, according to a new report, with a lineup of home devices including a desktop robot and a completely rebuilt Siri targeting a 2026-2027 launch.

In Brief:

  • The flagship device is a desktop robot featuring a motorized arm and a display that can physically track users around a room.

  • Siri is being rebuilt from the ground up with a new personality-driven character codenamed “Bubbles,” aiming to be more engaging and useful.

  • The company also plans to release a smart home display in mid-2026 and new AI-powered security cameras designed to automate household tasks.

Meta's New Data Trove

What’s new? Meta's Threads just crossed 400 million monthly active users, giving the company a massive, real-time dataset to supercharge the training of its Llama AI models.

In Brief:

  • This user growth isn't just a social media win; it's a strategic AI asset that provides a continuous stream of public conversational data.

  • The move mirrors how X’s data reportedly enhanced xAI’s Grok, suggesting Meta can now use a similar strategy to refine its own models with fresh, relevant text.

  • Access to this real-time text dataset could significantly improve Llama's ability to understand current events, slang, and evolving public discourse.

Everything else in AI

Apple is testing Anthropic's Claude model as a potential backup for its next-generation Siri, assigning it the internal codename 'Glenwood' to supplement its primary AI development.

Meta faces reports of significant culture issues within its AI unit, a factor that may be creating an opening for competitors like Microsoft to successfully poach key researchers.

OpenAI clarified that its new GPT-5 model features a 196k context window, a key technical detail shared by CEO Sam Altman amidst broader updates to the model's rollout.

Igor Babuschkin announced his exit from xAI to launch Babuschkin Ventures, a VC firm focused on backing startups that promise to make AI safe and beneficial for humanity.

DeepSeek delayed its next AI model, reportedly due to challenges with using local Chinese Huawei chips as US accelerators remain scarce.

Start learning AI in 2025

Keeping up with AI is hard – we get it!

That’s why over 1M professionals read Superhuman AI to stay ahead.

  • Get daily AI news, tools, and tutorials

  • Learn new AI skills you can use at work in 3 mins a day

  • Become 10X more productive

AI Tool Picks

  • Motion: Uses AI to auto-schedule your tasks, optimize your daily calendar, and shift priorities as work changes—perfect for busy professionals who juggle meetings, deep work, and unexpected to-dos.

  • Uplifted: AI command center for creative teams—auto-tags, organizes, analyzes, and remixes creative assets (like ads, videos, campaigns), so you can quickly reuse, version, and collaborate. Great for startups and agencies.

  • Otter.ai (Enhanced 2025 Edition): Real-time AI meeting transcription and summarization, supports video calls, in-person meetings, and auto-generates action lists. Perfect for capturing every detail without having to take notes.

  • Gamma: Instantly turns bullet points and ideas into full, professional presentations with AI-generated design, layout, and notes, eliminating “blank slide anxiety.”

Bonus Tools

  • Fluidwave: A new AI personal assistant focused on project and task management—automates tracking, updating, and prioritizing to help you get more done with less effort.

Your opinion matters!