Karpathy's agent reality check

PLUS: Alibaba's 82% GPU cut and key lessons for production RAG

Together with

A leading AI researcher has delivered a sharp reality check on the state of AI agents. Andrej Karpathy called current agentic outputs "slop," arguing the technology is still a decade away from fulfilling today’s promises.

The critique from one of AI's most respected minds provides a crucial counter-narrative to the industry's hype. But while agents may underwhelm top researchers, can they still deliver significant productivity gains for the average professional?

Today in AI:
  • Karpathy’s AI agent reality check

  • Alibaba slashes GPU costs by 82%

  • Real-world lessons for production RAG

PRESENTED BY HUBSPOT MEDIA

Turn AI into Your Income Engine

Ready to transform artificial intelligence from a buzzword into your personal revenue generator

HubSpot’s groundbreaking guide "200+ AI-Powered Income Ideas" is your gateway to financial innovation in the digital age.

Inside you'll discover:

  • A curated collection of 200+ profitable opportunities spanning content creation, e-commerce, gaming, and emerging digital markets—each vetted for real-world potential

  • Step-by-step implementation guides designed for beginners, making AI accessible regardless of your technical background

  • Cutting-edge strategies aligned with current market trends, ensuring your ventures stay ahead of the curve

Download your guide today and unlock a future where artificial intelligence powers your success. Your next income stream is waiting.

What’s new? Leading AI researcher Andrej Karpathy delivered a sharp reality check on the current state of AI agents, calling their output “slop” and estimating it will take a decade for the technology to meet today's promises.

What matters?

  • Karpathy argues that today's agents fundamentally "don't work" due to major gaps in core intelligence, multimodal understanding, and the ability to learn continuously.

  • He described reinforcement learning—a key technique for training agents—as "terrible" and "noise," even while acknowledging it's an improvement over previous methods.

  • In response, Elon Musk challenged Karpathy on X to a competition against Grok 5, though Karpathy noted he would prefer to collaborate.

Why it matters?

Coming from one of AI's most respected minds, this critique provides a crucial technical counter-narrative to the relentless industry hype. However, systems that underwhelm a top researcher may still offer massive productivity gains for the majority of professional users.

GUIDE

What’s new? Alibaba Cloud unveiled a new system called Aegaeon that it claims can cut the number of GPUs needed for AI inference by 82%, promising a major reduction in operational costs.

What matters?

  • During a three-month beta test, the system reduced the number of Nvidia H20 GPUs required from 1,192 to just 213 to serve dozens of large language models.

  • The system directly targets resource inefficiency, where many GPUs are allocated to serve a large number of models that are only used sporadically.

  • Aegaeon uses a technique called GPU power pooling, which allows computing resources to be shared dynamically across multiple models instead of being dedicated to just one.

Why it matters?

This development could dramatically lower the high cost of serving AI models, a major barrier for many companies. This efficiency gain makes building and deploying specialized AI applications more economically feasible at scale.

PRESENTED BY SURF LAKES

When Thor Speaks, Investors Pay Attention

Hollywood legend Chris Hemsworth has surfed the best waves on the planet, and he’s only the latest superstar to be wowed by Surf Lakes’ technology. Their patented wave tech is built to bring ocean-quality surf to cities worldwide, for all skill levels. And it’s a major untapped opportunity in surf tourism: a $65B global industry. Invest in Surf Lakes by 10/30 at 11:59 PM PT.

This is a paid advertisement for Surf Lakes’ Regulation CF offering. Please read the offering circular at https://invest.surflakes.com

What’s new? A developer who built a production RAG system for over 5 million documents has shared critical, hard-won lessons on what actually works, open-sourcing the findings in a new project called Agentset.

What matters?

  • Instead of relying on a single user query, generating multiple queries with an LLM to cover more semantic ground significantly boosts retrieval accuracy.

  • Adding a reranker is one of the simplest, highest-impact improvements you can make, often compensating for a suboptimal setup by re-ordering retrieved chunks for relevance.

  • Many user questions don’t actually require RAG, so a simple query router can detect non-retrieval questions and answer them directly to save resources and improve speed.

Why it matters?

These lessons provide a clear, field-tested playbook for developers moving RAG systems from proof-of-concept to production scale. Focusing on high-ROI steps like reranking and advanced query generation helps teams avoid common pitfalls and build applications that deliver reliable results.

Everything else in AI

Google gave Gemini access to its Maps API, allowing the model to ground responses with real-time location data like business hours and ratings.

Clark described modern AI systems as "real and mysterious creatures," with the Anthropic co-founder cautioning that models are showing signs of situational self-awareness.

Webflow announced it will use the Astro framework to power its upcoming AI code generation feature, allowing users to build production-ready web apps from a single prompt.

Musk estimated the probability of xAI’s upcoming Grok 5 model achieving AGI is “10% and rising.”

Essential AI Guides - Reading List:

Let us know!

Work with us

Reach 100k+ engaged Tech Professionals, Engineers, Managers and decision makers. Join brands like MorningBrew, HubSpot, Prezi, Nike, Ahref, Roku, 1440, Superhuman, and others in showcasing your product to our audience. Get in touch now →