- Generative AI Art
- Posts
- Karpathy's agent reality check
Karpathy's agent reality check
PLUS: Alibaba's 82% GPU cut and key lessons for production RAG
A leading AI researcher has delivered a sharp reality check on the state of AI agents. Andrej Karpathy called current agentic outputs "slop," arguing the technology is still a decade away from fulfilling today’s promises.
The critique from one of AI's most respected minds provides a crucial counter-narrative to the industry's hype. But while agents may underwhelm top researchers, can they still deliver significant productivity gains for the average professional?
Today in AI:
Karpathy’s AI agent reality check
Alibaba slashes GPU costs by 82%
Real-world lessons for production RAG
PRESENTED BY HUBSPOT MEDIA
Turn AI into Your Income Engine
Ready to transform artificial intelligence from a buzzword into your personal revenue generator
HubSpot’s groundbreaking guide "200+ AI-Powered Income Ideas" is your gateway to financial innovation in the digital age.
Inside you'll discover:
A curated collection of 200+ profitable opportunities spanning content creation, e-commerce, gaming, and emerging digital markets—each vetted for real-world potential
Step-by-step implementation guides designed for beginners, making AI accessible regardless of your technical background
Cutting-edge strategies aligned with current market trends, ensuring your ventures stay ahead of the curve
Download your guide today and unlock a future where artificial intelligence powers your success. Your next income stream is waiting.
What’s new? Leading AI researcher Andrej Karpathy delivered a sharp reality check on the current state of AI agents, calling their output “slop” and estimating it will take a decade for the technology to meet today's promises.
What matters?
Karpathy argues that today's agents fundamentally "don't work" due to major gaps in core intelligence, multimodal understanding, and the ability to learn continuously.
He described reinforcement learning—a key technique for training agents—as "terrible" and "noise," even while acknowledging it's an improvement over previous methods.
In response, Elon Musk challenged Karpathy on X to a competition against Grok 5, though Karpathy noted he would prefer to collaborate.
You make a lot of great points, especially that children should learn the tools of physics early.
Are you down for an AI coding contest or whatever form of competition you’d like for Andrej vs Grok 5, a la Kasparov vs Deep Blue?
— Elon Musk (@elonmusk)
10:22 PM • Oct 18, 2025
Why it matters?
Coming from one of AI's most respected minds, this critique provides a crucial technical counter-narrative to the relentless industry hype. However, systems that underwhelm a top researcher may still offer massive productivity gains for the majority of professional users.
GUIDE
What’s new? Alibaba Cloud unveiled a new system called Aegaeon that it claims can cut the number of GPUs needed for AI inference by 82%, promising a major reduction in operational costs.
What matters?
During a three-month beta test, the system reduced the number of Nvidia H20 GPUs required from 1,192 to just 213 to serve dozens of large language models.
The system directly targets resource inefficiency, where many GPUs are allocated to serve a large number of models that are only used sporadically.
Aegaeon uses a technique called GPU power pooling, which allows computing resources to be shared dynamically across multiple models instead of being dedicated to just one.
Why it matters?
This development could dramatically lower the high cost of serving AI models, a major barrier for many companies. This efficiency gain makes building and deploying specialized AI applications more economically feasible at scale.
PRESENTED BY SURF LAKES
When Thor Speaks, Investors Pay Attention
Hollywood legend Chris Hemsworth has surfed the best waves on the planet, and he’s only the latest superstar to be wowed by Surf Lakes’ technology. Their patented wave tech is built to bring ocean-quality surf to cities worldwide, for all skill levels. And it’s a major untapped opportunity in surf tourism: a $65B global industry. Invest in Surf Lakes by 10/30 at 11:59 PM PT.
This is a paid advertisement for Surf Lakes’ Regulation CF offering. Please read the offering circular at https://invest.surflakes.com
What’s new? A developer who built a production RAG system for over 5 million documents has shared critical, hard-won lessons on what actually works, open-sourcing the findings in a new project called Agentset.
What matters?
Instead of relying on a single user query, generating multiple queries with an LLM to cover more semantic ground significantly boosts retrieval accuracy.
Adding a reranker is one of the simplest, highest-impact improvements you can make, often compensating for a suboptimal setup by re-ordering retrieved chunks for relevance.
Many user questions don’t actually require RAG, so a simple query router can detect non-retrieval questions and answer them directly to save resources and improve speed.
Why it matters?
These lessons provide a clear, field-tested playbook for developers moving RAG systems from proof-of-concept to production scale. Focusing on high-ROI steps like reranking and advanced query generation helps teams avoid common pitfalls and build applications that deliver reliable results.
Everything else in AI
Google gave Gemini access to its Maps API, allowing the model to ground responses with real-time location data like business hours and ratings.
Clark described modern AI systems as "real and mysterious creatures," with the Anthropic co-founder cautioning that models are showing signs of situational self-awareness.
Webflow announced it will use the Astro framework to power its upcoming AI code generation feature, allowing users to build production-ready web apps from a single prompt.
Musk estimated the probability of xAI’s upcoming Grok 5 model achieving AGI is “10% and rising.”
Essential AI Guides - Reading List:
Let us know!
What did you think of today's email?Before you go, please give your feedback to help us improve the content for you! |
Work with us
Reach 100k+ engaged Tech Professionals, Engineers, Managers and decision makers. Join brands like MorningBrew, HubSpot, Prezi, Nike, Ahref, Roku, 1440, Superhuman, and others in showcasing your product to our audience. Get in touch now →