The great Agentic AI hoax

Enough toy demos using AI to plan vacations

Nov 01, 2024

Photo by Maël BALLAND

Join other smart people who absolutely love my writing today.

👉 If you enjoy reading this post, feel free to share it with friends! Or feel free to click the ❤️ button on this post so more people can discover anti-patterns. 🙏

Remember that Friends episode where Phoebe tries to cover for Joey at a French-speaking audition?
She insists that he’s fluent, while he talks nonsense.
As she tells the director, “C'est mon petit frère. Il est un peu retardé.”

What happened this week?

This week we had a new release from Anthropic, an Apple Intelligence tryout, and I finished my homework for an Agentic AI course.

The latest Anthropic tech is pretty sick but it's really just a bit of RPA (robotic process automation) they slapped on top of their GPT.

Apple Intelligence underdelivers. It doesn’t understand half of the commands and certainly doesn’t converse like ChatGPT.

Now that I’m done dissing Apple and Anthropic, what about my very own Agentic AI course project?

I wanted to be hands on with agentic AI and understand a bit better where we are really holding. I took a 5 week course on Maven with Amir Feizpour. The course was excellent. We had to do a homework assignment with other students. Cohort learning is a great idea but in practice, my partner disappeared after a week. Out of 70 students, 6 submitted the homework project.

You can see the Youtube video of my project from demo day -

I definitely learned a lot.

I am just at the beginning of my own learning curve and there is so much to learn.

There is an adrenaline rush in getting your hands dirty at early stages of the tech hype cycle.

You get to feel from up close how bad the situation really is.

You get to pull up the hood to the car engine and get your hands dirty.

It’s so much more fun than reading a white paper from AWS.

The good and bad news of Agentic AI

The good news about agentic AI is that it’s not science fiction and that there are a bunch of really smart people all doing incredibly creative stuff all moving very quickly.

The bad news is that we’re about 20% into the agentic AI hype curve with another 80% ahead of us and it’s not a flat highway ahead, it’s a steep mountain with a 10% gradient.

It’s like Mont Ventoux in the Tour De France. Mont Ventoux has crazy ascents.

The climb begins in Bédoin and is 21.4 kilometers, with an average gradient of 7.6%.

That’s crazy enough but averages are deceiving.

The first 6 km is a relatively gentle ascent, providing a good warm-up.

The next 10 km goes through a dense forest, with 9-10% gradients.

In the final 5 km, cyclists face a barren, lunar-like landscape with gradients averaging around 8% and up to 10%.

Agentic AI came out in mid 2023. We’re still in the warm up climb. We still have a dense forest and a lunar landscape with a 10% gradient to climb ahead of us.

What is agentic AI?

Like the name says.

Agentic AI is autonomous systems of “AI agents” capable of setting goals, making decisions, and executing tasks with minimal human intervention.

This is not a new idea. The idea of autonomous software agents talking to each other has been around for decades.

The first AI chatbot was called ELIZA.

ELIZA was developed by MIT professor Joseph Weizenbaum between 1964 and 1966. It was first released in 1966.

ELIZA simulated conversation by using pattern matching and substitution, creating the illusion of understanding. You can try it out over here: https://www.masswerk.at/elizabot/

Its most famous script, DOCTOR, emulated a Rogerian psychotherapist, reflecting users' inputs back to them in a therapeutic manner. Like this:

ELIZA: Please tell me what's been bothering you.
YOU: I feel sad that Agentic AI is hyped so much
ELIZA: Do you enjoy feeling sad that agentic ai is hyped so much ?
YOU: Of course not. It pisses me off
ELIZA: Please go on.
YOU: Why do people use so much hype?
ELIZA: Does that question interest you ?

The great grand-children of ELIZA are mainstream, for example:

In banking, AI agents handle customer inquiries about loan statuses, fraud alerts, and transaction disputes without human intervention.
In e-commerce systems, AI agents provide self-service support, aid in the customer journeys, make product recommendations, and assist with returns.
In healthcare, AI agents are beginning to be used for appointment management; AI agents can monitor cancellations and instantly notify patients to fill gaps.

But when we look closer, there are not a lot of Agentic AI systems in production applications, just like there are not a lot of autonomous cars on the road today.

In this essay - I will talk about the big agentic AI hoax.

The big Agentic AI hoax

Why do I say that?

High numbers of demos - do not make a technology mainstream
The number of demos for planning a vacation in Paris is super-suspicious. This has been a perennial demo since Netscape and the invention of JavaScript. Or was it VBScript?
High amounts of noise in social media - does not increase useful signals
In social media (YT, X, LI, IG, TT etc) signal << noise. Noise attracts customers and investors but doesn’t increase the signal.
The amount of effort to develop seriously useful Agentic AI applications is 20X underestimated
The proportion of effort required to actually develop and deploy a software product involves much more than writing code.

Writing code (what Copilot and Cursor handle) is about 20% of the development effort. In addition to 100% of writing, testing and debugging software, you need another 300% to actually package a piece of software as a product that humans like you and I can use. 20% out of 400% is 5%. This means that all the AI hype is about optimizing 5% of a product development life cycle. This doesn’t include product design, market research and everything else related to verifying and validating customer requirements and lifetime costs of customer support.

In truth, humans are bad at developing code.

Optimizing 5% of the problem with AI code copilots would not be an acceptable value proposition in any other business area.

The speed of releases is a masquerade for progress
The thing is that the LLMs and frameworks for agents are being rolled out very quickly for shock and awe value, to learn from the market and get market share.

It’s a lot like the late 90s. To have a polished application and integrated system with an excellent UX and deep understanding of what users really want requires a lot more time.
The lack of vendor-neutral standards is already biting us
I just don't see how AI can speed up the polished product life cycle given the complexity of the Internet infrastructure, the complexity of humans and the lack of vendor-neutral standards.
The expressivity of the languages we use in Agentic AI development
This is a core problem in my opinion. Most of the work around Agentic AI is done in Python or ES (if you look at Vercel). Python and ES are not much more advanced than the IBM S/360 assembly language 50 years ago. You write lines of code or have a copilot write for you, you debug and work your way through the SDLC. While you’re doing that, budgets, customers and markets change. I know that low-code is supposed to solve this problem, but I’m not personally familiar with any great software written in low-code.
Systems engineering.
Demos like vacation planning are single user in a box applications. They use LLMs to answer questions and search agents to get real-time information on flights, hotels, and rental cars that feed another LLM agent that prepares your itinerary.

Vacation demos are single -user, read-only applications that do not run transactions with other systems.

What I’d really want is a vacation planner like this:

Hi Yael, “I’m thinking about a vacation in Paris sometime in April”.

Yael. “Got you covered, Danny”.

Yael would take it from there. Yael knows that I like jazz, drink wine and that April in Paris is my dream.

The agents plan a vacation for me in Paris next April, book a couple of visits to jazz clubs, and restaurants with good wine and food where I meet beautiful French women. She orders the tickets, fits it into my schedule because she manages my Google Calendar, she stays within my budget because she manages my bank accounts, makes hotel reservations, orders a cab for me and helps me navigate to the hotel from CDG. My vacation planner stays with me during the trip and confirms reservations for me and makes suggestions for great places to eat and takes care of getting me back to the airport on time. She would notice that the Uber app is down and then talk to a Bolt driver and reassure me that the driver will pick me up on time at 430am exactly at the right place on the sidewalk.

Just like a perfect human personal assistant.

There is an incredible amount of systems engineering and systems integration to do this sort of thing in a reliable manner, recover from errors, manage follow-ups with the human and make the human feel pampered and taken care of.

This is far beyond a science project like planning a trip to Paris - which you could do with GPT last year with a bit of scripting. I think planning trips to Paris is about where browser helpers were back in 2002.

State management is much harder than you think
I'm using LangChain and LangGraph. Langchain is pretty mature. LangGraph is a year old and it’s a low-code development platform for Agentic AI. It’s powerful, but core functionality like state management between the agents is hard; maybe because state management of distributed systems is a hard problem to solve regardless of how you do it.
Ad Hoc development - Agentic AI systems are being developed ad-hoc today.

A lot of people are reinventing the wheel instead of learning from the data.

People think about a problem to be solved with agentic AI, and write Python code, but humans are not good at writing code. The solution is to bring data and train a neural network to run multiple language agents and tools to solve the problem instead of writing a lot of low-level Python code. Then we would deploy the neural network which would run the agents in a standard multi-agent architecture. Andrej Karpathy talked about this approach; calling it Software 2.0.
User requirements for agentic AI
This is more subtle but it’s a basic problem. Until an AI can guess what a user really needs, we’re going to have to live with “What I said is not what I meant and what you understood is not what I said”. There is no reason to believe that a machine can understand humans better than other humans having been trained on human language. Correct me if I am wrong.

As Barry Boehm wrote 60 years ago:

V&V - Verify that you’re solving the right problem. Validate that you’re solving it right,

The reality is that we still have a long ways to go for V&V of agentic AI systems

Conclusion

Agentic AI came out about a year ago, in mid 2023. We’re still in the warm up climb. We still have a dense forest and a lunar landscape with a 10% gradient ahead of us.

As autonomous vehicles show, it seems to take 10 years for a pilot of technology to turn into a product.

Expect Agentic AI to become production sometime in 2033.

Until then, unfortunately, Agentic AI is a hoax.

My agentic AI science project for a book launch:

Maya Murad from IBM on planning vacations -

Alex Finn on planning vacations with Anthropic

Andrej Karpathy on Software 2.0

Clear Thinking with Danny Lieberman

Discussion about this post