AI Agents in Practice

AI Agents in Practice

AI agents: when to use them and when not to

Part 2 of AI Agents in Practice. What are AI agents good at? And where do things go wrong? The do's and don'ts based on practical experience.

Not the solution for everything

AI agents are pitched as autonomous workforce. Reality is more nuanced—and most use cases fail.

Bottom-up works, top-down fails

Agents excel at repetitive tasks with clear rules. They fail at strategic decisions and ambiguity.

When to use, when not to

Use agents for data extraction, monitoring and classification. Not for advice, planning or creative work.

AI agents in practice: part 2

This article is part of AI Agents in Practice, a series covering everything you need to know about AI agents. In part 1 we explain what AI agents are and how they work technically. In this part we answer the practical question: when should you use them, and when not?

Vendor promises are big: agents that replace entire departments, autonomous systems that run your business. Reality is more nuanced. After extensive experimentation, we have a clear picture of where agents excel and where they fail.

What AI agents are really good at

AI agents excel at tasks where they need to gather, combine and analyze data. They’re patient, don’t make typos in API calls, and can process amounts of data in seconds that would take a human hours. Their strength is combining sources and saying something meaningful about them.

1

Retrieving and analyzing data

Combining Google Analytics + LinkedIn + CRM data and spotting trends. Generating a weekly report including anomalies. That's where they shine.

2

Structured, repetitive tasks

Monitoring inbox and automatically creating tickets. Summarizing meeting transcripts. Synchronizing data. Processes with a clear beginning and end.

3

Expert support

The best use case: an agent as a tool for an expert. The marketer who has data crunched but draws conclusions themselves. The developer who has code written but reviews it themselves.

4

Creative brainstorm support

In the creative process it matters less if the agent is 'wrong'. Everything in a brainstorm can be a starting point. The human remains the one who decides.

Why does this work so well? Because these tasks are structured. The sources are known, the result is verifiable. And crucially: a human can quickly evaluate the result.

Where AI agents are less suitable

Agents fail when the assignment becomes too open. “Improve our marketing” or “Make a Q2 plan” are goals with a thousand paths leading to them. The agent chooses one, but you have no visibility into which alternatives it considered or why it made this choice. The output looks convincing, because AI models are literally trained to be convincing.

1

Vague, open goals

Too many possible paths, and you have no visibility into which choices it makes. The output looks good, but is it what you meant?

2

Keys to the castle

Giving an agent access to your database or CRM without restrictions is asking for trouble. Unlikely that it goes wrong, but it can.

3

Replacing expertise

An agent without an expert nearby is dangerous. If you don't have the knowledge to judge if the output is correct, you have a problem.

4

Trusting 'newer is better'

Model 5.2 isn't necessarily better than 5.1 for your use case. Every model update can mean your agent functions differently. Without systematic testing you don't know if that's better or worse.

The red button problem

This problem illustrates why open goals are dangerous. Imagine: you have a fully autonomous robot with the assignment “press the red button on the other side of the room”. You stand between the robot and the button. What happens?

The robot goes around you. Or pushes you aside. Or, in the worst case, drives straight through you. Because you said: press that button. You didn’t say: press that button, but don’t drive over people, don’t push anyone over, and only take the shortest route if it’s safe.

With LLM agents this is subtler but just as real. Give an agent the goal “generate more leads” and it generates leads. Maybe it buys email lists. Maybe it spams LinkedIn. Technically it achieves its goal. Practically it’s a disaster.

Agents do what you say, not what you mean. The difference between those is where problems arise.

The context problem

A second fundamental problem: agents don’t know what you know. Your organization has a way of working, implicit rules, historical context. All of that lives in people’s heads, not in systems.

Take a simple example: you ask an agent to analyze Google Ads data. But does it know which account to use if you have multiple? Does it know that campaign X is experimental and campaign Y is your core business? Does it know there was an outage last month making the data unreliable? Does it know what today’s date is, or does it think it’s three years ago because that’s where its training data stops?

Every piece of missing context is a potential error. And the annoying thing: the agent doesn’t know it’s missing context. It makes assumptions and continues. The output looks complete, but is built on a foundation of gaps.

Conclusion: bottom-up works, top-down fails

The common thread through all successful agent implementations we see: they’re built bottom-up. Not “the agent runs our marketing”, but “the agent creates this specific report every Monday”. Not replacement of expertise, but amplification of it.

Bottom-up: specific task, expert in control

The expert knows what the output should be and can judge if it's correct. The agent speeds up the work, the human remains responsible.

Top-down: vague goals, agent as black box

Nobody knows exactly what the agent does or why. Output looks good but can't be verified. Fails on the question: does this actually work?

Before you invest in agents, ask yourself this question: do you have the expertise in-house to judge if the agent is doing good work? If the answer is no, you have a bigger problem than “we don’t have an agent yet”.

Want to know if agents work for you?

We help companies determine where AI agents make sense and where they don’t. Honest advice, no sales pitch. In a free 1.5-hour consultation we discuss your situation and give concrete recommendations.

Let's discuss your project

From AI prototypes that need to be production-ready to strategic advice, code audits, or ongoing development support. We're happy to think along about the best approach, no strings attached.

010 Coding Collective free consultation
free

Free Consultation

In 1.5 hours we discuss your project, challenges and goals. Honest advice from senior developers, no sales pitch.

1.5 hours with senior developer(s)
Analysis of your current situation
Written summary afterwards
Concrete next steps