Agent

Agent is a hot topic in AI. Many tools, libraries, services, platforms claim that they can build AI agents. Agent is a broad concept. An agent can be a chatbot to provide cooking suggestions. It can also provide comprehensive results of a research topic.

It's hard to give agent a clear definition in the context of AI. The Merriam-Webster Dictionary has several definitions of agent. Below are definitions related to agent in AI.

one that acts or exerts power
something that produces or is capable of producing an effect
a means or instrument by which a guiding intelligence achieves a result
a computer application designed to automate certain tasks (such as gathering information online)

There are different ways to create agents. Many agent platforms don't require coding skills to build agents.

Agent Components

An agent may have five components.

Profile, description of an agent. It may include elements such as background and demographics.
Tools, tools used to complete tasks or acquire information.
Knowledge and memory, annotates context with most relevant information.
Reasoning and evaluation, enables self-reflection and internal reasoning for the completion of a task.
Planning and feedback, organizes tasks to achieve high-level goals.

Not all agents have all these five components mentioned above. Having more components will make an agent more powerful, but harder to implement. When building an agent, the components to include depend on the intended usage scenarios.

Agent Types

There are two types of agents: conversational agents and task execution agents.

Conversational Agent

A conversational agent presents a chat-bot like UI to end users. It usually provides multi-modal input and output, including text, image, audio, and video.

A conversational agent usually has the following components:

Profile to define the specialized area of this agent.
Knowledge to provide context information and answer users' questions.
Memory to remember previous conversations.
Tools to extend the agent's capabilities.

A conversation agent may use a reasoning model for reasoning.

A conversation agent typically doesn't use planning, evaluation, or feedback.

This kind of agent usually has a hard constraint on response time. Planning will significantly increase the processing time.
Evaluation and feedback are provided by the end user. A user can send additional messages when not satisfied with previous results.

Task Execution Agent

A task execution agent finishes a particular task when it's executed. It usually consumes structured input and produces structured output.

A conversational agent usually has the following components:

Profile to define the task to be executed by this agent.
Tools to provide capabilities for finishing the task.
Knowledge and memory to provide additional information and execution results of previous subtasks.
Reasoning and evaluation to reason about approaches to finish a task.
Planning and feedback to break down the task into smaller subtasks.

Agent Components​

Agent Types​

Conversational Agent​

Task Execution Agent​

Agent Components

Agent Types

Conversational Agent

Task Execution Agent