Agent
Let's start with the definition of agent. Based on Merriam-Webster, an agent is:
a computer application designed to automate certain tasks (such as gathering information online)
The goal of an agent is to finish some tasks.
Agent Components
An agent have five components.
- Profile, description of an agent. It may include elements such as background and demographics.
- Tools, tools used to complete tasks or acquire information.
- Knowledge and memory, annotates context with most relevant information.
- Reasoning and evaluation, enables self-reflection and internal reasoning for the completion of a task.
- Planning and feedback, organizes tasks to achieve high-level goals.
Not all agents have all these five components mentioned above. Having more components will make an agent more powerful, but harder to implement. When building an agent, the components to add depend on the intended usage scenarios.
Agent Types
There are two types of agents: conversational agents and task execution agents.
Conversational Agent
A conversational agent presents a chat-bot like UI to end users. It usually provides multi-modal input and output, including text, image, audio, and video.
A conversational agent usually has the following components:
- Profile to define the specialized area of this agent.
- Knowledge to answer users' questions.
- Memory to remember previous conversations.
- Tools to extend the agent's capabilities.
A conversation agent may use a reasoning model for reasoning.
A conversation agent typically doesn't use planning, evaluation, or feedback.
- This kind of agent usually has a hard constraint on response time. Planning will significantly increase the processing time.
- Evaluation and feedback are provided by the end user. A user can send additional messages when not satisfied with previous results.
Task Execution Agent
A task execution agent finishes a particular task when it's executed. It usually consumes structured input and produces structured output.
A conversational agent usually has the following components:
- Profile to define the task to be executed by this agent.
- Tools to provide capabilities for finishing the task.
- Knowledge and memory to provide additional information and execution results of previous subtasks.
- Reasoning and evaluation to reason about approaches to finish a task.
- Planning and feedback to break down the task into smaller subtasks.