Parallelization Workflow

Parallelization Workflow pattern runs subtasks in parallel to improve the performance.

Depends on whether these tasks return the same structure of data, there are two types of parallelization workflows.

Subtasks return different types of data

This kind of parallelization workflows decompose a large task into smaller subtasks. These subtasks are run in parallel. The final result of this agent will be assembled from execution results of subtasks.

A typical example is writing reports. The agent creates different subtasks to gather information for different areas, then assembles results from subtasks to create the final report.

Subtasks return same type of data

This kind of parallelization workflows use multiple subtasks for voting or confirmation. These subtask give different results for the same input. The agent uses these results to determine its final result.

For the code generation example described in Evaluator-Optimizer pattern, the agent for evaluation can run three parallel subtasks to evaluate the code using three different models. Each subtask returns the result of passed or not passed. The agent can use the result with majority as its final result.

Implementation

This pattern consists of a main task and a flexible number of subtasks. The main task and each subtask are implemented using Task Execution pattern.

Result Types of Subtasks

Subtasks may return results of different types or the same type.

Different Types

If subtasks return different types of results, they typically require different types of inputs. In this case, before executing a subtask, the original task input needs to be transformed into the type required by a subtask.

Same Type

If all subtasks return the same type of results, then the original input can be passed to subtasks directly.

Assembling Strategy

When all subtasks finish execution, there are two strategies to assemble the results.

The first assembling strategy doesn't use an LLM. It simply takes the results of all subtasks and assemble them using code logic.

The second assembling strategy does use an LLM. Assembled result is passed to an LLM for further generation.

Rate Limits

When implementing this pattern, you should pay attention to rate limits of calling AI services. With subtasks running in parallel, it's very easy to reach rate limits. We can limit the number of parallel running tasks. We can also limit the total number of executed tasks in a certain time range.

Example

The example is an agent to write articles about algorithms. Each article has code examples written in different programming languages. A parallelization workflow agent runs parallel subtasks to generate code examples, then generates an article using these code examples.

Let's start from the input and output of this agent. The input contains two fields:

algorithm represents name of the algorithm.
languages represents the list of programming languages.

Agent input
package com.javaaidev.agenticpatterns.examples.parallelizationworkflow;

import java.util.List;

public record AlgorithmArticleGenerationRequest(String algorithm, List<String> languages) {

}

The output only contains one field.

article represents the generated article.

Agent output
package com.javaaidev.agenticpatterns.examples.parallelizationworkflow;

public record AlgorithmArticleGenerationResponse(String article) {

}

Parallel subtasks use the prompt template below to generate code samples.

Prompt template to generate code samples
Write {language} code to meet the requirement.

{description}

After subtasks finish execution, their results are collected and used to generate the article. LLM is used to assemble the results. In the prompt template below, the value of sample_code variable is generated from results of all subtasks.

Prompt template to generate the article
Goal: Write an article about {algorithm}.
        
Requirements:
- Start with a brief introduction.
- Include only sample code listed below.
- Output the article in markdown.

{sample_code}

Below is a sample input of this agent. The task is to generate an article about quick sort with code samples of Java, Python, and JavaScript.

Agent input
{
  "algorithm": "quick sort",
  "languages": [
    "java",
    "python",
    "javascript"
  ]
}

The picture below shows the trace of an execution of this agent. Trace name agent.execute means an execution of an agent. The outer trace represents the execution of the article generation agent. Those three nested agent.execute traces represent executions of subtasks to generate code samples of three languages. These three subtasks run in parallel.

Agent execution trace

The screenshot below shows the result of using Swagger UI to test the agent. The output is a generated article.

Swagger UI

Reference Implementation

See this page for reference implementation and examples.

Implementation​

Result Types of Subtasks​

Different Types​

Same Type​

Assembling Strategy​

Rate Limits​

Example​

Reference Implementation​