iteration.json

Overview

iteration.json defines a structured workflow configuration for an intelligent research assistant system. The file orchestrates a multi-step pipeline that:

Starts with a greeting component.
Uses a language model (LLM) agent to decompose a user query into meaningful sub-topics.
Iterates over each sub-topic to perform targeted internet searches.
Generates detailed reports on each sub-topic based exclusively on retrieved online information.
Aggregates and delivers the final responses to the user.

This JSON-based configuration describes components, their parameters, and how data flows between them (upstream/downstream relationships), enabling a modular, iterative, and extensible research assistant pipeline.

Detailed Explanation of Components

The file is structured under a top-level "components" object, where each key identifies a component instance by a unique ID. Each component includes:

obj: The component definition, including name and parameters.
downstream: Array of component IDs that consume this component’s output.
upstream: Array of component IDs that provide input to this component.
parent_id (optional): Identifies the parent component for nested components.

Component Breakdown

1. Begin

Type: Begin

Purpose: Entry point of the workflow; initializes the conversation with a greeting.

Parameters:

Parameter	Type	Description
prologue	String	Initial greeting message.

Usage Example:

{
  "component_name": "Begin",
  "params": {
    "prologue": "Hi there!"
  }
}

Data Flow:

Upstream: None (start of pipeline)
Downstream: "generate:0"

2. generate:0

Type: Agent

Purpose: Uses an LLM (language model) to decompose the user’s query (sys.query) into sub-topics.

Parameters:

Parameter	Type	Description
llm_id	String	Identifier of the LLM model to use (`deepseek-chat`).
sys_prompt	String	System prompt instructing the agent to decompose the query into meaningful sub-topics.
temperature	Number	Sampling temperature controlling randomness (0.2 = low randomness).
cite	Boolean	Whether to cite sources in output (false here).
output_structure	Array	Expected output format: list of sub-topic strings.

Prompt Details:

The system prompt instructs the LLM to strictly output a string array of sub-topics without redundant information.

Data Flow:

Upstream: "begin"
Downstream: "iteration:0"

3. iteration:0

Type: Iteration

Purpose: Iterates over the list of sub-topics produced by generate:0.

Parameters:

Parameter	Type	Description
items_ref	String	Reference to the array of sub-topics to iterate over (`generate:0@structured_content`).

Data Flow:

Upstream: "generate:0"
Downstream: "message:0"

4. iterationitem:0

Type: IterationItem

Purpose: Represents a single iteration over a sub-topic in the iteration component.

Parameters: None

Parent: "iteration:0"

Data Flow:

Upstream: None (iteration items are generated internally)
Downstream: "tavily:0"

5. tavily:0

Type: TavilySearch

Purpose: Performs an internet search query based on the current sub-topic.

Parameters:

Parameter	Type	Description
api_key	String	API key for authenticating with the Tavily search service.
query	String	Search query, dynamically set to the current iteration item’s result (`iterationitem:0@result`).

Parent: "iteration:0"

Data Flow:

Upstream: "iterationitem:0"
Downstream: "generate:1"

6. generate:1

Type: Agent

Purpose: Uses the LLM to generate a detailed report answering the current sub-topic based exclusively on the internet search results.

Parameters:

Parameter	Type	Description
llm_id	String	LLM model identifier (`deepseek-chat`).
sys_prompt	String	System prompt instructing the agent to generate a detailed, structured, and cited report only based on the provided search results.
temperature	Number	Sampling temperature (0.9 = higher creativity).
cite	Boolean	Whether to cite sources in output (false here).

System Prompt Highlights:

The agent must rely solely on the internet search results ({tavily:0@formalized_content}).
The answer must be a detailed report (min 200 words) with markdown and APA formatted sources.
No prior knowledge or external information should be used.

Parent: "iteration:0"

Data Flow:

Upstream: "tavily:0"
Downstream: "iterationitem:0" (feeding back to iteration)

7. message:0

Type: Message

Purpose: Aggregates the final output messages from the iteration component and delivers them.

Parameters:

Parameter	Type	Description
content	Array	An array referencing the generated reports from iteration (`{iteration:0@generate:1}`).

Data Flow:

Upstream: "iteration:0"
Downstream: None (terminal component)

Workflow Summary

Begin — Starts with a greeting.
generate:0 (Agent) — Decomposes the query into sub-topics.
iteration:0 (Iteration) — Loops over sub-topics.
For each sub-topic:
- iterationitem:0 — Represents the current sub-topic.
- tavily:0 (TavilySearch) — Searches the internet for information.
- generate:1 (Agent) — Generates a detailed report using search results only.
message:0 (Message) — Collects all reports and outputs them.

Important Implementation Details

Dynamic Data References: Components reference each other's outputs dynamically using notation like component_id@output_field. For example, generate:0@structured_content refers to the structured content output of generate:0.
Iterative Design: The use of Iteration and IterationItem components enables scalable processing of multiple sub-topics in a loop-like fashion.
Strict Prompt Engineering: The LLM prompts enforce strict output formats and constraints to ensure usable and consistent results.
Separation of Concerns: The search (TavilySearch) and answer generation (Agent) are distinct steps, enforcing information sourcing discipline.
Parent-Child Relationships: Components like iterationitem:0 and tavily:0 belong under the iterative parent iteration:0, reflecting nested workflows.

Interaction with Other System Parts

Globals: The configuration references global system variables such as sys.query, sys.user_id, and conversation turns, indicating integration with a larger conversational system.
External API: The TavilySearch component interacts with an external search API using an API key for live information retrieval.
LLM Backend: The Agent components leverage an LLM backend identified by llm_id (deepseek-chat), likely an internal or custom model endpoint.
Data Flow: Upstream/downstream arrays define how components pass data along the pipeline, enabling modular and extendable workflows.

Usage Example

Suppose a user query is set in the global variable:

{
  "sys.query": "Climate change impacts"
}

The pipeline would:

Greet the user.
Generate sub-topics like ["Rising sea levels", "Extreme weather events", "Impact on agriculture"].
Iterate over each sub-topic to perform internet searches.
Generate detailed, cited reports for each.
Aggregate and send the final messages back to the user.

Visual Diagram

flowchart TD
    Begin["Begin\n(prologue)"]
    Generate0["Agent: generate:0\n(decompose query)"]
    Iteration0["Iteration: iteration:0\n(loop over sub-topics)"]
    IterationItem0["IterationItem: iterationitem:0\n(single sub-topic)"]
    Tavily0["TavilySearch: tavily:0\n(search sub-topic)"]
    Generate1["Agent: generate:1\n(generate report)"]
    Message0["Message: message:0\n(final output)"]

    Begin --> Generate0
    Generate0 --> Iteration0
    Iteration0 --> Message0

    Iteration0 --> IterationItem0
    IterationItem0 --> Tavily0
    Tavily0 --> Generate1
    Generate1 --> IterationItem0

Summary

iteration.json configures a multi-component, iterative research assistant workflow that decomposes queries, searches the internet for each sub-topic, and generates detailed reports based strictly on retrieved data. It leverages modular components, dynamic data referencing, and strict prompt engineering to ensure robust and reliable output in a conversational AI context.