deep_research.json

Overview

deep_research.json defines the configuration and architecture for a Multi-Agent Deep Research Agent system designed to conduct thorough, structured investigations for professionals in sales, marketing, policy, or consulting. This JSON file specifies:

The purpose and description of the system in English and Chinese.
The multi-agent orchestration pipeline, detailing the roles and parameters of individual agents involved in research.
The workflow and interactions between these agents/components.
The core mission, execution framework, research process, query classification, quality gates, adaptive strategies, success metrics, and language adaptation guidelines.
Extensive instructions and templates guiding agents to produce high-quality, consulting-style strategic reports.
The graph structure representing the flow of execution among components.

This file essentially serves as a blueprint for an AI-driven multi-agent research system, specifying how agents collaborate and what each agent’s responsibilities are to deliver actionable, well-cited, and multi-perspective strategic reports.

Detailed Explanation of Components

The file's core structure revolves around "components" inside a DSL (Domain Specific Language) object. Each component represents an Agent or Message node with defined parameters, inputs, outputs, and connections.

Key Components (Agents)

1. Agent:NewPumasLick (Deep Research Agent / Lead Agent)

Role: Strategy Research Director orchestrating the multi-agent research team.
Purpose: Transform complex research queries into structured multi-agent workflows producing ~2000-word strategic reports.
Parameters:
- llm_id: "qwen-max@Tongyi-Qianwen" (indicates the selected language model)
- max_rounds: 3 (number of dialogue rounds)
- max_tokens: 4096 (max token context size)
- frequency_penalty: 0.5 (to reduce repetitive outputs)
- presence_penalty: 0.5
- prompts: Includes user prompt template "The user query is {sys.query}"
- sys_prompt: A large, detailed system prompt describing:
  - The core mission (multi-agent orchestration for research)
  - Execution framework: 3 stages (URL discovery, content extraction, report generation)
  - Research process: Step-by-step query analysis and planning
  - Query classification: Depth-first, Breadth-first, Straightforward
  - Quality gates: Criteria for each stage to ensure output quality
  - Adaptive strategy: Resource allocation and failure recovery
  - Success metrics: Information density, actionability, evidence strength, etc.
  - Language adaptation: Auto-detect user language and adapt sources and instructions accordingly
  - Execution process examples: Illustrative workflows for different query types.
Downstream: Feeds content to "Message:OrangeYearsShine" and triggers three sub-agents:
- "Agent:FreeDucksObey" (Web Search Specialist)
- "Agent:WeakBoatsServe" (Content Deep Reader)
- "Agent:SwiftToysTell" (Research Synthesizer)

Usage Example:

{
  "sys.query": "How is AI transforming healthcare?",
  "Agent:NewPumasLick": {
    "params": {
      "prompts": [
        {"content": "The user query is How is AI transforming healthcare?", "role": "user"}
      ],
      "sys_prompt": "...(long detailed instructions)..."
    }
  }
}

2. Agent:FreeDucksObey (Web Search Specialist)

Role: URL Discovery Expert.
Purpose: Use web search tools and Model Context Protocol (MCP) to find 5 high-quality URLs relevant to the research query.
Key Features:
- Finds links only; does NOT read or extract content.
- Evaluates URL quality based on domain and title.
- Uses multiple search strategies (broad, specific, geographic, temporal).
- Ensures diversity of sources: academic, official, industry, news.
- Stops search when diminishing returns occur.
Outputs: Structured list of URLs with extraction guidance for the Content Deep Reader.
Tools: Uses "TavilySearch" tool for actual web search queries.
Quality Criteria: Avoids paywalled or low-authority sites, prioritizes authoritative and recent sources.

3. Agent:WeakBoatsServe (Content Deep Reader)

Role: Content extraction specialist.
Purpose: Extract structured, research-ready content from URLs discovered by the Web Search Specialist.
Key Features:
- Uses web extraction tools and MCP connections.
- Parses full webpage content preserving context.
- Identifies key statistics, main findings, expert quotes, supporting data.
- Assigns credibility scores and extraction method flags.
- Applies quality gates ensuring >80% extraction success.
Outputs: EXTRACTED_CONTENT objects that feed into the Research Synthesizer.
Tools: Uses "TavilyExtract" web extraction tool.
Failure Recovery: Switches to metadata extraction if full extraction fails.

4. Agent:SwiftToysTell (Research Synthesizer)

Role: Final integration specialist.
Purpose: Transform extracted content and analysis instructions into comprehensive, executive-grade strategic reports.
Key Features:
- Cross-validates and synthesizes information from 5+ sources.
- Generates 2000+ word reports in consulting styles (McKinsey, BCG, Deloitte, or Academic).
- Follows detailed ANALYSIS_INSTRUCTIONS for analysis type, audience, business focus, and style.
- Produces reports with sections: Executive Summary, Analysis, Recommendations, etc.
- Ensures strategic intelligence generation, not just data aggregation.
No raw URL or intermediate data output allowed.
Outputs: Complete strategic reports ready for client use.

5. Message:OrangeYearsShine

Role: Message node to present the final output content.
Purpose: Serves as the response node delivering the synthesized research content from the Lead Agent.

6. Begin Node

Role: Entry point for the workflow.
Purpose: Starts the research process by triggering the Lead Agent.

Workflow and Interactions

The workflow graph starts at the begin node.
The begin node triggers Agent:NewPumasLick (Lead Agent).
The Lead Agent initiates a multi-agent process:
- Delegates URL discovery to Web Search Specialist (Agent:FreeDucksObey).
- Delegates content extraction to Content Deep Reader (Agent:WeakBoatsServe).
- Delegates report writing to Research Synthesizer (Agent:SwiftToysTell).
The Lead Agent collects and orchestrates outputs from sub-agents.
The final report content is passed to the Message:OrangeYearsShine node for presentation.
Each agent uses specific tools (e.g., TavilySearch, TavilyExtract) to fulfill their sub-task.
Quality gates ensure the output at each stage meets minimum standards before proceeding.
Adaptive strategies dynamically allocate resources and handle failures to maintain robustness.

Important Implementation Details and Algorithms

Multi-Agent Orchestration: The lead agent manages simultaneous subagent workflows with defined stages: discovery, extraction, synthesis.
Query Classification: The system categorizes queries as Depth-first, Breadth-first, or Straightforward, adapting the research plan accordingly.
Research Planning: The Lead Agent breaks down the user's query into components, plans multiple research approaches, and delegates tasks to specialized agents.
Quality Gates: After each stage, predefined quality checks validate outputs — e.g., minimum number of URLs, extraction success rate, report comprehensiveness.
Adaptive Strategy: Resource allocation scales with query complexity, and failure recovery mechanisms (fallback extraction methods, prioritization) ensure resilience.
Language Adaptation: The system detects the user's language and regional context, adjusting sources, agent instructions, and report language accordingly.
Strategic Synthesis: The Research Synthesizer applies advanced synthesis techniques, including pattern recognition, predictive analysis, and value creation frameworks, to convert raw data into actionable insights.
Prompt Engineering: Extensive system prompts guide each agent's behavior, ensuring domain expertise emulation and adherence to research standards.
Model Selection: Different LLM IDs are assigned per agent to optimize for their specific task (e.g., "qwen-max@Tongyi-Qianwen" for Lead Agent, "moonshot-v1-auto@Moonshot" for Content Reader).

How This File Interacts with Other System Parts

This JSON serves as a flow and configuration definition for the multi-agent research system.
It interfaces with:
- Language Models (LLMs): Via specified llm_ids, for natural language understanding and generation.
- Web Search and Extraction Tools: (e.g., TavilySearch, TavilyExtract) invoked by subagents for data acquisition.
- System Globals: Uses placeholders like {sys.query}, {sys.user_id} to inject runtime context.
- User Interface / API: The Message component outputs final results to be presented to the user.
- Agent Execution Engine: Parses this JSON to instantiate and execute agents according to the defined graph and parameters.
This file is likely part of a larger research automation platform where multiple such JSON configurations define different research workflows or canvases.

Visual Diagram: Multi-Agent Deep Research System Structure

classDiagram
    class DeepResearchAgent {
        +params
        +sys_prompt
        +executeResearchPlan()
        +orchestrateAgents()
    }
    class WebSearchSpecialist {
        +searchStrategy()
        +evaluateURLs()
        +provideURLList()
    }
    class ContentDeepReader {
        +extractContent()
        +structureData()
        +validateExtraction()
    }
    class ResearchSynthesizer {
        +integrateContent()
        +generateReport()
        +applyAnalysisFramework()
    }
    class Message {
        +content
        +deliverResponse()
    }
    class Begin {
        +startProcess()
    }

    Begin --> DeepResearchAgent : triggers
    DeepResearchAgent --> WebSearchSpecialist : delegates URL discovery
    DeepResearchAgent --> ContentDeepReader : delegates content extraction
    DeepResearchAgent --> ResearchSynthesizer : delegates report generation
    DeepResearchAgent --> Message : sends final report content

Summary of Main Entities and Their Responsibilities

Component Name	Type	Description
`Agent:NewPumasLick`	Agent	Lead agent coordinating research and report generation.
`Agent:FreeDucksObey`	Agent	Web Search Specialist finding high-quality URLs.
`Agent:WeakBoatsServe`	Agent	Content Deep Reader extracting structured data from URLs.
`Agent:SwiftToysTell`	Agent	Research Synthesizer creating final strategic reports.
`Message:OrangeYearsShine`	Message	Delivery node for final user-facing output.
`begin`	Begin	Entry point initiating the research workflow.

Usage and Extension

To execute a research query, inject the user query into sys.query, and start the begin node.
The system will:
1. Lead Agent analyzes the query and determines research strategy.
2. Web Search Specialist finds premium URLs.
3. Content Deep Reader extracts and structures content.
4. Research Synthesizer composes strategic report.
5. Final output is delivered to the user.
To customize for different domains or report styles, modify:
- sys_prompt and prompts in the Lead Agent.
- Parameters like llm_id, temperature, max_tokens per agent.
- Quality gates and adaptive strategies.
- Analysis instructions templates inside the Lead Agent prompt.

Appendix: Key Highlights from System Prompts

Lead Agent Core Mission: Orchestrate multi-agent teams to produce ~2000-word strategic reports.
Execution Framework: 3 stages — URL discovery, content extraction, report generation.
Research Process: Rigorous query breakdown, query type classification, detailed research plan, methodical execution.
Query Types: Depth-first (multiple perspectives), Breadth-first (parallel sub-questions), Straightforward (single focused fact-finding).
Quality Gates: Stage-wise checks to ensure output completeness and credibility.
Adaptive Strategy: Resource scaling and failure recovery mechanisms.
Language Adaptation: Auto-detect language and regional context for source selection.
Synthesis Techniques: Focus on pattern recognition, predictive analysis, and actionable strategic intelligence.
Output Requirements: Only produce final strategic reports; no raw or intermediate data output.

End of Documentation for `deep_research.json`