knowledge_base_report.json

Overview

knowledge_base_report.json defines a sophisticated Knowledge Base Retrieval Q&A Agent designed for generating research reports by leveraging a local knowledge base. It encapsulates the configuration of an intelligent agent system that performs advanced task planning, decomposition of user queries, multi-perspective research, iterative retrieval, integration of evidence, and structured report generation.

The system is particularly recommended for academic research paper question-answering scenarios, providing in-depth, verified, and structured responses based strictly on retrieved knowledge base content without fabrication.

Detailed Components Explanation

This JSON file is structured primarily as a DSL (Domain Specific Language) representation describing an agent-based workflow with components, their parameters, connections, and execution logic.

Top-Level Fields

id: Unique numeric identifier (20).
title: Object with multilingual titles:
- en: "Report Agent Using Knowledge Base"
- zh: "知识库检索智能体"
description: Multilingual description of the agent's purpose.
canvas_type: "Agent" indicating this is an agent workflow design.
dsl: The core definition describing components, globals, graph topology, and operational parameters.
avatar: Base64-encoded PNG image representing the agent's icon.

DSL Breakdown

1. Components

The key elements of this DSL are its components, which represent the nodes in the agent's workflow graph.

Begin
- Component Type: Begin
- Purpose: Entry point for the conversational interaction.
- Parameters:
  - enablePrologue: true — enables a greeting message.
  - prologue: "你好！我是你的助理，有什么可以帮助到你的吗？" (Chinese for "Hello! I am your assistant, how can I help you?")
  - mode: "conversational" — the interaction mode.
- Connections:
  - Downstream: Agent:NewPumasLick
  - Upstream: None (start node)
Agent:NewPumasLick
- Component Type: Agent
- Purpose: The core Knowledge Base Retrieval Q&A Agent.
- Parameters: Extensive configuration for how the agent handles queries.
  - LLM Settings:
    - llm_id: "qwen3-235b-a22b-instruct-2507@Tongyi-Qianwen"
    - temperature: 0.1 (low randomness for consistency)
    - max_tokens: 128000 (very large token limit)
    - max_retries: 3
    - max_rounds: 3
  - Penalties:
    - frequency_penalty: 0.5 (disabled by default)
    - presence_penalty: 0.5 (disabled by default)
  - Message History and Prompting:
    - message_history_window_size: 12
    - prompts: User prompt template uses {sys.query} placeholder.
    - sys_prompt: Very detailed system prompt describing the agent's role, execution framework, quality checks, core principles, LaTeX usage, failure handling, and language handling.
  - Tools:
    - A retrieval tool configured to query the knowledge base with parameters like similarity threshold (0.2), top-k results (1024), and top-n results (8).
  - Outputs:
    - Produces a string output in the content field.
- Connections:
  - Upstream: begin
  - Downstream: Message:OrangeYearsShine
  - Tool: Links to Tool:AllBirdsNail (purpose not elaborated)
Message:OrangeYearsShine
- Component Type: Message
- Purpose: Delivers the agent's generated content as a message output.
- Parameters:
  - content: Dynamic field referencing the agent's output ({Agent:NewPumasLick@content})
- Connections:
  - Upstream: Agent:NewPumasLick
  - Downstream: None (end node)
Tool:AllBirdsNail (Referenced but not connected visibly in main flow)
- Component Type: Tool
- Description: "This is an agent for a specific task."
- Purpose: Likely auxiliary or specialized processing tool for the agent.
- No explicit connections in main flow except linked as a tool to the agent.

2. Globals

Global system variables used in the workflow:

sys.conversation_turns: Initialized to 0; tracks number of dialogue turns.
sys.files: Empty list; possible placeholder for user-uploaded files or references.
sys.query: Empty string; placeholder where user's question input is stored.
sys.user_id: Empty string; user identification placeholder.

3. Graph Structure

The workflow graph defines nodes and edges representing the flow of data and control.

Nodes: begin, Agent:NewPumasLick, Message:OrangeYearsShine, Tool:AllBirdsNail
Edges:
- begin → Agent:NewPumasLick
- Agent:NewPumasLick → Message:OrangeYearsShine
- Agent:NewPumasLick tool edge → Tool:AllBirdsNail (auxiliary tool invocation)

Important Implementation Details and Algorithms

Core Algorithmic Flow Described in System Prompt

The agent's behavior is meticulously defined in the sys_prompt string with explicit stages:

Assessment & Decomposition: Automatically extracting key elements (topics, entities, scope) and decomposing the query into 5-20 factual data points.
Query Type Determination: Rule-based decision between:
- Depth-first approach for method comparisons or multiple explanations.
- Breadth-first for 3+ independent sub-questions.
- Simple query for straightforward fact/specification questions.
Research Plan Formulation: Tailors search strategies based on query type, defining perspectives, keywords, document types, and output format.
Retrieval Execution: Uses the linked retrieval tool to fetch relevant knowledge base content, with automatic query rewriting and retry loops if quality or coverage standards are unmet.
Integration & Reasoning: Constructs responses as fact-evidence-reasoning chains, attaching 1-2 strongest supporting evidence items per conclusion.

Quality Gate Checklist

The agent performs verification at three stages:

Decomposition: checks key concepts and data points.
Retrieval: ensures quality, diversity, and timeliness of sources.
Generation: verifies evidential support, assumptions, and user expectations.

Core Principles

Strict reliance on explicitly retrieved knowledge base content.
No generation or fabrication of information.
Preference for accuracy, even if incomplete.
Structured, logical, professional output formatting.
Support for LaTeX in responses.

Failure and Interaction Strategy

Explicit user notification if critical facts are missing.
Time filtering and indication of retrieval date for time-sensitive queries.
Responds in user's preferred language.

Usage Example

While this file itself is configuration data, here is a conceptual usage example:

{
  "sys.query": "Compare the effectiveness of different AI models for academic research.",
  "sys.user_id": "user_123"
}

The begin node initiates the interaction with a greeting.
The query passes to Agent:NewPumasLick which decomposes the question, determines depth-first strategy, formulates research plans, retrieves relevant documents, and synthesizes a structured report.
The final report content is passed to Message:OrangeYearsShine which outputs the answer to the user interface.

Interaction with Other System Parts

Knowledge Base: The agent relies on a local knowledge base accessed via the Retrieval tool component.
Language Model: Uses a specific LLM instance (qwen3-235b-a22b-instruct-2507@Tongyi-Qianwen) for reasoning and text generation.
User Interface: The Begin and Message components interface with the user, receiving queries and delivering answers.
Auxiliary Tools: The Tool:AllBirdsNail component could represent additional processing or external calls, integrated with the agent as needed.

This file serves as a self-contained agent workflow module within a larger AI assistant platform supporting multi-turn conversational Q&A with knowledge base grounding.

Mermaid Flowchart Diagram

The following Mermaid flowchart illustrates the main components and their relationships in this file:

flowchart LR
    Begin[Begin: Conversation Start]
    Agent[Agent: Knowledge Base Retrieval Q&A Agent]
    Message[Message: Output Response]
    Tool[Tool: Auxiliary Processing]

    Begin --> Agent
    Agent --> Message
    Agent -- uses --> Tool

Begin initiates the conversation.
Agent processes user queries with complex reasoning and retrieval.
Message delivers final content.
Tool is an auxiliary component employed by the agent.

Summary

knowledge_base_report.json is a comprehensive agent configuration file defining a powerful, multi-stage knowledge base retrieval and report generation assistant. It combines advanced query decomposition, rule-based planning, iterative retrieval, and evidence-based synthesis to serve academic research needs, ensuring high accuracy and professional output constrained strictly by available knowledge base data. This file is a critical module for enabling structured, grounded Q&A interactions within a broader AI assistant ecosystem.