retrieval_and_generate.json
Overview
This JSON file defines a modular pipeline configuration for a conversational AI system that performs knowledge retrieval and response generation. It specifies a sequence of components—starting with a greeting, then retrieving relevant knowledge base content, generating a response using a language model, and finally formatting the message to be sent back to the user.
The pipeline orchestrates the flow of data between components, specifying upstream and downstream relationships, allowing the system to process user queries by retrieving relevant information and generating coherent, context-aware answers.
Detailed Explanation of Components and Structure
The file is structured around a components object, where each key identifies a component instance, and each value describes the component's type, parameters, and connectivity (upstream and downstream components).
Components Overview
Component Key | Component Type | Purpose |
|---|---|---|
| Begin | Starting point of the pipeline; initiates conversation with a prologue message. |
| Retrieval | Retrieves relevant knowledge base entries based on query similarity. |
| LLM | Uses a large language model to generate an answer based on the retrieved content and conversation history. |
| Message | Formats and outputs the generated response to the user. |
Component Details
1. begin
Type:
BeginParameters:
prologue(string): Initial greeting message to start the conversation.
Example:"Hi there!"
Upstream: None (pipeline start)
Downstream:
retrieval:0
Functionality:
Serves as the entry point of the pipeline, providing an initial greeting or prompt to the system or user. This component does not depend on any prior input and triggers the retrieval process next.
2. retrieval:0
Type:
RetrievalParameters:
similarity_threshold(float): Minimum similarity score to consider a knowledge base entry relevant (0.2).keywords_similarity_weight(float): Weight for keyword matching in similarity computation (0.3).top_n(int): Number of top similar items to return (6).top_k(int): Maximum number of tokens to consider or candidates to scan (1024).rerank_id (string): ID for reranking model or method; empty if unused.
empty_response(string): Fallback message if no relevant data found.
Example:"Nothing found in dataset"kb_ids(array of strings): List of knowledge base identifiers to search within.
Example:["1a3d1d7afb0611ef9866047c16ec874f"]
Upstream:
beginDownstream:
generate:0
Functionality:
This component performs semantic retrieval from specified knowledge base(s). It uses configurable similarity thresholds and weights to rank and filter candidate knowledge chunks. If no relevant content is found, it returns a default empty response.
Implementation Notes:
Likely uses vector embeddings or keyword-based similarity scoring.
Retrieves top-N entries to supply to the LLM component.
Handles reranking if a rerank_id is specified (empty here).
3. generate:0
Type:
LLM(Large Language Model)Parameters:
llm_id(string): Identifier of the LLM model used.
Example:"deepseek-chat"sys_prompt (string): System prompt template guiding the LLM's behavior. Contains instructions to summarize and answer questions based on retrieved knowledge base content, including a fallback sentence if no relevant information is found.
Template snippet:You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence "The answer you are looking for is not found in the knowledge base!" Answers need to consider chat history. Here is the knowledge base: {retrieval:0@formalized_content} The above is the knowledge base.temperature(float): Sampling temperature controlling randomness in generation (0.2).
Upstream:
retrieval:0Downstream:
message:0
Functionality:
Generates a detailed and context-aware response using the LLM based on the retrieved knowledge base content and conversation history. Uses the provided system prompt to instruct the model on how to formulate the answer, including handling cases where relevant knowledge is missing.
4. message:0
Type:
MessageParameters:
content(array of strings): Content to output, referencing the result of thegenerate:0component.
Example:["{generate:0@content}"]
Upstream:
generate:0Downstream: None (end of pipeline)
Functionality:
Formats and delivers the final response content generated by the LLM to the user or downstream systems. This is the endpoint where the AI's answer is composed for output.
Additional Fields
history (array): Initially empty; likely tracks past interactions or conversation turns.
path (array): Empty; could be used for tracking component execution paths.
retrival (object): Contains empty
chunksanddoc_aggs; placeholders for retrieval results or document aggregations.globals (object): Holds system-wide variables such as user query, user ID, conversation turn count, and file references.
Usage Example
Assuming the system receives a user query, the process flow is:
beginoutputs "Hi there!" or initiates the conversation.retrieval:0searches the knowledge base"1a3d1d7afb0611ef9866047c16ec874f"for relevant content matching the query.generate:0uses the retrieved content and system prompt to craft a detailed answer.message:0outputs the final generated content back to the user.
Implementation Details and Algorithms
Retrieval component likely implements a hybrid similarity scoring algorithm combining keyword similarity and vector-based semantic similarity, controlled by
keywords_similarity_weight.The LLM is prompted with a carefully designed system prompt to encourage summarization and detailed answering, including explicit instructions for handling irrelevant or missing knowledge.
The pipeline is designed for modularity and extensibility, with upstream/downstream links defining the data flow.
Use of placeholders like
{retrieval:0@formalized_content}and{generate:0@content}enables dynamic content injection between components.
Interaction with Other System Parts
This pipeline likely integrates within a larger conversational AI framework where user queries are fed into the pipeline and responses are delivered back to the user interface.
The
kb_idsparameter links to external knowledge bases managed separately in the system.The LLM component references a model identifier (
deepseek-chat), which is managed by a model serving backend.The
globalssection interacts with session-level data such as user ID and conversation turns, supporting personalized and context-aware conversations.Other related pipeline configurations may exist for different tasks, utilizing similar components with different parameters or knowledge bases.
Visual Diagram
flowchart TD
Begin["Begin\n(prologue: \"Hi there!\")"]
Retrieval["Retrieval\n(similarity_threshold: 0.2,\nkeywords_similarity_weight: 0.3,\ntop_n: 6,\ntop_k: 1024,\nempty_response: \"Nothing found in dataset\",\nkbs: [\"1a3d1d7afb0611ef9866047c16ec874f\"] )"]
LLM["LLM\n(llm_id: \"deepseek-chat\",\ntemperature: 0.2)"]
Message["Message\n(content from LLM output)"]
Begin --> Retrieval
Retrieval --> LLM
LLM --> Message
Summary
This retrieval_and_generate.json file configures a sequential pipeline for conversational AI that begins with a greeting, retrieves relevant knowledge from a specified knowledge base, generates a detailed answer using an LLM with a custom prompt, and formats the answer for output. The modular design allows easy adjustment of parameters such as retrieval thresholds, knowledge bases, and generation settings while maintaining a clear data flow through upstream/downstream relationships.