headhunter_zh.json

Overview

headhunter_zh.json defines a conversational dialogue flow configuration for a Chinese-language AI-driven recruitment chatbot specialized in the AGI (Artificial General Intelligence) domain. This JSON file structures the components, their parameters, and the flow logic for interacting with candidates, handling responses, categorizing queries, generating replies, and managing conversation pathways.

The chatbot’s primary purpose is to engage with potential candidates for senior engineering positions at RAGFlow, gracefully handling various types of user input — from job-related questions to casual chit-chat and rejection responses — while guiding the conversation towards collecting contact information (e.g., WeChat ID) and delivering relevant job information.

Detailed Explanation of Components and Workflow

This file contains a set of conversational components, each represented as nodes with specific roles, parameters, and connectivity. The components are connected via upstream and downstream references, establishing the conversational flow.

Components

Each component is a node with:

obj: Describes the component type and parameters.
downstream: List of downstream component IDs triggered after this component.
upstream: List of upstream component IDs that lead to this component.

Component Types

Below are detailed descriptions of each component type and their usage in this flow:

1. Begin

Purpose: Entry point to start the conversation.
Parameters:
- prologue (string): Initial greeting message introducing the recruiter and the job opportunity.

Usage Example:

{
  "component_name": "Begin",
  "params": {
    "prologue": "您好！我是AGI方向的猎头，了解到您是这方面的大佬，..."
  }
}

Flow: Starts the conversation and passes control to answer:0.

2. Answer

Purpose: Captures user responses.
Parameters: None.
Usage: Acts as a placeholder to receive candidate input and forward it to the next categorization or processing step.
Instances: answer:0, answer:1
Flow:
- answer:0 receives initial responses, sends to categorize:0.
- answer:1 receives follow-up answers, sends to categorize:1.

3. Categorize

Purpose: Classifies user input into predefined categories to decide the next action.
Parameters:
- llm_id: Identifier for the language model used for classification (e.g., "deepseek-chat").
- category_description: Defines categories, their descriptions, example inputs, and routing targets.
- message_history_window_size (optional): Number of previous messages to consider for context.
Categories:
- about_job: Job-related queries → routed to job information retrieval.
- casual: Non-job-related chit-chat → routed to casual response generation.
- interested: Expressions of interest → routed to job introduction message.
- answer: Expressions of disinterest or rejection → routed to polite rejection message.
- wechat: Willingness to share WeChat → routed to WeChat handling responses.
- giveup: Refusal to share WeChat → routed to polite handling of refusal.

Usage Example:

{
  "component_name": "Categorize",
  "params": {
    "llm_id": "deepseek-chat",
    "category_description": {
      "about_job": {"description": "...", "examples": "...", "to": "retrieval:0"},
      "casual": {...},
      ...
    }
  }
}

Flow: Based on classification, routes to the appropriate next component, e.g., retrieval:0, generate:casual, message:introduction.

4. Message

Purpose: Sends predefined messages to users.
Parameters:
- messages: Array of string messages to send.
Instances:
- message:introduction: Introduces RAGFlow and invites further questions.
- message:reject: Polite closing messages for uninterested candidates.

Usage Example:

{
  "component_name": "Message",
  "params": {
    "messages": [
      "我简单介绍以下：\nRAGFlow 是一款基于深度文档理解构建的开源 RAG（Retrieval-Augmented Generation）引擎。",
      "您那边还有什么要了解的？"
    ]
  }
}

Flow: Sends messages and waits for candidate answers.

5. Generate

Purpose: Uses a language model (LLM) to generate dynamic responses based on prompts and context.
Parameters:
- llm_id: Language model identifier.
- prompt: Instruction and context for generating replies.
- temperature (optional): Controls randomness in generation.
- message_history_window_size (optional): Number of past messages for context.
- cite (optional): Whether to include citations.
Instances:
- generate:casual: Handles non-job related chit-chat, steering conversation back to job topics and requesting WeChat.
- generate:aboutJob: Answers job-related questions using retrieved job info.
- generate:get_wechat: Handles positive responses about WeChat sharing.
- generate:nowechat: Handles refusals to share WeChat politely and empathetically.

Usage Example:

{
  "component_name": "Generate",
  "params": {
    "llm_id": "deepseek-chat",
    "prompt": "你是AGI方向的猎头，候选人问了有关职位或公司的问题，...",
    "temperature": 0.02
  }
}

Flow: Generates replies to user queries; output typically flows back to the Answer component for further processing.

6. Retrieval

Purpose: Retrieves relevant job-related information from a knowledge base to answer candidate questions.
Parameters:
- similarity_threshold: Minimum similarity score to consider a document relevant.
- keywords_similarity_weight: Weight given to keyword matches in similarity calculations.
- top_n: Number of top documents to retrieve.
- top_k: Search scope for retrieval.
- rerank_id: Identifier for a reranking model to improve retrieval quality.
- kb_ids: Knowledge base IDs to query.

Usage Example:

{
  "component_name": "Retrieval",
  "params": {
    "similarity_threshold": 0.2,
    "keywords_similarity_weight": 0.3,
    "top_n": 6,
    "top_k": 1024,
    "rerank_id": "BAAI/bge-reranker-v2-m3",
    "kb_ids": ["869a236818b811ef91dffa163e197198"]
  }
}

Flow: Retrieves job info, then passes results to generate:aboutJob for answer generation.

Workflow Summary

Begin initiates conversation with a greeting and job introduction.
Candidate's first response is captured by answer:0.
categorize:0 classifies the response:
- If interested → message:introduction (job intro).
- If casual → generate:casual (steer back to job).
- If about job → retrieval:0 → generate:aboutJob (answers job questions).
- If reject → message:reject (polite goodbye).
answer:1 captures follow-up responses.
categorize:1 further classifies follow-ups with additional categories including WeChat sharing.
According to classification, the flow continues with generation components like generate:get_wechat or generate:nowechat to handle contact info sharing.
Conversation loops through answers and categorizations, maintaining context with message history windows.

Important Implementation Details and Algorithms

Natural Language Classification:
Uses a fine-tuned language model (deepseek-chat) to categorize candidate responses into semantic categories (job-related, casual, interested, reject, wechat willingness/refusal). This guides the branching logic dynamically.
Retrieval-Augmented Generation (RAG):
Job-related questions trigger a retrieval component that fetches relevant documents from a knowledge base using similarity search and reranking. The retrieved info is then fed into a generation prompt to produce accurate, context-aware answers.
Context Awareness:
Several components use a message_history_window_size parameter to keep context from previous exchanges, improving the coherence of generation and classification.
Politeness and Engagement:
The system includes specific prompts to maintain polite, empathetic interaction, especially when candidates reject offers or refuse to share WeChat, encouraging future contact.
Contact Information Handling:
Special logic distinguishes between candidates willing or unwilling to share WeChat, adapting responses accordingly to maintain a positive user experience.

Interaction with Other System Parts

Knowledge Base:
The retrieval:0 component interfaces with external knowledge base(s) identified by kb_ids to fetch job-related content.
Language Models:
Classification and generation components rely on LLMs (specifically "deepseek-chat") for understanding and replying.
Conversation Manager:
This JSON acts as a configuration input to a conversation orchestrator engine that executes the flow: sending messages, capturing answers, invoking LLM APIs, retrieval services, and routing messages between components.
User Interface:
The chatbot UI sends user inputs to Answer components and displays outputs from Message or generated texts.
Data Persistence:
Contextual data such as conversation history, message windows, and candidate responses are maintained externally and referenced by components for contextual understanding.

Usage Example

A typical interaction flow:

System sends the Begin prologue:
"您好！我是AGI方向的猎头，了解到您是这方面的大佬..."
Candidate replies: "请问具体工作内容是啥？"
answer:0 captures reply → categorize:0 classifies as about_job → retrieval:0 fetches job info → generate:aboutJob answers with job details and invites WeChat.
Candidate replies: "可以，微信是windblow_2231"
answer:1 → categorize:1 classifies as wechat → generate:get_wechat responds with thanks and contact info.

Visual Diagram

flowchart TD
    Begin["Begin\n(prologue)"]
    Answer0["Answer:0"]
    Categorize0["Categorize:0\n(classify initial response)"]
    MessageIntro["Message:Introduction\n(job intro message)"]
    Answer1["Answer:1"]
    Categorize1["Categorize:1\n(classify follow-up)"]
    MessageReject["Message:Reject\n(politeness on rejection)"]
    GenerateCasual["Generate:Casual\n(handle chit-chat)"]
    Retrieval["Retrieval:0\n(fetch job info)"]
    GenerateAboutJob["Generate:AboutJob\n(answer job questions)"]
    GenerateGetWeChat["Generate:Get_WeChat\n(handle positive WeChat)"]
    GenerateNoWeChat["Generate:NoWeChat\n(handle refusal WeChat)"]

    Begin --> Answer0
    Answer0 --> Categorize0

    Categorize0 -->|interested| MessageIntro
    Categorize0 -->|about_job| Retrieval
    Categorize0 -->|casual| GenerateCasual
    Categorize0 -->|answer (reject)| MessageReject

    MessageIntro --> Answer1
    Retrieval --> GenerateAboutJob
    GenerateAboutJob --> Answer1
    GenerateCasual --> Answer1
    MessageReject --> Answer0

    Answer1 --> Categorize1

    Categorize1 -->|about_job| Retrieval
    Categorize1 -->|casual| GenerateCasual
    Categorize1 -->|wechat| GenerateGetWeChat
    Categorize1 -->|giveup| GenerateNoWeChat

    GenerateGetWeChat --> Answer1
    GenerateNoWeChat --> Answer1

Diagram Explanation:
This flowchart illustrates the main components and their interactions in the conversational process. It shows the two categorization stages that direct conversation based on candidate responses, the use of retrieval to get job information, and generation components that handle dynamic replies and contact information negotiation.

Summary

headhunter_zh.json is a structured, modular dialogue flow definition for a Chinese AGI-focused recruitment chatbot. It leverages advanced classification, retrieval, and generation techniques to conduct natural, context-aware conversations with candidates, providing job information and collecting contact details with sensitivity and politeness. The file is designed to be used by a conversational AI framework that manages execution, user interaction, and integration with external knowledge bases and language models.