headhunter_zh.json

Overview

headhunter_zh.json defines a structured conversational dialogue flow specifically designed for an AGI (Artificial General Intelligence) domain headhunter chatbot in Chinese. The file encodes a graph-based workflow using nodes and edges, where each node represents a step in the conversation (such as sending a message, categorizing user input, generating responses, or retrieving information), and edges represent possible transitions between these steps based on user responses or internal logic.

The primary purpose of this file is to guide the chatbot through multi-turn conversations with potential candidates, handling different types of user inputs such as interest in the job, casual chat, job-related inquiries, acceptance or refusal to share WeChat contact information, and polite rejection. The flow integrates retrieval-augmented generation (RAG) techniques, leveraging a knowledge base about the job position and company to provide informed answers.

This JSON configuration is likely consumed by a RAGFlow-based conversational AI engine, which interprets nodes and edges to orchestrate the chatbot’s behavior.

Detailed Explanation of Components

Node Types

beginNode: Entry point of the conversation, usually initiating contact.
ragNode: Nodes involving retrieval-augmented generation or message sending.
categorizeNode: Nodes that classify user input into categories to decide the next step.

Each node has:

id: Unique identifier.
type: Type of node.
position: Coordinates for UI representation (not affecting logic).
data: Contains label, internal name, and form data such as prompts, messages, or categorization rules.
sourcePosition and targetPosition: UI-related metadata for edge connections.

Key Nodes and Their Roles

1. `begin` (beginNode)

Purpose: Starts the conversation with a polite introductory message.
Data:
- Prologue message introducing the headhunter and the job opportunity.
Usage Example:
您好！我是AGI方向的猎头，了解到您是这方面的大佬，然后冒昧的就联系到您。这边有个机会想和您分享，RAGFlow正在招聘您这个岗位的资深的工程师不知道您那边是不是感兴趣？

2. `answer:0` and `answer:1` (ragNode)

Purpose: Receive and process candidate’s answers at different conversation stages.
Data: No specific form data; placeholders for responses.
Usage: Acts as input points after messages or generation nodes to collect user replies.

3. categorize:0 and categorize:1 (categorizeNode)

Purpose: Use LLM-powered categorization (llm_id: deepseek-chat) to classify candidate responses into predefined categories.
Parameters:
- category_description: Defines categories with descriptions, example user inputs, and the next node to route to.
- message_history_window_size: (Only in categorize:1) Controls how many previous messages are considered.
Categories Example for categorize:0:
- about_job: Questions related to job or company.
- casual: Casual, off-topic chat.
- interested: Shows interest in the job.
- answer: Shows disinterest or rejection.
Usage: Routes conversation flow based on candidate's intent.

4. message:introduction and message:reject (ragNode)

Purpose:
- message:introduction: Provides a detailed intro about RAGFlow and invites questions.
- message:reject: Politely ends conversation when candidate is uninterested.
Data:
- messages: Array of fixed messages to send.
Usage Example:
- Introduction message explains RAGFlow’s capabilities and invites further queries.
- Reject message sends polite closing statements.

5. generate:* Nodes (ragNode)

Purpose: Generate dynamic responses using LLM with specific prompts based on candidate’s inputs and conversation context.
Nodes include:
- generate:casual: Handles casual chats, steering conversation back to the job and tries to obtain WeChat.
- generate:aboutJob: Answers job-related questions using retrieval results.
- generate:get_wechat: Responds when candidate is willing to share WeChat or has already shared it.
- generate:nowechat: Politely handles refusal to share WeChat, encourages further discussion and revisits WeChat request tactfully.
Parameters:
- llm_id: LLM model identifier (deepseek-chat).
- prompt: Template guiding the LLM’s response style and content.
- temperature: Controls randomness of generation.
- message_history_window_size: Limits dialogue context window.
- cite: Whether to include citations (mostly false).
Usage Example Prompt (generate:aboutJob):
你是AGI方向的猎头，候选人问了有关职位或公司的问题，你根据以下职位信息回答。如果职位信息中不包含候选人的问题就回答不清楚、不知道、有待确认等。回答完后引导候选人加微信号...

6. retrieval:0 (ragNode)

Purpose: Retrieve relevant information from a knowledge base to support job-related answers.
Parameters:
- similarity_threshold: Minimum similarity for retrieval.
- keywords_similarity_weight: Weight for keyword matching.
- top_n: Number of top results to consider.
- top_k: Upper limit of candidates to scan.
- rerank_id: Model for reranking results.
- kb_ids: List of knowledge base IDs to query.
Usage: Supplies job/company info for generation nodes to answer queries accurately.

Edges and Conversation Flow Logic

Edges connect nodes and define allowed transitions triggered by user input or internal logic:

Conversation starts at begin.
Moves to answer:0 for user response.
User input categorized at categorize:0.
Depending on category:
- interested → message:introduction → answer:1
- casual → generate:casual → answer:1
- answer (disinterest) → message:reject
- about_job → retrieval:0 → generate:aboutJob → answer:1
Subsequent user replies are categorized again at categorize:1 with a similar set of categories but including wechat and giveup.
wechat leads to generate:get_wechat.
giveup leads to generate:nowechat.
The flow allows looping between questioning and answering until the conversation ends or candidate refuses further engagement.

Important Implementation Details and Algorithms

LLM-powered Categorization: The system uses a language model labeled deepseek-chat to classify user inputs into categories to decide the next conversational step.
Retrieval-Augmented Generation (RAG): Combines retrieval from a knowledge base (kb_ids) with LLM generation to respond to job-related questions accurately.
Dynamic Prompts: The generate nodes use carefully crafted prompts to guide LLM responses to maintain a polite, professional, and engaging conversational style.
Multi-turn Context Handling: message_history_window_size controls how much prior dialogue context is fed into the LLM for more coherent and contextually aware replies.
User Intent Routing: Categorization nodes serve as decision points to route the conversation based on user intent (interest, casual chat, rejection, contact sharing willingness).

Interaction with Other System Components

RAGFlow Engine: This JSON is an input configuration for RAGFlow, which interprets nodes and edges for dialogue management.
Knowledge Base: The retrieval node queries a knowledge base (kb_ids) that contains job and company info.
LLM Service: The deepseek-chat LLM is invoked for categorization and generation tasks.
Messaging Interface: The chatbot frontend or messaging platform uses this flow to send messages and process user responses accordingly.

Usage Example Scenario

Chatbot starts with the begin node greeting and job pitch.
User responds; input is passed to answer:0.
categorize:0 classifies response as "interested".
Bot sends message:introduction with job details.
User asks about job specifics, routed through answer:1 and categorize:1 as "about_job".
retrieval:0 fetches relevant info; generate:aboutJob composes answer.
User expresses willingness to share WeChat, routed to generate:get_wechat.
Conversation continues or ends politely based on user inputs.

Visual Diagram

flowchart TD
    begin["Begin\n(Introduction Message)"]
    answer0["Answer:0\n(Candidate Response)"]
    categorize0["Categorize:0\n(Classify Candidate Input)"]
    message_intro["Message:Introduction\n(Explain RAGFlow)"]
    generate_casual["Generate:Casual\n(Handle Small Talk)"]
    message_reject["Message:Reject\n(Polite Decline)"]
    retrieval0["Retrieval:0\n(Query Knowledge Base)"]
    generate_aboutJob["Generate:AboutJob\n(Answer Job Questions)"]
    answer1["Answer:1\n(Candidate Response)"]
    categorize1["Categorize:1\n(Reclassify Response)"]
    generate_getWechat["Generate:Get_Wechat\n(Request or Confirm WeChat)"]
    generate_noWechat["Generate:NoWechat\n(Handle WeChat Refusal)"]

    begin --> answer0
    answer0 --> categorize0

    categorize0 -- interested --> message_intro
    categorize0 -- casual --> generate_casual
    categorize0 -- answer --> message_reject
    categorize0 -- about_job --> retrieval0

    message_intro --> answer1
    generate_casual --> answer1
    retrieval0 --> generate_aboutJob
    generate_aboutJob --> answer1

    answer1 --> categorize1

    categorize1 -- about_job --> retrieval0
    categorize1 -- casual --> generate_casual
    categorize1 -- wechat --> generate_getWechat
    categorize1 -- giveup --> generate_noWechat

    generate_getWechat --> answer1
    generate_noWechat --> answer1

Summary

This file configures a dialogue system for a Chinese AGI-focused headhunter chatbot using RAGFlow. It orchestrates a multi-turn conversation handling user interest, job inquiries, casual talk, and contact info sharing, powered by LLM-based categorization, retrieval from a knowledge base, and dynamic response generation. The graph structure enables flexible routing and context-aware interaction to maximize candidate engagement.