mind_map_extractor.py


Overview

mind_map_extractor.py defines a specialized extractor module used to convert textual input into structured mind map representations. It primarily interacts with a language model (LLM) to process text sections asynchronously, parse the LLM's Markdown-formatted output into hierarchical JSON structures, and merge multiple partial mind maps into a cohesive unipartite mind graph.

This module is part of the InfiniFlow system, likely within a retrieval-augmented generation (RAG) pipeline, where it leverages LLMs to extract structured knowledge (mind maps) from unstructured or semi-structured text inputs.


Classes and Functions

MindMapResult

@dataclass
class MindMapResult:
    output: dict
result = MindMapResult(output={"id": "root", "children": []})
print(result.output)

MindMapExtractor

class MindMapExtractor(Extractor):

Attribute

Type

Description

_llm

CompletionLLM

The language model invoker for extraction.

_input_text_key

str

Key name under which input text is passed into the prompt. Defaults to "input_text".

_mind_map_prompt

str

Prompt template for mind map extraction. Defaults to MIND_MAP_EXTRACTION_PROMPT.

_on_error

ErrorHandlerFn

Callback for error handling during processing.


__init__

def __init__(
        self,
        llm_invoker: CompletionLLM,
        prompt: str | None = None,
        input_text_key: str | None = None,
        on_error: ErrorHandlerFn | None = None,
)
extractor = MindMapExtractor(llm_invoker=my_llm, prompt=my_prompt)

_key

def _key(self, k: str) -> str:

_be_children

def _be_children(self, obj: dict | list | str, keyset: set) -> list:

__call__

async def __call__(
        self, sections: list[str], prompt_variables: dict[str, Any] | None = None
) -> MindMapResult:
result = await mind_map_extractor(sections=["Text part 1", "Text part 2"])
print(result.output)

_merge

def _merge(self, d1: dict, d2: dict) -> dict:

_list_to_kv

def _list_to_kv(self, data: dict) -> dict:

_todict

def _todict(self, layer: collections.OrderedDict) -> dict:

_process_document

async def _process_document(
        self, text: str, prompt_variables: dict[str, str], out_res: list
) -> str:

Important Implementation Details


Interaction with Other System Components


Example Usage

from rag.llm.chat_model import Base as CompletionLLM

# Assume llm is a CompletionLLM instance already configured
mind_map_extractor = MindMapExtractor(llm_invoker=llm)

sections = [
    "Introduction to neural networks...",
    "Details about convolutional layers...",
    "Summary and future directions."
]

result = await mind_map_extractor(sections)
print(result.output)

Visual Diagram: Class Diagram of MindMapExtractor

classDiagram
    class MindMapResult {
        +output: dict
    }

    class MindMapExtractor {
        - _llm: CompletionLLM
        - _input_text_key: str
        - _mind_map_prompt: str
        - _on_error: ErrorHandlerFn

        + __init__(llm_invoker, prompt=None, input_text_key=None, on_error=None)
        - _key(k: str) str
        - _be_children(obj: dict|list|str, keyset: set) list
        + __call__(sections: list[str], prompt_variables: dict[str, Any]|None) async MindMapResult
        - _merge(d1: dict, d2: dict) dict
        - _list_to_kv(data: dict) dict
        - _todict(layer: OrderedDict) dict
        - _process_document(text: str, prompt_variables: dict[str, str], out_res: list) async str
    }

    MindMapExtractor --> MindMapResult : returns
    MindMapExtractor ..> CompletionLLM : uses
    MindMapExtractor ..> ErrorHandlerFn : uses
    MindMapExtractor ..> MIND_MAP_EXTRACTION_PROMPT : uses
    MindMapExtractor ..> markdown_to_json : uses

Summary

mind_map_extractor.py encapsulates an asynchronous, LLM-backed extraction mechanism that converts input text into structured mind maps. It manages prompt customization, token budgeting, concurrent processing, response parsing, and merging to provide a coherent mind map representation suitable for downstream graph-based reasoning or visualization.

This module integrates tightly with the larger InfiniFlow framework, particularly the general extractor interface, prompt templates, and the RAG LLM client, ensuring modularity and extensibility within the system's knowledge extraction pipeline.