smoke.py


Overview

smoke.py is a command-line utility script designed to process a specific document within a tenant's knowledge base in the InfiniFlow system. It extracts graph representations from document chunks, updates these graphs, and performs resolution and community detection analyses using large language model (LLM) services.

The script leverages asynchronous programming (via trio) to efficiently handle I/O-bound tasks such as fetching data and calling LLM APIs. The main output consists of JSON-serialized graph data and community detection results printed to the console, primarily for inspection or debugging purposes.


Detailed Description

Key Functionalities


Classes and Functions

callback(prog=None, msg="Processing...")


async def main()


Important Implementation Details and Algorithms


Interaction with Other System Components


Visual Diagram

classDiagram
    class smoke_py {
        +callback(prog=None, msg="Processing...")
        +main()
    }

    class DocumentService {
        +get_by_id(doc_id)
    }

    class TenantService {
        +get_by_id(tenant_id)
    }

    class KnowledgebaseService {
        +get_by_id(kb_id)
    }

    class LLMBundle {
        +__init__(tenant_id, llm_type, llm_id)
    }

    class GraphExtractor

    class update_graph {
        +__call__(...)
    }

    class with_resolution {
        +__call__(...)
    }

    class with_community {
        +__call__(...)
    }

    smoke_py --> DocumentService : calls get_by_id(doc_id)
    smoke_py --> TenantService : calls get_by_id(tenant_id)
    smoke_py --> KnowledgebaseService : calls get_by_id(kb_id)
    smoke_py --> LLMBundle : initializes chat and embedding bundles
    smoke_py --> GraphExtractor : passed as parameter to update_graph
    smoke_py --> update_graph : updates graph with document data
    smoke_py --> with_resolution : performs resolution on graph
    smoke_py --> with_community : performs community detection

Summary

smoke.py is a diagnostic or utility script that integrates multiple system components to analyze a document's content from a tenant's knowledge base by building and analyzing a graph representation of the document chunks. It uses advanced LLM-based services for embedding and chat interactions to enrich the graph and extract meaningful community structures, outputting detailed JSON data for further use or debugging.


Notes


End of Documentation for smoke.py