smoke.py


Overview

smoke.py is a utility script designed to process a specific document within a tenant's knowledge base in the InfiniFlow platform. It extracts and updates a knowledge graph representation of the document’s content by leveraging language models and graph extraction utilities. The script is intended to be run as a standalone asynchronous application that:

This script is primarily useful for testing or "smoke testing" the graph extraction and update pipeline for a document in a tenant-specific knowledgebase.


Detailed Explanation

Imports and Initialization


Functions and Methods

callback(prog=None, msg="Processing...")


async def main()


Important Implementation Details


Interaction With Other Components


Usage Example

python smoke.py --tenant_id 123 --doc_id 456

This command processes the document with ID 456 belonging to tenant 123, updates its knowledge graph representation, and prints the resulting graph JSON to standard output.


Mermaid Class Diagram

classDiagram
    class smoke {
        +callback(prog=None, msg="Processing...")
        +async main()
    }

    class DocumentService {
        +get_by_id(doc_id)
    }
    class TenantService {
        +get_by_id(tenant_id)
    }
    class KnowledgebaseService {
        +get_by_id(kb_id)
    }
    class LLMBundle {
        +__init__(tenant_id, llm_type, llm_id)
    }
    class GraphExtractor
    class update_graph {
        +__call__(...)
    }

    smoke ..> DocumentService : uses
    smoke ..> TenantService : uses
    smoke ..> KnowledgebaseService : uses
    smoke ..> LLMBundle : uses
    smoke ..> GraphExtractor : uses
    smoke ..> update_graph : calls

This diagram shows the main functions in smoke.py and their relationships to key service classes and functions it utilizes.


Summary

smoke.py is a focused, asynchronous utility script designed to test and update the knowledge graph of a single document within a tenant's knowledge base using InfiniFlow's LLM and graph extraction infrastructure. It combines data retrieval, natural language processing, and graph update workflows, and outputs the final graph structure as JSON for further analysis or visualization.