knowledgebase_service.py

Overview

knowledgebase_service.py defines the KnowledgebaseService class, a specialized service layer for managing knowledge base entities within the InfiniFlow platform. Extending a generic CommonService, this class encapsulates business logic related to knowledge bases, including access control, document parsing status, tenant-based organization, parser configuration management, and CRUD operations.

This service interacts heavily with the database models (Knowledgebase, Document, Tenant, User, UserTenant) via the Peewee ORM and supports multi-tenant environments with permission enforcement.

Key functionalities provided include:


Classes and Methods

Class: KnowledgebaseService

Extends CommonService. The primary service class to manage knowledge base operations.

Attributes:


Methods:

accessible4deletion(kb_id: str, user_id: str) -> bool

Checks if the specified user has permission to delete a knowledge base. Only the creator of the knowledge base has deletion rights.

if KnowledgebaseService.accessible4deletion("kb123", "user456"):
    # proceed with deletion

is_parsed_done(kb_id: str) -> (bool, str|None)

Determines whether all documents in the knowledge base have finished parsing successfully.

done, msg = KnowledgebaseService.is_parsed_done("kb123")
if not done:
    print(msg)

list_documents_by_ids(kb_ids: List[str]) -> List[str]

Fetches document IDs associated with the specified knowledge base IDs.


get_by_tenant_ids(joined_tenant_ids: List[str], user_id: str, page_number: int, items_per_page: int, orderby: str, desc: bool, keywords: str, parser_id: Optional[str] = None) -> (List[dict], int)

Retrieves paginated knowledge bases owned by or shared with the user via tenants.


get_kb_ids(tenant_id: str) -> List[str]

Returns all knowledge base IDs belonging to a given tenant.


get_detail(kb_id: str) -> Optional[dict]

Fetches detailed information about a knowledge base, including metadata and configuration.


update_parser_config(id: str, config: dict) -> None

Merges a new parser configuration into the existing one for a knowledge base.


delete_field_map(id: str) -> None

Removes the "field_map" key from the knowledge base's parser configuration.


get_field_map(ids: List[str]) -> dict

Aggregates and returns field mappings across multiple knowledge bases.


get_by_name(kb_name: str, tenant_id: str) -> (bool, Optional[Knowledgebase])

Retrieves a knowledge base by name within a tenant's scope.


get_all_ids() -> List[str]

Returns all knowledge base IDs in the system.


get_list(joined_tenant_ids: List[str], user_id: str, page_number: int, items_per_page: int, orderby: str, desc: bool, id: Optional[str], name: Optional[str]) -> List[dict]

Fetches knowledge bases filtered by multiple criteria with pagination.


accessible(kb_id: str, user_id: str) -> bool

Checks if a knowledge base is accessible by a user, based on tenant membership.


get_kb_by_id(kb_id: str, user_id: str) -> List[dict]

Returns knowledge base info by ID if accessible by user.


get_kb_by_name(kb_name: str, user_id: str) -> List[dict]

Returns knowledge base info by name if accessible by user.


atomic_increase_doc_num_by_id(kb_id: str) -> int

Atomically increments the document count of a knowledge base by 1.


update_document_number_in_init(kb_id: str, doc_num: int) -> None

Sets the document number for a knowledge base during system initialization.


Implementation Details and Algorithms


Interactions with Other System Components


Visual Diagram

classDiagram
    class KnowledgebaseService {
        +model: Knowledgebase
        +accessible4deletion(kb_id: str, user_id: str) bool
        +is_parsed_done(kb_id: str) (bool, str|None)
        +list_documents_by_ids(kb_ids: List[str]) List[str]
        +get_by_tenant_ids(joined_tenant_ids: List[str], user_id: str, page_number: int, items_per_page: int, orderby: str, desc: bool, keywords: str, parser_id: Optional[str]) (List[dict], int)
        +get_kb_ids(tenant_id: str) List[str]
        +get_detail(kb_id: str) dict
        +update_parser_config(id: str, config: dict) None
        +delete_field_map(id: str) None
        +get_field_map(ids: List[str]) dict
        +get_by_name(kb_name: str, tenant_id: str) (bool, Knowledgebase)
        +get_all_ids() List[str]
        +get_list(joined_tenant_ids: List[str], user_id: str, page_number: int, items_per_page: int, orderby: str, desc: bool, id: Optional[str], name: Optional[str]) List[dict]
        +accessible(kb_id: str, user_id: str) bool
        +get_kb_by_id(kb_id: str, user_id: str) List[dict]
        +get_kb_by_name(kb_name: str, user_id: str) List[dict]
        +atomic_increase_doc_num_by_id(kb_id: str) int
        +update_document_number_in_init(kb_id: str, doc_num: int) None
    }

    KnowledgebaseService --> Knowledgebase : uses model
    KnowledgebaseService ..> CommonService : inherits
    KnowledgebaseService ..> DocumentService : calls get_by_kb_id
    KnowledgebaseService --> UserTenant : joins for permission check
    KnowledgebaseService --> Tenant : joins for tenant validation
    KnowledgebaseService --> User : joins for user info in queries

Summary

knowledgebase_service.py provides the core business logic for knowledge base management in InfiniFlow, ensuring secure, tenant-aware access and maintaining the integrity of knowledge base states and configurations. It acts as the bridge between the database layer and higher-level application components that manipulate or display knowledge base data. The service's comprehensive method set covers validation, querying, updating, and status checking, supporting robust and scalable multi-tenant knowledge base operations.