kb_app.py
Overview
kb_app.py is a Flask-based REST API module responsible for managing knowledge bases (KBs) within the InfiniFlow platform. It provides endpoints to create, update, delete, retrieve details, and list knowledge bases, as well as manage associated tags and knowledge graphs. The module enforces user authentication and authorization, ensuring that operations on knowledge bases are performed only by authorized users.
This file acts as the controller layer interfacing between HTTP requests and backend services such as KnowledgebaseService, DocumentService, and external components like settings.docStoreConn (likely an Elasticsearch or similar document store) and settings.retrievaler (a retrieval/search engine). It validates inputs, handles errors, and formats responses consistently.
Classes and Functions
This module does not define any classes but defines multiple Flask route handlers (functions) decorated for routes under the manager blueprint. Each function corresponds to an API endpoint.
1. create()
Route: POST /create
Description: Creates a new knowledge base for the authenticated user.
Parameters:
JSON body with:
name(string): Name of the new knowledge base. Must be non-empty and under a certain length (DATASET_NAME_LIMIT).
Returns:
JSON response with:
kb_id: The UUID of the newly created knowledge base on success.
Error messages if:
Name is invalid or duplicated.
Tenant is not found.
Other server errors.
Usage example:
POST /create
Content-Type: application/json
{
"name": "My Knowledge Base"
}
Implementation details:
Validates the
namefield.Checks for duplicate names within the tenant.
Generates a UUID for the KB.
Sets tenant and creator info from the
current_user.Saves the KB using
KnowledgebaseService.Uses exception handling to catch server errors.
2. update()
Route: POST /update
Description: Updates metadata of an existing knowledge base.
Parameters:
JSON body with:
kb_id(string): ID of the knowledge base to update.name(string): New name for the KB, validated similarly tocreate.description(string): Description text.parser_id(string/int): Parser identifier.Optional
pagerank(int): PageRank value for ordering relevance.
Returns:
JSON with updated knowledge base data.
Error messages for authorization issues, duplicate names, or server errors.
Important:
Disallows updates to protected fields like
id,tenant_id,created_by.Checks if the user is the owner and authorized to update.
Updates pagerank in the document store if changed.
3. detail()
Route: GET /detail
Description: Retrieves detailed metadata for a specific knowledge base.
Parameters:
Query parameter:
kb_id(string): Knowledge base ID.
Returns:
JSON with knowledge base details including total document size.
Error if the user is unauthorized or KB not found.
4. list_kbs()
Route: POST /list
Description: Lists knowledge bases accessible by the user, supports pagination, filtering, and ordering.
Parameters:
Query parameters:
keywords(string): Filter by keywords.page(int): Page number.page_size(int): Number of items per page.parser_id(optional): Filter by parser ID.orderby(string): Field to order by.desc(boolean): Descending order flag.
JSON body:
Optional
owner_ids(list): Tenant IDs to filter knowledge bases by ownership.
Returns:
JSON with:
kbs: List of knowledge bases.total: Total count.
5. rm()
Route: POST /rm
Description: Removes a knowledge base and all associated documents and files.
Parameters:
JSON body with:
kb_id(string): Knowledge base ID to delete.
Returns:
JSON indicating success or failure with appropriate error messages.
Implementation details:
Checks authorization.
Removes documents via
DocumentService.Deletes associated files and file-to-document links.
Deletes knowledge base record.
Removes entries from the document store and physical storage buckets if applicable.
6. list_tags(kb_id)
Route: GET /<kb_id>/tags
Description: Lists all tags associated with a specific knowledge base.
Parameters:
URL parameter:
kb_id(string): Knowledge base ID.
Returns:
JSON list of tags.
Authorization: User must have access to the KB.
7. list_tags_from_kbs()
Route: GET /tags
Description: Lists tags from multiple knowledge bases.
Parameters:
Query parameter:
kb_ids(string): Comma-separated KB IDs.
Returns:
JSON list of tags aggregated from all specified KBs.
Authorization: User must have access to all specified KBs.
8. rm_tags(kb_id)
Route: POST /<kb_id>/rm_tags
Description: Removes specified tags from a knowledge base.
Parameters:
URL parameter:
kb_id(string): KB ID.
JSON body:
tags(list): Tags to remove.
Returns:
JSON indicating success.
9. rename_tags(kb_id)
Route: POST /<kb_id>/rename_tag
Description: Renames a tag in a knowledge base.
Parameters:
URL parameter:
kb_id(string): KB ID.
JSON body:
from_tag(string): Tag to rename.to_tag(string): New tag name.
Returns:
JSON indicating success.
10. knowledge_graph(kb_id)
Route: GET /<kb_id>/knowledge_graph
Description: Retrieves knowledge graph and mind map data for a knowledge base.
Parameters:
URL parameter:
kb_id(string): KB ID.
Returns:
JSON object containing
graphandmind_mapstructures.Limits nodes to 256 and edges to 128, sorted by rank and weight.
11. delete_knowledge_graph(kb_id)
Route: DELETE /<kb_id>/knowledge_graph
Description: Deletes knowledge graph data from the document store for a knowledge base.
Parameters:
URL parameter:
kb_id(string): KB ID.
Returns:
JSON indicating success.
12. get_meta()
Route: GET /get_meta
Description: Retrieves metadata of documents for multiple knowledge bases.
Parameters:
Query parameter:
kb_ids(string): Comma-separated KB IDs.
Returns:
JSON metadata aggregated from specified KBs.
Implementation Details and Algorithms
Authorization Checks: Most endpoints verify if the current user is authorized to access or modify the specified knowledge base(s) via
KnowledgebaseService.accessibleoraccessible4deletion.Duplicate Name Handling: When creating or updating a KB, the system checks for duplicate names within the user's tenant using
duplicate_name()and queries withKnowledgebaseService.query(...).Document Store Integration: The file interacts heavily with an external document store connection
settings.docStoreConn, which stores indexed representations of knowledge bases, documents, tags, and graphs. Updates to pagerank fields and tags are propagated here.Data Validation: Uses decorators like
@validate_requestand@not_allowed_parametersto ensure request payloads meet API expectations, enforcing security and data integrity.Paging and Filtering: The
list_kbsendpoint implements flexible paging and filtering parameters and supports querying knowledge bases owned by specific tenants.Knowledge Graph Management: The knowledge graph retrieval endpoint extracts JSON content from stored documents, filtering and sorting graph nodes and edges to manageable sizes for client consumption.
Storage Cleanup: When deleting a KB, the code removes related files, documents, and storage buckets (if supported by the storage implementation).
Interaction with Other System Components
Flask and Flask-Login: Provides HTTP routing and user session management.
Database Services: Uses various services (
KnowledgebaseService,DocumentService,FileService, etc.) for CRUD operations on knowledge bases, documents, files, and tenant/user info.Document Store (
settings.docStoreConn): An external datastore (likely Elasticsearch or similar) where knowledge bases and documents are indexed for fast retrieval. Used for tag management, pagerank updates, and knowledge graph storage.Retrieval Engine (
settings.retrievaler): Provides search and retrieval capabilities, including fetching tags and knowledge graph data.Storage Factory (
STORAGE_IMPL): Abstracts physical storage buckets; used during KB deletion to remove buckets.Constants and Settings: Uses constants like
DATASET_NAME_LIMITand settings enums/codes for response codes and configuration.
Visual Diagram
classDiagram
class kb_app {
+create()
+update()
+detail()
+list_kbs()
+rm()
+list_tags(kb_id)
+list_tags_from_kbs()
+rm_tags(kb_id)
+rename_tags(kb_id)
+knowledge_graph(kb_id)
+delete_knowledge_graph(kb_id)
+get_meta()
}
kb_app ..> KnowledgebaseService : uses
kb_app ..> DocumentService : uses
kb_app ..> FileService : uses
kb_app ..> TenantService : uses
kb_app ..> UserTenantService : uses
kb_app ..> settings.docStoreConn : uses for indexing/search
kb_app ..> settings.retrievaler : uses for search/tags
kb_app ..> STORAGE_IMPL : uses for storage bucket management
Summary
kb_app.py is the primary API controller for knowledge base management in InfiniFlow. It provides comprehensive endpoints to create, update, delete, retrieve, and list knowledge bases and their associated metadata, tags, and knowledge graphs. The file enforces strict authorization, validates inputs, and coordinates with backend services and external document stores to maintain consistency and performance. It is a critical component bridging user requests with persistent storage and search infrastructure.