file2document_app.py


Overview

The file2document_app.py file provides RESTful API endpoints for managing the conversion of files into documents within the InfiniFlow system. It handles associating files (or folders containing files) with documents in one or more knowledgebases, as well as removing these associations. This process involves querying file metadata, knowledgebase configurations, and document records, then performing create and delete operations on documents and their mappings to files.

This file is a part of the backend API layer, built with Flask and Flask-Login for authentication, and interacts extensively with service-layer modules dealing with files, documents, knowledgebases, and their relationships.


Classes and Functions

This file does not define any classes but contains two main Flask route handler functions:

1. convert()

Purpose

Convert one or more files (or folders) into documents linked to specified knowledgebases. This includes:

Decorators

Parameters

Expected JSON body keys:

Returns

Usage Example

POST /convert
Content-Type: application/json
Authorization: Bearer <token>

{
  "file_ids": ["file123", "folder456"],
  "kb_ids": ["kb789", "kb101"]
}

Implementation Details

Error handling is performed at multiple steps, returning appropriate error results if any entity cannot be found or operations fail.


2. rm()

Purpose

Remove documents and their associations with files for given file IDs.

Decorators

Parameters

Expected JSON body key:

Returns

Usage Example

POST /rm
Content-Type: application/json
Authorization: Bearer <token>

{
  "file_ids": ["file123", "file456"]
}

Implementation Details


Important Implementation Details


Interactions with Other Components

The endpoints exposed here are likely consumed by frontend components or other backend services that manage file ingestion, knowledgebase management, and document processing workflows in the InfiniFlow platform.


Mermaid Class Diagram

classDiagram
    class file2document_app {
        <<module>>
        +convert()
        +rm()
    }

    class FileService {
        +get_by_ids(ids: list) File[]
        +get_all_innermost_file_ids(folder_id: str, acc: list) list
        +get_by_id(file_id: str) (bool, File)
        +get_parser(type: str, name: str, kb_parser_id: str) str
    }

    class File2DocumentService {
        +get_by_file_id(file_id: str) list
        +insert(data: dict) File2Document
        +delete_by_file_id(file_id: str)
    }

    class DocumentService {
        +get_by_id(doc_id: str) (bool, Document)
        +get_tenant_id(doc_id: str) str
        +remove_document(doc: Document, tenant_id: str) bool
        +insert(data: dict) Document
    }

    class KnowledgebaseService {
        +get_by_id(kb_id: str) (bool, Knowledgebase)
    }

    file2document_app ..> FileService : uses
    file2document_app ..> File2DocumentService : uses
    file2document_app ..> DocumentService : uses
    file2document_app ..> KnowledgebaseService : uses

Summary

The file2document_app.py file is a backend API module responsible for converting files into documents and managing their lifecycle within knowledgebases. It provides secure endpoints for creating and deleting file-document relationships, ensuring data consistency through careful validation and error handling. The module integrates tightly with service layers that abstract database operations, making it a crucial part of the InfiniFlow document ingestion and management pipeline.