common.py


Overview

common.py is a utility module that provides a comprehensive set of functions to interact with the InfiniFlow backend REST API. It focuses on managing key entities such as datasets, documents (files), chunks, chat assistants, and sessions. The module abstracts HTTP requests to the API endpoints, enabling easy creation, retrieval, update, deletion (CRUD) operations, and batch processing for these entities.

This file acts as a client-side connector for the InfiniFlow system, handling authentication, request formatting, multipart file uploads, and response parsing. It is designed to be used by higher-level application components or scripts that require programmatic access to InfiniFlow’s resources.


Detailed Functionality

The functions in common.py are grouped by the resource or domain they manage:

All HTTP requests use the requests library and expect an auth parameter (for authentication, e.g., HTTPBasicAuth). The base URL is configurable via the HOST_ADDRESS environment variable.


Constants

Constant

Description

HEADERS

Default HTTP headers for JSON content.

HOST_ADDRESS

Base URL for API requests (default http://127.0.0.1:9380).

DATASETS_API_URL

API endpoint path for dataset operations.

FILE_API_URL

API endpoint path for documents within datasets.

FILE_CHUNK_API_URL

API endpoint for file chunk operations.

CHUNK_API_URL

API endpoint for document chunk operations.

CHAT_ASSISTANT_API_URL

API endpoint path for chat assistant operations.

SESSION_WITH_CHAT_ASSISTANT_API_URL

API endpoint for chat assistant sessions.

SESSION_WITH_AGENT_API_URL

API endpoint for agent sessions (not explicitly used in this file).

INVALID_API_TOKEN

Placeholder invalid token string.

DATASET_NAME_LIMIT

Max length for dataset names (128 chars).

DOCUMENT_NAME_LIMIT

Max length for document names (128 chars).

CHAT_ASSISTANT_NAME_LIMIT

Max length for chat assistant names (255 chars).

SESSION_WITH_CHAT_NAME_LIMIT

Max length for chat assistant session names (255 chars).


Dataset Management Functions

create_dataset(auth, payload=None, *, headers=HEADERS, data=None)

Creates a new dataset.

list_datasets(auth, params=None, *, headers=HEADERS)

Lists datasets, optionally filtered by parameters.

update_dataset(auth, dataset_id, payload=None, *, headers=HEADERS, data=None)

Updates dataset metadata.

delete_datasets(auth, payload=None, *, headers=HEADERS, data=None)

Deletes datasets in batch via payload.

batch_create_datasets(auth, num)

Creates multiple datasets named sequentially.


File (Document) Management Functions

upload_documnets(auth, dataset_id, files_path=None)

Uploads one or more documents/files to a dataset.

download_document(auth, dataset_id, document_id, save_path)

Downloads a document from a dataset and saves it locally.

list_documnets(auth, dataset_id, params=None)

Lists documents in a dataset.

update_documnet(auth, dataset_id, document_id, payload=None)

Updates document metadata.

delete_documnets(auth, dataset_id, payload=None)

Deletes one or more documents in a dataset.

parse_documnets(auth, dataset_id, payload=None)

Triggers parsing of documents in a dataset to create chunks.

stop_parse_documnets(auth, dataset_id, payload=None)

Stops ongoing document parsing.

bulk_upload_documents(auth, dataset_id, num, tmp_path)

Helper to create temporary files and bulk upload them.


Chunk Management Functions

add_chunk(auth, dataset_id, document_id, payload=None)

Adds a chunk to a specific document.

list_chunks(auth, dataset_id, document_id, params=None)

Lists chunks for a document.

update_chunk(auth, dataset_id, document_id, chunk_id, payload=None)

Updates a chunk’s data.

delete_chunks(auth, dataset_id, document_id, payload=None)

Deletes chunks in batch.

retrieval_chunks(auth, payload=None)

Retrieves chunks based on retrieval query.

batch_add_chunks(auth, dataset_id, document_id, num)

Creates several chunks with test content.


Chat Assistant Management Functions

create_chat_assistant(auth, payload=None)

Creates a new chat assistant.

list_chat_assistants(auth, params=None)

Lists chat assistants.

update_chat_assistant(auth, chat_assistant_id, payload=None)

Updates chat assistant info.

delete_chat_assistants(auth, payload=None)

Deletes chat assistants.

batch_create_chat_assistants(auth, num)

Creates multiple chat assistants.


Session Management with Chat Assistants

create_session_with_chat_assistant(auth, chat_assistant_id, payload=None)

Creates a session linked to a chat assistant.

list_session_with_chat_assistants(auth, chat_assistant_id, params=None)

Lists sessions for a chat assistant.

update_session_with_chat_assistant(auth, chat_assistant_id, session_id, payload=None)

Updates session info.

delete_session_with_chat_assistants(auth, chat_assistant_id, payload=None)

Deletes sessions linked to a chat assistant.

batch_add_sessions_with_chat_assistant(auth, chat_assistant_id, num)

Creates multiple sessions for a chat assistant.


Important Implementation Details


Interaction with Other System Components


Visual Diagram

The following Mermaid class diagram visualizes the logical grouping of functions in common.py. Since the file is a utility module without classes, the diagram shows function groups as "utility classes" for clarity.

classDiagram
    class DatasetManagement {
        +create_dataset(auth, payload)
        +list_datasets(auth, params)
        +update_dataset(auth, dataset_id, payload)
        +delete_datasets(auth, payload)
        +batch_create_datasets(auth, num)
    }
    class DocumentManagement {
        +upload_documnets(auth, dataset_id, files_path)
        +download_document(auth, dataset_id, document_id, save_path)
        +list_documnets(auth, dataset_id, params)
        +update_documnet(auth, dataset_id, document_id, payload)
        +delete_documnets(auth, dataset_id, payload)
        +parse_documnets(auth, dataset_id, payload)
        +stop_parse_documnets(auth, dataset_id, payload)
        +bulk_upload_documents(auth, dataset_id, num, tmp_path)
    }
    class ChunkManagement {
        +add_chunk(auth, dataset_id, document_id, payload)
        +list_chunks(auth, dataset_id, document_id, params)
        +update_chunk(auth, dataset_id, document_id, chunk_id, payload)
        +delete_chunks(auth, dataset_id, document_id, payload)
        +retrieval_chunks(auth, payload)
        +batch_add_chunks(auth, dataset_id, document_id, num)
    }
    class ChatAssistantManagement {
        +create_chat_assistant(auth, payload)
        +list_chat_assistants(auth, params)
        +update_chat_assistant(auth, chat_assistant_id, payload)
        +delete_chat_assistants(auth, payload)
        +batch_create_chat_assistants(auth, num)
    }
    class SessionManagement {
        +create_session_with_chat_assistant(auth, chat_assistant_id, payload)
        +list_session_with_chat_assistants(auth, chat_assistant_id, params)
        +update_session_with_chat_assistant(auth, chat_assistant_id, session_id, payload)
        +delete_session_with_chat_assistants(auth, chat_assistant_id, payload)
        +batch_add_sessions_with_chat_assistant(auth, chat_assistant_id, num)
    }

    DatasetManagement --> DocumentManagement : manages documents
    DocumentManagement --> ChunkManagement : manages chunks