init.py
Overview
This init.py file serves as a foundational utility module within the InfiniFlow project, providing a broad suite of helper functions, classes, and configuration management utilities used across the codebase. Its main responsibilities include:
Loading and merging configuration files with support for local overrides.
Secure handling of sensitive configuration data (e.g., passwords, keys).
Serialization and deserialization utilities including safe unpickling.
Date/time formatting and conversion helpers.
Cryptographic utilities for decrypting passwords or data.
Network utility to retrieve local IP address.
JSON encoding extensions for custom types.
Unique identifier generation.
Other miscellaneous helpers such as downloading images as base64, hashing strings, and elapsed time formatting.
This module acts as a core utility toolkit that other parts of the system depend on to perform common low-level tasks, facilitating cleaner, safer, and more consistent code.
Detailed Explanations
Global Constants and Variables
CONFIGS: A dictionary containing merged global and local configuration loaded at module import.use_deserialize_safe_module: Boolean flag indicating whether to use a restricted, safer deserialization method for pickled data. Derived from configs or environment variables.
Functions
conf_realpath(conf_name: str) -> str
Returns the absolute path of a configuration file inside the conf/ directory relative to the project base directory.
Parameters:
conf_name- The configuration file name, e.g.,"service.yaml".
Returns: Absolute file path string.
Usage:
config_path = conf_realpath("service.yaml")
read_config(conf_name: str = SERVICE_CONF) -> dict
Loads and merges global and local YAML configuration files.
Loads
conf/{conf_name}as the global config.Loads
conf/local.{conf_name}as local overrides if it exists.Merges local config keys into global config, overriding duplicates.
Validates that loaded configs are dictionaries.
Parameters:
conf_name- The main config file name (default fromSERVICE_CONFconstant).
Returns: Merged configuration dictionary.
Exceptions:
Raises
ValueErrorif any config file is invalid (not a dict).
Example:
configs = read_config()
show_configs() -> None
Logs the current configuration with sensitive values masked (e.g., passwords replaced with ********).
Iterates over config dictionary.
Masks keys containing sensitive info like
password,secret_key,access_key,sas_token,client_secret, etc.Outputs config summary via
logging.info.
get_base_config(key: str, default=None) -> any
Retrieves a configuration value by key from the merged CONFIGS dictionary.
If
keyisNone, returnsNone.If
defaultisNone, tries to get default from environment variable namedkey.upper().Otherwise returns
CONFIGS[key]ordefault.
string_to_bytes(string: str|bytes) -> bytes
Ensures the input string is returned as bytes using UTF-8 encoding if input is str.
bytes_to_string(byte: bytes) -> str
Decodes bytes to UTF-8 string.
json_dumps(src: any, byte: bool = False, indent: int | None = None, with_type: bool = False) -> str | bytes
Serializes a Python object to JSON string or bytes.
Supports indentation.
Uses
CustomJSONEncoderto handle dates, enums, sets, andBaseTypeobjects.If
byte=True, returns UTF-8 encoded bytes of JSON string.with_type=Trueincludes type metadata in serialization.
json_loads(src: str | bytes, object_hook=None, object_pairs_hook=None) -> any
Deserializes JSON string or bytes to Python object.
Decodes bytes to string if needed.
Supports optional
object_hookandobject_pairs_hook.
current_timestamp() -> int
Returns the current timestamp in milliseconds since the Unix epoch.
timestamp_to_date(timestamp: int | float, format_string: str = "%Y-%m-%d %H:%M:%S") -> str
Converts a millisecond timestamp to a formatted date string.
date_string_to_timestamp(time_str: str, format_string: str = "%Y-%m-%d %H:%M:%S") -> int
Converts a formatted date string to a millisecond timestamp.
serialize_b64(src: any, to_str: bool = False) -> bytes | str
Serializes a Python object using pickle, then base64 encodes it.
If
to_str=True, returns base64 encoded data as UTF-8 string.Otherwise returns bytes.
deserialize_b64(src: str | bytes) -> any
Decodes base64 string/bytes and deserializes using pickle.
If
use_deserialize_safe_moduleis True, uses a restricted unpickler for security.
get_lan_ip() -> str
Retrieves the local LAN IP address.
On non-Windows systems, attempts to obtain IP via common network interfaces.
Falls back to hostname resolution.
Returns empty string if no IP found.
from_dict_hook(in_dict: dict) -> any
Custom JSON object hook for deserialization.
If dictionary contains keys
"type","data", and"module", re-creates the corresponding object by importing the module and instantiating the class with the data.Otherwise returns the dictionary as is.
decrypt_database_password(password: str) -> str
Decrypts a database password if encryption is enabled in config.
Requires
encrypt_passwordto be True in config.Uses
encrypt_moduleandprivate_keyfrom config to perform decryption.Raises
ValueErrorifprivate_keyabsent.Dynamically imports decryption function.
decrypt_database_config(database: dict = None, passwd_key: str = "password", name: str = "database") -> dict
Decrypts the password in a database config dictionary.
If no database dict provided, loads from config by
name.Decrypts the password under
passwd_keykey.Returns updated database dict.
update_config(key: str, value: any, conf_name: str = SERVICE_CONF) -> None
Updates a key-value pair in the main configuration file.
Acquires a file lock to prevent concurrent writes.
Loads existing config, updates key, and rewrites YAML config file.
get_uuid() -> str
Returns a new UUID string (UUID1 hex format).
datetime_format(date_time: datetime.datetime) -> datetime.datetime
Returns a datetime object truncated to seconds precision (dropping microseconds).
get_format_time() -> datetime.datetime
Returns the current datetime truncated to seconds.
str2date(date_time: str) -> datetime.datetime
Parses a string in the format %Y-%m-%d into a datetime.datetime object.
elapsed2time(elapsed: int | float) -> str
Converts elapsed milliseconds into a string formatted as HH:MM:SS.
decrypt(line: str) -> str
Decrypts an RSA encrypted base64-encoded string using a private key stored in conf/private.pem.
Uses PKCS#1 v1.5 cipher from
Cryptodome.Returns decrypted UTF-8 string.
decrypt2(crypt_text: str) -> str
Alternative RSA decryption method for base64 encoded input, addressing specific decode length edge cases.
Reads private key from
conf/private.pem.Returns decrypted UTF-8 string.
download_img(url: str) -> str
Downloads an image from a URL and returns a base64-encoded data URI string.
Uses
requeststo GET the image.Returns empty string if URL is empty.
delta_seconds(date_string: str) -> float
Returns the number of seconds elapsed between the given date string ("%Y-%m-%d %H:%M:%S") and now.
hash_str2int(line: str, mod: int = 10**8) -> int
Hashes a string using SHA1, converts to integer, and returns modulo mod (default 100 million).
Classes
BaseType
Base class designed to be inherited by other classes for serialization support.
Methods:
to_dict(self) -> dict
Returns a shallow dictionary representation of the instance, stripping leading underscores from attribute names.to_dict_with_type(self) -> dict
Recursively converts the object and nested objects into dictionaries including type and module metadata, useful for type-aware serialization.
CustomJSONEncoder(json.JSONEncoder)
A JSON encoder subclass that extends support for additional Python types.
Constructor:
Accepts
with_typeboolean to control whether to include type metadata forBaseTypeobjects.
Supported types:
datetime.datetime→ formatted string%Y-%m-%d %H:%M:%S.datetime.date→ formatted string%Y-%m-%d.datetime.timedelta→ string viastr().EnumandIntEnum→ their.value.set→ converted to list.BaseType→ serialized viato_dict()orto_dict_with_type().type→ class name string.
RestrictedUnpickler(pickle.Unpickler)
A pickle unpickler subclass that restricts unpickling to a predefined whitelist of safe modules.
Safe modules:
numpy,rag_flow.Behavior:
Overridesfind_classto only allow classes from safe modules to be loaded. Other module/class requests raisepickle.UnpicklingError.Usage:
Used inrestricted_loads()function for safer deserialization.
Important Implementation Details
Configuration Management:
Configuration files are loaded from aconfdirectory relative to the project base directory. The module supports a layered config system where local config overrides global config. Sensitive values in configs are masked when logging.Safe Deserialization:
The module supports a flag-controlled secure unpickling which forbids deserialization of arbitrary classes except those in a whitelist, mitigating security risks with pickle.Custom Serialization:
TheBaseTypeclass andCustomJSONEncoderenable rich serialization of user-defined objects with optional type metadata, useful for complex data interchange.Encryption and Decryption:
RSA-based decryption utilities use private keys stored in theconfdirectory, with fallback methods for edge cases.Thread-Safe Config Updates:
Theupdate_configfunction uses a file lock to prevent concurrent write conflicts.
Interactions with Other Parts of the System
Imports from
file_utils:
Functions likeget_project_base_directory(),load_yaml_conf(), andrewrite_yaml_conf()are used for file path management and YAML config reading/writing.Constants from
api.constants:
UsesSERVICE_CONFas the default config filename.Cryptographic Libraries:
UsesCryptodomefor RSA encryption/decryption.Third-party Libraries:
requestsfor HTTP requests (image downloading).filelockfor file locking.enumfor enhanced enum support.
Environmental Variables:
Allows config overrides via environment variables for flexibility in deployment.
Usage Examples
from utils import conf_realpath, show_configs, get_base_config, json_dumps, deserialize_b64
# Load and show current config with sensitive data masked
show_configs()
# Get specific config value
db_host = get_base_config('database')['host']
# Serialize an object to JSON with type info
json_data = json_dumps(my_obj, with_type=True)
# Deserialize safely from base64 pickle string
obj = deserialize_b64(pickled_str)
# Get current timestamp in ms
now_ms = current_timestamp()
# Convert timestamp to human-readable string
date_str = timestamp_to_date(now_ms)
# Decrypt database password from config
db_config = decrypt_database_config()
# Update config key safely
update_config('new_key', 'new_value')
Mermaid Diagram - Class Structure
classDiagram
class BaseType {
+to_dict() dict
+to_dict_with_type() dict
}
class CustomJSONEncoder {
+__init__(with_type=False)
+default(obj)
}
class RestrictedUnpickler {
+find_class(module, name)
}
BaseType <|-- CustomJSONEncoder
RestrictedUnpickler --|> pickle.Unpickler
Summary
This init.py module is a comprehensive utility foundation for the InfiniFlow project. It centralizes configuration management, secure serialization/deserialization, cryptographic helpers, and various utility functions that are widely reused. Its design emphasizes security (e.g., restricted unpickling, encrypted password handling), flexibility (config layering, environment overrides), and extensibility (custom JSON encoding). Understanding this file is key for developers working on configuration, data interchange, and security-sensitive operations within the system.