utils.py


Overview

The utils.py file provides utility functions to support text file reading operations within the InfiniFlow project. Its primary functionality is to read the contents of a file either from a binary input or from a file path on disk, returning the contents as a Unicode string. This utility abstracts away encoding detection and reading logic, facilitating smoother interaction with text data in different formats.


Functions

get_text(fnm: str, binary=None) -> str

Reads and returns the content of a text file or a binary text source as a string.

Parameters:

Returns:

Description:

Usage Example:

from utils import get_text

# Read text from a file path
content = get_text("example.txt")
print(content)

# Read text from binary data
binary_data = b'\xff\xfeH\x00e\x00l\x00l\x00o\x00'
text = get_text("ignored_filename.txt", binary=binary_data)
print(text)  # Output depends on detected encoding

Implementation Details


Interaction with Other Modules

This utility function is likely used by other modules in the InfiniFlow project wherever robust text loading from either files or binary blobs is needed.


Diagram: Function Flowchart

flowchart TD
    A[get_text(fnm: str, binary=None)]
    A -->|binary provided| B[find_codec(binary) -> encoding]
    B --> C[binary.decode(encoding, errors="ignore")]
    C --> D[Return decoded text]
    A -->|binary not provided| E[Open file fnm in read mode]
    E --> F[Read lines one by one]
    F --> G[Concatenate lines into txt]
    G --> H[Return txt]

Summary

utils.py provides a simple yet flexible utility function get_text to abstract text data reading from files or binary input. Its integration with encoding detection and error-tolerant decoding makes it suitable for diverse text data sources. This file serves as a foundational utility for handling text input consistently across the InfiniFlow system.