string_transform.py
Overview
The string_transform.py file is part of the InfiniFlow project and provides functionality for transforming strings within a component-based framework. It defines a component named StringTransform that supports two primary operations on strings:
Split: Splits a string into a list using specified delimiters.
Merge: Merges multiple string inputs into a single string based on a user-defined script/template.
This component is designed to be configurable via parameters and integrates with the larger InfiniFlow system to process string data flows. It leverages templating (via Jinja2) and regular expressions to perform flexible string transformations.
Classes
1. StringTransformParam
Defines the parameters for the StringTransform component.
Description:
StringTransformParam inherits from ComponentParamBase and encapsulates configuration parameters for the string transformation operations.
Attributes:
Attribute | Type | Description | Default |
|---|---|---|---|
|
| The transformation method, either |
|
|
| A script or template used when merging strings |
|
|
| Reference variable name for the string to split |
|
|
| List of delimiter strings used to split or join strings |
|
| Defines output structure; contains a |
Methods:
check() -> NoneValidates the parameter values. Specifically, it checks:
That
methodis either"split"or"merge".That
delimitersis not empty.
Usage Example:
param = StringTransformParam()
param.method = "merge"
param.script = "Hello {{name}}, welcome!"
param.delimiters = [" "]
param.check() # Validates the parameters
2. StringTransform
Provides the core string transformation functionality.
Description:
StringTransform is an abstract base class (inherits from Message and ABC) representing a component that performs string transformations according to the defined StringTransformParam. It supports two main operations: splitting strings into lists and merging multiple strings into one using a script/template.
Attributes:
Attribute | Description |
|---|---|
| Class-level identifier |
| Instance of |
| Context or environment used to get variable values (inherited or passed) |
Methods:
get_input_form() -> dict[str, dict]Returns a dictionary describing the expected input form for the component, which depends on the
method:For
"split": expects a single input"line"of type"line".For
"merge": expects inputs derived from the placeholders in the mergescript.
Returns:
A dictionary mapping input names to their metadata (name and type).
Example:
st = StringTransform() st._param.method = "split" inputs = st.get_input_form() # {'line': {'name': 'String', 'type': 'line'}}_invoke(**kwargs) -> NoneThe main execution method, decorated with a timeout (default 10 minutes, configurable via
COMPONENT_EXEC_TIMEOUTenvironment variable). It dispatches to_splitor_mergedepending on themethod.For
"split": expectslinekeyword argument.For
"merge": expects multiple keyword arguments for template variables.
_split(line: str | None = None) -> NoneSplits the input string into parts using the delimiters and sets the result output.
Parameters:
line: Optional string to split. If not provided, retrieves the value from the canvas variable referenced bysplit_ref.
Behavior:
Retrieves the string to split.
Splits the string by the delimiters using a regex that preserves delimiters but filters them out from the output.
Sets the output
"result"to the list of split substrings.
_merge(kwargs: dict[str, str] = {}) -> NoneMerges multiple string inputs into one string based on the
scriptparameter.Parameters:
kwargs: A dictionary of variable names and their string values to be merged into the script.
Behavior:
Extracts the script and variable values.
If the script is a valid Jinja2 template, it renders the template with the variables.
Otherwise, performs direct string substitution using regex for each variable.
Sets the output
"result"to the merged string.
thoughts() -> strReturns a short descriptive string about the current transformation method (e.g.,
"It's splitting."or"It's merging.").
Important Implementation Details
Timeout Decorator: The
_invokemethod is decorated with@timeout, which enforces execution time limits to prevent long-running operations.Regex Splitting: The
_splitmethod uses a regex pattern that includes delimiters as capture groups to split the string but excludes delimiters from the final result by skipping every other split element.Jinja2 Templating: The
_mergemethod supports Jinja2 templating, allowing users to write flexible merge scripts. If Jinja2 rendering fails, it falls back to simple regex substitutions.Integration with Canvas: The component fetches and sets variable values through a
_canvasinterface, which is presumably part of the larger InfiniFlow framework for managing variable states.
Interactions with Other Parts of the System
Inheritance:
Inherits from
Message(imported from.message), which likely provides messaging or event capabilities.Inherits from
ComponentParamBasefor parameter handling.
Imported Modules:
jinja2.Template: Used for rendering merge scripts.agent.component.base.ComponentParamBase: Base parameter class.api.utils.api_utils.timeout: Decorator to add timeouts on method execution.
Environment Configuration:
Reads
COMPONENT_EXEC_TIMEOUTenvironment variable to set timeout duration for execution.
Variable Management:
Uses
_canvas.get_variable_value()andset_input_value()/set_output()methods to interact with the system's variable management.
Usage Example
# Example for splitting a string
param = StringTransformParam()
param.method = "split"
param.split_ref = "input_string"
param.delimiters = [",", ";"]
transform = StringTransform()
transform._param = param
# Assume _canvas is properly set with variables
transform._invoke(line="apple,banana;cherry")
result = transform.get_output("result")
# result -> ['apple', 'banana', 'cherry']
# Example for merging strings
param.method = "merge"
param.script = "Hello {{name}}, your order {{order_id}} is ready."
param.delimiters = [" "]
transform._param = param
transform._invoke(name="Alice", order_id="12345")
result = transform.get_output("result")
# result -> "Hello Alice, your order 12345 is ready."
Mermaid Diagram - Flowchart of Main Functions
flowchart TD
A[get_input_form()] --> B[_invoke(kwargs)]
B --> |method == "split"| C[_split(line)]
B --> |method == "merge"| D[_merge(kwargs)]
C --> E[set_output("result", list_of_substrings)]
D --> F[Render Jinja2 template or regex substitute]
F --> G[set_output("result", merged_string)]
Summary
string_transform.py defines a configurable string transformation component for InfiniFlow capable of splitting strings into lists or merging multiple strings based on templates. The design supports flexible inputs, environment-based execution timeouts, and integration with the system's variable management, making it a versatile utility in data processing workflows.