exesql.py
Overview
The exesql.py file defines a reusable software component designed to execute SQL queries on various types of relational databases and return the query results in structured formats. It supports multiple database backends, including MySQL, MariaDB, PostgreSQL, and Microsoft SQL Server (MSSQL). The component encapsulates database connection parameters, SQL input validation, execution logic, and result processing into a clean and flexible interface.
This component is part of a larger system (likely the InfiniFlow platform) where it functions as a backend tool to dynamically execute arbitrary SQL queries, process their results, and return formatted data for downstream use.
Detailed Explanation
Classes
1. ExeSQLParam (inherits from ToolParamBase)
This class defines and validates the parameters required to configure and execute SQL queries through the ExeSQL component.
Purpose:
Encapsulates connection parameters for different database types along with SQL execution settings and enforces input validation rules.Properties:
meta: ToolMeta
Metadata describing the tool, including its name, description, and expected input parameters (notably thesqlstring).db_type: str
Type of database to connect to (default"mysql"). Supported values:'mysql','postgres','mariadb','mssql'.database: str
Database name to connect to.username: str
Username for database authentication.host: str
IP address or hostname of the database server.port: int
Port number to connect to (default3306).password: str
Password for database authentication.max_records: int
Maximum number of records to fetch from the query (default1024).
Methods:
init(self)
Initializes default values and metadata.check(self)
Validates all parameters for correctness:Ensures
db_typeis one of the supported database types.Checks that
database,username,host, andpasswordare not empty.Verifies
portandmax_recordsare positive integers.Adds security checks disallowing the use of database named rag_flow with host ragflow-mysql or password
infini_rag_flow.
get_input_form(self) -> dict[str, dict]
Returns the input form schema, defining the SQL query input as a single-line string field.
Usage Example:
param = ExeSQLParam() param.db_type = "postgres" param.database = "mydb" param.username = "user" param.password = "pass" param.host = "127.0.0.1" param.port = 5432 param.max_records = 500 param.check() # Validate parameters
2. ExeSQL (inherits from ToolBase and ABC)
This is the core class that performs SQL execution using the parameters defined by ExeSQLParam.
Purpose:
Provides an interface to execute SQL queries on the configured database and process the results into JSON and markdown table formats.Class Attributes:
component_name = "ExeSQL"
Methods:
_invoke(self, **kwargs)
The main method to run the SQL query. It is decorated with a configurable timeout (default 60 seconds).Parameters:
sql(string): The SQL query string to execute. Required.
Process:
Extracts variables/placeholders from the SQL string and formats them using input values.
Splits the SQL string by semicolons to support multiple statements.
Establishes a connection to the specified database using the appropriate Python database driver:
pymysqlfor MySQL/MariaDBpsycopg2for PostgreSQLpyodbcfor MSSQL
Executes each SQL statement separately.
Processes query results into pandas DataFrames, limiting rows to
max_records.Formats datetime columns to
YYYY-MM-DDstrings.Converts any
Decimaltypes in the results to floats for JSON serialization.Collects results into two outputs:
JSON serializable list of records (
json)Markdown representation of the query output (
formalized_content)
Returns the markdown formatted content as the primary output.
Returns:
Markdown string representation of the query results.
Raises:
Exception if SQL is empty or database connection fails.
Usage Example:
exe_sql = ExeSQL() exe_sql._param = param # An instance of ExeSQLParam with valid parameters set result = exe_sql._invoke(sql="SELECT * FROM users WHERE id = {user_id};", user_id=123) print(result) # Markdown table of query resultsthoughts(self) -> str
Returns a simple status message indicating query execution progress.
Important Implementation Details and Algorithms
SQL Placeholder Handling:
The_invokemethod extracts placeholders from the SQL string (e.g.,{user_id}), retrieves their values from input, serializes non-string values into JSON strings, and performs safe string formatting.Database Driver Selection:
Depending on thedb_type, the component chooses the appropriate database driver and connection method, ensuring compatibility with multiple RDBMS systems.Query Execution and Result Fetching:
Uses
cursor.execute()to run each SQL statement.Uses
cursor.fetchmany(max_records)to limit the number of returned rows.Converts raw results into pandas DataFrames for ease of manipulation and formatting.
Handles MSSQL's row fetching slightly differently due to driver specifics.
Converts datetime columns to string format for better JSON and markdown compatibility.
Converts
Decimaltypes to floats recursively within the result data structure to avoid serialization issues.
Security Checks in Parameters:
The parameter class enforces restrictions on certain database names and passwords to prevent unauthorized or risky operations.Timeout Management:
The_invokemethod is wrapped with a timeout decorator that reads a timeout value from environment variables, defaulting to 60 seconds, to avoid hanging executions.
Interactions with Other System Components
Base Classes:
Inherits fromToolBaseandToolParamBase(defined elsewhere inagent.tools.base), which provide foundational behavior for parameter handling and tool execution lifecycle.Utilities:
Uses atimeoutdecorator imported fromapi.utils.api_utilsto manage method execution time limits.External Libraries:
pandas: For data manipulation and formatting.Database client libraries (
pymysql,psycopg2,pyodbc) for connecting and querying different databases.jsonandrefor string processing and JSON serialization.
Input/Output Management:
Uses methods likeset_input_value,set_output, andoutput(presumably defined inToolBase) to manage inputs and outputs within the larger system framework.
Mermaid Class Diagram
classDiagram
class ExeSQLParam {
+meta: ToolMeta
+db_type: str
+database: str
+username: str
+host: str
+port: int
+password: str
+max_records: int
+__init__()
+check()
+get_input_form() dict
}
class ExeSQL {
+component_name: str
+_invoke(**kwargs) str
+thoughts() str
}
ExeSQLParam --|> ToolParamBase
ExeSQL --|> ToolBase
ExeSQL --|> ABC
Summary
The exesql.py file implements a robust, configurable SQL execution component that supports multiple database backends. It abstracts connection details and query execution behind a simple interface, formats results for easy consumption, and integrates with the larger InfiniFlow platform via base classes and utilities. Its design prioritizes flexibility, security, and usability, making it a core tool for dynamic database querying tasks within the system.