github.py
Overview
The github.py file provides a tool integration for searching GitHub repositories programmatically within the InfiniFlow agent framework. It defines the parameters and execution logic required to query the GitHub Search API, retrieve a list of repositories matching a search query, and format the results for further use in the system.
This component enables users or other parts of the application to perform repository searches on GitHub based on user-provided keywords, returning highly starred repositories sorted by popularity. It abstracts the GitHub API details and provides a consistent interface for repository search within InfiniFlow.
Classes and Their Details
GitHubParam
Defines the parameters and metadata for the GitHub search tool component.
Description
Holds the metadata describing the tool (name, description, parameters).
Defines the input form structure.
Provides validation for parameters such as the maximum number of repositories to fetch.
Attributes
meta: ToolMeta— Metadata dictionary defining the tool's name, description, and input parameters.top_n: int— Number of top repositories to retrieve; default is 10.
Methods
init(self)Initializes the metadata and sets default
top_nto 10.check(self)Validates that
top_nis a positive integer.get_input_form(self) -> dict[str, dict]Returns a dictionary describing the input form fields, currently only the search
query.
Usage Example
param = GitHubParam()
param.top_n = 5
param.check() # Validates top_n
input_form = param.get_input_form()
# input_form = {
# "query": {"name": "Query", "type": "line"}
# }
GitHub
Main tool class implementing the GitHub repository search functionality.
Inheritance
Inherits from
ToolBase(provides base tool functionality)Implements abstract base class
ABC
Class Attributes
component_name: str— Identifies the component as"GitHub"
Methods
_invoke(self, **kwargs) -> strThe core method that performs the GitHub API call and processes the response.
Parameters:
kwargs(dict): Should contain the search parameter"query".
Returns:
str: The formatted search results or error message.
Behavior:
Validates the presence of the
queryparameter.Constructs the GitHub search API URL with the query, sorting by stars descending, and limiting results by
top_n.Sends HTTP GET request with appropriate headers for GitHub API versioning.
On success, extracts repository details (name, URL, description, stars) and stores them as chunks.
Sets the
"json"output to the raw list of repositories.Returns the formatted content output.
On failure, retries based on the configured number of retries (
max_retries), with a delay after errors.On repeated failure, sets an error output.
Timeout:
The method is decorated with
@timeout()to enforce a maximum execution time (default 12 seconds or from environment variable).
thoughts(self) -> strReturns a human-readable summary string describing the current operation, including the query being searched.
Example:
tool = GitHub() tool.get_input = lambda: {"query": "machine learning"} print(tool.thoughts()) # Output: Scanning GitHub repos related to `machine learning`.
Important Implementation Details
API Usage: Utilizes GitHub's public REST API endpoint
/search/repositorieswith parameters:q= search query stringsort= "stars"order= "desc"per_page= number of results to fetch (top_n)
Error Handling and Retry Logic: Implements retry mechanism for robustness against transient errors or rate limiting, with a delay between retries.
Output Formatting: Uses
_retrieve_chunks()method (fromToolBase) to extract and format repository information (title, URL, description + star count) into structured output.Timeout Management: Uses a custom
@timeoutdecorator to prevent the invocation from running beyond a configured timeout, improving system resilience.
Interaction with Other System Components
ToolParamBaseandToolBase: The parameters and tool base classes provide shared interfaces and mechanisms for parameter validation, output setting, and chunk retrieval.api.utils.api_utils.timeout: Provides the timeout decorator controlling maximum execution time per invocation.agent.tools.base: The base classes used here come from this internal module, which defines tool contracts within InfiniFlow.Environment Variables: Uses
COMPONENT_EXEC_TIMEOUTto configure the maximum allowed execution time dynamically.Logging: Logs exceptions encountered during the API call for troubleshooting and monitoring.
Visual Diagram
classDiagram
class GitHubParam {
+meta: ToolMeta
+top_n: int
+__init__()
+check()
+get_input_form() dict
}
class GitHub {
+component_name: str
+_invoke(**kwargs) str
+thoughts() str
}
GitHubParam <|-- GitHub
ToolParamBase <|-- GitHubParam
ToolBase <|-- GitHub
GitHub ..> requests
GitHub ..> timeout
Summary
The github.py file implements a reusable and configurable tool component for searching GitHub repositories by keywords. It manages input parameters, performs authenticated API calls with error handling and retry, and formats results to integrate seamlessly into the InfiniFlow framework. The component’s design cleanly separates parameter definition (GitHubParam) from execution logic (GitHub), enabling easy extension and maintenance.
This tool is instrumental in enabling InfiniFlow agents to incorporate up-to-date GitHub repository data into their workflows, supporting tasks such as code discovery, project analysis, or integration with developer environments.