googlescholar.py


Overview

This file implements a component for querying Google Scholar to retrieve scholarly articles and related academic literature. It provides a structured interface to perform searches on Google Scholar using the scholarly Python package, enabling filtering, sorting, and retrieval of metadata such as titles, authors, abstracts, and URLs of publications.

The component is designed as part of a larger agent/tool framework (agent.tools.base) and adheres to a standardized parameter and execution model, including retry mechanisms and timeout controls. It supports configurable search parameters like query keywords, number of results, sorting criteria, publication year range, and patent inclusion.


Classes and Functions

Class: GoogleScholarParam

Defines the parameters used to configure the Google Scholar search component.

Purpose

Attributes

Methods

Usage Example

params = GoogleScholarParam()
params.top_n = 10
params.sort_by = 'date'
params.patents = False
params.check()  # Validate parameters

Class: GoogleScholar

Extends ToolBase with an abstract base class (ABC) to implement the Google Scholar search logic.

Purpose

Class Attributes

Methods

Usage Example

gs = GoogleScholar()
gs._param = GoogleScholarParam()
gs._param.top_n = 5
result = gs._invoke(query="machine learning optimization")
print(result)

Implementation Details and Algorithms


Interaction with Other System Components


Mermaid Class Diagram

classDiagram
    class GoogleScholarParam {
        +meta: ToolMeta
        +top_n: int
        +sort_by: str
        +year_low: int
        +year_high: int
        +patents: bool
        +__init__()
        +check()
        +get_input_form() dict
    }

    class GoogleScholar {
        +component_name: str
        +_invoke(**kwargs) str
        +thoughts() str
    }

    GoogleScholarParam <|-- GoogleScholar
    ToolParamBase <|-- GoogleScholarParam
    ToolBase <|-- GoogleScholar

Summary

This file defines a well-structured component for integrating Google Scholar searches into a larger agent-based system. It abstracts search parameters, provides validation, manages API calls with retries and timeouts, and formats results for further processing. The design leverages inheritance and decorators to maintain clean separation of concerns and robustness in execution.