rerank_model.py


Overview

The rerank_model.py file provides a comprehensive framework for performing reranking of search results or text documents based on their relevance to a query. It defines an abstract base class and multiple concrete implementations of reranking models that interact with either local models or remote APIs. These models compute similarity scores between a query and a list of candidate texts, returning a relevance ranking that can be used to reorder search results or improve information retrieval systems.

Key features:

This file is a core component of a retrieval-augmented generation (RAG) or search system, allowing flexible use of different reranking backends under a common interface.


Classes and Methods

class Base(ABC)

Abstract base class for all reranking models.


class DefaultRerank(Base)

Local reranker using the FlagReranker model (from FlagEmbedding).


class JinaRerank(Base)

Reranker that calls the Jina API for similarity scoring.


class YoudaoRerank(DefaultRerank)

Local reranker using the RerankerModel from BCEmbedding.


class XInferenceRerank(Base)

Reranker calling an inference API endpoint.


class LocalAIRerank(Base)

Reranker calling a LocalAI API.


class NvidiaRerank(Base)

Reranker calling NVIDIA's AI API.


class LmStudioRerank(Base)

Stub class for LM-Studio reranker.


class OpenAI_APIRerank(Base)

Reranker compatible with OpenAI API style.


class CoHereRerank(Base)

Reranker using Cohere's API client.


class TogetherAIRerank(Base)

Stub class for TogetherAI reranker.


class SILICONFLOWRerank(Base)

Reranker calling SiliconFlow API.


class BaiduYiyanRerank(Base)

Reranker using Baidu Yiyan API.


class VoyageRerank(Base)

Reranker using the Voyage AI client.


class QWenRerank(Base)

Reranker using Tongyi-Qianwen API via dashscope.


class HuggingfaceRerank(DefaultRerank)

Local reranker that sends requests to a local HuggingFace rerank server.


class GPUStackRerank(Base)

Reranker calling GPUStack API.


class NovitaRerank(JinaRerank)

Subclass of JinaRerank with a different default URL.


class GiteeRerank(JinaRerank)

Subclass of JinaRerank with a different default URL.


class Ai302Rerank(Base)

Reranker calling 302.AI API.


Important Implementation Details


Interactions with Other System Components


Usage Examples

Using DefaultRerank (Local Model)

reranker = DefaultRerank(key="dummy", model_name="BAAI/bge-reranker-v2-m3")
query = "What is AI?"
texts = ["AI stands for Artificial Intelligence.", "Machine learning is a subset of AI."]
scores, token_count = reranker.similarity(query, texts)
print(scores)

Using JinaRerank (API)

reranker = JinaRerank(key="your_api_key")
query = "Climate change effects"
texts = ["Rising sea levels are a concern.", "Global warming impacts agriculture."]
scores, token_count = reranker.similarity(query, texts)
print(scores)

Mermaid Class Diagram

classDiagram
    class Base {
        <<abstract>>
        +__init__(key, model_name, **kwargs)
        +similarity(query: str, texts: list)
        +total_token_count(resp)
    }

    class DefaultRerank {
        -_FACTORY_NAME: str = "BAAI"
        -_model
        -_model_lock
        -_dynamic_batch_size: int
        -_min_batch_size: int
        +__init__(key, model_name, **kwargs)
        +torch_empty_cache()
        -_process_batch(pairs, max_batch_size=None)
        -_compute_batch_scores(batch_pairs, max_length=None)
        +similarity(query: str, texts: list)
    }

    class JinaRerank {
        -_FACTORY_NAME: str = "Jina"
        -base_url: str
        -headers: dict
        -model_name: str
        +__init__(key, model_name, base_url)
        +similarity(query: str, texts: list)
    }

    class YoudaoRerank {
        -_FACTORY_NAME: str = "Youdao"
        -_model
        -_model_lock
        +__init__(key, model_name, **kwargs)
        +similarity(query: str, texts: list)
    }

    class XInferenceRerank {
        -_FACTORY_NAME: str = "Xinference"
        -base_url: str
        -headers: dict
        -model_name: str
        +__init__(key, model_name, base_url)
        +similarity(query: str, texts: list)
    }

    class LocalAIRerank {
        -_FACTORY_NAME: str = "LocalAI"
        -base_url: str
        -headers: dict
        -model_name: str
        +__init__(key, model_name, base_url)
        +similarity(query: str, texts: list)
    }

    class NvidiaRerank {
        -_FACTORY_NAME: str = "NVIDIA"
        -base_url: str
        -headers: dict
        -model_name: str
        +__init__(key, model_name, base_url)
        +similarity(query: str, texts: list)
    }

    class HuggingfaceRerank {
        -_FACTORY_NAME: str = "HuggingFace"
        -base_url: str
        -model_name: str
        +__init__(key, model_name, base_url)
        +similarity(query: str, texts: list)
        +post(query: str, texts: list, url)
    }

    %% Inheritance
    DefaultRerank --|> Base
    YoudaoRerank --|> DefaultRerank
    JinaRerank --|> Base
    XInferenceRerank --|> Base
    LocalAIRerank --|> Base
    NvidiaRerank --|> Base
    HuggingfaceRerank --|> DefaultRerank

Summary

The rerank_model.py file defines a modular and extensible reranking system with multiple backend implementations. It abstracts interaction with local ML models and remote APIs behind a consistent interface for generating relevance scores. This flexibility enables the larger system to adapt to different deployment environments and external service providers while maintaining consistent functionality.

The adaptive batching and careful token management support efficient and scalable reranking workflows, critical for real-time search and retrieval applications.


End of rerank_model.py Documentation