metrics.py
Overview
The `metrics.py` file provides functionality to evaluate the semantic similarity and coverage between two sets of vector embeddings: one representing code snippets (`code_embeddings`) and the other representing documentation snippets (`doc_embeddings`). The primary goal is to calculate a composite metric called CES (Code-Embedding Similarity), which quantifies how well documentation covers the code semantics, how relevant the documentation is to the code, and how novel the documentation content is relative to the code.
This file is designed for use in systems that analyze or improve the quality of code documentation by leveraging vector embeddings and similarity measures. It can be integrated into broader pipelines for code analysis, documentation generation, or quality assessment.
Detailed Explanation
Imports and Constants
from typing import List, Dict: Type hints for function parameters and return values.import numpy as np: For numerical operations on embeddings.from config import SIM_THRESHOLD, PARTIAL_THRESHOLD: Threshold constants imported from a configuration module, controlling similarity cutoffs for metrics.
Function: cosine_similarity
def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float:
Description
Computes the cosine similarity between two vectors `a` and `b`. Cosine similarity is a measure of the angle between two vectors in a multi-dimensional space, ranging from -1 (opposite) to 1 (identical). Here, it is bounded between 0 and 1 because of the use case (likely non-negative embeddings).
Parameters
a(np.ndarray): First embedding vector.b(np.ndarray): Second embedding vector.
Returns
float: The cosine similarity value. Returns0.0if either vector has zero magnitude to avoid division by zero.
Usage Example
import numpy as np
vec1 = np.array([1, 0, 0])
vec2 = np.array([0, 1, 0])
similarity = cosine_similarity(vec1, vec2) # Output: 0.0
Function: compute_ces_from_embeddings
def compute_ces_from_embeddings(code_embeddings: List[np.ndarray], doc_embeddings: List[np.ndarray],
sim_threshold: float = SIM_THRESHOLD,
partial_threshold: float = PARTIAL_THRESHOLD) -> Dict:
Description
Computes the CES (Code-Embedding Similarity) metric and its components by comparing code and documentation embeddings. CES is a weighted sum of three sub-metrics:
Direct Coverage (DC): Proportion of code embeddings that are closely matched (above
sim_threshold) by at least one documentation embedding.Relevance (Rel): Proportion of documentation embeddings that closely match (above
sim_threshold) at least one code embedding.Novelty (Nov): Proportion of documentation embeddings that partially match (between
partial_thresholdandsim_threshold) code embeddings, indicating novel but related content.
These metrics help quantify how well documentation covers, relates to, and adds novel information relative to the code.
Parameters
code_embeddings(List[np.ndarray]): List of vector embeddings representing code snippets.doc_embeddings(List[np.ndarray]): List of vector embeddings representing documentation snippets.sim_threshold(float, optional): Similarity threshold to count as a strong match (default from config).partial_threshold(float, optional): Lower similarity threshold to count as partial match (default from config).
Returns
Dict: A dictionary with keys:"CES"(float): The composite CES score."DirectCoverage"(float): Fraction of code embeddings covered."Relevance"(float): Fraction of documentation embeddings relevant."Novelty"(float): Fraction of documentation embeddings novel.
Usage Example
code_embs = [np.array([0.1, 0.2, 0.3]), np.array([0.4, 0.5, 0.6])]
doc_embs = [np.array([0.1, 0.2, 0.3]), np.array([0.0, 0.1, 0.2])]
result = compute_ces_from_embeddings(code_embs, doc_embs)
print(result)
# Output might be:
# {'CES': 0.4, 'DirectCoverage': 0.5, 'Relevance': 1.0, 'Novelty': 0.0}
Implementation Details
The function uses nested generator expressions to efficiently compute max cosine similarity between vectors from each set.
Handles empty input lists gracefully by returning zero scores.
Uses fixed weights (
alpha=0.5,beta=0.3,gamma=0.2) to combine the sub-metrics into the final CES score.The approach is heuristic and designed to balance coverage, relevance, and novelty.
Interaction with Other System Components
Configuration Module (
config.py): The file imports similarity threshold constants from a configuration file, enabling centralized tuning of similarity parameters.Embedding Generation: This module expects precomputed embeddings for code and documentation, which are typically generated by other parts of the system such as embedding models or feature extraction pipelines.
Evaluation Pipelines: The CES metric can be used in downstream components for evaluating documentation quality, guiding improvements, or automated feedback systems.
Potential Integration: Can be integrated with documentation generation tools, code review assistants, or knowledge base systems.
Visual Diagram: Class & Function Structure
flowchart TD
A[cosine_similarity(a: np.ndarray, b: np.ndarray) -> float]
B[compute_ces_from_embeddings(code_embeddings: List[np.ndarray], doc_embeddings: List[np.ndarray], sim_threshold: float, partial_threshold: float) -> Dict]
A --> B
**Diagram Explanation:**
The flowchart shows two main functions.
compute_ces_from_embeddingsinternally callscosine_similaritymultiple times to compute similarity scores between embeddings.There are no classes in this file; the structure is functional with utility functions focused on similarity and metric computation.
Summary
The `metrics.py` file provides core utility functions to measure the semantic relationship between code and documentation embeddings. By leveraging cosine similarity and threshold-based heuristics, it quantifies coverage, relevance, and novelty through the CES metric. This module is a critical component for systems aiming to assess or enhance documentation quality relative to code semantics.