main.py
Overview
`main.py` is the central script for performing a project-level evaluation of source code documentation coverage using CES (Code-to-Documentation Embedding Similarity) and BERTScore metrics. It processes source code and documentation files, generates semantic embeddings for both, computes similarity metrics to assess how well documentation covers the code, and produces visual and JSON summary outputs.
The evaluation leverages:
Ollama for generating semantic embeddings.
tree_sitter_languages for parsing source code.
BERTScore (commented out currently) for reference-based evaluation of documentation quality.
Visualization libraries (Plotly) to create interactive charts summarizing results.
This script is intended to be run as a command-line tool, taking paths to source code and documentation directories and producing evaluation outputs in a designated folder.
Detailed Explanation
Imports and Dependencies
Imports utility functions from project modules:
code_parsing.extract_code_units_from_filedocument_parsing.chunk_markdown_filesembeddings.chunk_code_units_for_embedding,embeddings.get_embeddings_ollamafile_utils.discover_filesmetrics.compute_ces_from_embeddingsplotting.plotly_radar_and_bar,plotting.plot_semantic_scatter
Requires external packages:
ollama,bert-score,tree_sitter_languages,scikit-learn,scipy,matplotlib,numpy.Assumes Ollama daemon is running locally.
Function: evaluate_project
def evaluate_project(code_dir: str, docs_dir: str, out_dir: str,
use_umap: bool = False, quiet: bool = False) -> Dict:
Purpose
Performs the full evaluation pipeline on the given project directories and outputs results and visualizations.
Parameters
code_dir(str): Path to the root directory containing source code files.docs_dir(str): Path to the root directory containing project documentation (Markdown files).out_dir(str): Directory path to save output files (visualizations, JSON summaries).use_umap(bool, optional): Whether to use UMAP (Uniform Manifold Approximation and Projection) for dimensionality reduction in visualization; default is False (use t-SNE).quiet(bool, optional): If True, suppresses most console output; default is False.
Returns
Dict: A dictionary containing the computed CES evaluation results.
Workflow
Setup Output Directory: Creates
out_dirif it doesn't exist.Discover Files: Finds source code and documentation files using
discover_files.Extract Code Units: Parses source files to extract raw code units (functions, classes, etc.).
Chunk Code Units: Splits large code units into chunks (max 60 lines) suitable for embedding.
Extract Documentation Units: Parses markdown files into manageable doc units.
Generate Embeddings: Uses Ollama to create semantic embeddings for both code chunks and documentation units.
Compute CES: Calculates the Code-to-Documentation Embedding Similarity metric from embeddings.
(Commented Out) BERTScore Computation: Placeholder for generating pseudo-references and computing BERTScore.
Visualization: Saves radar/bar charts and semantic scatter plots summarizing similarity metrics.
Save Summary JSON: Writes evaluation results and metadata to a JSON file.
Return Results: Returns the CES results dictionary.
Usage Example
results = evaluate_project(
code_dir="./my_project/src",
docs_dir="./my_project/docs",
out_dir="./my_project/eval_results",
use_umap=True,
quiet=False
)
print(results)
Function: main
def main():
Purpose
Entry point for command-line usage of the script. Parses arguments and invokes `evaluate_project`.
Command-line Arguments
--code_dir: Required. Path to source code directory.--docs_dir: Required. Path to documentation directory (Markdown files).--out_dir: Optional. Output directory for results (default:./out).--use_umap: Optional flag. Use UMAP instead of t-SNE for visualization.--quiet: Optional flag. Suppress verbose output.
Workflow
Parses CLI arguments.
Calls
evaluate_projectwith parsed arguments.Pretty-prints the JSON results to stdout.
Usage Example
python main.py --code_dir ./src --docs_dir ./docs --out_dir ./eval --use_umap
Important Implementation Details
File Discovery: Uses a utility function
discover_filesto locate relevant source code and markdown files recursively.Code Parsing: Employs
tree_sitter_languagesviaextract_code_units_from_fileto robustly parse and extract code units across multiple languages.Chunking Strategy: Code units are chunked into smaller blocks (max 60 lines) to optimize embedding generation and avoid input size limits.
Embeddings Generation: Uses Ollama's local model (
EMBED_MODEL) to generate semantic embeddings for both code and documentation units.CES Metric: Computes similarity between sets of embeddings to evaluate coverage of documentation relative to code.
Visualization: Generates rich interactive charts (radar/bar charts and semantic scatter plots) to help interpret and present evaluation results.
Scalability Considerations: Comments note that some evaluation parts (like pseudo-reference generation and BERTScore) can be computationally expensive and are currently commented out.
Extensibility: The modular design allows swapping out embedding models, parsing strategies, or metrics easily.
Interactions with Other Modules
code_parsing: Provides code extraction from raw source files.
document_parsing: Handles breaking down markdown docs into chunks.
embeddings: Manages chunking for embeddings and interfaces with Ollama API.
file_utils: Responsible for discovering relevant files in directories.
metrics: Contains CES evaluation logic.
plotting: Visualizes results with Plotly charts.
config: Supplies model configuration (e.g.,
EMBED_MODEL).
The script acts as the orchestrator that coordinates these modules to perform a comprehensive project documentation evaluation.
Visual Diagram: Class/Function Structure
flowchart TD
A[evaluate_project] --> B[discover_files]
A --> C[extract_code_units_from_file]
A --> D[chunk_code_units_for_embedding]
A --> E[chunk_markdown_files]
A --> F[get_embeddings_ollama]
A --> G[compute_ces_from_embeddings]
A --> H[plotly_radar_and_bar]
A --> I[plot_semantic_scatter]
J[main] --> A
**Diagram Explanation**:
maincallsevaluate_project.evaluate_projectcoordinates multiple helper functions from different modules.The flowchart depicts the main function call dependencies and workflow in the file.
Summary
`main.py` is a comprehensive evaluation driver script that integrates parsing, embedding generation, similarity computation, and visualization to assess how well project documentation covers source code. It is designed for extensibility and scalability, leveraging state-of-the-art semantic embeddings and visualization techniques to provide actionable insights into documentation quality at a project level.