source.py
Overview
The [source.py](/projects/286/67491) module provides a robust and immutable abstraction for handling and manipulating source code fragments in Python. Its primary class, `Source`, encapsulates source code lines while automatically normalizing indentation and enabling intuitive slicing, iteration, and querying of code statements.
This module is designed to facilitate introspection and manipulation of source code, especially useful in contexts such as debugging, code analysis, dynamic code modification, and testing frameworks. It leverages Python's standard libraries such as `inspect`, `ast`, and `tokenize` to parse and analyze source code structure accurately.
Detailed Documentation
Class: Source
**Purpose:** An immutable container for a source code fragment. It stores source code lines in a normalized form by deindenting them and provides various utility methods to query, slice, and manipulate these fragments conveniently.
Constructor: __init__(self, obj: object = None) -> None
**Description:** Creates a `Source` instance from various input types:
None: creates an emptySource.Another
Sourceobject: copies its lines.A list or tuple of strings: treats each string as a line of source and deindents.
A single string: splits into lines and deindents.
Any other Python object: attempts to extract source code using
inspect.getsource().
**Parameters:**
obj(object, optional): The source of the code fragment. Defaults toNone.
**Usage Example:**
src1 = Source()
src2 = Source([" def foo():", " return 42"])
src3 = Source("def bar():\n pass\n")
src4 = Source(some_function)
Equality and Hashing
__eq__(self, other: object) -> bool
TwoSourceinstances are equal if their deindented lines match exactly.__hash__is deliberately disabled (None) to keep the class immutable but unhashable.
Item Access: __getitem__(self, key: int | slice) -> str | Source
**Description:** Supports indexing and slicing:
Index by integer returns the corresponding line as a string.
Slice returns a new
Sourceinstance with the selected lines.
**Restrictions:** Slices with step values other than 1 are not supported and raise `IndexError`.
**Usage Example:**
line = src[2] # Get the third line
sub_source = src[1:4] # Get lines 1, 2, and 3 as a new Source instance
Iteration and Length
__iter__(self) -> Iterator[str]
Iterates over the deindented lines.__len__(self) -> int
Returns the number of lines in the source.
strip(self) -> Source
**Description:** Returns a new `Source` instance with leading and trailing blank lines removed.
**Usage Example:**
clean_source = src.strip()
indent(self, indent: str = " ") -> Source
**Description:** Returns a new `Source` instance with every line indented by the specified string (default 4 spaces).
**Parameters:**
indent(str): The string to prepend to each line.
**Usage Example:**
indented_source = src.indent("\t") # Indents each line with a tab
getstatement(self, lineno: int) -> Source
**Description:** Returns the minimal statement containing the given line number (0-based). A statement may span multiple lines (like function definitions, loops, etc.).
**Parameters:**
lineno(int): Zero-based line number in the source.
**Usage Example:**
statement = src.getstatement(5)
print(str(statement)) # Prints the entire statement containing line 5
getstatementrange(self, lineno: int) -> tuple[int, int]
**Description:** Returns the `(start, end)` line indices (0-based, end exclusive) that span the minimal statement containing the given line number.
**Parameters:**
lineno(int): Zero-based line number.
**Returns:**
Tuple of integers
(start, end).
deindent(self) -> Source
**Description:** Returns a new `Source` object with all lines deindented (leading whitespace removed uniformly).
__str__(self) -> str
**Description:** Returns the source code as a single string, joining lines with newline characters.
Helper Functions
findsource(obj) -> tuple[Source | None, int]
**Description:** Attempts to find and return the source code and starting line number of a given object using `inspect.findsource`.
**Returns:**
Sourceinstance containing raw source lines (orNoneif not found).Starting line number (
int) or-1if source is unobtainable.
getrawcode(obj: object, trycall: bool = True) -> types.CodeType
**Description:** Fetches the raw code object (`__code__`) for a given function or callable. If `obj` is not directly a function, attempts to retrieve the `__call__` method's code.
**Raises:**
TypeErrorif no code object can be found.
deindent(lines: Iterable[str]) -> list[str]
**Description:** Deindents a sequence of lines by removing common leading whitespace.
get_statement_startend2(lineno: int, node: ast.AST) -> tuple[int, int | None]
**Description:** Given a line number and an AST node, determines the start and end line indices of the minimal statement containing that line.
**Implementation Details:**
Walks the AST to collect line numbers of all statements and exception handlers.
Uses bisect to find statement bounds relative to the given line number.
Accounts for decorators and else/finally blocks as part of statements.
getstatementrange_ast(lineno: int, source: Source, assertion: bool = False, astnode: ast.AST | None = None) -> tuple[ast.AST, int, int]
**Description:** Returns the AST node and line range `(start, end)` for the minimal statement containing the given line number.
**Parameters:**
lineno: 0-based index of the line insource.source:Sourceobject.assertion: Unused, defaultsFalse.astnode: Optional pre-parsed AST forsource; ifNone, parses internally.
**Implementation Details:**
Parses the source code into an AST.
Uses
get_statement_startend2to find rough statement bounds.Uses
inspect.BlockFinderand tokenization to refine the end boundary, handling comments, blank lines, and indentation changes.Ensures the statement boundaries do not include trailing comments or blank lines.
**Returns:**
Tuple of
(AST node, start line, end line)with lines zero-indexed.
Important Implementation Details and Algorithms
Deindentation:
TheSourceclass automatically removes common leading whitespace from all lines to normalize indentation, usingtextwrap.dedent.Statement Range Resolution:
The module uses a combination of AST analysis (astmodule), tokenization (tokenize), andinspect.BlockFinderto accurately pinpoint the start and end lines of Python statements, accounting for decorators, multi-line statements, exception handling blocks, and comments.Immutable Design:
Sourceinstances are designed to be immutable in usage, discouraging in-place modifications by disabling hashing and equality only based on source lines.Integration with
inspect:
The module uses Python’s introspection capabilities to retrieve source code from live objects, handling various callable types robustly.
Interactions with Other System Components
This module likely serves as a foundational utility for other components requiring source code introspection or manipulation, such as:
Debuggers and REPLs that display or modify live code.
Testing frameworks that capture or generate code snippets.
Code analysis or transformation tools within the project.
Any system parts that require precise statement-level code extraction or editing.
It depends on Python standard libraries:
inspectfor code retrieval,astfor syntax tree parsing, andtokenizefor lexical analysis.
Visual Diagram: Class Structure of Source
classDiagram
class Source {
- lines: list[str]
- raw_lines: list[str]
+ __init__(obj: object = None)
+ __eq__(other: object) bool
+ __getitem__(key: int | slice) str | Source
+ __iter__() Iterator[str]
+ __len__() int
+ strip() Source
+ indent(indent: str = " ") Source
+ getstatement(lineno: int) Source
+ getstatementrange(lineno: int) (int, int)
+ deindent() Source
+ __str__() str
}
Summary
The [source.py](/projects/286/67491) module provides a powerful abstraction for handling Python source code fragments with precise control over indentation and statement boundaries. It leverages introspection and syntax analysis to support advanced use cases that require understanding or manipulating code structure at a granular level. The `Source` class is the core, offering immutable, sliceable, and iterable source representations, complemented by utility functions that integrate tightly with Python's introspection APIs.