pathlib.py
Overview
The [pathlib.py](/projects/286/67436) file provides a comprehensive set of utilities and abstractions for filesystem path manipulation, directory management, and dynamic Python module importing. It builds on Python's standard `pathlib` and `os` modules, extending their functionality to support use cases common in test frameworks like pytest, such as creating numbered directories with cleanup locks, safely removing directories (even with read-only files), and importing Python modules dynamically from arbitrary filesystem paths.
Key functionalities include:
Safe recursive directory removal (
rm_rf) that handles permission and read-only file issues.Creation and management of "numbered" directories with atomic locks to prevent premature cleanup.
Utilities to find directory entries by prefix and to extract suffixes.
Advanced dynamic import mechanisms supporting different import modes, including handling namespace packages and import mismatches.
Path resolution utilities that handle Windows extended-length paths and symlink creation with fallback.
Recursive directory scanning and walking with error handling for common filesystem edge cases.
Helpers for relative path computation, module name resolution, and importability checks.
This module is tightly integrated with test environment management, ensuring robust filesystem operations and flexible module loading, which are critical to dynamic test discovery and execution.
Classes
ImportMode (Enum)
Defines possible import modes for dynamic module importing.
Member | Description |
|---|---|
`prepend` | Prepend the module's directory to `sys.path` and import with `importlib.import_module`. |
`append` | Append the module's directory to `sys.path` before importing. |
`importlib` | Use `importlib` machinery to import the module without modifying `sys.path`. |
ImportPathMismatchError (Exception)
Raised when `import_path` detects a mismatch between the expected file path and the imported module's `__file__` attribute, which can occur if modules with the same basename exist in different package locations.
CouldNotResolvePathError (Exception)
Raised when the utility `resolve_pkg_root_and_module_name` fails to determine the package root directory and module name for a given path.
Functions
Filesystem and Directory Utilities
_ignore_error(exception: Exception) -> bool
Returns `True` if the exception corresponds to known, ignorable filesystem errors (e.g., `ENOENT`, `ENOTDIR`, `EBADF`, [ELOOP](/projects/286/67223) on POSIX or certain Windows errors).
get_lock_path(path: _AnyPurePath) -> _AnyPurePath
Returns a `.lock` file path inside the given directory path. Used for creating lock files to prevent concurrent cleanup.
on_rm_rf_error(func, path, excinfo, *, start_path: Path) -> bool
Error handler for [shutil.rmtree](/projects/286/67352) that handles read-only and permission errors by adjusting file permissions and retrying removal. Returns `True` if the error was handled.
**Usage:**
Passed as the [onerror](/projects/286/67579) callback to [shutil.rmtree](/projects/286/67352) to robustly delete directories.
ensure_extended_length_path(path: Path) -> Path
On Windows, converts a path to its extended-length form (`\\?\` prefix) to support long paths beyond the 260 character limit. Returns the path unchanged on other platforms.
get_extended_length_path_str(path: str) -> str
Helper that converts a string path to Windows extended-length path format.
rm_rf(path: Path) -> None
Recursively deletes a directory or file tree, handling read-only files and Windows long paths.
**Example:**
rm_rf(Path("/tmp/mydir"))
find_prefixed(root: Path, prefix: str) -> Iterator[os.DirEntry[str]]
Yields directory entries in `root` whose names start with `prefix` (case-insensitive).
extract_suffixes(iter: Iterable[os.DirEntry[str]], prefix: str) -> Iterator[str]
Given an iterator of `os.DirEntry` objects and a prefix, yields the suffixes of the entry names after the prefix.
find_suffixes(root: Path, prefix: str) -> Iterator[str]
Combines `find_prefixed` and `extract_suffixes` to yield suffixes of entries starting with `prefix` in `root`.
parse_num(maybe_num: str) -> int
Attempts to parse an integer from a string; returns `-1` if parsing fails.
_force_symlink(root: Path, target: str | PurePath, link_to: str | Path) -> None
Creates or replaces a symlink named `target` inside `root` pointing to `link_to`. Ignores exceptions, as this is a best-effort operation typically used for pointing to the current numbered directory.
make_numbered_dir(root: Path, prefix: str, mode: int = 0o700) -> Path
Creates a new directory inside `root` with a name consisting of `prefix` followed by an incremented number (e.g., `prefix0`, `prefix1`, ...). Tries up to 10 times to create a unique numbered directory. Also updates a `prefixcurrent` symlink to point to the newly created directory.
**Example:**
new_dir = make_numbered_dir(Path("/tmp"), "run")
print(new_dir) # e.g., /tmp/run3
create_cleanup_lock(p: Path) -> Path
Creates a lock file `.lock` inside directory `p` atomically to prevent premature cleanup. Writes the current process PID into the lock file.
Raises `OSError` if the lock file already exists.
register_cleanup_lock_removal(lock_path: Path, register: Any = atexit.register) -> Any
Registers a cleanup function to remove the given `lock_path` at process exit (default via `atexit`), but only if the process has not forked.
maybe_delete_a_numbered_dir(path: Path) -> None
Attempts to delete a numbered directory safely by acquiring a cleanup lock, renaming the directory to a garbage directory, and removing it recursively.
ensure_deletable(path: Path, consider_lock_dead_if_created_before: float) -> bool
Checks if a directory is deletable by verifying if its lock file exists and whether the lock file's modification time is older than a threshold.
try_cleanup(path: Path, consider_lock_dead_if_created_before: float) -> None
Tries to clean up a directory if it passes `ensure_deletable`.
cleanup_candidates(root: Path, prefix: str, keep: int) -> Iterator[Path]
Lists numbered directories under `root` with `prefix` whose suffix number is older than the number of directories to keep (`keep`).
cleanup_dead_symlinks(root: Path) -> None
Removes broken symlinks inside `root`.
cleanup_numbered_dir(root: Path, prefix: str, keep: int, consider_lock_dead_if_created_before: float) -> None
Cleans up old numbered directories and garbage directories under `root` according to locking and retention policies. Also removes dead symlinks.
make_numbered_dir_with_cleanup(root: Path, prefix: str, keep: int, lock_timeout: float, mode: int) -> Path
Creates a numbered directory with cleanup locks and registers cleanup for old directories at program exit.
resolve_from_str(input: str, rootpath: Path) -> Path
Resolves a string path `input` by expanding user (~) and environment variables, returning an absolute path relative to `rootpath` if `input` is relative.
fnmatch_ex(pattern: str, path: str | os.PathLike[str]) -> bool
Enhanced filename pattern matching that supports `**` glob patterns across path separators, maintaining compatibility with previous `py.path` behavior.
parts(s: str) -> set[str]
Splits a path string `s` by the OS separator and returns a set of cumulative path parts.
symlink_or_skip(src: os.PathLike[str] | str, dst: os.PathLike[str] | str, **kwargs: Any) -> None
Attempts to create a symlink from `src` to `dst`. If symlink creation is not supported, skips the test (using pytest's `skip`).
Module Import Utilities
import_path(path: str | os.PathLike[str], *, mode: str | ImportMode = ImportMode.prepend, root: Path, consider_namespace_packages: bool) -> ModuleType
Imports a Python module or package from an arbitrary filesystem path.
path: Filesystem path to the module or package.mode: Import mode controlling how the module is imported (prepend,append, orimportlib).root: Root path used to compute module names, especially in importlib mode.consider_namespace_packages: Whether to consider namespace packages when resolving module names.
Raises `ImportPathMismatchError` if the imported module's `__file__` does not match `path` (in `prepend` and `append` modes).
**Example:**
mod = import_path("/path/to/my/module.py", mode=ImportMode.importlib, root=Path("/path/to"))
_import_module_using_spec(module_name: str, module_path: Path, module_location: Path, *, insert_modules: bool) -> ModuleType | None
Internal helper to import a module using `importlib` `ModuleSpec`. Handles importing parents recursively and inserts missing intermediate modules if requested.
spec_matches_module_path(module_spec: ModuleSpec | None, module_path: Path) -> bool
Checks if the given `ModuleSpec` corresponds to the specified module file path.
module_name_from_path(path: Path, root: Path) -> str
Generates a Python dotted module name for a file path relative to a root directory.
**Example:**
module_name_from_path(Path("projects/src/tests/test_foo.py"), Path("projects"))
# Returns: "src.tests.test_foo"
insert_missing_modules(modules: dict[str, ModuleType], module_name: str) -> None
Creates empty intermediate modules for module names to allow proper import when modules are imported directly by path.
resolve_package_path(path: Path) -> Path | None
Finds the nearest parent directory containing `__init__.py` files, representing the package root. Returns `None` if not found.
resolve_pkg_root_and_module_name(path: Path, *, consider_namespace_packages: bool = False) -> tuple[Path, str]
Resolves and returns the package root directory and dotted module name for the given path, considering namespace packages optionally.
Raises `CouldNotResolvePathError` if resolution fails.
is_importable(module_name: str, module_path: Path) -> bool
Checks if a module with the given name can be imported normally and corresponds to the specified file path.
compute_module_name(root: Path, module_path: Path) -> str | None
Computes a dotted module name for `module_path` relative to `root`.
Directory Scanning and Walking
scandir(path: str | os.PathLike[str], sort_key: Callable[[os.DirEntry[str]], object] = lambda entry: entry.name) -> list[os.DirEntry[str]]
Scans a directory path and returns a sorted list of directory entries. Ignores entries that raise known ignorable errors.
Returns an empty list if the directory does not exist.
visit(path: str | os.PathLike[str], recurse: Callable[[os.DirEntry[str]], bool]) -> Iterator[os.DirEntry[str]]
Recursively walks directories breadth-first, yielding entries. The `recurse` predicate determines which directories to recurse into.
Path Utilities
absolutepath(path: str | os.PathLike[str]) -> Path
Returns an absolute path string using `os.path.abspath`, preferred over `Path.resolve()` or `Path.absolute()` due to subtle differences.
commonpath(path1: Path, path2: Path) -> Path | None
Returns the common prefix path shared between `path1` and `path2`. Returns `None` if no common prefix or if one path is relative and the other is absolute.
bestrelpath(directory: Path, dest: Path) -> str
Computes a relative path from `directory` to `dest`, if possible. Returns `dest` as string if no relative path can be computed.
safe_exists(p: Path) -> bool
Checks if a path exists, safely handling errors from paths that are too long or invalid (especially on Windows).
Important Implementation Details and Algorithms
Robust Directory Removal:
rm_rfuses a custom error handleron_rm_rf_errorto overcome common permission and read-only file issues during recursive deletion, including adjusting permissions and retrying.Numbered Directory Management: The module implements a system for creating directories named with a prefix and incremented number suffix, maintaining a
prefixcurrentsymlink to the latest directory. Cleanup locks prevent concurrent deletion. Cleanup functions are registered withatexitto remove stale directories.Dynamic Module Importing: The
import_pathfunction supports multiple import modes. Theimportlibmode usesModuleSpecand custom loaders to import modules without modifyingsys.path, allowing multiple modules with the same name in different locations. It carefully handles packages, namespace packages, and intermediate module insertion to maintain valid import states.Windows Extended-Length Paths: To handle long paths on Windows, paths are converted to extended-length format (prefixed with
\\?\) to bypass the 260-character limit.Error Handling: Functions such as
scandirandon_rm_rf_errorinclude detailed handling of platform-specific errors, ignoring known benign errors but raising others.Symlink Creation with Fallback:
symlink_or_skipattempts to create a symlink and skips the test if symlinks are unsupported, improving test portability.
Interaction with Other Parts of the System
Pytest Integration:
Uses
_pytest.outcomes.skipto skip tests if symlinks are unsupported.Emits warnings using
_pytest.warning_types.PytestWarning.Registers cleanup functions with
atexitto clean temporary test directories.Uses
assert_neverfrom_pytest.compatto enforce exhaustive enum handling.
Standard Library Modules:
Extensively uses
pathlib.Pathfor path abstraction.Uses
importlibutilities for dynamic module importing.Uses
os,shutilfor filesystem operations.Uses
fnmatchfor pattern matching.Uses
uuidto generate unique garbage directory names during cleanup.Uses
contextlib.suppressto ignore specific exceptions.
Visual Diagram: Class Diagram for ImportMode and Exception Classes
classDiagram
class ImportMode {
<<Enum>>
+prepend
+append
+importlib
}
class ImportPathMismatchError {
<<Exception>>
}
class CouldNotResolvePathError {
<<Exception>>
}
Usage Examples
Creating a numbered directory and cleaning old ones
from pathlib import Path
from pathlib import make_numbered_dir_with_cleanup
root = Path("/tmp/test-runs")
prefix = "run"
keep = 3
lock_timeout = 60 * 60 * 24 * 3 # 3 days
mode = 0o700
new_dir = make_numbered_dir_with_cleanup(root, prefix, keep, lock_timeout, mode)
print(f"Created new numbered directory: {new_dir}")
Importing a module dynamically from a path
from pathlib import import_path, ImportMode
from pathlib import Path
module_path = "/path/to/tests/test_example.py"
root = Path("/path/to")
mod = import_path(module_path, mode=ImportMode.importlib, root=root, consider_namespace_packages=True)
print(mod)
Removing a directory and handling read-only files
from pathlib import rm_rf
from pathlib import Path
rm_rf(Path("/tmp/some_directory"))
Summary
The [pathlib.py](/projects/286/67436) module enhances Python's filesystem and import capabilities with robust, platform-aware utilities designed to support dynamic testing environments. It manages filesystem paths, directories with locking and cleanup, complex importing scenarios including namespace packages, and offers safe recursive deletion and scanning. Its design carefully handles platform-specific quirks and integrates tightly with pytest for test isolation and cleanup.
This file is a critical utility component for managing filesystem state and dynamic code execution within the larger pytest testing framework.