Stepwise Test Execution

Purpose

Stepwise Test Execution addresses the challenge of efficiently running large test suites by stopping the test run immediately upon a test failure and continuing from that failure in the next run. This feature is especially valuable when developers want to focus on fixing failures incrementally without rerunning all previously passing tests, saving time and computational resources.

This subtopic implements a plugin that tracks the last failing test and modifies test execution order accordingly, enabling a "stop-on-failure" mode combined with intelligent skipping of already verified passing tests in subsequent runs.

Functionality

The core functionality revolves around three command line options:

--stepwise (--sw): Enables the stepwise workflow—stop at first failure and resume from there next time.
--stepwise-skip (--sw-skip): Similar to --stepwise but ignores the first failing test once, stopping at the second failure instead.
--stepwise-reset (--sw-reset): Resets stepwise state and cache, restarting the workflow.

Key workflows managed by the plugin include:

Initialization and Cache Loading:
On pytest startup, the plugin loads cached information about the last failed test, the number of tests in the last run, and the timestamp of the cache.
Modifying Test Collection:
During test collection, the plugin compares the current test suite with cached data. If the last failed test is known and the test count is unchanged, it skips all tests before the last failure, effectively resuming from the failure point.
Test Failure Handling and Stopping:
When a test fails, the plugin records its node ID as the last failure and signals pytest to stop execution immediately. If --stepwise-skip is enabled, the first failure is ignored (to allow one failure to pass) and the next failure triggers stopping.
Cache Updating:
After the test session, the plugin updates the cache with the latest failure information and timestamps, ensuring continuity of the stepwise workflow in subsequent runs.

An excerpt illustrating how tests are skipped before the last failure:

failed_index = None
for index, item in enumerate(items):
    if item.nodeid == self.cached_info.last_failed:
        failed_index = index
        break

if failed_index is not None:
    deselected = items[:failed_index]
    del items[:failed_index]
    config.hook.pytest_deselected(items=deselected)

And how failure triggers stopping:

if report.failed:
    self.cached_info.last_failed = report.nodeid
    self.session.shouldstop = "Test failed, continuing from this test next run."

Integration

Stepwise Test Execution is a specialized plugin extending pytest’s core test execution and caching infrastructure:

It integrates tightly with Test Execution and Reporting by intercepting test reports and controlling test session flow.
It leverages the Cache & Rerun subsystem to persist state between runs, enabling incremental test execution.
The plugin interacts with Test Discovery and Collection to reorder and skip tests dynamically based on cached failure data.
It complements the Test Rerun and Stepwise Execution parent topic by providing the actual implementation of the "stop on failure and resume" behavior.
This plugin respects parallel test execution environments by avoiding cache updates in worker nodes, preventing race conditions.

By focusing on failure-driven execution flow control, Stepwise Test Execution offers a focused performance optimization that works seamlessly alongside other pytest features like fixtures, parametrization, and plugins.

Diagram

flowchart TD
    Start[Test Session Start]
    LoadCache[Load Stepwise Cache]
    CollectTests[Test Collection]
    CheckCache[Check Last Failed Test]
    SkipPassed[Skip Tests Before Last Failure]
    RunTests[Run Tests]
    TestPass{Test Passed?}
    TestFail{Test Failed?}
    RecordFail[Record Failed Test & Stop Session]
    ContinueNext[Continue Next Test]
    UpdateCache[Update Cache After Session]
    End[Test Session End]

    Start --> LoadCache
    LoadCache --> CollectTests
    CollectTests --> CheckCache
    CheckCache -->|Last Failure Known| SkipPassed
    CheckCache -->|No Last Failure| RunTests
    SkipPassed --> RunTests
    RunTests --> TestPass
    TestPass -->|Yes| ContinueNext
    TestPass -->|No| TestFail
    TestFail --> RecordFail
    ContinueNext --> RunTests
    RunTests -->|All Tests Done| UpdateCache
    RecordFail --> UpdateCache
    UpdateCache --> End

This flowchart illustrates the process of loading failure info, skipping tests before the last failure, running tests until failure, recording that failure, and updating the cache for the next session.