sandbox_security_tests_full.py


Overview

sandbox_security_tests_full.py is a comprehensive automated test suite designed to validate the security and resource management capabilities of a sandboxed code execution environment. This environment is accessible via a REST API (SANDBOX_API_URL) that runs user-submitted code snippets in various programming languages (currently Python and Node.js).

The test suite executes multiple predefined test cases concurrently, each designed to verify that the sandbox correctly handles:

Results are collected, validated against expectations, and summarized in a detailed test report.


File Components

Constants


Enumerations

These Enum classes classify possible sandbox execution outcomes and failure reasons.

ResultStatus (str, Enum)

Enumerates the possible high-level result statuses of the sandbox execution:

Member

Description

SUCCESS

Execution completed successfully.

PROGRAM_ERROR

User program raised an error.

RESOURCE_LIMIT_EXCEEDED

Program exceeded predefined resource limits (time, memory, output).

UNAUTHORIZED_ACCESS

Program attempted unauthorized operations.

RUNTIME_ERROR

Runtime exceptions or signals occurred.

PROGRAM_RUNNER_ERROR

Error in the sandbox runner itself or communication failure.

ResourceLimitType (str, Enum)

Specifies which resource limit was exceeded:

UnauthorizedAccessType (str, Enum)

Specifies type of unauthorized access attempted:

RuntimeErrorType (str, Enum)

Specifies runtime error types:


Data Models

Using Pydantic, the following models represent structured test results.

ExecutionResult (BaseModel)

Represents the detailed result of a single sandbox execution.

Field

Type

Description

status

ResultStatus

Overall result status.

stdout

str

Captured standard output of the program.

stderr

str

Captured standard error output.

exit_code

int

Program exit code.

detail

Optional[str]

Additional details (e.g., limit type, error info).

resource_limit_type

Optional[ResourceLimitType]

Resource limit type if applicable.

unauthorized_access_type

Optional[UnauthorizedAccessType]

Unauthorized access type if applicable.

runtime_error_type

Optional[RuntimeErrorType]

Runtime error type if applicable.

TestResult (BaseModel)

Represents the outcome of a test case execution.

Field

Type

Description

name

str

Test case name.

passed

bool

Whether the test passed validation.

duration

float

Execution duration in seconds.

expected_failure

bool

True if the test is expected to fail.

result

Optional[ExecutionResult]

Detailed execution result, if available.

error

Optional[str]

Request or execution error string.

validation_error

Optional[str]

Validation error message if test failed validation.


Functions


encode_code(code: str) -> str

Encodes the source code string into a base64-encoded UTF-8 string.

Parameters:

Returns:

Usage Example:

encoded = encode_code("print('Hello')")
print(encoded)
# Outputs base64 string representing the code

execute_single_test(name: str, code: str, language: str, arguments: dict, expect_fail: bool = False) -> TestResult

Executes a single code snippet test case by sending it to the sandbox API and collecting its response.

Parameters:

Returns:

Usage Example:

result = execute_single_test(
    name="Test 1",
    code="def main(): return 1",
    language="python",
    arguments={},
    expect_fail=False,
)
print(result.passed)

validate_test_result(name: str, expect_fail: bool, test_result: TestResult) -> None

Validates the sandbox execution result against the test expectations.

Parameters:

Returns:


get_test_cases() -> Dict[str, dict]

Returns a dictionary of predefined test cases with their source code, expected failure flag, language, and arguments.

Returns:

Usage Example:

tests = get_test_cases()
print(tests["7 Normal test: Python without dependencies"]["code"])

print_test_report(results: Dict[str, TestResult]) -> None

Prints a formatted summary report of all test results to the console.

Parameters:

Returns:


main() -> None

Entry point for the test suite.

Returns:


Implementation Details & Algorithms


Interaction With Other System Components

This file is typically run as a standalone test executable but can be integrated into CI/CD pipelines or monitoring systems to continuously verify sandbox security properties.


Visual Diagram

classDiagram
    class ExecutionResult {
        +ResultStatus status
        +str stdout
        +str stderr
        +int exit_code
        +Optional[str] detail
        +Optional[ResourceLimitType] resource_limit_type
        +Optional[UnauthorizedAccessType] unauthorized_access_type
        +Optional[RuntimeErrorType] runtime_error_type
    }

    class TestResult {
        +str name
        +bool passed
        +float duration
        +bool expected_failure
        +Optional[ExecutionResult] result
        +Optional[str] error
        +Optional[str] validation_error
    }

    class SandboxSecurityTests {
        +execute_single_test(name, code, language, arguments, expect_fail) TestResult
        +validate_test_result(name, expect_fail, test_result) void
        +get_test_cases() Dict[str, dict]
        +print_test_report(results) void
        +main() void
        -encode_code(code) str
    }

    ExecutionResult <|-- TestResult
    SandboxSecurityTests ..> ExecutionResult
    SandboxSecurityTests ..> TestResult

Summary

sandbox_security_tests_full.py is a robust automated test suite that programmatically submits diverse code snippets to a sandbox execution environment and verifies the sandbox’s ability to:

Its concurrency, retry logic, and detailed validation make it suitable for continuous testing and regression detection in a sandbox security context.


End of Documentation for sandbox_security_tests_full.py