wcwidth.py

Overview

The `wcwidth.py` module provides utility functions to measure the display width of Unicode characters and strings in terminal environments. This is critical for applications that require precise alignment of text output, such as command-line interfaces, terminal user interfaces, and text editors.

The module exposes two main functions:

Widths are returned according to terminal display conventions:

The module uses Unicode properties and caching to efficiently determine these widths.


Detailed Explanation

Function: wcwidth

@lru_cache(100)
def wcwidth(c: str) -> int:
    """Determine how many columns are needed to display a character in a terminal.

    Returns -1 if the character is not printable.
    Returns 0, 1 or 2 for other characters.
    """

**Purpose:** Calculates the number of terminal column cells required to display the character `c`.

**Parameters:**

**Returns:**

**Implementation details:**

**Usage example:**

print(wcwidth('a'))     # Output: 1
print(wcwidth('あ'))    # Output: 2 (Hiragana character)
print(wcwidth('\u0301')) # Output: 0 (Combining acute accent)
print(wcwidth('\x07'))   # Output: -1 (Bell control character)

Function: wcswidth

def wcswidth(s: str) -> int:
    """Determine how many columns are needed to display a string in a terminal.

    Returns -1 if the string contains non-printable characters.
    """

**Purpose:** Computes the total terminal column width required to display the entire string `s`.

**Parameters:**

**Returns:**

**Implementation details:**

**Usage example:**

print(wcswidth("hello"))          # Output: 5
print(wcswidth("コンニチハ"))       # Output: 10 (Each character width 2)
print(wcswidth("a\u0301"))        # Output: 1 ('a' + combining acute accent)
print(wcswidth("hello\x07world")) # Output: -1 (Bell character inside string)

Important Implementation Details and Algorithms


Interaction with Other System Components


Mermaid Diagram

Below is a flowchart representing the main functions and their relationships in `wcwidth.py`.

flowchart TD
    A[Start: Input character or string]

    subgraph Single Character Width
        direction TB
        B[wcwidth(c)]
        B --> C{Is c ASCII printable?}
        C -- Yes --> D[Return 1]
        C -- No --> E{Is c zero-width special char?}
        E -- Yes --> F[Return 0]
        E -- No --> G{Is c control character?}
        G -- Yes --> H[Return -1]
        G -- No --> I{Is c combining mark?}
        I -- Yes --> F
        I -- No --> J{Is c East Asian Wide/Fullwidth?}
        J -- Yes --> K[Return 2]
        J -- No --> D
    end

    subgraph String Width Calculation
        direction TB
        L[wcswidth(s)]
        L --> M[Normalize s with NFC]
        M --> N[For each character c in s]
        N --> B
        B --> O{wcwidth(c) >= 0?}
        O -- No --> P[Return -1]
        O -- Yes --> Q[Accumulate total width]
        Q --> R{More characters?}
        R -- Yes --> N
        R -- No --> S[Return total width]
    end

Summary

The `wcwidth.py` module provides efficient, Unicode-aware functions to measure the number of terminal columns required to display characters and strings. By leveraging Unicode properties and caching, it delivers accurate width measurements essential for terminal text layout and alignment tasks. Its simple interface and reliance on Python standard libraries make it easy to integrate into terminal-based applications requiring precise text formatting.