regions.py

Overview

The regions.py file provides a hierarchical mapping and utility functions to work with Chinese administrative regions and some international locations. It maintains a large dictionary (TBL) that maps region identifiers (IDs) to their corresponding names and parent regions, effectively defining a tree structure of administrative divisions. The file includes helper functions to:

This file is primarily used for managing and querying geographic region information, useful for applications involving location data, regional analysis, or address normalization.


Data Structures

TBL (dict)

NM_SET (set)


Functions

get_names(id)

Retrieve the hierarchical name chain for a given region ID.


isName(nm)

Check if a given string corresponds to a known region name, accounting for common suffix variations.


Implementation Details and Algorithms


Interaction with Other Parts of the System/Application


Diagram: Class/Function Structure

Since this file is a utility module without classes, the following flowchart shows the relationships between main functions and data structures.

flowchart TD
    TBL["TBL (Region data dict)"]
    NM_SET["NM_SET (Set of region names)"]
    
    get_names["get_names(id)"]
    isName["isName(nm)"]
    
    subgraph Data
      TBL
      NM_SET
    end
    
    subgraph Functions
      get_names
      isName
    end
    
    get_names -->|Uses| TBL
    isName -->|Uses| NM_SET
    isName -->|Uses regex for suffix removal| re[/"re module"/]
    get_names -->|Uses regex for ID validation| re

Summary

regions.py provides a comprehensive hierarchical dataset of Chinese and some international regions and utility functions to query and validate region names. It is optimized for quick membership tests and hierarchical lookups, supporting applications that require geographic region management or normalization.


Example Usage Summary

import regions

# Get full name hierarchy for a region ID
names = regions.get_names("33")
print(names)  # Output: ['北京市', '北京', '1']

# Check if a string is a valid region name
print(regions.isName("新疆维吾尔自治区"))  # Output: True
print(regions.isName("不存在的地方"))      # Output: False

End of regions.py documentation.