industries.py
Overview
The industries.py file serves as a centralized data repository and utility for managing a hierarchical classification of industries. It contains a detailed dictionary mapping industry identifiers (IDs) to their corresponding industry names and parent industry IDs, enabling representation of industry categories in a tree-like structure.
The primary functionality provided is the ability to retrieve a list of industry names tracing the lineage of a given industry ID up through the hierarchy to the root category. This can be useful in applications requiring categorization, filtering, or display of industry-related data with contextual parent industry information.
Detailed Explanation
Industry Data Table (TBL)
TBLis a Python dictionary where:Key: Industry ID as a string (e.g.,
"1","1119").Value: Another dictionary with two keys:
"name": The name of the industry (string, mostly in Chinese)."parent": The parent industry ID (string), or"0"indicating a root-level industry.
This dictionary encodes a large hierarchy of industries and sub-industries, allowing traversal from any industry node up to the root node by following the "parent" IDs.
Example entry:
"1119": {"name": "牙科及医疗器械", "parent": "646"}
This means the industry with ID "1119" is named "牙科及医疗器械" (Dental and Medical Devices), and its parent industry has ID "646".
Function: get_names(id)
def get_names(id):
"""
Retrieve the list of industry names starting from the given industry ID,
tracing up the hierarchy to the root.
Parameters:
id (int or str): The industry ID to look up.
Returns:
list of str: Ordered list of industry names, starting from the given industry,
followed by its parent industries up to the root.
Returns an empty list if the ID is not found.
Usage example:
>>> get_names("1119")
['牙科及医疗器械', '医疗器械', '医药', '金融']
"""
id = str(id)
nms = []
d = TBL.get(id)
if not d:
return []
nms.append(d["name"])
p = get_names(d["parent"])
if p:
nms.extend(p)
return nms
Description
The function get_names performs a recursive lookup:
Converts the input
idto a string to ensure key compatibility.Attempts to find the dictionary entry for the
idinTBL.If the entry is not found, returns an empty list.
If found, appends the current industry's name to a list.
Recursively calls itself with the parent industry's ID.
Extends the list with the parent's lineage if it exists.
Returns the complete list of names from the given industry up to the root.
Parameters
id(intorstr): Industry identifier to look up.
Returns
list[str]: List of industry names ordered from the specified industry upward.
Usage Example
>>> print(get_names("1119"))
['牙科及医疗器械', '医疗器械', '医药', '金融']
This example shows the lineage for industry ID "1119".
Script Execution
The file includes an entry point to demonstrate the functionality when run as a standalone script:
if __name__ == "__main__":
print(get_names("1119"))
This will print the industry lineage for the ID "1119".
Implementation Details and Algorithms
Recursive Traversal: The industry hierarchy is navigated recursively by moving from a child industry to its parent until the root is reached.
Data Structure: The use of a dictionary (
TBL) allows O(1) average time complexity for retrieving industry data by ID.Lineage Construction: The function builds the lineage list by first adding the current node, then recursively appending the parent's lineage, resulting in a list ordered from the child up to the root.
Interaction with Other System Components
This file acts as a utility module providing industry classification data and lookup functionality.
It can be imported by other modules that require:
Industry name resolution.
Industry hierarchy lineage for classification tasks.
Display or filtering logic based on industry categories.
The hierarchical data can assist in analytics, reporting, or user interface components that present industry information.
Given that it contains only a data dictionary and a single utility function, this module is likely a foundational component used across multiple parts of an application dealing with industry-related data.
Visual Diagram
The following Mermaid class diagram illustrates the structure of the industries.py file, focusing on the key data structure and function:
classDiagram
class industries {
<<module>>
+TBL: dict[str, dict{name: str, parent: str}]
+get_names(id: int|str) list~str~
}
industries : +get_names() - recursively retrieves industry lineage
industries : +TBL - industry data mapping
Summary
Purpose: Provides a comprehensive industry hierarchy dictionary and a utility function to retrieve industry name lineages.
Data:
TBLdictionary encodes industries with parent-child relationships.Functionality:
get_names(id)recursively returns the list of industry names from the given ID to the root.Use Cases: Industry classification, data categorization, UI display, analytics.
Implementation: Efficient dictionary lookups combined with recursive traversal.
Extensibility: The data can be expanded or localized, and additional utility functions could be added for more complex queries, such as retrieving children or siblings.
This module is a key reference for any component needing structured industry information within the system.