index.tsx
Overview
This file exports a React functional component named ChunkMethodModal. It is a modal dialog interface designed for configuring document parsing methods within a knowledge base or document management system. The modal allows users to select a document parser type and customize various parser configurations such as page ranges, token limits, layout recognition, entity types, and additional parsing-related options.
ChunkMethodModal adapts its form fields dynamically based on the selected parser type and document extension, providing an intuitive and context-sensitive UI for managing document chunking and parsing strategies.
Detailed Explanation
Component: ChunkMethodModal
Description
ChunkMethodModal is a React functional component that renders a modal dialog with a form. This form enables users to configure chunking and parsing methods for documents. It integrates with hooks and other sub-components to fetch parser lists, handle form state, and manage parser-specific configurations.
Props Interface: IProps
interface IProps extends Omit<IModalManagerChildrenProps, 'showModal'> {
loading: boolean;
onOk: (
parserId: DocumentParserType | undefined,
parserConfig: IChangeParserConfigRequestBody,
) => void;
showModal?(): void;
parserId: DocumentParserType;
parserConfig: IParserConfig;
documentExtension: string;
documentId: string;
}
loading:
boolean— Indicates whether the form submission is in progress, used for disabling the modal's confirm button.onOk:
(parserId, parserConfig) => void— Callback invoked when the user confirms the modal. It provides the selected parser type and the updated parser configuration.showModal:
() => void(optional) — Function to trigger showing the modal (not used internally here).parserId:
DocumentParserType— The currently selected parser type.parserConfig:
IParserConfig— The current configuration object for the parser.documentExtension:
string— The file extension of the document being parsed (e.g., 'pdf', 'xlsx').documentId:
string— Identifier of the document to be parsed.
Internal Constants and Variables
hidePagesChunkMethods:
DocumentParserType[]— An array of parser types for which page range selection is hidden.form: Ant Design Form instance — Manages form state and validation.
parserList, handleChange, selectedTag: Returned from
useFetchParserListOnMounthook to provide parser options and manage parser selection.t: Translation function from
useTranslate.knowledgeDetails: Data fetched from knowledge base configuration, used to determine feature flags like
useGraphRag.useGraphRag: Boolean indicating if the Graph RAG (Retrieval-Augmented Generation) feature is enabled.
isPdf: Boolean, true if the document extension is 'pdf'.
showPages, showOne, showMaxTokenNumber, showEntityTypes, showExcelToHtml: Booleans controlling conditional rendering of form sections based on parser type and document extension.
showAutoKeywords: Function determining if auto keyword features should be shown.
Key Methods
handleOk
const handleOk = async () => {
const values = await form.validateFields();
const parser_config = {
...values.parser_config,
pages: values.pages?.map((x: any) => [x.from, x.to]) ?? [],
};
onOk(selectedTag, parser_config);
};
Validates the form fields.
Extracts and formats the page ranges from the form.
Calls the parent
onOkcallback with the selected parser ID and new parser configuration.
Effects
Form Initialization on Modal Open
useEffect(() => {
if (visible) {
const pages =
parserConfig?.pages?.map((x) => ({ from: x[0], to: x[1] })) ?? [];
form.setFieldsValue({
pages: pages.length > 0 ? pages : [{ from: 1, to: 1024 }],
parser_config: {
...omit(parserConfig, 'pages'),
graphrag: {
use_graphrag: get(
parserConfig,
'graphrag.use_graphrag',
useGraphRag,
),
},
},
});
}
}, [
form,
knowledgeDetails.parser_config,
parserConfig,
useGraphRag,
visible,
]);
Resets and populates the form fields whenever the modal becomes visible.
Maps existing page ranges from the
parserConfigto form structure.Initializes the
use_graphragflag using either the current config or global setting.
UI Structure and Conditional Rendering
The modal contains:
Parser Type Selector: Dropdown (
Select) listing available parser types.Page Range Input: Conditional form list of page ranges with validation ensuring ascending order and no overlaps.
Task Page Size: Numeric input shown if layout recognition is enabled.
Max Token Number and Delimiter: Shown for certain parser types.
Auto Keywords and Questions: Components for automatic keyword extraction when applicable.
Excel to HTML Converter: For
.xlsxdocuments with naive parser.Raptor Parse Configuration: Advanced parsing options.
Graph RAG Items: Additional graph retrieval-augmented generation settings.
Entity Types: Shown for Knowledge Graph parser.
Usage Example
<ChunkMethodModal
visible={isModalVisible}
loading={isSaving}
documentId="doc-123"
parserId={DocumentParserType.Naive}
parserConfig={currentParserConfig}
documentExtension="pdf"
onOk={(parserId, parserConfig) => {
saveParserConfig(parserId, parserConfig);
closeModal();
}}
hideModal={closeModal}
/>
Interaction with Other Modules
Hooks
useFetchParserListOnMount: Fetches and manages the list of available parsers and handles parser selection changes.useShowAutoKeywords: Determines if auto keyword features should be enabled for the selected parser.useFetchKnowledgeBaseConfiguration: Fetches global knowledge base settings including parser configurations.
Sub-Components
MaxTokenNumber: Displays and manages token limit inputs.AutoKeywordsItemandAutoQuestionsItem: Provide UI for managing automatic keyword and question extraction.DatasetConfigurationContainer: Layout wrapper for conditional configuration sections.Delimiter: UI component for setting delimiters in parsing.ExcelToHtml: Handles Excel document parsing UI.LayoutRecognize: Layout recognition settings.ParseConfiguration: Advanced parsing configurations.UseGraphRagItem: Graph retrieval-augmented generation settings.EntityTypesItem: Controls entity type configurations.
Constants
DocumentParserType: Enumerations representing document parser types guiding conditional UI behavior.
Styling
Uses CSS module styles from
./index.lessfor consistent styling.
Important Implementation Details
Dynamic Form Fields
The form fields are rendered dynamically based on the selected parser type and document type, enabling or disabling options to prevent invalid configurations.
Validation Logic
Page range inputs enforce:
Non-overlapping and ascending page ranges.
Required fields with appropriate error messages.
Dependencies between the "from" and "to" inputs ensure logical correctness.
State Synchronization
The form state is synchronized with external props (
parserConfig) on modal visibility changes using theuseEffecthook.Use of Memoization
useMemois employed to optimize rendering decisions and avoid unnecessary recalculations.Use of Utility Functions
Uses Lodash's
omitandgetto simplify object manipulations.
Mermaid Component Diagram
componentDiagram
ChunkMethodModal <|-- React.FC
ChunkMethodModal -- uses --> Form
ChunkMethodModal -- uses --> Modal
ChunkMethodModal -- uses --> Select
ChunkMethodModal -- uses --> InputNumber
ChunkMethodModal -- uses --> Button
ChunkMethodModal -- uses --> Tooltip
ChunkMethodModal -- uses --> Space
ChunkMethodModal -- uses --> Divider
ChunkMethodModal -- uses --> MaxTokenNumber
ChunkMethodModal -- uses --> AutoKeywordsItem
ChunkMethodModal -- uses --> AutoQuestionsItem
ChunkMethodModal -- uses --> DatasetConfigurationContainer
ChunkMethodModal -- uses --> Delimiter
ChunkMethodModal -- uses --> ExcelToHtml
ChunkMethodModal -- uses --> LayoutRecognize
ChunkMethodModal -- uses --> ParseConfiguration
ChunkMethodModal -- uses --> UseGraphRagItem
ChunkMethodModal -- uses --> EntityTypesItem
ChunkMethodModal -- uses --> useFetchParserListOnMount
ChunkMethodModal -- uses --> useShowAutoKeywords
ChunkMethodModal -- uses --> useFetchKnowledgeBaseConfiguration
ChunkMethodModal -- uses --> useTranslate
Summary
index.tsx defines ChunkMethodModal, a configurable modal component facilitating document parsing configuration in a knowledge management context. It provides a flexible, validated UI that adjusts to parser types and document formats, backed by hooks fetching parser lists and knowledge base configurations. The component integrates multiple specialized sub-components to cover a wide range of parsing options, supporting advanced features like graph-based RAG, auto keyword extraction, and Excel parsing.
This modular and dynamic approach ensures users can tailor document chunking and parsing behaviors effectively within the application.