dataset-util.ts


Overview

The dataset-util.ts file provides utility functions related to document parsers and data grouping within the context of knowledge processing or document management systems. Specifically, it:

This utility is likely used in higher-level components or services that manage document ingestion, classification, or filtering based on parser types or grouped metadata.


Detailed Explanation

Imports

import { DocumentParserType } from '@/constants/knowledge';

Functions

isKnowledgeGraphParser

export function isKnowledgeGraphParser(parserId: DocumentParserType): boolean
if (isKnowledgeGraphParser(currentParser)) {
  // Execute logic specific to KnowledgeGraph parser
}

isNaiveParser

export function isNaiveParser(parserId: DocumentParserType): boolean
if (isNaiveParser(currentParser)) {
  // Execute logic specific to Naive parser
}

Type Definitions

FilterType

export type FilterType = {
  id: string;
  label: string;
  count: number;
};

Generic Function: groupListByType

export function groupListByType<T extends Record<string, any>>(
  list: T[],
  idField: string,
  labelField: string,
): FilterType[]
interface FileItem {
  typeId: string;
  typeName: string;
  // other properties
}

const files: FileItem[] = [
  { typeId: 'pdf', typeName: 'PDF Document' },
  { typeId: 'pdf', typeName: 'PDF Document' },
  { typeId: 'doc', typeName: 'Word Document' },
];

const grouped = groupListByType(files, 'typeId', 'typeName');
// Result:
// [
//   { id: 'pdf', label: 'PDF Document', count: 2 },
//   { id: 'doc', label: 'Word Document', count: 1 }
// ]

Important Implementation Details


Interaction with Other System Parts


Mermaid Flowchart Diagram

flowchart TD
    A[Input: list of objects] --> B[groupListByType]
    B --> C[Check if group with idField exists]
    C -- Yes --> D[Increment count]
    C -- No --> E[Add new FilterType with count=1]
    D & E --> F[Return array of FilterType]

    subgraph Parser Identification
        G[isKnowledgeGraphParser(parserId)] --> H[Returns boolean]
        I[isNaiveParser(parserId)] --> J[Returns boolean]
    end

Summary

The dataset-util.ts file provides focused utility functions for:

Its simplicity and generic design make it a reusable utility in document processing pipelines and user interface components dealing with filters and parsers.