categorize.mdx

Overview

The categorize.mdx file documents the Categorize component used within a conversational AI workflow. This component is responsible for classifying user inputs into predefined categories and applying corresponding processing strategies based on the classification results.

Typically, the Categorize component is placed downstream of an Interact component to analyze user intents and determine which branch or processing path to take next in the conversation flow.

This documentation details the component's purpose, configuration options, usage scenarios, and interaction with other workflow components. It serves as a guide for users configuring or troubleshooting the Categorize component within a larger system.

Detailed Explanation

Purpose and Functionality

Primary Function: Classify incoming user inputs or data into one of several predefined categories.
Use Case: Enables the workflow to dynamically select different strategies or downstream components based on user intent or content classification.
Position in Workflow: Usually follows an Interact component that collects user inputs.

Key Concepts and Configuration Sections

1. Query Variables

Role: Define the data sources (queries) that the Categorize component will analyze.
Source: Dropdown lists all global variables declared prior to this component.
Mandatory: Yes.

2. Input Variables

Types:
- Reference: Links to outputs from other components or global variables.
- Text: Static text inputs.
Usage: Added via the + Add variable button.
Example:
- Reference a user message output from an Interact component.
- Provide static prompt text to help guide categorization.

3. Model Configuration

Model: Choice of chat model (e.g., GPT-based models) used to perform categorization.
Freedom: Preset configurations controlling generation randomness and creativity:
- Improvise (creative)
- Precise (conservative, default)
- Balance (middle ground)
Temperature: Controls randomness (default 0.1).
Top P: Controls nucleus sampling threshold (default 0.3).
Presence Penalty: Encourages new tokens (default 0.4).
Frequency Penalty: Discourages repetition (default 0.7).

Note: Users can mix models across components for flexibility and performance tuning.

4. Message Window Size

Specifies how many previous dialogue rounds are included in the input to the model.
Defaults to 1.
Important for multi-turn dialogues only (leave default if not looping).

5. Category Definition

At least two categories must be defined.
Each category has:
- Name: Editable label to help the LLM understand the category.
- Description: Criteria or notes to clarify category intent.
- Examples: Concrete samples to improve classification accuracy.
Categories connect to downstream components via the UI (+ button next to cases).

6. Output

Defines the global variable name that stores the classification result.
Default output variable: category_name.
This output can be referenced by subsequent components to guide workflow branching.

Usage Example

Suppose you want to classify user messages into categories like "Support Request," "Sales Inquiry," and "General Feedback."

Define input variables:
- Reference the user's message output from the Interact component.
Configure model settings:
- Use the default Precise preset.
Add categories:
- Name: "Support Request"
  Description: "User is asking for technical support."
  Examples: "My internet is down," "How do I reset my password?"
- Name: "Sales Inquiry"
  Description: "User wants to learn about products or pricing."
  Examples: "What is the price of your software?", "Tell me about your plans."
- Name: "General Feedback"
  Description: "User is providing feedback or comments."
  Examples: "I love your service," "You should add more features."
Connect each category to different downstream components handling support, sales, or feedback workflows.

Implementation Details and Algorithms

The component leverages an LLM (Large Language Model) to perform natural language classification.
Classification is based on prompt engineering, where descriptions and examples help the model determine which category best fits the input.
Sampling parameters (temperature, top P, penalties) control the style and reliability of the classification.
Message window size allows the inclusion of conversational context for multi-turn dialogue classification.

Interaction with Other Components

Upstream: Usually follows an Interact component that captures user inputs.
Downstream: Each category branches to specific components or workflows tailored to that category's intent.
Global Variables: Reads global variables defined earlier to use as query inputs.
Output: Exposes a global variable category_name (or custom name) which downstream components use to decide execution paths.

Visual Diagram: Component Structure and Configuration Flow

flowchart TD
    A[User Input] --> B(Interact Component)
    B --> C[Categorize Component]
    C -->|Category: Support Request| D[Support Workflow Component]
    C -->|Category: Sales Inquiry| E[Sales Workflow Component]
    C -->|Category: General Feedback| F[Feedback Workflow Component]

    subgraph Categorize Configuration
        C1[Query Variables]
        C2[Input Variables]
        C3[Model Settings]
        C4[Message Window Size]
        C5[Categories (Name, Description, Examples)]
        C1 --> C
        C2 --> C
        C3 --> C
        C4 --> C
        C5 --> C
    end

Summary

The Categorize component is a pivotal element in AI-driven conversational workflows, enabling dynamic response strategies based on user intent classification. It is highly configurable through query inputs, model parameters, and category definitions, and integrates tightly with upstream user interaction components and downstream processing branches.

By leveraging LLM capabilities and carefully crafted prompts, it facilitates intelligent routing of dialogue flows to improve user experience and operational efficiency.