Skip to main content
This reference documents the UI elements, configuration options, and actions available in the Classifier Studio interface.

Overview

Classifier Studio enables training classification models that categorize and label data within Narrative’s platform. It integrates data selection, label configuration, and compute resource allocation into a streamlined workflow. Path: My Models → Classifier Studio

Builder interactions

The builder is a step-by-step flow. Each configured step exposes inline actions so you can revise selections without navigating back through earlier steps:
ActionDescription
EditRe-opens the step’s configuration view with your existing selections pre-filled
RemoveClears the step’s configuration (and any dependent downstream steps)
When training is submitted, a compact toast notification confirms success and links directly to the Jobs page to track progress.

Dataset Selection module

The Dataset Selection module lets you choose the dataset that contains your labeled training data.
ElementDescription
Select Dataset buttonOpens the dataset selection view
Dataset nameDisplays the currently selected training dataset

Dataset selection view

When you click Select Dataset, a full selection view appears with:
ElementDescription
Dataset listSearchable, virtualized list of available datasets in the current data plane
Next buttonProceeds to label column selection (enabled after selecting a dataset)
CancelReturns to the builder start view
Only datasets available in your currently selected data plane are shown.

Label Column module

The Label Column module lets you select the column that contains the classification target — the categories your model will learn to predict.
ElementDescription
Select Label buttonOpens the label column selection view (requires a dataset to be selected first)
Column nameDisplays the currently selected label column

Label column selection view

ElementDescription
Column dropdownFilterable list of primitive-type columns from the selected dataset
Column typeDisplays the data type next to each column name
Back buttonReturns to dataset selection
Next buttonProceeds to feature configuration (enabled after selecting a column)

Supported column types

Only columns with primitive data types are available as label columns:
TypeDescription
stringText-based categories
booleanBinary classification
doubleNumeric labels
longInteger labels
timestamptzTimestamp-based labels
For best results, choose a column with a balanced distribution of category values in your training data.

Feature Configuration module

The Feature Configuration module lets you define which columns from your dataset serve as input features for the classifier and how they should be processed.
ElementDescription
Configure buttonOpens the feature configuration view
Feature listDisplays configured features and their types

Feature types

TypeDescription
textFree-form text processed via natural language techniques
categoricalDiscrete categories encoded for model input
numericContinuous numeric values
count_vectorizerText converted to token frequency vectors
embeddingPre-computed vector embeddings

Algorithm Configuration module

The Algorithm Configuration module lets you choose the classification algorithm for training.
ElementDescription
Configure buttonOpens the algorithm selection view
Algorithm nameDisplays the selected classifier type

Available algorithms

AlgorithmDescription
Logistic RegressionLinear model suited for binary and multi-class classification with interpretable results
Random ForestEnsemble method that builds multiple decision trees for robust predictions

Hyperparameters

Each algorithm exposes its own set of tunable hyperparameters with sensible defaults. The configuration view surfaces the parameters relevant to your selected algorithm — for example, regularization strength for Logistic Regression, or tree count and maximum depth for Random Forest — so you can adjust only what matters for your use case.

Test/train split

Configure how the dataset is partitioned into training and evaluation sets:
SettingDescription
Test sizeFraction of rows reserved for evaluation
Random stateInteger seed that makes the split deterministic across retrains
StratificationWhen enabled, preserves the label distribution in both the train and test splits — useful for imbalanced classes

Finalize module

The Finalize module is the last step before training. It lets you name and version the model, attach metadata, confirm the execution environment, and review everything you’ve configured in a single summary view.
ElementDescription
Model nameHuman-readable name for the trained classifier
Model versionVersion identifier for this training run, enabling side-by-side comparison of retrains
TagsKeywords for organizing and identifying trained classifiers
Data plane selectorConfirm the data plane where training executes
Configuration summaryRead-only review of your dataset, label column, features, algorithm, and split settings before submission
Classifier training runs on Snowflake’s built-in ML capabilities within your data plane. Data never leaves your infrastructure.

Actions reference

Configuration actions

ActionLocationDescriptionResult
Select DatasetDataset Selection moduleChoose training datasetDataset selected, columns become available
Select LabelLabel Column moduleChoose target columnClassification target defined
Configure FeaturesFeature Configuration moduleDefine input featuresFeature columns and types configured
Configure AlgorithmAlgorithm Configuration moduleChoose classifier type, hyperparameters, and splitAlgorithm selected for training
FinalizeFinalize moduleName/version the model, add tags, confirm data plane, review summaryTraining request fully configured

Training actions

ActionLocationDescriptionResult
Train ClassifierPage toolbarStart training (enabled when all steps are configured)Training job submitted and progress displayed

Training output

After you click Train Classifier:
  • A training job is submitted to the selected data plane
  • A success confirmation appears when the job is accepted
  • Monitor training progress on the Jobs page
  • The trained classifier becomes available for use in your data workflows

Workflow summary

  1. Select dataset → Choose the training dataset in the Dataset Selection module
  2. Select label column → Pick the column containing classification labels
  3. Configure features → Define input features and their processing types
  4. Configure algorithm → Choose between logistic regression and random forest, tune hyperparameters, and set the test/train split
  5. Finalize → Name and version the model, add tags, confirm the data plane, and review the configuration summary
  6. Train classifier → Click Train Classifier and monitor progress from the Jobs page

LLM Studio

Train and fine-tune LLM models using prepared datasets

AI enrichment with NQL

Use AI functions in NQL queries for data classification and enrichment

Model Inference

Run inference using AI models within your data plane

Datasets

Understanding datasets in Narrative