Knowledge Base

Model Registry Classifier Prediction Functions

Classifier Prediction in NQL

Overview

This reference provides a guide to using the classifier prediction User-Defined Functions (UDFs) available when you register a custom classifier model in the Narrative Model Registry. These functions enable you to apply trained models directly in NQL queries, returning human-readable labels, probability distributions, and uncertainty measurements.

When a classifier model is registered, the following prediction functions become available: predict_label, predict_proba_label, and predict_entropy. These functions eliminate the need for manual ID-to-label mapping and provide built-in support for uncertainty quantification.

Function: `PREDICT_LABEL`

Arguments

features (varies): The input features matching the model's expected schema. Can be passed as individual columns or as an array, depending on model configuration.

What It Does

Returns the predicted class as a human-readable string label instead of a numeric ID.
Automatically maps the model's internal label encoding to the original label names stored during training.
Returns NULL for null or malformed input data.
Works with both binary and multi-class classifiers.

Example

CREATE MATERIALIZED VIEW MY_PREDICTIONS AS
SELECT
    unique_id,
    my_model.predict_label(feature_array) AS predicted_category
FROM company_data.my_dataset;

Function: `PREDICT_PROBA_LABEL`

Arguments

features (varies): The input features matching the model's expected schema.

What It Does

Returns the full probability distribution with human-readable label names as keys.
Output is an object (key-value pairs) where keys are label names and values are probabilities between 0.0 and 1.0.
Probabilities across all labels sum to 1.0.
Returns NULL for null or malformed input data.
Works with both binary and multi-class classifiers.

Example

CREATE MATERIALIZED VIEW MY_PROBABILITY_PREDICTIONS AS
SELECT
    unique_id,
    my_model.predict_proba_label(feature_array) AS label_probabilities
FROM company_data.my_dataset;

Sample Output Structure

The label_probabilities column returns an object like:

{"electronics": 0.72, "apparel": 0.18, "home": 0.10}

Function: `PREDICT_ENTROPY`

Arguments

features (varies): The input features matching the model's expected schema.

What It Does

Returns the Shannon entropy of the prediction probability distribution—a measure of prediction uncertainty.
Output is a float value where:
- 0 = completely certain (one class has 100% probability)
- log₂(n_classes) = maximum uncertainty (uniform distribution across all classes)
Calculated as: H(X) = -Σ p(x) × log₂(p(x))
Zero probabilities are excluded from the calculation to avoid undefined logarithms.
Returns NULL for null or malformed input data.

Example

CREATE MATERIALIZED VIEW PREDICTIONS_WITH_CONFIDENCE AS
SELECT
    unique_id,
    my_model.predict_label(feature_array) AS predicted_category,
    my_model.predict_entropy(feature_array) AS uncertainty
FROM company_data.my_dataset;

Interpreting Entropy Values

For a 3-class classifier (max entropy = log₂(3) ≈ 1.58):

Entropy	Interpretation	Example Distribution
0.0	Certain	1.0, 0.0, 0.0
~0.5	Confident	0.85, 0.10, 0.05
~1.0	Moderate uncertainty	0.60, 0.25, 0.15
~1.58	Maximum uncertainty	0.33, 0.33, 0.33

Example: Filtering by Confidence

Use entropy to filter for high-confidence predictions:

CREATE MATERIALIZED VIEW HIGH_CONFIDENCE_PREDICTIONS AS
SELECT
    unique_id,
    my_model.predict_label(feature_array) AS predicted_category,
    my_model.predict_entropy(feature_array) AS uncertainty
FROM company_data.my_dataset
WHERE my_model.predict_entropy(feature_array) < 0.5;

Key Benefits

No Manual Mapping: Eliminates the need to maintain separate label encoder mappings or perform additional queries to interpret model outputs.
Version Alignment: Label mappings are stored with the model, preventing misalignment between model versions.
Uncertainty Quantification: The entropy function enables detection of low-confidence predictions for human review, active learning sample selection, or quality filtering.
Consistent Interface: All functions work uniformly across binary and multi-class classifiers.

Notes

The model must be registered in the Narrative Model Registry before these functions become available.
Functions inherit access permissions from the registered model.
For multi-class classifiers, entropy ranges from 0 to log₂(n_classes). A 2-class classifier has max entropy of 1.0; a 10-class classifier has max entropy of ~3.32.
Use the NQL Editor to validate your query before execution.
Performance overhead is minimal (<10ms) compared to the base predict and predict_proba functions.