Knowledge Base

Model Registry Classifier Prediction Functions

Classifier Prediction in NQL

Overview

This reference provides a guide to using the classifier prediction User-Defined Functions (UDFs) available when you register a custom classifier model in the Narrative Model Registry. These functions enable you to apply trained models directly in NQL queries, returning human-readable labels, probability distributions, and uncertainty measurements.

When a classifier model is registered, the following prediction functions become available: predict_label, predict_proba_label, and predict_entropy. These functions eliminate the need for manual ID-to-label mapping and provide built-in support for uncertainty quantification.

Function: PREDICT_LABEL

Arguments

  • features (varies): The input features matching the model's expected schema. Can be passed as individual columns or as an array, depending on model configuration.

What It Does

  • Returns the predicted class as a human-readable string label instead of a numeric ID.
  • Automatically maps the model's internal label encoding to the original label names stored during training.
  • Returns NULL for null or malformed input data.
  • Works with both binary and multi-class classifiers.

Example

CREATE MATERIALIZED VIEW MY_PREDICTIONS AS
SELECT
    unique_id,
    my_model.predict_label(feature_array) AS predicted_category
FROM company_data.my_dataset;

Function: PREDICT_PROBA_LABEL

Arguments

  • features (varies): The input features matching the model's expected schema.

What It Does

  • Returns the full probability distribution with human-readable label names as keys.
  • Output is an object (key-value pairs) where keys are label names and values are probabilities between 0.0 and 1.0.
  • Probabilities across all labels sum to 1.0.
  • Returns NULL for null or malformed input data.
  • Works with both binary and multi-class classifiers.

Example

CREATE MATERIALIZED VIEW MY_PROBABILITY_PREDICTIONS AS
SELECT
    unique_id,
    my_model.predict_proba_label(feature_array) AS label_probabilities
FROM company_data.my_dataset;

Sample Output Structure

The label_probabilities column returns an object like:

{"electronics": 0.72, "apparel": 0.18, "home": 0.10}

Function: PREDICT_ENTROPY

Arguments

  • features (varies): The input features matching the model's expected schema.

What It Does

  • Returns the Shannon entropy of the prediction probability distribution—a measure of prediction uncertainty.
  • Output is a float value where:
    • 0 = completely certain (one class has 100% probability)
    • log₂(n_classes) = maximum uncertainty (uniform distribution across all classes)
  • Calculated as: H(X) = -Σ p(x) × log₂(p(x))
  • Zero probabilities are excluded from the calculation to avoid undefined logarithms.
  • Returns NULL for null or malformed input data.

Example

CREATE MATERIALIZED VIEW PREDICTIONS_WITH_CONFIDENCE AS
SELECT
    unique_id,
    my_model.predict_label(feature_array) AS predicted_category,
    my_model.predict_entropy(feature_array) AS uncertainty
FROM company_data.my_dataset;

Interpreting Entropy Values

For a 3-class classifier (max entropy = log₂(3) ≈ 1.58):

EntropyInterpretationExample Distribution
0.0Certain1.0, 0.0, 0.0
~0.5Confident0.85, 0.10, 0.05
~1.0Moderate uncertainty0.60, 0.25, 0.15
~1.58Maximum uncertainty0.33, 0.33, 0.33

Example: Filtering by Confidence

Use entropy to filter for high-confidence predictions:

CREATE MATERIALIZED VIEW HIGH_CONFIDENCE_PREDICTIONS AS
SELECT
    unique_id,
    my_model.predict_label(feature_array) AS predicted_category,
    my_model.predict_entropy(feature_array) AS uncertainty
FROM company_data.my_dataset
WHERE my_model.predict_entropy(feature_array) < 0.5;

Key Benefits

  • No Manual Mapping: Eliminates the need to maintain separate label encoder mappings or perform additional queries to interpret model outputs.
  • Version Alignment: Label mappings are stored with the model, preventing misalignment between model versions.
  • Uncertainty Quantification: The entropy function enables detection of low-confidence predictions for human review, active learning sample selection, or quality filtering.
  • Consistent Interface: All functions work uniformly across binary and multi-class classifiers.

Notes

  • The model must be registered in the Narrative Model Registry before these functions become available.
  • Functions inherit access permissions from the registered model.
  • For multi-class classifiers, entropy ranges from 0 to log₂(n_classes). A 2-class classifier has max entropy of 1.0; a 10-class classifier has max entropy of ~3.32.
  • Use the NQL Editor to validate your query before execution.
  • Performance overhead is minimal (<10ms) compared to the base predict and predict_proba functions.
< Back
Rosetta

Hi! I’m Rosetta, your big data assistant. Ask me anything! If you want to talk to one of our wonderful human team members, let me know! I can schedule a call for you.