26 Model Monitoring

Drift, performance, and inference logs

Keywords

snowflake, R, RStudio, Posit, VS Code, workspace notebooks, snowflakeR, RSnowflake, mlops

26.1 Overview

After deployment, model monitoring tracks drift, performance degradation, and statistical shifts on production inference data — configured through the Model Registry API in snowflakeR.

Read this chapter after you have a logged model version in the registry (Model Registry) and, for partition-heavy workloads, after Many-Model Patterns. Monitoring complements Experiments (training exploration) on the path to production.

Requires snowflake-ml-python ≥ 1.7.1 — verify with sfr_check_environment().

26.2 Learning Objectives

Wire an inference log table to a registered model version
Register monitors and read drift/performance metrics
Apply segment filters for regional or cohort analysis

26.3 Prerequisites

A logged model version in the registry (sfr_log_model()).
An inference log table or view with:
- Timestamp column
- Prediction score column(s)
- Optional actual/label column(s) for performance metrics
- Optional ID columns for entity-level analysis
A warehouse for scheduled monitor computation.

Typical inference log population: batch scoring SQL, streaming ingest, or application writes to a Snowflake table.

26.4 Configure source and registry

library(snowflakeR)

conn <- sfr_connect()
reg <- sfr_model_registry(conn, database = "ML_DB", schema = "MODELS")

src <- sfr_monitor_source(
  "ML_DB.MY_SCH.INFERENCE_LOG",
  timestamp_column         = "EVENT_TIME",
  prediction_score_columns = "PREDICTION",
  actual_score_columns     = "LABEL",
  id_columns               = "USER_ID"
)

cfg <- sfr_monitor_config(
  model_name         = "MY_MODEL",
  version_name       = "v1",
  warehouse          = "ML_WH",
  aggregation_window = "1 day"
)

Inference log schema means the column layout of the table that stores production predictions (your inference log), not a separate Snowflake object type. The names you pass to sfr_monitor_source() — timestamp_column, prediction_score_columns, actual_score_columns, id_columns — must match real columns in that table.

Snowflake stores unquoted identifiers as UPPERCASE (EVENT_TIME, PREDICTION). If you created the table with quoted mixed-case names, quote them consistently in R or use uppercase names everywhere. A mismatch surfaces as “column not found” when monitors run.

26.5 Register monitors

sfr_add_monitor(reg, monitor_name = "MY_MONITOR", src, cfg)

sfr_show_model_monitors(reg)
sfr_get_monitor(reg, name = "MY_MONITOR")

Monitors run on Snowflake’s schedule using the configured warehouse — not in your local R session.

26.6 Read metrics

sfr_monitor_drift(reg, "MY_MONITOR")
sfr_monitor_performance(reg, "MY_MONITOR")
sfr_monitor_stats(reg, "MY_MONITOR")

Pass time ranges and granularity per ?sfr_monitor_drift and related help pages. Use results in R dashboards, Shiny, or export to tables for alerting pipelines.

26.7 Segments

Filter metrics to a population slice:

seg <- list(column = "REGION", value = "EMEA")
sfr_monitor_drift(reg, "MY_MONITOR", segment = seg)

Useful when global drift masks regional issues or when models serve heterogeneous cohorts.

26.8 Operational pattern

flowchart LR
  DEPLOY[sfr_deploy_model] --> SCORE[Batch / online scoring]
  SCORE --> LOG[Inference log table]
  LOG --> MON[sfr_add_monitor]
  MON --> ALERT[Review drift / performance]
  ALERT --> RETRAIN[Retrain + new registry version]

Connect monitoring alerts to your MLOps process — see MLOps on Snowflake.

26.9 Companion

Vignette: model-monitoring
Notebook: workspace_model_monitoring examples in package inst/notebooks/

26.10 Next steps

End-to-End Pipeline — full lifecycle map.

Model Registry — prerequisites (sfr_log_model(), deployed version).

--- title: "Model Monitoring" subtitle: "Drift, performance, and inference logs" --- ## Overview After deployment, **model monitoring** tracks drift, performance degradation, and statistical shifts on production inference data — configured through the Model Registry API in snowflakeR. Read this chapter **after** you have a **logged model version** in the registry ([Model Registry](../18_model_registry/index.qmd)) and, for partition-heavy workloads, after [Many-Model Patterns](../21_many_model/index.qmd). Monitoring complements [Experiments](../19_experiments/index.qmd) (training exploration) on the path to production. Requires **`snowflake-ml-python` ≥ 1.7.1** — verify with `sfr_check_environment()`. ## Learning Objectives - Wire an inference log table to a registered model version - Register monitors and read drift/performance metrics - Apply segment filters for regional or cohort analysis --- ## Prerequisites {#sec-prereq} 1. A **logged model version** in the registry (`sfr_log_model()`). 2. An **inference log** table or view with: - Timestamp column - Prediction score column(s) - Optional actual/label column(s) for performance metrics - Optional ID columns for entity-level analysis 3. A **warehouse** for scheduled monitor computation. Typical inference log population: batch scoring SQL, streaming ingest, or application writes to a Snowflake table. --- ## Configure source and registry {#sec-config} ```r library(snowflakeR) conn <- sfr_connect() reg <- sfr_model_registry(conn, database = "ML_DB", schema = "MODELS") src <- sfr_monitor_source( "ML_DB.MY_SCH.INFERENCE_LOG", timestamp_column = "EVENT_TIME", prediction_score_columns = "PREDICTION", actual_score_columns = "LABEL", id_columns = "USER_ID" ) cfg <- sfr_monitor_config( model_name = "MY_MODEL", version_name = "v1", warehouse = "ML_WH", aggregation_window = "1 day" ) ``` **Inference log schema** means the **column layout of the table that stores production predictions** (your inference log), not a separate Snowflake object type. The names you pass to `sfr_monitor_source()` — `timestamp_column`, `prediction_score_columns`, `actual_score_columns`, `id_columns` — must match **real columns** in that table. Snowflake stores unquoted identifiers as **UPPERCASE** (`EVENT_TIME`, `PREDICTION`). If you created the table with quoted mixed-case names, quote them consistently in R or use uppercase names everywhere. A mismatch surfaces as “column not found” when monitors run. --- ## Register monitors {#sec-register} ```r sfr_add_monitor(reg, monitor_name = "MY_MONITOR", src, cfg) sfr_show_model_monitors(reg) sfr_get_monitor(reg, name = "MY_MONITOR") ``` Monitors run on Snowflake's schedule using the configured warehouse — not in your local R session. --- ## Read metrics {#sec-metrics} ```r sfr_monitor_drift(reg, "MY_MONITOR") sfr_monitor_performance(reg, "MY_MONITOR") sfr_monitor_stats(reg, "MY_MONITOR") ``` Pass time ranges and granularity per `?sfr_monitor_drift` and related help pages. Use results in R dashboards, Shiny, or export to tables for alerting pipelines. --- ## Segments {#sec-segments} Filter metrics to a population slice: ```r seg <- list(column = "REGION", value = "EMEA") sfr_monitor_drift(reg, "MY_MONITOR", segment = seg) ``` Useful when global drift masks regional issues or when models serve heterogeneous cohorts. --- ## Operational pattern {#sec-ops} ```{mermaid} flowchart LR DEPLOY[sfr_deploy_model] --> SCORE[Batch / online scoring] SCORE --> LOG[Inference log table] LOG --> MON[sfr_add_monitor] MON --> ALERT[Review drift / performance] ALERT --> RETRAIN[Retrain + new registry version] ``` Connect monitoring alerts to your MLOps process — see [MLOps on Snowflake](../15_mlops_on_snowflake/index.qmd). --- ## Companion - Vignette: `model-monitoring` - Notebook: `workspace_model_monitoring` examples in package `inst/notebooks/` --- ## Next steps [End-to-End Pipeline](../23_end_to_end/index.qmd) — full lifecycle map. [Model Registry](../18_model_registry/index.qmd) — prerequisites (`sfr_log_model()`, deployed version).