flowchart TB
subgraph ingest [Data]
src[Source tables / streams]
fv[Feature Views]
end
subgraph dev [Development]
boot["R-ready Workspace<br/>bootstrap or CRE"]
train[Train in R]
exp[Experiments optional]
end
subgraph govern [Governance]
ds[Dataset snapshot]
reg[Model Registry]
end
subgraph prod [Production]
dep[Deploy SPCS / SQL]
mon[Monitoring]
task[Tasks / ML Jobs]
end
src --> fv --> ds --> train
boot --> train
train --> exp
train --> reg --> dep --> mon
dep --> task
27 End-to-End ML Pipeline
From features to deployed model in one flow
snowflake, R, RStudio, Posit, VS Code, workspace notebooks, snowflakeR, RSnowflake, mlops
27.1 Overview
This chapter ties the guide into a single reference pipeline — the path most R teams want on Snowflake: governed features, training in R, registry deployment, monitoring, and optional scale-out. Use it as a map while reviewing individual chapters.
Workspace R readiness has two supported paths: self-serve bootstrap (setup_notebook() on the default runtime) and organisation CRE (Custom Runtime Environment with R pre-baked). The pipeline stages below are the same after R is available — only the first interactive step differs.
27.2 Learning Objectives
- Map chapters to pipeline stages
- Choose Workspace (bootstrap vs CRE) vs local IDE entry points
- Find companion notebooks for copy-paste starting points
- Complete production checklist before go-live
27.3 Reference pipeline
27.4 Stage-by-stage map
| Stage | What you do | Guide chapter | Starter artifact |
|---|---|---|---|
| 0. Learn platform | Warehouses, Workspace, ML services | 01, 05 | — |
| 1. Connect | Auth, TOML, IDE or Workspace | 03, 04, 16 | connections.toml |
| 2. R-ready Workspace | Bootstrap or attach org CRE | 06, 09, 07 | snowflaker_config.yaml / cre@<org> |
| 3. Features | Entities, views, point-in-time data | 17 | workspace_feature_store.ipynb |
| 4. Train | tidymodels / fable / custom | 08, 19 | — |
| 5. Register | sfr_log_model(), deploy |
18 | workspace_model_registry.ipynb |
| 6. Scale | Parallel doSnowflake → many-model | 22, 21 | parallel SPCS notebooks |
| 7. Monitor | Inference log + drift on deployed versions | 20 | monitoring vignette |
MLOps framing situates these stages in your organization’s lifecycle.
27.5 Entry paths
Three common ways teams enter this pipeline. Stages 3–7 in the table above follow the same order as the snowflakeR chapters (features through monitoring).
27.5.1 Path A — Workspace with self-serve bootstrap
For pilots, sandboxes, or while an org CRE is not yet available:
27.5.2 Path B — Workspace with organisation CRE (recommended at scale)
When platform/IT has registered a Custom Runtime Environment:
- Snowsight → Workspace → attach
cre@<org_name>in notebook advanced settings - Skip
setup_notebook()for standard packages already in the image (R,%%R, snowflakeR, RSnowflake, optional ADBC) - Optional Python cell: session checks,
sfr_load_notebook_config(), or EAI-only extras — see RSnowflake in Workspace %%Rcells for modeling; promote to ML Job + same CRE for scheduled batch (09)
CRE and bootstrap can coexist: bootstrap for experimentation, CRE for production notebooks and jobs.
27.5.3 Path C — Local IDE-first
- Install RSnowflake + snowflakeR locally (03)
- Develop in RStudio / Posit / VS Code with
connections.toml(04) - Push notebook + config to Workspace Git; run in Workspace via Path A or B for scheduled execution
Paths A, B, and C use the same snowflakeR / RSnowflake APIs after connection — only environment setup and auth differ.
27.6 Marketing / causal demo
The demo overview walks CausalImpact + Robyn:
- Feature Store for marketing features
- Model Registry for response curves
- SQL-served inference for stakeholder dashboards
Good template for measurement and MMM teams evaluating snowflakeR.
27.7 Minimal smoke test
After your Workspace is R-ready (bootstrap finished or CRE attached and kernel started), run this sequence before a full pipeline.
Run the Python bootstrap cell first (06). Wait until it reports success (~60s typical). Then run the %%R cells below.
Attach cre@<org_name> in notebook settings, start the kernel, and confirm %%R is available (often automatic on kernel start). Do not re-run setup_notebook() unless you need packages not in the image. Then run the %%R cells below.
snowflakeR connection and config:
%%R
library(snowflakeR)
conn <- sfr_connect()
conn <- sfr_load_notebook_config(conn)
sfr_query(conn, "SELECT CURRENT_USER(), CURRENT_DATABASE(), CURRENT_SCHEMA()")RSnowflake / DBI (same session — validates SQL API path used by workers and bulk I/O):
%%R
library(DBI)
library(RSnowflake)
con <- dbConnect(Snowflake())
dbGetQuery(con, "SELECT 1 AS ok")If the first block fails on a CRE notebook, check that snowflakeR is in the image and that sfr_load_notebook_config() points at your project’s YAML (absolute path under /filesystem/ for Git projects — 06). If the second block fails after bootstrap only, re-run the Python bootstrap cell and confirm EAI is enabled.
27.8 Production handoff
Before production cutover:
- Appendix D: Production checklist — includes org CRE onboarding for platform teams and tarball pins for both CRE builds and bootstrap YAML
- Appendix C: Troubleshooting
- Pre-built tarballs in bootstrap configs and CRE build profiles (Appendix B)
- Decide bootstrap vs CRE per audience: analysts on CRE; sandbox/PoC may keep bootstrap until the image is promoted
Community packages — validate SLAs, support, and compliance with your organization.
27.9 Feedback
Open issues on snowflakeR (guide source under guide/) or package repos linked from the home page.