flowchart TB
subgraph notebook [Workspace Notebook]
PY[Python cells bootstrap Snowpark]
SQL[SQL cells]
Rcell["%%R cells"]
end
subgraph ipy [Python / IPython kernel]
rpy2[rpy2 embeds R]
reticulate[reticulate inside R]
end
subgraph rlayer [Embedded R session]
SFR[snowflakeR]
RS[RSnowflake optional]
end
subgraph pyml [Python ML stack]
ML[snowflake-ml-python]
SP[Snowpark session]
end
subgraph sf [Snowflake]
WH[(Warehouse)]
MLAPI[Feature Store / Registry / SPCS]
end
PY --> SP
Rcell --> rpy2 --> SFR
SFR --> reticulate --> ML --> MLAPI
SFR --> RS
RS --> WH
SQL --> WH
ML --> SP
10 R Cells & Interop
%%R magic, outputs, and Python↔︎R data exchange
snowflake, R, RStudio, Posit, VS Code, workspace notebooks, snowflakeR, RSnowflake, mlops
10.1 Overview
Workspace Notebooks expose a Python kernel — there is no separate R kernel in the notebook UI. R users work through %%R cells: IPython cell magic that hands the cell body to rpy2, which runs it in an embedded R interpreter inside the same Python process.
That design choice is fundamental. It means:
- R is provisioned once per session: either
setup_notebook()in a Python cell (cold bootstrap) or pre-installed in a CRE container (no bootstrap cell) - All subsequent ML and stats work can live in
%%Rcells - SQL cells still run on the warehouse (unchanged)
- Python,
%%R, and SQL cells can be intermingled throughout a notebook on the same compute pool / container runtime - snowflakeR and RSnowflake both run inside the embedded R session
This chapter explains how that feels in practice: session persistence, passing data across the bridge, readable output, plotting, Workspace tools (Variable explorer, Scratchpad), and mixing R with Python and SQL in one notebook. For the three-package stack, see Architecture. For bootstrap detail, see Workspace Bootstrap.
10.2 Learning Objectives
- Explain why
%%Rcells are Python cells under the hood - Use
-i/-omagic flags and know when to avoid cross-bridge data transfer - Configure output and plots for Workspace rendering — including plot + interactive data grid in one cell
- Use the Tools panel Variable explorer and Scratchpad when debugging R/Python interop
- Structure multi-language notebooks and resolve
/filesystem/paths
10.3 There is no R kernel
Snowflake Workspace follows the broader Jupyter model: one active kernel per notebook session. Today that kernel is Python (Snowpark + IPython).
| What you see in UI | What actually runs |
|---|---|
| Python cell | Python / Snowpark |
| SQL cell | SQL on selected warehouse |
“R cell” (%%R) |
Python kernel → rpy2 → embedded R |
You never select “R” as a kernel. After bootstrap registers the magic, you prefix R code with %%R and write normal R syntax below it. Variables, loaded packages, and conn objects persist across %%R cells within the same session — analogous to running cells top-to-bottom in RStudio.
rpy2 embeds a full R interpreter in the Python process (not a subprocess). It registers %%R so IPython routes cell text to R and returns stdout, values, and plots to the notebook display.
reticulate is the R→Python bridge. When you call sfr_connect() or Feature Store APIs, snowflakeR uses reticulate to invoke snowflake-ml-python. You do not write that Python yourself — but it explains why snowflakeR needs a compatible Python env alongside rpy2.
10.4 Architecture in a mixed notebook
Typical roles:
| Component | Use in notebook |
|---|---|
| snowflakeR | ML platform — Feature Store, Registry, experiments, doSnowflake |
| RSnowflake | DBI/dbplyr, Connections-style SQL, optional bulk write paths |
| Snowpark (Python) | ETL, session.table(), passing small frames to R with -i |
| SQL cells | DDL, context, warehouse-side transforms |
Both R packages can coexist. Prefer one connection story: sfr_connect() then sfr_dbi_connection() if you need dplyr (RSnowflake in Workspace).
10.5 Session lifecycle
sequenceDiagram participant U as You participant PY as Python kernel participant BOOT as setup_notebook participant R as Embedded R via rpy2 U->>PY: Bootstrap cell OR attach CRE PY->>BOOT: micromamba or pre-baked R, %%R magic U->>PY: Run %%R cell — library(snowflakeR) PY->>R: Execute R code R-->>PY: Output / plots PY-->>U: Display in notebook Note over R: conn, data, packages persist U->>PY: Next %%R cell PY->>R: Same R session continues
| Event | What happens |
|---|---|
| Start session | Empty Python kernel; R not available until bootstrap or CRE attach |
| After bootstrap or CRE | R available; %%R magic registered |
First %%R cell |
Loads packages, creates conn |
Later %%R cells |
Reuse same R workspace — no need to library() every cell unless you prefer explicitness |
| Restart session | R state lost — re-run bootstrap (fast if cached/CRE) |
| End session | Container recycled; interactive installs may need YAML/CRE |
10.6 First R cells
Without a CRE — run bootstrap first, then R:
- Python:
setup_notebook(...)— see Bootstrap - R (
%%R): connect and smoke test
With a reference CRE — attach cre@<org_name> when starting the notebook service (Workspaces overview); skip step 1. Go straight to %%R (magic is registered on kernel start). Optional Python cell: setup_notebook(config="/opt/sfnb/config/cre_multilang_r.yaml", ...) for EAI checks or session context only — it should not reinstall R when SFNB_CUSTOM_RUNTIME=1.
%%R
library(snowflakeR)
conn <- sfr_connect()
conn <- sfr_load_notebook_config(conn) # warehouse / db / schema from YAML
sfr_query(conn, "SELECT CURRENT_USER(), CURRENT_ROLE()")RSnowflake-only path (no ML Python stack beyond what bootstrap installed):
%%R
library(DBI)
library(RSnowflake)
con <- dbConnect(Snowflake())
dbGetQuery(con, "SELECT CURRENT_USER()")In Workspace, both paths use session OAuth — no PAT in cells. Bootstrap exports SNOWFLAKE_* env vars RSnowflake reads automatically.
10.7 The %%R magic line
The first line of an R cell is not R code — it is an IPython directive:
Optional flags on the magic line:
| Flag | Meaning | Example |
|---|---|---|
-i var |
Import Python variable into R | %%R -i py_df |
-o var |
Export R object to Python | %%R -o r_df |
-o var1,var2 |
Export multiple R objects | %%R -o p,mtcars |
-w / -h |
Plot width / height (pixels) | %%R -w 700 -h 450 |
Combine flags: %%R -i py_df -o r_summary -w 600 -h 400.
Export several objects in one cell:
%%R -w 700 -h 450 -o p,mtcars
library(ggplot2)
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() + theme_minimal()
print(p)Both p and mtcars appear in the Python namespace for the next cell. You can also repeat the flag: %%R -o p -o mtcars. Spaces after commas (-o p, mtcars) are normalized automatically by sfnb_multilang — older builds without that fix may error with mtcarslibrary(...).
Combining -w / -h, print(p), and -o gives two complementary outputs in a single %%R cell:
- Static ggplot image —
-w 700 -h 450plusprint(p)renders the plot inline (same as a plot-only cell). - Interactive data grid —
-oexports objects into the Python kernel;SafeRMagicsreturns the last-ovariable that is a pandasDataFrameas the cell result, which Workspace renders as the Table / Chart / Pivot explorer.
So %%R -w 700 -h 450 -o p,mtcars with print(p) typically shows both the scatter plot and an explorable mtcars grid. Put the DataFrame last in the -o list (-o p,mtcars, not -o mtcars,p) if you want the grid to target that table. Non-tabular -o values (ggplot objects, raw PNG bytes) are still exported to Python but do not drive the grid viewer.
Use %%R --text if you want exports without the interactive grid.
10.7.1 Passing Python → R (-i)
Use for small objects created in Python (config dicts, sampled DataFrames):
rpy2 converts common types: pandas DataFrame → R data.frame, scalars → R vectors.
10.7.2 Passing R → Python (-o)
The bundled r_helpers.py (from sfnb_setup) also wires richer conversions for notebook displays — lazy dbplyr tables can execute SQL via Snowpark and land as pandas grids without manual -o in some workflows.
What lands in Python after -o (via rpy2’s default converter):
| R object | Python type in kernel | Useful for |
|---|---|---|
data.frame |
pandas.DataFrame |
Next Python cell, grid display |
ggplot object |
rpy2 RS4 (ggplot2::ggplot) |
Re-print in R via Scratchpad; not matplotlib-ready |
raw vector (PNG bytes) |
rpy2 ByteVector / bytes-like |
IPython.display.Image(data=bytes(...)) |
| Vectors, lists | numpy / Python scalars |
Small summaries, counts |
See Workspace tools panel for inspecting these in the Variable explorer.
10.7.3 When not to use -i / -o
| Anti-pattern | Better approach |
|---|---|
| Passing million-row DataFrames | sfr_write_table() / SQL materialization; read in R with sfr_read_table() |
| Shuttling training data every cell | One table in Snowflake; both languages query it |
| Round-tripping large objects for ETL | Python Snowpark or R/dbplyr — pick one pipeline per step |
Rule of thumb: cross the bridge for prototypes and small samples; use Snowflake tables as the system of record for ML datasets.
10.8 Output helpers
10.8.1 What we know about Workspace + R output
When R runs through rpy2 in Workspace, stdout often reaches the notebook UI as several separate display events (one per write_console / flush from rpy2). Snowsight’s HTML renderer can insert extra vertical spacing between those events — printed tables look stretched and cat() messages double-space.
This is cosmetic, not wrong data. It was observed during multilang notebook development on Workspace; product behaviour may change by release.
Two mitigations work together:
| Layer | Where | What it does |
|---|---|---|
%%R magic (sfnb-multilang) |
Every %%R cell |
Wraps cell code in R capture.output({...}), buffers console writes, emits one Python print() block — see SafeRMagics in r_helpers.py |
| snowflakeR helpers | When you call them explicitly | rprint(), rview(), rglimpse(), rcat() — wide print() width, cli summaries; use when magic buffering is not enough or you use bare rpy2 |
| RSnowflake | DBI-only workflows | Same helper names in RSnowflake use writeLines(capture.output(...)) for the same reason — see RSnowflake/R/helpers.R |
After bootstrap, sfnb-multilang also sets options(width = 200) and options(tibble.width = Inf) so wide tables wrap less inside each line.
Interactive grids: For data frames, the sfnb %%R magic can return a pandas DataFrame to Workspace’s datagrid (lazy dbplyr tables may execute SQL via Snowpark without copying all rows to R). Use %%R --text to force plain text only. Details: multilang r_helpers.py docstring on SafeRMagics.
| Function (snowflakeR) | Replaces | Use for |
|---|---|---|
rprint(x) |
print(x) |
Models, lists, data frames |
rview(df, n = 10) |
head(df) |
Quick row preview + row count message |
rglimpse(df) |
glimpse() / str() |
Column types and sample values |
rcat(...) |
cat() |
Status lines without awkward spacing |
If output still looks wrong, compare the same cell in local RStudio vs Workspace to confirm it is display-layer only. Platform internals: Appendix G.
Bootstrap / CRE also set console width from Python when needed:
10.9 Plotting
Workspace does not use RStudio’s plot pane. ggplot2 and base graphics need:
- Explicit dimensions on the magic line
print(p)on ggplot objects — the grid is not auto-printed
%%R -w 700 -h 450
library(ggplot2)
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() + theme_minimal()
print(p)Without print(p), the cell may run successfully but show a blank output area. Tune -w / -h for slide exports vs inline notebook viewing.
To show a plot and pass the underlying data to Python in one shot, combine plotting flags with -o — see the callout under The %%R magic line.
10.10 Workspace tools panel
Snowsight’s notebook Tools panel (alongside the main cell editor) is useful when debugging %%R workflows — especially Python↔︎R exports — without cluttering the notebook with throwaway cells.
10.10.1 Variable explorer
Open Variables in the Tools panel. It lists names in the Python kernel namespace (updated when a cell finishes).
Objects exported with -o appear here as rpy2-converted Python values, not as native R objects in an RStudio Environment pane:
| Name (example) | Type shown | Meaning |
|---|---|---|
mtcars |
DataFrame (32, 11) |
R data.frame → pandas |
p |
RS4 (ggplot2::ggplot, …) |
Live ggplot handle in embedded R |
png_bytes |
ByteVector / raw |
PNG file bytes from readBin() |
fig, ax |
Figure, Axes |
Native matplotlib objects from Python cells |
Use the Preview column to confirm conversion before writing a downstream Python cell. The explorer answers “what did -o actually give me?” — especially for ggplot (RS4) vs tabular data (DataFrame) vs binary (raw).
10.10.2 Scratchpad
Scratchpad provides extra Python kernel cells that do not appear in the saved notebook — ideal for ad-hoc inspection while developing:
%%Rthenprint(mtcars)orstr(conn)to probe the embedded R session%%R -w 700 -h 450+print(p)to re-render a ggplot already in R memory- Python snippets:
mtcars.head(),type(png_bytes),bytes(png_bytes)[:8]
Create multiple scratch tabs (Scratch 1, Scratch 2, …) for parallel experiments. Scratchpad cells share the same kernel session as the notebook (variables from -o, conn, loaded packages all visible).
Scratchpad output is ephemeral — useful for debugging and demos-in-progress, not for production pipeline steps. Commit logic to proper notebook cells once it works.
10.11 Session context in R
Workspace often does not set database/schema the way Snowsight worksheets do. Set context before unqualified table names:
Or explicitly (must reassign — sfr_use() returns a new object):
%%R
conn <- sfr_use(conn, warehouse = "ML_WH", database = "ML_DB", schema = "FEATURES")
sfr_fqn(conn, "TRAINING_DATA") # "ML_DB.FEATURES.TRAINING_DATA"SQL cells can run USE WAREHOUSE / USE SCHEMA for warehouse context; R session context is separate — YAML + sfr_load_notebook_config() keeps R and your config aligned.
10.12 Filesystem layout in Workspace
Git-connected projects mount the repo under /filesystem/<session-hash>/. The hash changes per session — never hard-code it.
10.12.1 Why path discovery matters
| Problem | Cause |
|---|---|
ModuleNotFoundError: sfnb_setup |
Notebook not in same folder as sfnb_setup.py |
| Wrong config loaded | Relative config= resolved from unreliable cwd |
| Broken imports after restart | Different mount hash — discovery pattern still works |
| Notebook location | Bootstrap pattern |
|---|---|
snowflakeR/inst/notebooks/*.ipynb |
from sfnb_setup import setup_notebook directly |
| Anywhere else in the repo tree | Scan /filesystem/, add snowflakeR/inst/notebooks to sys.path (see below) |
import sys, os
_repo_root = None
for _h in os.listdir("/filesystem"):
_p = os.path.join("/filesystem", _h, "snowflakeR", "inst", "notebooks")
if os.path.isdir(_p):
_repo_root = os.path.join("/filesystem", _h)
sys.path.insert(0, _p)
break
if _repo_root is None:
raise RuntimeError("Could not find snowflakeR/inst/notebooks under /filesystem/")
from sfnb_setup import setup_notebook
setup_notebook(
config=os.path.join(_repo_root, "internal", "my_project", "snowflaker_config.yaml"),
packages=["snowflakeR", "RSnowflake"],
)Keep one copy of sfnb_setup.py in the repo (usually snowflakeR/inst/notebooks/) instead of duplicating it per folder. Full walkthrough: Bootstrap — filesystem.
10.13 Structuring a multi-language notebook
A productive layout separates setup, data, modeling, and operational steps:
flowchart TD C1["1 Python — setup_notebook or CRE"] C2["2 SQL — DDL, USE context, seed tables"] C3["3 R — connect, dbplyr ETL, train"] C4["4 R or Python — optional Snowpark ETL"] C5["5 R — log model, registry"] C6["6 SQL — verify scores / grants"] C1 --> C2 --> C3 --> C4 --> C5 --> C6
| Cell | Language | Runs on | Example |
|---|---|---|---|
| Bootstrap (if no CRE) | Python | Container kernel | setup_notebook() |
| Context / DDL | SQL | Warehouse | CREATE TABLE, USE WAREHOUSE |
| Explore / model | R (%%R) |
Embedded R | sfr_connect(), tidymodels |
| Warehouse ETL | R (%%R) + dbplyr |
Warehouse (push-down SQL) | tbl(con, ...) %>% mutate(...) %>% compute() — see RSnowflake: Query |
| Heavy ETL (optional) | Python | Snowpark on warehouse | session.write_pandas() when the notebook is already Python-centric or a colleague owns Snowpark transforms |
| ML platform | R (%%R) |
Embedded R | sfr_log_model() |
| Validation | SQL | Warehouse | SELECT COUNT(*) FROM scores |
Prefer R dbplyr for data pipelines. RSnowflake + dbplyr translates dplyr to Snowflake SQL and materializes with compute() — same push-down pattern as Python Snowpark, without leaving R. Use Snowpark Python only when you need Python-only APIs, an existing Snowpark pipeline, or mixed-language cells where Python already owns the transform step.
Compute reminder: SQL uses warehouse credits; R/Python kernel work uses container runtime credits; Snowflake ML API calls may touch both depending on operation.
10.14 Installing extra R packages
10.14.1 Cold bootstrap (no CRE)
setup_notebook() installs the stack from snowflaker_*.yaml (conda, CRAN, tarballs). For one-off packages in the same session:
%%R
options(repos = c(CRAN = "https://cloud.r-project.org"))
install.packages("forecast", quiet = TRUE)
suppressPackageStartupMessages(library(forecast))Workspace has no interactive stdin — always set repos, use quiet = TRUE, and suppress startup messages. Packages with system libraries (e.g. sf → GDAL) often fail with plain install.packages(); add conda-forge equivalents (r-sf) to YAML instead.
Interactive installs do not survive container recycle — add recurring packages to YAML or move to a CRE.
10.14.2 On top of a CRE (recommended paths)
A CRE already has R, %%R, snowflakeR, and RSnowflake. To add more packages:
| Approach | When to use | Notes |
|---|---|---|
| Rebuild CRE | Production standard | Add packages to cre_profile.yaml extras.conda_r / extras.cran, rebuild image, CREATE OR REPLACE CRE — CRE chapter |
In-session install.packages() |
Quick experiment | Same pattern as above; needs EAI (or internal mirrors: CRAN) if the package is not cached in the image |
micromamba in %%R |
Conda-forge-only deps | Sys.which("micromamba") should exist in CRE; install into workspace_env — platform team may prefer image rebuild instead |
%%R
# Example: ad hoc CRAN add on a CRE session (EAI / mirror must allow CRAN)
options(repos = c(CRAN = "https://cloud.r-project.org"))
if (!requireNamespace("forecast", quietly = TRUE)) {
install.packages("forecast", quiet = TRUE)
}
library(forecast)# Example: bake into next CRE build (cre_profile.yaml)
extras:
cran:
- forecast
conda_r:
- r-forecast # prefer when a conda-forge build existsAfter an in-session install, packages last until kernel restart or container recycle — same as bootstrap. For a durable team stack, rebuild the CRE rather than relying on analysts to install ad hoc.
Pin r-reticulate>=1.25 in config (bootstrap YAML or CRE profile). Older reticulate versions can segfault when loaded with rpy2 in the same process.
10.15 Workspace vs local RStudio
| Aspect | Local RStudio / Posit | Workspace %%R |
|---|---|---|
| Kernel | Native R | Python + rpy2 |
| Auth | connections.toml, key-pair, browser |
Session OAuth auto |
sfr_connect() |
Reads TOML / params | Wraps active Snowpark session |
| Plots | Plot pane | %%R -w … -h … + print(p) |
| Environment / debug | RStudio Environment, Console | Tools panel Variables + Scratchpad (§ Tools panel) |
| Output | Standard | Prefer rprint() / rview() |
| Setup | Once per machine | Bootstrap per session (CRE reduces cost) |
| Python interop | reticulate (optional) | rpy2 + reticulate (required for snowflakeR) |
| Kernel picker | Native R (RStudio / IRkernel locally) | Python only; R via %%R — Appendix G |
Design goal: the same snowflakeR R code runs in both environments — only setup and display differ (Local R Setup).
10.16 REST inference and PATs
Most notebook workflows use sfr_predict() through the Snowpark session — no extra token.
sfr_predict_rest() calls deployed SPCS endpoints over HTTPS; the Workspace session token is not valid for that ingress path. For REST predict from a notebook testing external apps, set a PAT via Sys.setenv(SNOWFLAKE_PAT = "...") — prefer not to hard-code in committed notebooks.
10.17 Common pitfalls
| Symptom | Likely cause | Fix |
|---|---|---|
%%R unknown command |
Bootstrap not run or kernel restarted | Re-run Python bootstrap cell |
Segfault on library(snowflakeR) |
Old reticulate | r-reticulate>=1.25 in YAML |
| Table not found | Missing DB/schema context | sfr_load_notebook_config() |
| Blank ggplot cell | Missing print(p) or size flags |
%%R -w 700 -h 450 + print(p) |
Huge data frame slow across -i |
Bridge transfer | Query from Snowflake in R instead |
sfr_use() ignored |
No reassignment | conn <- sfr_use(conn, ...) |
could not find function "mtcarslibrary" |
Space after comma in -o list on old sfnb_multilang |
Use -o p,mtcars or -o p -o mtcars; upgrade multilang (normalizes -o p, mtcars) |
More: Appendix C: Troubleshooting.
10.18 Companion material
| Resource | Content |
|---|---|
workspace_quickstart.ipynb |
Runnable end-to-end |
Vignette workspace-notebooks |
Full patterns and bulk-write notes |
r_helpers.py |
Output helpers, PAT/key helpers, display bridges |
10.19 Next steps
CRE & ML Jobs — skip repeat bootstrap with pre-baked images.
snowflakeR: Connect — session and operational APIs.