10  R Cells & Interop

%%R magic, outputs, and Python↔︎R data exchange

Keywords

snowflake, R, RStudio, Posit, VS Code, workspace notebooks, snowflakeR, RSnowflake, mlops

10.1 Overview

Workspace Notebooks expose a Python kernel — there is no separate R kernel in the notebook UI. R users work through %%R cells: IPython cell magic that hands the cell body to rpy2, which runs it in an embedded R interpreter inside the same Python process.

That design choice is fundamental. It means:

  • R is provisioned once per session: either setup_notebook() in a Python cell (cold bootstrap) or pre-installed in a CRE container (no bootstrap cell)
  • All subsequent ML and stats work can live in %%R cells
  • SQL cells still run on the warehouse (unchanged)
  • Python, %%R, and SQL cells can be intermingled throughout a notebook on the same compute pool / container runtime
  • snowflakeR and RSnowflake both run inside the embedded R session

This chapter explains how that feels in practice: session persistence, passing data across the bridge, readable output, plotting, Workspace tools (Variable explorer, Scratchpad), and mixing R with Python and SQL in one notebook. For the three-package stack, see Architecture. For bootstrap detail, see Workspace Bootstrap.

10.2 Learning Objectives

  • Explain why %%R cells are Python cells under the hood
  • Use -i / -o magic flags and know when to avoid cross-bridge data transfer
  • Configure output and plots for Workspace rendering — including plot + interactive data grid in one cell
  • Use the Tools panel Variable explorer and Scratchpad when debugging R/Python interop
  • Structure multi-language notebooks and resolve /filesystem/ paths

10.3 There is no R kernel

Snowflake Workspace follows the broader Jupyter model: one active kernel per notebook session. Today that kernel is Python (Snowpark + IPython).

What you see in UI What actually runs
Python cell Python / Snowpark
SQL cell SQL on selected warehouse
“R cell” (%%R) Python kernel → rpy2 → embedded R

You never select “R” as a kernel. After bootstrap registers the magic, you prefix R code with %%R and write normal R syntax below it. Variables, loaded packages, and conn objects persist across %%R cells within the same session — analogous to running cells top-to-bottom in RStudio.

What is rpy2?

rpy2 embeds a full R interpreter in the Python process (not a subprocess). It registers %%R so IPython routes cell text to R and returns stdout, values, and plots to the notebook display.

What is reticulate?

reticulate is the R→Python bridge. When you call sfr_connect() or Feature Store APIs, snowflakeR uses reticulate to invoke snowflake-ml-python. You do not write that Python yourself — but it explains why snowflakeR needs a compatible Python env alongside rpy2.

Why not a native R kernel (IRkernel)?

Jupyter supports a standalone R kernel (IRkernel) on desktop or self-hosted Hub. Workspace today exposes one Python kernel per session; we use %%R instead. CRE can bake IRkernel on disk, but without platform kernel selection you still would not get a Snowsight “R kernel” UX — and you would lose a single shared session with SQL cells and snowflakeR’s Snowpark bridge. See Appendix G.


10.4 Architecture in a mixed notebook

flowchart TB
  subgraph notebook [Workspace Notebook]
    PY[Python cells bootstrap Snowpark]
    SQL[SQL cells]
    Rcell["%%R cells"]
  end

  subgraph ipy [Python / IPython kernel]
    rpy2[rpy2 embeds R]
    reticulate[reticulate inside R]
  end

  subgraph rlayer [Embedded R session]
    SFR[snowflakeR]
    RS[RSnowflake optional]
  end

  subgraph pyml [Python ML stack]
    ML[snowflake-ml-python]
    SP[Snowpark session]
  end

  subgraph sf [Snowflake]
    WH[(Warehouse)]
    MLAPI[Feature Store / Registry / SPCS]
  end

  PY --> SP
  Rcell --> rpy2 --> SFR
  SFR --> reticulate --> ML --> MLAPI
  SFR --> RS
  RS --> WH
  SQL --> WH
  ML --> SP

Typical roles:

Component Use in notebook
snowflakeR ML platform — Feature Store, Registry, experiments, doSnowflake
RSnowflake DBI/dbplyr, Connections-style SQL, optional bulk write paths
Snowpark (Python) ETL, session.table(), passing small frames to R with -i
SQL cells DDL, context, warehouse-side transforms

Both R packages can coexist. Prefer one connection story: sfr_connect() then sfr_dbi_connection() if you need dplyr (RSnowflake in Workspace).


10.5 Session lifecycle

sequenceDiagram
  participant U as You
  participant PY as Python kernel
  participant BOOT as setup_notebook
  participant R as Embedded R via rpy2

  U->>PY: Bootstrap cell OR attach CRE
  PY->>BOOT: micromamba or pre-baked R, %%R magic
  U->>PY: Run %%R cell — library(snowflakeR)
  PY->>R: Execute R code
  R-->>PY: Output / plots
  PY-->>U: Display in notebook
  Note over R: conn, data, packages persist
  U->>PY: Next %%R cell
  PY->>R: Same R session continues

Event What happens
Start session Empty Python kernel; R not available until bootstrap or CRE attach
After bootstrap or CRE R available; %%R magic registered
First %%R cell Loads packages, creates conn
Later %%R cells Reuse same R workspace — no need to library() every cell unless you prefer explicitness
Restart session R state lost — re-run bootstrap (fast if cached/CRE)
End session Container recycled; interactive installs may need YAML/CRE

10.6 First R cells

Without a CRE — run bootstrap first, then R:

  1. Python: setup_notebook(...) — see Bootstrap
  2. R (%%R): connect and smoke test

With a reference CRE — attach cre@<org_name> when starting the notebook service (Workspaces overview); skip step 1. Go straight to %%R (magic is registered on kernel start). Optional Python cell: setup_notebook(config="/opt/sfnb/config/cre_multilang_r.yaml", ...) for EAI checks or session context only — it should not reinstall R when SFNB_CUSTOM_RUNTIME=1.

%%R
library(snowflakeR)
conn <- sfr_connect()
conn <- sfr_load_notebook_config(conn)   # warehouse / db / schema from YAML
sfr_query(conn, "SELECT CURRENT_USER(), CURRENT_ROLE()")

RSnowflake-only path (no ML Python stack beyond what bootstrap installed):

%%R
library(DBI)
library(RSnowflake)
con <- dbConnect(Snowflake())
dbGetQuery(con, "SELECT CURRENT_USER()")

In Workspace, both paths use session OAuth — no PAT in cells. Bootstrap exports SNOWFLAKE_* env vars RSnowflake reads automatically.


10.7 The %%R magic line

The first line of an R cell is not R code — it is an IPython directive:

%%R
# Everything below this line is R
x <- 1 + 1

Optional flags on the magic line:

Flag Meaning Example
-i var Import Python variable into R %%R -i py_df
-o var Export R object to Python %%R -o r_df
-o var1,var2 Export multiple R objects %%R -o p,mtcars
-w / -h Plot width / height (pixels) %%R -w 700 -h 450

Combine flags: %%R -i py_df -o r_summary -w 600 -h 400.

Export several objects in one cell:

%%R -w 700 -h 450 -o p,mtcars
library(ggplot2)
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() + theme_minimal()
print(p)

Both p and mtcars appear in the Python namespace for the next cell. You can also repeat the flag: %%R -o p -o mtcars. Spaces after commas (-o p, mtcars) are normalized automatically by sfnb_multilang — older builds without that fix may error with mtcarslibrary(...).

Plot and interactive data grid in one cell

Combining -w / -h, print(p), and -o gives two complementary outputs in a single %%R cell:

  1. Static ggplot image-w 700 -h 450 plus print(p) renders the plot inline (same as a plot-only cell).
  2. Interactive data grid-o exports objects into the Python kernel; SafeRMagics returns the last -o variable that is a pandas DataFrame as the cell result, which Workspace renders as the Table / Chart / Pivot explorer.

So %%R -w 700 -h 450 -o p,mtcars with print(p) typically shows both the scatter plot and an explorable mtcars grid. Put the DataFrame last in the -o list (-o p,mtcars, not -o mtcars,p) if you want the grid to target that table. Non-tabular -o values (ggplot objects, raw PNG bytes) are still exported to Python but do not drive the grid viewer.

Use %%R --text if you want exports without the interactive grid.

10.7.1 Passing Python → R (-i)

Use for small objects created in Python (config dicts, sampled DataFrames):

# Python cell
import pandas as pd
py_data = pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
%%R -i py_data
str(py_data)      # R data.frame
summary(py_data)

rpy2 converts common types: pandas DataFrame → R data.frame, scalars → R vectors.

10.7.2 Passing R → Python (-o)

%%R -o r_result
r_result <- sfr_query(conn, "SELECT * FROM my_table LIMIT 100")
# Python cell
r_result.head()   # pandas DataFrame in Python

The bundled r_helpers.py (from sfnb_setup) also wires richer conversions for notebook displays — lazy dbplyr tables can execute SQL via Snowpark and land as pandas grids without manual -o in some workflows.

What lands in Python after -o (via rpy2’s default converter):

R object Python type in kernel Useful for
data.frame pandas.DataFrame Next Python cell, grid display
ggplot object rpy2 RS4 (ggplot2::ggplot) Re-print in R via Scratchpad; not matplotlib-ready
raw vector (PNG bytes) rpy2 ByteVector / bytes-like IPython.display.Image(data=bytes(...))
Vectors, lists numpy / Python scalars Small summaries, counts

See Workspace tools panel for inspecting these in the Variable explorer.

10.7.3 When not to use -i / -o

Anti-pattern Better approach
Passing million-row DataFrames sfr_write_table() / SQL materialization; read in R with sfr_read_table()
Shuttling training data every cell One table in Snowflake; both languages query it
Round-tripping large objects for ETL Python Snowpark or R/dbplyr — pick one pipeline per step

Rule of thumb: cross the bridge for prototypes and small samples; use Snowflake tables as the system of record for ML datasets.


10.8 Output helpers

10.8.1 What we know about Workspace + R output

When R runs through rpy2 in Workspace, stdout often reaches the notebook UI as several separate display events (one per write_console / flush from rpy2). Snowsight’s HTML renderer can insert extra vertical spacing between those events — printed tables look stretched and cat() messages double-space.

This is cosmetic, not wrong data. It was observed during multilang notebook development on Workspace; product behaviour may change by release.

Two mitigations work together:

Layer Where What it does
%%R magic (sfnb-multilang) Every %%R cell Wraps cell code in R capture.output({...}), buffers console writes, emits one Python print() block — see SafeRMagics in r_helpers.py
snowflakeR helpers When you call them explicitly rprint(), rview(), rglimpse(), rcat() — wide print() width, cli summaries; use when magic buffering is not enough or you use bare rpy2
RSnowflake DBI-only workflows Same helper names in RSnowflake use writeLines(capture.output(...)) for the same reason — see RSnowflake/R/helpers.R

After bootstrap, sfnb-multilang also sets options(width = 200) and options(tibble.width = Inf) so wide tables wrap less inside each line.

Tip

Interactive grids: For data frames, the sfnb %%R magic can return a pandas DataFrame to Workspace’s datagrid (lazy dbplyr tables may execute SQL via Snowpark without copying all rows to R). Use %%R --text to force plain text only. Details: multilang r_helpers.py docstring on SafeRMagics.

Function (snowflakeR) Replaces Use for
rprint(x) print(x) Models, lists, data frames
rview(df, n = 10) head(df) Quick row preview + row count message
rglimpse(df) glimpse() / str() Column types and sample values
rcat(...) cat() Status lines without awkward spacing
%%R
df <- sfr_query(conn, "SELECT * FROM orders LIMIT 50")
rview(df, n = 8)
rglimpse(df)

If output still looks wrong, compare the same cell in local RStudio vs Workspace to confirm it is display-layer only. Platform internals: Appendix G.

Bootstrap / CRE also set console width from Python when needed:

from r_helpers import set_r_console_width
set_r_console_width(150)

10.9 Plotting

Workspace does not use RStudio’s plot pane. ggplot2 and base graphics need:

  1. Explicit dimensions on the magic line
  2. print(p) on ggplot objects — the grid is not auto-printed
%%R -w 700 -h 450
library(ggplot2)
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() + theme_minimal()
print(p)

Without print(p), the cell may run successfully but show a blank output area. Tune -w / -h for slide exports vs inline notebook viewing.

To show a plot and pass the underlying data to Python in one shot, combine plotting flags with -o — see the callout under The %%R magic line.


10.10 Workspace tools panel

Snowsight’s notebook Tools panel (alongside the main cell editor) is useful when debugging %%R workflows — especially Python↔︎R exports — without cluttering the notebook with throwaway cells.

10.10.1 Variable explorer

Open Variables in the Tools panel. It lists names in the Python kernel namespace (updated when a cell finishes).

Objects exported with -o appear here as rpy2-converted Python values, not as native R objects in an RStudio Environment pane:

Name (example) Type shown Meaning
mtcars DataFrame (32, 11) R data.frame → pandas
p RS4 (ggplot2::ggplot, …) Live ggplot handle in embedded R
png_bytes ByteVector / raw PNG file bytes from readBin()
fig, ax Figure, Axes Native matplotlib objects from Python cells

Use the Preview column to confirm conversion before writing a downstream Python cell. The explorer answers “what did -o actually give me?” — especially for ggplot (RS4) vs tabular data (DataFrame) vs binary (raw).

10.10.2 Scratchpad

Scratchpad provides extra Python kernel cells that do not appear in the saved notebook — ideal for ad-hoc inspection while developing:

  • %%R then print(mtcars) or str(conn) to probe the embedded R session
  • %%R -w 700 -h 450 + print(p) to re-render a ggplot already in R memory
  • Python snippets: mtcars.head(), type(png_bytes), bytes(png_bytes)[:8]

Create multiple scratch tabs (Scratch 1, Scratch 2, …) for parallel experiments. Scratchpad cells share the same kernel session as the notebook (variables from -o, conn, loaded packages all visible).

Note

Scratchpad output is ephemeral — useful for debugging and demos-in-progress, not for production pipeline steps. Commit logic to proper notebook cells once it works.


10.11 Session context in R

Workspace often does not set database/schema the way Snowsight worksheets do. Set context before unqualified table names:

%%R
conn <- sfr_connect()
conn <- sfr_load_notebook_config(conn)   # from snowflaker_*.yaml

Or explicitly (must reassignsfr_use() returns a new object):

%%R
conn <- sfr_use(conn, warehouse = "ML_WH", database = "ML_DB", schema = "FEATURES")
sfr_fqn(conn, "TRAINING_DATA")   # "ML_DB.FEATURES.TRAINING_DATA"

SQL cells can run USE WAREHOUSE / USE SCHEMA for warehouse context; R session context is separate — YAML + sfr_load_notebook_config() keeps R and your config aligned.


10.12 Filesystem layout in Workspace

Git-connected projects mount the repo under /filesystem/<session-hash>/. The hash changes per session — never hard-code it.

10.12.1 Why path discovery matters

Problem Cause
ModuleNotFoundError: sfnb_setup Notebook not in same folder as sfnb_setup.py
Wrong config loaded Relative config= resolved from unreliable cwd
Broken imports after restart Different mount hash — discovery pattern still works
Notebook location Bootstrap pattern
snowflakeR/inst/notebooks/*.ipynb from sfnb_setup import setup_notebook directly
Anywhere else in the repo tree Scan /filesystem/, add snowflakeR/inst/notebooks to sys.path (see below)
import sys, os

_repo_root = None
for _h in os.listdir("/filesystem"):
    _p = os.path.join("/filesystem", _h, "snowflakeR", "inst", "notebooks")
    if os.path.isdir(_p):
        _repo_root = os.path.join("/filesystem", _h)
        sys.path.insert(0, _p)
        break

if _repo_root is None:
    raise RuntimeError("Could not find snowflakeR/inst/notebooks under /filesystem/")

from sfnb_setup import setup_notebook

setup_notebook(
    config=os.path.join(_repo_root, "internal", "my_project", "snowflaker_config.yaml"),
    packages=["snowflakeR", "RSnowflake"],
)

Keep one copy of sfnb_setup.py in the repo (usually snowflakeR/inst/notebooks/) instead of duplicating it per folder. Full walkthrough: Bootstrap — filesystem.


10.13 Structuring a multi-language notebook

A productive layout separates setup, data, modeling, and operational steps:

flowchart TD
  C1["1 Python — setup_notebook or CRE"]
  C2["2 SQL — DDL, USE context, seed tables"]
  C3["3 R — connect, dbplyr ETL, train"]
  C4["4 R or Python — optional Snowpark ETL"]
  C5["5 R — log model, registry"]
  C6["6 SQL — verify scores / grants"]
  C1 --> C2 --> C3 --> C4 --> C5 --> C6

Cell Language Runs on Example
Bootstrap (if no CRE) Python Container kernel setup_notebook()
Context / DDL SQL Warehouse CREATE TABLE, USE WAREHOUSE
Explore / model R (%%R) Embedded R sfr_connect(), tidymodels
Warehouse ETL R (%%R) + dbplyr Warehouse (push-down SQL) tbl(con, ...) %>% mutate(...) %>% compute() — see RSnowflake: Query
Heavy ETL (optional) Python Snowpark on warehouse session.write_pandas() when the notebook is already Python-centric or a colleague owns Snowpark transforms
ML platform R (%%R) Embedded R sfr_log_model()
Validation SQL Warehouse SELECT COUNT(*) FROM scores
Note

Prefer R dbplyr for data pipelines. RSnowflake + dbplyr translates dplyr to Snowflake SQL and materializes with compute() — same push-down pattern as Python Snowpark, without leaving R. Use Snowpark Python only when you need Python-only APIs, an existing Snowpark pipeline, or mixed-language cells where Python already owns the transform step.

Compute reminder: SQL uses warehouse credits; R/Python kernel work uses container runtime credits; Snowflake ML API calls may touch both depending on operation.


10.14 Installing extra R packages

10.14.1 Cold bootstrap (no CRE)

setup_notebook() installs the stack from snowflaker_*.yaml (conda, CRAN, tarballs). For one-off packages in the same session:

%%R
options(repos = c(CRAN = "https://cloud.r-project.org"))
install.packages("forecast", quiet = TRUE)
suppressPackageStartupMessages(library(forecast))

Workspace has no interactive stdin — always set repos, use quiet = TRUE, and suppress startup messages. Packages with system libraries (e.g. sf → GDAL) often fail with plain install.packages(); add conda-forge equivalents (r-sf) to YAML instead.

Interactive installs do not survive container recycle — add recurring packages to YAML or move to a CRE.

10.15 Workspace vs local RStudio

Aspect Local RStudio / Posit Workspace %%R
Kernel Native R Python + rpy2
Auth connections.toml, key-pair, browser Session OAuth auto
sfr_connect() Reads TOML / params Wraps active Snowpark session
Plots Plot pane %%R -w … -h … + print(p)
Environment / debug RStudio Environment, Console Tools panel Variables + Scratchpad (§ Tools panel)
Output Standard Prefer rprint() / rview()
Setup Once per machine Bootstrap per session (CRE reduces cost)
Python interop reticulate (optional) rpy2 + reticulate (required for snowflakeR)
Kernel picker Native R (RStudio / IRkernel locally) Python only; R via %%RAppendix G

Design goal: the same snowflakeR R code runs in both environments — only setup and display differ (Local R Setup).


10.16 REST inference and PATs

Most notebook workflows use sfr_predict() through the Snowpark session — no extra token.

sfr_predict_rest() calls deployed SPCS endpoints over HTTPS; the Workspace session token is not valid for that ingress path. For REST predict from a notebook testing external apps, set a PAT via Sys.setenv(SNOWFLAKE_PAT = "...") — prefer not to hard-code in committed notebooks.


10.17 Common pitfalls

Symptom Likely cause Fix
%%R unknown command Bootstrap not run or kernel restarted Re-run Python bootstrap cell
Segfault on library(snowflakeR) Old reticulate r-reticulate>=1.25 in YAML
Table not found Missing DB/schema context sfr_load_notebook_config()
Blank ggplot cell Missing print(p) or size flags %%R -w 700 -h 450 + print(p)
Huge data frame slow across -i Bridge transfer Query from Snowflake in R instead
sfr_use() ignored No reassignment conn <- sfr_use(conn, ...)
could not find function "mtcarslibrary" Space after comma in -o list on old sfnb_multilang Use -o p,mtcars or -o p -o mtcars; upgrade multilang (normalizes -o p, mtcars)

More: Appendix C: Troubleshooting.


10.18 Companion material

Resource Content
workspace_quickstart.ipynb Runnable end-to-end
Vignette workspace-notebooks Full patterns and bulk-write notes
r_helpers.py Output helpers, PAT/key helpers, display bridges

10.19 Next steps

CRE & ML Jobs — skip repeat bootstrap with pre-baked images.

snowflakeR: Connect — session and operational APIs.