16  RSnowflake in Workspace

Bootstrap order, EAI, ADBC, and sfr_dbi_connection()

Keywords

snowflake, R, RStudio, Posit, VS Code, workspace notebooks, snowflakeR, RSnowflake, mlops

16.1 Overview

RSnowflake in Workspace follows the same Python kernel + %%R model as snowflakeR. This chapter covers bootstrap order, optional ADBC acceleration, the internal SPCS gateway, and bridging to snowflakeR via sfr_dbi_connection().

Prerequisites: Workspace Bootstrap, Network & EAI.

16.2 Learning Objectives

  • Run the correct bootstrap → R cell order (or skip bootstrap when using CRE)
  • Understand ADBC install constraints in Workspace
  • Use sfr_dbi_connection() for dplyr alongside ML APIs
  • Know when bulk writes route via ADBC vs Snowpark fallback

16.3 Execution model

  1. Python kernel holds the Snowpark session Workspace creates.
  2. %%R cells run R via rpy2; RSnowflake in R speaks HTTPS to Snowflake (SQL API).
  3. dbConnect(Snowflake()) succeeds when bootstrap exports SNOWFLAKE_* env vars.
What is rpy2?

RSnowflake runs inside the embedded R session that rpy2 creates in the Python process. There is no separate RSnowflake “Workspace mode” — it is the same DBI driver with Workspace OAuth from environment variables.


16.4 Bootstrap order

Two paths: bootstrap (default without CRE) and CRE (packages pre-baked in the runtime image).

16.4.1 Without CRE — run bootstrap first

Step Cell Action
1 Python setup_notebook(config=..., packages=["RSnowflake", "snowflakeR"])
2 Python (optional) Verify EAI enabled in Snowsight (needed for downloads during bootstrap)
3 R (%%R) library(RSnowflake) or library(snowflakeR)
4 R (%%R) dbConnect(Snowflake()) or sfr_connect()
from sfnb_setup import setup_notebook
setup_notebook(config="snowflaker_config.yaml", packages=["RSnowflake", "snowflakeR"])
%%R
library(DBI)
library(RSnowflake)
con <- dbConnect(Snowflake())
dbGetQuery(con, "SELECT CURRENT_USER(), CURRENT_WAREHOUSE()")

If dbConnect() fails: re-run Python bootstrap; confirm External Access is toggled on; do not pass a laptop connections.toml profile name in Workspace.

16.4.2 With CRE — steps 1 and 2 often unnecessary

When the notebook uses a Custom Runtime Environment that already includes R, %%R, snowflakeR, RSnowflake, and (optionally) ADBC, analysts can usually skip step 1 (setup_notebook) and step 2 (runtime EAI for package downloads):

Step Cell Action
1 Skipped — tarballs and micromamba env are in the image
2 Skipped for cold install — no public download chain unless you add new hosts at runtime
3 R (%%R) library(RSnowflake) / library(snowflakeR)
4 R (%%R) dbConnect(Snowflake()) / sfr_connect()

Attach the CRE in notebook settings (cre@<org_name>). Platform teams rebuild the image when package versions change — same reproducibility model as pinning versions in a CRE profile.

Why RSnowflake (SQL API) in Workspace — not ODBC via bootstrap

Before CRE, installing the Snowflake ODBC driver and configuring unixODBC inside a notebook session was impractical — admin-heavy and a poor fit for setup_notebook(). RSnowflake (SQL REST API) avoided that entirely. ADBC (adbcsnowflake) was installable via bootstrap (with Go + EAI) and is Snowflake’s Arrow-native, optimized bulk path going forward.

With CRE, you could bake the ODBC driver and odbc package into an image if you standardize on odbc::snowflake() — but that is a platform image decision, not something bootstrap is designed to do. This guide still recommends RSnowflake + optional ADBC in CRE for Workspace notebooks.


16.5 Internal gateway (SNOWFLAKE_HOST)

Workspace containers route Snowflake API traffic through an internal SPCS gateway — not always the public *.snowflakecomputing.com URL.

RSnowflake detects SNOWFLAKE_HOST and uses the container OAuth token automatically. Same behaviour for optional ADBC when installed.


16.6 Optional ADBC in Workspace

ADBC (adbcsnowflake) enables Arrow-native bulk reads/writes. In Workspace:

Topic Detail
Install Enable languages.r.addons.adbc: true in YAML or bootstrap preset
CRE Pre-install adbcsnowflake in the runtime image — same startup win as baking R/%%R (see CRE chapter)
Build time Requires Go compiler + EAI for module downloads (~2 min cold without CRE)
Bulk writes "auto" prefers ADBC when installed (internal gateway); Snowpark write_pandas is fallback (~2–3 s vs ~7 s vs ~45 s literal for ~50k rows in RSnowflake tests)
Reads ADBC can use internal gateway with OAuth; preferred for large training pulls and VECTOR/typed ARRAY columns
Semi-structured VARIANT stays JSON in Arrow; typed ARRAY/struct need account ENABLE_STRUCTURED_TYPES_IN_CLIENT_RESPONSE — see Training Data in R — ADBC
%%R
options(RSnowflake.upload_method = "auto")
options(RSnowflake.bulk_write_threshold = 50000L)
options(RSnowflake.backend = "adbc")   # Feature Store materialized reads
dbWriteTable(con, "BIG_COPY", large_df)

Engineering deep dive: WORKSPACE_ADBC.md


16.7 Interop with snowflakeR

Use one Snowpark session for ML APIs and dplyr:

%%R
library(snowflakeR)
conn <- sfr_connect()
dbi <- sfr_dbi_connection(conn)

library(dplyr)
tbl(dbi, in_schema("ANALYTICS", "ORDERS")) |>
  filter(ORDER_DATE >= "2024-01-01") |>
  head(10) |>
  collect()

sfr_dbi_connection() lazily wraps the snowflakeR session — no second login.


16.8 RSnowflake-only vs snowflakeR-only

Need Package
SQL, dbplyr, Connections Pane (local IDE) RSnowflake alone
Feature Store, Registry, doSnowflake snowflakeR
Mixed notebook Both via bootstrap packages=

RSnowflake does not require Python ML stack. snowflakeR does — bootstrap installs both when listed.


16.9 Companion artifacts

Artifact Location
Test notebook RSnowflake/inst/notebooks/workspace_rsnowflake_test.ipynb
Vignette workspace-rsnowflake
ML notebook patterns snowflakeR vignette workspace-notebooks

16.10 Next steps

MLOps on SnowflakesnowflakeR: Connect