3  Architecture

Three packages, rpy2 bridge, and zero-config auth

Keywords

snowflake, R, RStudio, Posit, VS Code, workspace notebooks, snowflakeR, RSnowflake, mlops

3.1 Overview

Snowflake’s ML platform was built around Python and SQL. This guide’s stack adds R as a first-class citizen in Workspace through three layers: notebook bootstrap, DBI connectivity, and ML platform APIs — unified by in-process bridges and Workspace OAuth.

3.2 Learning Objectives

After completing this chapter, you will be able to:

  • Describe the role of each of the three packages
  • Explain how rpy2 connects R cells to Snowpark without R users writing Python
  • Choose between RSnowflake-first vs snowflakeR-first workflows

3.3 The problem

Before these packagesR teams could query Snowflake via ODBC/JDBC and dplyr, but Feature Store, Model Registry, and Snowpark ML exposed Python-first APIs. That forced duplicate infrastructure, PMML/ONNX conversions, or hand-written Python CustomModel wrappers.

3.4 The solution: three packages

3.4.1 snowflake-notebook-multilang

Role: Bootstrap R (and other languages) inside Workspace via micromamba, install packages from YAML, register %%R cell magic.

When you need it: Every Workspace notebook that runs R cells, before library(snowflakeR).

3.4.2 RSnowflake

Role: DBI-compliant driver (SQL API, ADBC, Workspace OAuth). Powers dbplyr, Arrow bulk I/O, Connections Pane.

When you need it: SQL-first workflows, lazy dplyr pipelines, or when you want a dedicated driver without the ML Python stack.

3.4.3 snowflakeR

Role: R API over snowflake-ml-python — Feature Store, Model Registry, experiments, monitoring, SPCS deployment, registerDoSnowflake().

When you need it: End-to-end ML on Snowflake from R.

3.5 Architecture diagram

flowchart TB
  subgraph ws [Workspace Notebook — Python kernel]
    PYcell[Python cells]
    SQLcell[SQL cells]
    Rcell["Cell with %%R magic"]
    subgraph ipy [IPython]
      magic["%%R cell magic"]
      rpy2bridge[rpy2]
      rembed[R interpreter embedded in Python]
    end
    setup["setup_notebook()"]
  end
  subgraph rpkgs [R packages in embedded R]
    RS[RSnowflake]
    SFR[snowflakeR]
  end
  PYcell --> setup
  setup --> magic
  Rcell --> magic --> rpy2bridge --> rembed
  rembed --> RS
  rembed --> SFR
  RS --> WH[(Warehouse)]
  SFR --> ML[Feature Store / Registry / SPCS]
  SQLcell --> WH

%%R cells indicate that the Python kernel should treat and execute the content of the cell as R code. There is no R cell type in the Workspace notebook UI, and no native R kernel. The R code is executed in the embedded R interpreter running inside the Python process via rpy2.

3.6 The bidirectional bridge (rpy2 + reticulate)

Snowflake’s compute layer — Workspace, warehouses, SPCS — runs Python natively. There is no standalone R kernel in Workspace today. R execution is enabled by embedding a full R interpreter inside the Python process via rpy2.

This is not a subprocess or HTTP hop: R and Python share the same process, with efficient data exchange for common types (data frames, vectors).

flowchart LR
  subgraph py [Python world]
    PYK[Jupyter / Snowpark kernel]
    ML[snowflake-ml-python]
  end
  subgraph bridges [Bridges]
    RPY2[rpy2 Python to R]
    RET[reticulate R to Python]
  end
  subgraph r [R world]
    RSESS[R session]
    SFR[snowflakeR]
    RS[RSnowflake]
  end
  PYK -->|%%R cells| RPY2 --> RSESS
  RSESS --> SFR --> RET --> ML
  RSESS --> RS

Direction Mechanism Example
Python → R rpy2 %%R cell magic runs library(tidymodels)
R → Python reticulate sfr_connect() calls snowflake-ml-python
R → SQL RSnowflake (DBI) or sfr_query() dplyr pipelines, ad hoc SQL

Why two bridges? Snowflake ML APIs (Feature Store, Model Registry, Snowpark session helpers) ship as Python SDKs. snowflakeR wraps them from R via reticulate so R users never write Python. rpy2 is the complementary path that lets Workspace run R cells at all.

3.6.1 What happens when you run a %%R cell

  1. IPython sees the %%R magic and hands the cell body to rpy2.
  2. rpy2 passes the text to the embedded R interpreter.
  3. R code runs; if it calls snowflakeR, reticulate invokes Python ML SDK calls.
  4. Results cross back to the notebook as R or Python objects.

3.6.2 Serving: CustomModel + rpy2

At inference time, registered R models deploy as Python CustomModel handlers that call back into R via rpy2 inside the SPCS inference environment container. Training stays in R; serving runs on Snowflake’s container runtime. See Model Registry.

3.7 Design principles

These ideas assume familiarity with Snowflake basics. Terms in bold are in the Glossary.

Principle What it means
Zero-config auth In Workspace, sfr_connect() and RSnowflake use the session OAuth token within the Workspace for seamless authentication — no PATs in notebook cells
Automatic network setup Bootstrap detects/creates EAI for CRAN, GitHub, and Python package indexes assuming Role has appropriate privileges
No Python from the R user snowflakeR owns the reticulate / rpy2 bridge internally
Snowflake-native objects Feature views, datasets, and registry models created in R appear in Python tooling. Likewise Python models created in the Snowflake ML platform appear in R tooling. Polyglot data-scientist teams can collaborate on the same data and models in using the preferred language and tools. Data Science team productivity is enhanced.
Train in R, serve in Snowflake Models wrap as Python CustomModel + rpy2 for SPCS inference
ML lineage Source table → Feature ViewDatasetModel traceability in Model Registry

3.8 Choosing your entry point

Goal Start with
New to Snowflake Snowflake Platform Primer
RStudio, Posit, or VS Code on your laptop Local R SetupIDE chapter
Run R in a Workspace notebook Workspaces overviewBootstrap
dplyr/SQL only, no ML SDK RSnowflake
Feature Store / Registry from R MLOps on SnowflakesnowflakeR Connect

3.9 Interop: sfr_dbi_connection()

Both packages can coexist. snowflakeR::sfr_dbi_connection(conn) lazily wraps a snowflakeR session connection as DBI for RSnowflake/dbplyr when you need ML APIs and dplyr in the same notebook.

3.10 Next steps