8  Workspace Bootstrap

setup_notebook() and configuration YAML

Keywords

snowflake, R, RStudio, Posit, VS Code, workspace notebooks, snowflakeR, RSnowflake, mlops

8.1 Overview

After Workspaces & Notebooks, enable R in notebooks with snowflake-notebook-multilang package. Most enterprises follow two phases:

1. Organisation onboarding (usually IT or platform engineering) — Your Snowflake team builds a Custom Runtime Environment (CRE): a versioned container image with R, %%R, snowflakeR, RSnowflake, and your approved package set pre-installed. They register it in Snowflake (cre@<your_org_name>) and grant notebook roles access. Analysts then open Workspace with no bootstrap cell (reference CRE image) and go straight to %%R — startup is fast once the notebook container is running. This is the path most production teams adopt after a short pilot.

2. Self-serve bootstrap (this chapter’s main content) — While CRE is being set up — or for a solo proof-of-concept — a single Python cell runs setup_notebook() to install micromamba, R, tarballs, and %%R on the default runtime. It works without Docker or admin image builds, but runs on each notebook compute restart (after idle recycle). Typical time for a standard snowflakeR + RSnowflake configuration is ~60 seconds; more conda/CRAN packages, ADBC, or multilanguage options in your YAML add time (large stacks can reach several minutes).

You may use both: bootstrap during evaluation, or testing new R versions and packages; CRE once your organisation’s image is live.

8.2 Learning Objectives

  • Run setup_notebook() with a project YAML config
  • Structure snowflaker_*.yaml for context, EAI, and R tarballs
  • Explain who owns CRE setup (platform/IT) vs self-serve bootstrap (analyst PoC)

8.3 CRE vs bootstrap

Org CRE (after IT/platform setup) Self-serve bootstrap (setup_notebook())
Who sets it up Platform, ML ops, or IT (image build + Snowflake registration) Individual analyst or data scientist
Typical timing After pilot; ongoing image updates when packages change Day one; before org CRE exists, or for customisation and experimentation
First %%R Seconds (after container is up) — reference CRE auto-registers on kernel start ~60s typical bootstrap; longer with extra packages
Governance Image scanned in CI; pinned digest; one approved stack Runtime downloads (EAI); per-notebook YAML
Analyst experience Attach cre@<org_name> in advanced settings; often no setup cell One Python bootstrap cell per notebook (or project)
Still need bootstrap? Optional — session context, EAI checks, one-off extras Required on default Snowflake runtime
Note

Caching: CRE image layers cache on the registry/node after first pull. Kernel restart in the same session is cheap (R state persists in-process). Idle recycle starts new compute and re-runs bootstrap (~60s typical for standard config) unless you use a CRE.


8.4 Recommended: single bootstrap cell

Use the bootstrap script sfnb_setup.py from snowflake-notebook-multilang (same file is also in snowflakeR/inst/notebooks/).

How you use it depends on your project layout:

Layout What to do
Notebook in the same folder as sfnb_setup.py from sfnb_setup import setup_notebook
Git repo cloned into Workspace (typical) File is under snowflakeR/inst/notebooks/ — see filesystem paths below
No Git / minimal project Copy sfnb_setup.py next to your notebook, or pip install sfnb-multilang and use the package API
from sfnb_setup import setup_notebook

setup_notebook(
    config="snowflaker_config.yaml",
    packages=["snowflakeR", "RSnowflake"],
)

8.4.1 What setup_notebook() does

Phase Action
EAI Discover/create External Access Integration for package downloads
micromamba Install or reuse conda-compatible env manager
R runtime Install R version from YAML
R packages CRAN/conda packages + tarballs for snowflakeR/RSnowflake
Magic Register %%R (and other languages if enabled) on IPython kernel
Context Export SNOWFLAKE_* env vars for RSnowflake / snowflakeR
Session Optionally apply warehouse/database/schema from YAML

setup_notebook() is idempotent — re-run safely to add packages (edit YAML first). Use force_reinstall only when debugging a broken env.

packages= loads named R packages after bootstrap. Always list tarballs in YAML for production — Appendix B.

Workspace notebook Python cell calling setup_notebook with snowflakeR and RSnowflake packages

Python bootstrap cell with setup_notebook() in a connected notebook

8.5 Configuration file essentials

snowflake:
  database: MY_DB
  schema: MY_SCHEMA
  warehouse: COMPUTE_WH
  role: MY_ROLE

languages:
  r:
    version: "4.3"
    packages:
      - tidyverse
      - arrow
    tarballs:
      - url: https://github.com/Snowflake-Labs/snowflakeR/releases/download/v0.x.x/snowflakeR_0.x.x.tar.gz
      - url: https://github.com/Snowflake-Labs/RSnowflake/releases/download/v0.x.x/RSnowflake_0.x.x.tar.gz
Warning

Wherever possible, never compile snowflakeR / RSnowflake from source inside Workspace. Build tarballs with R CMD build --no-build-vignettes locally.

8.6 Notebooks deep in your Git repo

8.6.1 Why this matters

If your Workspace project is linked to Git, Snowflake mounts the repository inside the notebook container under:

/filesystem/<session-hash>/          ← root of your cloned repo (hash changes each session)
├── snowflakeR/inst/notebooks/
│   └── sfnb_setup.py                ← bootstrap script (if you use the Labs repos)
├── your_notebooks/                  ← your .ipynb may live elsewhere in the tree
└── ...

Problem 1 — finding sfnb_setup.py: from sfnb_setup import setup_notebook only works if that file is on Python’s path. It is not automatically next to every notebook — usually it sits in snowflakeR/inst/notebooks/.

Problem 2 — os.getcwd() is unreliable: Depending on Workspace version and restart state, os.getcwd() may return your notebook folder, a parent folder, or something else. Never use cwd-relative paths for config YAML or bootstrap imports.

Problem 3 — config paths: setup_notebook(config="snowflaker_config.yaml") resolves relative to cwd — which may be wrong. Pass an absolute path built from the discovered repo root.

8.6.2 Bootstrap pattern for notebooks outside snowflakeR/inst/notebooks/

Use this Python cell before setup_notebook() when your notebook is not in the same directory as sfnb_setup.py (for example under projects/my_analysis/):

import sys, os

_repo_root = None
for _h in os.listdir("/filesystem"):
    _p = os.path.join("/filesystem", _h, "snowflakeR", "inst", "notebooks")
    if os.path.isdir(_p):
        _repo_root = os.path.join("/filesystem", _h)
        sys.path.insert(0, _p)
        break

if _repo_root is None:
    raise RuntimeError("Could not find snowflakeR/inst/notebooks under /filesystem/")

from sfnb_setup import setup_notebook

_config = os.path.join(
    _repo_root,
    "projects", "my_forecast_demo",
    "snowflaker_config.yaml",
)

setup_notebook(config=_config, packages=["snowflakeR", "RSnowflake"])
Step What it does
Scan /filesystem/ Find the session-specific mount hash
Check for snowflakeR/inst/notebooks Confirm this path exists under /filesystem/<hash>/
sys.path.insert(0, ...) Make sfnb_setup importable
Absolute config= path YAML loads regardless of cwd

Notebooks already in snowflakeR/inst/notebooks/ can import directly — no scan needed:

from sfnb_setup import setup_notebook
setup_notebook(config="snowflaker_config.yaml", packages=["snowflakeR"])

Full patterns: R Cells & Interop.

Warning

Prefer one shared copy of sfnb_setup.py in your repo (e.g. under snowflakeR/inst/notebooks/) and the path-discovery pattern below — rather than duplicating the file into every demo folder. That way upgrades from snowflake-notebook-multilang apply in one place.

8.7 Custom Runtime Images (for platform / IT teams)

If you are enabling R for an organisation (not only your own notebook), plan a one-time CRE onboarding project: build the reference v1 image with snowflake-notebook-multilang (docker/create_cre.sh + cre_profile.yaml — default tag sfnb-multilang-r:v1), push to your Snowflake image repository, register cre@<org_name>, and grant USAGE to notebook roles. That image registers %%R on kernel start (empty git repo is fine), includes snowflakeR, RSnowflake, and optional ADBC. Earlier internal image experiments are not a separate supported release — v1 is the first production reference.

Analysts only select the CRE in Workspace advanced settings — they do not run Docker. Share the notebook template after go-live.

Quick verify after attaching cre@<your_cre_name>:

%%R
cat(R.version.string, "\n")
packageVersion("snowflakeR")
requireNamespace("adbcsnowflake", quietly = TRUE)
library(snowflakeR)
sfr_connect()

Full end-to-end guide (build, push, register, customize packages): CRE & ML Jobs — Custom Runtime Images.

Toolkit docs (public repo): Organisation operating model, path matrix, notebook template.

Warning

Git-connected projects with renv (.Rprofilerenv/activate.R at the repo root) can hijack R on the /filesystem mount. Use the reference CRE (v1) image or enable_r_cells() from the bootstrap helpers — never raw %load_ext rpy2.ipython on those repos.

8.8 Companion notebook

8.9 Next steps

Network & EAI