Skip to content

Deploying Streamlit Applications in Snowflake

Previous Chapter Recap:

In the last chapter, we developed an interactive Streamlit application for predicting penguin species using our Random Forest model. We focused on creating an intuitive user interface through:

  • Visualization of prediction probabilities using Streamlit's progress bars
  • Strategic content organization with columns and containers for enhanced layout
  • Implementation of success messages to display clear prediction results

Moving forward, we'll explore how to deploy this application within Snowflake's ecosystem using Streamlit in Snowflake (SiS). This integration combines our interactive prediction capabilities with Snowflake's enterprise-grade data platform features.

In this chapter, we will:

  • Modify our existing Streamlit application for seamless Snowflake integration
  • Establish and configure secure Snowflake connections
  • Deploy and validate our application using Streamlit in Snowflake
  • Explore critical distinctions between local development and Snowflake deployment

Preparing for Deployment

Database

Important

It is assumed that Snowflake CLI has been installed and you have test your snowflake connection.

Set your default connection name using

  • snow connection set-default <your snowflake connection>
  • Set an environment variable named SNOWFLAKE_DEFAULT_CONNECTION_NAME

And then for quick check try:

snow connection test

Database

For this demo all our Snowflake objects will be housed in a DB called st_ml_app.

Create the Database

snow object create database \
  name='st_ml_app' \
  comment='Database used for Streamlit ML App Tutorial'

Schema

Let us create one schema to hold the notebooks.

# notebooks
snow object create schema \
  name='notebooks' comment='Will hold all notebooks' \
  --database='st_ml_app'

Download and import the notebook and follow the instructions on the notebook to prepare the environment for deployment.

Deploying the App

Create Streamlit Project

Create a Snowflake project from a template, to make things easy and clean we will create the application in $TUTORIAL_HOME/sis.

snow init sis --template example_streamlit

Note

The application init uses the example_streamlit template from https://github.com/snowflakedb/snowflake-cli-templates

Update the App

TODO: Note on Copy

Edit and update the $TUTORIAL_HOME/sis/streamlit_app.py with,

streamlit_app.py
import streamlit as st
import os

from sklearn.ensemble import RandomForestClassifier
from snowflake.snowpark.session import Session
from snowflake.snowpark.functions import col
from snowflake.snowpark.types import StringType, DecimalType


def get_active_session() -> Session:
    """Create or get new Snowflake Session.
       When running locally it uses the SNOWFLAKE_CONNECTION_NAME environment variable to get the connection name and when running in SiS it uses the context connection.
    """
    conn = st.connection(
        os.getenv(
            "SNOWFLAKE_CONNECTION_NAME",
            "devrel-ent",
        ),
        type="snowflake",
    )
    return conn.session()


session = get_active_session()

st.title("🤖 Machine Learning App")

st.write("Welcome to world of Machine Learning with Streamlit.")

with st.expander("Data"):
    st.write("**Raw Data**")
    # read the data from table
    # cast the columns to right data types with right precision
    df = session.table("st_ml_app.data.penguins").select(
        col("island").cast(StringType()).alias("island"),
        col("species").cast(StringType()).alias("species"),
        col("bill_length_mm").cast(DecimalType(5, 2)).alias("bill_length_mm"),
        col("bill_depth_mm").cast(DecimalType(5, 2)).alias("bill_depth_mm"),
        col("flipper_length_mm").cast(DecimalType(5, 2)).alias("flipper_length_mm"),
        col("body_mass_g").cast(DecimalType()).alias("body_mass_g"),
        col("sex").cast(StringType()).alias("sex"),
    )
    df = df.to_pandas()
    # make the column names lower to reuse the rest of the code as is
    df.columns = df.columns.str.lower()
    # define and display
    st.write("**X**")
    X_raw = df.drop("species", axis=1)
    X_raw

    st.write("**y**")
    y_raw = df.species
    y_raw

with st.expander("Data Visualization"):
    st.scatter_chart(
        df,
        x="bill_length_mm",
        y="body_mass_g",
        color="species",
    )

# Interactivity
# Columns:
# 'species', 'island', 'bill_length_mm', 'bill_depth_mm',
# 'flipper_length_mm', 'body_mass_g', 'sex'
with st.sidebar:
    st.header("Input Features")
    # Islands
    islands = df.island.unique().astype(str)
    island = st.selectbox(
        "Island",
        islands,
    )
    # Bill Length
    min, max, mean = (
        df.bill_length_mm.min(),
        df.bill_length_mm.max(),
        df.bill_length_mm.mean().round(2),
    )
    bill_length_mm = st.slider(
        "Bill Length(mm)",
        min_value=min,
        max_value=max,
        value=mean,
    )
    # Bill Depth
    min, max, mean = (
        df.bill_depth_mm.min(),
        df.bill_depth_mm.max(),
        df.bill_depth_mm.mean().round(2),
    )
    bill_depth_mm = st.slider(
        "Bill Depth(mm)",
        min_value=min,
        max_value=max,
        value=mean,
    )
    # Filpper Length
    min, max, mean = (
        df.flipper_length_mm.min(),
        df.flipper_length_mm.max(),
        df.flipper_length_mm.mean().round(2),
    )
    flipper_length_mm = st.slider(
        "Flipper Length(mm)",
        min_value=min,
        max_value=max,
        value=mean,
    )
    # Body Mass
    min, max, mean = (
        df.body_mass_g.min().astype(float),
        df.body_mass_g.max().astype(float),
        df.body_mass_g.mean().round(2),
    )
    body_mass_g = st.slider(
        "Body Mass(g)",
        min_value=min,
        max_value=max,
        value=mean,
    )
    # Gender
    gender = st.radio(
        "Gender",
        ("male", "female"),
    )

# Dataframe for Input features
data = {
    "island": island,
    "bill_length_mm": bill_length_mm,
    "bill_depth_mm": bill_depth_mm,
    "flipper_length_mm": flipper_length_mm,
    "body_mass_g": body_mass_g,
    "sex": gender,
}
input_df = pd.DataFrame(data, index=[0])
input_penguins = pd.concat([input_df, X_raw], axis=0)

with st.expander("Input Features"):
    st.write("**Input Penguins**")
    input_df
    st.write("**Combined Penguins Data**")
    input_penguins

## Data Preparation

## Encode X
X_encode = ["island", "sex"]
df_penguins = pd.get_dummies(input_penguins, prefix=X_encode)
X = df_penguins[1:]
input_row = df_penguins[:1]

## Encode Y
target_mapper = {
    "Adelie": 0,
    "Chinstrap": 1,
    "Gentoo": 2,
}


def target_encoder(val_y: str) -> int:
    return target_mapper[val_y]


y = y_raw.apply(target_encoder)

with st.expander("Data Preparation"):
    st.write("**Encoded X (input penguins)**")
    input_row
    st.write("**Encoded y**")
    y


with st.container():
    st.subheader("**Prediction Probability**")
    ## Model Training
    rf_classifier = RandomForestClassifier()
    # Fit the model
    rf_classifier.fit(X, y)
    # predict using the model
    prediction = rf_classifier.predict(input_row)
    prediction_prob = rf_classifier.predict_proba(input_row)

    # reverse the target_mapper
    p_cols = dict((v, k) for k, v in target_mapper.items())
    df_prediction_prob = pd.DataFrame(prediction_prob)
    # set the column names
    df_prediction_prob.columns = p_cols.values()
    # set the Penguin name
    df_prediction_prob.rename(columns=p_cols)

    st.dataframe(
        df_prediction_prob,
        column_config={
            "Adelie": st.column_config.ProgressColumn(
                "Adelie",
                help="Adelie",
                format="%f",
                width="medium",
                min_value=0,
                max_value=1,
            ),
            "Chinstrap": st.column_config.ProgressColumn(
                "Chinstrap",
                help="Chinstrap",
                format="%f",
                width="medium",
                min_value=0,
                max_value=1,
            ),
            "Gentoo": st.column_config.ProgressColumn(
                "Gentoo",
                help="Gentoo",
                format="%f",
                width="medium",
                min_value=0,
                max_value=1,
            ),
        },
        hide_index=True,
    )

# display the prediction
st.subheader("Predicted Species")
st.success(p_cols[prediction[0]])

Verify Python Packages

Edit and update $TUTORIAL_HOME/sis/environment.yml to be like, ensuring that the packages used locally and in Snowflake are same.

environment.yml
1
2
3
4
5
6
7
8
9
name: sf_env
channels:
  - snowflake
dependencies:
  - streamlit=1.35.0
  - snowflake-snowpark-python
  - scikit-learn=1.3.0
  - pandas=2.0.3
  - numpy=1.24.3

Verify Project Manifest

Edit and update the Project manifest $TUTORIAL_HOME/sis/snowflake.yml to be inline with your settings, especially

  • name
  • main_file
  • query_warehouse
  • artifacts

IMPORTANT

If you have followed along the tutorial as is without any changes the only that you might need to change is query_warehouse. If you are using trial account then update it to COMPUTE_WH.

snowflake.yml
definition_version: "2"
entities:
  streamlit_penguin:
    type: streamlit
    identifier:
      name: streamlit_penguin
    main_file: streamlit_app.py
    query_warehouse: kamesh_demos_s
    stage: src
    artifacts:
      - streamlit_app.py
      - environment.yml

Navigate to the application(sis) folder,

cd sis
Run the following command to deploy the Streamlit application to Snowflake,

Note

The output of the command displays the URL to access the application. In case you missed to note run the following command,

# get the url to streamlit app named "streamlit_penguin"
snow streamlit get-url streamlit_penguin

snow streamlit deploy --replace \
  --database='st_ml_app'  --schema='apps'

There you go we have seamlessly deployed the application to SiS with a very little effort.

Summary

This chapter guided you through the process of transforming a locally running Streamlit application into a production-ready deployment within Snowflake. You learned the essential modifications needed for Snowflake compatibility, understood the configuration requirements, and mastered the deployment process. You now have a fully functional Streamlit application running in Snowflake's secure environment, accessible to your organization's users through Snowflake's interface.

Pro Tip: You can run this entire application using Snowflake Notebook. Download and import the environment.yml and application notebook and experience the Snowflake Notebook magic! 🚀