Quick Setup¶
Get your Snowflake Openflow demo running in 15 minutes with this streamlined setup guide. This quick setup focuses on the essential components needed to run the demo effectively, skipping optional configurations.
Demo Data Disclaimer
All business data, financial figures, and organizational information in this demo are fictitious and for demonstration purposes only.
Prerequisites Checklist¶
Before starting, ensure you have:
- Snowflake Enterprise Account with Openflow enabled
- Google Workspace Admin access
- Google Cloud Project with Drive API enabled
- Google Service Account (GSA) with domain-wide delegation configured
Need Help?
If you haven't completed the prerequisites, see the detailed prerequisites guide.
Required Tools Check
Before proceeding, ensure you have these tools installed:
git --version # Git (for cloning repository)
task --version # Task (for automation commands)
uv --version # uv (for Python dependencies)
python3 --version # Python 3.12+ (for processing)
snow --version # Snow CLI (for database queries)
Step 1: Repository Setup (2 minutes)¶
Clone and Install Dependencies¶
# Clone the repository
git clone https://github.com/kameshsampath/openflow-unstructured-data-pipeline-demo.git
cd openflow-unstructured-data-pipeline-demo
# Verify Taskfile automation is working
task --list
# Optional: Install Python dependencies (only needed for document conversion)
uv sync
Verify Document Collection¶
You should see the following file structure with 15 business documents across multiple formats:
sample-data/google-drive-docs/
├── Analysis/
│ └── Post-Event-Analysis-Summer-2024.pptx
├── Compliance/
│ └── Health-Safety-Policy.pdf
├── Executive Meetings/
│ └── Board-Meeting-Minutes-Q4-2024.docx
├── Financial Reports/
│ └── Q3-2024-Financial-Analysis.pdf
├── Operations/
│ ├── Venue-Setup-Operations-Manual-0.jpg
│ ├── Venue-Setup-Operations-Manual-1.jpg
│ ├── Venue-Setup-Operations-Manual-2.jpg
│ └── Venue-Setup-Operations-Manual-3.jpg
├── Projects/
│ └── Sound-System-Modernization-Project-Charter.docx
├── Strategic Planning/
│ ├── 2025-Festival-Expansion-Strategy-0.jpg
│ ├── 2025-Festival-Expansion-Strategy-1.jpg
│ ├── 2025-Festival-Expansion-Strategy-2.jpg
│ ├── 2025-Festival-Expansion-Strategy-3.jpg
│ └── 2025-Festival-Expansion-Strategy-4.jpg
├── Training/
│ └── Customer-Service-Training-Guide.pptx
└── Vendors/
└── Audio-Equipment-Service-Agreement.pdf
Document Formats: PDF, DOCX, PPTX, JPG - demonstrating true multi-format document intelligence
Source Files
The .md files are conversion templates and not uploaded to Google Drive. The demo uses the converted formats shown above.
Step 2: Google Drive Setup (5 minutes)¶
Option A: Use Google Apps Script (Recommended)¶
- Open Google Apps Script: https://script.google.com
- Create New Project: Copy contents from
scripts/google-apps-script/CreateFolderStructure.gs - Configure Shared Drive ID: Edit the
createDemoFolders()function - Run Script: Creates complete folder structure and uploads demo documents automatically
Option B: Manual Web Upload¶
- Create Shared Drive: In Google Drive web, create "Festival Operations" shared drive
- Create Folder Structure: Manually create these demo category folders:
- Strategic Planning/
- Operations/
- Compliance/
- Training/
- Upload Documents: Drag and drop files from
sample-data/google-drive-docs/into corresponding folders based on the demo categories you plan to demonstrate.
Step 3: Document Processing (3 minutes)¶
Convert Documents to All Formats¶
# Optional: Convert all documents (only if you modified sample-data markdowns)
task convert-all-docs
# Copy documents to Google Drive location (only if you have local Google Drive sync)
task copy-all-categories
Verify Document Upload¶
Check your Google Drive "Festival Operations" shared drive contains:
- Strategic Planning/ - Market expansion images, board minutes, financial reports
- Operations/ - Technology projects, venue setup manuals, event analysis
- Compliance/ - Health policies, vendor agreements
- Training/ - Customer service materials
Step 4: Snowflake Configuration (5 minutes)¶
External Access Integration Setup¶
Configure network access for Google Drive API connectivity:
-- Create schema for network rules
USE ROLE ACCOUNTADMIN;
USE DATABASE OPENFLOW_FESTIVAL_DEMO;
CREATE SCHEMA IF NOT EXISTS NETWORKS;
-- Create network rule for Google APIs
CREATE OR REPLACE NETWORK RULE google_network_rule
MODE = EGRESS
TYPE = HOST_PORT
VALUE_LIST = (
'admin.googleapis.com',
'oauth2.googleapis.com',
'www.googleapis.com',
'google.com'
);
-- Verify the network rule
DESC NETWORK RULE google_network_rule;
Optional: Add Google Workspace Domain Rule
If accessing resources from your specific Google Workspace domain:
-- Replace 'your-domain.com' with your actual domain
CREATE OR REPLACE NETWORK RULE your_workspace_domain_network_rule
MODE = EGRESS
TYPE = HOST_PORT
VALUE_LIST = ('your-domain.com');
-- Verify
DESC NETWORK RULE your_workspace_domain_network_rule;
Create External Access Integration:
-- Create external access integration with Google API access
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION festival_ops_access_integration
ALLOWED_NETWORK_RULES = (
OPENFLOW_FESTIVAL_DEMO.NETWORKS.google_network_rule
-- Add your workspace domain rule if created:
-- , OPENFLOW_FESTIVAL_DEMO.NETWORKS.your_workspace_domain_network_rule
)
ENABLED = TRUE
COMMENT = 'Used for Openflow SPCS runtime to access Google Drive';
-- Verify the integration
DESC EXTERNAL ACCESS INTEGRATION festival_ops_access_integration;
-- Grant access to Openflow admin role
GRANT USAGE ON DATABASE OPENFLOW_FESTIVAL_DEMO TO ROLE OPENFLOW_ADMIN;
GRANT USAGE ON SCHEMA OPENFLOW_FESTIVAL_DEMO.NETWORKS TO ROLE OPENFLOW_ADMIN;
GRANT USAGE ON INTEGRATION festival_ops_access_integration TO ROLE OPENFLOW_ADMIN;
Network Configuration
The OPENFLOW_ADMIN role is created automatically during Openflow SPCS deployment setup. All SQL snippets are also available in sql/network.sql in the repository.
Database Setup¶
-- Create target database and schema
CREATE DATABASE IF NOT EXISTS OPENFLOW_FESTIVAL_DEMO;
CREATE SCHEMA IF NOT EXISTS OPENFLOW_FESTIVAL_DEMO.FESTIVAL_OPS;
-- Create service user for OpenFlow
CREATE USER IF NOT EXISTS festival_demo_service
TYPE = SERVICE
MUST_CHANGE_PASSWORD = FALSE;
-- Grant necessary privileges
GRANT USAGE ON WAREHOUSE FESTIVAL_DEMO_S TO ROLE FESTIVAL_DEMO_ROLE;
GRANT ALL ON DATABASE OPENFLOW_FESTIVAL_DEMO TO ROLE FESTIVAL_DEMO_ROLE;
Openflow Connector Configuration¶
- Access Openflow UI in your Snowflake account
- Create Google Drive Connector:
- Name:
festival_operations_connector - Google Service Account (GSA): Upload your JSON key file
- Shared Drive: Select "Festival Operations"
-
Target:
OPENFLOW_FESTIVAL_DEMO.FESTIVAL_OPS -
Start Connector: Begin document processing
Step 5: Cortex Search Intelligence (Automatic)¶
Cortex Search service is created automatically by the Openflow Google Drive (Cortex connect) connector. No manual SQL required!
How It Works¶
graph LR
A[OpenFlow Google Drive Connector] --> B[Document Processing]
B --> C[Cortex Search Service<br/>🤖 Auto-Created]
C --> D[Natural Language Queries]
classDef autoStyle fill:#f8f9fa,stroke:#28a745,stroke-width:2px
class C autoStyle Automatic Features:
- ✅ Arctic Embeddings: Automatically configured with
snowflake-arctic-embed-m-v1.5 - ✅ Document Indexing: All processed documents automatically indexed
- ✅ Semantic Search: Ready for natural language queries immediately
- ✅ Metadata Integration: Document properties, authors, and collaboration data included
Verify Automatic Setup¶
-- Verify the auto-created service
SHOW CORTEX SEARCH SERVICES;
-- Test natural language query
SELECT PARSE_JSON(
SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
'FESTIVALS_OPS_SEARCH_SERVICE',
'{"query": "expansion plans", "limit": 3}'
)
)['results'] as search_results;
Service Name
The service will be auto-created as FESTIVALS_OPS_SEARCH_SERVICE by Openflow after the first document is processed.
Step 6: Demo Validation (1 minute)¶
Test Key Business Queries¶
Run these sample queries to verify everything works:
Ready to Use
The service name FESTIVALS_OPS_SEARCH_SERVICE will be automatically created by Openflow
Demo Resources¶
Your setup is complete! Access these resources for successful demos:
-
Demo Commands Reference
Complete Taskfile automation commands, SQL queries, and troubleshooting for ongoing demo usage
-
Sample Questions Reference
Categorized business questions organized by function for smooth demo presentations
Expected Demo Results¶
After setup, you can demonstrate:
✅ Natural Language Queries: "What are our 2025 expansion plans?"
✅ Multi-Format Search: Find insights across PDF, DOCX, PPTX, JPG documents
✅ Business Intelligence: Strategic, operational, compliance, and training insights
✅ Executive Decision Support: Instant access to investment analysis (demo figures)
✅ Cross-Category Analysis: Unified view across all business functions
Troubleshooting¶
Common Issues¶
Openflow Connector Not Visible
Solution: Verify Enterprise account and contact Snowflake support to enable BYOC/SPCS
Google Drive Access Denied
Solution: Check Google Service Account (GSA) domain-wide delegation and OAuth scopes
Cortex Search Service Creation Failed
Solution: Verify Cortex Search is enabled in your account region
Documents Not Processing
Solution: Check Openflow connector logs and verify file permissions
Next Steps¶
-
Run Demo
Execute the complete demo with sample business questions and interactive queries
-
Business Intelligence
Explore advanced analytics and strategic insights from document processing
🎉 Congratulations! Your Snowflake Openflow document intelligence demo is ready. You can now transform unstructured business documents into queryable strategic intelligence!