Prerequisites¶
Before setting up the Unstructured Document Intelligence Demo, ensure you have the following requirements in place.
Enterprise Requirement
This demo requires Snowflake Openflow, which is currently available only for Enterprise accounts as BYOC (Bring Your Own Cloud) or SPCS (Snowpark Container Services) Public Preview.
Contact your Snowflake account team to enable OpenFlow access.
Required Development Tools¶
These tools must be installed on your local machine to run the demo commands:
Essential Tools¶
- Git: Version control and repository cloning
- Python >= 3.12: Required for document processing scripts
- Task: Automation runner (Makefile in YAML)
- uv: Fast Python package manager
- Snow CLI: Snowflake command-line client
Installation Commands¶
Verification¶
Test that all tools are installed correctly:
Google Drive & Google Cloud Requirements¶
Google Administrative Access¶
- Google Workspace Admin: Super Admin permissions for your organization
- Google Cloud Console Access: Project owner or editor permissions
Google Cloud Project Setup¶
- Google Cloud Project with the following roles:
- Organization Policy Administrator
- Organization Administrator
- Billing Account linked to the project
- APIs Enabled:
- Google Drive API
- Google Admin SDK API
Google Service Account (GSA) Configuration¶
Critical Setup Step
Google Service Account (GSA) key creation is disabled by default in Google Cloud. You must enable this capability.
Required Steps:
- Enable Google Service Account (GSA) Key Creation:
gcloud org-policies reset constraints/iam.disableServiceAccountKeyCreation \
--project=YOUR_PROJECT_ID
- Create Google Service Account (GSA):
- Create new Google Service Account (GSA) in Google Cloud Console
- Download JSON key file securely
-
Store key file in secure location
-
Configure Domain-Wide Delegation:
Required OAuth scopes:
https://www.googleapis.com/auth/drive
https://www.googleapis.com/auth/drive.metadata.readonly
https://www.googleapis.com/auth/admin.directory.group.member.readonly
https://www.googleapis.com/auth/admin.directory.group.readonly
https://www.googleapis.com/auth/drive.file
https://www.googleapis.com/auth/drive.metadata
Snowflake Requirements¶
Account & Access¶
- Snowflake Account: Active Enterprise account in AWS Commercial Regions
- Openflow Access: BYOC or SPCS Public Preview enabled
- Account Admin Access: Ability to create services and manage users
Account Requirements¶
- Admin Access: Ability to create databases, schemas, and service users
- Warehouse Access: Existing compute warehouse (e.g.,
compute_wh) - Service User Capability: Permission to create SERVICE type users
Setup Note
The actual database, schema, and service user creation will be done in the Quick Setup Guide.
Cortex Search Requirements¶
- Cortex Search Enabled: Available in your Snowflake account
- Compute Warehouse: Dedicated warehouse for search operations
- Storage Database: Target database for processed documents
Infrastructure Requirements¶
Secrets Management (Recommended)¶
Choose one secrets management solution:
- AWS account with Secrets Manager access
- IAM role for Snowflake integration
- Secrets stored with proper access policies
- Azure subscription with Key Vault access
- Service principal for Snowflake integration
- Key Vault configured with appropriate policies
- HashiCorp Vault instance or cloud service
- Authentication method configured
- Policies for Snowflake access
Additional Demo Environment¶
For the complete demo setup:
- Google Shared Drive: "Festival Operations" shared drive created
- Document Collection: Access to the 16 demo business documents
- Network Access: Connectivity between Google Drive and Snowflake
Verification Checklist¶
Before proceeding with setup, verify:
Google Drive Access¶
You can verify Google Drive access in two ways:
Recommended for most users
- Open Google Drive in browser
- Navigate to your "Festival Operations" shared drive
- Try uploading a test file
- ✅ If successful, your permissions are working
Optional - for developers who want programmatic testing
# Install dev dependencies (includes Google API libraries)
uv sync --dev
# Test Google Service Account (GSA) access
python -c "
from google.oauth2 import service_account
from googleapiclient.discovery import build
credentials = service_account.Credentials.from_service_account_file(
'path/to/service-account.json',
scopes=['https://www.googleapis.com/auth/drive']
)
service = build('drive', 'v3', credentials=credentials)
results = service.files().list(pageSize=10).execute()
print('✅ Google Drive API access successful')
"
Snowflake Connectivity¶
You can verify Snowflake access in two ways:
Recommended for most users
- Open your Snowflake account in browser
- Navigate to Worksheets
- Try running:
SELECT CURRENT_ACCOUNT(), CURRENT_USER(); - Check Data > Databases - verify you can see your databases
- ✅ If successful, your account access is working
Optional - for developers who want to test Cortex Search specifically
Openflow Access¶
Contact your Snowflake account team to verify:
- BYOC Deployment: Container infrastructure ready
- SPCS Access: Public Preview enabled
- Connector Access: Google Drive connector available
- Processing Limits: Understanding of document volume limits
Common Issues & Solutions¶
Google Service Account (GSA) Issues¶
Key Creation Disabled
Error: Cannot create Google Service Account (GSA) keys
Solution: Enable key creation via organization policy:
Domain-Wide Delegation
Error: Insufficient permissions for drive access
Solution: Verify all 6 OAuth scopes are configured correctly
Snowflake Access Issues¶
Openflow Not Available
Error: Openflow connectors not visible
Solution: Contact Snowflake support to enable BYOC/SPCS access
Cortex Search Not Available
Error: Cortex functions not found
Solution: Verify account tier and region support
Next Steps¶
Once you've completed all prerequisites:
-
Quick Setup
Complete 15-minute setup guide - includes Google Drive, Openflow, and Cortex Search configuration
Questions or Issues? Contact your Snowflake team or check the Quick Setup Guide for common solutions.