Cloud Processing¶
SPIMquant supports cloud-based processing for scalable analysis of large datasets.
Overview¶
Cloud processing enables:
- Scalable compute: Process many subjects in parallel
- Cloud storage: Direct access to S3, GCS data
- Cost efficiency: Pay only for resources used
- No local hardware: Process without local infrastructure
Cloud Storage Support¶
Amazon S3¶
Read BIDS datasets directly from S3:
S3 Configuration¶
Set AWS credentials:
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_DEFAULT_REGION="us-east-1"
Or use AWS CLI configuration:
Google Cloud Storage¶
Read from GCS buckets:
GCS Configuration¶
Authenticate with gcloud:
Or use service account:
Cloud Execution with Coiled¶
SPIMquant integrates with Coiled for cloud execution.
Setup Coiled¶
- Create Coiled account at coiled.io
- Install Coiled CLI:
- Authenticate:
Run on Cloud¶
Coiled Configuration¶
Configure cloud resources:
Hybrid Workflows¶
Combine local and cloud resources:
# Download from cloud, process locally
pixi run spimquant s3://bucket/bids /local/output participant --cores all
# Process locally, upload results to cloud
pixi run spimquant /local/bids s3://bucket/output participant --cores all
Cost Optimization¶
Data Transfer Costs¶
Minimize data transfer:
- Process in same region as data
- Use cloud storage classes appropriately
- Clean up intermediate files
Compute Costs¶
Optimize compute resources:
- Right-size instance types
- Use spot instances when possible
- Stop resources when not in use
Storage Costs¶
Manage storage efficiently:
- Delete temporary files after completion
- Archive old results to cheaper storage tiers
- Use compression for large outputs
Cloud Provider Guides¶
AWS¶
Setting Up¶
- Create S3 bucket
- Configure IAM permissions
- Set up EC2 instances (if not using Coiled)
Running on AWS¶
Google Cloud Platform¶
Setting Up¶
- Create GCS bucket
- Configure service account
- Set up Compute Engine instances
Running on GCP¶
Azure¶
Azure support is planned for future releases.
Monitoring and Debugging¶
Cloud Logs¶
Access logs for cloud runs:
Resource Monitoring¶
Monitor cloud resource usage:
- CPU utilization
- Memory usage
- Network I/O
- Storage I/O
Troubleshooting¶
Common cloud issues:
- Authentication failures: Check credentials
- Permission errors: Verify IAM/service account permissions
- Region errors: Ensure resources in same region
- Network timeouts: Increase timeout settings
Security Considerations¶
Data Security¶
- Use encrypted storage buckets
- Enable encryption in transit
- Implement access controls
- Audit access logs
Credential Management¶
- Never commit credentials to code
- Use environment variables or secret management
- Rotate credentials regularly
- Use least-privilege access
Performance Tips¶
Network Performance¶
- Process data in same region/zone
- Use high-bandwidth instances
- Enable accelerated networking
Storage Performance¶
- Use SSD-backed storage
- Enable caching where appropriate
- Parallelize I/O operations
Compute Performance¶
- Choose appropriate instance types
- Use instances with local SSD for temp files
- Enable instance-level parallelization
Example Workflows¶
Complete Cloud Workflow¶
# 1. Upload data to S3
aws s3 sync /local/bids s3://bucket/bids/
# 2. Process on cloud
pixi run spimquant s3://bucket/bids s3://bucket/output participant \
--cloud \
--cores all
# 3. Download results
aws s3 sync s3://bucket/output /local/output/
Hybrid Processing¶
# Process participant-level on cloud
pixi run spimquant s3://bucket/bids /local/output participant --cloud --cores all
# Run group-level locally
pixi run spimquant /local/bids /local/output group \
--contrast_column treatment \
--contrast_values control drug
Comparison: Local vs Cloud¶
| Aspect | Local | Cloud |
|---|---|---|
| Setup | Hardware required | Account setup only |
| Cost | Upfront hardware | Pay-as-you-go |
| Scalability | Limited by hardware | Unlimited scaling |
| Data transfer | None | Can be significant |
| Maintenance | Manual | Managed |
Next Steps¶
- Configuration: Configure cloud storage
- Workflows: Execution strategies
- Examples: Complete cloud examples