Tutorial: Cloud Accelerator Module
This project, cloud-accelerator-function, acts as an event-driven data pipeline primarily running on Google Cloud.
It listens for messages (e.g., new file uploads) via Pub/Sub, processes them using a Flask application deployed to Cloud Run, and manages data workflows.
Data is ingested, transformed (e.g., CSV to Parquet), and loaded into BigQuery.
The system uses a PostgreSQL database to store metadata about files, tasks, and configurations, ensuring reliable and traceable operations.
It can also orchestrate external services like Dataform for SQL transformations and Power BI for report refreshes.
Deployment and operational tasks are automated with shell scripts.
Source Repository: Cloud Accelerator
Chapters
- Pub/Sub Event Handling & Routing
- Data Ingestion & Transformation Pipeline
- Metadata Persistence (Database)
- Task Definition & Execution Workflow
- External Services Integration (Dataform & Power BI)
- Configuration Management
- Structured Logging and Alerting
- Deployment and Operational Scripts