Tutorial: Cloud Accelerator Module

This project, cloud-accelerator-function, acts as an event-driven data pipeline primarily running on Google Cloud. It listens for messages (e.g., new file uploads) via Pub/Sub, processes them using a Flask application deployed to Cloud Run, and manages data workflows. Data is ingested, transformed (e.g., CSV to Parquet), and loaded into BigQuery. The system uses a PostgreSQL database to store metadata about files, tasks, and configurations, ensuring reliable and traceable operations. It can also orchestrate external services like Dataform for SQL transformations and Power BI for report refreshes. Deployment and operational tasks are automated with shell scripts.

Source Repository: Cloud Accelerator

Chapters

Pub/Sub Event Handling & Routing
Data Ingestion & Transformation Pipeline
Metadata Persistence (Database)
Task Definition & Execution Workflow
External Services Integration (Dataform & Power BI)
Configuration Management
Structured Logging and Alerting
Deployment and Operational Scripts

Chapters​

Chapters