Chapter 8: Deployment and Operational Scripts

Welcome to the final chapter of our cloud-accelerator-function tutorial! In Chapter 7: Structured Logging and Alerting, we saw how our application keeps detailed records of its actions and can notify us of important events. Now that our application is well-configured and monitored, how do we actually get it running in the cloud? And how do we manage all the Google Cloud pieces it needs, like storage buckets or special permissions? This is where our Deployment and Operational Scripts come into play.

The Automated Construction & Maintenance Crew: Why We Need These Scripts

Imagine you're building a large, prefabricated house. You have all the parts and a detailed plan. You could try to assemble everything manually – setting up the foundation, raising walls, connecting plumbing, wiring electricity. But it would be slow, tiring, and easy to make mistakes.

What if you had an automated construction crew? A team of robots and specialized machines that could:

Prepare the building site.
Assemble the house components according to the plan.
Perform routine maintenance checks.

This is exactly what the Deployment and Operational Scripts do for our cloud-accelerator-function. They are a collection of shell scripts (files containing a series of commands) that automate the setup, deployment, and management of our application and all the Google Cloud Platform (GCP) infrastructure it relies on.

Our Use Case: Setting Up a Fresh Environment

Let's say you want to deploy the cloud-accelerator-function for a new project or in a new GCP environment. You'd need to:

Tell GCP which project you're working on.
Enable several GCP services (APIs) that our application uses.
Create storage buckets for raw, archived, and error files.
Set up Pub/Sub topics for messaging.
Deploy our Python Flask application to Cloud Run so it can receive messages.
Grant the correct permissions so different parts can talk to each other.

Doing all this manually through the GCP web console would be time-consuming and error-prone, especially if you need to do it more than once. Our scripts automate this entire process, making it faster, more reliable, and repeatable.

Key Concepts: Understanding Our "Crew"

Shell Scripts (The Command Lists):
- These are simple text files, usually ending in .sh (like setup_project.sh).
- They contain a list of commands that your computer's command line (or "shell") can understand and execute one after another.
- For GCP, these commands are often gcloud commands, which are Google's official command-line tools for interacting with GCP services.
Automation (The "Robot" Power):
- The primary goal of these scripts is to automate tasks. Instead of you typing many commands, you run one script, and it executes all those commands for you.
Configuration-Driven (Following the Blueprint):
- Remember Chapter 6: Configuration Management? We learned about a main_config.yaml file (that you would create and manage) and how running python run/generate_config.py creates a run/scripts/config.sh file.
- Our operational scripts read settings from this run/scripts/config.sh file. This means if your project ID is my-cool-project-123, you set it once in main_config.yaml, generate config.sh, and all scripts will automatically use my-cool-project-123 without you needing to change each script.
Modular Scripts (Specialized Workers):
- Instead of one giant script that does everything, we have several smaller scripts, each focused on a specific area:
  - setup_project.sh: Configures basic GCP project settings and enables necessary APIs.
  - create_bucket.sh: Creates Google Cloud Storage buckets.
  - create_pubsub.sh: Sets up Pub/Sub topics and subscriptions.
  - deploy_cloud_run.sh: Deploys our Python application to Cloud Run.
  - set_iam_permissions.sh: Configures access control (who can do what).
  - ...and others for specific needs, like addDataFormCompletionTrigger.sh (which we saw in Chapter 1: Pub/Sub Event Handling & Routing and Chapter 5: External Services Integration (Dataform & Power BI)).
- You can find these scripts in the run/scripts/ directory of the project.
Idempotency (Safe to Re-run - Mostly!):
- Many of these scripts are designed to be "idempotent." This is a fancy word meaning that if you run them multiple times, they should still achieve the same desired state without causing errors or unintended duplications. For example, if a script tries to create a bucket that already exists, gcloud commands can often handle this gracefully (e.g., by doing nothing or reporting it already exists). While perfect idempotency isn't always guaranteed for every single script operation, it's a general aim.

How to Use the Scripts: Our Automated Setup Process

Let's walk through setting up a new environment using these scripts. You'd typically run these commands from your terminal in the project's root directory.

Prerequisites:

You have the Google Cloud SDK (gcloud command-line tool) installed and configured.
You have edited your main_config.yaml (you would create this file based on a template or example) with all your project-specific details.
You have run python run/generate_config.py to create run/scripts/config.sh and run/pubsub/.env as detailed in Chapter 6: Configuration Management.

Step-by-Step Automation:

Navigate to the Scripts Directory (Optional but Good Practice):
```
cd run/scripts
```
This makes it easier to run the scripts without typing the full path.
Set Up the GCP Project Defaults & APIs: The setup_project.sh script handles initial GCP project configuration.
```
./setup_project.sh
```
- What it does (conceptually): Reads config.sh for your PROJECT_ID and PROJECT_REGION. It then uses gcloud to:
  - Set your active GCP project and default region.
  - Enable essential Google Cloud APIs (services) like Pub/Sub, Cloud Run, Cloud Storage, BigQuery, Dataform, etc., that our application needs to function.
- Output (example): You'll see messages like "Updated property [core/project]" and "Operation "operations/..." finished successfully."
Create Cloud Storage Buckets: The create_bucket.sh script creates the necessary GCS buckets.
```
./create_bucket.sh
```
- What it does: Reads bucket names (like your raw data bucket, archive bucket) from config.sh and uses gcloud storage buckets create to make them in your project.
- Output (example): "Creating gs://your-raw-bucket-name/..."
Create Pub/Sub Topics and Subscriptions: The create_pubsub.sh script sets up the messaging infrastructure.
```
./create_pubsub.sh
```
- What it does: Reads the Pub/Sub topic name from config.sh. It then uses gcloud to:
  - Create the main Pub/Sub topic our application listens to.
  - (Optionally) It might also set up subscriptions if specific ones are needed at deployment time.
- Output (example): "Created topic [projects/your-project-id/topics/your-topic-name]."
Deploy the Application to Cloud Run: The deploy_cloud_run.sh script deploys our Python Flask application.
```
./deploy_cloud_run.sh
```
- What it does: This is a more complex script. It reads many settings from config.sh (like the Cloud Run service name, region, memory, CPU, the Pub/Sub topic it should listen to, database connection details for the .env file, etc.). It then uses gcloud run deploy to:
  - Package your Python application (from run/pubsub/) into a container image.
  - Upload this image to Google Container Registry or Artifact Registry.
  - Deploy this image as a new Cloud Run service, configured to be triggered by messages on your Pub/Sub topic.
  - It also ensures the necessary environment variables (from your .env file, which was generated by generate_config.py) are available to the Cloud Run service.
- Output (example): A lot of output, ending with "Service [your-service-name] revision [your-service-name-...] has been deployed and is serving 100 percent of traffic at [URL]."
Set IAM Permissions: The set_iam_permissions.sh script ensures your Cloud Run service (and other components) have the necessary permissions.
```
./set_iam_permissions.sh
```
- What it does: Reads service account names and project ID from config.sh. It uses gcloud projects add-iam-policy-binding and other gcloud iam commands to grant roles. For example, it ensures:
  - Pub/Sub can invoke your Cloud Run service.
  - Your Cloud Run service can read/write to the GCS buckets.
  - Your Cloud Run service can access the PostgreSQL database.
  - Your Cloud Run service can interact with BigQuery and Dataform.
- Output (example): "Updated IAM policy for project [your-project-id]."

After running these scripts, your cloud-accelerator-function environment should be fully set up and operational in GCP!

Under the Hood: Anatomy of a Script

Let's look at what a typical operational script might contain. We'll use a simplified conceptual version of what setup_project.sh might do.

Simplified View of Script Execution:

A Peek Inside a Simplified setup_project.sh:

#!/bin/bash

# Exit immediately if a command exits with a non-zero status.
set -e

# --- Load Configuration ---
# This line reads all variables from config.sh (like PROJECT_ID)
# and makes them available in this script.
source ./config.sh 
echo "INFO: Loaded configuration from config.sh"

# --- Set Project Defaults ---
echo "INFO: Setting GCP project to $PROJECT_ID and region to $PROJECT_REGION..."
gcloud config set project "$PROJECT_ID"
gcloud config set compute/region "$PROJECT_REGION" # Example for region

# --- Enable Necessary APIs ---
echo "INFO: Enabling core APIs for project $PROJECT_ID..."
gcloud services enable pubsub.googleapis.com --project "$PROJECT_ID"
gcloud services enable run.googleapis.com --project "$PROJECT_ID"
gcloud services enable storage.googleapis.com --project "$PROJECT_ID"
# ... add more services like bigquery.googleapis.com, dataform.googleapis.com etc.

echo "INFO: Project setup and API enablement complete for $PROJECT_ID."

Let's break this down:

#!/bin/bash: This is called a "shebang." It tells the system that this script should be executed with Bash, a common type of shell.
set -e: A safety measure. If any command in the script fails (returns an error), the script will stop immediately. This prevents further commands from running in an unexpected state.
source ./config.sh: This is super important!
- source is a shell command that reads and executes commands from the specified file (./config.sh) in the current shell's environment.
- This means all the VARIABLE_NAME="value" lines in config.sh (which were generated by python run/generate_config.py from your main_config.yaml) are loaded.
- After this line, the script can use variables like $PROJECT_ID, $PROJECT_REGION, $RAW_BUCKET_NAME, etc., directly.
echo "INFO: ...": These lines print informational messages to your terminal, so you can see what the script is doing.
gcloud config set project "$PROJECT_ID":
- This is a gcloud command.
- config set project tells gcloud to set the default project for subsequent gcloud commands.
- "$PROJECT_ID" uses the value of the PROJECT_ID variable loaded from config.sh. The quotes are good practice to handle names with spaces (though project IDs usually don't have them).
gcloud services enable pubsub.googleapis.com --project "$PROJECT_ID":
- services enable tells gcloud to turn on a specific Google Cloud service (API).
- pubsub.googleapis.com is the identifier for the Pub/Sub API.
- --project "$PROJECT_ID" explicitly specifies which project to enable it for, again using the configured variable.

Other scripts like create_bucket.sh would use commands like gcloud storage buckets create gs://$RAW_BUCKET_NAME --project "$PROJECT_ID", and deploy_cloud_run.sh would use gcloud run deploy $SERVICE_NAME --image ... --region $PROJECT_REGION ... (highly simplified), all pulling their specific parameters from the config.sh file.

This combination of a central configuration mechanism (Chapter 6: Configuration Management) and these operational scripts makes managing the cloud-accelerator-function much more streamlined and less prone to manual errors.

Conclusion: Your Automated Toolkit

And with that, you've reached the end of our cloud-accelerator-function tutorial journey! In this chapter, we explored the Deployment and Operational Scripts, our automated construction and maintenance crew. You learned:

These shell scripts automate setting up GCP resources and deploying the application.
They are driven by configurations defined in main_config.yaml (your custom file) and distributed via run/scripts/config.sh.
Scripts for different tasks (project setup, bucket creation, Pub/Sub setup, Cloud Run deployment, IAM permissions) provide a modular way to manage your environment.
Using these scripts makes deployment faster, more consistent, and repeatable.

Throughout this tutorial, we've covered the entire lifecycle and architecture of the cloud-accelerator-function:

How it receives messages with Chapter 1: Pub/Sub Event Handling & Routing.
The Chapter 2: Data Ingestion & Transformation Pipeline for processing files.
Storing vital information in Chapter 3: Metadata Persistence (Database).
Orchestrating complex workflows with Chapter 4: Task Definition & Execution Workflow.
Integrating with tools like Dataform and Power BI in Chapter 5: External Services Integration (Dataform & Power BI).
Managing all its settings via Chapter 6: Configuration Management.
Keeping an eye on things with Chapter 7: Structured Logging and Alerting.
And finally, automating its setup and deployment with the scripts discussed in this chapter.

We hope this tutorial has given you a solid, beginner-friendly understanding of how the cloud-accelerator-function works and how you can use it to build powerful, automated data processing solutions on Google Cloud. Happy building!

The Automated Construction & Maintenance Crew: Why We Need These Scripts​

Key Concepts: Understanding Our "Crew"​

How to Use the Scripts: Our Automated Setup Process​

Under the Hood: Anatomy of a Script​

Conclusion: Your Automated Toolkit​

The Automated Construction & Maintenance Crew: Why We Need These Scripts

Key Concepts: Understanding Our "Crew"

How to Use the Scripts: Our Automated Setup Process

Under the Hood: Anatomy of a Script

Conclusion: Your Automated Toolkit