Chapter 8: Deployment and Operational Scripts
Welcome to the final chapter of our cloud-accelerator-function tutorial! In Chapter 7: Structured Logging and Alerting, we saw how our application keeps detailed records of its actions and can notify us of important events. Now that our application is well-configured and monitored, how do we actually get it running in the cloud? And how do we manage all the Google Cloud pieces it needs, like storage buckets or special permissions? This is where our Deployment and Operational Scripts come into play.
The Automated Construction & Maintenance Crew: Why We Need These Scripts
Imagine you're building a large, prefabricated house. You have all the parts and a detailed plan. You could try to assemble everything manually – setting up the foundation, raising walls, connecting plumbing, wiring electricity. But it would be slow, tiring, and easy to make mistakes.
What if you had an automated construction crew? A team of robots and specialized machines that could:
- Prepare the building site.
- Assemble the house components according to the plan.
- Perform routine maintenance checks.
This is exactly what the Deployment and Operational Scripts do for our cloud-accelerator-function. They are a collection of shell scripts (files containing a series of commands) that automate the setup, deployment, and management of our application and all the Google Cloud Platform (GCP) infrastructure it relies on.
Our Use Case: Setting Up a Fresh Environment
Let's say you want to deploy the cloud-accelerator-function for a new project or in a new GCP environment. You'd need to:
- Tell GCP which project you're working on.
- Enable several GCP services (APIs) that our application uses.
- Create storage buckets for raw, archived, and error files.
- Set up Pub/Sub topics for messaging.
- Deploy our Python Flask application to Cloud Run so it can receive messages.
- Grant the correct permissions so different parts can talk to each other.
Doing all this manually through the GCP web console would be time-consuming and error-prone, especially if you need to do it more than once. Our scripts automate this entire process, making it faster, more reliable, and repeatable.
Key Concepts: Understanding Our "Crew"
-
Shell Scripts (The Command Lists):
- These are simple text files, usually ending in
.sh(likesetup_project.sh). - They contain a list of commands that your computer's command line (or "shell") can understand and execute one after another.
- For GCP, these commands are often
gcloudcommands, which are Google's official command-line tools for interacting with GCP services.
- These are simple text files, usually ending in
-
Automation (The "Robot" Power):
- The primary goal of these scripts is to automate tasks. Instead of you typing many commands, you run one script, and it executes all those commands for you.
-
Configuration-Driven (Following the Blueprint):
- Remember Chapter 6: Configuration Management? We learned about a
main_config.yamlfile (that you would create and manage) and how runningpython run/generate_config.pycreates arun/scripts/config.shfile. - Our operational scripts read settings from this
run/scripts/config.shfile. This means if your project ID ismy-cool-project-123, you set it once inmain_config.yaml, generateconfig.sh, and all scripts will automatically usemy-cool-project-123without you needing to change each script.
- Remember Chapter 6: Configuration Management? We learned about a
-
Modular Scripts (Specialized Workers):
- Instead of one giant script that does everything, we have several smaller scripts, each focused on a specific area:
setup_project.sh: Configures basic GCP project settings and enables necessary APIs.create_bucket.sh: Creates Google Cloud Storage buckets.create_pubsub.sh: Sets up Pub/Sub topics and subscriptions.deploy_cloud_run.sh: Deploys our Python application to Cloud Run.set_iam_permissions.sh: Configures access control (who can do what).- ...and others for specific needs, like
addDataFormCompletionTrigger.sh(which we saw in Chapter 1: Pub/Sub Event Handling & Routing and Chapter 5: External Services Integration (Dataform & Power BI)).
- You can find these scripts in the
run/scripts/directory of the project.
- Instead of one giant script that does everything, we have several smaller scripts, each focused on a specific area:
-
Idempotency (Safe to Re-run - Mostly!):
- Many of these scripts are designed to be "idempotent." This is a fancy word meaning that if you run them multiple times, they should still achieve the same desired state without causing errors or unintended duplications. For example, if a script tries to create a bucket that already exists,
gcloudcommands can often handle this gracefully (e.g., by doing nothing or reporting it already exists). While perfect idempotency isn't always guaranteed for every single script operation, it's a general aim.
- Many of these scripts are designed to be "idempotent." This is a fancy word meaning that if you run them multiple times, they should still achieve the same desired state without causing errors or unintended duplications. For example, if a script tries to create a bucket that already exists,
How to Use the Scripts: Our Automated Setup Process
Let's walk through setting up a new environment using these scripts. You'd typically run these commands from your terminal in the project's root directory.
Prerequisites:
- You have the Google Cloud SDK (
gcloudcommand-line tool) installed and configured. - You have edited your
main_config.yaml(you would create this file based on a template or example) with all your project-specific details. - You have run
python run/generate_config.pyto createrun/scripts/config.shandrun/pubsub/.envas detailed in Chapter 6: Configuration Management.
Step-by-Step Automation:
-
Navigate to the Scripts Directory (Optional but Good Practice):
cd run/scriptsThis makes it easier to run the scripts without typing the full path.
-
Set Up the GCP Project Defaults & APIs: The
setup_project.shscript handles initial GCP project configuration../setup_project.sh- What it does (conceptually): Reads
config.shfor yourPROJECT_IDandPROJECT_REGION. It then usesgcloudto:- Set your active GCP project and default region.
- Enable essential Google Cloud APIs (services) like Pub/Sub, Cloud Run, Cloud Storage, BigQuery, Dataform, etc., that our application needs to function.
- Output (example): You'll see messages like "Updated property [core/project]" and "Operation "operations/..." finished successfully."
- What it does (conceptually): Reads
-
Create Cloud Storage Buckets: The
create_bucket.shscript creates the necessary GCS buckets../create_bucket.sh- What it does: Reads bucket names (like your raw data bucket, archive bucket) from
config.shand usesgcloud storage buckets createto make them in your project. - Output (example): "Creating gs://your-raw-bucket-name/..."
- What it does: Reads bucket names (like your raw data bucket, archive bucket) from
-
Create Pub/Sub Topics and Subscriptions: The
create_pubsub.shscript sets up the messaging infrastructure../create_pubsub.sh- What it does: Reads the Pub/Sub topic name from
config.sh. It then usesgcloudto:- Create the main Pub/Sub topic our application listens to.
- (Optionally) It might also set up subscriptions if specific ones are needed at deployment time.
- Output (example): "Created topic [projects/your-project-id/topics/your-topic-name]."
- What it does: Reads the Pub/Sub topic name from
-
Deploy the Application to Cloud Run: The
deploy_cloud_run.shscript deploys our Python Flask application../deploy_cloud_run.sh- What it does: This is a more complex script. It reads many settings from
config.sh(like the Cloud Run service name, region, memory, CPU, the Pub/Sub topic it should listen to, database connection details for the.envfile, etc.). It then usesgcloud run deployto:- Package your Python application (from
run/pubsub/) into a container image. - Upload this image to Google Container Registry or Artifact Registry.
- Deploy this image as a new Cloud Run service, configured to be triggered by messages on your Pub/Sub topic.
- It also ensures the necessary environment variables (from your
.envfile, which was generated bygenerate_config.py) are available to the Cloud Run service.
- Package your Python application (from
- Output (example): A lot of output, ending with "Service [your-service-name] revision [your-service-name-...] has been deployed and is serving 100 percent of traffic at [URL]."
- What it does: This is a more complex script. It reads many settings from
-
Set IAM Permissions: The
set_iam_permissions.shscript ensures your Cloud Run service (and other components) have the necessary permissions../set_iam_permissions.sh- What it does: Reads service account names and project ID from
config.sh. It usesgcloud projects add-iam-policy-bindingand othergcloud iamcommands to grant roles. For example, it ensures:- Pub/Sub can invoke your Cloud Run service.
- Your Cloud Run service can read/write to the GCS buckets.
- Your Cloud Run service can access the PostgreSQL database.
- Your Cloud Run service can interact with BigQuery and Dataform.
- Output (example): "Updated IAM policy for project [your-project-id]."
- What it does: Reads service account names and project ID from
After running these scripts, your cloud-accelerator-function environment should be fully set up and operational in GCP!
Under the Hood: Anatomy of a Script
Let's look at what a typical operational script might contain. We'll use a simplified conceptual version of what setup_project.sh might do.
Simplified View of Script Execution:
A Peek Inside a Simplified setup_project.sh:
#!/bin/bash
# Exit immediately if a command exits with a non-zero status.
set -e
# --- Load Configuration ---
# This line reads all variables from config.sh (like PROJECT_ID)
# and makes them available in this script.
source ./config.sh
echo "INFO: Loaded configuration from config.sh"
# --- Set Project Defaults ---
echo "INFO: Setting GCP project to $PROJECT_ID and region to $PROJECT_REGION..."
gcloud config set project "$PROJECT_ID"
gcloud config set compute/region "$PROJECT_REGION" # Example for region
# --- Enable Necessary APIs ---
echo "INFO: Enabling core APIs for project $PROJECT_ID..."
gcloud services enable pubsub.googleapis.com --project "$PROJECT_ID"
gcloud services enable run.googleapis.com --project "$PROJECT_ID"
gcloud services enable storage.googleapis.com --project "$PROJECT_ID"
# ... add more services like bigquery.googleapis.com, dataform.googleapis.com etc.
echo "INFO: Project setup and API enablement complete for $PROJECT_ID."
Let's break this down:
#!/bin/bash: This is called a "shebang." It tells the system that this script should be executed with Bash, a common type of shell.set -e: A safety measure. If any command in the script fails (returns an error), the script will stop immediately. This prevents further commands from running in an unexpected state.source ./config.sh: This is super important!sourceis a shell command that reads and executes commands from the specified file (./config.sh) in the current shell's environment.- This means all the
VARIABLE_NAME="value"lines inconfig.sh(which were generated bypython run/generate_config.pyfrom yourmain_config.yaml) are loaded. - After this line, the script can use variables like
$PROJECT_ID,$PROJECT_REGION,$RAW_BUCKET_NAME, etc., directly.
echo "INFO: ...": These lines print informational messages to your terminal, so you can see what the script is doing.gcloud config set project "$PROJECT_ID":- This is a
gcloudcommand. config set projecttellsgcloudto set the default project for subsequentgcloudcommands."$PROJECT_ID"uses the value of thePROJECT_IDvariable loaded fromconfig.sh. The quotes are good practice to handle names with spaces (though project IDs usually don't have them).
- This is a
gcloud services enable pubsub.googleapis.com --project "$PROJECT_ID":services enabletellsgcloudto turn on a specific Google Cloud service (API).pubsub.googleapis.comis the identifier for the Pub/Sub API.--project "$PROJECT_ID"explicitly specifies which project to enable it for, again using the configured variable.
Other scripts like create_bucket.sh would use commands like gcloud storage buckets create gs://$RAW_BUCKET_NAME --project "$PROJECT_ID", and deploy_cloud_run.sh would use gcloud run deploy $SERVICE_NAME --image ... --region $PROJECT_REGION ... (highly simplified), all pulling their specific parameters from the config.sh file.
This combination of a central configuration mechanism (Chapter 6: Configuration Management) and these operational scripts makes managing the cloud-accelerator-function much more streamlined and less prone to manual errors.
Conclusion: Your Automated Toolkit
And with that, you've reached the end of our cloud-accelerator-function tutorial journey! In this chapter, we explored the Deployment and Operational Scripts, our automated construction and maintenance crew. You learned:
- These shell scripts automate setting up GCP resources and deploying the application.
- They are driven by configurations defined in
main_config.yaml(your custom file) and distributed viarun/scripts/config.sh. - Scripts for different tasks (project setup, bucket creation, Pub/Sub setup, Cloud Run deployment, IAM permissions) provide a modular way to manage your environment.
- Using these scripts makes deployment faster, more consistent, and repeatable.
Throughout this tutorial, we've covered the entire lifecycle and architecture of the cloud-accelerator-function:
- How it receives messages with Chapter 1: Pub/Sub Event Handling & Routing.
- The Chapter 2: Data Ingestion & Transformation Pipeline for processing files.
- Storing vital information in Chapter 3: Metadata Persistence (Database).
- Orchestrating complex workflows with Chapter 4: Task Definition & Execution Workflow.
- Integrating with tools like Dataform and Power BI in Chapter 5: External Services Integration (Dataform & Power BI).
- Managing all its settings via Chapter 6: Configuration Management.
- Keeping an eye on things with Chapter 7: Structured Logging and Alerting.
- And finally, automating its setup and deployment with the scripts discussed in this chapter.
We hope this tutorial has given you a solid, beginner-friendly understanding of how the cloud-accelerator-function works and how you can use it to build powerful, automated data processing solutions on Google Cloud. Happy building!