Script for back up PosgreSQL database

December, 22, 2024

I want to write an automated script that runs monthly and saves database dump files to the AWS S3 bucket.

First, one should make an AWS S3 bucket and get its credentials.

Then I wrote a bash script that handles dump PostgreSQL recovery files, backup_postgres.sh.

#!/bin/bash

# Check if the required arguments are provided
if [ $# -lt 1 ]; then
    echo "Usage: $0 <db_name> [username] [host] [port] [backup_dir]"
    exit 1
fi

# Get parameters
DB_NAME=$1                       # The first argument is the database name
DB_USER=$2
DB_PASSWORD=$3
FULL_PATH=$4
DB_HOST=${5:-localhost}          # Optional: Defaults to 'localhost' if not provided
DB_PORT=${6:-5432}               # Optional: Defaults to '5432' if not provided
BACKUP_DIR=${7:-/backups}        # Optional: Defaults to '/backups' if not provided

# Generate the backup filename
TIMESTAMP=$(date +"%Y-%m-%d_%H-%M-%S")
BACKUP_FILE="$FULL_PATH$BACKUP_DIR/${DB_NAME}_backup_$TIMESTAMP.dump"
# Ensure the backup directory exists
mkdir -p "$FULL_PATH$BACKUP_DIR"
if [ $? -ne 0 ]; then
    echo "Failed to create directory: $BACKUP_DIR"
    exit 1
fi
echo "Directory created successfully: $BACKUP_DIR"
echo "Starting backup for database: $DB_NAME"
PGPASSWORD="$DB_PASSWORD" pg_dump -U "$DB_USER" -h "$DB_HOST" -p "$DB_PORT" -F c -f "$BACKUP_FILE" "$DB_NAME"

# Check if the dump was successful
if [ $? -eq 0 ]; then
    echo "$BACKUP_FILE"
else
    echo "Backup failed!"
    exit 1
fi

exit 0

Then, I wrote a Python script that triggers the backup_postgres.sh script, passes arguments to it, and loads the created dump file into the S3 bucket.

import re
import subprocess
import boto3
import logging
from pathlib import Path
from botocore.exceptions import NoCredentialsError, PartialCredentialsError
from env_handler import var_getter

logging.basicConfig(
    level=logging.INFO,  # Set to DEBUG for more detailed logs
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("upload_to_s3.log"),  # Log to a file
        logging.StreamHandler()  # Log to the console
    ]
)

parent_path = Path(__file__).resolve().parent

def run_bash_script(script_path: Path, *args) -> str | None:
    """
    Runs a bash script with optional arguments and returns its output.

    :param script_path: Path to the bash script
    :param args: Arguments to pass to the script
    :return: str|None The standard output of the bash script
    """
    try:
        command = ["bash", script_path] + list(args)
        # Run the bash script and capture its output
        result = subprocess.run(command, check=True, text=True, capture_output=True)
        return result.stdout.strip()

    except subprocess.CalledProcessError as e:
        logging.error("Error executing script: %s", e)
        logging.error("Error Output: %s", e.stderr)
        return None

def upload_to_s3(file_path, bucket_name, directory, region="us-east-1"):
    """
    Uploads a file to a specific directory in an S3 bucket.

    :param file_path: Path to the local file to upload
    :param bucket_name: Name of the S3 bucket
    :param directory: Directory (prefix) in the S3 bucket where the file should be uploaded
    :param region: AWS region of the S3 bucket (default is 'us-east-1')
    :return: str The S3 URL of the uploaded file if successful
    """
    try:
        logging.info("Starting the upload process.")
        # Extract the file name from the local file path
        file_name = file_path.split("/")[-1]
        logging.debug("File name extracted: %s", file_name)
        # Construct the full S3 key (directory + file name)
        s3_key = f"{directory}/{file_name}" if directory else file_name

        s3_client = boto3.client('s3',
                                 aws_access_key_id=var_getter('AWS_ACCESS_KEY_ID', path=parent_path),
                                 aws_secret_access_key=var_getter('AWS_SECRET_ACCESS_KEY', path=parent_path),
                                 region_name=region)

        logging.info("Uploading to file '%s' to the '%s'.", file_name, s3_key)
        s3_client.upload_file(file_path, bucket_name, s3_key)
        # Generate the S3 URL
        s3_url = f"https://{bucket_name}.s3.{region}.amazonaws.com/{s3_key}"
        logging.info("File '%s' successfully uploaded to '%s'.", file_path, s3_url)
        return s3_url

    except FileNotFoundError:
        logging.error("Error: The file '%s' was not found.", file_path)
    except NoCredentialsError:
        logging.error("Error: AWS credentials not found.")
    except PartialCredentialsError:
        print("Error: Incomplete AWS credentials.")
    except Exception as e:
        logging.error("An unexpected error occurred: %s", e)
        raise

if __name__ == "__main__":
    script_path = parent_path / 'backup_postgres.sh'
    arguments = [var_getter('POSTGRESQL_NAME', path=parent_path),
                 var_getter('POSTGRESQL_USER', path=parent_path),
                 var_getter('POSTGRESQL_PASSWORD', path=parent_path),
                 str(parent_path)
                ]

    output = run_bash_script(script_path, *arguments)
    if output:
        logging.info("Script Output: '%s'", output)
        pattern = r"(/home[^ ]*?/backup_postgres[^ ]*?\.dump)"
        matches = re.findall(pattern, output)
        logging.info("PG dump file names: '%s'", matches)

        upload_to_s3(file_path=matches[0],
                     bucket_name=var_getter('AWS_STORAGE_BUCKET_NAME', path=parent_path),
                     directory='pg_backups',
                     region=var_getter('AWS_S3_REGION_NAME', path=parent_path)
                     )
    else:
        logging.error("No stdout returned.")

Note that credential I handle separately. The source code for the whole script bunch you can find here.

Last, but not least, I've written the pg_backup.service unit in the /etc/systemd/system/ 

[Unit]
Description=Run Python Script Every Month

[Service]
Type=oneshot
User=root
ExecStart=/usr/bin/python3 /home/pahtto/backup_postgres/main.py

and in the same directory, the timer for this service

[Unit]
Description=Run Python Script Every Month
Wants=pg_backup.service

[Timer]
OnCalendar=monthly
Persistent=true

[Install]
WantedBy=timers.target

Then run the next commands:

sudo systemctl enable pg_backup.timer
sudo systemctl start pg_backup.timer
sudo systemctl status pg_backup.timer
systemctl list-timers --all # to check if the timer runs

sudo systemctl daemon-reload

 


Similar articles:

Learning Modern Linux

Learning Python

Rust course for begginers


Reddit