Script for back up PosgreSQL database
December, 22, 2024I want to write an automated script that runs monthly and saves database dump files to the AWS S3 bucket.
First, one should make an AWS S3 bucket and get its credentials.
Then I wrote a bash script that handles dump PostgreSQL recovery files, backup_postgres.sh.
#!/bin/bash
# Check if the required arguments are provided
if [ $# -lt 1 ]; then
echo "Usage: $0 <db_name> [username] [host] [port] [backup_dir]"
exit 1
fi
# Get parameters
DB_NAME=$1 # The first argument is the database name
DB_USER=$2
DB_PASSWORD=$3
FULL_PATH=$4
DB_HOST=${5:-localhost} # Optional: Defaults to 'localhost' if not provided
DB_PORT=${6:-5432} # Optional: Defaults to '5432' if not provided
BACKUP_DIR=${7:-/backups} # Optional: Defaults to '/backups' if not provided
# Generate the backup filename
TIMESTAMP=$(date +"%Y-%m-%d_%H-%M-%S")
BACKUP_FILE="$FULL_PATH$BACKUP_DIR/${DB_NAME}_backup_$TIMESTAMP.dump"
# Ensure the backup directory exists
mkdir -p "$FULL_PATH$BACKUP_DIR"
if [ $? -ne 0 ]; then
echo "Failed to create directory: $BACKUP_DIR"
exit 1
fi
echo "Directory created successfully: $BACKUP_DIR"
echo "Starting backup for database: $DB_NAME"
PGPASSWORD="$DB_PASSWORD" pg_dump -U "$DB_USER" -h "$DB_HOST" -p "$DB_PORT" -F c -f "$BACKUP_FILE" "$DB_NAME"
# Check if the dump was successful
if [ $? -eq 0 ]; then
echo "$BACKUP_FILE"
else
echo "Backup failed!"
exit 1
fi
exit 0
Then, I wrote a Python script that triggers the backup_postgres.sh script, passes arguments to it, and loads the created dump file into the S3 bucket.
import re
import subprocess
import boto3
import logging
from pathlib import Path
from botocore.exceptions import NoCredentialsError, PartialCredentialsError
from env_handler import var_getter
logging.basicConfig(
level=logging.INFO, # Set to DEBUG for more detailed logs
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler("upload_to_s3.log"), # Log to a file
logging.StreamHandler() # Log to the console
]
)
parent_path = Path(__file__).resolve().parent
def run_bash_script(script_path: Path, *args) -> str | None:
"""
Runs a bash script with optional arguments and returns its output.
:param script_path: Path to the bash script
:param args: Arguments to pass to the script
:return: str|None The standard output of the bash script
"""
try:
command = ["bash", script_path] + list(args)
# Run the bash script and capture its output
result = subprocess.run(command, check=True, text=True, capture_output=True)
return result.stdout.strip()
except subprocess.CalledProcessError as e:
logging.error("Error executing script: %s", e)
logging.error("Error Output: %s", e.stderr)
return None
def upload_to_s3(file_path, bucket_name, directory, region="us-east-1"):
"""
Uploads a file to a specific directory in an S3 bucket.
:param file_path: Path to the local file to upload
:param bucket_name: Name of the S3 bucket
:param directory: Directory (prefix) in the S3 bucket where the file should be uploaded
:param region: AWS region of the S3 bucket (default is 'us-east-1')
:return: str The S3 URL of the uploaded file if successful
"""
try:
logging.info("Starting the upload process.")
# Extract the file name from the local file path
file_name = file_path.split("/")[-1]
logging.debug("File name extracted: %s", file_name)
# Construct the full S3 key (directory + file name)
s3_key = f"{directory}/{file_name}" if directory else file_name
s3_client = boto3.client('s3',
aws_access_key_id=var_getter('AWS_ACCESS_KEY_ID', path=parent_path),
aws_secret_access_key=var_getter('AWS_SECRET_ACCESS_KEY', path=parent_path),
region_name=region)
logging.info("Uploading to file '%s' to the '%s'.", file_name, s3_key)
s3_client.upload_file(file_path, bucket_name, s3_key)
# Generate the S3 URL
s3_url = f"https://{bucket_name}.s3.{region}.amazonaws.com/{s3_key}"
logging.info("File '%s' successfully uploaded to '%s'.", file_path, s3_url)
return s3_url
except FileNotFoundError:
logging.error("Error: The file '%s' was not found.", file_path)
except NoCredentialsError:
logging.error("Error: AWS credentials not found.")
except PartialCredentialsError:
print("Error: Incomplete AWS credentials.")
except Exception as e:
logging.error("An unexpected error occurred: %s", e)
raise
if __name__ == "__main__":
script_path = parent_path / 'backup_postgres.sh'
arguments = [var_getter('POSTGRESQL_NAME', path=parent_path),
var_getter('POSTGRESQL_USER', path=parent_path),
var_getter('POSTGRESQL_PASSWORD', path=parent_path),
str(parent_path)
]
output = run_bash_script(script_path, *arguments)
if output:
logging.info("Script Output: '%s'", output)
pattern = r"(/home[^ ]*?/backup_postgres[^ ]*?\.dump)"
matches = re.findall(pattern, output)
logging.info("PG dump file names: '%s'", matches)
upload_to_s3(file_path=matches[0],
bucket_name=var_getter('AWS_STORAGE_BUCKET_NAME', path=parent_path),
directory='pg_backups',
region=var_getter('AWS_S3_REGION_NAME', path=parent_path)
)
else:
logging.error("No stdout returned.")
Note that credential I handle separately. The source code for the whole script bunch you can find here.
Last, but not least, I've written the pg_backup.service unit in the /etc/systemd/system/
[Unit]
Description=Run Python Script Every Month
[Service]
Type=oneshot
User=root
ExecStart=/usr/bin/python3 /home/pahtto/backup_postgres/main.py
and in the same directory, the timer for this service
[Unit]
Description=Run Python Script Every Month
Wants=pg_backup.service
[Timer]
OnCalendar=monthly
Persistent=true
[Install]
WantedBy=timers.target
Then run the next commands:
sudo systemctl enable pg_backup.timer
sudo systemctl start pg_backup.timer
sudo systemctl status pg_backup.timer
systemctl list-timers --all # to check if the timer runs
sudo systemctl daemon-reload