How to backup your Postgres DB into AWS S3 in Kubernetes
TL;DR: Use CronJob to schedule a DB daily backup

One good friend of me told me time ago…
The difference between a good developer and a bad one is that the former makes backups.

Designing our backup solution
Our business requirements for the backup solution are:
- Not in the same place. We use AWS S3 for store the backups. Out of our cluster network.
- Secure and encrypted. Custody of data is essential. We will use secure transport up to S3 and the files will be encrypted with a symmetric key.
- Kubernetes way. Backups should be done automatically using Kubernetes resources.
Create and configure an AWS S3 bucket
It’s very pretty straightforward create a AWS S3 bucket. We create it using our main AWS account with the “Create bucket wizard”.

What is not so easy is having a good access and security configuration.
In our case we need to create a specific user to use only the AWS API and with just write permissions (PutObject in essence).
There is no a default S3 policy for just write. So we create the custom policy first and we assign it when we create the new user later.

We even specify the folders that you can access.
Now, the user creation and policy assignment.



Good! We are ready to enter in the Kubernetes part.
Write and apply CronJob
We create a bash script with the following steps:
- Database backup with pg_dump command.
- Compression with BZip2.
- Encrypt the file.
- Upload it to S3.
As you can observe, all the parameters are passed as environment variables.
Regard to the CronJob spec is the following:
Before apply it, we need to setup the secrets and configurations. Visit “Secrets” for more information and to see how secrets are created in Kubernetes.
We apply our manifest now:
kubectl apply -f backup-postgres-cronjob.yaml
If you have interest in the Dockerfile for this image, here it is:
Check it up!
For a rapid testing, we apply a each 5 minutes cron schedule (*/5 * * * *).

When we check that the backup worked, we re-apply the manifest with the desired backup frequency (I.E: each day at 2 am, 0 2 * * *).
Conclusion
If you have a Kubernetes cluster, probably you need backups.
Thanks to Kubernetes CronJob it’s very simple create good backups.
A possible improvement to this solution is to use AWS SAS or Telegram API to notify us when the backup is done. Maybe I’ll update the post later with that notification.
Please, if you liked it, give it a round of applause. And if you want to know more about DevOps, Kubernetes, Docker, etc … follow me :)
KR
