Files backup implementation using AWS S3

Regular backups of files is a must when we are talking about back office management solution.

Here is how we implemented a remote backup using aws.

In order to copy files remotely to a S3 storage service, you first need to instal aws on the main server. Here we consider you arlready have an account with AWS,and a user with proper role and access to S3 service. (for more details about this topic, you can read this about the service and this about roles and permissions).

AWS installation

In our case we install aws on Ubuntu 14.04. The installation in this case is very easy and well described in aws-cli codes page.

After installation, you need to configure access as explained in the "Getting started" section. You need to provide Access key ID and Secret Access Key. This is why it is important to have a user or role dedicated to S3 service only in order to restrict access to AWS account to what is needed only. I our case upload of files to S3 storage.

Tip: the config and credentials files are located in the logged user folder. For example if you are logged as user1 and installed aws with you account, the files can be found in /home/user1/.aws folder. The files should be owned by user1 to record the configuration. If this is not the case, just execute a "chmod user1:root" on the files.

The synchronisation script

To store your local files on the S3 storage, you can use the sync function .

Command line example:

aws s3 sync /var/www/my_folder/ s3://my_bucket/my_folder --region <bucket region location>

If the credentials file has been updated properly, all local files within my_folder will be copied in the S3 bucket under my_folder

The cron script

In order to have an automatized backup, we create a cron script that will synchronise the files at a given frequency.


#!/bin/bash
DATE=`date +%Y-%m-%d_%Hh%Mm%Ss`
echo  "------------ start S3 sync $DATE" >> /backup/script/sync.log
/usr/local/bin/aws s3 sync /var/www/my_folder/ s3://my_bucket/my_folder --region ap-southeast-1 --storage-class REDUCED_REDUNDANCY --sse >> /backup/script/sync.log
DATE=`date +%Y-%m-%d_%Hh%Mm%Ss`
echo "------------ end S3 sync $DATE" >> /backup/script/sync.log

You can save this file anywhere convenient on your server and make it executable to be run by crontab.

Tip: in the file we used the full path to aws service (/usr/local/bin/aws) otherwise the command might not be executed in cron service. Note also that we log the cron activity in a log file "sync.log" that must be writable.

In our case we used an hourly execution of the script. Thus the cron command is very simple:

@hourly /backup/script/s3sync.sh/code>

When testing your cron execution, you may face the "Unable to locate credentials" error. In this case, look at the following possible problems:

  • The .aws/credentials file is not own by the crontab user (i.e the user under which the cron is executed cannot read the ./aws/credentials file)
  • The crontab is not under the proper user (i.e cron is executed as root but credentials are owned by a different user)

In most case the error will happen because the cron is not executed under the proper user.

backup

This setup is a very simple and economical way to have a safe backup of your files in real time. You can even increase the frequency of cron if you need a more frequent copy of your files to remote storage. In our case we also added a second cron script with a monthly frequency and --delete option in the command line. With this second option, files are backup on hourly basis; On 1st day of each month a specific synchronisation is executed that will delete files in the backup that are not anymore on the main server. it means that for a certain period, not exceeding 30 days, files deleted on the main server can still be retrieved from the backup storage which is an added security for those who accidentally deleted a file on the main system.

Add new comment

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.