How to Perform Incremental backup in Linux using tar utility
This guide is advanced version of my previously written guide on How to backup files and directories in Linux using tar & cron jobs. In this guide lets move to advance feature of tar utility, that will allow us to incremental backup for our important data.
The vision behind while i am writing this guide is to create a simplest and easy to understand guide for our readers
Taking backup of large amount of data (which is common nowadays)can take long time to complete. Normally we required to take full backup for the first time and then for the next time we would want to backup files which are modified or created newly. Incremental backup process should backup all modified/new files leaving behind unchanged files.
To achieve this task using tar utility we need to simply provide an argument called “-listed-incremental=snapshot-file-name” or “-g” where snapshot-file is a special file created by the tar command to track files that are been modified, added or deleted.
Lets create scenario to apply incremental backup and understand this process in more simplest way.
Suppose we have one directory on root called “/mydata” which contains important file and we want to make full backup at first and then on daily bases we required incremental backup. Backup directory where our all backups will be stored is located on root named “backup”.
At the end we will see how to achieve this task using shell script and make whole process automatic.
Lab Start Here
I have already created /mydata and /backup directory using mkdir command.
You can see mydata directory already contains two files in it.
# ls -l /mydata
-rw-r--r-- 1 root root 17 Feb 28 12:47 file1.txt
-rw-r--r-- 1 root root 17 Feb 28 12:47 file2.txt
Now lets create first full backup of /mydata directory with snapshot file using -g argument.
# tar -cvf /backup/backup-0.tar.gz -g /mydata/mydata.snar /mydata
tar: /mydata: Directory is new
tar: Removing leading `/’ from member names
What just happened?
We have created backup-0.tar.gz archive plus mydata.snar file with “-g” argument which is snapshot file to keep record of currently backed up files so, this same file will be used next time to identify all modified/new files created in /mydata directory.
tar -cvf /backup/backup-0.tar.gz -g /mydata/mydata.snar /mydata
- Tar = Tape Archive Utility
- c = Create Tar File
- v = Verbose Mode
- f = It allows tar to get file name.
- g = list all incremental files
/backup/backup-0.tar.gz means we pointing the location of backup file and create archive files with backup-0.tar.gz name.
We can see newly created tar archived in result of above command:
# ls -l /backup
-rw-r--r-- 1 root root 10240 Feb 28 12:50 backup-0.tar.gz
To see the content of tar archive we can use following command:
#tar -tvf /backup/backup-0.tar.gz
drwxr-xr-x root/root 36 2016-02-28 12:50 mydata/
-rw-r--r-- root/root 17 2016-02-28 12:47 mydata/file1.txt
-rw-r--r-- root/root 17 2016-02-28 12:47 mydata/file2.txt
-rw-r--r-- root/root 0 2016-02-28 12:50 mydata/mydata.snar
Hope everything is clear till now :).
Now lets create new file inside our /mydata directory and try to backup again to see if it will backup incremented data only or not.
Create one new file in /mydata directory and run backup command again.
# touch /mydata/file3.txt
Now run tar command again with addition to change tar archive file name to backup-1.tar.gz.
# tar -cvf /backup/backup-1.tar.gz -g /mydata/mydata.snar /mydata
list newly created archive to see what was backed up this time:
# tar -tvf /backup/backup-1.tar.gz
drwxr-xr-x root/root 47 2016-02-28 13:15 mydata/
-rw-r--r-- root/root 29 2016-02-28 13:15 mydata/file3.txt
-rw-r--r-- root/root 119 2016-02-28 13:13 mydata/mydata.snar
Result of this command is showing only file3.txt was backed up leaving behind unchanged data file1.txt and file2.txt. Great!! It means tar is performing incremental backup properly.
Lets be smart and Automate this whole process using shell script.
Keeping same scenario in mind I will modify my old super simple backup script.
Create script file:
# vi script.sh
Copy & paste below contents:
#!/bin/bash #Purpose = Backup of Important Data #Created on 28-02-2016 #Author = Hafiz Haider #Version 2.0 #START TIME=`date +%b-%d-%y%s` # This Command will add date in Backup File Name. FILENAME=backup-$TIME.tar.gz # Here i define Backup file name format. SRCDIR=/mydata # Location of Important Data Directory (Source of backup). DESDIR=/backup # Destination of backup file. SNF=/mydata/mydata.snar # Snapshot file name and location tar -cvf $DESDIR/$FILENAME -g $SNF $SRCDIR #Backup Command #END
This shell script will generate backup archives containing current date and seconds from current time to avoid file name duplication in case if you run this script multiple time in a same day.
See result below after running script two times.
# ls -l /backup
-rw-r--r-- 1 root root 10240 Feb 28 14:32 backup-Feb-28-161456655572.gz
-rw-r--r-- 1 root root 10240 Feb 28 14:33 backup-Feb-28-161456655620.gz
If you have any question please leave comment and don’t forget to say thanks, it’s free.:)