2013-01-22

or summary  for lazy guys on how to use it...
{autotoc enabled=yes}

I know that a lot has being written around Xtrabackup, and good documentation can be found on the Percona web site.
Anyhow I had to write a summary and clear procedure for my teams, so I choose to share those with all, given it could provide benefit to all community.

Each major topic is associate to a checklist, that need to be follow to prevent mistakes.
Overview
Xtrabackup is a hot backup tool, that allow you to perform backup on InnoDB with very limited impact on the running transactions/operations.
In order to do this xtrabackup, copy the IBD files AND it takes information out from the REDO log, from a starting point X.
This information needs to be then apply to the datafiles, before restarting the MySQL server on restore.
In short the Backup operation is compose by 2 main phases:

Copy of the file
Copy of the delta modified from REDO log.

Another phase is the "prepare" phase where the REDO log modifications, must apply.
This phase can be done as soon as the backup is complete, if the files are not STREAM (we will see it later), or must be done on Restore if STREAM was use.
Xtrabackup is compose by two main parts, the innobackupex script wrapper, and the xtrabackup.
the Xtrabackup binary has three different version:
- xtrabackup
- xtrabackup_51
- xtrabackup_55
Binary change in respect to the mysql binary version and are automatically selected from innobackupex as follow:
MySQL 5.0.* - xtrabackup_51
MySQL 5.1.* - xtrabackup_51
MySQL 5.1.* with InnoDB plugin - xtrabackup
Percona Server >= 11.0 - xtrabackup
MySQL 5.5.* - xtrabackup_55
It is important to note that while the backup of InnoDB tables is taken with minimal impact, the backup of MyISAM still require a full tables lock.
The full process can be describe as follow:

check connection to MySQL
start the xtrabackup as child process
wait untill xtrabackup suspend the process
connect to mysql
if sever is a slave wait for replication to catch-up
if server is a master it returns right away
flush tables and acquire a read lock (unless explicitly ask in the settings to DO NOT get lock)
write slave information
perform physical write of the files
resume xtrabackup process
unlock tables
close connection to mysql
copy last LRU information
write backup status report

User and Grants
Backup user SHOULD not be a common user or a DBA user but it should be one created for this operation as below:

1
2
3
4
5

CREATE USER 'backup'@'localhost' IDENTIFIED BY 'bckuser123';
REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'backup'@'localhost';
GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'backup'@'localhost';
FLUSH PRIVILEGES;

How to invoke the Xtrabackup in standard easy way.
This is the easier way to take a FULL backup using Xtrabackup.
/usr/bin/innobackupex-1.5.1 --defaults-file=
--slave-info --user=
--password=
/path/to/destination/backup/folder
ie
/usr/bin/innobackupex-1.5.1 --defaults-file=/home/mysql/instances/mtest1/my.cnf --slave-info --user=backup --password=bckuser123   /home/mysql/backup/
This will produce a full uncompress backup.

root@mysqlt3:/home/mysql/backup/2012_12_21_1300/2012-12-21_14-32-02# ll
total 200088
drwxr-xr-x 15 root root 4096 Dec 21 14:41 ./
drwxr-xr-x 3 root root 4096 Dec 21 14:46 ../
-rw-r--r-- 1 root root 263 Dec 21 14:32 backup-my.cnf
-rw-r----- 1 root root 104857600 Dec 21 14:32 ibdata1
drwxr-xr-x 2 root root 4096 Dec 21 14:41 mysql/
drwxr-xr-x 2 root root 4096 Dec 21 14:41 performance_schema/
drwx------ 2 root root 4096 Dec 21 14:41 security/
drwx------ 2 root root 4096 Dec 21 14:41 test/
drwx------ 2 root root 4096 Dec 21 14:41 test_audit/
drwx------ 2 root root 4096 Dec 21 14:41 timstaging/
drwx------ 2 root root 4096 Dec 21 14:41 timtags/
drwxr-xr-x 2 root root 4096 Dec 21 14:41 world/
-rw-r--r-- 1 root root 13 Dec 21 14:41 xtrabackup_binary
-rw-r--r-- 1 root root 26 Dec 21 14:41 xtrabackup_binlog_info
-rw-r----- 1 root root 85 Dec 21 14:41 xtrabackup_checkpoints
-rw-r----- 1 root root 99912192 Dec 21 14:41 xtrabackup_logfile
-rw-r--r-- 1 root root 53 Dec 21 14:41 xtrabackup_slave_info
backup-my.cnf
/home/mysql/backup/2012_12_21_1300/full_mtest1.tar.gz
Once the copy is over you will have a file like this:

drwxr-xr-x 3 root root 4.0K Dec 21 17:08 ./
drwxr-xr-x 3 root root 4.0K Dec 21 14:16 ../
drwxr-xr-x 15 root root 4.0K Dec 21 16:31 2012-12-21_14-32-02/
-rw-r--r-- 1 root root 737M Dec 21 17:18 full_mtest1.tar.gz
show schemas;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| security |
| test |
| test_audit |
| world |
+--------------------+
13 rows in set (0.04 sec)
root@localhost [(none)]> SELECT TABLE_SCHEMA, ENGINE, COUNT(1) as 'TABLES', sum(TABLE_ROWS) as 'ROWS',
TRUNCATE(sum(DATA_LENGTH)/pow(1024,2),2) as 'DATA (M)',
TRUNCATE(sum(INDEX_LENGTH)/pow(1024,2),2) as 'INDEX (M)',
TRUNCATE((sum(DATA_LENGTH)+sum(INDEX_LENGTH))/pow(1024,2),2) AS 'TOTAL(M)'
FROM information_schema.tables WHERE TABLE_SCHEMA <> 'information_schema'
AND TABLE_SCHEMA <> 'mysql' AND TABLE_SCHEMA not like 'avail%'
AND TABLE_SCHEMA <> 'maatkit' AND TABLE_TYPE = 'BASE TABLE'
GROUP BY TABLE_SCHEMA, ENGINE WITH ROLLUP;
+--------------------+--------------------+--------+----------+----------+-----------+----------+
| TABLE_SCHEMA | ENGINE | TABLES | ROWS | DATA (M) | INDEX (M) | TOTAL(M) |
+--------------------+--------------------+--------+----------+----------+-----------+----------+
| performance_schema | PERFORMANCE_SCHEMA | 17 | 23014 | 0.00 | 0.00 | 0.00 |
| performance_schema | NULL | 17 | 23014 | 0.00 | 0.00 | 0.00 |
| security | InnoDB | 1 | 1454967 | 170.73 | 60.75 | 231.48 |
| security | NULL | 1 | 1454967 | 170.73 | 60.75 | 231.48 |
| test | InnoDB | 51 | 9298913 | 6058.39 | 1347.78 | 7406.17 |
| test | NULL | 51 | 9298913 | 6058.39 | 1347.78 | 7406.17 |
| test_audit | InnoDB | 9 | 1189343 | 685.56 | 236.56 | 922.12 |
| test_audit | NULL | 9 | 1189343 | 685.56 | 236.56 | 922.12 |
| world | MyISAM | 3 | 5302 | 0.35 | 0.06 | 0.42 |
| world | NULL | 3 | 5302 | 0.35 | 0.06 | 0.42 |
| NULL | NULL | 227 | 11971539 | 6916.70 | 1645.74 | 8562.44 |
+--------------------+--------------------+--------+----------+----------+-----------+----------+

So far so good.
Now is time to modify the slave.
First take the current status:

1
2
3

root@localhost [(none)]> SHOW slave STATUS\G
Empty SET (0.00 sec)
root@localhost [(none)]>

Ok nothing, good.
Assign the master AND the log file and position from xtrabackup_binlog_info.

1
2

cat xtrabackup_binlog_info
binlog.000001 150497896

Prepare the command as:
change master to master_host='192.168.0.3', master_port=3310,master_user='replica',master_password='xxxx', master_log_file='binlog.000001',master_log_pos=150497896;
Check again:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

root@localhost [(none)]> SHOW slave STATUS\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.0.3
Master_User: replica
Master_Port: 3310
Connect_Retry: 60
Master_Log_File: binlog.000001
Read_Master_Log_Pos: 150497896
Relay_Log_File: mysql-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: binlog.000001
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 150497896
Relay_Log_Space: 107
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 0
1 row IN SET (0.00 sec)
root@localhost [(none)]>

Perfect start the slave:

slave start;

AND CHECK again:

root@localhost [(none)]> SHOW slave STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting FOR master TO send event
Master_Host: 192.168.0.3
Master_User: replica
Master_Port: 3310
Connect_Retry: 60
Master_Log_File: binlog.000001
Read_Master_Log_Pos: 206843593
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 22872
Relay_Master_Log_File: binlog.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 150520518
Relay_Log_Space: 56346103
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 30
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 3310
1 row IN SET (0.00 sec)

Ok we have some delay as expected by all is running as it should.
Our server is up and running.
How to do INCREMENTAL BACKUP with Xtrabackup

Incremental backup works in a different way.
To understand it correctly we need to remember that InnoDB pages have a sequence number LSN (Log Sequence Number), given that, each incremental backup starts from the previous stored LSN.

Incremental backup must have a first FULL Backup as base, then each following incremental, will be stored in a different directory (by timestamp).
To restore the incremental backup the full set of incremental, from the BASE to the last point in time, need to be apply.
So if we have the Full Backup done on Monday, and incremental are taken every day, if we need to restore the full set on Friday, we must apply the logs on the BASE (Monday) following the chronological order, Monday (base), then Tuesday, Wednesday, Thursday, Friday.
Only at that point we will have the full set of data, that can replace the one we were having on the server.
To remember that this works only for InnoDB, other storage engines like MyISAM are copy in full every time.
Let this work without compression
/opt/percona-xtrabackup-2.0.4/bin/innobackupex-1.5.1 --defaults-file=/home/mysql/instances/mtest1/my.cnf --slave-info --user=backup --password=bckuser123   /home/mysql/backup/
The new directory 2013-01-10_13-07-24 is the BASE.
Checking the files inside we can check the LSN position:

root@tusacentral03:/home/mysql/backup/2013-01-10_13-07-24# cat xtrabackup_checkpoints
backup_type = full-backuped
from_lsn = 0
to_lsn = 32473279827
last_lsn = 32473279827
Last LSN is 32473279827

As exercise let us do TWO incremental backup starting from this base, but first add some data...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

root@localhost [test]> SHOW processlist;
+-----+--------+---------------------------+------+---------+------+--------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+--------+---------------------------+------+---------+------+--------+------------------------------------------------------------------------------------------------------+
| 87 | root | localhost | test | Query | 0 | NULL | SHOW processlist |
| 92 | stress | tusacentral01.LOCAL:37293 | test | Sleep | 0 | | NULL |
| 94 | stress | tusacentral01.LOCAL:37296 | test | Query | 0 | UPDATE | INSERT INTO tbtest30 (uuid,a,b,c,counter,partitionid,strrecordtype) VALUES(UUID(),731188002,"hd rsg |
| 95 | root | localhost:37295 | test | Query | 0 | update | INSERT INTO test_audit.tbtest4 values(NEW.autoInc,NEW.a,NEW.uuid,NEW.b,NEW.c,NEW.counter,NEW.time,NE |
| 96 | stress | tusacentral01.local:37298 | test | Query | 0 | NULL | COMMIT |
| 97 | root | localhost:37299 | test | Query | 0 | update | INSERT INTO test_audit.tbtest4 values(NEW.autoInc,NEW.a,NEW.uuid,NEW.b,NEW.c,NEW.counter,NEW.time,NE |
| 98 | stress | tusacentral01.local:37300 | test | Query | 0 | update | insert INTO tbtest15 (uuid,a,b,c,counter,partitionid,strrecordtype) VALUES(UUID(),598854171,"usfcrgl |
| 99 | root | localhost:37301 | test | Query | 0 | UPDATE | INSERT INTO test_audit.tbtest4 VALUES(NEW.autoInc,NEW.a,NEW.uuid,NEW.b,NEW.c,NEW.counter,NEW.time,NE |
| 100 | stress | tusacentral01.LOCAL:37302 | test | Query | 0 | UPDATE | INSERT INTO tbtest15 (uuid,a,b,c,counter,partitionid,strrecordtype) VALUES(UUID(),22723485,"vno ehhr |
| 101 | stress | tusacentral01.local:37303 | test | Query | 0 | update | insert INTO tbtest1 (uuid,a,b,c,counter,partitionid,strrecordtype) VALUES(UUID(),991063177,"nqdcogeu |
| 102 | stress | tusacentral01.LOCAL:37304 | test | Query | 0 | UPDATE | INSERT INTO tbtest1 (uuid,a,b,c,counter,partitionid,strrecordtype) VALUES(UUID(),86481207,"sdfabnogn |
| 103 | stress | tusacentral01.local:37305 | test | Query | 0 | NULL | COMMIT |
+-----+--------+---------------------------+------+---------+------+--------+------------------------------------------------------------------------------------------------------+
12 rows in set (0.00 sec)

Now let us create the first incremental backup:
/opt/percona-xtrabackup-2.0.4/bin/innobackupex-1.5.1 --incremental --incremental-basedir=/home/mysql/backup/2013-01-10_13-07-24 --defaults-file=/home/mysql/instances/mtest1/my.cnf --slave-info --user=backup --password=bckuser123   /home/mysql/backup/

After all the process is complete, we will have TWO directories:

total 20
drwxr-xr-x 5 root root 4096 Jan 10 13:30 ./
drwxr-xr-x 18 mysql mysql 4096 Dec 28 12:16 ../
drwxr-xr-x 15 root root 4096 Jan 10 13:17 2013-01-10_13-07-24/
drwxr-xr-x 15 root root 4096 Jan 10 13:34 2013-01-10_13-30-43/
/home/mysql/backup/incremental_2013_01_10_19_05.gz

To note is the parameter  --extra-lsndir which allow you to specify an additional location for the LSN file position,
this is very important because it needs to be "grep" for the next incremental backup.
Like:

grep last_lsn xtrabackup_checkpoints|awk -F' = ' '{print $2}'
34925032837

and the parameter --parallel=4 to implement multi thread streaming
So next will be:

/opt/percona-xtrabackup-2.0.4/bin/innobackupex-1.5.1 --incremental \
--incremental-lsn=34925032837 \
--defaults-file=/home/mysql/instances/mtest1/my.cnf \
--slave-info --user=backup --password=bckuser123 \
--extra-lsndir=/home/mysql/backup/ \
--stream=xbstream --parallel=4 ./ |pigz -p4 - > /home/mysql/backup/incremental_2013_01_11_11_35.gz

Once done taking again the LSN value it will be 35209627102
At this point we have a compress incremental backup using xbstream and pigz.
Point is can we restore it correctly?
copy all the files to the resore area/server

root@tusacentral03:/home/mysql/backup# ll
total 631952
drwxr-xr-x 3 root root 4096 Jan 11 11:36 ./
drwxr-xr-x 19 mysql mysql 4096 Jan 10 15:01 ../
drwxr-xr-x 15 root root 4096 Jan 10 17:27 full_2013_01_10_18_54.gz
-rw-r--r-- 1 root root 360874358 Jan 11 11:25 incremental_2013_01_10_19_05.gz
-rw-r--r-- 1 root root 286216063 Jan 11 11:41 incremental_2013_01_11_11_35.gz
-rw-r--r-- 1 root root 93 Jan 11 11:41 xtrabackup_checkpoints

then to expand it:
pigz -d -c full_2013_01_10_18_54.gz | xbstream -x -v
create 2 directory:
mkdir 2013_01_10_19_05
mkdir 2013_01_11_11_35
Then
pigz -d -c incremental_2013_01_10_19_05.gz | xbstream -x -v
pigz -d -c incremental_2013_01_11_11_35.gz | xbstream -x -v

After that the procedure will be the same.
/opt/percona-xtrabackup-2.0.4/bin/innobackupex-1.5.1 --use-memory=1G --apply-log --redo-only /home/mysql/restore/2013-01-10_17-15-27
/opt/percona-xtrabackup-2.0.4/bin/innobackupex-1.5.1 --use-memory=1G --apply-log --redo-only /home/mysql/restore/2013-01-10_17-15-27 --incremental-dir=/home/mysql/restore/2013_01_10_19_05
/opt/percona-xtrabackup-2.0.4/bin/innobackupex-1.5.1 --use-memory=1G --apply-log --redo-only /home/mysql/restore/2013-01-10_17-15-27 --incremental-dir=/home/mysql/restore/2013_01_11_11_35

Finalize the process
/opt/percona-xtrabackup-2.0.4/bin/innobackupex-1.5.1 --use-memory=1G --apply-log /home/mysql/restore/2013-01-10_17-15-27

Copy in the production location:
To remove possible not needed files :
find . -name "*.TR*" -exec  \rm -v '{}' \;

Assign correct grants to mysql user
chown -R mysql:mysql data
restart and if slave set the right binlog and position as before
Done!

Incremental with compression and NetCat
There are two possible ways to perform the copy with NetCat:

one is "on the fly" means that the stream instead being direct to a local file it is directly push on the "Recovery" server.
the other is to write the file then push it to the "Recovery" server.

Using the the "on the fly" is in my opinion conceptually dangerous.
This because a backup operation should be as more solid as possible.
Having the stream directed to the final server is opening to possible issue at any network glitch.
Any network flotation could affect the whole backup, and there could be also possible scenario where a full transmitted backup will result corrupted.
This because IF a network issue happen during the transfer the process on the source or destination server, the one DOING the backup or the one receiving can crash or hung.
All the above impose a sanity check on the process and on the final result, to be sure that in case of failure the backup will be take again, or at least there will be awareness about the issue.

Needs to be say that the process is not so fragile when dealing with small amount of data, but it could become much more concerning when dealing with Gigs because resource allocation limit on the source machine.
The NetCat solution see two elements in our case:

server (sender)
client (receiver)

This is valid in our case but needs mention that the server can also get input from the client, but this is not a topic here.
The on the fly
The backup process is suppose to be launched on the server with the following statement:
/opt/percona-xtrabackup-2.0.4/bin/innobackupex-1.5.1 --incremental  --incremental-lsn=35209627102 --defaults-file=/home/mysql/instances/mtest1/my.cnf --slave-info --user=backup --password=bckuser123 --extra-lsndir=/home/mysql/backup/ --stream=xbstream --parallel=4 ./ |pigz -p4 - | nc -l 6666
while on client :
nc 192.168.0.3 6666|pv -trb > /home/mysql/recovery/incremental_2013_01_14_12_05.gz
So the only difference is the add of the NetCat commands and obviously the need to have it done on the client.
Once the process is over, the expand can be done as usual:
pigz -d -c incremental_2013_01_14_12_05.gz | xbstream -x -v
Two steps process
Is exactly the same of the one "Incremental with compression", but instead doing a file copy issue the commands:
on the server:
cat /home/mysql/backup/incremental_2013_01_14_12_05.gz | nc -l 6666| pv -rtb
On the client:
nc 192.168.0.3 6666|pv -trb > /home/mysql/recovery/incremental_2013_01_14_12_05.gz
Conclusion
I think it could make sense to use NetCat ONLY in very specific cases, and only developing solid scripts around it, including in them:

status checks of the backup operation
list of the transmitted files
LSN position validation
network status/monitor during the operations

In short a possible nightmare.
Check lists
Simple backup

[] Check binary version
[] Check binaries are accessible in the PATH and accessible
[] Assign correct user/password in MySQL for backup user
[] Create or check backup data destination folder
[] Check my.cnf for datadir and be sure is pointing in the right place
[] Execute backup
[] Apply logs

Simple restore
[] be sure mysql server is down
[] remove / move data from original directory
[] remove / move ib_logs from original directory
[] run innobackupex --copy-back
[] check file permissions for mysql
[] start mysql
[] check the mysql log for error
[] log in and check for data.

Backup with Stream and compression
[] Check binary version
[] Check binaries are accessible in the PATH and accessible
[] Assign correct user/password in MySQL for backup user
[] Create or check backup data destination folder
[] Check my.cnf for datadir and be sure is pointing in the right place
[] Check for Pigz presence and version
[] Execute backup

Restore from Stream on a different machines (slave)
[] be sure mysql server is down
[] remove / move data from original directory
[] remove / move ib_logs from original directory
[] copy over the compress file
[] expand the backup in a safe directory
[] run innobackupex --copy-back
[] check file permissions for mysql
[] check that server WILL NOT restart the slave process on start
[] start mysql
[] check the mysql log for error
[] log in and check for data.
[] take the master log position
[] check for slave process information
[] apply new binary log position
[] restart slave
[] check slave status

Incremental Backup with Stream and compression
[] Check binary version
[] Check binaries are accessible in the PATH and accessible
[] Assign correct user/password in MySQL for backup user
[] Create or check backup data destination folder
[] Check my.cnf for datadir and be sure is pointing in the right place
[] Check for Pigz presence and version
[] Check for LSN file postion in xtrabackup_checkpoints
[] Assign the LSN to the "incremental-lsn" parameter
[] Be sure that the --extra-lsndir parametr is present and pointing to an existing directory
[] Execute backup

Incremental Restore from Stream on a different machines or slave
[] be sure mysql server is down
[] remove / move data from original directory
[] remove / move ib_logs from original directory
[] copy over the compress files
[] validate the chronological order from the BASE to the last increment
{loop for each file set}
[] expand the backup in a safe directory one a time
[] be sure that you apply log with "--apply-log --redo-only" parameters every time
[] be sure you always have the correct destination directory set (BASE set)
[] remove the incremental once apply
{loop end}
[] run innobackupex --apply-log on the BASE set
[] remove IB_log files
[] copy files to destination directory
[] check file permissions for mysql
[] check that server WILL NOT restart the slave process on start
[] start mysql
[] check the mysql log for error
[] log in and check for data.
[] take the master log position
[] check for slave process information
[] apply new binary log position
[] restart slave

[] check slave status

{joscommentenable}

Show more