Monday, October 1, 2012

With MySQL 5.6 coming closer to its release, I got a chance to sneak a look into the Row Based Replication (RBR). 5.6 release of MySQL will be a replication features packed one, and it is for this reason I write  another post on the new enhancements in Row Based Replication(RBR). RBR provides a safe way of replicating between master and slave and thats why RBR enhancements become even more important. RBR in 5.6 is far more optimized than what it was before and we hope it will be fun to deploy, maintain and use.

Folowing are some interesting enhancements in RBR in MySQL 5.6 Replication.

1. Optimized row image:

Large data transfer between master and slave while using RBR is a thing of past. Why log complete rows on the master? Why not just the changed colums?  The MySQL 5.6 release enables a DBA to configure the master to log only the changed columns in the row.  The Before Image (BI) will contain the columns for identifying the rows and the After Image (AI) will just contain the column that has changed.

Before MySQL 5.6: 

As one can observe in the illustration that the row images were full, i.e. all the columns in the row were logged for both the Before Image(BI) and the After Image AI

In MySQL 5.6:

In MySQL 5.6 the DBA can configure if full or minimal or NOBLOB image of rows should be logged in the binary log in the following ways.

1. start the server with --binlog-row-image=MINIMAL|FULL|NOBLOB

In minimal image if there is a Primary Key (PK) or any kind of Unique Index/Key (UK) that can identify a row uniquely on the slave then the BI contains just that column. In the After Image (AI) only the changed columns are logged (See illustration below). 

In case of NOBLOB is used, the BLOB fields will be skipped from logging unless changed. In case  FULL is used the logging will continue as before.

As you can see the amount of data transfer will be reduced by a lot especially if you have used proper indexes/keys for the tables on the master and the slave. A detailed post of this enhancement be found here.

2. Batch operations on table without PK or UK:

I have already mentioned the details of this enhancement in a previous post. This is one of the long awaited optimization when using RBR. This enables the DBA to configure what algorithm will be used  on the slave to find the appropriate row for updating/deleting. We have introduced the concept of HASH_SCAN, in which an in-memory hash is being created on the slave, and provides a way to find and apply rows in a row event in O(n) time.  This enhancement provides a great scope for performance improvement on the slave for the events logged in row format.

3. Informational Log events: 

One of the difficulties DBA have faced till now, while using RBR is that, it is difficult to maintain. The Row images are binary in nature and is not suitable for human "consumption". This is where this enhancement comes handy. There are two ways of using this.

    1. Enabling switch --binlog-rows-query-log-events during the server startup
    2. Changing the corresponding session variable.

Once enabled this will cause the original query for the row event to be logged in the binary log as a special event which the DBA can check using one of the two  methods.

1. One can check these special events in the binary-log by using SHOW BINLOG EVENTS. The query corresponding the the row event will be logged as an event type: Row_query. below is one such example

| Log_name      | Pos | Event_type  | | End... | Info                                          |
| master.000001 |   4 | Format_desc |     1 |    114 | Server ver: 5.6.7-m5-debug-log, Binlog ver: 4 |
| master.000001 | 114 | Query       |     1 |    200 | use `test`; CREATE TABLE NUM_INTS (a INT)     |
| master.000001 | 200 | Query       |     1 |    268 | BEGIN                                         |
| master.000001 | 268 | Rows_query  |     1 |    323 | # INSERT INTO NUM_INTS VALUES (1), (2), (3)   |
| master.000001 | 323 | Table_map   |     1 |    364 | table_id: 54 (test.NUM_INTS)                  |
| master.000001 | 364 | Write_rows  |     1 |    408 | table_id: 54 flags: STMT_END_F                |
| master.000001 | 408 | Query       |     1 |    477 | COMMIT                                        |

In the above terminal output (snip-ed), one can see the event at pos= 268 is an informational event for the next Table_map and Write_rows events. 

2. Another way is to use the mysqlbinlog utility with  --vv switch (verbosity level 2 or more).

shell> mysqlbinlog -vv /master.000001
# at 268
#110401 14:24:29 server id 1  end_log_pos 323   Rows_query
# INSERT INTO t1 VALUES (1), (2), (3)
# at 323
#110401 14:24:29 server id 1  end_log_pos 364   Table_map: `test`.`NUM_INTS` mapped to number 54
# at 364
#110401 14:24:29 server id 1  end_log_pos 408   Write_rows: table id 54 flags:STMT_END_F


So we see how MySQL-5.6 Replication will provide a lot of features that enables a better, faster, reliable slave that is easy to maintain.  So why not just go try it out?


About the author

Rohit Kalhans is a Software Development Engineer based out of Banglore India. His area of focus revolves around Row-based replication and Multi-threaded parallel slave. His interests include system programming, highly-available and scalable systems. In his free time, he plays guitar and piano and loves to write short stories and poems. More Information can be found on his homepage.


  1. I've been using the minimal RBR format for some time and indeed it does help reduce the size of logging quite a lot. It's really good.

    I see a few missing extra features to replication in 5.6:
    * On busy servers even minimal RBR still consumes a lot of disk space. The retention period is configured in days and that's much too coarse sometimes. Just making the configuration setting for expire_logs_days work for non integer periods would give more flexibility when you can not afford to store more than a day or so binlogs.
    * The binlog events may be quite large, and especially if using SBR there's a lot of text in there. I'd love to see an option to compress the events as that probably requires only a minimal change to the format, and in many cases would again help reduce the binlog sizes.
    * Finally if RBR breaks it can be pretty tricky to figure out what broke and where. The information is there partly in what's logged by mysqld but it's not easy to process without quite a bit of processing with mysqlbinlog and some scripting. Recovery from a failed situation like this can also be tricky but in many cases you can _safely_ continue for certain situations even though mysqld will currently stop and complain:
    - a duplicate key error triggered by an insert, could be replaced by an UPDATE on the PK.
    - a failed DELETE if the PK value is missing can be ignored
    - an UPDATE (not minimal) which fails because the BI does not match _exactly_ can be executed by using the full AI, if the PK matches.
    ( all of this is of course when recovering from some replication failure where master and slaves have for some unexpected reason drifted apart )
    Allowing these recovery options (and counters to indicate how many times the different recovery methods are triggered) means there's more chance that MySQL will be able to recover without requiring a manual time-consuming intervention to figure out what's broken so that can be fixed.

    And for anyone that says that MySQL replication never breaks... that's not been my experience. It's certainly not frequent, but often may be triggered by failures in hardware or sometimes even human error. Whatever the cause often we need to fix it so having better tools to help us do that would help.

    * My other pet wish is that Oracle implement what will be in MariaDB 10 soon: I've still yet to play with that, but having used the same mechanism some time ago with Sybase think it will help many people who need to combine data from multiple sources, like central reporting servers, something that can't be done now unless using something like Tungsten Replicator.

    1. Thank you Simon for your feedback. we have taken your feedback into consideration.

  2. This comment has been removed by a blog administrator.