Tuesday, October 29, 2013

Two weeks back I decided to switch to Ubuntu 13.04. I was using 12.04 LTS version and everything was fine.  However I was seeing crash in bzr-gtk  upon bzr gcommit and hence I decided to upgrade. All was well until I saw the sys-tray icon of Network Manager crash upon switching connections due to  bug#1164631 and bug#1159063. I have to switch between connections often and this was frustrating.

The worst part of all this is that you cannot go online and find a solution to the issue since you are no longer connected to the internet.  So much for our dependency on the Internet! And you cannot restart your system if you have a huge test suite being run.

Skimming through the man pages I came across the old friend,   NMCLI

$  nmcli dev
DEVICE     TYPE              STATE        
ttyUSB0    cdma              disconnected    
wlan0      802-11-wireless   disconnected 
eth0       802-3-ethernet    unavailable 

Okay, So this was the epiphany I was looking for, Cool! So what's next.

$ nmcli con
NAME                      UUID                                   TYPE              TIMESTAMP-REAL                    
VPN connection 1          de9fd222-7a27-4e29-ad9e-9095f953e837   vpn               never                             
LAN                       c26d0089-0f41-422a-9997-8a9b832d7159   802-3-ethernet    Saturday 26 October 2013 07:23:48 PM IST
Tata Indicom (Photon+) connection b3c0f820-7521-4eb4-8507-8c931f44d691   cdma              Tuesday 29 October 2013 04:25:28 PM IST

To connect copy the uuid or the name of the connection you want to connect to  e.g. I wanted to connect to my cdma dongle and hence I used

$ nmcli con up uuid b3c0f820-7521-4eb4-8507-8c931f44d691

This is cool but I wanted to switch the connection from the systray. I used Python appIndicator3  to create an appindicator and added a menu with the list of all the available connections.

from gi.repository import AppIndicator3 as appindicator

def getIndicator():                                                             
   indicator = appindicator.Indicator.new (                                     
   indicator.set_status (appindicator.IndicatorStatus.ACTIVE)                   
   return indicator

Next was to register the appindcator to dbus notification interface.  This will ensure that we get notifications of network connect or disconnect,  whether from this appindicator or from outside.

from gi.repository import Gdk                                                   
import dbus                                                                     
from dbus.mainloop.glib import DBusGMainLoop 

def registerSignalHandler(ind):                                                 
        global indicator                                                        
        indicator= ind                                                          
        dbus_loop = DBusGMainLoop()                                                 
        system_bus = dbus.SystemBus(mainloop=dbus_loop)                             

 From here everything was downhill :)  I added a call back to check if the notification is a  connect or a disconnect.  You may get the active connection from

$ nmcli con status
NAME                      UUID                         launchpad          DEVICES    DEFAULT  VPN   MASTER-PATH                                 
Tata Indicom (Photon+) connection b3c0f820-7521-4eb4-8507-8c931f44d691   ttyUSB0    yes      no    --  

A screenshot of the systray icon.

It is a simple python script and is available  on launchpad.

Saturday, September 21, 2013

MySQL 5.7.2 features enhanced Multi-threaded slave which can be used to apply transactions in parallel even within a single database. Internal details of its working can be found in an earlier post. In this  post we will see how we can configure our replication slave to use this enhancement.

MySQL 5.7.2 has a new system variable  --slave-parallel-type which is dynamic. It can be set to the following values:

1. DATABASE  : (Default) Use the db partitioned MTS (1 worker per database)
2. LOGICAL_CLOCK:  Use logical clock based parallelization mode.

Apart from this the original option of --slave-parallel-workers=N is still valid and it sets that number of workers that we need to spawn. Also since the slave leverages the group of transactions that have committed in parallel on the slave, it makes sense to leave --binlog-max-flush-queue-time=0 which is the default value intact, on the master. This will ensure that the leader thread on the master flushes all the transactions queued in the FLUSH QUEUE of binlog group commit without getting timed out, thereby delivering maximum  parallelization on the slave.

Finally to summarize the steps to set up the enhanced MTS

1. start master with --binlog-max-flush-queue-time=0

1.a. Start slave server with --slave-parallel-type=LOGICAL_CLOCK --slave-parallel-workers=N

Or alternatively,

1.b Start the slave server normally. Change the MTS options dynamically using
the following

mysql: STOP SLAVE: --if the slave is running

A small Demo:

1. We created 5 tables in a single test database on master and used 5 clients to do inserts on them, in parallel.
2. The slave was configured as --slave-parallel-type="logical_clock"  and --slave-parallel-workers=5.
3. We let the slave replicate from the master and we checked the status of the workers by using
    performance schema tables for replication and show processlist command

Here is the sample output on the slave (click on the image to zoom)

When to use enhanced MTS

Since the slave uses the parallelization information from the master, it performs best when there are multiple clients on the master and there are multiple transactions committing at the same time. In case the master is underloaded, spawning multiple threads may not have effect on the slave performance, and may even lead to performance degradation.


This enhancement is available in MySQL 5.7.2 which can be downloaded from the MySQL download page. So try it out and let us know your valuable feedback.


Re-applying binary logs generated from highly concurrent master on the slave has always been an area of focus. It is important for various reasons. First, in real-time systems, it becomes extremely important for the slave to keep up with the master. This can only be guaranteed if the slaves’ performance in reapplying the transactions from the binary log is similar (or at-least comparable) to that of master, which is accepting queries directly from multiple clients. Second, in synchronous replication scenarios, having a fast slaves, aids in reducing the response times as seen by the clients to the master. This can be made possible by applying transactions from the binary log in parallel. However if left uncontrolled, a simple round-robin multi-threaded applying will lead to inconsistency and the slave will no longer be the exact replica of the leader.

The infamous out of order commit problem

The Out of order execution of transaction on the slave if left uncontrolled will lead to the slave diverging from the master. Here is an example: consider two transactions T1 and T2 being applied on an initial state.

On Master we apply T1 and T2 in that order.
State0: x= 1, y= 1
T1: { x:= Read(y);
          x:= x+1;
          Commit; }
State1: x= 2, y= 1

T2: { y:= Read(x);
          Commit; }
State2: x= 2, y= 3

On the slave however these two transactions commit out of order (Say T2 and then T1).
State0: x= 1, y= 1
T2: { y:= Read(x);
          y:= y+1;
          Commit; }
State1: x= 1, y= 2

T1: { x:= Read(y);
          Commit; }
State2: x= 3, y= 2

As we see above the final state state 2 is different in the two cases. Needless to say that we need to control the transactions that can execute in parallel.

Controlled parallelization

The above problem can be solved by controlling what transactions can be executed in parallel with the ones being executed by the slave. This means we need to have some kind of information in the transactions themselves. Interesting to note that we can use the information of parallelization from the master on the slave. Since we have multiple transactions committing at the same time on the master, we can store the information of the transactions that were in the "process of committing" when this transaction committed. Now let's define the phrase "process of committing".

The process of committing: On the slave we need to make sure that the transactions that we schedule for parallel execution will be the one which do not have conflicting read and write set. This is the only and the necessary requirement for the slave  workers to work without conflicts. This also implies that if the transactions being executed in parallel do not have intersecting read and write sets, we don't care if they are committed out of order. Since MySQL uses lock based scheduling, all the transactions that have entered the prepared stage but not as yet committed will have disjoint read and write sets and hence can be executed in parallel.

Logical clock and commit parent

We have introduced a logical clock. Now before I am tackled by a mathematician from one side and a computer engineer from the other, let me explain. It is a simple counter which is stepped when a binlog group of transaction commits on the master. Essentially this clock is stepped every time the leader execute the flush stage of binlog group commit. The value of this clock is recorded on each transaction when it enters the prepare stage. This recorded value is the "commit parent"

The pseudo code is as follows.

During Prepare
trx.commit_parent= commit_clock.get_timestamp();

During Commit
for every binlog group

As it is evident by now the transactions with the same commit parent follow our guiding principle of slave side parallelization i.e. transactions that have entered the prepared stage but has not as yet committed, and hence can be executed in parallel.

Schematics of inlog prepare stage and commit parent
In the example we will take up three transactions (T1 T2 T3), two of which have been committed as a part of the same binlog group. T1 enters the prepare stage and get the commit parent as 0 since none of the group have been committed as yet. T1 assigns itself as the leader and then goes on to flush its transaction/statement cache. In the meanwhile transaction T2 enters the prepare stage. It is also assigned the same commit parent "0"(CP as used in the figure) since the commit clock has not as yet been stepped. T2 then goes on a wait for the leader to flush its cache in to the binlog. After the flush has been completed by the leader, it signals T2 to continue and both of them enter the Sync stage, where the leader thread  calls fsync() there by finishing the binlog commit process. The  transaction T3 however enters the prepare stage after the previous group has been synced and there-by ends up getting the next CP.

Another thing to note here is that the "group" of transactions that are being executed in parallel are not bounded by binlog commit group. There is a possibility that a transaction have entered the binlog prepare stage but could not make it to the current binlog group. Our approach takes care of such cases and makes sure that we relax the boundary of the group being executed in parallel on the slave.

On the slave we use the existing infrastructure of DB partitioned MTS to execute the tranactions in parallel, simply by modifying the scheduling logic.


This feature provides the great enhancement to the existing MySQL replication. To know more about the configuration options of this enhancement refer to this post.
This feature is available in MySQL 5.7.2 release. You can try it out and let us know the feedback.

Monday, May 6, 2013


MySQL replication slave features  a powerful capability of ignoring conflicts like duplicate key error, key not found errors etc. while applying row events.  This is exceptionally useful while doing row based replication(RBR) from the master when the slave already contains some data which may conflict with the data coming from the master. In MySQL 5.7 we extend this capability while applying row events from mysql-binlog files using mysqlbinlog tool.  This enhancement will prevent such errors from aborting mysql client in case of conflicts like the ones mentioned above.


Prior to MySQL 5.7 we have been using the mysqlbinlog tool as follows.

shell$> mysqlbinlog master.000001|mysql -uroot -hexample.com -ps3cret  

This allows us to pipe the output of mysqlbinlog to mysql, and works as long as the row events from the mysqlbinlog do not conflict with the data already present on the mysql server on example.com. However in case of conflicts, the mysql client aborts and the we have to restart the whole process again skipping the event and starting from the next event using the start-position option.  This is a problem in case of multiple conflicts.

How to use

The enhancement is pretty simple to use. mysqlbinlog in MySQL 5.7.0 provides command line options to enable this feature.

--idempotent, -i 

To use this from the command line  simply use the short option

shell$> mysqlbinlog master.000001 -i | mysql -uroot -hexample.com -ps3cret  

or use full option
shell$> mysqlbinlog master.000001 --idempotent | mysql -uroot -hexample.com -ps3cret  

How does it work

In MySQL server version  5.7.1 we have introduced a new session system variable as rbr_exec_mode which can be set to STRICT or IDEMPOTENT. When set to IDEMPOTENT, the server does not throw out any conflict errors for that particular session.

When executed with -i or --idempotent  option the mysql binlog writes
to the output file or to the stdout. When piped into MySQL client this directs the server to use the idempotent mode while applying row events.

This enhancement is available in MySQL 5.7.1 which can be downloaded from the MySQL download page. So go try it out and let us know your valuable feedback using the comment section.