2012-12-18

It is time for christmas presents: some sharding support and cache locality optimizations are coming with PECL/mysqlnd_ms 1.5. PECL/mysqlnd_ms is a plugin for the mysqlnd library. The plugin adds replication and load balancing support to any PHP MySQL API (mysql, mysqli, PDO_MySQL) if compiled to use the mysqlnd library.

As a MySQL user you can choose between a wide variety of clustering solutions to scale-out. Your options range from eventual consistent solutions to strong consistent ones, from built-in (MySQL Replication, MySQL Cluster) to third party or home-grown. PECL/mysqlnd_ms is a client side load balancer that aims to serve all.

Background: cluster-wide partitioning and sharding

Divide-and-conquer is a true classic. If you are lucky, your business requires you to spread load and data over many MySQL nodes using some sort of cluster. In many cases read scale-out offered by MySQL Replication is all you need. Any step further means digging deeper into the world of distributed systems.

Writes

MySQL Master

|
|
|

Slave 1
Slave 2
Slave 3

Reads

Possibly, you need more than the above and look into schema-based partitioning for distributing data over many nodes. MySQL Replication cluster topologies are hardly limited by anything but your fantasy. Please, mind the writes because MySQL Replication is using lazy primary copy but who do I tell this?

Writes for DB1, DB2

Writes for DB3, DB4

Reads covering DB1…DB4

Reads covering DB1…DB4

MySQL Master 1
replicate DB1, DB2 –>
MySQL Master 2

DB1 (rw), DB2 (rw), DB3 (ro), DB4 (ro)

1e9)

Writes for DB1.T2


Reads covering …

Reads covering …



MySQL Master 1
replicate … –>
MySQL Master n

DB1.T1 (ID n…m) (rw)

query("INSERT INTO t1(id) VALUES(1)");

$server = get_shard_server($id = 2);
$link = new mysqli($server);
$link->query("INSERT INTO t1(id) VALUES(2)");

The above is not very nice as all the connection handling is left to the user as an excercise. Also, imagine you wanted to add read scale-out through a lazy primary copy system like MySQL Replication and had to do read-write splitting, failover or load balancing on top. All tasks could be handled by a load balancer using code like this:

$link = new mysqli("cluster_name");
$link->query(get_shard_hint(1) . "INSERT INTO t1(id) VALUES(1)");
$link->query("/*shard_a*/INSERT INTO t1(id) VALUES(2)");

A SQL hint (comment) at the very beginning of the query tells the load balancer which "shard" to use. The load balancer can then map the shard to one node or a group of nodes that form a MySQL Replication setup consisting of one master to handle the write load and multiple slaves for the reads. That’s one use case of the node_group filter coming with PECL/mysqlnd_ms 1.5.

Static sharding example

It is rather unlikely you will ever setup a MySQL cluster like the below, but let’s assume you do to give me a chance to demo all kinds of funny setups. Your cluster consists of three groups ("shards"). Some of the groups use replication (A, B), some don’t (C). Based on some application rules you want to distribute queries to A, B and C.

cluster_name

Group A
Group B
Group C

Master 1
Master 2
Master 3

Slave 1
Slave 2
Slave 3

This is an example PECL/mysqlnd_ms 1.5 load balancer configuration to match the above cluster topology. Instead of naming the groups A, B, and C, one could use 0, 1, 2 as well. This may be handy if your application is using a very basic function to compute the "shard" such as ID % 3.
node_group.conf

{
"cluster_name": {
"master": {
"master1": {
"host": "192.168.2.1",
"port": 3306,
},
"master2": {
"host": "192.168.2.2",
"port": 3306,
},
"master3": {
"host": "192.168.2.3",
"port": 3306,
}
},
"slave": {
"slave1": {
"host": "192.168.2.4",
"port": 3306,
},
"slave2": {
"host": "192.168.2.5",
"port": 3306,
},
"slave3": {
"host": "192.168.2.6",
"port": 3306,
}
},
"filters": {
"node_groups": {
"A": {
"master": [
"master1"
],
"slave": [
"slave1"
]
},
"B": {
"master": [
"master2"
],
"slave": [
"slave2",
"slave3"
]
},
"C": {
"master": [
"master3"
],
"slave": [
"master3"
]
}
},
"roundrobin": [

]
}
}
}

Next, you load PECL/mysqlnd_ms into PHP, enable it, tell it where to find the configuration and instruct it to emit no warning about multiple masters given in the configuration.
php.ini

extension=/path/to/mysqlnd_ms.so
mysqlnd_ms.enable=1
mysqlnd_ms.multi_master=1
mysqlnd_ms.config_file=node_group.conf

You are done with the configuration. If you open a connection to the host "cluster_name" using any PHP MySQL API, the load balancers begins its work. To send queries to the "shard" A, prefix the query with the SQL hint "/*A*/". PECL/mysqlnd_ms will recognize the hint and not use all configured nodes but only those listed in the node group "A". With read write splitting being enabled by default, the SELECT will go to slave1 and the INSERT will be directed to master1. The SELECT statements that begin with the hint "/*B*/" are load balanced in a round robin fashion over slave2 and slave3.

$link = new mysqli("cluster_name", "user", "password", "database");
$link->query("/*A*/SELECT 1 FROM DUAL");
$link->query("/*A*/INSERT INTO test(id) VALUES (1)");
$link->query("/*B*/SELECT 1 FROM DUAL");
$link->query("/*B*/SELECT 1 FROM DUAL");

That’s the idea, that’s what works with the initial push. Details to be implemented.

Cache locality optimization

Optimizing query distribution for better cache usage is another use case of the new node_group filter. Assume all nodes hold all the data, no partitioning or sharding is used, all nodes can answer all queries. Queries are evenly distributed over all nodes (Node 1, 2) and databases (DB1, DB2). Thus, all caches hold roughly the same set of data.

Node 1
Node 2

~6MB DB1.* cache
~5MB DB1.* cache

~4MB DB2.* cache
~5MB DB2.* cache

Using node groups you could manually influence, for example, SELECT distribution for selected queries in order to keep all large SELECT questions for DB1 on Node 1 and all large ones for DB2 on Node 2. This may result in more efficient usage of rare cache memory. However, it is a detail optimization for experts.

Node 1
Node 2

~9MB DB1.* cache
~1MB DB1.* cache

~1MB DB2.* cache
~9MB DB2.* cache

Merry Christmas and a Happy New Year!

Christmas is around the corner, PECL/mysqlnd_ms 1.5 is not yet. The node_group filter is brand new (from yesterday) and its not feature complete. Sharding/partitioning support does not really depend on it. Both use cases sketched in this blog can be implemented using user provided callback functions. Search the fine manual for user and user_multi filter.

Happy hacking!

@Ulf_Wendel
The post Some sharding support and cache locality optimization support for PHP MySQL driver appeared first on Ulf Wendel.

Show more