MySQL Fabric is a new framework that adds High Availability (HA) and/or scaling-out for MySQL. This is the third in a series of posts on the new MySQL Fabric framework; the first article (MySQL Fabric – adding High Availability to MySQL) explains how MySQL Fabric can deliver HA and then stepped through all of the steps to configure and use it. The second (MySQL Fabric – adding Scaling to MySQL) focuses on sharding and again provides an introduction to how it’s achieved and then works through an example.
When the first two posts went live, MySQL Fabric was still an alpha – the good news is that MySQL Fabric now has a Release Candidate – download the MySQL Utilities 1.4.2 RC here (“Development Releases” tab).
This post focuses on MySQL Fabric as a whole – both High Availability and scaling out (sharding). It starts with introductions to HA and scaling out (by partitioning/sharding data) and how MySQL Fabric achieves it before going on to work through a full example of deploying HA with MySQL Fabric and then adding sharding on top.
What MySQL Fabric Provides
MySQL Fabric is built around an extensible framework for managing farms of MySQL Servers. Currently two features have been implemented – High Availability and scaling out using data sharding. Either of these features can be used in isolation or in combination.
Both features are implemented in two layers:
The mysqlfabric process which processes any management requests – whether received through the mysqlfabric command-line-interface (the manual for which can be found at http://dev.mysql.com/doc/mysql-utilities/1.4/en/fabric.html) or from another process via the supplied XML/RPC interface. When using the HA feature, this process can also be made responsible for monitoring the master server and initiating failover to promote a slave to be the new master should it fail. The state of the server farm is held in the state store (a MySQL database) and the mysqlfabric process is responsible for providing the state and routing information to the connectors.
MySQL Connectors are used by the application code to access the database(s), converting instructions from a specific programming language to the MySQL wire protocol, which is used to communicate with the MySQL Server processes. A ‘Fabric-aware’ connector stores a cache of the routing information that it has received from the mysqlfabric process and then uses that information to send transactions or queries to the correct MySQL Server. Currently the three supported Fabric-aware MySQL connectors are for PHP, Python and Java (and in turn the Doctrine and Hibernate Object-Relational Mapping frameworks). This approach means that the latency and potential bottleneck of sending all requests via a proxy can be avoided.
High Availability (HA) refers to the ability for a system to provide continuous service – a system is available while that service can be utilized. The level of availability is often expressed in terms of the “number of nines” – for example, a HA level of 99.999% means that the service can be used for 99.999% of the time, in other words, on average, the service is only unavailable for 5.25 minutes per year (and that includes all scheduled as well as unscheduled down-time).
Different Points of High Availability
The figure shows the different layers in the system that need to be available for service to be provided.
At the bottom is the data that the service relies on. Obviously, if that data is lost then the service cannot function correctly and so it’s important to make sure that there is at least one extra copy of that data. This data can be duplicated at the storage layer itself but with MySQL, it’s most commonly replicated by the layer above – the MySQL Server using MySQL Replication. The MySQL Server provides access to the data – there is no point in the data being there if you can’t get at it! It’s a common misconception that having redundancy at these two levels is enough to have a HA system but you also need to look at the system from the top-down.
To have a HA service, there needs to be redundancy at the application layer; in itself this is very straight-forward, just load balance all of the service requests over a pool of application servers which are all running the same application logic. If the service were something as simple as a random number generator then this would be fine but most useful applications need to access data and as soon as you move beyond a single database server (for example because it needs to be HA) then a way is needed to connect the application server to the correct data source. In a HA system, the routing isn’t a static function, if one database server should fail (or be taken down for maintenance) the application should be directed instead to an alternate database. Some HA systems implement this routing function by introducing a proxy process between the application and the database servers; others use a virtual IP address which can be migrated to the correct server. When using MySQL Fabric, this routing function is implemented within the Fabric-aware MySQL connector library that’s used by the application server processes.
What MySQL Fabric Adds in Terms of High Availability
MySQL Fabric has the concept of a HA group which is a pool of two or more MySQL Servers; at any point in time, one of those servers is the Primary (MySQL Replication master) and the others are Secondaries (MySQL Replication slaves). The role of a HA group is to ensure that access to the data held within that group is always available.
While MySQL Replication allows the data to be made safe by duplicating it, for a HA solution two extra components are needed and MySQL Fabric provides these:
Failure detection and promotion – the MySQL Fabric process monitors the Primary within the HA group and should that server fail then it selects one of the Secondaries and promotes it to be the Primary (with all of the other slaves in the HA group then receiving updates from the new master). Note that the connectors can inform MySQL Fabric when they observe a problem with the Primary and the MySQL Fabric process uses that information as part of its decision making process surrounding the state of the servers in the farm.
Routing of database requests – When MySQL Fabric promotes the new Primary, it updates the state store and notifies the connectors so that they can refresh their caches with the updated routing information. In this way, the application does not need to be aware that the topology has changed and that writes need to be sent to a different destination.
Scaling Out – Sharding
When nearing the capacity or write performance limit of a single MySQL Server (or HA group), MySQL Fabric can be used to scale-out the database servers by partitioning the data across multiple MySQL Server “groups”. Note that a group could contain a single MySQL Server or it could be a HA group.
The administrator defines how data should be partitioned/sharded between these servers; this is done by creating shard mappings. A shard mapping applies to a set of tables and for each table the administrator specifies which column from those tables should be used as a shard key (the shard key will subsequently be used by MySQL Fabric to calculate which shard a specific row from one of those tables should be part of). Because all of these tables use the same shard key and mapping, the use of the same column value in those tables will result in those rows being in the same shard – allowing a single transaction to access all of them. For example, if using the subscriber-id column from multiple tables then all of the data for a specific subscriber will be in the same shard. The administrator then defines how that shard key should be used to calculate the shard number:
HASH: A hash function is run on the shard key to generate the shard number. If values held in the column used as the sharding key don’t tend to have too many repeated values then this should result in an even partitioning of rows across the shards.
RANGE: The administrator defines an explicit mapping between ranges of values for the sharding key and shards. This gives maximum control to the user of how data is partitioned and which rows should be co-located.
When the application needs to access the sharded database, it sets a property for the connection that specifies the sharding key – the Fabric-aware connector will then apply the correct range or hash mapping and route the transaction to the correct shard.
If further shards/groups are needed then MySQL Fabric can split an existing shard into two and then update the state-store and the caches of routing data held by the connectors. Similarly, a shard can be moved from one HA group to another.
Note that a single transaction or query can only access a single shard and so it is important to select shard keys based on an understanding of the data and the application’s access patterns. It doesn’t always make sense to shard all tables as some may be relatively small and having their full contents available in each group can be beneficial given the rule about no cross-shard queries. These global tables are written to a ‘global group’ and any additions or changes to data in those tables are automatically replicated to all of the other groups. Schema changes are also made to the global group and replicated to all of the others to ensure consistency.
To get the best mapping, it may also be necessary to modify the schema if there isn’t already a ‘natural choice’ for the sharding keys.
The example that this post steps through starts by setting up MySQL Fabric itself and then uses it to manage HA using a group of MySQL Servers. An example application will store data in this new configuration. After that, the example will introduce shards to the server farm in order to scale out capacity and read/write performance.
The following sections set up the sharded MySQL configuration shown here before running some (Python) code against – with queries and transactions routed to the correct MySQL Server.
Building the MySQL Fabric Framework
Note that this section is repeated from the earlier two posts and so can be skipped if you’ve already worked through one of them.
The machines being used already have MySQL 5.6 installed (though in a custom location) and so the only software pre-requisite is to install the MySQL connector for Python from the “Development Releases” tab from the connector download page and MySQL Fabric (part of MySQL Utilities) from the “Development Releases” tab on the MySQL Utilities download page:
MySQL Fabric needs access to a MySQL Database to store state and routing information for the farm of servers; if there isn’t already a running MySQL Server instance that can be used for this then it’s simple to set one up:
MySQL Fabric needs to be able to access this state store and so a dedicated user is created (note that the fabric database hasn’t yet been created – that will be done soon using the mysqlfabric command):
All of the management requests that we make for MySQL Fabric will be issued via the mysqlfabric command. This command is documented in the MySQL Fabric User Guide but sub-commands can be viewed from the terminal using the list-commands option:
MySQL Fabric has its own configuration file (note that its location can vary depending on your platform and how MySQL Utilities were installed). The contents of this configuration file should be reviewed before starting the MySQL Fabric process (in this case, the mysqldump_program and mysqlclient_program settings needed to be changed as MySQL was installed in a user’s directory):
The final step before starting the MySQL Fabric process is to create the MySQL Fabric schema within the state store:
An optional step is then to check for yourself that the schema is indeed there:
The MySQL Fabric process can now be started; in this case the process will run from the terminal from which it’s started but the --daemonize option can be used to make it run as a daemon.
If the process had been run as a daemon then it’s useful to be able to check if it’s actually running:
Adding MySQL Servers to Create a HA Farm
At this point, MySQL Fabric is up and running but it has no MySQL Servers to manage. This figure shows the what the configuration will look like once MySQL Servers have been added to create a HA server farm.
Three MySQL Servers will make up the managed HA group – each running on a different machine – these are the configuration files for each (there’s no detailed commentary as this is standard MySQL stuff):
These MySQL Servers can then be bootstrapped and started:
At this point, the MySQL Fabric process (and its associate state store) is up and running, as are the MySQL Servers that will become part of the HA group. MySQL Fabric is now able to access and manipulate those MySQL Servers and so they can now be added to a HA group called group_id-1.
The mysqlfabric command can then be used to confirm the list of servers that are part of the HA group:
Note that all of the MySQL Servers are reported as being Secondaries (in other words, none of them is acting as the MySQL Replication master). The next step is to promote one of the servers to be the Primary; in this case the uuid of the server we want to promote is provided but that isn’t required – in which case MySQL Fabric will select one.
Note that fab4 is now showing as the Primary; any of the Secondary servers can also be queried to confirm that they are indeed MySQL replication slaves of the Primary.
At this stage, the MySQL replication relationship is configured and running but there isn’t yet High Availability as MySQL Fabric is not monitoring the state of the servers – the final configuration step fixes that:
Everything is now set up to detect if the Primary (master) should fail and in the event that it does, promote one of the Secondaries to be the new Primary. If using one of the MySQL Fabric-aware connectors (initially PHP, Python and Java) then that failover can be transparent to the application.
Run an Application Against the HA Farm
The code that follows shows how an application can accesses the new HA group – in this case, using the Python connector. First an application table is created:
Note the following about that code sample:
The connector is provided with the address for the MySQL Fabric process localhost:8080 rather than any of the MySQL Servers
The mode property for the connection is set to fabric.MODE_READWRITE which the connector will interpret as meaning that the transaction should be sent to the Primary (as that’s where all writes must be executed so that they can be replicated to the Secondaries)
The group property is set to group_id-1 which is the name that was given to the single HA Group
This code can now be executed and then a check made on one of the Secondaries that the table creation has indeed been replicated from the Primary.
The next step is to add some rows to the table:
And then the data can be retrieved (note that the mode parameter for the connection is set to fabric.MODE_READONLY and so the connector knows that it can load balance the requests across any MySQL Servers in the HA Group).
Note that if the Secondary servers don’t all have the same performance then you can skew the ratio for how many reads are sent to each one using the mysqlfabric server set_weight command – specifying a value between 0 and 1 (default is 1 for all servers). Additionally, the mysqlfabric server set_mode command can be used to specify if the Primary should receive some of the reads (READ_WRITE) or only writes (WRITE_ONLY).
The next section describes how this configuration can be extended to add scalability by sharding the table data (and it can be skipped if that isn’t needed).
Adding Scale-Out with Sharding
The example in this section builds upon the previous one by adding more servers in order to scale out the capacity and read/write performance of the database. The first step is to create a new group (which is named global-group in this example) – the Global Group is a special HA group that performs two critical functions:
Any data schema changes are applied to the Global Group and from there they will be replicated to each of the other HA Groups
If there are tables that contain data that should be replicated to all HA groups (rather than sharded) then any inserts, updates or deletes will be made on the Global Group and then replicated to the others. Those tables are referred to as global tables.
The following figure illustrates what the configuration will look like once the Global Group has been created.
The global group will contain three MySQL Servers running on the same host (in a product environment they would be split over multiple machines) and so the first step is to configure, bootstrap and start those servers.
The Global Group is defined and populated with MySQL Servers and then a Primary is promoted in the following steps:
As an application table has already been created within the original HA group, that will need to copied to the new Global Group:
A shard mapping is an entity that is used to define how certain tables should be sharded between a set of HA groups. It is possible to have multiple shard mappings but in this example, only one will be used. When defining the shard mapping, there are two key parameters:
The type of mapping – can be either HASH or RANGE
The global group that will be used
The commands that follow define the mapping and identify the index number assigned to this mapping (in this example – 3) – in fact that same index is recovered in two different ways: using the mysqlfabric command and then reading the data directly from the state store:
The next step is to define what columns from which tables should be used as the sharding key (the value on which the HASH function is executed or is compared with the defined RANGEs). In this example, only one table is being sharded (the subscribers table with the sub_no column being used as the sharding key) but the command can simply be re-executed for further tables. Note that the identifier for the shard mapping (3) is passed on the command-line:
At this point, the shard mapping has been defined but no shards have been created and so the next step is to create a single shard and that shard will be stored in the existing HA group (group_id-1):
At this point, the database has technically been sharded but of course it offers no scalability as there is only a single shard. The steps that follow evolve that configuration into one containing two shards as shown in the following figure.
Another HA group (group_id-2) is created, from three newly created MySQL Servers then one of the servers is promoted to be the Primary:
At this point, the new HA group exists but is missing the application schema and data. Before allocating a shard to the group, a reset master needs to be executed on the Primary for the group (this is required because changes have already been made on that server – if nothing else, to grant permissions for one or more users to connect remotely). The mysqlfabric group lookup_server command is used to first check which of the three servers is currently the Primary.
Splitting the Shard
The next step is to split the existing shard, specifying the shard id (in this case 2) and the name of the HA group where the new shard will be stored:
Before looking at the application code changes that are needed to cope with the sharded data, a simple test can be run to confirm that the table’s existing data has indeed been split between the two shards:
The next example Python code adds some new rows to the subscribers table. Note that the tables property for the connection is set to test.subscribers and the key to the value of the sub_no column for that table – this is enough information for the Fabric-aware connector to choose the correct shard/HA group and then the fact that the mode property is set to fabric.MODE_READWRITE further tells the connector that the transaction should be sent to the Primary within that HA group.
The mysql client can then be used to confirm that the new data has also been partitioned between the two shards/HA groups.
Example Application Code (Includes Sharding)
The final example application code reads the row for each of the records that have been added, the key thing to note here is that the mode property for the connection has been set to fabric.MODE_READONLY so that the Fabric-aware Python connector knows that it can load balance requests over the Secondaries within the HA groups rather than sending everything to the Primary.
The initial version of MySQL Fabric is designed to be simple, robust and able to scale to thousands of MySQL Servers. This approach means that this version has a number of limitations, which are described here:
Sharding is not completely transparent to the application. While the application need not be aware of which server stores a set of rows and it doesn’t need to be concerned when that data is moved, it does need to provide the sharding key when accessing the database.
All transactions and queries need to be limited in scope to the rows held in a single shard, together with the global (non-sharded) tables. For example, Joins involving multiple shards are not supported.
Because the connectors perform the routing function, the extra latency involved in proxy-based solutions is avoided but it does mean that Fabric-aware connectors are required – at the time of writing these exist for PHP, Python and Java
The MySQL Fabric process itself is not fault-tolerant and must be restarted in the event of it failing. Note that this does not represent a single-point-of-failure for the server farm (HA and/or sharding) as the connectors are able to continue routing operations using their local caches while the MySQL Fabric process is unavailable.
MySQL Fabric Architecture & Extensibility
MySQL Fabric has been architected for extensibility at a number of levels. For example, in the first release the only option for implementing HA is based on MySQL Replication but in future releases we hope to add further options (for example, MySQL Cluster). We also hope to see completely new applications around the managing of farms of MySQL Servers – both from Oracle and the wider MySQL community.
The following diagram illustrates how new applications and protocols can be added using the pluggable framework.
We really hope that people try out MySQL Fabric and let us know how you get on; one way is to comment on this post, another is to post to the MySQL Fabric forum or if you think you’ve found a bug then raise a bug report.
Vote UP /