2015-12-09

Tesora’s Doug Shelley and Oracle’s Matt Lord weigh in on the latest with MySQL on OpenStack Trove.



Transcript of Session:

Doug:     As Ken said, I’m Doug Shelley. I’m the VP of development at Tesora. I just have a couple of slides that I want to introduce some newer features in Tesora DBaaS platforms specifically around MySQL. Then I have a short demo to show off those features I wanted to take you through. Before I do that have you all heard of the book? I don’t know if that’s been mentioned yet, the book, OpenStack Trove.

Tesora DBaaS platform just a quick intro for those of you who don’t know what it is. This is completely based on community Trove which we all know and love. We provide a few extra things. We provide certified guest images. Pretty much all the databases are supported and some extras. I think this is one of the things that’s important, I think it was the last discussion, the meet the experts panel there was a significant amount of discussion in that panel around guest images and what it takes to make one and how do you make sure it works and all those things. We definitely identified that as an area where we could add value. We’re building these production quality images that we test all the time. We test them across all platforms, distributions to make sure that they’re going to work and all the databases. We’re starting to add more enterprise features. We’ve definitely identified Oracle support as an area where it is important for enterprise customers. We started building out Oracle 11G and 12C support. We’ve done a bunch of work around making the platform easier to install and configure.

Really in this presentation I wanted to focus on what’s new. I want to focus on MySQL support. Since the last Trove day, if anybody was there, it was about a year ago, some things that I wanted to highlight. We’ve really took a look at replication within the context of MySQL and we did this basically in two phases. The first phase we call replication v1. It was basically, we called it binlog based. It’s doing asynchronous replication based on the kind of stock binlog that’s in MySQL 5.5. At the time it also supported MySQL 5.6. Basically you create a slave from a master. You can create multiple slaves one at a time and you could detach the slave.

We looked at that and decided to evolve that feature and we came up with replication v2, which we decided to use the global transaction ID mechanism that’s in MySQL 5.6, which basically provides for some of these other features I’ll talk about in a second. We basically also provided the ability to start multiple slaves simultaneously. One command you can have n replicas come up. We also recognized a need to be able to start a slave from an incremental backup because if every time a slave starts it has to do a full backup that could be annoying. If there’s a full backup available, it will automatically provision the slaves off of an incremental.

The biggest thing I’d say we started out here, and Amrith touched on some of this this morning, about failover. We added at this point manual failover. The ability to do this was really based on the introduced the use of the GTID mechanism. Now basically what we can do is promote and eject and I’ll talk more about those in a second. All of this stuff went upstream in Juno and Kilo, I believe.

Then another thing that has been recently added that were being worked on is something I think that has been talked about for a long time in the community and with customers is the ability to pull various logs off for the guest. I would say the major focus is the database specific error logs or the database specific query logs or slow query log. We built out a mechanism that’s generic, but the initial implementation is focused on MySQL. For MySQL it exposes the error general slow query. We also exposed the actual Trove guest agent log optionally.

The publishing of the logs can be enabled or disabled. There’s Horizon interface for this, which I’m going to show you. There’s also CLI and rest API, so the publish is log publish. Individual logs you can stream them back to your command line using log tail. That’s also available at Horizon. You can actually download the entire log with log save.

I wanted to take a look here, let’s see if this works. That’s a next slide. Okay. We can flip to the demo. Before my friends at HP laugh at me. My demo is a video that I’m going to walk you through. I do that because as my VP of marketing said, ridicule is better than epic fail. Or actually he’s not my VP of marketing; he’s my favorite marketing dude.

This is Horizon dashboard. Our themed Horizon. This is the Tesora DBaaS platform so here we’re just going o sign into that. Then what I’m going to show you is first we’re going to take a look at the data store panel, which just lists the different data stores that are available. As you heard today, these all link basically to guest images. I loaded up a few here just so you could see more than what I’m going to demo, but you can see we have the Oracle, Cassandra, Postgres.

I’m going to focus on an image we just recently created with help from our friends at MySQL enterprise group from Oracle. You see this MySQL-eeguest. I’m going to click on that here and you can see the, I’m going to click on it eventually. There we go. This is just to drill down into it. You can see the name and it points to where the datastore. This is version 5.6 of MySQL enterprise. There’s the datastore version panel showing that. I didn’t click on it but you can see there’s a link on the far right side to the image ID that’s actually imaging clients.

Okay. What we’re going to do is I’m going to bring up an instance that’s going to become my master. I hit the Launch Instance Dialogue. Fill it in here my master. I’m going to pick this medium sized flavor, go for 3 gig volume. I’m going to MySQL Enterprise 5.6 as my guest. I’m also going to on the create, I’m going to create a database called Employees because I’m actually going to load the MySQL employee sample into it so we have some data. I’m going to create a user for myself. These will all be created on the MySQL database when it’s provisioned. Now we’re going to launch. Here we go. It’s building. You can see. Build, build, build. There we go. It’s active. See how fast that was.

Now, we’re going to do is I’m going to use this master image to show you our log retrieval feature. We’re drilling into the instance details first. You can see down here I’m going to cut and paste the locator info so I can connect to the MySQL instance via the other CLI and show you some things. Just going to copy that out. That has the connect information for MySQL. You can see here the user got created, the database, employees is there.

Now we’re going to go over to the logs. As I mentioned in my overuse slide, this logs panel is new, in Tesora DBaaS it will be 1.6 I believe. The feature will end up being upstreamed in Mitake. These are the four logs we expose for MySQL. You can see there, error, slow query, guest, general log. You can see that this can publish flag indicates whether it’s a publishable log, so right now for the demo I set them all to yes so we can get through this here. What I’m going to do is I’m going to go ahead and show you the general log I believe. I’m going to click on this view log button here. As you can see, for anybody familiar with MySQL that happens to look like a MySQL general log. Not much has happened yet. I’m going to come back to this in a minute after I do some more work.

There are some more options for you for log download. We’re going to go back to the list. This is a Trove guest agent log. One of the things in the feature is that the operator can choose which of these logs to expose to the end user. In the case of this demo I’d expose them all, but there’s certain reasons you can imagine that the guest agent log would be a particular log that you wouldn’t want to expose to the end users of the system. That’s available; I just have it exposed here. This is a Trove guest agent log frame. Trove people out there, they’d recognize that for sure.

Going back to the list. I’m going to pop the error log. This is a MySQL error log. As you can see, I’m going to show this twice just so people don’t think I made this up. Notice the name of the database in the bottom there, it says, 5.6.26 enterprise commercial advanced. When I read that I was trying to figure out how many more words Oracle could put in the name of the database to indicate that it’s enterprise edition. I’m using in fact, MySQL enterprise edition 5.6.

Now what I’m going to do, I’m going to load the database. I’m going to go to command line. I’m going to connect as my user ID Doug. Here we are and you can see again, it comes up, it’s MySQL enterprise. I’m going to show you that this database, employees, is on the instance, there it is. Then were going to go out here and we’re going to load this employee sample people who have used MySQL know this. It loads quite a lot of data I’d say for a sample database. It’ll help instruct the replication feature here. I’m just going to load that. It shouldn’t take that long here. This Mac is really fast, either that or I’m really good at video editing.

Okay. I just want to go in and show you on the database the data is loaded and specifically how many rows are in the employees table. I’m going to select count star employees. This is all loaded into my master database that I created previously. There it is 300,024 rows. Now we’re going to flip back to our dashboard. We’re going to go back out to the instances panel and launch some replicas. Actually sorry, no we’re not. First we’re going to show you the general log through the log retrieval features. You can see that mess above, that’s the bulk insert from the employee sample. Then you can see at the bottom here the commands that I typed into the master to select the count star from employees, so it’s indeed collecting data as we go here.

Now we’re going to launch the masters after I get back to the top of this screen here. There’s all the bulk load of that database. Okay. Come on. Let’s stop collection on the general log first. Okay. See it stopped. Okay, here we go. We’re going to launch. This time we’re going to launch two slaves off that master. I’m going to give him a base name of my replica. They’re going to come up as my replica 1 and my replica 2. We’re going to use MySQL 5.6 enterprise. We’re going to pick under the advanced tabs source as replicated from instance and we’re going to pick the master instance as the source. We’re going to do two slaves. This is basically going to take a backup of the master, shoot it to both the two slaves it provisions and then provision Cheetah based replication on there. Right now you see it’s doing the backup master. Then the master goes back to active. Then it loads that onto the slaves and they go active. Now we have a three node effectively replication set in MySQL here.

I’m just going to go through the instances and show you how they’re provisioned. This is the first replica. Basically on the dashboard what gets added to the detail output if you look right at the very bottom it says that is a replica of. This is a replica of that instance. I’m going to copy paste out the replica locator info so we can go check the database out. Okay. Now we’re going to go look at the second replica quickly here, I believe. Nope, we’re going to check the first replica for data. That entire employee’s database should have been backed up, restored on here and is replicating now, so we’re going to just check that, so I’m going to see how many rows are in our employees table. Select count start from employees. Anybody know what the answer’s going to be? 300,024.Stunning.There we go. Our replica has been correctly provisioned with that base dataset form the master.

Now what I wanted to show you is one of the failover mechanisms. In the case where you want to basically generate a failover and the master is still reachable you can use the promote to replica source function, which is right here. What it’s going to do basically is it’s going to take the replica you picked and it’s going to make it the master and convert the master to a slave. They go into this promote state, the whole set for a minute, so it can get everything consistent. Once they’re active we can take a look and see that indeed my replica 2 is the one I selected. It’s now going to be the master and you can see that because instead of it saying is replica of there at the bottom it says replicas and it has two of them. We can see now that our master is actually a replica of so it’s been demoted to a slave. My replica 1 now will have switched from master being the original master to my replica 2. I click on that link and you’ll see the my replica 2 actually comes up. That’s promote.

Then the next thing I wanted to show you was just some of the other features here the commands you can do. For a replica you can do detached replicas. That basically leaves the database along but takes it as a replication network, so it’s no longer replicating. For the master you can do eject. The purpose of eject is when the master isn’t actually reachable and you want to kick them out and have it automatically promote one of the slaves to be master. The way it does this is it’s configurable, but I think at this point by default it uses the slave that it thinks is most current, which makes sense in most cases I suspect.

We’re going to go ahead and do that. We’re going to eject my replica 2. Here we go. It generated an error. The reason there’s an error is because my replica 2 is still reachable. It’s tagged as active. You’re not allowed, this is basically a seatbelt, you’re not allowed to eject the master if it’s reachable. If you wanted to do that, you should use promote because promote makes sure all your databases are consistent. In the case of eject it is possible that you could ose data if there’s transactions on your master that haven’t been replicated yet. Basically we’re telling you here, you can’t eject this guy because he’s still reachable. We determine reachability by the Trove guest heartbeat. This guy has a current heartbeat so he’s not ejectable. What I’m going to do is I’m going to basically knock over the Trove guest agent as a way to simulate a failure. I just basically SSH into it and kick the guest agent over. You can see here I just did that and it says here the guest agent is inactive. Now that master will end up with a stale heartbeat, in which case we can eject it. They all go into eject state. It’s working through figuring out which replica to promote. It will promote it and detach the master, which in this case the master was the replica 2.

Let’s go see if we can figure out who the new master is. First we’ll check my replica 2 and notice now that he has no replication information on the bottom because he’s basically been detached. He’s not in the replication network anymore. We’ll go check out my master and maybe he got promoted back to being the master. Let’s find out. Yes, he did because he has one replica which is the other guy. There we go. I think that’s actually what I had for today. I don’t know if there are any questions for me. I can take a couple and then I’ll turn it over to Matt. Does anybody have any? Oh, sorry, I can’t see. Go ahead.

Doug: Yes.

Doug:     The promote won’t take longer because they’re already all in synch general speaking. The lag could possibly be on transaction rate in promote, but in terms of creating a new replication network of course if you have a very large volume that didn’t have a previous full back up on It and you make him a master, yeah, it’s got to basically shoot that database back up over to all of the slaves, which is partly why we did the, it’s part of the reason for the incremental feature, but definitely at this point if you have significantly large master and you create a new slave there is time in terms of getting the back up to him. Anybody else? Go ahead.

Audience member: do what you just demoed in a program.

Doug:     You asked me if I know of any Java client.

Audience member:         Are there any Java client libraries that you know of that could be used to build a demo like what you just did without going to the UI?

Doug:     So you’re asking me if you could demo this feature without using the UI?

Audience member: Yeah, basically I’m interested in Trove, plant libraries into Trove.

Doug:     Well the core client library to Trove is Python. All of the things that I just demoed through Horizon are available through the rest API, which in turn then means they’re available through the Python API and also through the Trove CLI. You can completely script that, for sure.

Doug:     No, I don’t think that’s advisable.

Audience member: Okay.

Doug:     Because basically at the point you ejected it you possible, maybe likely depending on the transaction rate, lost data because it could have kept getting portioned or something, right.

Audience member:         Why is the ejected VM labeled as active just like all of the others?

Doug:     Yeah. That is a very good question. That is an artifact which I think we need to sort out, which is the Trove instance, the active is being set because the heartbeat is current. There’s no mechanism at the moment to actually detect that the heartbeat isn’t current in terms of active monitoring. When I shot that guest agent, he just stops ending heartbeats. The infrastructure database, which is where the heartbeat is store, there’s no mechanism to go oh, he’s stale change his state to not active, which I think part of that goes to the monitoring discussion that Amrith was talking about this morning. I think the community is definitely very interested in. Okay. Did I mention the book? Oh, sorry. Okay, Matt I’ll turn it over to you.

Matt:     Thanks Doug. Can you hear me? Can you hear me now? Thank you Doug. My name is Matt Lord. I’m a project manager in the MySQL group at Oracle. I’m going to try and go through these really quickly. I know we’re already running a little long. I’ve got quite a few slides. I wanted to at least cover the highlights of the highlights.

MySQL 5.7 we’ve got our second release candidate was made available just a few weeks ago. There’s a ton of new stuff in 5.7. Again, I’m just going to touch on the highlights of the highlights. I have links on the least slide. This presentation will be made available later on by Tesora. I encourage you to look up some additional information. It’s by far the biggest release that we’ve ever done as MySQL. There’s something in it for everybody.

Even if you’re not interested necessarily in any of the new features, I would encourage you to upgrade to 5.6 and then to 5.7, if nothing else for the increases performance. With each release we identify and eliminate certain hot spots, bottlenecks typically, it’s mutex usage. By splitting various mutex, moving to atomic, looking at our synchronization perimeters, we’re able to scale up more effectively with each release. Of course, we want to be able to scale horizontally as well, which we’ll talk about, group application for example, but being able to scale up is a big part of the full scale picture as well.

We also have some NoSQL. APIs to MySQL. This was actually added in MySQL 5.6. We have the Mem-cache-D API, which you can talk directly to the storage engine, which is an InnoDB. Facebook was one of the early adaptors of that technology. They gave us a lot of feedback that was extremely helpful. Facebook operates at a scale that not too many entities do, so they gave us a lot of great feedback. Helped us with a lot of improvements that you can see there in the 5.7 numbers compared to 5.6.

This is another area where Facebook was a big help. They still use PHP. They have their own modified version. They have their hip hop VM, which is still based on PHP which in practice means that they have a lot of connects and disconnects. There’s not a big connection pooling layer. When you don’t have a connection pooling layer, of course you’re making tons of connects and disconnects. We made some pretty serious improvements to how quickly we’re able to handle connections, doing the handshake, doing authentication and so on.

Optimizer improvements, we’ve made a number of them in 5.6, most notably related to sub queries. We’ve made far more in 5.7. This is just kind of some highlights of them. We have a new cost model. Before it was almost entirely based on heuristics. There were some costs involved, but they were all essentially static costs. Now we have a configurable cost model. There’s cost model tables in the system schema, the MySQL schema, that you can actually go in and modify those as well. You can go in and modify the cost of a disc read for example. If you’re using a SolidFire or a Nimble storage flash array, you’re going to have a much different I/O cost than if you’re just using a single SATA disk.

Those costs are built throughout. We also expose them through JSON. We have a new JSON based explain, which is new in 5.6. We have expanded it. Every time we add new costs, we also expose them in the explain.

I’m going to have to skip through just because of time. I wanted to put the slides in there even if I couldn’t spend much time on them just so you can download them later. The query rewrite. plugin. This was really helpful for r a few specific use cases. For example, if you want to build your own firewall and we actually leverage that for an enterprise offering as well, some of the similar things or if you just have a third party application that you’re using. Maybe you inherited it or you’re just using any, you can’t go in and modify the query. There are certain times where even the best optimizer doesn’t pick the optimal path. Then you can go in and you can add index in, for example. This is one way you can do that even when you don’t have the ability to modify the application or the queries directly.

If you’ve been using MySQL 5.5, 5.1, you’ve almost certainly run into cases where you needed to try and solve a performance related issue and you ended up having to make educated guesses. In some cases, less educated than others. Because MySQL is a bit of a black box before MySQL 5.6 where we really started to add a lot of instrumentation to performance schema. We’ve added a ton of additional instrumentation in 5.7 as you can see a listing here. One of the big ones is memory, but also internal locking, transactions, store procedures. We filled in a lot of the gaps that we had in MySQL 5.6, if you’ve been using 5.6.

In conjunction with that, we lowered the overhead a lot. One of the issues, of course there’s always a tradeoff between moniterability, logging, debugging facilities and so on in performance. We’ve been able to lower the performance overhead or performance schema while at the same time adding new features. We’ve also made it a bit more accessible for the average user. If you look at performance schema directly, there’s a ton of tables, there’s a ton of different options and it can be a bit daunting. It can scare people off at first. We’ve added the SYS schema in 5.7 as well, which is just a collection of helpful views, store procedures, store routines, they really make all of that data that’s in performance schema more accessible. Provides it a nice user friendly format. It allows you to easily answer common questions like what queries are resulting in the most I/O. How is the I/O usage broken down by table? What is blocking this particular query and things like that?

JSON support is one of the more headline features. It’s certainly something that a wide swath of people are interested in. We have a native JSON type in 5.7. We also have a native binary format. If you’re using MongoDB, they have their BSON format. We have our own internal binary format. Things operate well performance wise. Although when it’s exposed via the query results it’s just native JSON. We also have JSON functions. We have the syntax that we support conforms to the proposed SQL standards. There’s an SQL standard proposal out there for JSON and SQL. So we conform to that standard.

In 5.7 we also support generated columns. If you’re familiar with SQL servers, very similar. We support generated columns that are either persisted or stored. So you can create virtual generated columns and then add indexes on them to achieve what’s often referred to as a functional index. That allows you to then query the JSON efficiently. You can add an index on the result of some JSON document look up.

GIS is one area where MySQL has historically been behind, in the open source world, behind post GIS. We’re really looking to provide a solid foundation for MySQL users. It’s come up before, polyglot persistence. One of the common cases for MySQL users they may have a Postgres system set up just for GIS needs. We really want to provide a GIS system internally in MySQL, but it eliminates that use case. If you’re a MySQL user, you’re not forced to go off and set up a separate post GIS instance, just for whatever your GL spatial needs are.

You can see a run down there. The single biggest one is that we added spatial indexes to support InnoDB. Previously it was only for the MyISAM storage engine. We also added a bunch of other things like GeoJSON, GeoHash. We also ripped out the internal implementation, which is probably not of too much interest to you. We hired two of the primary developers from Boost.Geometry. They’ve been working with us and then we commit that code back up stream to Boost.Geometry.

InnoDB is the primary storage engine. If you’re not intimately familiar with MySQL, you might not even know what I’m talking about with storage engines. If you’re using MySQL 5.5, you’re just using InnoDB without even knowing it. We’ve added native partitioning. We’ve had partitioning for some time, but it was just a generic partitioning support so you had an intermediate handler layer. We have native support so it eliminates a lot of the problems that we’ve had with partitioning, namely around the number of file descriptors that are used, the memory that’s used. We have CJK support for full text indexes. InnoDB supports full text indexes as of 5.6 and now we have two CJK related parsers. We have a MKV parser and an Ngram parser.

I mentioned the native spatial indexes already. We use Archery indexes for that. We support larger page sizes, using those in conjunction with another feature I’ll talk about, which is transparent page compression. That allows you to get much better compression ratio. The bigger page size, the more redundant data you’re going to have in the page, and the better compression ratios you can get.

We added general table space support. Now let’s say you have an internal organization or a customer, you can create all of their tables in a shared general table space. It makes it a little easier for migrating things at the fall system level. For example, you can have a single IBB file for a given customer, or given internal organization. Group application I’m going to talk about in just a minute.

Security is always an important topic, especially in the cloud, where literally and figuratively, you may have other eyes on your data. You don’t have the physical control over its location. Being able to support data encryption while in transit and at rest is important.   We’ve done a number of improvements there. Russell helped with general security issues, with password rotation policies. We can notify you if someone is using a weak password and so on.

SSL I already mentioned in general, but we’ve had two notable things. One is a new helper script. When you initialize MySQL, it will bootstrap itself now with MySQL 5.7, so you don’t need to run MySQL installed DB, that’s separate script. You run MySQL D—initialize. If you don’t already have SSL certificates set up, it will create those SSL artifacts for you. You can also use this separate utility this MYSQL RSA setup utility as well. You can also require secure transport which is something that you could never do before. It was actually a cause of some confusion. There was previously no way to require SSL connections. Now you can do that with MySQL 5.7.

Server-side timeouts is something that people have requested for a long time. This is a long-standing feature request from users. I thought it was worth mentioning. Now you can specify a global timeout. Any read query, you can say globally I want it to timeout to 20 seconds. For example, some developer added some query and it scans a two billion row table every and it’s run every time someone logs in. That will give you a chance to kind of protect your machine from total failure in the event that certain things come in. Then you can also do a per query as you can see in this example here. You can specify n milliseconds using this new hint syntax as well.

We have a new hint syntax and I mentioned that in optimizer slide. I listed it rather in the optimizer slide. But it’s much nicer, if you ever used the hints before they were kind of spread all over the place. Now we use in conjunction with the old one we now support this which is familiar to an Oracle database user .It’s after the select, insert, update, delete, Then you can specify any number of hints using that new syntax which is much easier than having them all over the place. When you get into really long queries. We’ve also added a bunch of new hints, but there are so any new things in 5.7, I’m just giving you the highlight of the highlights. I encourage you to go talk to one of us, talk to a MySQL sales person, look at MySQL.com. Then you can find out all the details.

As Doug mentioned we added GTID. Global transaction ID based replication. In MySQL 5.6 and we did some major enhancements to that in 5.7. The biggest one is you can move from the old style of replication or binary log and position to GTIDs as an online operation.

We also enhanced the semi-synchronous replication that we added in MySQL 5.6. Going back to MySQL 3.23. When we first added replication it was just asynchronous. In MySQL 5.6 we added support for semi-synchronous replication. You can say, I don’t want you to do the local commit until you get n nodes that have said, okay we have received this, go ahead and commit it.

The multi-source replication is another one. With multi source replication before you always had one master to n slaves. Now it can be many to many. A slave can have many masters. You can do things like fan in replication.

Performance is always a big issue. Another thing that a lot of people have kind of glanced over in 5.6, something that we’ve improved even more in 5.7 is a slave performance. If you’ve been using MySQL for any period of time, you’ve probably come across slave lag. It’s just a common issue that you have to be aware of. With asynchronous replication the slave can fall behind n number of transactions or n number of seconds from the master. We added a parallel replication in 5.6, but the parallelization was done based on schema. If you had queries coming in for n schema, we would execute them in parallel. Now we’ve made that logical clock based. Now you can get parallelization regardless of the schema, as long as there are non-conflicting rights we’ll do that. We leveraged some of the binary log group commit stuff from the master. You can specify n number of threads on each slave in order to do that.

I’ll skip over that, it just talks a little bit about the GTID online operations. This is just highlighting the multi–source replication. I say multi-source, some people have called it multi master I’ve noticed which leads to confusion because this isn’t what I’m going to talk about shortly, which is multi master, or write anywhere replication. That is another thing that’s new in 5.7.

General HA improvements. Group application is probably the biggest which I’m going to talk about in just a second. We’ve added a lot of little things that really help us internally with MySQL Fabric, but also any tooling. We’ve added some server side stuff that really helps understand when a connection can be moved from one server to another, so am I in the middle of a transaction. If I am, then I can’t migrate that connection from one instance to another, but if I’m not, then the load balancer is free to move that connection around.

Reversion tokens that help with caches. If you have a caching layer, then that caching layer can compare its version token to the one that’s currently on the server and if they don’t match than the caching layer can go ahead and reload its cache. We leverage that in MySQL Fabric.

Group replication as far as HA and replication goes, it’s the single most interesting feature. I’m sure a number of people here are familiar with Galera. Galera replication and MySQL group replication at a high level are very similar. They’re both based on the same academic papers from the late 90s the replicated database state machine academic papers. The implementation differs. At a high-level, it’s multi master, write anywhere, synchronous replication. The difference between the two is Galera is a bolted on product, so it’s a third party product. It has its own options, its own logging facilities, its own DBO facilities, all the wsrep stuff if you’ve used it. Whereas, group application is baked in. It uses the same options, the same parallel applier code. It uses the same debugging facilities and performance schema and so on. From a DBA perspective one is learning another product, and the other is now you have the option of asynchronous, semi-synchronous, or synchronous in effect.

The other main difference that I’ll talk about is how they handle the not so rare case that one of the nodes in the cluster becomes much slower. You have a four node cluster in each and one the performance drops off for whatever reason. Galera will degrade the performance of every node in the cluster to match the slow one, so that they’re all in synchronous replication. Group application will silently move the one slow node out from synchronous replication to eventual consistency. Then once things come back it will ensure that it’s consistent and a synchronous member of the group again. There are tradeoffs there and I would like to see us have a configurable choice there. I think those are the two big differences between the implementations.

I mentioned native JSON support. We also in the labs, in the MySQL labs, which is where we release, beta, non-GA, releases of stuff aside from the MySQL server, certain add-ons or features. This is an example of one, which is HTTP plugin. Not only can you use JSON internally but you can set up an HTTP plugin on the server so that rather than sending normal SQL to MySQL it can also speak JSON over HTTP. You can send basic plug calls or you can make document oriented calls.

I’m running out of time here I know I apologize. Just quickly, this is kind of the MySQL official tool for handling sharding and HA. Each HA group that you see in this picture is managing a shard. Fabric will then handle two things; it will handle the sharding aspect. The query comes in and it will direct it to the appropriate shard. Then it will also handle the HA within each HA group. It monitors the nodes; if one of them fails it will handle the promotion, the demotion and so on. Group application can be used in there moving forward to improve the characteristics of each HA group. In theory you can write to any of them. Although you still probably want to have some of them read only and some of them read write.

This I’ll skip over. We moved from what we used to use years ago, which Bizarre, or even before that we used, I’m spacing on the name. We used this old proprietary versioning mechanism and them we moved to Bizarre. Now we moved to GitHub which just makes it easier for people to work with us. We have our code available on GitHub. You can even submit fall requests and so on. BitKeeper, that’s the old one we used to use.

If you don’t already know we also have an official MySQL repos. We have YUM repos, we have APY repos for Debian, for RedHat and so on.

This is just additional info. This was just the highlights of the highlights. 5.7 is a huge release. Really excited about it. I encourage you to go find out additional information. I doubt we have time for questions, but feel free to come talk to me. I’ll be in the back there by the little Oracle table. That’s it. Thank you.

The post The Latest with MySQL on OpenStack Trove appeared first on Tesora.

Show more