2014-11-06

I missed the PASS welcome, but I did arrive to see Cloud Databases from Dr. Rimma Nehme.

We begin with a few references to Dr. DeWitt, who has given amazing keynotes. It’s a tough act to follow, but Dr. Nehme goes in her own direction. Dr. Nehma is highly qualified. An MA and PhD, working with Microsoft Research and EMC as her career has grown. She’s currently at the Gray System Lab.

We start with the Blind Men and the Cloud – Lot of different views on what the “cloud” means. People have Shiny Object Syndrome, and think that the cloud is something new and might solve their problems. That’s certainly been true for many companies.

Cloud computing – Computing resources and software delivered on demand, as a service. That’s a good definition, but it’s also what I want to see from my database. It’s a service. It’s not a box, not a server, but it’s a service I ensure developers can use. I think as we move to that mindset, we’ll also start to manage our on-premises boxes differently and perhaps perform our jobs differently.

Charcteristics of the cloud

on demand self-service

location transparent resource pooling

unbiquitous network access

rapid elasticiity

measured service, pay per use

It’ a little scary to think about that last one, and I certainly that has people concerned. Do we really know what we use, or even need to use? Often we don’t get elastic revenue even if we go to elastic cost. It’s a scary balance.

One big thing, which I wonder how many IT people realize, is that IT eliminates, or reduces, CAPEX. If you don’t know what that is, you should. It’s important.

I truly think that many DBAs and other IT staff need to be thinking more widely, considering the impact on the business of our work. The positive and negative.The more we can phrase, discuss, and couch out benefits and costs in terms that business people understand. That alone will help us advance our careers, and our industry.

Data centers

Where do they live? For Microsoft, they’re all over. They have over 100 in 40 countries, and over 1mm servers. Dr. Nehme moves to efficiency, since that’s one of the reasons the cloud matters.

What’s the efficiency of a data center? Traditional centers spend according to this chart.



Why do we care? Certainly you could look at climate change, but perhaps more importantly, we want to remember resources are not infinite and the limits, or races to reduce costs and raise efficiency drive companies towards innovation.

PUE = Facility Power / IT Eqiupment Power, usually in a data center, this is 2. So half our power is spent on non-computing resources. Modular data centers have gotten down to a 1.12 – 1.2 PUE. The fifth generation data centers, integrated ones, are in the 1.07 range.

We now move to the cloud, and how the services may differ. We get a picture of the various cloud service levels, it’s a complex picture.



Pizza as a service, a good analogy.



Life and ship your database to the cloud? I’m not sure it’s that easy. If you are IaaS, then it probably is, though load is an issue. II f you go to PaaS, then because the “surface” of what’s available as a database, you probably need code changes.

The focus shifts to cloud specific services for data management, looking at areas that do not exist for the “earthed” databases. That’s a better term than on-premises, and I’ll start using it. It might help us differentiate the two types of databases that I suspect more and more of us will deal with.

Virtualization came about because it’s better for using resources, but it does bring issues. Bottlenecks and delays for the various services using the hardware. The abstraction of resources (CPU, memory, network, disk) mean that there are indirect paths to use those resources, which can cause performance issues.

There is a good slide, showing the ways you might look at consolidation. I’ve often looked at the timing of workloads, but understanding IO v CPU v memory (for the buffer pool), might be good.

If you don’t use virtualization, I’d be surprised, but I certainly think you should start to understand it. Even if only on your own desktop, start to learn about how things work.

We get a slide on multi-tenancy – four main approaches, with examples.

This is interesting. It’s good to build comparison on the differences. I typically hadn’t considered the right two, though the third one I one I’ve one for auditing or ETL.

Inside Azure

It’s designed with HA in mind, with 2 secondary replicas of the data. We’ve heard this before, but we’re building a base here.  We get a little architecture, layers for routing and billing sit between clients and databases. However there’s a twist.

Each machine has a SQL Server instance (not the edition you buy) with a master, msdb, model db. However with that is one SQL DB of some sort. Inside of this SQLDB is the primary for one of the databases you create, along with secondaries for other SQL databases from other customers. That’s an interesting concept.

Somehow we share a database, but we can each build whatever schemas we want. There must be some sort of “virtual schema” that separates those things but we don’t get details.

The DBA’s Role

Dr. Nehme sees the cloud as not a threat to DBA. Demand grows for apps and capacity increases, but DBAs don’t grow as quickly. You could argue that the low growth of DBAs (relatively) is because we need less.

However we still need data quality, security, HA, provisioning, tuning and more from DBAs. Arguably we might need more of those things in the cloud, so I don’t think the cloud matters, unless you are unwilling to learn. If you build your skills, and get away from the simple aspects of managing databases, I suspect you’ll be fine.

Watch the keynote. It’s interesting and should make you think. Whether you agree or not, there is something here.

Filed under: Blog Tagged: PASS, syndicated

Show more