2016-10-24

This post was authored by Cosmos Darwin, Program Manager, Windows Server.

The Challenge

In the Windows Server team, we tend to focus on going big. Our enterprise customers and service providers are increasingly relying on Windows as the foundation of their software-defined datacenters, and needless to say, our hyperscale public cloud Azure does too. Recent big announcements like support for 24 TB of memory per server with Hyper-V, or 6+ million IOPS per cluster with Storage Spaces Direct, or delivering 50 Gb/s of throughput per virtual machine with Software-Defined Networking are the proof.

But what can these same features in Windows Server do for smaller deployments? Those known in the IT industry as Remote-Office / Branch-Office (“ROBO”) – think retail stores, restaurants, bank branches or private practices, remote industrial or constructions sites, and more. After all, their basic requirement isn’t so different – they need high availability for mission-critical apps, with rock-solid storage for those apps. And generally, they need it to be local, so they can operate – process transactions, or look up a patient’s records – even when their Internet connection is flaky or non-existent.

For these deployments, cost is paramount. Major retail chains operate thousands, or tens of thousands, of locations. This multiplier makes IT budgets extremely sensitive to the per-unit cost of each system. The simplicity and savings of hyper-convergence – using the same servers to provide compute and storage – present an attractive solution.

With this in mind,  under the auspices of Project Kepler-47, we set about going small…



This tiny two-server cluster packs powerful compute and spacious storage into one cubic foot.

Meet Kepler-47

“The storage is flash-accelerated, the chips are Intel Xeon,
and the memory is error-correcting DDR4 – no compromises.”

The resulting prototype – and it’s just that, a prototype – was revealed at Microsoft Ignite 2016 last week.



Kepler-47 on expo floor at Microsoft Ignite 2016 in Atlanta.

In our configuration, this tiny two-server cluster provides over 20 TB of available storage capacity, and over 50 GB of available memory for a handful of mid-sized virtual machines. The storage is flash-accelerated, the chips are Intel Xeon, and the memory is error-correcting DDR4 – no compromises. The storage is mirrored to tolerate hardware failures – drive or server – with continuous availability. And if one server goes down or needs maintenance, virtual machines live migrate to the other server with no appreciable downtime.



Kepler-47 is 45% smaller than standard 2U rack servers.

In terms of size, Kepler-47 is barely one cubic foot – 45% smaller than standard 2U rack servers. For perspective, this means both servers fit readily in one carry-on bag in the overhead bin!

We bought (almost) every part online at retail prices. The total cost for each server was just $1,101. This excludes the drives, which we salvaged from around the office, and which could vary wildly in price depending on your needs.

Each Kepler-47 server cost just $1,101 retail, excluding drives.

Technology

Kepler-47 is comprised of two servers, each running Windows Server 2016 Datacenter. The servers form one hyper-converged Failover Cluster, with the new Cloud Witness as the low-cost, low-footprint quorum technology. The cluster provides high availability to Hyper-V virtual machines (which may also run Windows, at no additional licensing cost), and Storage Spaces Direct provides fast and fault tolerant storage using just the local drives.

Additional fault tolerance can be achieved using new features such as Storage Replica with Azure Site Recovery.

Notably, Kepler-47 does not use traditional Ethernet networking between the servers, eliminating the need for costly high-speed network adapters and switches. Instead, it uses Intel Thunderbolt™ 3 over a USB Type-C connector, which provides up to 20 Gb/s (or up to 40 Gb/s when utilizing display and data together!) – plenty for replicating storage and live migrating virtual machines.

To pull this off, we partnered with our friends at Intel, who furnished us with pre-release PCIe add-in-cards for Thunderbolt™ 3 and a proof-of-concept driver.

Kepler-47 does not use traditional Ethernet between the servers; instead, it uses Intel Thunderbolt™ 3 over USB Type-C.

To our delight, it worked like a charm – here’s the Networks view in Failover Cluster Manager. Thanks, Intel!

The Networks view in Failover Cluster Manager, showing Thunderbolt™ Networking.

While Thunderbolt™ 3 is already in widespread use in laptops and other devices, this kind of server application is new, and it’s one of the main reasons Kepler-47 is strictly a prototype. It also boots from USB 3 DOM, which isn’t yet supported, and has no host-bus adapter (HBA) nor SAS expander, both of which are currently required for Storage Spaces Direct to leverage SCSI Enclosure Services (SES) for slot identification. However, it otherwise passes all our validation and testing and, as far as we can tell, works flawlessly.

(In case you missed it, support for Storage Spaces Direct clusters with just two servers was announced at Ignite!)

Parts List

Ok, now for the juicy details. Since Ignite, we have been asked repeatedly what parts we used. Here you go:

The key parts of Kepler-47.

* Just one needed for both servers.

Practical Notes

The ASRock C236 WSI motherboard is the only one we could locate that is mini-ITX form factor, has eight SATA ports, and supports server-class processors and error-correcting memory with SATA hot-plug. The E3-1235L v5 is just 25 watts, which helps keep Kepler-47 very quiet. (Dan has been running it literally on his desk since last month, and he hasn’t complained yet.)

Having spent all our SATA ports on the storage, we needed to boot from something else. We were delighted to spot the USB 3 header on the motherboard.

The U-NAS NSC-800 chassis is not the cheapest option. You could go cheaper. However, it features an aluminum outer casing, steel frame, and rubberized drive trays – the quality appealed to us.

We actually had to order two sets of SATA cables – the first were not malleable enough to weave their way around the tight corners from the board to the drive bays in our chassis. The second set we got are flat and 30 AWG, and they work great.

Likewise, we had to confront physical limitations on the heatsink – the fan we use is barely 2.7 cm tall, to fit in the chassis.

We salvaged the drives we used, for cache and capacity, from other systems in our test lab. In the case of the SSDs, they’re several years old and discontinued, so it’s not clear how to accurately price them. In the future, we imagine ROBO deployments of Storage Spaces Direct will vary tremendously in the drives they use – we chose 4 TB HDDs, but some folks may only need 1 TB, or may want 10 TB. This is why we aren’t focusing on the price of the drives themselves – it’s really up to you.

Finally, the Thunderbolt™ 3 controller chip in PCIe add-in-card form factor was pre-release, for development purposes only. It was graciously provided to us by our friends at Intel. They have cited a price-tag of $8.55 for the chip, but not made us pay yet.

Takeaway

With Project Kepler-47, we used Storage Spaces Direct and Windows Server 2016 to build an unprecedentedly low-cost high availability solution to meet remote-office, branch-office needs. It delivers the simplicity and savings of hyper-convergence, with compute and storage in a single two-server cluster, with next to no networking gear, that is very budget friendly.

Are you or is your organization interested in this type of solution? Let us know in the comments!

Show more