We talk a lot about building “task specific” solutions that enable open source to be adopted at scale in the software defined data center, but what does that *actually* mean? Well here’s an example of what you can achieve when you build the hardware from the ground up with a specific task in mind. Meet the newly patented* “Ceph Button”.

 

First a bit of background. One of the many properties of Ceph’s Software Defined Storage platform is that it makes many copies of the data it is storing across nodes so there’s no single point of failure. The specific algorithm that manages this is called CRUSH if you want to do more reading.

Now, should Ceph unexpectedly lose communication with a drive, or set of drives (for whatever reason) the automatic response will be to re-balance to get back to a sufficient number of data replicas and ensure they’re distributed evenly across the cluster. This, as you might imagine, can have an impact on overall system performance.

If you’re familiar with Ceph’s command line interface you could “warn” Ceph about the drive, or set of drives, that are about to disappear and tell it to pause this re-balancing. You’ll need to know which drives in which appliances will be impacted, or maybe just remove the entire appliance from the cluster. More importantly, identifying which specific physical disk a logical OSD is running in isn’t always straightforward. That’s a pain, a skilled job, and not something that you really want to get into if all you’re trying to do is replace a failed drive.

 

Enter the “Ceph button”.

To make maintenance easier we build our appliances to hold drives within caddies of multiple drives. Their tool-less design means they can be installed and removed with no risk of rogue screws or screwdrivers damaging the appliance. They also have a unique button on the caddy able to “talk” to Ceph.

The button does all the hard work in Ceph so that you don’t have to – once you’ve pushed it the LEDs in the Caddy confirm that Ceph is ready for your drive to be pulled, and you can take out any drive in the caddy safely and replace it. The caddy has a set of indicators that also highlight the specific drive is in need of attention (a small on-board battery keeps the LEDs lit when the caddy is removed) making it a snap for the (regular, not a Ceph expert) engineer to identify and replace the failing drive -again, with no tools required.

Check out the cool little video to get a better sense of what’s happening:

You can’t do this if you take an off the shelf commodity appliance and run Ceph on it. You just can’t. And it adds real value for our customers trying to deploy software defined storage at scale. If that’s got you interested why not get in touch for a chat and we’ll explain some of the other things that a task specific approach can deliver. We think you’ll be impressed.

*United States Patent: 10,684,928