How HyperDrive can help stem spiraling co-lo costs as your storage footprint grows.
A scenario that we regularly see: you run a business that monetizes data, maybe directly by offering data storage services for other businesses – things like backup, cloud file sharing systems, or processing – or indirectly through insights that help your business run smarter and be more competitive.
That data is increasingly generated and stored outside the four walls of a traditional data center, to be closer to cloud computing resources or geographically distributed workforces or to be closer to where the data gets generated, maybe because of bandwidth limitations or evolving data privacy or sovereignty laws.
Co-location provides an attractive solution, and based on IDC’s forecast, is on track to becoming a $15B market within the next 5 years. These facilities providers have effectively monetized the outsourcing of facilities management, all the way down to the rack and physical security. And they’ve also created strong relationships with network backhaul service providers to ensure high bandwidth and reliability.
Co-location providers typically monetize three aspects of their service:
- The physical space – often in partial or full rack increments. It facilitates secure separation of tenants within the retail co-location market for those who don’t have enough digital footprint to consume entire rooms or floors of a building.
- Network connectivity to the outside world – this can vary widely depending on networking needs and sophistication but usually contains some combination of bandwidth and allocation of logical network segmentation IP subnets or addresses, VLANs, etc.
- Power reservation – typically, customers are charged either for a circuit (or redundant circuits) and can consume as much energy as the circuit allows, they’re charged for the power consumed, or both. This varies widely based on the region due to differences in the cost of power infrastructure and cost of electricity from utility providers and is provided at a premium of over the “raw” cost per kwH reflecting that it tends to be both an in-demand and limited commodity inside the four walls of a data center.
Recently Earth Capital produced a report that focused on sustainability which demonstrated that SoftIron storage appliances use, on average, 70% less energy than than industry standard servers. But it also has a significant bottom-line impact, especially to those using co-location services. Let’s do some math to demonstrate how this relates to specific costs in a co-location scenario.
An example – 1PB in co-location
Let’s say you want to store 1 petabyte (PB) of data in a co-lo facility. Let’s compare building a HyperDrive cluster to do the job effectively compared to something based on generic hardware.
Using SoftIron, your first thought might be to architect a solution using the densest possible system that we provide – which today is the HD11144, providing 144TB of capacity per 1 rack unit. But you’ll also want to ensure that your system is both durable (resistant to downtime through drive or node failure) and is efficient (still performs well if such failures occur, but with optimal redundant hardware). To get a very cost-effective and durable storage efficiency, you might use a technology like erasure coding – let’s say in an 8-3 configuration (that’s the ability to lose 3 drives or nodes in the case of a HyperDrive cluster, without losing data, a storage efficiency of 72%).
In that case, you’d want a minimum of 12 storage nodes to distribute resilience evenly across each node. That way you can lose drives or entire systems without losing your resilience, and if you lose one, the data can rebalance across the remaining 11 without sacrificing durability. You’d also want to plan for about 20% of headroom.
The math looks roughly like this…
|Target usable capacity
||1PB (1,000 TB)
|Resiliency and efficiency profile
||8:3 ≈ 73%
|Raw capacity needed
||1000 ÷ 73% = 1,369 x (1+20%) = 1,641TB
|Raw capacity per HD11144
|Number of HD11144s required
||1,713 ÷ 144 ≈ 12 nodes
|Usable capacity sanity check
||12 x 144 = 1,728 x 73% ≈ 1,261 – 20% ≈ 1,009TB
In this case, the HD11144 provides a sweet spot – an ideal node count given the target capacity and data durability.
In addition to the storage nodes, you’ll likely want management and storage router nodes to provide remote management and monitoring and storage interfaces for your applications or users. SoftIron recommends those be independent devices because it improves predictability during cluster stress, though other vendors may collapse those functions into the storage devices.
Your total power footprint now looks like:
||Average power per appliance (in watts)
||Total power needed
There’s a rule of thumb in the industry to give 20% headroom in your power budget, which puts the total wattage required at 3,348W.
If you want similar resilience from other SDS solutions using commodity hardware, you’d end up with a similar TB/appliance strategy. Since those solutions might sacrifice predictability under stress to save a few bucks on hardware they do not use separate management and router designs. Let’s strip those out. That leaves 12 appliances that operate at roughly 3.3x the power consumption, basically 660W per node, or 7,920 W for 1PB usable capacity. Factor in headroom, and you’re at 9,504W.
What we’re leaving out here is the fact that those commodity servers often get to their capacity footprint by being taller to offset the power consumption from the non-storage parts of the system.
- more drives per node, leading to
- fewer nodes available to get to 1PB, which in turn means
- you may not have enough nodes to reach your data durability goal.
But for this exercise, let’s assume that commodity server vendors can get to a similar density as HyperDrive in a 1U form factor.
What customers and partners tell us, here in the US at least, is a typical 30A 208V redundant circuit (meaning you get two separate circuits into the rack in case one fails) might cost from $700 to $1,000 per month, if the co-location provider only charges for the circuit and not also for the power used. It means you have to pay $1,000 per month regardless of how much you consume for up to 6,240W of safe power per month (safe meaning if you exceed that threshold, you could have power failures in your rack if one circuit fails. )
The commodity equivalent for that 1PB usable storage capacity will require two redundant circuits because 9,540W is greater than the6,240W available, where a HyperDrive solution would require one. That’s a potential additional $12,000 per year – over the average 5 year life of a storage system, that’s $60,000 of savings in power alone. And don’t forget power in many data centers is in high demand and short supply. With the current economic situation, rising energy cost plus the rapid growth in data infrastructure these costs (and savings) are likely only heading in one direction over the next few years.
It turns out that rethinking the way data infrastructure should work top to bottom and inside out, and rethinking the way it gets designed means taking a harder path than the rest of the industry takes.
Right-sizing all the pieces results in a leaner system. And that leads to greater power savings and greener products.
How would this example work in your region or with your co-lo providers? Let me know. I’m keen to hear how the model might be different elsewhere.