Here’s a great video from Backblaze discussing their storage pods:
Hope you enjoy!
We thought ten people would care; instead a million people read our Storage Pod 1.0 blog post where we open sourced the Backblaze Storage Pod design and introduced the world’s most cost-efficient way to store big data. The interest grew when we published our Petabytes on a Budget: Revealing More Secrets blog post that announced Storage Pod 2.0, which doubled the amount of storage and reduced the price. Since then several companies have built businesses selling Storage Pods inspired by Backblaze to hundreds of organizations around the world who are storing hundreds of petabytes of data on their own Storage Pods. Today we introduce Backblaze Storage Pod 3.0 which stores more data, costs less, is more reliable, and is easier to service.
For Storage Pod 3.0 we redesigned the chassis and upgraded many of the components. Most of the changes are aimed at improving the reliability and flexibility of the Storage Pod. The full parts list is in Appendix A at the end of this post.
Here are the highlights:
- 180 terabytes of storage. With the availability of 4 TB hard drives a Storage Pod can now be configured to store 180 TB (45 x 4 TB). As a bonus the same chassis and components can be used with any capacity of 3.5” hard drives.
- Anti-vibration drive bay assemblies. There are now 3 assemblies, one for each row of 15 drives. Each assembly is designed to lock down a row of drives in place. These assemblies replace the “drive bands” around each drive. This saves nearly an hour during Pod assembly and makes drive replacement easier as well.The key advantage of the drive bay assemblies is to reduce vibration. These assemblies not only keep the drives still, they also keep them firmly seated in the backplanes. Over the past several months we have tested different models of drives in the new drive bay assemblies and we have seen a dramatic improvement in overall system performance along with lower drive failure rates.
- Upgraded motherboard. We now use the Supermicro MBD-X9SCL-F motherboard which replaces the previous model. The new motherboard adds a host of advanced processing features from Intel, as well as upgrades the PCIe slots to double throughput. Note that while our currently specified SATA cards do not take advantage of this increased throughput, it’s nice to know we can use it in the future.
- More motherboard choices. We added standoffs to the chassis to provide better support for Micro ATX motherboards while still supporting the Standard ATX form factor. Specifically the new standoffs support the outer edge of the Micro ATX boards.
- CPU. We upgraded the CPU to a 2nd generation Intel Core i3-2100 processor to replace the end of lifed (EOL) i3-540 model. This also gets us a little bump in clock speed (3.06 to 3.1 GHz), lower power usage (65 versus 73 watts), and more supported RAM (32GB up from 16GB).
- Memory. We changed memory suppliers, so now the memory is certified by Supermicro.
- Boot drive options. With Storage Pod 3.0, boot drives can now be 2.5” or 3.5” and we allow a second 2.5” drive to be attached for use in a redundant RAID1 boot volume. Boot-up drives can also be traditional HD or SSD drives. Backblaze switched to 2.5” boot drives because they are less expensive and are more reliable, but we didn’t want to eliminate support for 3.5” drives in case anyone needs a higher capacity 3.5” drive.
- Backplanes. While we continue to use the same backplanes (Sil3726 chipset) as with Pod 2.0, there is another backplane based on the newer Silicon Image Sil3826 chipset that can be used. If you do use the Sil3826 based backplanes, you’ll notice that boot up can take a long time due to a large number of time-out/retry errors during the boot process. Eventually the boot up process will succeed. To fix this, you can use these instructions to update the backplane driver in your Linux kernel so you can use the Sil3826 based backplanes.
- SATA cables. We replaced our SATA cable vendor with Nippon Labs. If you decide to use another vendor, look for cables that are SATA II or SATA III compliant and test them extensively. While any SATA II or SATA III cable should work we have found quality control problems with a number of vendors. In all cases it was the connectors, not the cables, which were defective.
- Metal standoffs in the chassis. We replaced the plastic standoffs with metal components that can be manufactured as part of the chassis to reduce the cost and assembly time.
- Improved airflow. The vent design was improved to increase airflow through the pod. We’ve never really had a problem with heat in the pods, and we’d like to keep it that way.
- Chassis rivets. We replaced many of the screws with rivets. This simplifies the manufacturing process and saves times during the assembly process. Note, if you end up buying a case from Protocase (more on them later), they continue to use screws versus rivets. The two manufacturing processes produce the same basic chassis regardless.
- Costs Less. For Storage Pod 2.0, the price for the components without drives was $1,984.00. For Pod 3.0 the price is $1,942.59, or $37.41 less – or about a 1.9% decrease in the cost (see Appendix A for a parts breakdown). The main component in the total cost will be the hard drives. The lingering effects from the Thailand drive crisis and consolidation in the drive industry have meant that even today hard drive prices are higher than they were when Pod 2.0 was introduced back in July of 2011 but here’s one way you can save a little on the price of hard drives.
Backblaze currently has over 450 Storage Pods deployed and manages nearly 50 petabytes of data. There are also many Storage Pods being used by organizations around the globe. Along the way we’ve learned a few things:
- Firmware revisions matter. Watch out, manufacturers update hardware and upgrade firmware without changing their model number. These updates are intended to fix bugs but in the process new bugs can be introduced. Read the release notes carefully and downgrade the firmware to the version you’ve tested whenever possible. When updates are unavoidable, test them thoroughly before deploying them.
- Let your vendors do the testing. Components like memory, PCIe cards and hard drives are modular and should all play well together. Unfortunately, the specs that allow for this interoperability (PCIe, SATA, DDR3, etc) are as complicated as are the components themselves. Because of this, use ‘certified’ components whenever possible. This will minimize problems and avoid finger pointing between vendors if problems do arise.
- Don’t make random changes to our design. It might be tempting to try a build a pod out of the spare parts lying around your office but don’t do it. The components specified in the parts list in Appendix A are known to work well together. We believe in iteration and experimentation but don’t reinvent the wheel unless you have to.
- There is more to power than just Watts. ATX power supplies deliver power at several voltages or ‘rails’ (12V, 5V, 3.3V, etc). Each vendor imposes unique limits on the amount of power you can draw off of each rail and unused power on one rail cannot be used on another. In particular, most high end power supplies are designed to deliver most of their power on the 12V rail because that is what high end gamer PCs use. Unfortunately, hard drives draw a lot of power off the 5V rail and can easily overwhelm a high wattage power supply. You will hit serious problems if power requirements for each component are not met so be careful if you don’t use the power supplies we recommend.
- Keep it simple. For hard drives, SATA cards, SATA cables, backplanes, etc. you should use the same vendor, part number and/or model number as much as possible. By keeping things simple you reduce the number of variables that need to be considered if things go wrong. If you do mix components do it intentionally and for a good reason like comparing the performance of 3 different hard drives. And when you do this, make sure you’ve set things up in a way that allows you to draw clear conclusions. For example: with hard drives, if you want to compare the performance of 3 different models you should arrange them so that each RAID array is homogeneous. You should also take care to make sure each array is spread across all backplanes and SATA cards so that you don’t have IO hotspots which could taint your results.
- Things change. Just when you get comfortable with a part, it will be discontinued or upgraded. We buy in quantity, we buy spares, and we have substitutes ready to go at any time. We also look for parts that have long-term support policies. We realize you may not be able to do these things, so be prepared when you upgrade one component that something may break.
The Uses and Users of a Storage Pod
In general there are three types of storage:
- Transactional storage that provides real time or near real time access to data.
- Bulk storage which stores a large amount of data yet provides access to that data within seconds to minutes.
- Archive storage which keeps all of your data and often requires hours to days to access the data you need.
All are important. Our Backblaze Storage Pods were designed, built and implemented to economically hold large amounts of data yet provide access to that data in seconds to minutes, thus they are purpose-built for bulk storage. The economics of using a Backblaze Storage Pod for storage have driven many organizations and people to build and utilize Storage Pods inspired by Backblaze. This community has also built upon the Backblaze design and specifications to build Storage Pods for all types of storage needs. Here are a few examples:
- GINA, The Geographic Information Network of Alaska stores geospatial information.
- Vanderbilt University Institute of Image Science stores MRI scanner data for their research partners.
- Crispin Porter + Bogusky create ads for companies like Old Navy, Best Buy, Coke Zero and others, they archive their media campaigns on a Storage Pod.
- Netflix created their Open Connect Appliances, inspired by Backblaze Storage Pods, to store videos closer to their customers.
- Rensselaer Polytechnic Institute, the NASA Jet Propulsion Laboratory, Heritage Auctions, the University of New Mexico and many more.
Storage Pod Economics
When we announced Storage Pod 2.0 in July of 2011, the price dropped by 15% and we doubled the amount of storage versus Storage Pod 1.0. So with Storage Pod 3.0, this should be the section where we would say something like “as hard drive prices have dropped, the economics have continued to improve.” We would say that, except that the Thailand drive crisis which began in October of 2011 raised hard drive prices dramatically. That event, and perhaps the continuing consolidation of hard drive manufacturers, has meant that hard drives prices have yet to return to their July 2011 levels. For example, the hard drive we specified in the Storage Pod 2.0 blog post was listed at $120 (Hitachi 3TB 5400 HDSS5C3030ALA630). To purchase that same drive today you would pay $199 with very limited availability. To be fair there are other drives available that we use, but even the least expensive drive is $125 and that’s an external USB drive that must be ‘shucked’ before it can be used. With the increase in drive prices and decrease in availability, we have validated a variety of drives that work in Backblaze Storage Pods. Here is a list of the drives we have tested and currently use:
4 TB drives
- Hitachi – HDS5C4040ALE630 (Just starting to use these, but they look good.)
3 TB drives
- Hitachi- HDS5C3030ALA630 (low power)
- Hitachi – HDS723030ALA640 (high power)
- Western Digital – WDC WD30EFRX‑68AX9N0 (RED)
- Seagate – ST3000DM001‑9YN166 (slightly higher failure rate)
Another thing we expected to happen was wide availability of 4 TB drives with the subsequent price drop. Only recently have we been able to get a reliable supply of 4 TB drives and prices are starting to drop. While we have built and deployed 180 TB Storage Pods (forty-five 4 TB drives) and we expect to move to exclusively building 180 TB Storage Pods in the near future, the cost per TB for 3 TB systems versus is 4 TB systems is nearly the same.
The table below compares the total cost of different Backblaze Storage Pod configurations.
|Description||Pod 2.0||Pod 3.0||Pod 3.0|
|Drive size||3 TB||3 TB||4 TB|
|Our cost per drive||$120||$125||$195|
|Number of drives||45||45||45|
|Total size in TB||135||135||180|
|Cost per TB||$54.70||$56.06||$59.54|
|Cost per GB||$0.0547||$0.0561||$0.0595|
While it looks like the cost of a 4TB drive system is more expensive, when you factor in rack space, electricity, installation labor, etc. the long term cost for Backblaze leans towards using 4 TB drives. Our monthly cost for a full rack of Storage Pods with 3 TB drives is $0.63 per TB, while a full rack of Storage Pods with 4 TB drives is $0.47 per TB. When you factor all the costs together, it takes about 5 months for us to recover the extra cost encountered when building 4 TB based Storage Pods.
Building Your Own Storage Pod
Since Backblaze has given away the design, you can build your own Storage Pod. You’ll want to start with the parts list in Appendix A. From there feel free to change out some of the components, redesign the case, or completely change out the design. It’s up to you, but pay attention to the “lessons learned” noted earlier. To help you out, here’s a nice screen shot assembly walk-through from Protocase you can review and here’s a Storage Pod assembly overview (PDF-1.5MB) that you can download. You may also find some fellow Pod Builders by visiting theopenstoragepod.org web site for information on their efforts to develop “affordable & energy-efficient high-capacity storage servers, built from commodity components”.
One thing to know is since we don’t sell the Storage Pods and don’t make money off any ancillary services, we don’t provide support or a warranty for people who build their own. To all of those builders who take up the challenge, we’d love to hear from you and welcome any insights you provide about the experience.
Buying a Storage Pod
If you are interested in buying a Storage Pod or having one built for you we recommend you talk to the folks at Protocase. They have built hundreds of Storage Pods. They can also help you with design changes for a reasonable fee. Also, their45 Drives wiki site contains useful technical information about their Backblaze inspired Storage Pods.
As noted earlier Protocase continues to use screws versus rivets in the assembly process of their Storage Pods. From a functional point-of-view you will notice no difference in the finished product.
Another thing to know about Protocase is that they do not include hard drives with their Storage Pods. Protocase does test each Storage Pod they build with drives before they ship the unit to you, but it’s up to you to buy and install the hard drives. This is pretty easy to do, but be forewarned; sometimes buying 45 hard drives at Costco can be an issue.
Thanks for All the Fish
Thank you to everyone that has helped with ideas for how to improve the Storage Pods and to those who have helped spread the word on the reality of inexpensive big data storage. We welcome your continued feedback and support. Enjoy the design!
Appendix A – Price List:
The prices listed below are what we pay for the parts. To obtain these prices we do purchase then in quantity.
Includes case, anti-vibration assemblies, power supply bracket, etc. (Available in quantities of 1 from Protocase for more money)
Zippy PSM-5760V Power Supply
Vantec VDK-PSU Power Supply Vibration Dampener
Supermicro MBD-X9SCL-F (MicroATX)
CFI-B53PM 5 Port Backplane (SiI3726 Chipset)
4GB Samsung BDDR3-1333 PC3-10600 (Certified by Supermicro)
Intel Core i3 processor i3-2100
Mechatronics G1238M(OR E)12B1-FSR 12V 3-Wire Fan
Syba PCI Express SATA II 4-Port RAID Controller Card SY-PEX40008
SATA cables RA-to-STR 3 ft locking from Nippon Labs
FrozenCPU ele-302 Bulgin Vandal Momentary LED Power Switch 12″ 2-pin
WD SCORPIO BLUE WD1600BPVT 160GB 2.5″ Internal