nimbus.io, from the makers of SpiderOak, is an inexpensive way to backup your data. Currently in beta, nimbus doesn’t do automated backups or syncing, but it allows you to utilize the Amazon S3 servers to backup your data very securely.
$6 per 100GB of purchased storage. Transfer in is always free. Transfer out is $0.06/GB. PUT and LISTMATCH requests are $0.01 per 1000. Other requests are $0.01 per 10,000 or free.
- REST API Similar to Amazon S3
- 100% Open source hardware, client libraries and server software
- Build by SpiderOak on the same proven backend storage network which powers hundreds of thousands of backups
- Focus on high throughput and reliability, while sacrificing low latency for cost effectiveness
Nimbus.io uses parity striping to spread data across many servers redundantly – providing bulk storage with equal reliability and less than half the overhead of replication based storage systems.
The API is similar to S3 or Rackspace Cloud Files – favoring JSON over XML. For easy portability, we supply libraries to use Nimbus.io as drop-in alternatives for the most popular libraries to access competing cloud storage platforms.
The server and client components are all free and open source software. The server software is made available under the AGPL license. Clients libraries are available under LGPL.
As 37signals famously described, in the software business we almost always create valuable byproducts. To build a privacy-respecting backup and sync service that was affordable, we also had to build a world class long term archival storage system.
We had to do it. Most companies in the online backup space (including BackBlaze, Carbonite, Mozy, and SpiderOak to name a few) have made substantial investments in creating an internal system to cost effectively store data at massive scale. Those who haven’t such as Dropbox and JungleDisk are not price competitive per GB and put their efforts into competing on other factors.
Long term archival data is different than everyday data. It’s created in bulk, generally ignored for weeks or months with only small additions and accesses, and restored in bulk (and then often in a hurried panic!)
This access pattern means that a storage system for backup data ought to be designed differently than a storage system for general data. Designed for this purpose, reliable long term archival storage can be delivered at dramatically lower prices.
Unfortunately, the storage hardware industry does not offer great off-the-shelf solutions for reliable long term archival data storage. For example, if you consider NAS, SAN and RAID offerings across the spectrum of storage vendors, they are not appropriate for one or both of these reasons:
- Unreliable: They do not protect against whole machine failure. If you have enough data on enough RAID volumes, over time you will lose a few of them. RAID failures happen every day.
- Expensive: Pricy hardware and high power consumption. This is because you are paying for low-latency performance that does not matter in the archival data world.
Of course #1 is solvable by making #2 worse. This is the approach of existing general purpose redundant distributed storage systems. All offer excellent reliability and performance but require overpaying for hardware. Examples include GlusterFS, Linux DRBD, MogileFS, and more recently Riak+Luwak. All of these systems replicate data to multiple whole machines making the combined cluster tolerant of machine failure at the cost of 3x or 4x overhead. Nimbus.IO takes a different approach using parity striping instead of replication, for only 1.25x overhead.
Customers purchasing long term storage don’t typically notice or care about the difference between a transfer starting in 0.006 seconds or 0.6 seconds. That’s two orders of magnitude of latency. Customers care greatly about throughput (megabyte per second of transfer speed) but latency (how long until the first byte begins moving) is not relevant the way it is if you’re serving images on a website.
Meanwhile the added cost to support those two orders of magnitude of latency performance is huge. It impacts all three of the major cost components – bandwidth, hardware, and power consumption.
A service designed specifically for bulk, long term, low latency storage is easily less than half the cost to provide.
Since launching SpiderOak in 2007, we’ve rewritten the storage backend software four times and gone through five different major hardware revisions for the nodes in our storage clusters. Nimbus.IO is a new software architecture leveraging everything we’ve learned so far.
The Nimbus.IO online service is noteworthy in that the backend hardware and software is also open source, making it possible for people to either purchase storage from Nimbus.IO similar to S3, or run storage clusters locally on site.
If you are currently using or planning to adopt cloud storage, we hope you will give Nimbus.IOsome consideration. Chances are we can eliminate 2/3 of your monthly bill.