Like many people, I keep most of my records, pictures, memories, important documents all in digital format. Keeping everything in a digital format is by far more efficient, flexible, and lossless (in terms of pictures not aging). But this flexibility comes at the cost of being much more fragile overall. Live data - data that is online, and accessible - is as vulnerable in most home environments as your file cabinet sitting next to a big glass window off a busy street. The wall between your data, and disaster is very thin.
Digital records and data have the ability to be incredibly easy to store in multiple locations, much easier than physical records. A duplication of every record you have, every picture, etc… can be accomplished for virtually free and in a matter of minutes in some cases. By not leveraging that capability, you are skipping one of the best perks of this digital/data age. Today you records and data are much more open to attack by anyone with the intent, and determination, or mistakes by the end user.
Your Cloud Drive is not Backed up:
Just because your data is already in your Drop Box, OneDrive, Box, etc… account doesn’t mean it’s safe. Those accounts are still a single point for data in many cases, and are actually more open to attack and exploit than your on premises data. They should be backed up just as anything else should.
Malware wake up call:
Cryptolocker Malware: A virus that locks your files and required you to pay to unlock them or when the clock times out, it erases everything you had.
One wrong move by mistake, or by malicious malware could wipe out a lifetime of pictures, and records. With so many “Data locker” viruses and malware (like Cryptolocker) going around these days, your precious data could forever be locked or encrypted unless you pay some Russian mobster hundreds or thousands of dollars for the decryption key. This is where having backups of your data come into play. It’s insurance against you or someone else destroying the most important data we all keep and archive these days.
Synolocker - a NAS specific malware:
Some of the new Malware is able to focus attacks on NAS (networked hard drives) specifically as well. No Antivirus on your PC will protect your NAS device in that case.
If you have backups of your data, you could effectively just ignore these malware requests and just restore the data and delete the malware encrypted files (pending you removed the Malware that is).
If your backups were online and accessible via the same account, there is a good chance your backups could be compromised or blocked from you as well. Which is why you need to consider an “air gap”, security firewall (2 factor auth for accessing backups), or off line storage for backups to keep them protected from real time attacks like these data locker malwares.
Like any insurance, backups are never supposed to be needed, but we do it just in case we do. Backing up your data is at a simple level, just making copies of your data in another location. Just like making copies of important physical documents and storing some in a safety deposit box off site.
Backups come in many forms. You can backup to a USB drive, second hard drive, optical disc, tape, cloud, etc… But there are a few main categories of backup:
- On Site , Online - Backed up data is stored in the same physical location, and at the same time is currently connected and accessible via your network or systems. (Example: Second Hard drive in your PC, always running and accessible)
- On Site, Offline - Backed up data is stored in the same physical location, but is only mounted or online while the actual backup is being performed, after that it is offline, disconnected and stored. (Example: Burned DVD, then stored next to your PC, or a USB drive you back up to now and then but keep in your house.)
- Off Site, Online (ish) - Same as on site and online, just in a different physical location, but still accessible. (example: Backup to another office location, but that location remains online, and updated continuously. Like Drop Box or OneDrive, or other services that sync files from onsite, and is always available for read/write and being erased with no special security.)
- Off Site, Offline(ish) - Same as On Site, but in a different physical location. This is the highest level of resiliency for backed up data. (Example: Backup to a DVD or Disk and move that disk off site after the backup is complete. Also a Cloud Service that lets you do snapshot, or incremental backups but retains previous backup images for a time period. Typically encrypted and requires special logins/access to access any data already uploaded. )
Most home users will backup to an on site solution these days (if they do at all). With USB drives, NAS devices, or internal hard drives being the primary backup location. These are good for general protection of data, but in the case of a fire/flood/theft/natural disaster, most of these are prone to the same data loss as no backup. You can mitigate the risk with fire safes, or even fireproof USB drives that are out there. But those devices are still prone to theft, and even bitlocker encryption has been shown to be vulnerable to attack in some cases on those devices if not configured properly.
Fireproof USB drive - iOSafe:
To the Cloud!
With the advent of relatively cheap cloud storage many consumer level cloud backup solutions are coming into play. Companies like Carbonite, CrashPlan, and Mozy are offering cost effective consumer (and enterprise in some cases) cloud backup solutions. These are cheap enough and easy enough in some cases to replace what someone would typically backup onto USB drives or DVDs.
Example of Consumer pricing:
These products are easy to use, and can be considered Off Site and Off line (in some cases, other times these are still very online and accessible) backups, giving you the best of all worlds in terms of protection. The downside for many of these systems if they don’t scale well. These solutions are meant to protect 10’s of GB of data, or 100’s of GB for a monthly fee in most case (Example: Mozy Home offers 125GB for 10$/mo, 20GB extra for $2/mo). Which is great for most home users today, but as cameras get more powerful, images take up more space. And soon avid photographers may find themselves out running what the “consumer” cloud solutions can offer for space. This is where you need to start to become strategic in your backup plans.
Backing up everything to the cloud isn’t feasible for most people. To prioritizing where you backup what data to what location becomes important. Just like you don’t buy the most expensive insurance plan you can get, you weigh the risk and cost into the equation.
The basic rule is - Nothing important should reside only in one physical location, whether that be a single PC, or in a single residence. Malware, Fires, Floods, Tornados, Hurricanes, Burst pipes, Pet accidents, etc… are all generally capable of removing all your most valuable items in one quick move. So might as well keep a copy somewhere else.
Some basic things I use to prioritize data I back up:
- Is the data replaceable - (Ex: OS installs, downloadable apps, downloadable app data, etc…) - Don’t Backup.
- Is the data already copied in multiple locations or cloud solutions - (Ex: iTunes Match, Amazon Cloud Player for music, One Drive or Drop Box for other forms of data.) - Not important for off site backups as the data already resides off site and in a cloud or another datacenter. In this case, backing up to an on site (or another cloud) but offline system could be a good idea.
- Is the data not replaceable and is important - (Ex: Photographs, records, and documents) - Backup, off site and off line.
- Is the data semi important, but replaceable, but time consuming? - (Ex: Purchased Downloaded Movies and Music, scanned pictures, archived data from other mediums) - Backup if you feel the time to recover/download the lost data again is more of a hassle than the cost to backup the data.
As you can see above, there is no straight forward and universal recommendation for how to backup everything. Some things aren’t worth the cost, some things are, and it is largely a personal decision based on risk, time, and cost.
On Site storage is largely purchased in bulk, one time payments. You buy a NAS device, you buy a USB disk, etc… These are one time purchases and you use as much of the storage as you can. And buy more when you need.
Off Site Cloud storage is much more granular, and you generally pay for what you use. So if cost is a major concern you need to organize your data you want to backup very carefully to not use more cloud storage than you need.
Getting the data up to a cloud provider, means you need to (in most cases) upload the data over your internet connection. This is a bigger hurdle than one may think. Lets take the Mozy 125GB example from above. And factor how this comes into play in the consumer market.
Most consumer internet connections have bandwidth caps in place. Ranging between 200GB/mo to 500GB/mo. After that you are charged a higher rate, or your speeds drop down.
Also consider that your upload speeds are fractions of your download speeds. Many consumer connections are limited to 1 - 5Mb/Sec upload. With most being at the lower end of that range. Many consumer connections also suffer from congestion of the connection when all the upload speed is being used. So even while the download stream is clear, the high upload seems to lock up the connection.
Upload times can be measured in weeks or months depending on your connection, and this is at 7Mb/sec up (picture of Veeam Cloud Edition):
The Bad news:
So if you were to backup 125GB of data, at 2Mb/sec that would take approximately 7 days to run and could drag your connection to the ground when that is happening. This could also potentially (depending on how your provider measures your bandwidth consumption) use half your monthly allocation.
You can throttle most of the backup apps to only back up on certain schedules or only use so much bandwidth.
Not many consumer level backup solutions offer what is called “pre seeding” which lets you mail them a disk to do the initial replication of data.
The Good News:
Almost all cloud backup providers do block level backups. Which means only items that are changed in files or whole files are backed up ever run. So after the initial large backup (7+ days in this case), your weekly or daily backups could be much smaller. More often you back up, the smaller the uploads become.
So an avid photographer with hundreds of GB of images and data to backup, may need to consider business class internet connections, and enterprise grade backup solutions.
Business class internet connections are getting cheaper and faster every year. With 10Mb uploads as low as $100/month in my area. With Gbit fiber coming to more locations in the coming years, this bottle neck may not be an issue much longer.
Big Data = Big Money
If you are a data pack rat like I am, you may look at some of the consumer backup solutions and shake your head. Because when you start getting to data sizes measured in terabytes, the game changes dramatically.
You are now playing in the realm of enterprise solutions and enterprise level costs. Most of the consumer players mentioned earlier have business level solutions that scale, but at a business/enterprise level costs.
Mozypro Business pricing (as an example since we used Mozy above):
As you can see, for 1TB Mozy business grade is over $400/month. That is extremely significant and shows that these solutions tend to not scale well. They come packed with features, from enhanced management, to easy restore, etc… but at an extreme cost.
There are other, more out of the box, and custom options out there that may fit better for these level of customers.
To the BIG cloud! (on the cheap… kind of…)
Amazon S3, Microsoft Azure, RackSpace, Amazon Glacier, etc… These are intimidating names generally reserved for the developer or enterprise circles.
But what they are in reality, is the biggest cloud storage organizations on earth. (Microsoft has stated back in 2011, they were adding already 1 Petabyte a week of storage to Azure, that was when Azure was a fraction its current size.) Because they buy so much storage, and really just offer it as bulk storage with few native features, its incredibly cheap.
How cheap? 1TB of Microsoft Azure Blob Storage is $24/mo, Amazon S3 $30/mo, Amazon Glacier is $10/mo
The issue? tools to actually access it. Unlike the Mozy’s of the world who package their storage (which… is largely actually stored on Amazon and Microsoft storage on the back end) is wrapped in easy to use tools they support, and you pay for it. These cheap, massive cloud storage providers do not offer any (or limited) native tools around these storage solutions.
One exception is Microsoft Azure - they do offer a native plugin / Backup tool for Windows Server 2008 and 2012 to backup direct to Azure Cloud storage. But this backup isn’t to the cheap storage we are talking about here. That agent backs up to a special location in Azure that is 10x more expensive than bulk storage. It backs up to special storage called the “Backup Vault” which is a managed and specialized storage area. The agent is free for any windows server instance, but per 1TB you will pay $290/mo for it. So for the purpose of this, we are not counting that as a cheap option. It’s more in line with the MozyPro example above, only with less features.
In terms of Amazon Glacier, the primary method of storing data in that cloud is by shipping them a disk to load into the storage system. But they also do have a direct connect feature as well that allows 3rd party tools to upload and sync. Glacier is meant for long term records retention (keeping the records on ice… get it.. glacier… ice! )
Azure and S3 also have similar import/export physical disk capabilites but their primary way to use them is via upload and have the backups updated via online transactions via the internet.
Amazon S3 and Microsoft Azure both have options for local redundant and geo redundant replication. Meaning your data will either be stored in one DC or spread around the country. (both have different pricing for those tiers)
Discussions about enterprise cloud storage will need an entirely separate post because they all have so many different features and capabilities. For the scope of this, I am going to leave it here and just say they are big,cheap, and straight forward purpose built systems.
Potential Hidden Costs:
One aspect of the a la carte pricing in these large cloud systems is the potential “hidden” or unexpected cost that may come into play is data transfers. For most big cloud providers, uploads into the cloud are free . Downloads (Egress / Restore) cost money though. In theory this will only be used in the case of restoring from a backup. In the case of Microsoft Azure (Amazon S3 is the same price) it is close to $120 if you choose to pull 1TB of data back from the cloud.
Amazon and Azure offer “Export” services to throw data onto a disk and mail it to you for $40/disk in most cases. But your backup solution may not understand how to handle the data being made locally.
Now lets talk about those 3rd party tools.
Tools to use the cheap clouds:
Every enterprise backup provider is starting to incorporate some cloud backup solutions. Generally to their own datacenter or offering, similar to the Microsoft Azure backup agent mentioned above. These native tools come at a premium cost, and are bundled into the enterprise management tools you know and love.
But there are some challengers in the market that are making this cost effective, and strangely easy.
One of the leaders in this space is Cloudberry labs. (I am not compensated, or have any involvement with Cloudberry, their product just fits this discussion, and after lots of testing, I’ve found it to be one of the better backup tools to the cloud.)
Cloudberry makes a backup tool that is actually licensed and rebranded by other backup solutions, like Veeam Cloud edition.
This tool will connect to almost any bulk storage cloud on the web (local and FTP included), and allow encrypted, block level, managed backups.
Pricing around it (direct from Cloudberry) has options from Desktop users, to server users:
Licensing for these tools are generally based on devices you are backing up. In the case of the desktop, its per desktop. Same for Server. These licenses are perpetual, so the one time cost of $30 for desktop backups, and Azure storage gives you backup capabilities for 1/20th the price of what that MozyPro was offering.
But with different features as well. The expensive MozyPro and other embeded options give you stuff like management and restore from a mobile device, enhanced web management, etc… This tool is pretty simple and bare bones, but in the end, it does exactly what it says.
The Cloudberry Lab based Veeam Cloud Backup:
This tool allows you to set bandwidth schedules and limits, encryption levels (Up to AES 256), Compression, SSL uploading, and after the initial sync, it will do block level changes and can do it real time as changes occur in data.
In the case of the Veeam option, this is licensed as part of their Backup and Replication suite, and in my case is backing up my Veeam Backup images. But it can backup raw files direct as well.
Encryption of your backups?
Encryption of your backup data is more important than many may think. Many consumer products tout their military level encryption, and in most cases its all done on the cloud side without any end user intervention.
But like any encryption, your data is only as secure as the key is kept private.
If you use Amazon S3, or Microsoft Azure, and you account gets compromised, anyone can download your stored data. If its not encrypted, then they can just open it and read it. This is one reason many recommend keeping 2 factor authentication on your cloud service accounts, but also encrypt your data before or at the time of upload.
The Cloudberry tool will not only encrypt the uploaded data, but will also encrypt the file names making the objects in the cloud file system look like gibberish if anyone were to gain access.
Below is an image of a storage container I am backing up into in Microsoft Azure. Note all the file names are complete garbage, and no one knows what they are.
The decryption key, which is required for any restore from Azure to the Cloudberry tool (Veeam in this case), is stored separately and offline. I can control where that key resides, and the data is useless in Azure without it.
So if someone were to compromise this storage area of Azure, worst thing they can do is delete the data, and I need to re-upload it.
Wrap up (for now):
Every single person who has important, or sensitive data stored in only one location (a laptop, NAS device, Drop Box, One Drive, etc…) needs to have some kind of backup strategy. It is far too easy for an exploit or a misplaced key stroke to permanently delete or damage data these days.
As data keeps piling up, and the need for backups grows in all segments of computer and data users, options are everywhere and for all sizes. If you want easy, highly manageable, “one throat to choke” kind of offsite and offline backup solutions to the cloud, then solutions like Carbonite, Crash Plan, Mozy, etc… are your solutions. But they all come at a cost.
If you are more inclined to get your hands more dirty, and play with the big boys with enterprise grade tools and systems, then look at using some of the big cloud players and the associated 3rd party tools. They can be far cheaper, but if you aren’t careful in what you use, or how you use it, that cost savings can evaporate.
In the end, it doesn’t matter what you use to backup important data, the most critical thing is to know what data you have is actually important. Do the work before hand to identify what you need to backup, and where it should go, then look for solutions to fit your needs and budget. Because no one wants to be sitting in the middle of a data disaster and wondering if they actually backed everything up.
Oh and one last thing - test your restores. Before you ever start putting data in any backup solution. Do a test, so you know what you will need to do to restore down the road. No one wants to learn on the fly in the middle of a crisis.