Backups: Drives, NAS's, RAIDs, & Clouds
A good backup strategy is one of the single most important things you can implement as a computer user, regardless of the work you are doing. Backups will save you in the inevitable event that you have a drive or system failure. The benefits of a good backup strategy will outweigh the costs, even if you never have to use it. Backups provide you with an easy and time saving manner of recovering your files in the event of a failure, which should also bring you a peace of mind knowing that your data is secure.
What makes a good backup strategy?
There is a saying in the IT world that if your data doesn’t exist in at least 3 places, it doesn’t exist at all. Another idea you may have heard is “two is one, one is none.” These ideas gets expanded upon with the “3-2-1 rule”, which is a rule of thumb that states that you should have at least 3 different copies of your data, one at least 2 separate devices or storage media, with at least 1 of these being off site (such as stored in a friends house or on the cloud). This ensures that your data is protected regardless of a single drive/system failure or a studio-wide fire.
Some popular small-scale solutions for backups include external drives, a NAS system, and cloud based storage.
External drives come in a variety of sizes, speeds, and factors. (See “Which Storage Drive Should I Buy?” for more information). You "could" use one hard drive and set up a partition on it to designate a set amount of space on it for backups, but I would not recommend this as it is storing the primary data and the backup on the same disk. I recommend getting a separate drive specifically for backing up your data.
Consider how drives usually fail: Hard drives usually fail due to mechanical issues, usually due to wear and tear or user error (such as dropping a drive). Solid State Drives fail with wear and tear as well, but a different kind. SSDs have a limited amount of read and write cycles. For most use cases, users generally do not get close to this amount. But for a backup system, it can happen way quicker than one might think. Both of these drive types are also suspect to random issues, such as floods, fires, data corruption, viruses, electrical isseus, etc. which all also can cause your drive to fail.
With a NAS (network attached storage) unit, you have a separate computer that has many drives inside of it and is connected to your network. The drives inside can be arranged in a variety of RAID formats, which essentially determines how the computer stores data on the drives and the level of backups that it does across them.
! With NAS’s and RAID systems, or any time you are using a server with multiple hard drives spinning in close proximity to each other, it is a good idea to use server grade software. This are built to a higher quality and are able to withstand all the vibrations that that generates. This will also help ensure the protection of your data.
RAID stands for Redundant Array of Inexpensive Disks, and it works exactly as it sounds. In a RAID array, the Inexpensive Disks (or drives) and arranged in a way and usually set up to have data redundancy – so if one drive fails you can replace it without losing any data. This is incredibly beneficial, as all drives WILL fail at some point, but this allows them to fail with minimal risk of losing any data.
RAIDs can be configured in a variety of formats, designated as the word RAID followed by a number:
RAID 0 is your fastest possible option to share data between drives. Essentially, it doubles, triples, or quadruples your total bandwidth among the drives, depending on how many you have. This practice of sharing data “horizontally” across multiple drives is known as “striping” and is great for your NVMe SSD sample drives, as the computer will read the multiple drives as one large drives having multiple lanes of data transfer rather than just one. The downside of a RAID 0 is that it offers NO data redundancy, so if one drive fails you lose everything.
RAID 1 is an exact copy of all the data on one drive to another. So Drive A is exactly the same as drive B. This is likely the most similar to how most people would utilize a backup drive, though it is automated so it is much more efficient and accurate. The practice of exactly copying data from one drive to another is known as “mirroring”. RAID 1 is usually best when data reading or reliability is more important than data writing or storage capacity. So for example, as an archive system for your personal photos and documents this would be great.
RAID 5 introduces us to the concept of “parity”, which for the sake of backups is very important for us. Parity is extra data stored on a separate drive that can be used to reconstruct the lost data if needed. Think of it like algebra: If your data is 1 + 4 = 5, and one drive fails that will leave you with the equation 1 + X = 5. The computer can figure out what the value of X is using the parity information and rebuild the original equation of 1 + 4 = 5.
RAID 5 is the most commonly used RAID configuration and requires at least 3 drives, but can work with up to 16. This is because this RAID configuration uses at least one of the drives as a parity drive, and then the remaining data is striped across the remaining. The drive that stores the parity data is alternated with each stripe, so one single drive does not contain all of the parity data for the system. RAID 5 is a great for storing backups of data that is not regularly updated – so videos, photos, e-books, etc.
RAID 6 is similar to RAID 5, except there is an additional parity drive as well. So you can have up to two drive failures simultaneously without losing data. With RAID 5 and RAID 6, once a drive fails it takes time for the system to perform the necessary calculations and rebuild the data. If a second drive fails during this time in RAID 5, you lose your data. In RAID 6, a third drive would have to fail.
RAID 10 is also a common choice, though it is not too common in small use cases. This is because RAID 10 requires at least 4 drives and you can use only half of your total storage capacity. Data is striped across hald of them, like in a raid 0, and then mirrored on a second set of drives like in a RAID 1. So the name is literally RAID 1+RAID 0, or RAID 10 for short. In a RAID 10, the rebuild time of your data is exceptionally fast, but you only have access to half of your total storage capacity, making this an expensive, but secure, RAID configuration.
There are more to RAIDs than this, and I encourage you to do your own research before settling on one that is right for you. You can combine different RAID configurations to suit your purpose, such as having your sample drives RAID 0, but your backup drives as a RAID 1. RAID is NOT a substitution (on its own) for a backup system. It is only a means of configuring your hard drives and the way your computer accesses and stores data across them!
Cloud based storage
Cloud based storage is the most common way people achieve their off site back up. This is achieved by paying for a service that you connect to via the Internet, and it backs your data up on someone else’s servers. What’s great about the cloud is you can access your data from virtually anywhere and your data is generally very secure against data loss but you also have the risk of someone else having access to your data. This is generally not an issue if you vet the cloud provider properly, but the data does exist on someone else’s storage drives, so it is something to be aware of. Another issue with cloud-based storage is the speed is determined by your network capabilities. So if you for instance had to restore a backup of a large system, this may take a long time. Backblaze is a popular option among small businesses.
What is a Time Machine?
A time machine backup (MacOS) is Apple’s built in backup technology. With it, you connect your computer to a drive, and the OS automatically backs up your data at regularly scheduled intervals and maintains an archive of all the backups its performed until you run out of space. This is wonderful for restoring your system to an earlier point in time or migrating to a new machine.
! There is a "feature" in some MacOS versions where if you don’t have a drive connected after you’ve set up Time Machine that it will start saving cloud backups to your device’s storage, even if you’ve manually disabled this feature. So if you are getting constant “out of space” messages, try opening terminal and typing “tmutil listlocalsnapshots /” and see if any are in your system. If there is, follow the instructions - here.