Something that I've battled with for a long time, is how to best manage my family's photo collection. Our requirements are quite specific:
- Everything has to be triple backed-up. Our photo collection contains treasured images of our kids growing up, our wedding day, honeymoon, and images of loved ones who are no longer with us. Risking losing it all just isn't an option. We need local copies and remote copies.
- We have four devices to manage (my iPhone, my wife's iPhone, a Nikon D60 DSLR, and a Nikon Coolpix waterproof camera). This rules out any 'out-of-the-box' solutions such as just letting Apple Photos/Google Photos/whatever upload our images automatically from our phones. The solution that we settle on needs to be able to collate the photos taken from all four sources, as well as any other sources when needed, such as photos of the kids taken from professional photo shoots etc.
- I want total control over how the collection is structured. Our digital photo collection stretches back over 10 years already, and having those images all dumped by some system into yearly folders, or worse all dumped into one folder using only metadata to sort between them, sounds like a total nightmare.
- While not strictly a requirement, I don't like the 'smart' kinds of photo library management software that have been appearing recently. I don't want anything to scan my photos using image recognition so that I can type in keywords and see images of those things. I don't want anything to build 'best of the month' slideshows for me, or to do anything else with my images without my express say so. I just want to store my photo collection securely.
After a lot of trial and error with various services, here's the solution that I've landed upon, and which I've been using with success for a while now:
Import the files
I begin by importing the photos/videos from the cameras into a local directory on my laptop, which I try to do as frequently as I can. Within my note-taking software (Apple Notes) I keep a list of all four of our cameras, along with the filenames and dates of the last photo/video that I uploaded from each one. For instance, it currently looks like this:
- iPhone 6s - IMG_0432.JPG 31.10.18
- iPhone 6s Plus - IMG_3708.JPG 30.10.18
- Nikon D60 - DSC_0980.JPG - 11.05.18
- Nikon Coolpix - DSCN0805.JPG - 12.05.18
Optimise the file sizes
I then run the images through a piece of software called JPEGmini. JPEGmini uses magic to strip out a lot of metadata and generally un-needed junk from each photo, leaving them on average 40% smaller in file size, without affecting the image quality at all. It still leaves in place the kinds of metadata that you're likely to want to keep, so the photos retain info such as when they were taken, the model of camera that was used etc (this kind of metadata is called Exif data, more on that later). I'm not entirely certain what it is that JPEGmini removes from the photos, but it works fantastically, and I've been using it for years now without any problems. This step is a no-brainer for me, JPEGmini doesn't cost a lot of money, and the file size savings will amount to savings in HDD and cloud storage space in the future (more on that to come too).
Unfortunately, I haven't found anything with the same drag-and-drop simplicity of JPEGmini for compressing video files. As mobile phone cameras are improving in quality (and the resulting photos are increasing in file size) I'm finding that a huge proportion of our photo collection is taken up with videos. However, tools to convert/compress/shrink video files are still very slow to process and very manual to use. I don't have the time or the inclination to manually compress hours worth of video on a regular basis, so until something more drag-and-drop comes along I'd rather take the file size hit.
Rename the files
The trouble with merging images from four different cameras into one collection is that the file names aren't in a consecutive order. Photos taken on an iPhone begin with 'IMG', while images from our DSLR begin with 'DSC', and our waterproof camera uses something different still. This means that our photos typically won't get sorted in chronological order when viewing them in a file browser, but alphabetically instead, leaving them out of sequence.
To fix this I use another tool, a free download called ExifTool. ExifTool is a command line program that can use the aforementioned Exif data that's stored within the files to automatically rename photos and videos, so that instead of eg 'IMG_1234.jpg' the photo will be named according to the date and time that it was taken, accurate to the second. By running ExifTool on all of the photos and videos within the directory, regardless of the which camera was used to take them, they will all be named and therefore displayed chronologically. That is, just so long as the dates and times are set correctly on all of the cameras before the photos and videos are taken. I learned that the hard way.
ExifTool is a real "hacker's" tool which can be run in a variety of ways. The command that I run on my photos is
exiftool '-FileName<CreateDate' -d %Y%m%d_%H%M%S%%-c.%%e dir (where
dir represents the directory to run the command on). This results in filenames such as
20180804_103946.jpg which can be translated as YYYYMMDD_HHMMSS. It's even smart enough that if multiple photos were taken at the same second (something that happens surprisingly often when my wife and I are both taking photos of the same thing at the same time), it'll append one of the filenames with a
-1, then a
-2 for the next one, and so on.
Store the collection on a USB HDD
I manage the 'master' copy of our photo collection on a USB HDD. This removes any concerns around my MacBook Pro running out of adequate storage to house the entire collection, which currently weighs in at ~250GB. That's not too large by professional photography standards, but large enough to fill up a laptop pretty quickly. Keeping the photos on a USB HDD also means that I can structure them however I want. The directory structure that I've chosen is
At this stage, I just need to look through the imported photos and drag them into the relevant directories within the current year directory. Because the files are now all named chronologically, it can be quite quick to find contiguous batches of images taken from the same event, so it's a fairly fast process to sort the files manually in this way.
Duplicate the USB HDD
Once the photos are all filed away nicely on the USB HDD, I use a program called Carbon Copy Cloner to clone the entire contents of that USB HDD to a secondary USB HDD. Carbon Copy Cloner automatically finds the differences between the files on the two HDDs and copies files as necessary until the two HDDs are identical. I tend to keep this backup HDD in a separate physical location to the original one (sometimes at a different house, sometimes just in my laptop bag with me somewhere outside of the house, while the original stays at home). This helps a bit with protection against an event such as a house fire. However, it's still not an ideal solution, because sometimes I do have both drives at home with me. Not to mention the fact that eventually, all hard drives will fail, which brings me to…
Backup to the cloud
This is the part that has taken me the longest to work out. I need to keep the third copy of my photo collection on the cloud somewhere, as a safeguard against losing both hard drives. There are plenty of consumer-focused cloud storage providers available, and I've tried several, but none have been quite right for my needs, leading me in a different direction that I'm very happy with.
I first used Google Drive, which worked well enough, allowing me to drag-and-drop my files onto Google Chrome to upload them. The process was slow and very manual, but it worked well enough. Google Drive does offer a desktop app that silently syncs your files from your machine to the cloud, however, it doesn't play nicely when the files you want to sync are on an external drive rather than on your machine, so I was stuck doing it the manual drag-and-drop way.
I used Google Drive for a while in this way, uploading any new files whenever they were added to the HDD, however eventually this manual process did start to become too much work. It would be easy to add an entirely new directory to the cloud backup, but when I had to add some additional photos to an existing directory (that year's 'randoms' directory for instance) I'd have to manually keep an eye on which files were and weren't already uploaded to the cloud, to ensure that I didn't miss any, or create duplicates. This process became too error-prone and too time-consuming to continue.
When I began moving away from Google a while back, I quickly switched to a similar setup using Dropbox instead of Google Drive. The process and limitations were exactly the same, I still couldn't use the desktop app so I had to manually upload the photos via drag-and-drop in Google Chrome. The setup wasn't any better than I'd had before, but it allowed me to remove the dependency on Google, which was important to me at the time. However, I still needed to find a better long-term solution.
I played with the idea of using a cloud photo hosting service to manage the entire collection, one which allows for managing the collection however you want. Flickr ticked almost all of the right boxes, coming with unlimited free storage space. However, a big problem was that it didn't allow for large video files to be uploaded. Some of our videos can get quite large in file size, which would prohibit us from uploading them to Flickr, and because of that, Flickr was a no-go for us. I could split those larger files into smaller ones when necessary of course, but I didn't want to add that extra step, I wanted a solution that I could just throw all of my photos and videos at without question.
To explain, cloud storage services such as Google Drive and Dropbox are designed to be very consumer oriented. They come with nice user interfaces, making it very simple to store your files in the cloud. The cost of this convenience is 1: monetary (they are more expensive than the solution that I've chosen), and 2: they impose their own limitations, such as assuming that if you're using their desktop apps, that you want to sync files that live internally on your machine, rather than on an external drive.
However, there is another type of cloud storage that exists, and which isn't aimed at consumers at all. This kind of cloud storage is incredibly cheap to purchase because it's a type of 'wholesale' storage that's intended directly for use by developers. It doesn't come with any kind of user interface, instead, it relies upon developers using an API to read and write data to that storage space. Because of this limitation, the storage is incredibly affordable, because the provider hasn't had to put in all of the work to build a consumer-friendly interface around it. It's just raw storage space, intended for developers to build something around.
The missing piece of the puzzle then is how to best access this space. I'm a software developer so I could use the provided API to put together a command line tool of some sort that would handle uploads for me, but due to time constraints it would probably end up being quite crude, and wouldn't be smart enough to monitor the files for changes without me having to put in a huge amount of work, leaving me with another fairly manual process. Thankfully, however, a solution does exist, Cloudberry Backup. Cloudberry has built the missing piece of the puzzle, a nice intuitive desktop app that connects with these providers of raw storage space, providing the same friendly user experience as you'd get with a consumer-facing product, without the need to develop anything.
Cloudberry Backup can interface with a great many of these storage providers, and I'd initially intended to use Amazon's S3 storage, almost certainly the largest player in the market. However before signing up with them I decided to research a few of the other providers that Cloudberry Backup works with, and I was pleased to discover that Backblaze, another provider of consumer-friendly storage that I researched, also offers raw storage space in the form of its B2 product, and somehow B2 works out to be a lot cheaper than Amazon S3 does. Backblaze has been in business for over a decade and seems very trustworthy, and while I did see some negative reviews about their consumer-facing backup software, their B2 storage seems to be very solid.
For comparison, using a service such as Google Drive or Dropbox, I'd be paying around $10 per month for the amount of storage space that I require. Using Amazon S3 would have cost me around $5 per month, however, somehow Backblaze B2 comes in at only around $1.50 per month. Cloudberry Backup can be used for free if you're syncing less than 200GB of files in total, but I'm syncing more than that, so I need a pro license to manage my collection. This should cost around a $30 one-off fee, with around a $6 annual maintenance fee for updates, so the cost saving still holds. For full disclosure, however, I have to say that in return for writing this post about CloudBerry backup, Cloudberry has provided me with a free one-year pro license. However, even if that wasn't the case I'd still have chosen to use it and will continue to pay the annual maintenance fee going forward, which should show that I'm invested in using it regardless of the great deal that I got.
Cloudberry Backup makes it very simple to get everything set up. The first step is to sign up with the cloud storage provider of your choice, in my case Backblaze B2. This step is done completely outside of Cloudberry Backup, you just need to sign up through that storage provider's website. Once signed up, you'll need to use the storage provider's website to create a storage 'bucket', a named container to store your files inside. The storage provider will then give you various credentials which are needed to access that bucket remotely.
All you then need to do is to add that storage provider account data within Cloudberry Backup by entering the same credentials, and then to create a backup plan. This step involves deciding from a variety of options about how the backup should run, such as which folder(s) should be backed up, and to which storage bucket? Should the backup run automatically or only when you tell it to? Should it only backup whole files, or blocks of files if only a portion of those files has changed? Should it use encryption and/or compression when storing the files? Should it send an email once the backup completes, or when it throws an error?
For several of the options, I decided to go with the simplest route. If I'm completely honest the thought of using Cloudberry's encryption and compression features scares me a bit. The entire reason for me backing up our photo collection to cloud storage is for future security. Hopefully, I'll never need to rely on this cloud backup, but if I do I want to be able to retrieve the files without any problems, even if that happens to be in twenty years time. By using Cloudberry Backup to upload the files unencrypted and uncompressed, I can later retrieve those files, however, I want to, even if that's simply done by clicking the download button from within the Backblaze B2 website itself.
However as soon as I choose to allow Cloudberry Backup to encrypt and compress those files, I've chosen to make my entire retrieval process dependant upon Cloudberry's algorithms. Without their software to decompress and unencrypt those retrieved files, they'll be useless to me. As fantastic as Cloudberry Backup is, and indeed Cloudberry are market leaders and have been for many years, I have no guarantee that they'll be around in twenty years time. As much as I'd prefer to encrypt my files (for added privacy against unwanted attempts to access them) and compress them (to speed up the upload process and lower the storage costs), intrinsically linking my entire backup with another service when I don't need to seems too dangerous. Of course, Backblaze themselves could go out of business in the future causing me to lose all of my data anyway, but I have to put my faith somewhere, I just don't need to extend that trust to two companies if I can rely on only the one. Again, this isn't a negative against Cloudberry Backup which is fantastic, it's just far from ideal to entrust my entire ability to restore my backup to a company that isn't hosting that backup.
Something that's also worth noting about Cloudberry Backup is that it does have the ability to backup files locally as well as remotely, which means that I could actually replace Carbon Copy Cloner with it, using Cloudberry Backup to duplicate files between both USB HDDs as well as to the cloud. I'm still using Carbon Copy Cleaner for now because it's already an established part of my process, but in the future when I get to the point when I'll need to pay to update Carbon Copy Cleaner to the latest version, perhaps I'll switch to only using Cloudberry Backup.