Facebook Pixel File Archiving Strategy – the Bucket System

File Archiving Strategy – the Bucket System

A Guest Post by Nick Rains.

Image: Using metadata correctly means that images can be fully catalogued and images can be retrieve...

Using metadata correctly means that images can be fully catalogued and images can be retrieved at any time without necessarily knowing which folder the file is in. King Penguins, Macquarie Island – Canon 5D MkII, 300f2.8L 1/500 second @ f5.6.


Where do you actually put your image files?

Do you file images in folders with meaningful names like “Sydney 01-01-2010” and “Perth 11-03-2010”, or “Flowers” or some such. What if it’s a picture of a flower in Perth; which folder would you put it in? Or would you put it in both? This is a physical filing system, not unlike literal files in a literal filing cabinet. It can serve as a filing system but it neglects the single most useful aspect of digital imagery – metadata.

Correct use of metadata means that computers can do what they are best at, remembering large amounts of data and making connections between lists and records. An image with metadata such as City = ‘Perth’ and Caption = ‘flower’ can easily be referenced by a database which merely matches an image file’s metadata with the words Flower and Perth and displays its location on you hard drive. If you search for ‘Flower’ and ‘Perth’ the database will list all files which are tagged with those words – and here’s the trick, the file does not have to be in any particular place on your hard drive as long as the database has previously recorded its position. In other words it has already catalogued all the file locations. You could have a dozen images of flowers in Perth in a dozen different locations and the database can effortlessly list those files when you search for those terms. This is what computers do best, and they are very good at it.

Using good catalogue software means you simply do not have to arrange your files in any sort of logical folder structure as long as the software has catalogued all the locations of all the images. If all your images live on one hard drive, and that hard drive has been fully catalogued, then the folder structure of that hard drive can be anything you want.

OK, so we have established that using something like Idimager, Lightroom or Expressions Media 2 is a good idea. That’s one part of the problem. The other part is how do we store those files off-site as a back-up on DVDs (given that we can only fit 4.5GB of images onto one DVD) in such as manner that we can easily retrieve a file if it’s lost or corrupted somehow. How do we know where it is? We could catalogue each DVDs as as well I suppose but that would be very time consuming and fortunately it’s not necessary.

The trick is to mirror the contents of your DVDs on your hard drive by using folders with the exact same names as the DVDs, containing the exact same images. If we catalogue a hard drive laid out like this then the catalogue will, at the same time, be a catalogue of the DVDs.

Image: This panel from Expressions Media 2 shows the folders on the networked PC. D Drive \ DVDs 151...

This panel from Expressions Media 2 shows the folders on the networked PC. D Drive \ DVDs 151-200 \ DVD161. The green dot means that is what is currently being displayed in a browser window ( see screenshot below).

The Bucket System

I follow the 3-2-1 approach to archiving and back up. That’s at least three copies of each image, on two different media types (HD and DVD) and at least one copy stored fully off-site.

I also follow the ‘Bucket’ approach to archiving popularised by Peter Krogh in his book The DAM Book. I had been using a similar system for a few years when I came across Peter’s book. It was great to have my own methods re-affirmed and developed, so I bought the book, made some changes to my workflow and have been following this path ever since.

The Bucket system is based around optical media of finite size, like DVDs. Please feel free to substitute ‘BlueRay’ for ‘DVD’ in this article as technology has moved on. Regardless of the actual media that you use, the point remains the same – you put your files into folders called buckets and when a bucket is ‘full’ you burn it to a DVD, start a new bucket with a new folder name and start filling that up, and so on. A bucket is considered ‘full’ when it approaches the size of the optical media to which it will be burned.

So ‘Buckets’ are simply folders that are created to be filled with images until they reach, for single sided DVDs, the 4.5GB mark at which point they are burned to DVD and filed off-site. I name these folders on my hard drive DVD001, DVD002, DVD003 and so on. When I burn the DVD its title in the DVD burning software will be the exact same, DVD001 etc and I will write DVD001 on the case (not on the DVD).

Now, here is the crucial point. If you keep your images in these bucket folders on your main hard-drive, or wherever you habitually store your images, you can import them into a cataloguer still within these folders leaving the folders as they are with the same folder name, DVD001 etc. The cataloguer will reference these folders by their folder name and so you have an exact mirror of what is on your hard drive burned to a set of DVDs. For each physical DVD there will be a corresponding folder on your hard-drive with the exact same name containing the exact same files. This makes it amazingly easy to track down a file if for some reason you cannot access it on the hard drive. The cataloguer will tell you which folder it should be in, and all you need to do is find the DVD of the same name.

Image: The browser window in Expressions Media 2 shows the thumbnails of the images in the folder an...

The browser window in Expressions Media 2 shows the thumbnails of the images in the folder and identifies the individual file at the top, in the status bar. In this case this file is clearly in a folder named DVD161. It's of course also on a DVD called DVD161 which is stored off-site but which could be retrieved and the whole catalogue rebuilt in the event of a disaster.

So, I edit my images and add metadata , make adjustments in Lightroom (and then convert them all to DNG format) and then take the whole collection and split it into new folders each containing about 4.5GB of images. These folders are sequentially named, i.e. DVD001, DVD001 etc and then the whole set of folders is imported into the cataloguer which in my case is Expressions Media 2. Each folder is also copied onto a second computer which acts as a local backup and burned to a DVD of the same name.

The net result is that I have three sets of identical folders containing identical images. Two sets on two different hard-drives (on two different PCs in my case) and the third set on DVDs stored off-site. The cataloguer has imported this exact folder structure as well, so any file references will match the physical DVD names as well as the identically named folders on the two hard-drives. It even works over a network.

There is one slight fly in the ointment here – adding files to folders so that they total 4.5GB is tedious because you can normally only do it manually. You have to select groups of files, copy them into the ‘bucket’ and keep checking the total folder size. It’s a bottleneck. I despaired of finding a way to automate this after searching the net and only finding one application for the Mac that could do this (Big Mean Folder Machine) and none whatsoever for the PC. So, to cut a long story short, I made my own application for PC, called Bucketeer and by the time you read this it should be available from my website (look under Products / Software).

Bucketeer simply takes a large folder of images and copies the contents to new folders of a specified size. Each folder is named sequentially so I end up with one big folder and the same images in a set of smaller folders named in a sequence like DVD001, DVD002 – exactly the method outlined above. All I then have to do is burn each folder to a DVD (manually for now) and it’s all done – no more trial-and-error multiple selections to fill a 4.5GB DVD.

The bucket system is a boon for small collections of a few tens of thousands of images. Bigger collections might benefit from an enterprise level system with servers and dedicated RAID drive arrays but, for most of us, DVDs and BlueRay discs will do just fine as long as you have a good cataloguing system and a methodical apprioach to filing, archiving and backup.

Nick Rains has been a professional photographer for 28 years and his work has been published in books, calendars and magazines all over the world. He currently specialises in feature work around Australia and is a regular contributor to Australian Geographic magazine. Nick is also the Editor of Better Digital Camera magazine and regularly conducts advanced photographic workshops around the country.

Read more from our Post Production category

Guest Editor
Guest Contributor This post was written by a guest contributor to dPS.
Please see their details in the post above.

Become a Contributor: Check out Write for DPS page for details about how YOU can share your photography tips with the DPS community.

Some Older Comments