This is a post explaining the Canon XF folder structure that’s used by Canon professional video cameras, principally the XF305 and C300 (there may be others).
Some of this relates to Premiere, and how it is vital to import Canon XF clips properly using Media Browser (never using File > Import or drag and drop).
But mostly it is about the way the Canon folder structure and file naming works, which is something I learnt a lot about years ago onsite at the BBC, and have been meaning to share ever since. I’m planning to post about the Sony and Panasonic and Red structures, too.
To be honest, I was always hoping someone else would do it – but nobody ever has. So either nobody else understands it, or the few of us who do have just been hoping someone else would shoot first.
So, here’s the thing:
Professional cameras record their data onto cards in complex folder structures and files.
From our phones and pocket cameras, we’re used to video clips being recorded as single .MOV or .MP4 files which you can copy off and view and edit just by themselves.
This is not the case with video clips recorded by professional Canon, Sony, Panasonic and Red cameras.
This is for various reasons to do with large data rates, card file systems and stupid design by engineers who don’t have to use what they make.
Worse still, there’s a different complex folder and file structure for every camera manufacturer. Multiple different proprietary standards.
You don’t get this kind of bullshit with a consumer product. Consumers wouldn’t put up with it. They shoot a clip, they want a single file for that clip copied to their computer, end of story.
But in this niche, we have to use what we’re given, and there aren’t enough of us to complain about it; still fewer who understand it enough to point out authoritatively how ridiculous it is.
So it continues, and people like us have to suck it up and make it work, even though it costs the production industry millions of hours and dollars in postproduction wastage every year.
And as I start to write this, I’m realising that – because of this insanity – I now have to actually describe what the word “Clip” is going to mean in this post, to avoid confusion.
When I’m talking about a Clip here, I’m talking about the bit of video and audio that the cameraperson recorded between the moment they pressed Record and the moment they pressed stop.
You’d think that this was a perfectly obvious statement – but unfortunately, over in the professional camera dimension, “Clip” is not a single file.
Yes, I guess probably the most important thing to know about a professional camera card format is:
There can be many files and folders that make up a single Clip.
In Canon XF cameras, a clip is not just made up of a video/audio file, it’s also made up of a stack of metadata files, in a complex nested folder structure that must be retained intact for editing & media applications to be able to read it properly.
And worse, if a single clip is recorded for longer than 5 minutes 15 seconds, even just the video/audio part of it will not be made up of one file – it will start to be made up of several separate consecutive video/audio files.
And the problem is that if you don’t have all the right elements in exactly the right place in relation to each other, the card isn’t readable/mountable by any media software, including Premiere. Your cards will no longer work the way they’re supposed to. You won’t be able to view or import your clips.
So a 64GB card can’t be broken up into its constituent elements. It has to remain a 64GB card.
As I said, this madness is not just Canon’s. All the others have their own version of it.
In Panasonic P2 cards, they separate video and audio into separate folders; and then for good measure, they create separate audio files for every channel!
In Red, they split up a single clip into several consecutive video files, with a metadata file alongside.
In Sony, there’s a single video/audio file for each clip (hurray!), but it’s hidden among a forest of metadata files on several levels (boo!).
I’ve been having to explain all this recently to a MAM (Media Asset Management) system manufacturer, whom one of our clients wanted to be able to ingest professional video camera card folder structures, and it reminded me that this information is still just NOT OUT THERE.
Even Canon’s own Canon Professional Network site doesn’t contain it in their special education section – just info about their other less complex formats. I raised it with a few people on their side years ago and tried to get more details direct from engineers, but with no luck. In the end, I could figure most of it out myself, guessing at the purpose of one or two obscurely coded metadata files.
I had to do this because I was supporting the ingest of thousands of hours worth of footage, and was trying to build a script that would let us separate out individual clips, rename them, archive them individually, and then recombine them back into a single card. It was important to try and save money and time in the archiving and restoring of petabytes of data. (See bottom)
After figuring out how all the formats work, I made my script work. But of course, we then couldn’t roll it out as a supported application at the large media companies we support, in case I got hit by a car and nobody else knew how to support it in future. It was not fundable, and too complicated. If you’re interested in it, let me know.
But anyway, for those of you who are in the same position as I was, trying to figure out how a Canon XF card works and how to reconstruct it in order to support post production or fix a problem, this is how it works:
(I’d go and get a cup of tea, if I were you – we’re going to be here a while, and it’s not going to be much fun.)
THE CANON XF (XF305 AND C300) FOLDER STRUCTURE AND NAMING CONVENTION – AND HOW IT WORKS
When you mount the card on a computer using a CF card reader, you’ll see the top level folder is called CONTENTS.
The card itself will be named CANON XF by the camera, but when you offload it to a drive, you can call the folder that contains this CONTENTS folder anything you want. You should have a naming convention to keep all your card folders uniquely named, but that’s another (long) story.
I should say a couple of things about how we use the words Card and Mount here, to avoid confusion.
- “Card”: Whether or not the data is being read from a CF card or from a copy on a drive, we refer to both of these things as “cards”; even though the latter is technically a copy of the data from the card, not an actual physical card. We do this because the card was the original container for that video, so it’s easier to think of that collection of video clips as being a card’s worth. This is just like we used to think of a reel or a tape, even after we had digitised them.
- “Mount”: When you insert the physical card into a card reader and connect it to a computer, the computer mounts the card as a storage volume, the same way as it mounts a disk drive. When we’ve copied the data from the card to a drive, the computer will no longer mount that data as a separate volume of storage.
However, partly because of an old feature in FCP, we still use the word mount to describe loading up that card for viewing. Media software (Premiere, FCP, Avid, Canon XF Utility, Resolve, etc) will read the complex card folder structure and files, and understand how to display all the messy data inside it as single separately viewable video clips with the right metadata. So we generally say that in doing this, these systems are ‘mounting the card’, even though it means something quite different from mounting an actual card. Jeez, that’s a convoluted explanation.
Inside the CONTENTS folder, there’s a single folder called CLIPS001, which contains all the card’s clip folders and metadata. Let’s take a look at a picture of a sample Card Structure:
CLIPS001 is unique to Canon XF. Lots of cameras shoot a top level folder called CONTENTS – old Sony cameras, and Panasonic P2 cameras, for instance – but they each create a very different set of folders inside CONTENTS.
So when you’ve been given a drive containing a bunch of offloaded cards from unknown cameras, CLIPS001 will quickly tell you it was a Canon XF. (Sony XDCAM cards have a Clip, Component, Edit and other files and folders and Panasonic cards have AUDIO, CLIP, ICON, PROXY, VIDEO, VOICE – I will do separate posts on these.)
Inside CLIPS001, you will see the folders that contain your individual clips (I’ll go into what constitutes a “clip” below) as well as a very important file called INDEX.MIF
Without INDEX.MIF, the media apps can’t mount your card (i.e. know how to read it and see the files inside it as video clips). You won’t be able to read it. The same if INDEX.MIF doesn’t contain exactly the right information, in exactly the right structure.
If you open INDEX.MIF up in a text editor, you will see that there’s a lot of gobbledegook inside. I once spent a long time reverse engineering this gobbledegook, and there’s a line of information for each clip recorded on the card. I’m guessing that MIF stands for something like Media Information File or Metadata Index File. Each line matches information stored in the CIF, below.
There’s also often a folder here called JOURNAL which is usually empty, but this is less crucial. Unlike INDEX.MIF, this can be missing, and the card will still read and mount.
Finally, to the Clips themselves!
You’ll see that each clip – each bit of video between when the camera op pressed record and when they pressed stop – is given its own folder inside CLIPS001.
In the picture above, you’ll see that the example clips are called AA0046, AA0047, AA0048 and AA0049.
The letter naming convention can be set in the camera. The one used by Canon is a bit limiting – just 2 letters are customisable at the start of each clip. (Sony’s XD standard allowed you to set around 40 characters).
Name them wisely
Out of the box, it will use AA. I recommend that you use a naming convention for this – either the Camera Op’s initials, or a sequential camera code agreed within your production, company or shoot. This helps in the edit, to easily know which camera is which. (I found a way to edit the 2 letters in the names in postproduction and make it work – but I won’t cover that here, as it’s a hack. If you want to know how to do it, leave a comment and I’ll write it up.)
Number them sequentially
The numbering is sequential. And it’s super important to set the cameras to keep continuous numbering, if you’re gathering a lot of footage on an ongoing shoot which will all go into one big edit. The other option resets the clip count to 0001 every time you insert a new card, and that’s a nightmare in post and archiving. Even if you’ve remembered to customised letters to your initials – say RH – you will have have masses of cards, all with clips called RH0001, RH0002, etc unless you have continuous numbering turned on in camera.
So, what’s inside each clip folder?
Each Clip folder contains a cluster of media and metadata files.
As I said above, you’d expect from your experience with every other form of recorded video that a single clip would be represented by a single file. That’s sadly not the case.
Look at clip folder AA0046 in the image below. Inside that folder is AA0046.CIF, AA0046.CPF, AA0046.THM, AA0046.XML, AA0046.XMP, AA004601.MXF, AA004601.SIF. (Your own clip folder may not contain all of these, but don’t worry: some of them are optional, as described below.)
Of these AA0046.XMP is an Adobe metadata file, created by Premiere, Prelude, or Media Encoder when browsing through the card. (You can stop it from doing that – but that’s another post).
The rest are the files that the C300 or XF305 make when they record the clip onto the card.
AA004601.MXF contains the video, audio and timecode for this short clip. There can be more than one of these for each clip – I’ll get to that later.
AA0046.CIF is a Clip Information File which contains a bunch of gobbledegook code identifying the clip, the camera it was shot on (type and serial and version. There’s a full copy of this code in the INDEX.MIF file – so anything attempting to read the card will read the MIF and match it to this CIF. This also stores some information that you can see and edit in Canon XF Utility, such as the Clip Status (OK / Check / None) and probably a lot more information about camera settings.
AA0046.CPF is the Custom Picture File, which contains information about any picture profile that was used to shoot the clip. Again, it’s gobbledegook, and actually it’s an optional metadata file. If no picture profiles have been set, there’ll be no CPF file here.
AA0046.THM is the Thumbnail image file generated by Canon XF Utility for previewing. Again, this is optional.
AA0046.XML is the XML metadata which is actually readable. You can see a snapshot of its contents below. This is required, but it doesn’t contain a huge amount of useful info. If you wanted an external program to be able to look at the card and see the following, it shows the following things that are set in-camera: Creator name, Description, Camera serial number, model name and firmware version, Lens, GPS coordinates and location name.) You can also edit these in Canon XF Utility, which will save changes into this file (you can see my own edits in the image here), and of course you can directly edit the text in the XML itself (with a greater chance of breaking it!)
The way this has come in most useful is when checking whether corrupt clips from different cards have come from the same camera, by checking the serial number.
AA004601.SIF must be a Shot/Segment/Span Information File – but I still don’t really know what this is. It contains a lot more unreadable code. I’d love to know more. I have tested switching SIF files with other SIFs from other clips without breaking the card/clip’s readability ad without changing what metadata is shown in Canon XF Utility, which suggests that the SIF file doesn’t contain key metadata – but certainly there is should be one SIF for every MXF, so when there are multiple MXFs for a single shot, there are matching numbered SIFs too. Like I say, I’ll come back to that, just below.
So, those are the types of files you’ll see in the clip folder. Now for the really annoying bit about Canon XF card structures:
MULTIPLE SPANNED FILES FOR EACH CLIP
AA0046 was a simple clip, because it was short enough (10 seconds) to be completely contained within 1 MXF video/audio file.
Let’s look at clip AA0049, which is a much longer clip – 41 minutes. So on a Canon card, it is made up of 8 (EIGHT) separate consecutive MXF video/audio files.
Canon, because they are sadists, decided to use FAT32 as the filesystem for their professional camera cards. FAT32 has a file size limit of 2GB, which is why nobody in video production uses FAT32. Ever.
Let’s just do a little roleplay to explore this decision.
Your company is making a professional video camera that shoots a codec whose file size you know. It’s consistent. How big is it? IT’S 25 GIGABYTES PER HOUR. EVERY HOUR.
Now, imagine you’re the engineer in charge of choosing the file system that your company is going to use for the cards that these files are recorded on. What file system are you going to choose?
Do you choose one that:
A) Allows you to record as much data as you need for every clip? Recording all the data you need into one handy file that can be copied and used anywhere, individually?
B) Limits every file to 2GB, so that when the camera operator records a single shot that’s more than 5 (FIVE) minutes YOUR CAMERA HAS TO START BREAKING IT UP INTO MULTIPLE FILES THAT CAN ONLY BE UNDERSTOOD AS ONE SINGLE CLIP BY YOUR OWN SOFTWARE AS LONG AS IT’S KEPT TOGETHER WITH ALL THE OTHER FILES?
It would take too long, and be too boring (longer and more boring even than this blog post, if you can imagine such a fate) to go into the amount of pain and cost our clients and the rest of the Canon-using professional market have had to deal with as a result of this design, so I’ll just stop moaning and get down to explaining what it actually does. And how we deal with it, apart from moaning.
Where was I? Oh yes, I said we’d look at clip AA0049.
You can see in the image below that AA0049 has 8 separate MXFs. Again, the MXFs contain all the video, audio and timecode information.
We tend to called these either subfiles or spanned files. “Spanned clips” is how people often describe them, meaning that the clip spans multiple files – but that can be confusing because “spanned clips” can also mean something different: when a single clip spans across two consecutive cards. So for the sake of this post, I’ll call these subfiles.
You can see the naming/numbering system they’ve used here for these subfiles: the name of the clip, AA049, but with 01, 02, 03 tacked on the end.
As described above, you can see each MXF subfile has one of the mysterious .SIFs alongside it, matching its name. So that the MXFs and SIFs look like this:
AA004901.MXF – AA004901.SIF
AA004902.MXF – AA004902.SIF
AA004903.MXF – AA004903.SIF
AA004904.MXF – AA004904.SIF
AA004905.MXF – AA004905.SIF
AA004906.MXF – AA004906.SIF
AA004907.MXF – AA004907.SIF
AA004908.MXF – AA004908.SIF
Canon XF Utility or Adobe Premiere / Prelude, or any other media application which can read Canon XF cards properly will display all of these subfiles as one single clip, as the camera operator intended.
HOW TO IMPORT CANON XF INTO PREMIERE
The right way to import in Premiere: Media Browser.
In Premiere, you should use the Media Browser panel to view the card folder structure before importing. If you do that, Media Browser should detect that it’s an XF card as soon as you clip on the top level folder. You’ll see the little eye icon menu flick over to show that it knows that it’s a Canon card. It won’t show all the subfolders: you’ll see just one playable asset for each of the clips. So for AA0049, you won’t see all the 8 MXF subfiles that make it up, you’ll see one asset called AA0049 – which you can then import by itself, with all its metadata. In fact, since all clips are just shown as single assets, you can select all the clips on the card and import them at the same time, without error.
The wrong way to import in Premiere: Import, or drag and drop
If you use File > Import or you drag and drop the folders or files into Premiere, it will show a bunch of errors as it tries to import all the MIF, CIF, CPF, SIF, XML, files into Premiere and fails to understand what they are.
That’s not the problem, though. The problem is that it won’t understand the structure, and it will see all the MXFs as separate files that need importing. It will therefore import every single MXF subfile that makes up AA0049 as a separate clip asset into Premiere. In this case, it will import 8 separate clip assets into Premiere, with names from AA004901.MXF to AA004908.MXF.
The even more confusing thing is, these 8 Premiere assets won’t just have the little 5min 15secs of video/audio that’s covered by their subfile. Premiere will be smart enough to know that they’re part of a larger whole (not smart enough to import only one, though) and so it will show all 8 clip assets with the full 40 minute clip length. You’ll have 8 full length duplicate assets in your project, but all named differently and wrong.
So don’t do that. Use Media Browser.
HOW TO WORK AROUND THIS INSANITY, AND WHY IT IS UNSUSTAINABLE
There are plenty of Media Asset Management (MAM) providers and Archiving systems who don’t tend to understand what’s happening inside a professional camera card, or why a single clip is not a single file. To be fair, the developers haven’t sat where we’ve sat having to look at these formats in mass ingest, so you can’t blame them for not knowing that this is a problem or a requirement.
What you don’t want to do is to transcode all your media to single files of a different format – losing a generation of quality and usually ending up with a larger codec than you started with (e.g. the commonly used DNxHD 10 bit option is 4x the size of the Canon XF’s MPEG2).
What you need to do is convert your media to single MXF files, but without transcoding, preserving the original media data but copied into a single file.
Some people call this transwrapping. At SP, ever since we started doing it in FCP, we’ve called this native rewrapping – bundling all the subfiles in each Canon XF clip into single MOV file wrapper, without transcoding the video and audio inside the files. We can them rename that MOV and move it around as much as we want. FCP Log and Transfer was an amazing tool for this.
Now you can do the same thing inside Canon XF Utility or Adobe Media Encoder (we specifically requested it as a feature) – turning the files into single MXF files, without transcoding the video data inside them. Preserving their initial quality, but joining all the subfiles in each clip together into one file.
You can do that for Panasonic and Sony, too. (I’ll write up a separate post about the method for this, at some point.)
That’s the method that’s used by a lot of places to manage the ingest and archive of individual clips. That’s how things are imported into NLEs and MAMs, stored on LTO tape for individual recall.
Apart from the fact that professional MAM systems can’t understand the clips unless they’re individual files, there’s a huge inefficiency in only being able to archive and restore whole cards, if you only want to move some of the clips inside them.
Say you shot your media on 64GB cards, and you want to move a couple of 5GB shots on two separate cards. Do you want to move both cards – 128GB – to get those clips moved, or do you just want to pull out the 2 shots separately, for a total 10GB shots? If you’re using the original camera cards, you have no choice: you have to move the whole cards.
Now expand that to an in-house postproduction facility with an archive of Petabytes of data. Then look at the fact that we’re now shooting UHD and larger – bigger files on bigger cards. We have 128GB, 256GB, 512GB cards coming in – a single shoot can come back with 20TB of data.
You can’t be pushing whole 512GB cards around to pull back a single 10GB clip. You have to break the card up into its constituent clips. The only way to do that is to process those files – transwrap/rewrap them – into single files for each clip.
BUT this process requires people, and machines. In the long term, it will be automated; but, as I’ve said, many off-the-shelf MAM/DAM systems can’t understand and mount these cards. So we have to do it manually, with desktop media apps. Lots of people are dedicated to this every day – just this. That’s an enormous wastage of money, time and resource that would be better applied to something more productive. All because of thoughtless design.
It’s wider than that, though. Say I come back from the shoot and I just want to keep the good stuff, and bin off the bad stuff? I can’t – I’m stuck with the card, unless I want to start tinkering with it and risking breaking it. Say I’ve shot a bunch of footage and want to share just a couple of the original clips at their original quality with someone else? I have to share whole cards: 16/32/64/128/256/512GB cards. Oh, it’s just a 5GB clip, and you want to use the cloud? Too bad – you’ve got to upload the whole card, or process out just that clip, taking the time and duplicating that data just to send it. Why can’t we just have a single file for a single clip, separable from the original card structure? It’s 2016.
I hope somebody at each of the camera companies eventually reads this, and realises that this sucks, and does something about it. I think that to them, it’s an invisible problem; that’s really why I should have written this years ago. As well as the fact that it probably could have helped some of you who have had to go through and learn what I’ve had to, over the past few years, and who have Googled about Canon XF or Panasonic or Sony Folder Structures and found nothing.
Since I started writing this post 24 hours ago, I’ve watched Whiplash (amazing) while drinking beer and gin, slept for 8 hours, made breakfast and drunk a lot of coffee while discussing Kate‘s album structure, song choices, lyrics, CD design and career direction, helped Amy with her maths homework, done a bit of this post, gone to Clevedon and walked by the sea, watched Miss Peregrine’s Home for Peculiar Children at the Curzon (vg, just the right amount of scary), come home, had crumpets with marmite and tea (it was 4.45 on a Sunday in October), re-dressed Amy’s painful swollen grazed knee, received hundreds of notifications from a live news client that a piece of storage has shut down (dealt with by other SP people) and sat staring at this screen for another 4 hours. It hasn’t been a bad day.
It’s Sunday night at 10.20pm, so I’m going to stop and pick it up again next week, with more unspeakable geekery. Probably a follow up to this, about Panasonic P2.
But for now, I’m going to go and have another little gin with Kate before bed, where I hope to dream of a world where camera card design is less stupid.