We didn’t have any shared-mime-info release for some time, with the last one dating back to March 2005. Much work has been accomplished recently, though. Debian patches were pushed upstream, and – which is really nice – we finally found a solution for the long-standing issue of multiple MIME types per extension, causing media container b0rkage. This rectifies a new release within the next few weeks.
Let me go a bit into detail on the “multiple MIME types per extension” issue:
Some of you might already have noticed that when Nautilus determines the contents of a file to be different from the extension, and you select it, it is suddently changing its icon, getting resorted when you sort by MIME type etc. This inter alia used to be a widespread phenomenom for windows media files, and constantly happened for all of them.
In terms of MIME handling, media “containers” are quiet evil.
You often have multiple media streams encapsulated into one file/stream (this is called “multiplexing”). The enclosing stream/file is called container. When naively looking at the container’s file extension, it is not possible to tell what the contents is. Since nowadays there are many media container formats available, like Ogg or ASF, it is cruicial to deal properly with those encapsulated streams, and give the user the possibility to associate audio players with encapsulated audio, and video players with encapsulated video.
To make the situation worse, mistakes were made when determining MIME types for containers. For instance, all Ogg files are meant to have the MIME type “application/ogg”, according to the IANA. This is a very bad idea, for the bad user experience pointed out in the last paragraph.
Summing up the above, the MIME container analysis revealed two major attach points:
a) never ever determine the MIME type of a container without peeking its contents
b) we need more MIME types
a) was resolved to 98% by Matthias Clasen who improved the xdgmime API to include support for this. Because the container concept can’t simply be mapped to mime type definitions, he took a different approach: He hacked support for N:M relations between filename glob patterns (“*.ogg”) and MIME types into xdgmime, adding API for querying whether multiple MIME types for a particular glob pattern exist. Thus, xdgmime can now can query whether multiple MIME types exist for a particular passed-in file name, and find out whether peeking its contents is always required, telling its client that the actual MIME type is unknown. The last 2% were slight adaptions of helpers exported in the API to actually make GnomeVFS semantics work correctly with it, which was my contribution.
Note that these changes can also be extremely important useful for other use-cases, like the file extension “*.pot” being assigned to translation templates and powerpoint presentations.
b) for Ogg, the task was quiet straightforward:
“application/ogg” is used for “*.ogg” and its magic matchlet (the sniffing information) matches against all Ogg files.
The trick is now to define new MIME types for the encapsulated streams, all of which also match “*.ogg”, the Ogg magic matchlet mentioned above, and an additional stream type-specific magic matchlet:
“audio/x-vorbis” maps to “Ogg Vorbis audio”
“audio/x-oggflac” maps to “Ogg FLAC audio”
“audio/x-speex” maps to “Ogg Speex audio”
“video/x-theora” maps to “Ogg Theora video”
additionally, we have a “OGM video” MIME type matching “*.ogm”, which contents-wise matches a modified version of the Ogg magic matchlet. It is a bit legacy, considering that this was mainly added because the crappy Windows MIME system doesn’t allow any contents sniffing, and is used to identify Ogg videos.
All of the mentioned stream types are registered as sub-types of “application/ogg”, so that old MIME associations are not broken and apps operating on the container itself still work as expected.
a) and b) were resolved today for Ogg, b) hasn’t been resolved for ASF yet, but I’m quiet optimistic that it will make it into the next shared-mime-info release as well.
This is really a major progress in xdgmime/shared-mime-info, and should convince the KDE guys to take a close look at the bundle for KDE 4, although I can’t tell how editor-friendly the shared-mime-info file structure is. I doubt they’ll tell users that MIME associations are the distributors job and they should just file bugs if something goes wrong, like we did. MIME type editing will probably still be possible in KDE 4. If the MIME editor friendlyness of the structure is bad, take it as a revenge for the over-complex arcane XML-based menu system we adopted ;).
Congrats to KDE 3.5, btw.
Update
Peter Bowen points out that a MIME naming following the XML naming spec (“video/x-foo+ogg”) would be more convinient. I agree and changed it accordingly.
Maybe it would make more sense to use application/x-*+ogg, similar to XML, as it has the same purpose and issues? So application/x-flac+ogg, application/x-vorbis+ogg, etc.
The XML Mimetype RFC basically points out the same issues with XML you are talking about with ogg.
It would be great, if Raw files like .cr2 wouldn’t be understood as TIFF files. Is there anything to do about that?