soundprint

As some astute observers may be aware, free software isn’t my only nerdy obsession. A quick perusal of my Flickr photos may reveal some of my other interests. If you guessed “taking poor pictures of wildlife”, you’d be pretty close.

Zonotrichia albicollis

Yes, I watch birds.

To make a long story short, a couple of years ago I became quite interested in the vocalizations of birds: learning their calls and songs, learning to identify a bird by ear. It turns out that in order to really internalize a sound, it’s actually very helpful to be able to visualize it. This is generally done with a plot of frequency vs time. To generate these spectrograms, I’ve been using a slightly modified version of spek, which is a great little program. However, I’ve also found myself wishing I could have an easy visual overview of all of the files in a folder so that I could see at a glance what sort of a sound it was.

By happy coincidence, I have just learned some basics of gstreamer, so I thought it would be a nice opportunity to kill a couple of birds with a single stone[1]. So after a couple of hours of hacking, I’ve pushed a git repository for a little utility I’ve tentatively called ‘soundprint’. It generates a sort of fingerprint for sound files: a spectrogram of the first 5 seconds of audio. It also installs a .thumbnailer file so that nautilus can use it to generate thumbnails for audio files.  It’s quite simplistic, but it does what I want it to do.

Spectrogram thumbnails for audio files

I admit that it’s a bit of a niche application. Spectrograms work best on audio that consists of relatively pure tones; music files tend to end up looking fairly similar to eachother. But in the hope that it may be useful to somebody else, there it is.

[1] no birds were actually killed during this process.

11 Comments

  1. Simon
    Posted August 18, 2011 at 4:54 am | Permalink | Reply

    It’s an interesting hobby. My workplace is within walking distance of a large park, and I regularly go there with my camera in my lunch break. I enjoy the challenge of trying to get good photos of some of the local birdlife…

  2. Stefan Sauer
    Posted August 18, 2011 at 11:43 am | Permalink | Reply

    Really nice. Next feature – using a rainbow color scale :) and skipping initial near-silence.

    • Posted August 18, 2011 at 2:38 pm | Permalink | Reply

      I know some people like the rainbow color scale, but to me it makes the spectrogram almost useless, because I can’t instinctively tell the difference in amplitude between e.g. bright green and a bright red. I need to go check the legend to figure out which color is supposed to represent a higher amplitude. On the other hand, I can instinctively tell the difference in amplitude between dark black and white (or gray) with a very brief glance. In fact, you may notice that in the post I linked to a bug about the modifications I made to spek. That patch was actually to switch from a rainbow color scheme to a monochrome scheme ;)

  3. Yo'av Moshe
    Posted August 18, 2011 at 12:10 pm | Permalink | Reply

    So cool. Seriously.

  4. Posted August 18, 2011 at 12:59 pm | Permalink | Reply

    That thumbnailer is great! Why not propose that for incorporation upstream into Gnome? Or at least a soundwave thumbnailer rather than a spectrum one, so as not to take as much processing power :)

  5. Posted August 18, 2011 at 2:31 pm | Permalink | Reply

    So neat! Do you make a habit of uploading this stuff into Wikimedia Commons, too?

    Your footnote was very funny.

    • Posted August 18, 2011 at 2:57 pm | Permalink | Reply

      Unfortunately, I don’t. I do license them so others can upload them there though! ;)

      • Posted August 24, 2011 at 8:03 am | Permalink | Reply

        Thanks! I mentioned your project here: https://secure.wikimedia.org/wikipedia/en/wiki/Wikipedia_talk:WikiProject_Birds#Bird_photo_and_birdcall_recording_source and was wondering whether the birdcalls and your spectrograms are uploaded somewhere, freely licensed, so they, too, can be used on Commons. Thanks!

        • Posted August 24, 2011 at 9:58 am | Permalink | Reply

          Actually, almost all of the audio files that I use are at xeno-canto.org (which is a fabulous resource for anyone that has any interest in bird song), so they’re all licensed under Creative Commons licenses, and I’m fairly sure that the wikipedia birds project is well aware of xeno-canto already. Unfortunately the standard xeno-canto license is more restrictive than I’d like, so all recordings that I’ve actually uploaded are also licensed under a more permissive license (share-alike).

          As for the spectrograms, I don’t actually upload them anywhere. This program is just to make thumbnail icons so that I can recognize the files more easily in the file manager.

  6. bt
    Posted August 18, 2011 at 5:25 pm | Permalink | Reply

    That’s amazing, something original. Nice work!

  7. Posted September 2, 2011 at 3:17 pm | Permalink | Reply

    Hello. I came here via a Google search for bird vocalizations. I then recognized your name as being on many ebird submissions. Most of these were from Powderhorn Park.

    I’m a first year birder who has checked out Powderhorn a couple times and have never seen much of anything other than a lot of Wood ducks. You seem to be seeing a lot of interesting Warblers there.

    Can you tell me what area of the park you have had the most luck? Or perhaps we can meetup sometime there.

    Thanks much.

    Brian