Why Phonon is a broken wheel

So I attended the OSDL Desktop Summit in Mainz, Germany from Sunday to Tuesday evening this week. The goal of the meeting was to follow up on the tasks done at the earlier meeting in Portland and to look for new areas that could do with some work, multimedia being the one pulling me in.

The goal of myself and the GStreamer community in such a scenario is of course to advocate the use of GStreamer as the response free software can offer to advanced media frameworks on other platforms. We do believe we have the best and most extensive framework available and that with the work currently being done in the community this is not likely to change anytime soon.

In the discussion the approach taken by the Phonon abstraction layer which the Phonon project is advocating for inclusion in KDE4 also came up. I have held back blogging about Phonon for some time to avoid flamewars, but I don’t want to have efforts like OSDL delayed due to setups like Phonon being promoted or thought of as a workable solution for the issues faced. Let me start of with a brief introduction to the area of multimedia frameworks.

First of all multimedia frameworks are very complex systems, handling very hard technical problems in order to cater for issues ranging from problems from dealing with the analog past (for some technical information in this field I recommend fourcc.org), a forest of media formats with different traits, a host of features/deficiencies in hardware and a wide range of other software solutions to interact with like sound servers and legacy systems. On top of this you add performance and latency requirements, network protocols and multiplatform issues. In the end you have a problem space where even those who have worked on the issues for many years need to keep their head clear when designing the framework.

Multimedia frameworks are also by their nature abstraction layers themselves, trying to abstract away all the demuxers, muxers, decoders, encoders, cameras, soundcards, sound systems, network protocols and various types of filters into a coherent and userfriendly API.

With all this complexity the frameworks are struggling to not only bring you a coherent API, but also try to offer developers using them some high-level API’s that are useful for application developers. What we discovered in GStreamer is that while application developers initially ask for a ‘play this file’ API, which is what we offer through our ‘playbin’ component, they often end up pushing playbin to its limits and sometimes not use it at all in the end as they want to do fancy stuff which demands manipulating the pipelines more directly.
In many cases application authors want to do something which demands that a special plugin or filter to be written.

Now to get back to why I think Phonon is conceptually broken. First of all it is destined to fall into one of two traps. Either its API become so high level and limited that application developers will shun it due to a lack of features, meaning that you have an API useful for doing ‘ding’ sounds in standard applications, but anyone wanting more powerful operations will feel its a bigger hindrance than a help. On the other hand if they actually try to implement a feature set that is big enough to at least satisfy a subset, of for instance music player writers, then they will be forced into accessing things so deep in the frameworks that the operations become so framework specific that generalizing them away into a common API will at best be a kludge and at worst produce various broken behaviour changing depending on framework chosen.

Phonon also falls short in many other areas. For instance the stated goal is to let application writers write their applications against one API and have them work with a host of media frameworks. The reasoning is that no framework ‘does it all’ so having this flexibility is a good thing. This logic falls down quickly when you start thinking about it. While the general statement that no framework supports all formats or all features is true, the opposite, that a combination of multiple frameworks ‘do it all’, is equally untrue. An application developer who wants to add support for a new or rare audio format for instance would very often need to write the plugin or library to support it him/herself anyway, targeting an API with for instance 5 backends doesn’t make that job easier, it actually means that if you as an application developer want to ensure all your users can play this format he/she need to repeat the job five times.

So if you choose to standardize on one framework like GNOME has done with GStreamer then there is a chance that the application developer wants to support a format that GStreamer doesn’t currently support. But at least the developer have a clear idea where to add support for this format.

Also in a usability context telling the users to do hit and miss framework changes based on which media file they are trying to play is simply broken and I have a hard time believing that this is the user experience anyone wants to present users with for KDE4.

The counter argument to this is that Phonon do allow application developers to force or at least strongly suggest a specific backend for the user to use. This do solve the problem of the application developer knowing where to add support for something, but it also means that another stated goal of Phonon, to avoid enforcing such a ‘heavy’ dependency as GStreamer, might very well be replaced by enforcing 5 different mediaframeworks to be bundled with KDE4 as a whole. And if you think one framework is a heavy dependency then I promise you than five is not less. It also reduces the synergy effect of the KDE community a lot as it means that the work done by one music player author to add support for a new format will not be automatically available to the other KDE music players.

What we have been advocating for a long time from the GStreamer corner is that if both GNOME and KDE share a multimedia framework the synergy effect for both desktops will be huge.

My final objection to Phonon is that even if they manage to prove me wrong on their ability to provide a truly useful limited cross framework API and demonstrates that having a menu option offering your grandma to play her music using framework X,Y or Z actually solves more problems that it creates, I still think that it falls short. Because it wouldn’t provide an API to do applications like Pitivi, Diva, Jokosker, Buzztard, Flumotion and so on which I think is where we want to be at today in order to provide a competitive desktop. MacOS X and Windows Vista are showing us that this is the role that the desktop is heading towards.

One scenario I know have been contemplated is using Phonon, but at the same time saying that GStreamer is the recommended framework if you want to do something outside of the scope of Phonon. But my opinion is that in this use case focusing on Qt-style bindings for GStreamer is a better solution and a much easier thing to do and would result in something more useful for developers and users alike.

So I hope that interested people in the KDE community agrees with my analysis and starts working on Qt-style bindings for GStreamer, and as a result Phonon falls by the wayside. If not, well hopefully we will be able to cooperate on some of the lower level issues in the desktop, like improved driver handling through HAL for instance as the minimum.

All this said, people from the GStreamer community will of course try to help out people developing the GStreamer phonon backend for instance. We do try our best to try to help anyone using GStreamer, even when they do something we don’t believe in the viability or direction of. Zaheer Merali for instance has already volunteered to mentor or co-mentor anyone interested in working on Phonon-GStreamer integration as part of the Google Summer of code and as far as I know there where multiple proposals submitted for that.

Trick modes in GStreamer

Jan just demoed trick modes on his machine using GStreamer. Trick modes in the term used for stuff like double speed, quadruple speed, half speed, backwards play etc. playback. A little code cleanup is still needed before comitting, but it will be in GStreamer CVS next week.

The initial goal is supporting server side trickmode properly like when getting a video feed from a ViiV enabled server, but it already supports some modes of client side trickmodes too.

Along with the recently added Quality of Service framework and network clocks we are adding a lot of advanced functionality to GStreamer these days.

Edward and Wim also did some critical fixes in GStreamer and GNonlin today for Jokosher, to help ensure that the Jokosher team will be able to demo a working application at GUADEC.

Google Summer of code deadline is approaching fast (monday morning) and we still have room for more students to propose GStreamer related projects under the GNOME, Xiph or BBC banners. Be sure to check out these projects ideas lists and submit a proposal. Probably other projects also stil open to more proposals, but I am not mentoring those so I don’t know their status.

GStreamer and SoC

So the application period for Google Summer of code is this week, so any interested students should get their stuff together and start applying to any project of their liking. In addition to the projects mentioned in that previous blog entry also Mono, Ruby and Ekiga have GStreamer proposals too. And last but not least BBC will be accepting GStreamer and Schroedinger/Dirac related proposals as part of their SoC program. The GStreamer MXF container plugins might be better suited to run under the BBC header than under Xiph.org banner it currently resides. If you have any questions please come by the #gstreamer channel on irc.freenode.org or you could mail me at uraeus at linuxrising dot org, and I will try to put you in touch with the right people.

So multimedia is still considered a weak point for GNU/Linux systems, be sure to take on a project this summer that make that less so! Take on the right projects and we will be a huge step closer to multimedia nirvana when fall comes.

GStreamer and Google Summer of Code

Are you a student? Do you belong to the best and brightest among us? Then you probably woke up this morning wanting to do a GStreamer releated Google Summer of Code project. But to your horror GStreamer was not listed as a mentoring organisation for Google Summer of Code.

Well things are much brighter than they might look at first glance. In fact GStreamer is available as a Google Summer of Code project from a long list of mentoring organisations and GStreamer hackers are mentoring or offering to mentor many of them. I will try to give an overview of projects available.

GNOME:
There are multiple projects available as part of the GNOME project. Proposals already listed there are doing a Gst-editor
application for GStreamer 0.10 using Python and improving the integration of bluetooth devices in GStreamer(this second project could maybe be integrated with Bastien’s bluetooth manager project). There are also a few Totem related projects, like improving the Totem Mozilla plugin, getting GStreamer DVD support working in Totem and Annodex support in Totem (the annodex project might be mostly Xine hacking actually as GStreamer already supports Annodex). There is also a project proposal from the Jokosher project to work on LADSPA support in GStreamer and Jokosher.

Xiph.org

Through our close collaboration with the Xiph.org project there are many GStreamer related projects listed on their Summer of Code page. Projects include doing MXF plugins for GStreamer (MXF is the container format used in the TV industry), RTP plugins for Vorbis and Theora, OggMNG support and more. In fact almost every Xiph.org project either includes GStreamer or would be of direct use for the GStreamer project. I strongly recommend taking a look at them.

KDE
KDE has a project to create a GStreamer backend for their Phonon media playback framework. I am also sure that they would be open for more proposals from interested students. For instance do Edward Hervey offer to mentor a project to create a qt/KDE frontend to the Pitivi non-linear video editor using the KDE Python bindings.

OpenSolaris

The OpenSolaris project has a lot of GStreamer related tasks listed on the ideas page like improved GStreamer hardware plugins for Solaris and JMF/GStreamer integration.

handhelds.org

The handhelds.org project list a number of tasks related to the GPE embedded environment where GStreamer could and should be part of the solution like their VoIP proposals. This work should probably be based upon the work of the Farsight project who are working on this in context of Maemo.

XMMS2

has a project about evaluating and maybe porting XMMS2 to use GStreamer. A good project to take on if you want to limit fragmentation and help consolidate the desktop.

Others:
There are other projects too who probably would accept GStreamer related projects. Remember that as long as you are able to find a mentor most organisations are happy to take in good projects proposed by students. For instance BBC Research would probably be willing to take on Schroedinger – Dirac related projects or the MXF plugins also listed under Xiph.org. David Schleef, is willing to mentor a student writing a Dirac encoder targeted especially at desktop recording, using things like X damage extension for instance. The Mono project might be interested in taking on a project to do a plugin for f-spot that took selected photos and created an Ogg/Theora/Vorbis movie file of a slideshow of the photos transitioned with effects and muxed with audio from say a rb/banshee playlist. iPhoto on Mac OS X has a similar feature and so does ULead CD & DVD PictureShow on Windows. The MythTV project might be interested in a project to port it to use GStreamer. The Creative Commons project might be willing to take on a project to ensure easy CC tagging of all files generated by GStreamer. The examples just goes on and on. The Gimp project maybe would be interested in a Pitivi/Gimp integration project using GEGL.

Patent pains

Rambus wins patent claim

Saw today that
Rambus won in a courtcase over a memory maker called Hynix
, the jury in case case said that all of Rambus 10 patents in question were valid. My first thought that if all 10 patents where found valid the jury probably looked more at the nationality of the companies involved and one being an ‘evil chinese’ company taking american jobs and the other being a ‘good american’ company defendings its intelectual property it was a clear cut case.

Become more and more sceptical of the jury system over the last years, from having started out as a strong supporter of it. I think in a lot of cases the jury doesn’t have the competence to judge a case and also they are more likely to be swayed by non-legal things, ranging from how charming the parties involved in the case are and local political considerations. Reminds me of how Jonathan Schwartz was saying that they didn’t want to have to take the Kodak patent case in front oa jury in Kodak’s home state as the jury would be to predisposed towards Kodak. I think this is also something SCO is trying to bank on in their ongoing case with IBM, that a local Utah jury will be more favourable to their plight as a small local company and not able to fully understand the technical questions involved and thus give them a favourable judgement.

On the other side, the recent patent case between RIM and NTP, shows that even professional judges are not as good as one could hope. With the judge trying to force a settlement even in the light of the patent office seemingly about to invalidate most of the patents in question.
Once again you had the situation of a foreign company vs a local one,
but I think the major problem here was the judge seemingly thinking forcing a settlement would give a result that was a ‘fair compromise’.

Of course that in many of these cases todays victim was yesterday’s troll doesn’t make things easier, especially in the court of public opinion.

Allergy pains

So it seems there is something in the air this year in Barcelona that is causing me grief. My eyeballs hurt, I have a constant headache and a general feeling of about to become seasick.

I do tend to be a little busted during spring, but I don’t think I was even close to this last year, so I guess there is more of whatever is causing it this year.

The last Norwegian invasion?

Was an article in a Norwegian newspaper today about what I believe is the last Norwegian king to attempt a foreign invasion, namely the invasion of Harald Hardrada of England in 1066. (Hardrada means ‘hard ruler’) as his rule was supposed to have been very harsh.

The story of Haralad Hardrada has always facinated my as he played a part on my important events in Norwegian history, like participating in the battle of Stiklestad in 1030. This battle is famous in Norwegian history as it was the battle which basically turned Norway into a Christian country, even though the army fighting to keep the faith in the Norse gods won the battle and managed to kill the king later known as Olav the Holy.

In 1034 Harald Hardrada had travelled to Constantinopel and took service for the Byzantin emperor, which was where he gained his battle experience and wealth. In 1045 he returned to Norway and became co-ruler for a year, before gaining sole kingship. Harald Hardrade is also known as the founder of Oslo, which is today the capital of Norway, which is why a
statue of him decorate the city hall
.

Anyway the story in today’s paper was about the Battle of Fulford which I had to admit I didn’t know about. The battle of Fulford preceded the more famous (and for Harald Hardrade final battle of Stamford bridge. The battle of Stamford bridge preceding of course the even more famous battle at Hastings in which Harald Godwinson (the then king of England) was beaten by William the Conqueror.

It seems the battleside of the battle of Fulford is actually preserved today in a condition almost identical to the one of 1066, with an english society working to have it preserved as there is a recent effort to build a road and housing over the site.

So if you are in that area lend your support to the Battle of Fullford group :)

GUADEC video

We got some DVD’s with a presentation video of Vilanova de Geltru which I thought it would be nice to transcode to Ogg and share with everyone.

While waiting for Thoggen to get ported to 0.10 I had to make do with gst-launch. The pipeline below is what I managed to put together with the help of Zaheer.

gst-launch-0.10 dvdreadsrc title=”5″ ! decodebin name=”dvd” dvd. ! ffmpegcolorspace ! video/x-raw-yuv,format=\(fourcc\)YUY2 ! videoscale method=1 ! video/x-raw-yuv,format=\(fourcc\)YUY2,width=360,height=288,pixel-aspect-ratio=\(fraction\)16/15 ! videorate ! video/x-raw-yuv,framerate=25/2 ! ffmpegcolorspace ! theoraenc ! queue ! oggmux name=mux ! gnomevfssink location=file:///home/cschalle/vilanova_present.ogg dvd. ! audioconvert ! vorbisenc ! queue ! mux.

What this pipeline basically does is take the mpeg2/ac3 on the DVD, scale it down to 360×288 size, drop the framerate to half of the original and output the result as an Ogg Theora/Vorbis file.

Presentation video of Vilanova i la Geltru, home of GUADEC 2007

Anyway, you can now get a impression of the city of Vilanova i la Geltru by looking at
this presentation video of Vilanova
available in Ogg format and licensed under the Creative Commons Attribution-ShareAlike license. You can also watch the video online using the Cortado java-applet through this link.

Dirac
Also been testing the Schrodinger Dirac implementation recently. Thanks to Ralph Giles there is a official Dirac in Ogg specification now and I am able to create Ogg Dirac files using the GStreamer plugins provided by the Schrodinger project. We still have some way to go before this is truly useful, but it is nice to be able to actually encode something and view it in Totem.

svideo and linux continued

So in my previous blog entry I mentioned my intial work to get my xvideo output working with Linux. I did notice thought that there was one remaining issue, which was the problem that there was a black border around the computer screen image on the TV. I ended up spending more time on resolving that than I did on getting the thing working in the first place. Anyway Jan aka thaytan told me (after I had already spent quite some hours on the problem) that there is a option called TVOverScan in my xorg.conf file which can be used to get the image to scale up to get rid of black borders like I had. The problem was that whatever I set the TVOverScan too, my nvidia board seemed to ignore it. Adjusting it using nvidia-settings however worked fine. Seems that the TVOverScan in xorg.conf gets ignored, so what I did instead was set up my system to run ‘/usr/bin/nvidia-settings –load-config-only’ on login to solve it. A bit hackish, but it will have to do for now.

Also rediscovered my old issue of nautilus-cd-burner not being able to deal with both my internal cdwriter and my usb dvd burner at the same time. Ended up having to remove the internal drive and rebooting to ge t it to deal with my usb driver properly.

Enabling SVIDEO from my Dell 8600 laptop

On my Inspiron 8600 Laptop I have a SVIDEO output on the back which I thought would be nice to use for playing back videos etc., on my tv. Although my TV also has SVGA input support, the cable for that is way to short to work nicely for me. Anyway after a lot of googling and testing back and forth I managed to put together this xorg.conf file which does what I want, giving me a separate X screen on the svideo port. Tried playing some movies onto it yesterday and it worked very nicely. In addition to duplicating everything for two screens the magic was in the BusID option and the tvstandard, tvoutformat and connectedmonitor options.

Getting this going though was a very manual process with editing the xorg.conf file, reading the NVIDIA driver README and googling to find answers to some specific questions. (None of the Linux on Dell Inspiron sites seemed to have actually tried testing/using the SVIDEO output port).

I assume part of making this nice that X could do with some HAL/dbus magic in order to be able to handle this in a more automated fashion.
Not sure in the end if xorg or GNOME will be able to offer something to setup these things in a nice GUI’ed way or if we are depending on the hardware vendors to do this due to it being relativly hardware vendor specific? NVidia already have a little GTK+ based setup tool bundled which maybe they could extend (currently it only seems to allow you to adjust stuff not add anything). Anyone know if there are any efforts by anyone in this area currently?

Reminds me of my USB soundcard issues from some time ago. While there is rudimentary support in the drivers, we still have so way to go before its ready for joe average user. And when we do get to the point of trying to make it joe average friendly we will probably find, like they discovered with Network Manager, that the drivers needs a lot of fixes before being ready to work properly in such a scenario. At least for the sound card scenario we should have infrastructure for it in the next release of GNOME thanks to Jürg Billeter‘s work. Hopefully the USB soundcard drivers makers follows suit and improves their linux support.

Blog talking about Fedora, GNOME, GStreamer and related topics

css.php

Bad Behavior has blocked 703 access attempts in the last 7 days.