GStreamer and Mobile phones

On May 17th I posted a request for people to submit movies and clips from the mobile phones and digital cameras to our tracking bugzilla entry. Today Wim posted a comment to the bug saying that allmost all the bugs where fixed. I think people will have some problems still due to the AMR decoding library not being generally available. If you want AMR support under GStreamer you need to follow the instructions in the README in the ext/amrwb directory in gst-plugins-bad. We are of course aware that this is a bit of a pain, but luckily doing an AMR decoder/encoder is a ffmpeg SoC project this year.

There was one file submited using a Qualcomm codec which also is lacking open source implementation. As I pointed out in the bugzilla there is a free beer library available for supporting this on desktop systems, so if someone is interested in helping us create a GStreamer plugin for it that would be a good thing for those stuch which phones or cameras using this codec.

So there are no known bugs anymore in relation to these files there is only a unimplemented codec (QCELP) and a codec which you need to do some manual compilation to get support for (AMR).

We would still appreciate even more movie clips and camera model information. I am sure there are more phones and cameras out there producing more weird files. So if you got one of those please add it to the tracker bug report

Google Summer of Code

The SoC projects are now chosen and we got some nice projects approved. Particularly happy that the updated gst-editor application got approved. There was a very good application for MXF demuxing/muxing GStreamer plugins submitted under the Xiph.org banner, but unfortunatly Xiph.org didn’t get enough projects allocated to take on this project. Not getting that project through was the only thing I am feeling a bit bummed out about as it was a really well written and thought out application. A big thanks to Marko for applying.

Open Source and Strategy – A GUADEC story

With GUADEC approaching fast I have been thinking a bit about what we do at GUADEC and what have worked and not worked earlier years. One thing GUADEC has at times been good for is to help us set some strategies for going foward. I think the biggest singlest success story in this regard is the strategic decision taken during and after GUADEC 3 in 2002. At GUADEC 3 Jim Getty‘s did a talk called draining the swamp. In this talk he pointed out some of the problems we face today and how we need to be better at reaching out to others to solve them. This talk and the subsequent debate led to some major decisions getting taken in the GNOME community.(To be fair others, like Havoc Pennington, had brought up similar issues before and even worked on them, but I think GUADEC 3 and Jim’s talk was a turning point).

The strategic choices I think can be summed up as follows:

  • We need to reach out to other projects and organisations and pull them closer
  • We need to fix problems on the right level not try to fix them on our level
  • Focus on getting things right on Linux while making sure the solutions we choose don’t permanently lock out other platforms
  • Work on cross-desktop standards to avoid redundant efforts and increase interoperability
  • It isn’t our way or the highway

I think on all these areas we succeeded and the world is a better place today for it. In terms of project outreach I think we succeeded in pulling projects such as Mozilla, OpenOffice, Eclipse and Xiph.org to name a few closer with the three first all now defaulting to our having GTK+/GNOME integration as their natural Linux integration point. True enough we didn’t have any real competition in some sense with anternatives either being license wise unacceptable or already on the way out, but a lot of personal initiatives where taken by people in and around the GNOME communty to engage with these projects and I think it has paid off. Adoption by commerical vendors such as vmware and Adobe is another example of how the choice to be more welcoming to ‘outsiders’ has worked out.

As for fixing the problems on the right level this was about renewed engagement with people in the kernel and X communities for instance and in some cases pushing projects to freedesktop. This also ties in with the linux focus point. Instead of creating a forest of abstraction layers to try to overcome the problems of varying platforms with or without certain features and functionality there was a decision to choose good solutions and help popularize them instead of letting lack uniformity hold us back or cause us to waste time and effort trying to support highly different systems. HAL is a good example here. GNOME decided to go for HAL even though it was a pure linux solution at the time. The design of HAL wasn’t made to be linux exclusive, but the implementation at the time was. Up to this point we had tried with varying levels of success of working around and abstracting this away at the GNOME level. With the current work done on porting HAL to Solaris and FreeBSD I do feel that this strategy is vindicated and HAL is providing us with a much better plug and play story than we ever had before. Lesson here is that if you try to be Jack of all trades you become master of none.

After GUADEC 3 GNOME people put a lot more emphasis on trying to push more things into freedesktop.org. There was some initial pushback with people crying about freedesktop being a cover operation to force GNOME technologies onto the world and so on, but I think today most people would agree that the steps taken with freedesktop have been good and if anything we have not managed to take enough steps.

It isn’t our way or the highway. I also think that it was about this time that GNOME developers starting coming to a true acceptance that there was nothing wrong with developing applications in other programming languages and frameworks. There was a lot of discussion about what it means to be a GNOME application, like can you be a GNOME application even if you don’t use library XYZ and so on. Today we have a thriving community of developers doing applications in C++, Python, Mono, Java, Ruby and many more. Some are using GTK+ or GTK+ bindings, while other integrate much more indirectly using toolkits such as SWT, Swing, wxWidgets, XUL and VCL. We still have an ongoing job to define what we expect of a GNOME application in a more neutral fashion, but long term there should be nothing stopping someone from even doing a GNOME application in Qt for instance. The philosophy being that if it looks, walks and talks like a duck, then it is a duck. You don’t need to disect it to prove it as a duck. This is software development not biology :).

There has been a lot of other strategic decisions made over the years too of course which have had smaller or bigger impact. The HIG and the 6 months development cycle to mention a few. Or in some sense we never made any strategic decisions, but GUADEC have always played a instrumental role to ensure developer buy-in for such things, and those ideas which have gotten enough buy-in have looking back become strategic decisions :) I guess that is the sort of thing which people coming from a non-free software background have trouble relating too :)

I don’t know what will be the strategic choices made or strongly influeced by this years GUADEC, but I will be there to find out and so should you :)

First preview release of Dirac implementation

For some time know we have been working with the BBC and David Schleef to create a implementation of the Dirac codec in ANSI C. The homepage of this project is schrodinger.sf.net. Due to delays in getting the bitstream specification finalized things have taken longer than anticipated, but things are moving ahead at full speed now so we felt it was a good time to make our first ever release.

This release isn’t 100% spec compliant or as fast as we want it to be, but it do provide you with a full set of GStreamer plugins for encoding and decoding video and embedding the Dirac video into a valid Ogg file. Its basically a technology preview release making it easy for people to grab the code and compile it knowing that we did some basic testing on it to make sure it ‘works’ as opposed to grabbing a random snapshot from svn.

You can download this 0.2.0
release here

Over the next months we will do further releases which will gradually be closer and closer to the specificiation, faster and the library will also have a public API for people who wants to access it directly instead of through GStreamer. The 1.0 release will be 100% spec compliant and reasonably fast for common use.

Testing Pitivi CVS

Edward is planing a new release of Pitivi very soon so I have been helping him out with doing some testing. The target of this release is to allow you to transcode as many of your files as possible to Ogg Theora. We are building Pitivi step-by-step now with the roadmap being something like.

  1. Play any file that Totem plays (previous release)
  2. Transcode any of those files we play to Ogg Theora (this release)
  3. Merge together multiple clips into one bigger movie (next release)
  4. Basic transitions

There is of course a long todo list after that too, adding more and more advanced features, more output formats and so on and there is a lot of related bugfixing to make those items a reality. Like fixing the issues caused by those mobile phone and camera files we got yesterday. But our goal is that starting from the current release Pitivi will actually do something thats useful for many people, namely transcode their video clips into Ogg Theora. In some sense that todo list doesn’t look very ambitious, but there are a lot of bugs that gets found and needs fixing during this process. My thinking initially was that if we could decode it (for instance play it back with Totem) we could transcode it. This turned out to be a huge oversimplification. In the 1-to-1 file case this is mostly true, but since Pitivi’s goal is to handle many-to-1 in the end the design requires a higher level of perfection from the decoders in terms of how they handle seeking for instance. So far AVI, Windows Media, Matroska and Ogg files have transcoded fine for me. MPEG/Quicktime movies seems to lose either sound or video partway into the transcoding process, but those issues are being looked at and will be resolved with future GStreamer plugins releases.

Also in terms of getting gnonlin stable and bugfree there is now a good tag-team effort going on between Pitivi and Jokosher. The kind of issues that the Jokosher team runs into tend to be the same that Pitivi are or will be running into so with both them exercising it mightily a lot of bugs are found but also resolved. Also noticed a new batch of really sweet Joksher screenshots posted a little over a week ago. Reminds me that we need to get working on more Pitivi screenshots also :)

On a related note, Mike Smith commited the ALSA spdifsink to CVS today. This means that outputing AC3 through the spdif port on your soundcard should now work with GStreamer if your alsa driver has spdif support. Another feature in Totem we can now support :)

Also checking Jokosher webpage today I noticed that they don’t sport the cool GStreamer family web button. Anyone doing GStreamer based applications should be sure to sport this token of eliteness on your webpage :)

Got a camera or phone supporting video?

Edward is hard at work preparing a new release of the Pitivi non-linear editor. The goal of this release is to be able to transcode to Ogg anything you throw at it. Once that is done we have a solid base to build from and an application that does something genuinly useful for people.

One thing we want to make sure of is that we are able to handle all the videos created by mobile phones and digital cameras (non-dv cameras). So we ask that anyone who got such a device create a 5 second clip at transfer it to their system. If it plays in Totem (using GStreamer) or CVS Pitivi and you are able to seek in it then please just add a comment to this bugzilla entry stating the device that produced the clip. If it either doesn’t play or seeking is broken please attach the clip to that bug with a statement about what device captured it and what the problem is.

Update: Please attach a test video clip to that bug even if it works in Totem as there are cases where playback works, but we still hit problems when trying more advanced operations on the files.

Update on Dirac library

Some time ago I announced the start of a project to implement the Dirac codec in C. The project is called Schroedinger and we are currently starting get to a point where its becoming interesting for people to play with. If you check out current SVN it includes both a encoder and decoder plugin for GStreamer. David Schleef has also done the work to make sure it embeds properly in the Ogg container format.

There is still more work required however to finalize it. Its not 100% compliant to the Dirac specification, much more optimisation is needed and the library also needs to be properly set up with a public API. Current CVS also is hardcoded to do lossless encoding so the files are rather big.

I will report back on further progress as it happens. We have at least reached that first important milestone where its possible to transcode a file and play it back (on a fast computer) so visible progress should be easier to track from here on out.

In the land of emotions

I got a few comments on my phonon blog entry yesterday. A big thanks to everyone who posted comments and gave feedback. I was positivly suprised at both the number of comments and the general quality of comments. These things tend to quickly degenerate into mindless flamewars, but a high percentage of the comments where quite on point and reasoning. I will try to comment on some of the general issues raised, but after this thing got onto osnews and lwn the range of comments range is to big for me to go into them all here in a proper fashion, the thing with such debates is of course that they tend to stray far from their origin as they go on.

That said I am not going to do a detailed discussion per blog on this subject either, so for those unhappy about this debate even taking place feel assured I will not post on it again for a while :)

API/ABI:
Maybe the most important issue brought up was that of API/ABI stability, which for instance Scott Wheeler brought up. I consider Scott a friend and I think his entry is well considered. My general response is that the bigger and more complex an API gets the chances of getting it right the first time goes down. Current use cases proposed for Phonon goes beyond the ‘playbin wrapper’ that Scott talks about, including adding VoiP support for instance. I am not saying Phonon is already has a scope where being ABI stable until 2012 is unrealistic, but I already see a lot of talk and pressure inside the KDE community to increase the scope of Phonon. And some of the expectations expressed through for instance comments to my initial blog or the linux.com article will not be met by the API Scott proposes. That is of course more an issue of expectation management than anything else of course.

Lessons from arts:
Another common point brought up was the lessons learned from arts in KDE. Why a project fails has many explanations of course, but I don’t think it would be fair to say arts in KDE failed because KDE decided to choose one media framework. A oversimplification of the argument made for sure, but it highlight why I feel most of the arts arguments brought don’t ring true to me. I would say that factors such as adopting a project with little prior use, a small development community and a design which was clearly not meant for video from the outset was the bigger contributing factors to its failure. So ‘burned child doesn’t play with fire’ might be old saying that applies here, I feel using it as an excuse isn’t fully justified. I also find it sad if the experience with arts have caused the KDE community to think that from here on out they only want to be a multimedia framework consumer, not development partners. Because that is why I want KDE behind GStreamer, to be our development partner on GStreamer. I think that will be good for GStreamer and help us increase our pace of development even more and thus good for KDE because you get better more advanced media framework features for your users and developers even quicker.

The application of force:
One common theme more among the user replies rather than the developer blogs is that of developers/users being forced to do something. I always found those kind of arguments a bit bemusing. So GNOME and KDE or GNUStep for that matter is basically a collection of development libraries, a suite of applications and also some written and unwritten rules about what goes and not. Those rules covers things such as programming languages used, what kind of dependencies are acceptable and so on. Philosophically I guess one based on that one can claim that these projects are vehicles of subjugation, but personally I see these things more as developer aid and guidance. But for those who disagree I guess its time to start printing the ‘Free software is all about choice subjugation‘ t-shirts :)

Anyway we could go on and on debating the various issues on the why and why’s not of chosing a multimedia framework or what is possible to abstract in what fashion in what timeframes or not for a long time. I think my arguments from the inital post still stands on their own and of course people now have interesting viewpoints from a lot of people on the subject too, like the people commenting on my initial post to other bloggers like Aaron Seigo, Jono Bacon,Ian Monroe and last but not least Scott Wheeler. While we all might not agree fully with eachother I think its been an interesting exchange of thoughts. Thanks :)
Update: The very cool Michael Pyne has also chimmed in, he doesn’t agree with me, but that doesn’t make him less cool :)

Why Phonon is a broken wheel

So I attended the OSDL Desktop Summit in Mainz, Germany from Sunday to Tuesday evening this week. The goal of the meeting was to follow up on the tasks done at the earlier meeting in Portland and to look for new areas that could do with some work, multimedia being the one pulling me in.

The goal of myself and the GStreamer community in such a scenario is of course to advocate the use of GStreamer as the response free software can offer to advanced media frameworks on other platforms. We do believe we have the best and most extensive framework available and that with the work currently being done in the community this is not likely to change anytime soon.

In the discussion the approach taken by the Phonon abstraction layer which the Phonon project is advocating for inclusion in KDE4 also came up. I have held back blogging about Phonon for some time to avoid flamewars, but I don’t want to have efforts like OSDL delayed due to setups like Phonon being promoted or thought of as a workable solution for the issues faced. Let me start of with a brief introduction to the area of multimedia frameworks.

First of all multimedia frameworks are very complex systems, handling very hard technical problems in order to cater for issues ranging from problems from dealing with the analog past (for some technical information in this field I recommend fourcc.org), a forest of media formats with different traits, a host of features/deficiencies in hardware and a wide range of other software solutions to interact with like sound servers and legacy systems. On top of this you add performance and latency requirements, network protocols and multiplatform issues. In the end you have a problem space where even those who have worked on the issues for many years need to keep their head clear when designing the framework.

Multimedia frameworks are also by their nature abstraction layers themselves, trying to abstract away all the demuxers, muxers, decoders, encoders, cameras, soundcards, sound systems, network protocols and various types of filters into a coherent and userfriendly API.

With all this complexity the frameworks are struggling to not only bring you a coherent API, but also try to offer developers using them some high-level API’s that are useful for application developers. What we discovered in GStreamer is that while application developers initially ask for a ‘play this file’ API, which is what we offer through our ‘playbin’ component, they often end up pushing playbin to its limits and sometimes not use it at all in the end as they want to do fancy stuff which demands manipulating the pipelines more directly.
In many cases application authors want to do something which demands that a special plugin or filter to be written.

Now to get back to why I think Phonon is conceptually broken. First of all it is destined to fall into one of two traps. Either its API become so high level and limited that application developers will shun it due to a lack of features, meaning that you have an API useful for doing ‘ding’ sounds in standard applications, but anyone wanting more powerful operations will feel its a bigger hindrance than a help. On the other hand if they actually try to implement a feature set that is big enough to at least satisfy a subset, of for instance music player writers, then they will be forced into accessing things so deep in the frameworks that the operations become so framework specific that generalizing them away into a common API will at best be a kludge and at worst produce various broken behaviour changing depending on framework chosen.

Phonon also falls short in many other areas. For instance the stated goal is to let application writers write their applications against one API and have them work with a host of media frameworks. The reasoning is that no framework ‘does it all’ so having this flexibility is a good thing. This logic falls down quickly when you start thinking about it. While the general statement that no framework supports all formats or all features is true, the opposite, that a combination of multiple frameworks ‘do it all’, is equally untrue. An application developer who wants to add support for a new or rare audio format for instance would very often need to write the plugin or library to support it him/herself anyway, targeting an API with for instance 5 backends doesn’t make that job easier, it actually means that if you as an application developer want to ensure all your users can play this format he/she need to repeat the job five times.

So if you choose to standardize on one framework like GNOME has done with GStreamer then there is a chance that the application developer wants to support a format that GStreamer doesn’t currently support. But at least the developer have a clear idea where to add support for this format.

Also in a usability context telling the users to do hit and miss framework changes based on which media file they are trying to play is simply broken and I have a hard time believing that this is the user experience anyone wants to present users with for KDE4.

The counter argument to this is that Phonon do allow application developers to force or at least strongly suggest a specific backend for the user to use. This do solve the problem of the application developer knowing where to add support for something, but it also means that another stated goal of Phonon, to avoid enforcing such a ‘heavy’ dependency as GStreamer, might very well be replaced by enforcing 5 different mediaframeworks to be bundled with KDE4 as a whole. And if you think one framework is a heavy dependency then I promise you than five is not less. It also reduces the synergy effect of the KDE community a lot as it means that the work done by one music player author to add support for a new format will not be automatically available to the other KDE music players.

What we have been advocating for a long time from the GStreamer corner is that if both GNOME and KDE share a multimedia framework the synergy effect for both desktops will be huge.

My final objection to Phonon is that even if they manage to prove me wrong on their ability to provide a truly useful limited cross framework API and demonstrates that having a menu option offering your grandma to play her music using framework X,Y or Z actually solves more problems that it creates, I still think that it falls short. Because it wouldn’t provide an API to do applications like Pitivi, Diva, Jokosker, Buzztard, Flumotion and so on which I think is where we want to be at today in order to provide a competitive desktop. MacOS X and Windows Vista are showing us that this is the role that the desktop is heading towards.

One scenario I know have been contemplated is using Phonon, but at the same time saying that GStreamer is the recommended framework if you want to do something outside of the scope of Phonon. But my opinion is that in this use case focusing on Qt-style bindings for GStreamer is a better solution and a much easier thing to do and would result in something more useful for developers and users alike.

So I hope that interested people in the KDE community agrees with my analysis and starts working on Qt-style bindings for GStreamer, and as a result Phonon falls by the wayside. If not, well hopefully we will be able to cooperate on some of the lower level issues in the desktop, like improved driver handling through HAL for instance as the minimum.

All this said, people from the GStreamer community will of course try to help out people developing the GStreamer phonon backend for instance. We do try our best to try to help anyone using GStreamer, even when they do something we don’t believe in the viability or direction of. Zaheer Merali for instance has already volunteered to mentor or co-mentor anyone interested in working on Phonon-GStreamer integration as part of the Google Summer of code and as far as I know there where multiple proposals submitted for that.

Trick modes in GStreamer

Jan just demoed trick modes on his machine using GStreamer. Trick modes in the term used for stuff like double speed, quadruple speed, half speed, backwards play etc. playback. A little code cleanup is still needed before comitting, but it will be in GStreamer CVS next week.

The initial goal is supporting server side trickmode properly like when getting a video feed from a ViiV enabled server, but it already supports some modes of client side trickmodes too.

Along with the recently added Quality of Service framework and network clocks we are adding a lot of advanced functionality to GStreamer these days.

Edward and Wim also did some critical fixes in GStreamer and GNonlin today for Jokosher, to help ensure that the Jokosher team will be able to demo a working application at GUADEC.

Google Summer of code deadline is approaching fast (monday morning) and we still have room for more students to propose GStreamer related projects under the GNOME, Xiph or BBC banners. Be sure to check out these projects ideas lists and submit a proposal. Probably other projects also stil open to more proposals, but I am not mentoring those so I don’t know their status.

GStreamer and SoC

So the application period for Google Summer of code is this week, so any interested students should get their stuff together and start applying to any project of their liking. In addition to the projects mentioned in that previous blog entry also Mono, Ruby and Ekiga have GStreamer proposals too. And last but not least BBC will be accepting GStreamer and Schroedinger/Dirac related proposals as part of their SoC program. The GStreamer MXF container plugins might be better suited to run under the BBC header than under Xiph.org banner it currently resides. If you have any questions please come by the channel on irc.freenode.org or you could mail me at uraeus at linuxrising dot org, and I will try to put you in touch with the right people.

So multimedia is still considered a weak point for GNU/Linux systems, be sure to take on a project this summer that make that less so! Take on the right projects and we will be a huge step closer to multimedia nirvana when fall comes.

GStreamer and Google Summer of Code

Are you a student? Do you belong to the best and brightest among us? Then you probably woke up this morning wanting to do a GStreamer releated Google Summer of Code project. But to your horror GStreamer was not listed as a mentoring organisation for Google Summer of Code.

Well things are much brighter than they might look at first glance. In fact GStreamer is available as a Google Summer of Code project from a long list of mentoring organisations and GStreamer hackers are mentoring or offering to mentor many of them. I will try to give an overview of projects available.

GNOME:
There are multiple projects available as part of the GNOME project. Proposals already listed there are doing a Gst-editor
application for GStreamer 0.10 using Python and improving the integration of bluetooth devices in GStreamer(this second project could maybe be integrated with Bastien’s bluetooth manager project). There are also a few Totem related projects, like improving the Totem Mozilla plugin, getting GStreamer DVD support working in Totem and Annodex support in Totem (the annodex project might be mostly Xine hacking actually as GStreamer already supports Annodex). There is also a project proposal from the Jokosher project to work on LADSPA support in GStreamer and Jokosher.

Xiph.org

Through our close collaboration with the Xiph.org project there are many GStreamer related projects listed on their Summer of Code page. Projects include doing MXF plugins for GStreamer (MXF is the container format used in the TV industry), RTP plugins for Vorbis and Theora, OggMNG support and more. In fact almost every Xiph.org project either includes GStreamer or would be of direct use for the GStreamer project. I strongly recommend taking a look at them.

KDE
KDE has a project to create a GStreamer backend for their Phonon media playback framework. I am also sure that they would be open for more proposals from interested students. For instance do Edward Hervey offer to mentor a project to create a qt/KDE frontend to the Pitivi non-linear video editor using the KDE Python bindings.

OpenSolaris

The OpenSolaris project has a lot of GStreamer related tasks listed on the ideas page like improved GStreamer hardware plugins for Solaris and JMF/GStreamer integration.

handhelds.org

The handhelds.org project list a number of tasks related to the GPE embedded environment where GStreamer could and should be part of the solution like their VoIP proposals. This work should probably be based upon the work of the Farsight project who are working on this in context of Maemo.

XMMS2

has a project about evaluating and maybe porting XMMS2 to use GStreamer. A good project to take on if you want to limit fragmentation and help consolidate the desktop.

Others:
There are other projects too who probably would accept GStreamer related projects. Remember that as long as you are able to find a mentor most organisations are happy to take in good projects proposed by students. For instance BBC Research would probably be willing to take on Schroedinger – Dirac related projects or the MXF plugins also listed under Xiph.org. David Schleef, is willing to mentor a student writing a Dirac encoder targeted especially at desktop recording, using things like X damage extension for instance. The Mono project might be interested in taking on a project to do a plugin for f-spot that took selected photos and created an Ogg/Theora/Vorbis movie file of a slideshow of the photos transitioned with effects and muxed with audio from say a rb/banshee playlist. iPhoto on Mac OS X has a similar feature and so does ULead CD & DVD PictureShow on Windows. The MythTV project might be interested in a project to port it to use GStreamer. The Creative Commons project might be willing to take on a project to ensure easy CC tagging of all files generated by GStreamer. The examples just goes on and on. The Gimp project maybe would be interested in a Pitivi/Gimp integration project using GEGL.