GStreamer Conference 2018

For the 9th time this year there will be the GStreamer Conference. This year it will be in Edinburgh, UK right after the Embedded Linux Conference Europe, on the 25th of 26th of October. The GStreamer Conference is always a lot of fun with a wide variety of talks around Linux and multimedia, not all of them tied to GStreamer itself, for instance in the past we had a lot of talks about PulseAudio, V4L, OpenGL and Vulkan and new codecs.This year I am really looking forward to talks such as the DeepStream talk by NVidia, Bringing Deep Neural Networks to GStreamer by Pexip and D3Dx Video Game Streaming on Windows by Bebo, to mention a few.

For a variety of reasons I missed the last couple of conferences, but this year I will be back in attendance and I am really looking forward to it. In fact it will be the first GStreamer Conference I am attending that I am not the organizer for, so it will be nice to really be able to just enjoy the conference and the hallway track this time.

So if you haven’t booked yourself in already I strongly recommend going to the GStreamer Conference website and getting yourself signed up to attend.

See you all in Edinburgh!

Also looking forward to seeing everyone attending the PipeWire Hackfest happening right after the GStreamer Conference.

Launching Pipewire!

In quite a few blog posts I been referencing Pipewire our new Linux infrastructure piece to handle multimedia under Linux better. Well we are finally ready to formally launch pipewire as a project and have created a Pipewire website and logo.Pipewire logo

To give you all some background, Pipewire is the latest creation of GStreamer co-creator Wim Taymans. The original reason it was created was that we realized that as desktop applications would be moving towards primarly being shipped as containerized Flatpaks we would need something for video similar to what PulseAudio was doing for Audio. As part of his job here at Red Hat Wim had already been contributing to PulseAudio for a while, including implementing a new security model for PulseAudio to ensure we could securely have containerized applications output sound through PulseAudio. So he set out to write Pipewire, although initially the name he used was PulseVideo. As he was working on figuring out the core design of PipeWire he came to the conclusion that designing Pipewire to just be able to do video would be a mistake as a major challenge he was familiar with working on GStreamer was how to ensure perfect audio and video syncronisation. If both audio and video could be routed through the same media daemon then ensuring audio and video worked well together would be a lot simpler and frameworks such as GStreamer would need to do a lot less heavy lifting to make it work. So just before we starting sharing the code publicaly we renamed the project to Pinos, named after Pinos de Alhaurín, a small town close to where Wim is living in southern Spain. In retrospect Pinos was probably not the worlds best name :)

Anyway as work progressed Wim decided to also take a look at Jack, as supporting the pro-audio usecase was an area PulseAudio had never tried to do, yet we felt that if we could ensure Pipewire supported the pro-audio usecase in addition to consumer level audio and video it would improve our multimedia infrastructure significantly and ensure pro-audio became a first class citizen on the Linux desktop. Of course as the scope grew the development time got longer too.

Another major usecase for Pipewire for us was that we knew that with the migration to Wayland we would need a new mechanism to handle screen capture as the way it was done under X was very insecure. So Jonas Ådahl started working on creating an API we could support in the compositor and use Pipewire to output. This is meant to cover both single frame capture like screenshot, to local desktop recording and remoting protocols. It is important to note here that what we have done is not just implement support for a specific protocol like RDP or VNC, but we ensured there is an advaned infrastructure in place to support any protocol on top of. For instance we will be working with the Spice team here at Red Hat to ensure SPICE can take advantage of Pipewire and the new API for instance. We will also ensure Chrome and Firefox supports this so that you can share your Wayland desktop through systems such as Blue Jeans.

Where we are now
So after multiple years of development we are now landing Pipewire in Fedora Workstation 27. This initial version is video only as that is the most urgent thing we need supported for Flatpaks and Wayland. So audio is completely unaffected by this for now and rolling that out will require quite a bit of work as we do not want to risk breaking audio on your system as a result of this change. We know that for many the original rollout of PulseAudio was painful and we do not want a repeat of that history.

So I strongly recommend grabbing the Fedora Workstation 27 beta to test pipewire and check out the new website at Pipewire.org and the initial documentation at the Pipewire wiki. Especially interesting is probably the pages that will eventually outline our plans for handling PulseAudio and JACK usecases.

If you are interested in Pipewire please join us on IRC in #pipewire on freenode. Also if things goes as planned Wim will be on Linux Unplugged tonight talking to Chris Fisher and the Unplugged crew about Pipewire, so tune in!

GStreamer Conference 2013 – Haggis edition

So we are quickly approaching the time for GStreamer Conference 2013. This year we will be in Edinburgh, Scotland and the conference will be hosted at the Edinburgh International Conference Centre alongside the Embedded Linux Conference Europe and the Automotive Linux Summit.

The GStreamer Conference 2013 Schedule is now live and as you can see there are a lot of great talks this year too, ranging from OpenGL integration, embedded hardware, new codecs and more. As always the GStreamer Conference is the best place to be to discuss the challenges of multimedia.

The conference will be held on the 22nd and 23rd of October so I strongly recommend you get yourself registered. If you want to attend ELCE or the Autotmotive summit make sure to register for those too and extend your stay in Edinburgh to also cover the 24th and 25th of October.

Looking forward to seeing you all there.

And last but not least, a big thank you to this years sponsors of the GStreamer Conference, especially Platinum sponsor Collabora. Also a big thank you to sponsors Fluendo, Google and for the first time this year Cisco.

And as always a big thank you to Ubicast who as always will be recording and publishing the conference for us. Be sure to check out their recordings from earlier years to find out what the GStreamer Conference is about.

GStreamer, Python and videomixing

One feature that would be of interest to us in the Empathy Video Conference client is the ability to record conversations. Due to that I have been putting together a simple prototype Python test application in free moments to verify that everything works as expected, before any effort is put into doing any work inside Empathy.

The sample code below requires two webcams to be connected to your system to work. It basically takes the two camera video streams, puts one of them through a encode/rtp/decode process (to roughly emulate what happens in a video call) and puts a text overlay onto the video to let the conference participant know the call is being recorded. The two video streams are then mixed together and displayed. In the actual application the combined stream would be saved to disk instead of course and also audio captured and mixed.

If we ever get around to working on this feature is an open question, but at least we can now assume that it is likely to work. Of course getting one stream in over the network over RTP is very different from what this sample does, so that might uncover some bugs.

The sample also works with Python3, so even though it is only a prototype it already fulfils the GNOME Goal :)

import sys
from gi.repository import Gst
from gi.repository import GObject
GObject.threads_init()
Gst.init(None)

import os

class VideoBox():
   def __init__(self):
       mainloop = GObject.MainLoop()
       # Create transcoding pipeline
       self.pipeline = Gst.Pipeline()


       self.v4lsrc1 = Gst.ElementFactory.make('v4l2src', None)
       self.v4lsrc1.set_property("device", "/dev/video0")
       self.pipeline.add(self.v4lsrc1)

       self.v4lsrc2 = Gst.ElementFactory.make('v4l2src', None)
       self.v4lsrc2.set_property("device", "/dev/video1")
       self.pipeline.add(self.v4lsrc2)

       camera1caps = Gst.Caps.from_string("video/x-raw, width=320,height=240")
       self.camerafilter1 = Gst.ElementFactory.make("capsfilter", "filter1") 
       self.camerafilter1.set_property("caps", camera1caps)
       self.pipeline.add(self.camerafilter1)

       self.videoenc = Gst.ElementFactory.make("theoraenc", None)
       self.pipeline.add(self.videoenc)

       self.videodec = Gst.ElementFactory.make("theoradec", None)
       self.pipeline.add(self.videodec)

       self.videortppay = Gst.ElementFactory.make("rtptheorapay", None)
       self.pipeline.add(self.videortppay)

       self.videortpdepay = Gst.ElementFactory.make("rtptheoradepay", None)
       self.pipeline.add(self.videortpdepay)

       self.textoverlay = Gst.ElementFactory.make("textoverlay", None)
       self.textoverlay.set_property("text","Talk is being recorded")
       self.pipeline.add(self.textoverlay)

       camera2caps = Gst.Caps.from_string("video/x-raw, width=320,height=240")
       self.camerafilter2 = Gst.ElementFactory.make("capsfilter", "filter2") 
       self.camerafilter2.set_property("caps", camera2caps)
       self.pipeline.add(self.camerafilter2)

       self.videomixer = Gst.ElementFactory.make('videomixer', None)
       self.pipeline.add(self.videomixer)

       self.videobox1 = Gst.ElementFactory.make('videobox', None)
       self.videobox1.set_property("border-alpha",0)
       self.videobox1.set_property("top",0)
       self.videobox1.set_property("left",-320)
       self.pipeline.add(self.videobox1)

       self.videoformatconverter1 = Gst.ElementFactory.make('videoconvert', None)
       self.pipeline.add(self.videoformatconverter1)

       self.videoformatconverter2 = Gst.ElementFactory.make('videoconvert', None)
       self.pipeline.add(self.videoformatconverter2)

       self.videoformatconverter3 = Gst.ElementFactory.make('videoconvert', None)
       self.pipeline.add(self.videoformatconverter3)

       self.videoformatconverter4 = Gst.ElementFactory.make('videoconvert', None)
       self.pipeline.add(self.videoformatconverter4)

       self.xvimagesink = Gst.ElementFactory.make('xvimagesink',None)
       self.pipeline.add(self.xvimagesink)

       self.v4lsrc1.link(self.camerafilter1)
       self.camerafilter1.link(self.videoformatconverter1)
       self.videoformatconverter1.link(self.textoverlay)
       self.textoverlay.link(self.videobox1)
       self.videobox1.link(self.videomixer)

       self.v4lsrc2.link(self.camerafilter2)
       self.camerafilter2.link(self.videoformatconverter2)
       self.videoformatconverter2.link(self.videoenc)
       self.videoenc.link(self.videortppay)
       self.videortppay.link(self.videortpdepay)
       self.videortpdepay.link(self.videodec)
       self.videodec.link(self.videoformatconverter3)
       self.videoformatconverter3.link(self.videomixer)

       self.videomixer.link(self.videoformatconverter4)
       self.videoformatconverter4.link(self.xvimagesink)
       self.pipeline.set_state(Gst.State.PLAYING)
       mainloop.run()
   
if __name__ == "__main__":
    app = VideoBox()
    signal.signal(signal.SIGINT, signal.SIG_DFL)
    exit_status = app.run(sys.argv)
    sys.exit(exit_status)

The long journey towards good free video conferencing

One project we been working on here at Red Hat Brno is to make sure we have a nicely working voice and video calling with Empathy in Fedora 18. The project is being spearheaded by Debarshi Ray with me trying to help out with the testing. We are still not there, but we are making good progress thanks to the help of people like Brian Pepple, Sjoerd Simons, Olivier Crete and Guillaume Desmottes and more.

But having been involved with open source multimedia for so long I thought it could be interesting for people to know why free video calling have taken so long to get right and why we still have a little bit to go. So I decided to do this write up of some of the challenges involved. Be aware though that this article is mostly discuss the general historical challenges of getting free VoIP up and running, but I will try to tie that into the specific issues we are trying to resolve currently where relevant.

Protocols

The first challenge that had to be overcome was the challenge of protocols. VoIP and video calling has been around for a while (which an application like Ekiga is proof of), but it has been hampered by a jungle of complex standards, closed protocols, lack of interoperability and so on. Some of the older standards also require non-free codecs to operate. The open standard that has started to turn this around is XMPP which is the protocol that came out of the Jabber project. Originally it was just an open text chat network, but thanks to ongoing work it now features voice and video conferencing too. It also got a boost as Google choose it as the foundation for their GTalk offering ensuring that anyone with a gmail address suddenly was available to chat or call. That said like any developing protocol it has its challenges, and some slight differences in behaviour between a Google jabber server and most others is causing us some pain with video calls currently, which is one of the issues we are trying to figure out how to resolve.

Codecs and interoperability

The other thing that has hounded us is the combination of non-free codecs and the need for interoperability. For a video calling system to be interesting to use you would need to be able to use it to contact at least a substantial subset of your friends and family. For the longest time this either meant using a non-free codec, because if you relied solely on free codecs no widely used client out there would be able to connect with you. But thanks to the effort of first Xiph.org to create the Speex audio codec and now most recently the Opus audio codec, and later the adoption of Speex by Google has at least mostly resolved things on the audio side of things. On the video side things are still not 100% there. We have the Theora video codec from Xiph.org, but unfortunately when the RTP specification for that codec was written, the primary usecase in mind was RTSP streaming and not video conferencing, making the Theora RTP a bit hairy to use for video conferencing. The other bigger issue with Theora is that outside the Linux world nobody adopted Theora for video calling, so once again you are not likely able to use it to call a very large subset of your friends and family unless they are all on Linux systems.
There might be a solution on the way though in the form of new kid on the block, VP8. VP8 is a video codec that Google released as part of their WebM HTML5 video effort. The RTP specification for VP8 is still under development, so adoption is limited, but the hope and expectation is that Google will support VP8 in their GTalk client once the RTP specification is stable and thus we should have a good set of free codecs for both Audio and Video available and in the hands of a large user base.

Frameworks

Video calling is a quite complex technical issue, with a lot of components needing to work together from audio and video acquisition on your local machine, integrating with your address book, negotiating the call between the parties involved, putting everything into RTP packets on one side and unpacking and displaying them on the other side, taking into account the network, firewalls and and audio and video sync. So in order for a call to work you will need (among others) ALSA, PulseAudio, V4L2, GStreamer, Evolution Data Server, Farstream, libnice, the XMPP server, Telepathy and Empathy to work together across two different systems. And if you want to interoperate with a 3rd party system like GTalk the list of components that all need to work perfectly with each other grows further.

A lot of this software has been written in parallel with each other, written in parallel with evolving codecs and standards, and it tries to interoperate with as many 3rd party systems as possible. This has come at the cost of stability, which of course has turned people of from using and testing the video call functionality of Empathy. But we believe that we have reached a turning point now where the pieces are in place, which is why we are now trying to help stabilize and improve the experience to make doing VoIP and video conferencing calls work nicely out of the box on Fedora 18.

Missing pieces

In addition to the nitty gritty of protocols and codecs there are other pieces that has been lacking to give users a really good experience. The most critical one is good echo cancellation. This is required in order to avoid having an ugly echo effect when trying to use your laptop built-in speakers and microphone for a call. So people have been forced to use a headset to make things work reasonably well. This was a quite hard issue to solve as there was neither any great open source code available which implemented echo cancellation or a good way to hook it into the system. To start addressing this issue while I was working for Collabora Multimedia we reached out to the Dutch non-profit NLnet Foundation who sponsored us to have Wim Taymans work on creating an echo cancellation framework for PulseAudio. The goal was to create the framework within PulseAudio to support pluggable echo cancellation modules, turn two existing open source echo cancellation solutions into plugins for this framework as examples and proof of concept, and hope that the availability of such a framework would encourage other groups or individuals to release better echo cancellation modules going forward.
When we started this work the best existing open source echo cancellation system was Speex DSP. Unfortunately SpeexDSP had a lot of limitations, for instance it could not work well with two soundcards, which meant using your laptop speakers for output and a USB microphone for input would not work. Although we can claim no direct connection as things would have it Google ended up releasing a quite good echo cancellation algorithm as part of their WebRTC effort. This was quickly turned into a library and plugin for PulseAudio by Arun Raghavan. And this combined PulseAudio and WebRTC echo cancellation system is what we will have packaged and available in Fedora 18.

Summary

So I outlined a few of the challenges around having a good quality VoIP and video conferencing solution shipping out of the box on a Linux Distribution. And some of the items like the Video Codec situation and general stack stability is not 100% there yet. There also is quite a few bugs in Empathy in terms of behaviour, but Debarshi are already debugging those and with the help of the Telepathy and Empathy teams we should hopefully get those issues patched and merged before Fedora 18 is shipping. Our goal is to get Empathy up to a level where people want to be using it to make VoiP and Video calls, as that is also the best way to ensure things stay working going forward.

In addition to Debarshi, another key person helping us with this effort in the Fedora community is Brian Pepple, who are making sure we are getting releases and updates of GStreamer, Telpathy, Farstream, libnice and so on packaged for Fedora 18 almost on the day. This is making testing and verifying bugfixes a lot easier for us.

Future plans

There are also some nice to have items we want to look at going forward after having stabilized the current functionality. For instance Red Hat and Xiph.org codec guru Monty Montgomery suggested we add a video noise reduction video to the GStreamer pipeline inside Empathy in order to improve quality and performance when using a low quality built in web camera. [Edit: Sjoerd just tolm me the Gst 0.10 version of the code had such a plugin available, so this might not be to hard to resolve.]
Debarshi is also interested in seeing if we can help move the multiparty chat feature forward. But we are not expecting to be able to work on these issues before Fedora 18 is released.

Back from GStreamer Conference 2012

Came back last evening from the GStreamer Conference and I am now back in Cambridge for the weekend. The GStreamer Conference was a lot of fun this year and it was great seeing everyone again. I think the mixture of talks we had this year was really good and I think everyone attending enjoyed themselves. For those who missed the conference this year then Phoronix and Lwn.net posted articles from the Conference. The talks where also recorded and will soon be available at the Ubicast GStreamer Conference website. We did try to get livestreaming going this year, but due to technical problems it didn’t work out, but maybe next year.

A big thank you again to our Gold Sponsor Collabora and our Silver Sponsors Entropy Wave, Fluendo, Igalia and Google. Thanks also goes to LWN.net, Phoronix and Ubicast for making sure the talks and sessions at the GStreamer Conference can reach a wide an audience as possible. And last but not least a big thanks to all our conference speakers who took the time and effort to prepare presentations for this years GStreamer Conference.

For me personally the GStreamer Conference this year also marks the end of my life in Cambridge, UK. Starting from next week I will have completed my period of comuting to Brno, and will instead be living in Brno, Czech Republic on a permanent basis. Which reminds me, we are looking to hire more members to our Brno desktop engineering team, so I will be posting a blog soon outlining what kind of experience we are looking for.

Interview with Arun Raghavan about PulseAudio

With all the talk generated by Arun Raghavans blog post comparing PulseAudio and Audioflinger I thought it would be good to follow up with an interview with Arun about the latest developments in PulseAudio and the way forward for the project. You can find the PulseAudio interview here. I also made a new page listing all the Collabora developer interviews done so far. Enjoy :)

Farstream and libnice, an interview with Youness Alaoui

After the success of the GStreamer interview with Wim Taymans I thought we follow up with another great interview with a Collabora developer.

This time we are talking with Youness Alaoui who is one of the maintainers of Farstream, the audio and video conferencing framework built on top of GStreamer. We also cover another of Youness Alaoui projects, libnice, the NAT traversal library. So if you want to know what is happening with audio and video conferencing on Linux be sure to read the full interview with Youness Alaoui here.

Dynamic Adaptive Streaming over HTTP standard, ST Microelectronics, Fraunhofer and GStreamer

Edward Hervey pointed out to me this morning that there are some nice articles online about an effort between ST Microelectronics and Frauenhofer, around the 3GP DASH (Dynamic Adaptive Streaming over HTTP) standard. There is for instance this article in Thinq magazine and this article on TMCnet. What you might not know and which is even cooler is that Emanuele Quacchio will be speaking this GStreamer based DASH implementation at the the GStreamer Conference later this Month. So if you haven’t signed up for the GStreamer Conference already, then maybe this is the time to do it :)

I am really excited about this years GStreamer Conference as we have a lot of ongoing efforts about to come to fruits. From Collabora we got Wim Taymans will be talking about GStreamer 1.0 effort, which we expect to have out before years end and Tim-Philipp Müller will speak about a lot of the other incredible advances we made over the last year. Being in the middle of it I think its easy to go a bit blind due to the gradual process, but things like the new parsing libraries that Thibault Saunier have been working on, which will enable much quicker and better support for things like libva and vdpau plugins in GStreamer. Or the new baseclasses that Mark Nauwelaerts have ported most of our plugins over to now, which in one fell swoop improved our plugin quality by leaps and bounds. And of course there are things like the GStreamer Editing services (GES), discoverer and encodebin which Edward Hervey created, which will make applications like Transmageddon video transcoder and remuxer and the PiTiVi video editor a lot easier to develop.

We will also be doing some real cool demonstrations of stuff we have been working on at Collabora at the Linux Con Showcase on Thursday. Thanks to GES we have a great demo of a mobile editing solution using either QML or HTML5. We have HTML5 video calling using Telepathy and we have Video calling using Telepathy from the Media explorer media center solution.

Another talk that I will be sure not to miss is Jan Schmidt who will be talking about Blu-Ray playback with GStreamer. In addition to being technically interesting Jans talks are always fun, like last year he did his presentation using GStreamer instead of something like LibreOffice, having created his slides as a DVD menu through a small program he wrote to turn SVG files into DVD menus :)

Media Explorer

Discovered a new piece of software using GStreamer the other day called Media Explorer. It is a nice media center type solution for the desktop and Meego devices. The system has been developed by mostly Intel engineers, but they have now made it freely available. I tried it on my Fedora system last evening and it seems to have a bit of a MeeGo bias currently, as it complained about Connman being installed and also didn’t look for media in the desktop Media directories, but I am sure those are smaller issues that can quickly be sorted out. Thomas Wood did this nice blog entry about it a couple of weeks ago, with some more details.

Anyway if you are looking for a linux based media center UI this might be what you are looking for, personally I will try to see if I can get it going on my little Panda board at home.

Update: Also noticed there is a nice article about Media Explorer on linux.com.