One thing we are doing here at Red Hat Brno is maintain Firefox for Fedora and RHEL. The job is mostly focused on making sure we have Firefox available on all RHEL versions with all the latest security fixes, but it also gives our great team of Martin Stransky and Jan Horak some time to work on adding new features to Firefox to make sure it feels like a more integrated part of your desktop. They are currently working on 3 such features that you will hopefully be able to enjoy soon. The first is a patch to inhibit the screensaver when you are watching HTML5 or Flash content fullscreen. So if you are annoyed by having to move your mouse every 3 minutes to avoid the screen dimming when watching The Daily Show this is the fix for you. The second item they are working on is enabling the GStreamer backend in Firefox on Fedora. Which means that if you install for instance H264 support for Totem you will also have H264 support for HTML5 in Firefox. And finally there is also ongoing work on adding support for GIO in Firefox to make sure that any setup that works with GIO in terms of remote file access also works with Firefox, this latest task is taking some time though as it is currently blocking on some code refactoring in Firefox.
So the Linux based Steam Gaming Console has been relased, or at least one version of it. It is called Piston and it seems quite nice looking.
Personally I think this device has a potential to truly transform the Linux desktop and gaming market. If this things takes off it could for instance make linux drivers the top priority for the makers for graphic chips. And people specalizing in gaming oriented high end PCs would also be likely to start offering those machines with Linux.
So I don’t know about you, but I will for sure buy one of these boxes when it comes out
As mentioned in my previous blog entry I am working on multistream handling in Transmageddon. Not been a lot of changes, but I have been able to put in a little time here and there. The changes needed to accommodate this have also cleaned up the codebase quite a bit in my opinion, moving from a forest of variables to a list of python dictionaries. This change makes keeping track of whats happening in the codepath a lot easier as I can now just print the dictionary from the list to see what all relevant values are at a given point. Anyway a little screenshot below to show where I am at:
Still quite a bit of work to do to clean up the codebase and decide how certain things are to be handled (or not handled), but it is getting there. Screenshot above actually demonstrates one thing I haven’t decided on yet, which is how to deal with combining a device preset with a multistream file.
The biggest blocker currently for finishing this work is that the GStreamer encodebin element does not have an API yet for dealing with selecting encoding settings for multiple streams as detailed in this bug report. If anyone got the inclination to cook up a patch for encodebin which adds support for this that would be much appreciated.
Anyway, once I have this completed I think my next step will be to try to add some kind of DVD ripping support to Transmageddon and some basic metadata checking/editing and move the video flipping support into a special menu and add support for enabling/disabling deinterlacing in that same special menu. I trying to figure out as I go along how I can keep the user interface simple and straightforward and add requested features. The question that I continuously ask myself is what features do belong in Transmageddon and what features are of a level where people should go to something like PiTiVi instead.
Thanks to Sebastian Dröge there is a new thing in GStreamer called streamid. It basically gives all streams inside a given file a unique id, making files with multiple streams a lot easier to deal with. This streamid is also supported by the GStreamer discoverer object. So once you identified the contents of a file with discoverer you can be sure to grab the exact stream you want coming out of (uri)decodebin by checking the pad for the streamid. The most common usecase for this is of course files with multiple audio streams in different languages.
From the output of Discoverer the stream id is really easy to get:
On the stream object you get out of Discoverer you just run a:
On the pad you get from decodebin or uridecodebin the patch is a bit more convoluted, but not
to hard once you know how (there might be some kind of convenience API added for this at some point).
Before you connect the pad you get from the bin you attach a pad to it like this:
src_pad.add_probe(Gst.PadProbeType.EVENT_DOWNSTREAM, self.padprobe, None)
Then you in the function you define you can extract the stream_id with the parse_stream_start call as seen below:
def padprobe(self, pad, probeinfo, userdata): event = probeinfo.get_event() eventtype=event.type if eventtype==Gst.EventType.STREAM_START: streamid = event.parse_stream_start() return Gst.PadProbeReturn.OK
I been using this code in my local copy of Transmageddon to start implementing support for files with multiple audio streams (also supporting multiple video streams would be easy, but I am not sure how useful it would be). Got a screenshot of my current development snapshot below, but I am still trying to figure out what would be a nice way to present it. The current setup will look quite crap if the incoming file got more than a few audio streams. Suggestions welcome
One feature that would be of interest to us in the Empathy Video Conference client is the ability to record conversations. Due to that I have been putting together a simple prototype Python test application in free moments to verify that everything works as expected, before any effort is put into doing any work inside Empathy.
The sample code below requires two webcams to be connected to your system to work. It basically takes the two camera video streams, puts one of them through a encode/rtp/decode process (to roughly emulate what happens in a video call) and puts a text overlay onto the video to let the conference participant know the call is being recorded. The two video streams are then mixed together and displayed. In the actual application the combined stream would be saved to disk instead of course and also audio captured and mixed.
If we ever get around to working on this feature is an open question, but at least we can now assume that it is likely to work. Of course getting one stream in over the network over RTP is very different from what this sample does, so that might uncover some bugs.
The sample also works with Python3, so even though it is only a prototype it already fulfils the GNOME Goal
import sys from gi.repository import Gst from gi.repository import GObject GObject.threads_init() Gst.init(None) import os class VideoBox(): def __init__(self): mainloop = GObject.MainLoop() # Create transcoding pipeline self.pipeline = Gst.Pipeline() self.v4lsrc1 = Gst.ElementFactory.make('v4l2src', None) self.v4lsrc1.set_property("device", "/dev/video0") self.pipeline.add(self.v4lsrc1) self.v4lsrc2 = Gst.ElementFactory.make('v4l2src', None) self.v4lsrc2.set_property("device", "/dev/video1") self.pipeline.add(self.v4lsrc2) camera1caps = Gst.Caps.from_string("video/x-raw, width=320,height=240") self.camerafilter1 = Gst.ElementFactory.make("capsfilter", "filter1") self.camerafilter1.set_property("caps", camera1caps) self.pipeline.add(self.camerafilter1) self.videoenc = Gst.ElementFactory.make("theoraenc", None) self.pipeline.add(self.videoenc) self.videodec = Gst.ElementFactory.make("theoradec", None) self.pipeline.add(self.videodec) self.videortppay = Gst.ElementFactory.make("rtptheorapay", None) self.pipeline.add(self.videortppay) self.videortpdepay = Gst.ElementFactory.make("rtptheoradepay", None) self.pipeline.add(self.videortpdepay) self.textoverlay = Gst.ElementFactory.make("textoverlay", None) self.textoverlay.set_property("text","Talk is being recorded") self.pipeline.add(self.textoverlay) camera2caps = Gst.Caps.from_string("video/x-raw, width=320,height=240") self.camerafilter2 = Gst.ElementFactory.make("capsfilter", "filter2") self.camerafilter2.set_property("caps", camera2caps) self.pipeline.add(self.camerafilter2) self.videomixer = Gst.ElementFactory.make('videomixer', None) self.pipeline.add(self.videomixer) self.videobox1 = Gst.ElementFactory.make('videobox', None) self.videobox1.set_property("border-alpha",0) self.videobox1.set_property("top",0) self.videobox1.set_property("left",-320) self.pipeline.add(self.videobox1) self.videoformatconverter1 = Gst.ElementFactory.make('videoconvert', None) self.pipeline.add(self.videoformatconverter1) self.videoformatconverter2 = Gst.ElementFactory.make('videoconvert', None) self.pipeline.add(self.videoformatconverter2) self.videoformatconverter3 = Gst.ElementFactory.make('videoconvert', None) self.pipeline.add(self.videoformatconverter3) self.videoformatconverter4 = Gst.ElementFactory.make('videoconvert', None) self.pipeline.add(self.videoformatconverter4) self.xvimagesink = Gst.ElementFactory.make('xvimagesink',None) self.pipeline.add(self.xvimagesink) self.v4lsrc1.link(self.camerafilter1) self.camerafilter1.link(self.videoformatconverter1) self.videoformatconverter1.link(self.textoverlay) self.textoverlay.link(self.videobox1) self.videobox1.link(self.videomixer) self.v4lsrc2.link(self.camerafilter2) self.camerafilter2.link(self.videoformatconverter2) self.videoformatconverter2.link(self.videoenc) self.videoenc.link(self.videortppay) self.videortppay.link(self.videortpdepay) self.videortpdepay.link(self.videodec) self.videodec.link(self.videoformatconverter3) self.videoformatconverter3.link(self.videomixer) self.videomixer.link(self.videoformatconverter4) self.videoformatconverter4.link(self.xvimagesink) self.pipeline.set_state(Gst.State.PLAYING) mainloop.run() if __name__ == "__main__": app = VideoBox() signal.signal(signal.SIGINT, signal.SIG_DFL) exit_status = app.run(sys.argv) sys.exit(exit_status)
GStreamer does assembling advanced video application quite easy, in fact so easy that even I can write such an application in Python What I have had a lot more issues with is understanding how to deal with things like USB cameras and such. Well luckily the developers of Cheese realized this and created libcheese to help. libcheese is today used by Cheese itself of course, but also by Empathy for its camera handling.
Since I been thinking about adding some kind of video recording support in Transmageddon I wanted to test libcheese from Python. Unfortunately there was no Python examples available anywhere online, so I had write my own example
With some pointers from David King I managed to put the following python code together.
import sys from gi.repository import Gtk from gi.repository import Cheese from gi.repository import Clutter from gi.repository import Gst Gst.init(None) Clutter.init(sys.argv) class VideoBox(): def __init__(self): self.stage = Clutter.Stage() self.stage.set_size(400, 400) self.layout_manager = Clutter.BoxLayout() self.textures_box = Clutter.Actor(layout_manager=self.layout_manager) self.stage.add_actor(self.textures_box) self.video_texture = Clutter.Texture.new() self.video_texture.set_keep_aspect_ratio(True) self.video_texture.set_size(400,400) self.layout_manager.pack(self.video_texture, expand=False, x_fill=False, y_fill=False, x_align=Clutter.BoxAlignment.CENTER, y_align=Clutter.BoxAlignment.CENTER) self.camera = Cheese.Camera.new(self.video_texture, None, 100, 100) Cheese.Camera.setup(self.camera, None) Cheese.Camera.play(self.camera) def added(signal, data): uuid=data.get_uuid() node=data.get_device_node() print "uuid is " +str(uuid) print "node is " +str(node) self.camera.set_device_by_device_node(node) self.camera.switch_camera_device() device_monitor=Cheese.CameraDeviceMonitor.new() device_monitor.connect("added", added) device_monitor.coldplug() self.stage.show() Clutter.main() if __name__ == "__main__": app = VideoBox()
The application creates a simple clutter window to host the stream from the webcam. So when you run the application it should display the video from the system webcam. Then if you plug a second webcam into a USB port it will switch the video feed to that stream. Not a very useful application in itself, but hopefully enough to get you started on using libcheese from Python. You can find the libcheese API docs here, they are for C, but Python API from Gobject Introspection follows it so close that you should be able to find the right calls. And remember for figuring out exact API names ipython is your friend
P.S. You need Cheese 3.6 installed to be able to use libcheese with Python, this version which will be in Fedora starting with Fedora 18.
At Red Hat we are involved with a lot of cool open source projects. One of these is the popular LibreOffice productivity Suite, where we are putting in a lot of effort to make sure Red Hat customers and the community in general have a dependable and feature rich Office Suite available.
In addition to of course doing work to add features requested by Red Hat customers, the team focuses on helping build the upstream project and making sure we help push desktop integration forward.
In fact the work done by Caolán McNamara, David Tardon, Stephan Bergmann, Michael Stahl and Eike Rathke is making Red Hat a major contributor to LibreOffice. So to celebrate the success of our team so far we wanted to have some nice t-shirts made for this years
LibreOffice conference in Berlin to give the team. It would have added a nice little touch to a conference where Caolan did a talk about his cool widget layout work (*1), Michael did a talk about the migration of LibreOffice to gbuild, Stephan did a talk about API stability and Eike did a talk about collaborative editing.
Unfortunately the t-shirts came back late from the printer and thus missed the conference, but I will be sending them out to the team today so that they have them ready for the next LibreOffice event
Anyway a big thank you from me to the team, they have been a pleasure working with since I joined Red Hat and I am looking forward to seeing what we will achieve over the next years.
One project we been working on here at Red Hat Brno is to make sure we have a nicely working voice and video calling with Empathy in Fedora 18. The project is being spearheaded by Debarshi Ray with me trying to help out with the testing. We are still not there, but we are making good progress thanks to the help of people like Brian Pepple, Sjoerd Simons, Olivier Crete and Guillaume Desmottes and more.
But having been involved with open source multimedia for so long I thought it could be interesting for people to know why free video calling have taken so long to get right and why we still have a little bit to go. So I decided to do this write up of some of the challenges involved. Be aware though that this article is mostly discuss the general historical challenges of getting free VoIP up and running, but I will try to tie that into the specific issues we are trying to resolve currently where relevant.
The first challenge that had to be overcome was the challenge of protocols. VoIP and video calling has been around for a while (which an application like Ekiga is proof of), but it has been hampered by a jungle of complex standards, closed protocols, lack of interoperability and so on. Some of the older standards also require non-free codecs to operate. The open standard that has started to turn this around is XMPP which is the protocol that came out of the Jabber project. Originally it was just an open text chat network, but thanks to ongoing work it now features voice and video conferencing too. It also got a boost as Google choose it as the foundation for their GTalk offering ensuring that anyone with a gmail address suddenly was available to chat or call. That said like any developing protocol it has its challenges, and some slight differences in behaviour between a Google jabber server and most others is causing us some pain with video calls currently, which is one of the issues we are trying to figure out how to resolve.
Codecs and interoperability
The other thing that has hounded us is the combination of non-free codecs and the need for interoperability. For a video calling system to be interesting to use you would need to be able to use it to contact at least a substantial subset of your friends and family. For the longest time this either meant using a non-free codec, because if you relied solely on free codecs no widely used client out there would be able to connect with you. But thanks to the effort of first Xiph.org to create the Speex audio codec and now most recently the Opus audio codec, and later the adoption of Speex by Google has at least mostly resolved things on the audio side of things. On the video side things are still not 100% there. We have the Theora video codec from Xiph.org, but unfortunately when the RTP specification for that codec was written, the primary usecase in mind was RTSP streaming and not video conferencing, making the Theora RTP a bit hairy to use for video conferencing. The other bigger issue with Theora is that outside the Linux world nobody adopted Theora for video calling, so once again you are not likely able to use it to call a very large subset of your friends and family unless they are all on Linux systems.
There might be a solution on the way though in the form of new kid on the block, VP8. VP8 is a video codec that Google released as part of their WebM HTML5 video effort. The RTP specification for VP8 is still under development, so adoption is limited, but the hope and expectation is that Google will support VP8 in their GTalk client once the RTP specification is stable and thus we should have a good set of free codecs for both Audio and Video available and in the hands of a large user base.
Video calling is a quite complex technical issue, with a lot of components needing to work together from audio and video acquisition on your local machine, integrating with your address book, negotiating the call between the parties involved, putting everything into RTP packets on one side and unpacking and displaying them on the other side, taking into account the network, firewalls and and audio and video sync. So in order for a call to work you will need (among others) ALSA, PulseAudio, V4L2, GStreamer, Evolution Data Server, Farstream, libnice, the XMPP server, Telepathy and Empathy to work together across two different systems. And if you want to interoperate with a 3rd party system like GTalk the list of components that all need to work perfectly with each other grows further.
A lot of this software has been written in parallel with each other, written in parallel with evolving codecs and standards, and it tries to interoperate with as many 3rd party systems as possible. This has come at the cost of stability, which of course has turned people of from using and testing the video call functionality of Empathy. But we believe that we have reached a turning point now where the pieces are in place, which is why we are now trying to help stabilize and improve the experience to make doing VoIP and video conferencing calls work nicely out of the box on Fedora 18.
In addition to the nitty gritty of protocols and codecs there are other pieces that has been lacking to give users a really good experience. The most critical one is good echo cancellation. This is required in order to avoid having an ugly echo effect when trying to use your laptop built-in speakers and microphone for a call. So people have been forced to use a headset to make things work reasonably well. This was a quite hard issue to solve as there was neither any great open source code available which implemented echo cancellation or a good way to hook it into the system. To start addressing this issue while I was working for Collabora Multimedia we reached out to the Dutch non-profit NLnet Foundation who sponsored us to have Wim Taymans work on creating an echo cancellation framework for PulseAudio. The goal was to create the framework within PulseAudio to support pluggable echo cancellation modules, turn two existing open source echo cancellation solutions into plugins for this framework as examples and proof of concept, and hope that the availability of such a framework would encourage other groups or individuals to release better echo cancellation modules going forward.
When we started this work the best existing open source echo cancellation system was Speex DSP. Unfortunately SpeexDSP had a lot of limitations, for instance it could not work well with two soundcards, which meant using your laptop speakers for output and a USB microphone for input would not work. Although we can claim no direct connection as things would have it Google ended up releasing a quite good echo cancellation algorithm as part of their WebRTC effort. This was quickly turned into a library and plugin for PulseAudio by Arun Raghavan. And this combined PulseAudio and WebRTC echo cancellation system is what we will have packaged and available in Fedora 18.
So I outlined a few of the challenges around having a good quality VoIP and video conferencing solution shipping out of the box on a Linux Distribution. And some of the items like the Video Codec situation and general stack stability is not 100% there yet. There also is quite a few bugs in Empathy in terms of behaviour, but Debarshi are already debugging those and with the help of the Telepathy and Empathy teams we should hopefully get those issues patched and merged before Fedora 18 is shipping. Our goal is to get Empathy up to a level where people want to be using it to make VoiP and Video calls, as that is also the best way to ensure things stay working going forward.
In addition to Debarshi, another key person helping us with this effort in the Fedora community is Brian Pepple, who are making sure we are getting releases and updates of GStreamer, Telpathy, Farstream, libnice and so on packaged for Fedora 18 almost on the day. This is making testing and verifying bugfixes a lot easier for us.
There are also some nice to have items we want to look at going forward after having stabilized the current functionality. For instance Red Hat and Xiph.org codec guru Monty Montgomery suggested we add a video noise reduction video to the GStreamer pipeline inside Empathy in order to improve quality and performance when using a low quality built in web camera. [Edit: Sjoerd just tolm me the Gst 0.10 version of the code had such a plugin available, so this might not be to hard to resolve.]
Debarshi is also interested in seeing if we can help move the multiparty chat feature forward. But we are not expecting to be able to work on these issues before Fedora 18 is released.
GStreamer maintainer Wim Taymans decided that having a brand new GStreamer 1.x series was only worth the effort if we also had some nice up to date documentation for GStreamer 1.0. So over the last week he has been going over the GStreamer Application Development manual making sure it is up to date and fixing all the code examples and adding new chapters even. So if you want to get into GStreamer development our introduction manual should now be a good starting point again!
So I got an email yesterday telling me I hit the 3 Month milestone at Red Hat. It has been an incredible experience so far and it has been a joy meeting all the talented people working for Red Hat here in Brno. Been a lot of ramping up for me, learning and understanding the Red Hat release processes and the tools we use. I think one of the things I appreciated the most during my time here is seeing how Red Hat even now as a large billion dollar company has stayed so true to their values, and how you see the belief in the open source model is not just something you see among the engineers and technical staff, but even among the top manager of the company, going all the way up to the CEO.
There are also a lot of really nice efforts and programs being run here in Brno with the company having an impressive level of interaction and collaboration with the local universities. Red Hat staff do lectures, mentor bachelor and master projects and let a large amount of students work with us as interns. And more often than not those internships leads to a job with Red Hat.
Yesterday we had a big party here in Brno to celebrate the opening of our new office building, as having reached a headcount of 500 here in Brno, the current building was starting to feel quite cramped. The bottom floors of the current building will be emptied out now and rebuilt, so that they are ready to house the next batch of new hires. We have a nice space set aside for the Brno desktop team in the new building, so this Friday we will all be moving there which I think will help us work even better together.
On a personal front my wife and daughter arrived last week after we finally manage to get my wife a visa and this morning the moving truck arrived with our stuff from the UK. Next step is a raid to IKEA to fill in the missing pieces. So all set for the next Months and years here