Theora in Empathy

Over the last few months a lot of effort has been put into getting Empathy up to scratch as not only a chat client, but also for use as an audio and video conferencing client, the long term goal being to support all the major instant messaging based videoconf systems in addition to stalwarts such as SIP. Being free software enthusiasts we have of course been especially keen on combining the open protocol of XMPP, used by Jabber and Google Talk, with a set of free video and audio codecs, namely Speex for audio and Theora for video.
Empathy video-conf, Sjoerd and myself talking

Theora RTP being a bit special meant we had to wait on the integration of Farsight2 into Empathy for this to work, but now using our eminent stack of GStreamer with Farsight and Telepathy, this now works on my computer as shown by the screenshot above. It requires the very latest versions of everything, and there are still some bugs that need hammering out, but all in all we are very close to reaching the goal. So once the relevant versions of Empathy and the underlying libraries have trickled out into the distribution ecosystem over the next months, we will finally have fully working XMPP based Speex+Theora video conferencing working out of the box between all Linux and Unix systems. Empathy supports doing audio calls with Google Talk, and video calls with the Nokia N810 devices over XMPP if you install the relevant H.263 encoders and decoders for GStreamer. We’re still working on support for the new GMail Video Chat plugin, which initially supported H.264 SVC, although even though it now also supports H.264 AVC we can’t get it to decode streams created by the x264 encoder.

27 thoughts on “Theora in Empathy”

  1. Wow, that’s awesome.

    Once it works with the new web-based GTalk, I can finally ditch Skype to talk to family and friends :)

    Great, thanks a thousand times.

  2. Cool, but the interface needs a refresh, the buttons take way too much space. I would also prefer an overlay where the local cam input is placed on the larger external cam input.

  3. @menko: its a vumeter for the microphone. So you can visually see that you are actually producing audio :)

    @Hans: yeah, although functionality comes first, so step 1 is getting Empathy where we want it in terms of supporting protocols etc. and the UI cleanup/improvement as step 2.

  4. >> Cool, but the interface needs a refresh, the buttons take way too much space.
    I think it respects the GNOME button settings

    >> I would also prefer an overlay where the local cam input is
    >> placed on the larger external cam input.
    +1 for this!

    Great work, thanks to everyone involved!

  5. “Once it works with the new web-based GTalk, I can finally ditch Skype to talk to family and friends :)”

    Very yes.

  6. great. but do you think we will see some encryption stuff for this like ZRTP? or would at least Zfone be able to sniff and encrypt the streams?

    i am a bit curious about such things, as it seems that all free software in this area doesn’t care much about privacy. best example is ekiga where the ZRTP support is postponed again :(. this is so sad as skype can’t do real privacy because of its CS nature, but it still beats us with a “basic level” of privacy. meaning “only” skype/ebay and probably some government agencies can access the audio and video streams somehow, where all the FOSS apps send those streams completely unencrypted via the internet.

    but anyway great to see these improvments! :)

  7. @bojo42: there are plans for adding security mechanisms to the system to ensure privacy, but probably not ZRTP for various reasons I will not go into here. If you want the details I suggest going to #telepathy on the freenode IRC server.

  8. @bojo42: To follow up on Christian, we’re not likely to work on zRTP ourselves. Our plans first are to take care of implementing end to end encryption in XMPP, and then we can use sRTP with shared session keys over the secure signalling channel. The current proposal is to use Jingle to set up peer to peer streams, and then just use normal TLS to encrypt them. Each client would then create their own X.509 certificates (behind the scenes) and then do verification of fingerprints of others. We’d at least then have “leap of faith” like ssh, or can implement channel binding to make sure the peer has a shared secret exchanged by other means. Using TLS means that if you wanted to then a more enterprise deployment can set up CAs, signature chains and CRLs, etc. Once we have end to end secure signalling, we can use sRTP with just shared session keys.

  9. Are there any plans to support webcam with gmail web based interface. Specifically, I want to see friends using the Windows only add-on that Google has provided

  10. Maybe this isn’t the right place to ask, but are there any plans to implement basic Voice/Video for other protocols like MSN or Yahoo in the nearer future (like lets say end of ’09)? Really excited about this development and Empathy seems to be shaping up to quite an impressive IM system.

  11. Nice work! More little updates on the Empathy project would be awesome.

    I’ve been using Empathy to make voip calls for a while. All it really needs is a proper dialpad, or some way of storing frequently used phone numbers

  12. Great work !! it’s awesome. So many questions

    Questions about the experience itself : is there any lag? how cpu intensive it it? what is more important : cpu / graphic card / IP connection (bandwidth/ping) ?

    To make this a standard… wouldn’t it be nice to have some framework to integrate empathy in totally different GUIs ? (ie for instance develop a full screen UI for a mobile, or in a car, or a widget on a computer… all kinds of derivatives)

  13. @frustphil, bash: Yes, MSN support is in development, Yahoo is on the todo list, but it might take a bit longer to get there (unless someone in the community steps up to th plate that is :)

    @Nick: I don’t have any numbers for you, but the Speex codec was created for this exact sort of thing so it should be fine. And Theora is by today standards a fairly simple and low CPU codec.

    As for a framwork yes, in some sense Empathy was created to show of the underlying framework. Telepathy is the framework offered to do this which can be used with a lot of different user interfaces.

  14. @Robert: great that you have such plans for privacy :) but as you seem to make really good progress and AFAIK the new end-to-end stuff in XMPP will still take a while, do you think users should be able to use Zfone for the mean time in case Empathys privacy stuff isn’t ready when the basic infrastructure is? of course you have nothing to do with Zfone and it’s even not real OS or FOSS, but i asking this from a user perspective, as i am very keen on the great work you’re doing and can’t wait to use it, but i won’t feel safe use Empathy without basic privacy and Zfone is AFAIK the only working solution for free software right now.

    and BTW thank you both for your responses and don’t mind my greedy requesting of privacy features ;)

  15. @bojo42: I don’t think the end to end stuff in XMPP will take too long, we’ve already got a working Jingle state machine, peer to peer SOCKS5 and IBB streams, and an XMPP library inside Salut (our link-local backend) which supports peer to peer connections and TLS, so for us it’s just a bit of refactoring to plug all of these things together. In the meantime, I don’t make any proscriptions about what users should and shouldn’t use. I personally will stick to free software, but if encrypting your voice calls is more important to you than free software, feel free to use what you want.

  16. I tried 5 FOSS SIP-phones inside NAT. I still haven’t really tried to do it where both people are behind NAT. Empathy kinda-worked.

    But; I’ve got servers, is there any software I can easily set up to make Empathy indestructible for all my users? So we can ditch Skype (+normal phones).

    I *love* the development, seems we’re finally close to «getting there». :-)

Comments are closed.