So as I blogged about before, Collabora Multimedia has been doing a project with NLnet to improve echo cancellation support under Linux and Pulse Audio. There were and still are a lot of challenges to get it right, but we wanted to try to lay the groundwork for a system wide solution, which is why we decided to try to implement it within Pulse Audio.
For those wondering what echo cancellation actually means, it is a way to resolve the issue that if you record sound from your laptop microphone and at the same time output sound from your speakers, you easily end up with the sound looping, creating an irritating echo effect, which makes doing voice calls on a machine painful and sometimes impossible. Echo cancellation systems basically try to analyse the data coming out of the speakers so that it can filter it out and ignore it when it comes back through the microphone.
The final result is that we have created a virtual device pair which adds echo cancellation, these virtual devices are automatically used by your application if it announces itself as a ‘phone’ application to Pulse Audio. Our Empathy messaging and video conferencing client does this for instance. The bulk of our work is now done and for many use cases things should just work as soon as the output of our effort gets merged into Pulse Audio and packaged by the distributions. There are some open questions left, but we hope that by making this work available and trying to work with people within the ALSA community we will be able to resolve the remaining issues over time. Anyway, let me just give you the small report Wim Taymans wrote to summarize the work and what has been done:
Pulse Audio filter infrastructure
Currently the echo-cancel module is built upon the virtual source and sink examples, which is currently considered to be the Pulse Audio filter infrastructure.
We briefly looked into the ideas for a different generic filter infrastructure for Pulse Audio. Lennart Poetterings first attempts to implement such an infrastructure were put on hold because of the large complexity wrt to latency and rewinds. Because echo-cancellation is not purely a filter (it needs input from both sink and source) we decided to build the echo-cancel module as a virtual source and sink instead.
Echo cancelling for PulseAudio
A new module called ‘module-echo-cancel’ was added to Pulse Audio. The module adds a new echo-cancel source and sink to the existing devices. All samples played to the echo-cancel sink get echo-cancelled from the samples captured from the echo-cancel source.
The module is built so that new echo cancellation algorithms can be plugged in very easily. Two echo cancellers are implemented already, one based on Speex dsp and another based on the code from Andre Adrian.
The echo-cancel source and sink currently proxy the default source and sink in Pulse Audio. This can be changed with the pavucontrol application by changing the source and sink of the virtual streams.
Currently the echo cancellation code can deal with sources and sinks that share a common clock, such as those found on the same sound card.
For devices that don’t share a common clock, we currently don’t have accurate enough timings from pulseaudio yet in order to implement dynamic resampling. Most echo cancelling algorithms are extremely sensitive to drift and fail as soon as resampling is slightly inaccurate. This means that if you record sound using a USB microphone on your webcam for instance, there is a good chance the card will drift and the echo cancellation will cease to work.
Enable echo cancellation in Empathy
Any application can connect to the new echo-cancel source and sink to use the provided echo-cancellation.
The new echo-cancel source and sink are tagged with the ‘phone’ media.role so that the module called ‘module-intended-roles’ will automatically link Empathy to the echo-cancel source and sinks.
Echo Cancellation test application
In the case that echo cancellation doesn’t work on your system there is a good chance it is due to bugs in the audio driver, to make it more easy to test for this and provide useful bug reports we wrote a test application that can be run when echo cancellation doesn’t work. As with all such applications there is of course only a limited range of things we can test for so it will not be able to detect all types of problems, but it should be able to expose some of the more common ones.
Given that the echo canceller has to run on more than just the ALSA API (pulse audio has various backends), we implemented a new Pulse Audio module called ‘module-test’ that will run a set of tests on all existing source and sink devices.
The current tests include measuring the accuracy of the timevalues reported by the internal clock. The accuracy of this clock is one of the most crucial parts in Pulse Audio because it is directly used to estimate when samples will be played or captured. Having an accurate timevalue for when a particular sample was played and recorded is essential to implement echo cancellation.
An application called ‘patest’ is provided that runs the tests and outputs the results on the standard output.
The output of the patest application contains information about the various devices that were tested along with a min/max jitter and drift measured on those devices. The jitter and drift are mostly caused by inaccurate results returned from the ALSA drivers. A normal jitter would be around +-15us, a typical drift would be +-100us.
The module is constructed in such a way that more tests can be added later.
Future plans
We would really like to resolve the issue of using multiple sound cards, ie laptop speakers outputting from the internal sound card and microphone recording using a USB sound card in a webcam. In order to do so will will try to reach out to some of the core ALSA developers and see if we can work together to figure out if something can be done. In some cases the hardware might not be good enough to resolve the problem in software, but we hope that in at last the majority of the cases we will be able to compensate for hardware issues in the driver layer.
Another thing we hope to see is that with this infrastructure in place that maybe some commercial entities decide to open source their echo cancellation algorithms, enabling us to plug in new ones that are more robust in terms of handling clock drifting for instance. Another possibility is that academic projects working on echo cancellation can now find it easy to put their algorithms into a production system easily, so we can see some innovation in this field happen in the open source space.