Almost a year ago, I decided to quit my job and start my own business. Video coding technology in general, and VP9 specifically, seemed interesting enough that I should be able to build a business on top of it, right? The company is called Two Orioles.
As a first product, I’ve created a VP9 bitstream analyzer. What’s a bitstream analyzer? It’s a tool to analyze the VP9 bitstream, of course! As such, it will visualize coding tools used for each VP9 frame, such as block/transform decompositions, intra/inter prediction modes used, segmentation maps; it also displays the frame buffer at each decoding stage (prediction, pre-loopfilter, final reconstruction), differences between each of these stages, and error between each stage and the source. It can also export block-, frame- and stream-level statistics to external tools (e.g. Google Sheets or Microsoft Excel) for further analysis.
I’m considering adding support for more codecs to it, let me know if you’re interested in that.
Good luck with your new venture. We would love to see an HEVC version.
Your test in the previous post was interesting, but not a fair comparison. You mention that “all forms of threading/tiling/slicing/wpp were disabled”. Why? x265 is highly multi-threaded, and it uses frame parallelism and Wavefront Parallel Processing by default. These features have minimal impact on quality, but a massive positive impact on performance. This is the way x265 was designed to be run from the start of our development. WPP is an integral part of the HEVC standard, and we expect most HEVC video to use WPP, as it makes both encoding and decoding run faster and more efficiently.
We’ve recently doubled x265 performance for most of our 10 presets. See http://x265.org/performance-presets/
I think it’s time to repeat these tests, using x265 default settings.
@Tom: thanks for the response! So, I had a conversation with Deepthi about this at VDD also. You can actually watch that back on vimeo/youtube. The basic assumption was that we’re not so much interested in wall clock time, but data center cost. 2 threads is twice as expensive as 1 thread in that context (per time unit). If you assume that wall clock time is the metric of choice, you’re obviously correct and the assumption would indeed change.
The new performance metrics look great, I’ll try to do new comparisons sometime soon if I can find the time, should be very exciting, thanks for your work!
@rbultje: thanks for your response! I understand that one particularly big video service operator thinks this way (running encode jobs on a single thread). But the vast majority of companies and end-users run their jobs multi-threaded.
In some ways single-threaded execution is more efficient, as a single execution thread should always have work to do (no waiting for the results of another execution thread). But in other ways it’s less efficient, as the physical CPU cache would be shared by many threads operating on the same machine (poisoning the cache with their irrelevant data), and so cache misses would increase.
@Tom: I’ll see if I can add meaningful data about threaded encoding to the next iteration. Nothing beats data, right? Maybe even $bigserviceoperator can change their ways if data shows that it’s beneficials to their bottom line.
Totally offtopic but I found your blog!
I was looking up vp8 stuff and I found that your other blog had died and I could only use archive.org and I couldn’t really find anything linking to your other handle.
Recently learning some stuff about vp8 and all of this stuff is really interesting!