March 22, 2017

Christian SchallerAnother media codec on the way!

(Christian Schaller)

One of the thing we are working hard at currently is ensuring you have the codecs you need available in Fedora Workstation. Our main avenue for doing this is looking at the various codecs out there and trying to determine if the intellectual property situation allows us to start shipping all or parts of the technologies involved. This was how we were able to start shipping mp3 playback support for Fedora Workstation 25. Of course in cases where this is obviously not the case we have things like the agreement with our friends at Cisco allowing us to offer H264 support using their licensed codec, which is how OpenH264 started being available in Fedora Workstation 24.

As you might imagine clearing a codec for shipping is a slow and labour intensive process with lawyers and engineers spending a lot of time reviewing stuff to figure out what can be shipped when and how. I am hoping to have more announcements like this coming out during the course of the year.

So I am very happy to announce today that we are now working on packaging the codec known as AC3 (also known as A52) for Fedora Workstation 26. The name AC3 might not be very well known to you, but AC3 is part of a set of technologies developed by Dolby and marketed as Dolby Surround. This means that if you have video files with surround sound audio it is most likely something we can playback with an AC3 decoder. AC3/A52 is also used for surround sound TV broadcasts in the US and it is the audio format used by some Sony and Panasonic video cameras.

We will be offering AC3 playback in Fedora Workstation 26 and we are looking into options for offering an encoder. To be clear there are nothing stopping us from offering an encoder apart from finding an implementation that is possible to package and ship with Fedora with an reasonable amount of effort. The most well known open source implementation we know about is the one found in ffmpeg/libav, but extracting a single codec to ship from ffmpeg or libav is a lot of work and not something we currently have the resources to do. We found another implementation called aften, but that seems to be unmaintaned for years, but we will look at it to see if it could be used.
But if you are interested in AC3 encoding support we would love it if someone started working on a standalone AC3 encoder we could ship, be that by picking up maintership of Aften, splitting out AC3 encoding from libav or ffmpeg or writting something new.

If you want to learn more about AC3 the best place to look is probably the Wikipedia page for Dolby Digital or the a52 ATSC audio standard document for more of a technical deep dive.

by uraeus at March 22, 2017 05:02 PM

March 18, 2017

Jean-François Fortin TamDefence against the Dark Arts involves controlling your hardware

In light of the Vault 7 documents leak (and the rise to power of Lord Voldemort this year), it might make sense to rethink just how paranoid we need to be.  Jarrod Carmichael puts it quite vividly:

I find the general surprise… surprising. After all, this is in line with what Snowden told us years ago, which was already in line with what many computer geeks thought deep down inside for years prior. In the good words of monsieur Crête circa 2013, the CIA (and to an extent the NSA, FBI, etc.) is a spy agency. They are spies. Spying is what they’re supposed to do! 😁

Well, if these agencies are really on to you, you’re already in quite a bit of trouble to begin with. Good luck escaping them, other than living in an embassy or airport for the next decade or so. But that doesn’t mean the repercussions of their technological recklessness—effectively poisoning the whole world’s security well—are not something you should ward against.

It’s not enough to just run FLOSS apps. When you don’t control the underlying OS and hardware, you are inherently compromised. It’s like driving over a minefield with a consumer-grade Hummer while dodging rockets (at least use a hovercraft or something!) and thinking “Well, I’m not driving a Ford Pinto!” (but see this post where Todd weaver explains the implications much more eloquently—and seriously—than I do).

Considering the political context we now find ourselves in, pushing for privacy and software freedom has never been more relevant, as Karen Sandler pointed out at the end of the year. This is why I’m excited that Purism’s work on coreboot is coming to fruition and that it will be neutralizing the Intel Management Engine on its laptops, because this is finally providing an option for security-concerned people other than running exotic or technologically obsolete hardware.

by nekohayo at March 18, 2017 02:00 PM

March 17, 2017

Víctor JáquezGStreamer VAAPI 1.11.x (development branch)

Greetings GstFolks!

Last month the unstable release 1.11.2 of GStreamer hit the streets, and I would like to share with you all a quick heads-up of what we are working on in gstreamer-vaapi, since there are a lot of new stuff:

  1. GstVaapiDisplay inherits from GstObject

    GstVaapiDisplay is a wrapper for VADisplay. Before it was a custom C structure shared among the pipeline through the GstContext mechanism. Now it is a GObject based object, which can be queried, introspected and, perhaps later on, exposed in a separated library.

  2. Direct rendering and upload

    Direct rendering and upload are mechanisms based on using ">">vaDeriveImage to upload an raw image into a VASurface, or to download a VASurface into a raw image, which is faster rather than exporting the VASurface to a VAImage.

    Nonetheless we have found some issues with the direct rendering in new Intel hardware (Skylake and above), and we are still assessing if we keep it as default.

  3. Improve the GstValidate pass rate

    GstValidate provides a battery of tests for the whole GStreamer object, sadly, using gstreamer-vaapi, the battery didn’t output the same pass rate as without it. Though we still have some issues with the vaapsink that might need to be tackled in VA API, the pass rate has increased a lot.

  4. Refactor the GstVaapiVideoMemory

    We had refactor completely the internals of the VAAPI video memory (related with the work done for the direct download and upload). Also we have added locks when mapping and unmapping, to avoid race conditions.

  5. Support dmabuf sharing with downstream

    gstreamer-vaapi already had support to share dmabuf-based buffers with upstream (e.g. cameras) but now it is also capable to share dmabuf-based buffers with downstream with sinks capable of importing them (e.g. glimagesink under supported EGL).

  6. Support compilation with meson

    Meson is a new compilation machinery in GStreamer, along with autotools, and now it is supported also by gstreamer-vaapi.

  7. Headless rendering improvements

    There has been a couple of improvements in the DRM backend for vaapisink, for headless environments.

  8. Wayland backend improvements

    Also there has been improvements for the Wayland backend for vaapisink and GstVaapiDisplay.

  9. Dynamically reports the supported raw caps

    Now the elements query in run-time the VA backend to know which color formats does it support, so either the source or sink caps are negotiated correctly, avoiding possible error conditions (like negotiating a unsupported color space). This has been done for encoders, decoders and the post-processor.

  10. Encoders enhancements

    We have improve encoders a lot, adding new features such as constant bit rate support for VP8, the handling of stream metadata through tags, etc.

And many, many more changes, improvements and fixes. But there is still a long road to the stable release (1.12) with many pending tasks and bugs to tackle.

Thanks a bunch to Hyunjun Ko, Julien Isorce, Scott D Phillips, Stirling Westrup, etc. for all their work.

Also, Intel Media and Audio For Linux was accepted in the Google Summer Of Code this year! If your are willing to face this challenge, you can browse the list of ideas to work on, not only in gstreamer-vaapi, but in the driver, or other projects surrounding VAAPI.

Finally, do not forget these dates: 20th and 21h of May @ A Coruña (Spain), where the GStreamer Spring Hackfest is going to take place. Sign up!

by vjaquez at March 17, 2017 04:53 PM

March 15, 2017

Andy Wingoguile 2.2 omg!!!

(Andy Wingo)

Oh, good evening my hackfriends! I am just chuffed to share a thing with yall: tomorrow we release Guile 2.2.0. Yaaaay!

I know in these days of version number inflation that this seems like a very incremental, point-release kind of a thing, but it's a big deal to me. This is a project I have been working on since soon after the release of Guile 2.0 some 6 years ago. It wasn't always clear that this project would work, but now it's here, going into production.

In that time I have worked on JavaScriptCore and V8 and SpiderMonkey and so I got a feel for what a state-of-the-art programming language implementation looks like. Also in that time I ate and breathed optimizing compilers, and really hit the wall until finally paging in what Fluet and Weeks were saying so many years ago about continuation-passing style and scope, and eventually came through with a solution that was still CPS: CPS soup. At this point Guile's "middle-end" is, I think, totally respectable. The backend targets a quite good virtual machine.

The virtual machine is still a bytecode interpreter for now; native code is a next step. Oddly my journey here has been precisely opposite, in a way, to An incremental approach to compiler construction; incremental, yes, but starting from the other end. But I am very happy with where things are. Guile remains very portable, bootstrappable from C, and the compiler is in a good shape to take us the rest of the way to register allocation and native code generation, and performance is pretty ok, even better than some natively-compiled Schemes.

For a "scripting" language (what does that mean?), I also think that Guile is breaking nice ground by using ELF as its object file format. Very cute. As this seems to be a "Andy mentions things he's proud of" segment, I was also pleased with how we were able to completely remove the stack size restriction.

high fives all around

As is often the case with these things, I got the idea for removing the stack limit after talking with Sam Tobin-Hochstadt from Racket and the PLT group. I admire Racket and its makers very much and look forward to stealing fromworking with them in the future.

Of course the ideas for the contification and closure optimization passes are in debt to Matthew Fluet and Stephen Weeks for the former, and Andy Keep and Kent Dybvig for the the latter. The intmap/intset representation of CPS soup itself is highly endebted to the late Phil Bagwell, to Rich Hickey, and to Clojure folk; persistent data structures were an amazing revelation to me.

Guile's virtual machine itself was initially heavily inspired by JavaScriptCore's VM. Thanks to WebKit folks for writing so much about the early days of Squirrelfish! As far as the actual optimizations in the compiler itself, I was inspired a lot by V8's Crankshaft in a weird way -- it was my first touch with fixed-point flow analysis. As most of yall know, I didn't study CS, for better and for worse; for worse, because I didn't know a lot of this stuff, and for better, as I had the joy of learning it as I needed it. Since starting with flow analysis, Carl Offner's Notes on graph algorithms used in optimizing compilers was invaluable. I still open it up from time to time.

While I'm high-fiving, large ups to two amazing support teams: firstly to my colleagues at Igalia for supporting me on this. Almost the whole time I've been at Igalia, I've been working on this, for about a day or two a week. Sometimes at work we get to take advantage of a Guile thing, but Igalia's Guile investment mainly pays out in the sense of keeping me happy, keeping me up to date with language implementation techniques, and attracting talent. At work we have a lot of language implementation people, in JS engines obviously but also in other niches like the networking group, and it helps to be able to transfer hackers from Scheme to these domains.

I put in my own time too, of course; but my time isn't really my own either. My wife Kate has been really supportive and understanding of my not-infrequent impulses to just nerd out and hack a thing. She probably won't read this (though maybe?), but it's important to acknowledge that many of us hackers are only able to do our work because of the support that we get from our families.

a digression on the nature of seeking and knowledge

I am jealous of my colleagues in academia sometimes; of course it must be this way, that we are jealous of each other. Greener grass and all that. But when you go through a doctoral program, you know that you push the boundaries of human knowledge. You know because you are acutely aware of the state of recorded knowledge in your field, and you know that your work expands that record. If you stay in academia, you use your honed skills to continue chipping away at the unknown. The papers that this process reifies have a huge impact on the flow of knowledge in the world. As just one example, I've read all of Dybvig's papers, with delight and pleasure and avarice and jealousy, and learned loads from them. (Incidentally, I am given to understand that all of these are proper academic reactions :)

But in my work on Guile I don't actually know that I've expanded knowledge in any way. I don't actually know that anything I did is new and suspect that nothing is. Maybe CPS soup? There have been some similar publications in the last couple years but you never know. Maybe some of the multicore Concurrent ML stuff I haven't written about yet. Really not sure. I am starting to see papers these days that are similar to what I do and I have the feeling that they have a bit more impact than my work because of their medium, and I wonder if I could be putting my work in a more useful form, or orienting it in a more newness-oriented way.

I also don't know how important new knowledge is. Simply being able to practice language implementation at a state-of-the-art level is a valuable skill in itself, and releasing a quality, stable free-software language implementation is valuable to the world. So it's not like I'm negative on where I'm at, but I do feel wonderful talking with folks at academic conferences and wonder how to pull some more of that into my life.

In the meantime, I feel like (my part of) Guile 2.2 is my master work in a way -- a savepoint in my hack career. It's fine work; see A Virtual Machine for Guile and Continuation-Passing Style for some high level documentation, or many of these bloggies for the nitties and the gritties. OKitties!

getting the goods

It's been a joy over the last two or three years to see the growth of Guix, a packaging system written in Guile and inspired by GNU stow and Nix. The laptop I'm writing this on runs GuixSD, and Guix is up to some 5000 packages at this point.

I've always wondered what the right solution for packaging Guile and Guile modules was. At one point I thought that we would have a Guile-specific packaging system, but one with stow-like characteristics. We had problems with C extensions though: how do you build one? Where do you get the compilers? Where do you get the libraries?

Guix solves this in a comprehensive way. From the four or five bootstrap binaries, Guix can download and build the world from source, for any of its supported architectures. The result is a farm of weirdly-named files in /gnu/store, but the transitive closure of a store item works on any distribution of that architecture.

This state of affairs was clear from the Guix binary installation instructions that just have you extract a tarball over your current distro, regardless of what's there. The process of building this weird tarball was always a bit ad-hoc though, geared to Guix's installation needs.

It turns out that we can use the same strategy to distribute reproducible binaries for any package that Guix includes. So if you download this tarball, and extract it as root in /, then it will extract some paths in /gnu/store and also add a /opt/guile-2.2.0. Run Guile as /opt/guile-2.2.0/bin/guile and you have Guile 2.2, before any of your friends! That pack was made using guix pack -C lzip -S /opt/guile-2.2.0=/ guile-next glibc-utf8-locales, at Guix git revision 80a725726d3b3a62c69c9f80d35a898dcea8ad90.

(If you run that Guile, it will complain about not being able to install the locale. Guix, like Scheme, is generally a statically scoped system; but locales are dynamically scoped. That is to say, you have to set GUIX_LOCPATH=/opt/guile-2.2.0/lib/locale in the environment, for locales to work. See the GUIX_LOCPATH docs for the gnarlies.)

Alternately of course you can install Guix and just guix package -i guile-next. Guix itself will migrate to 2.2 over the next week or so.

Welp, that's all for this evening. I'll be relieved to push the release tag and announcements tomorrow. In the meantime, happy hacking, and yes: this blog is served by Guile 2.2! :)

by Andy Wingo at March 15, 2017 10:56 PM

March 07, 2017

Zeeshan AliGDP meets GSoC

(Zeeshan Ali)
Are you a student? Passionate about Open Source? Want your code to run on next generation of automobiles? You're in luck! Genivi Development Platform will be participating in Google Summer of Code this summer and you are welcome to participate. We have collected a bunch of ideas for what would be a good 3 month project for a student but you're more than welcome to suggest your own project. The ideas page, also has instructions on how to get started with GDP.

We look forward to your participation!

March 07, 2017 07:08 PM

March 06, 2017

Andy Wingoit's probably spam

(Andy Wingo)

Greetings, peoples. As you probably know, these words are served to you by Tekuti, a blog engine written in Scheme that uses Git as its database.

Part of the reason I wrote this blog software was that from the time when I was using Wordpress, I actually appreciated the comments that I would get. Sometimes nice folks visit this blog and comment with information that I find really interesting, and I thought it would be a shame if I had to disable those entirely.

But allowing users to add things to your site is tricky. There are all kinds of potential security vulnerabilities. I thought about the ones that were important to me, back in 2008 when I wrote Tekuti, and I thought I did a pretty OK job on preventing XSS and designing-out code execution possibilities. When it came to bogus comments though, things worked well enough for the time. Tekuti uses Git as a log-structured database, and so to delete a comment, you just revert the change that added the comment. I added a little security question ("what's your favorite number?"; any number worked) to prevent wordpress spammers from hitting me, and I was good to go.

Sadly, what was good enough in 2008 isn't good enough in 2017. In 2017 alone, some 2000 bogus comments made it through. So I took comments offline and painstakingly went through and separated the wheat from the chaff while pondering what to do next.

an aside

I really wondered why spammers bothered though. I mean, I added the rel="external nofollow" attribute on links, which should prevent search engines from granting relevancy to the spammer's links, so what gives? Could be that all the advice from the mid-2000s regarding nofollow is bogus. But it was definitely the case that while I was adding the attribute to commenter's home page links, I wasn't adding it to links in the comment. Doh! With this fixed, perhaps I will just have to deal with the spammers I have and not even more spammers in the future.

i digress

I started by simply changing my security question to require a number in a certain range. No dice; bogus comments still got through. I changed the range; could it be the numbers they were using were already in range? Again the bogosity continued undaunted.

So I decided to break down and write a bogus comment filter. Luckily, Git gives me a handy corpus of legit and bogus comments: all the comments that remain live are legit, and all that were ever added but are no longer live are bogus. I wrote a simple tokenizer across the comments, extracted feature counts, and fed that into a naive Bayesian classifier. I finally turned it on this morning; fingers crossed!

My trials at home show that if you train the classifier on half the data set (around 5300 bogus comments and 1900 legit comments) and then run it against the other half, I get about 6% false negatives and 1% false positives. The feature extractor interns sequences of 1, 2, and 3 tokens, and doesn't have a lower limit for number of features extracted -- a feature seen only once in bogus comments and never in legit comments is a fairly strong bogosity signal; as you have to make up the denominator in that case, I set it to indicate that such a feature is 99.9% bogus. A corresponding single feature in the legit set without appearance in the bogus set is 99% legit.

Of course with this strong of a bias towards precise features of the training set, if you run the classifier against its own training set, it produces no false positives and only 0.3% false negatives, some of which were simply reverted duplicate comments.

It wasn't straightforward to get these results out of a Bayesian classifier. The "smoothing" factor that you add to both numerator and denominator was tricky, as I mentioned above. Getting a useful tokenization was tricky. And the final trick was even trickier: limiting the significant-feature count when determining bogosity. I hate to cite Paul Graham but I have to do so here -- choosing the N most significant features in the document made the classification much less sensitive to the varying lengths of legit and bogus comments, and less sensitive to inclusions of verbatim texts from other comments.

We'll see I guess. If your comment gets caught by my filters, let me know -- over email or Twitter I guess, since you might not be able to comment! I hope to be able to keep comments open; I've learned a lot from yall over the years.

by Andy Wingo at March 06, 2017 02:16 PM

March 01, 2017

Christian Schaller2016 in review

(Christian Schaller)

I started writing this blog entry at the end of January, but kept delaying publishing it due to waiting for some cool updates we are working on. But I decided today that instead of keep pushing the 2016 review part back I should just do this as two separate blog entries. So here is my Fedora Workstation 2016 Summary :)

We did two major releases of Fedora Workstation, namely 24 and 25 each taking is steps closer to realising our vision for the future of the Linux Desktop. I am really happy that we finally managed to default to Wayland in Fedora Workstation 25. As Jonathan Corbet of LWN so well put it: “That said, it’s worth pointing out that the move to Wayland is a huge transition; we are moving away from a display manager that has been in place since before Linus Torvalds got his first computer”.
Successfully replacing the X11 system that has been used since 1987 is no small feat and we have to remember many tried over the years and failed. So a big Thank You to Kristian Høgsberg for his incredible work getting Wayland off the ground and build consensus around it from the community. I am always full of admiration for those who manage to create these kind of efforts up from their first line of code to a place where a vibrant and dynamic community can form around them.

And while we for sure have some issues left to resolve I think the launch of Wayland in Fedora Workstation 25 was so strong that we managed to keep and accelerate the momentum needed to leave the orbit of X11 and have it truly take on a life of its own.
Because we have succeeded not just in forming a community around Wayland, but with getting the existing linux graphics driver community to partake in the move, we managed to get the major desktop toolkits to partake in the move and I believe we have managed to get the community at large to partake in the move. And we needed all of those 3 to join us for this transition to have a chance to succeed with it. If this had only been about us at Red Hat and in the Fedora community who cared and contributed it would have gone nowhere, but this was truly one of those efforts that pulled together almost everyone in the wider linux community, and showcased what is possible when such a wide coalition of people get together. So while for instance we don’t ship an Enlightenment spin of Fedora (interested parties would be encouraged to do so though) we did value and appreciate the work they where doing around Wayland, simply because the bigger the community the more development and bug fixing you will see on the shared infrastructure.

A part of the Wayland effort was the new input library Peter Hutterer put out called libinput. That library allowed us to clean up our input stack and also share the input code between X and Wayland. A lot of credit to Peter and Benjamin Tissoires for their work here as the transition like the later Wayland transition succeeded without causing a huge amount of pain for our users.

And this is also our approach for Flatpak which for us forms a crucial tandem with Wayland and the future of the Linux desktop. To ensure the project is managed in a way that is open and transparent to all and allows for different groups to adapt it to their specific usecases. And so far it is looking good, with early adoption and trials from the IVI community, traditional Linux distributions, device makers like Endless and platforms such as Steam. Each of these using the technologies or looking to use them in slightly different ways, but still all collaborating on pushing the shared technologies forward.

We managed to make some good steps forward in our effort to drain the swamp of Desktop Linux land (only unfortunate thing here is a certain Trump deciding to cybersquat on the ‘drain the swamp’ mantra) with adding H264 and mp3 support to Fedora Workstation. And while the H264 support still needs some work to support more profiles (which we unfortunately did not get to in 2016) we have other codec related work underway which I think will help move the needle on this being a big issue even further. The work needed on OpenH264 is not forgotten, but Wim Taymans ended up doing a lot more multimedia plumbing work for our container based future than originally planned. I am looking forward to share more details of where his work is going these days though as it could bring another group of makers into the world of mainstream desktop Linux when its ready.

Another piece of swamp draining that happened was around the Linux Firmware Service, which went from strength to strength in 2016. We had new vendors sign up throughout the year and while not all of those efforts are public I do expect that by the end of 2017 we will have most major hardware vendors offering firmware through the service. And not only system firmware updates but things like Logitech mice and keyboards will also be available.

Of course the firmware update service also has a client part and GNOME Software truly became a powerhouse for driving change during 2016, being the catalyst not only for the firmware update service, but also for linux applications providing good metadata in a standardized manner. The Appstream format required by GNOME Software has become the de-facto standard. And speaking of GNOME Software the distribution upgrade functionality we added in Fedora 24 and improved in Fedora 25 has become pretty flawless. Always possible to improve of course, but the biggest problem I heard of was due to versioning issue due to us pushing the mp3 decoding support for Fedora in at the very last minute and thus not giving 3rd party repositories a reasonable chance to update their packaging to account for it. Lesson learnt for going forward :)

These are just of course a small subset of the things we accomplished in 2016, but I was really happy to see the great reception we had to Fedora 25 last year, with a lot of major new sites giving it stellar reviews and also making it their distribution of the year. The growth curves in terms of adoption we are seeing for Fedora Workstation is a great encouragement for the team and helps is validate that we are on the right track with setting our development priorities. My hope for 2017 is that even more of you will decide to join our effort and switch to Fedora and 2017 will be the year of Fedora Workstation! On that note the very positive reception to the Fedora Media Writer that we introduced as the default download for Fedora Workstation 25 was great to see. Being able to have one simple tool to use regardless of which operating system you come to us from simplifies so much in terms of both communication on our end and lowering the threshold of adoption on the end user side.

by uraeus at March 01, 2017 05:47 PM

February 27, 2017

GStreamerGStreamer 1.11.2 unstable release (binaries)

(GStreamer)

Pre-built binary images of the 1.11.2 unstable release of GStreamer are now available for Windows 32/64-bit, iOS and Mac OS X and Android.

The builds are available for download from: Android, iOS, Mac OS X and Windows.

February 27, 2017 08:00 AM

February 24, 2017

Andy Wingoencyclopedia snabb and the case of the foreign drivers

(Andy Wingo)

Peoples of the blogosphere, welcome back to the solipsism! Happy 2017 and all that. Today's missive is about Snabb (formerly Snabb Switch), a high-speed networking project we've been working on at work for some years now.

What's Snabb all about you say? Good question and I have a nice answer for you in video and third-party textual form! This year I managed to make it to linux.conf.au in lovely Tasmania. Tasmania is amazing, with wild wombats and pademelons and devils and wallabies and all kinds of things, and they let me talk about Snabb.

Click to download video

You can check that video on the youtube if the link above doesn't work; slides here.

Jonathan Corbet from LWN wrote up the talk in an article here, which besides being flattering is a real windfall as I don't have to write it up myself :)

In that talk I mentioned that Snabb uses its own drivers. We were recently approached by a customer with a simple and honest question: does this really make sense? Is it really a win? Why wouldn't we just use the work that the NIC vendors have already put into their drivers for the Data Plane Development Kit (DPDK)? After all, part of the attraction of a switch to open source is that you will be able to take advantage of the work that others have produced.

Our answer is that while it is indeed possible to use drivers from DPDK, there are costs and benefits on both sides and we think that when we weigh it all up, it makes both technical and economic sense for Snabb to have its own driver implementations. It might sound counterintuitive on the face of things, so I wrote this long article to discuss some perhaps under-appreciated points about the tradeoff.

Technically speaking there are generally two ways you can imagine incorporating DPDK drivers into Snabb:

  1. Bundle a snapshot of the DPDK into Snabb itself.

  2. Somehow make it so that Snabb could (perhaps optionally) compile against a built DPDK SDK.

As part of a software-producing organization that ships solutions based on Snabb, I need to be able to ship a "known thing" to customers. When we ship the lwAFTR, we ship it in source and in binary form. For both of those deliverables, we need to know exactly what code we are shipping. We achieve that by having a minimal set of dependencies in Snabb -- only LuaJIT and three Lua libraries (DynASM, ljsyscall, and pflua) -- and we include those dependencies directly in the source tree. This requirement of ours rules out (2), so the option under consideration is only (1): importing the DPDK (or some part of it) directly into Snabb.

So let's start by looking at Snabb and the DPDK from the top down, comparing some metrics, seeing how we could make this combination.

snabbdpdk
Code lines 61K 583K
Contributors (all-time) 60 370
Contributors (since Jan 2016) 32 240
Non-merge commits (since Jan 2016) 1.4K 3.2K

These numbers aren't directly comparable, of course; in Snabb our unit of code change is the merge rather than the commit, and in Snabb we include a number of production-ready applications like the lwAFTR and the NFV, but they are fine enough numbers to start with. What seems clear is that the DPDK project is significantly larger than Snabb, so adding it to Snabb would fundamentally change the nature of the Snabb project.

So depending on the DPDK makes it so that suddenly Snabb jumps from being a project that compiles in a minute to being a much more heavy-weight thing. That could be OK if the benefits were high enough and if there weren't other costs, but there are indeed other costs to including the DPDK:

  • Data-plane control. Right now when I ship a product, I can be responsible for the whole data plane: everything that happens on the CPU when packets are being processed. This includes the driver, naturally; it's part of Snabb and if I need to change it or if I need to understand it in some deep way, I can do that. But if I switch to third-party drivers, this is now out of my domain; there's a wall between me and something that running on my CPU. And if there is a performance problem, I now have someone to blame that's not myself! From the customer perspective this is terrible, as you want the responsibility for software to rest in one entity.

  • Impedance-matching development costs. Snabb is written in Lua; the DPDK is written in C. I will have to build a bridge, and keep it up to date as both Snabb and the DPDK evolve. This impedance-matching layer is also another source of bugs; either we make a local impedance matcher in C or we bind everything using LuaJIT's FFI. In the former case, it's a lot of duplicate code, and in the latter we lose compile-time type checking, which is a no-go given that the DPDK can and does change API and ABI.

  • Communication costs. The DPDK development list had 3K messages in January. Keeping up with DPDK development would become necessary, as the DPDK is now in your dataplane, but it costs significant amounts of time.

  • Costs relating to mismatched goals. Snabb tries to win development and run-time speed by searching for simple solutions. The DPDK tries to be a showcase for NIC features from vendors, placing less of a priority on simplicity. This is a very real cost in the form of the way network packets are represented in the DPDK, with support for such features as scatter/gather and indirect buffers. In Snabb we were able to do away with this complexity by having simple linear buffers, and our speed did not suffer; adding the DPDK again would either force us to marshal and unmarshal these buffers into and out of the DPDK's format, or otherwise to reintroduce this particular complexity into Snabb.

  • Abstraction costs. A network function written against the DPDK typically uses at least three abstraction layers: the "EAL" environment abstraction layer, the "PMD" poll-mode driver layer, and often an internal hardware abstraction layer from the network card vendor. (And some of those abstraction layers are actually external dependencies of the DPDK, as with Mellanox's ConnectX-4 drivers!) Any discrepancy between the goals and/or implementation of these layers and the goals of a Snabb network function is a cost in developer time and in run-time. Note that those low-level HAL facilities aren't considered acceptable in upstream Linux kernels, for all of these reasons!

  • Stay-on-the-train costs. The DPDK is big and sometimes its abstractions change. As a minor player just riding the DPDK train, we would have to invest a continuous amount of effort into just staying aboard.

  • Fork costs. The Snabb project has a number of contributors but is really run by Luke Gorrie. Because Snabb is so small and understandable, if Luke decided to stop working on Snabb or take it in a radically different direction, I would feel comfortable continuing to maintain (a fork of) Snabb for as long as is necessary. If the DPDK changed goals for whatever reason, I don't think I would want to continue to maintain a stale fork.

  • Overkill costs. Drivers written against the DPDK have many considerations that simply aren't relevant in a Snabb world: kernel drivers (KNI), special NIC features that we don't use in Snabb (RDMA, offload), non-x86 architectures with different barrier semantics, threads, complicated buffer layouts (chained and indirect), interaction with specific kernel modules (uio-pci-generic / igb-uio / ...), and so on. We don't need all of that, but we would have to bring it along for the ride, and any changes we might want to make would have to take these use cases into account so that other users won't get mad.

So there are lots of costs if we were to try to hop on the DPDK train. But what about the benefits? The goal of relying on the DPDK would be that we "automatically" get drivers, and ultimately that a network function would be driver-agnostic. But this is not necessarily the case. Each driver has its own set of quirks and tuning parameters; in order for a software development team to be able to support a new platform, the team would need to validate the platform, discover the right tuning parameters, and modify the software to configure the platform for good performance. Sadly this is not a trivial amount of work.

Furthermore, using a different vendor's driver isn't always easy. Consider Mellanox's DPDK ConnectX-4 / ConnectX-5 support: the "Quick Start" guide has you first install MLNX_OFED in order to build the DPDK drivers. What is this thing exactly? You go to download the tarball and it's 55 megabytes. What's in it? 30 other tarballs! If you build it somehow from source instead of using the vendor binaries, then what do you get? All that code, running as root, with kernel modules, and implementing systemd/sysvinit services!!! And this is just step one!!!! Worse yet, this enormous amount of code powering a DPDK driver is mostly driver-specific; what we hear from colleagues whose organizations decided to bet on the DPDK is that you don't get to amortize much knowledge or validation when you switch between an Intel and a Mellanox card.

In the end when we ship a solution, it's going to be tested against a specific NIC or set of NICs. Each NIC will add to the validation effort. So if we were to rely on the DPDK's drivers, we would have payed all the costs but we wouldn't save very much in the end.

There is another way. Instead of relying on so much third-party code that it is impossible for any one person to grasp the entirety of a network function, much less be responsible for it, we can build systems small enough to understand. In Snabb we just read the data sheet and write a driver. (Of course we also benefit by looking at DPDK and other open source drivers as well to see how they structure things.) By only including what is needed, Snabb drivers are typically only a thousand or two thousand lines of Lua. With a driver of that size, it's possible for even a small ISV or in-house developer to "own" the entire data plane of whatever network function you need.

Of course Snabb drivers have costs too. What are they? Are customers going to be stuck forever paying for drivers for every new card that comes out? It's a very good question and one that I know is in the minds of many.

Obviously I don't have the whole answer, as my role in this market is a software developer, not an end user. But having talked with other people in the Snabb community, I see it like this: Snabb is still in relatively early days. What we need are about three good drivers. One of them should be for a standard workhorse commodity 10Gbps NIC, which we have in the Intel 82599 driver. That chipset has been out for a while so we probably need to update it to the current commodities being sold. Additionally we need a couple cards that are going to compete in the 100Gbps space. We have the Mellanox ConnectX-4 and presumably ConnectX-5 drivers on the way, but there's room for another one. We've found that it's hard to actually get good performance out of 100Gbps cards, so this is a space in which NIC vendors can differentiate their offerings.

We budget somewhere between 3 and 9 months of developer time to create a completely new Snabb driver. Of course it usually takes less time to develop Snabb support for a NIC that is only incrementally different from others in the same family that already have drivers.

We see this driver development work to be similar to the work needed to validate a new NIC for a network function, with the additional advantage that it gives us up-front knowledge instead of the best-effort testing later in the game that we would get with the DPDK. When you add all the additional costs of riding the DPDK train, we expect that the cost of Snabb-native drivers competes favorably against the cost of relying on third-party DPDK drivers.

In the beginning it's natural that early adopters of Snabb make investments in this base set of Snabb network drivers, as they would to validate a network function on a new platform. However over time as Snabb applications start to be deployed over more ports in the field, network vendors will also see that it's in their interests to have solid Snabb drivers, just as they now see with the Linux kernel and with the DPDK, and given that the investment is relatively low compared to their already existing efforts in Linux and the DPDK, it is quite feasible that we will see the NIC vendors of the world start to value Snabb for the performance that it can squeeze out of their cards.

So in summary, in Snabb we are convinced that writing minimal drivers that are adapted to our needs is an overall win compared to relying on third-party code. It lets us ship solutions that we can feel responsible for: both for their operational characteristics as well as their maintainability over time. Still, we are happy to learn and share with our colleagues all across the open source high-performance networking space, from the DPDK to VPP and beyond.

by Andy Wingo at February 24, 2017 05:37 PM

GStreamerGStreamer 1.11.2 unstable release

(GStreamer)

The GStreamer team is pleased to announce the second release of the unstable 1.11 release series. The 1.11 release series is adding new features on top of the 1.0, 1.2, 1.4, 1.6, 1.8 and 1.10 series and is part of the API and ABI-stable 1.x release series of the GStreamer multimedia framework. The unstable 1.11 release series will lead to the stable 1.12 release series in the next weeks. Any newly added API can still change until that point.

Full release notes will be provided at some point during the 1.11 release cycle, highlighting all the new features, bugfixes, performance optimizations and other important changes.

Binaries for Android, iOS, Mac OS X and Windows will be provided in the next days.

Check out the release notes for GStreamer core, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx, or download tarballs for gstreamer, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx.

February 24, 2017 02:00 PM

GStreamerGStreamer 1.10.4 stable release (binaries)

(GStreamer)

Pre-built binary images of the 1.10.4 stable release of GStreamer are now available for Windows 32/64-bit, iOS and Mac OS X and Android.

See /releases/1.10/ for the full list of changes.

The builds are available for download from: Android, iOS, Mac OS X and Windows.

February 24, 2017 10:00 AM

February 23, 2017

GStreamerGStreamer 1.10.4 stable release

(GStreamer)

The GStreamer team is pleased to announce the third bugfix release in the stable 1.10 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.10.0. For a full list of bugfixes see Bugzilla.

See /releases/1.10/ for the full release notes.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Check out the release notes for GStreamer core, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx, or download tarballs for gstreamer, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx.

February 23, 2017 03:00 PM

February 07, 2017

Arun RaghavanStricter JSON parsing with Haskell and Aeson

I’ve been having fun recently, writing a RESTful service using Haskell and Servant. I did run into a problem that I couldn’t easily find a solution to on the magical bounty of knowledge that is the Internet, so I thought I’d share my findings and solution.

While writing this service (and practically any Haskell code), step 1 is of course defining our core types. Our REST endpoint is basically a CRUD app which exchanges these with the outside world as JSON objects. Doing this is delightfully simple:

{-# LANGUAGE DeriveGeneric #-}

import Data.Aeson
import GHC.Generics

data Job = Job { jobInputUrl :: String
               , jobPriority :: Int
               , ...
               } deriving (Eq, Generic, Show)

instance ToJSON Job where
  toJSON = genericToJSON defaultOptions

instance FromJSON Job where
  parseJSON = genericParseJSON defaultOptions

That’s all it takes to get the basic type up with free serialization using Aeson and Haskell Generics. This is followed by a few more lines to hook up GET and POST handlers, we instantiate the server using warp, and we’re good to go. All standard stuff, right out of the Servant tutorial.

The POST request accepts a new object in the form of a JSON object, which is then used to create the corresponding object on the server. Standard operating procedure again, as far as RESTful APIs go.

The nice part about doing it like this is that the input is automatically validated based on types. So input like:

{
  "jobInputUrl": 123, // should have been a string
  "jobPriority": 123
}

will result in:

Error in $: expected String, encountered Number

However, as this nice tour of how Aeson works demonstrate, if the input has keys that we don’t recognise, no error will be raised:

{
  "jobInputUrl": "http://arunraghavan.net",
  "jobPriority": 100,
  "junkField": "junkValue"
}

This behaviour would not be undesirable in use-cases such as mine — if the client is sending fields we don’t understand, I’d like for the server to signal an error so the underlying problem can be caught early.

As it turns out, making the JSON parsing stricter and catch missing fields is just a little more involved. I didn’t find how this could be done in a single place on the Internet, so here’s the best I could do:

{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE DeriveGeneric      #-}

import Data.Aeson
import Data.Data
import GHC.Generics

data Job = Job { jobInputUrl :: String
               , jobPriority :: Int
               , ...
               } deriving (Data, Eq, Generic, Show)

instance ToJSON Job where
  toJSON = genericToJSON defaultOptions

instance FromJSON Job where
  parseJSON json = do
    job <- genericParseJSON defaultOptions json
    if keysMatchRecords json job
    then
      return job
    else
      fail "extraneous keys in input"
    where
      -- Make sure the set of JSON object keys is exactly the same as the fields in our object
      keysMatchRecords (Object o) d =
        let
          objKeys   = sort . fmap unpack . keys
          recFields = sort . fmap (fieldLabelModifier defaultOptions) . constrFields . toConstr
        in
          objKeys o == recFields d
      keysMatchRecords _ _          = False

The idea is quite straightforward, and likely very easy to make generic. The Data.Data module lets us extract the constructor for the Job type, and the list of fields in that constructor. We just make sure that’s an exact match for the list of keys in the JSON object we parsed, and that’s it.

Of course, I’m quite new to the Haskell world so it’s likely there are better ways to do this. Feel free to drop a comment with suggestions! In the mean time, maybe this will be useful to others facing a similar problem.

Update: I’ve fixed parseJSON to properly use fieldLabelModifier from the default options, so that comparison actually works when you’re not using Aeson‘s default options. Thanks to /u/tathougies for catching that.

I’m also hoping to rewrite this in generic form using Generics, so watch this space for more updates.

by Arun at February 07, 2017 05:10 AM

February 01, 2017

GStreamerGStreamer 1.10.3 stable release (binaries)

(GStreamer)

Pre-built binary images of the 1.10.3 stable release of GStreamer are now available for Windows 32/64-bit, iOS and Mac OS X and Android.

See /releases/1.10/ for the full list of changes.

The builds are available for download from: Android, iOS, Mac OS X and Windows.

February 01, 2017 08:40 AM

January 30, 2017

GStreamerGStreamer 1.10.3 stable release

(GStreamer)

The GStreamer team is pleased to announce the third bugfix release in the stable 1.10 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.10.0. For a full list of bugfixes see Bugzilla.

See /releases/1.10/ for the full release notes.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Check out the release notes for GStreamer core, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx, or download tarballs for gstreamer, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx.

January 30, 2017 02:25 PM

January 29, 2017

Thomas Vander SticheleA morning in San Francisco

(Thomas Vander Stichele)

This morning in San Francisco, I check out from the hotel and walk to Bodega, a place I discovered last time I was here. I walk past a Chinese man swinging his arms slowly and deliberately, celebrating a secret of health us Westerners will never know. It is Chinese New Year, and I pass bigger groups celebrating and children singing. My phone takes a picture of a forest of phones taking pictures.

I get to the corner hoping the place is still in business. The sign outside asks “Can it all be so simple?” The place is open, so at least for today, the answer is yes. I take a seat at the bar, and I’m flattered when the owner recognizes me even if it’s only my second time here. I ask her if her sister made it to New York to study – but no, she is trekking around Columbia after helping out at the bodega every weekend for the past few months. I get a coffee and a hibiscus mimosa as I ponder the menu.

The man next to me turns out to be her cousin, Amir. He took a plane to San Francisco from Iran yesterday after hearing an executive order might get signed banning people with visas from seven countries to enter the US. The order was signed two hours before his plane landed. He made it through immigration. The fact sheet arrived on the immigration officer’s desks right after he passed through, and the next man in his queue, coming from Turkey, did not make it through. Needles and eyes.

Now he is planning to get a job, and get a lawyer to find a way to bring over his wife and 4 year old child who are now officially banned from following him for 120 days or more. In Iran he does business strategy and teaches at University. It hits home really hard that we are not that different, him and I, and how undeservedly lucky I am that I won’t ever be faced with such a horrible choice to make. Paria, the owner, chimes in, saying that she’s a Iranian Muslim who came to the US 15 years ago with her family, and they all can’t believe what’s happening right now.

The church bell chimes a song over Washington Square Park and breaks the spell, telling me it’s eleven o’clock and time to get going to the airport.

flattr this!

by Thomas at January 29, 2017 04:57 AM

January 19, 2017

Arun RaghavanQuantifying Synchronisation: Oscilloscope Edition

I’ve written a bit in my last two blog posts about the work I’ve been doing in inter-device synchronised playback using GStreamer. I introduced the library and then demonstrated its use in building video walls.

The important thing in synchronisation, of course, is how much in-sync are the streams? The video in my previous post gave a glimpse into that, and in this post I’ll expand on that with a more rigorous, quantifiable approach.

Before I start, a quick note: I am currently providing freelance consulting around GStreamer, PulseAudio and open source multimedia in general. If you’re looking for help with any of these, do get in touch.

The sync measurement setup

Quantifying what?

What is it that we are trying to measure? Let’s look at this in terms of the outcome — I have two computers, on a network. Using the gst-sync-server library, I play a stream on both of them. The ideal outcome is that the same video frame is displayed at exactly the same time, and the audio sample being played out of the respective speakers is also identical at any given instant.

As we saw previously, the video output is not a good way to measure what we want. This is because video displays are updated in sync with the display clock, over which consumer hardware generally does not have control. Besides, our eyes are not that sensitive to minor differences in timing unless images are side-by-side. After all, we’re fooling it with static pictures that change every 16.67ms or so.

Using audio, though, we should be able to do better. Digital audio streams for music/videos typically consist of 44100 or 48000 samples a second, so we have a much finer granularity than video provides us. The human ear is also fairly sensitive to timings with regards to sound. If it hears the same sound at an interval larger than 10 ms, you will hear two distinct sounds and the echo will annoy you to no end.

Measuring audio is also good enough because once you’ve got audio in sync, GStreamer will take care of A/V sync itself.

Setup

Okay, so now that we know what we want to measure, but how do we measure it? The setup is illustrated below:

Sync measurement setup illustrated

As before, I’ve set up my desktop PC and laptop to play the same stream in sync. The stream being played is a local audio file — I’m keeping the setup simple by not adding network streaming to the equation.

The audio itself is just a tick sound every second. The tick is a simple 440 Hz sine wave (A₄ for the musically inclined) that runs for for 1600 samples. It sounds something like this:

https://arunraghavan.net/wp-content/uploads/test-audio.ogg

I’ve connected the 3.5mm audio output of both the computers to my faithful digital oscilloscope (a Tektronix TBS 1072B if you wanted to know). So now measuring synchronisation is really a question of seeing how far apart the leading edge of the sine wave on the tick is.

Of course, this assumes we’re not more than 1s out of sync (that’s the periodicity of the tick itself), and I’ve verified that by playing non-periodic sounds (any song or video) and making sure they’re in sync as well. You can trust me on this, or better yet, get the code and try it yourself! :)

The last piece to worry about — the network. How well we can sync the two streams depends on how well we can synchronise the clocks of the pipeline we’re running on each of the two devices. I’ll talk about how this works in a subsequent post, but my measurements are done on both a wired and wireless network.

Measurements

Before we get into it, we should keep in mind that due to how we synchronise streams — using a network clock — how in-sync our streams are will vary over time depending on the quality of the network connection.

If this variation is small enough, it won’t be noticeable. If it is large (10s of milliseconds), then we may notice start to notice it as echo, or glitches when the pipeline tries to correct for the lack of sync.

In the first setup, my laptop and desktop are connected to each other directly via a LAN cable. The result looks something like this:

The first two images show the best case — we need to zoom in real close to see how out of sync the audio is, and it’s roughly 50µs.

The next two images show the “worst case”. This time, the zoomed out (5ms) version shows some out-of-sync-ness, and on zooming in, we see that it’s in the order of 500µs.

So even our bad case is actually quite good — sound travels at about 340 m/s, so 500µs is the equivalent of two speakers about 17cm apart.

Now let’s make things a little more interesting. With both my laptop and desktop connected to a wifi network:

On average, the sync can be quite okay. The first pair of images show sync to be within about 300µs.

However, the wifi on my desktop is flaky, so you can see it go off up to 2.5ms in the next pair. In my setup, it even goes off up to 10-20ms, before returning to the average case. The next two images show it go back and forth.

Why does this happen? Well, let’s take a quick look at what ping statistics from my desktop to my laptop look like:

Ping from desktop to laptop on wifi

That’s not good — you can see that the minimum, average and maximum RTT are very different. Our network clock logic probably needs some tuning to deal with this much jitter.

Conclusion

These measurements show that we can get some (in my opinion) pretty good synchronisation between devices using GStreamer. I wrote the gst-sync-server library to make it easy to build applications on top of this feature.

The obvious area to improve is how we cope with jittery networks. We’ve added some infrastructure to capture and replay clock synchronisation messages offline. What remains is to build a large enough body of good and bad cases, and then tune the sync algorithm to work as well as possible with all of these.

Also, Florent over at Ubicast pointed out a nice tool they’ve written to measure A/V sync on the same device. It would be interesting to modify this to allow for automated measurement of inter-device sync.

In a future post, I’ll write more about how we actually achieve synchronisation between devices, and how we can go about improving it.

by Arun at January 19, 2017 12:09 PM

January 17, 2017

GStreamerGStreamer 1.11.1 unstable release (binaries)

(GStreamer)

Pre-built binary images of the 1.11.1 stable release of GStreamer are now available for Windows 32/64-bit, iOS and Mac OS X and Android.

The builds are available for download from: Android, iOS, Mac OS X and Windows.

January 17, 2017 05:45 AM

January 12, 2017

GStreamerGStreamer 1.11.1 unstable release

(GStreamer)

The GStreamer team is pleased to announce the first release of the unstable 1.11 release series. The 1.11 release series is adding new features on top of the 1.0, 1.2, 1.4, 1.6, 1.8 and 1.10 series and is part of the API and ABI-stable 1.x release series of the GStreamer multimedia framework. The unstable 1.11 release series will lead to the stable 1.12 release series in the next weeks. Any newly added API can still change until that point.

Full release notes will be provided at some point during the 1.11 release cycle, highlighting all the new features, bugfixes, performance optimizations and other important changes.

Binaries for Android, iOS, Mac OS X and Windows will be provided in the next days.

Check out the release notes for GStreamer core, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx, or download tarballs for gstreamer, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx.

January 12, 2017 03:00 PM

January 11, 2017

Phil NormandWebKitGTK+ and HTML5 fullscreen video

(Phil Normand)

HTML5 video is really nice and all but one annoying thing is the lack of fullscreen video specification, it is currently up to the User-Agent to allow fullscreen video display.

WebKit allows a fullscreen button in the media controls and Safari can, on user-demand only, switch a video to fullscreen and display nice controls over the video. There’s also a webkit-specific DOM API that allow custom media controls to also make use of that feature, as long as it is initiated by a user gesture. Vimeo’s HTML5 player uses that feature for instance.

Thanks to the efforts made by Igalia to improve the WebKit GTK+ port and its GStreamer media-player, I have been able to implement support for this fullscreen video display feature. So any application using WebKitGTK+ can now make use of it :)

The way it is done with this first implementation is that a new fullscreen gtk+ window is created and our GStreamer media-player inserts an autovideosink in the pipeline and overlays the video in the window. Some simple controls are supported, the UI is actually similar to Totem’s. One improvement we could do in the future would be to allow application to override or customize that simple controls UI.

The nice thing about this is that we of course use the GstXOverlay interface and that it’s implemented by the linux, windows and mac os video sinks :) So when other WebKit ports start to use our GStreamer player implementation it will be fairly easy to port the feature and allow more platforms to support fullscreen video display :)

Benjamin also suggested that we could reuse the cairo surface of our WebKit videosink and paint it in a fullscreen GdkWindow. But this should be done only when his Cairo/Pixman patches to enable hardware acceleration land.

So there is room for improvement but I believe this is a first nice step for fullscreen HTML5 video consumption from Epiphany and other WebKitGTK+-based browsers :) Thanks a lot to Gustavo, Sebastian Dröge and Martin for the code-reviews.

by Philippe Normand at January 11, 2017 02:42 PM

January 10, 2017

Jean-François Fortin TamReviewing the Librem 15

Following up on my previous post where I detailed the work I’ve been doing mostly on Purism’s website, today’s post post will cover some video work. Near the beginning of October, I received a Librem 15 v2 unit for testing and reviewing purposes. I have been using it as my main laptop since then, as I don’t believe in reviewing something without using it daily for a couple weeks at least. And so on nights and week-ends, I wrote down testing results, rough impressions and recommendations, then wrote a detailed plan and script to make the first in depth video review of this laptop. Here’s the result—not your typical 2-minutes superficial tour:

With this review, I wanted to:

  • Satisfy my own curiosity and then share the key findings; one of the things that annoyed me some months ago is that I couldn’t find any good “up close” review video to answer my own technical questions, and I thought “Surely I’m not the only one! Certainly a bunch of other people would like to see what the beast feels like in practice.”
  • Make an audio+video production I would be proud of, artistically speaking. I’m rather meticulous in my craft as like creating quality work made to last (similarly, I have recently finished a particular decorative painting after months of obsession… I’ll let you know about that in some other blog post ;)
  • Put my production equipment to good use; I had recently purchased a lot of equipment for my studio and outdoors shooting—it was just begging to be used! Some details on that further down in this post.
  • Provide a ton of industrial design feedback to the Purism team for future models, based on my experience owning and using almost every laptop type out there. And so I did. Pages and pages of it, way more than can fit in a video:
Pictured: my review notes

A fine line

For the video, I spent a fair amount of time writing and revising my video’s plan and narration (half a dozen times at least), until I came to a satisfactory “final” version that I was able to record the narration for (using the exquisite ancient techniques of voice acting).

The tricky part is being simultaneously concise, exhaustive, fair, and entertaining. I wanted to be as balanced as possible and to cover the most crucial topics.

  • At the end of the day, there are some simple fundamentals of what makes a good or bad computer. Checking my 14 pages of review notes, I knew I was being extremely demanding, and that some of those expectations of mine came down to personal preference, things that most people don’t even care about, or common issues that are not even addressed by most “big brand” OEMs’ products… so I balanced my criticism with a dose of realism, making sure to focus on what would matter to people.
  • I also chose topics that would have a longer “shelf life”, considering how much work it takes to produce a high-quality video. For example, even while preparing the review over the course of 2-3 months, some aspects (such as the touchpad drivers) changed/improved and made me revise my opinion. The touchpad behaved better in Fedora 25 than in Fedora 24… until a kernel update broke it (Ugh. At that point I decided to version-lock my kernel package in Fedora).
  • I was conservative in my estimates, even if that makes the device look less shiny than it is. For example, while I said “5-6 hours” of battery life in the review video, in practice I realized that I can get 10-12 hours of battery life with my personal usage pattern (writing text in Gedit, no web browser open, 50% brightness, etc.) and a few simple tweaks in PowerTop.

The final version of my script, as I recorded it, was 59 minutes long. No jokes. And that was after I had decided to ignore some topics (ex.: the whole part about the preloaded operating system; that part would be long enough to be a standalone review review).

I spent some days processing that 59 minutes recording to remove any sound impurities or mistakes, and sped up the tempo, bringing down the duration to 37 and then 31 minutes. Still, that was too long, so while I was doing the final video edit, I tightened everything further (removing as many speech gaps as possible) and cut out a few more topics at the last minute. The final rendered result is a video that is 21 minutes long. Much more reasonable, considering it’s an “in depth” review.

Real audio, lighting, and optics

My general work ethic is: when you do something, do it properly or don’t do it at all.

For this project I used studio lighting, tripods and stands, a dolly, a DSLR with two lenses, a smaller camera for some last minute shots, a high-end PCM sound recorder, a phantom-powered shotgun microphone, a recording booth, monitoring headphones, sandbags, etc.

To complement the narration and cover the length of the review, I chose seven different songs (out of many dozens) based on genre, mood and tempo. I sometimes had to cut/mix songs to avoid the annoying parts. The result, I hope, is a video that remains pleasant and entertaining to watch throughout, while also having a certain emotional or “material” quality to it. For example, the song I used for the thermal design portion makes me feel like I’m touching heatpipes and watching energy flow. Maybe that’s just me though—perhaps if there was some lounge music I would feel like a sofa ;)

Fun!

Much like when I made a video for a local symphonic orchestra, I have enjoyed the making of this review video immensely. I’m glad I got an interesting device to review (not a whole chicken in a can) with the freedom to make the “video I would have wanted to see”. On that note, if anyone has a fighter jet they’d want to see reviewed, let me know (I’m pretty good at dodging missiles ;)

by nekohayo at January 10, 2017 04:00 PM

January 03, 2017

Jean-François Fortin TamRenommer un périphérique audio avec PulseAudio

J’ai découvert par hasard qu’on peut faire un clic droit sur le “profil” d’un périphérique dans pavucontrol pour renommer le périphérique. Or, cette fonctionnalité n’est pas disponible par défaut puisqu’il faut un module supplémentaire (sous Fedora, du moins). Pour faire un essai en temps réel:

$ pactl load-module module-device-manager
34

“34”? Quelle drôle de réponse! Je présume que ça veut dire que ça a fonctionné. Si on réessaie immédiatement la même commande, on obtient:

$ pactl load-module module-device-manager
Échec : Échec lors de l'initialisation du module

Ce qui, si on pense comme un développeur un peu paresseux sur la sémantique, dévoile que le module est bel et bien chargé déjà.

On peut maintenant donner des noms beaucoup plus courts et pratiques à nos périphériques favoris. Par exemple, ma SoundBlaster au nom beaucoup trop complexe:

Ou encore mon bon vieux micro Logitech au nom tout à fait cryptique:

Côté interface, tout ceci est plutôt moche et difficile à découvrir, alors j’ai ouvert un rapport de bug à ce sujet. Ce qui est un peu dommage, c’est qu’on renomme ici les entrées/sorties (sources/sinks) individuellement, au lieu de renommer le périphérique matériel dans sa globalité. Aussi, les versions renommées ne sont pas prises en compte par les paramètres de son de GNOME Control Center.

En tout cas, après ce test concluant, on peut ajouter les lignes suivantes dans le fichier ~/.config/pulse/default.pa pour rendre le changement permanent (je préfère éditer ce fichier dans mon dossier personnel plutôt que le fichier système “/etc/pulse/default.pa”, pour que ça persiste après des “clean installs” de distros:

.include /etc/pulse/default.pa
load-module module-device-manager

Note: “default.pa”, contrairement à daemon.conf (dans lequel j’ai simplement mis “flat-volumes = no” pour revenir au mixage à l’ancienne), n’hérite pas automatiquement des paramètres système, c’est pourquoi j’ai inséré la ligne “.include”.

by nekohayo at January 03, 2017 04:00 PM

December 29, 2016

Sebastian PölsterlAnnouncing scikit-survival – a Python library for survival analysis build on top of scikit-learn

I've meant to do this release for quite a while now and last week I finally had some time to package everything and update the dependencies. scikit-survival contains the majority of code I developed during my Ph.D.

About Survival Analysis

Survival analysis – also referred to as reliability analysis in engineering – refers to type of problem in statistics where the objective is to establish a connections between a set of measurements (often called features or covariates) and the time to an event. The name survival analysis originates from clinical research: in many clinical studies, one is interested in predicting the time to death, i.e., survival. Broadly speaking, survival analysis is a type of regression problem (one wants to predict a continuous value), but with a twist. Consider a clinical study, which investigates coronary heart disease and has been carried out over a 1 year period as in the figure below.

Patient A was lost to follow-up after three months with no recorded cardiovascular event, patient B experienced an event four and a half months after enrollment, patient D withdrew from the study two months after enrollment, and patient E did not experience any event before the study ended. Consequently, the exact time of a cardiovascular event could only be recorded for patients B and C; their records are uncensored. For the remaining patients it is unknown whether they did or did not experience an event after termination of the study. The only valid information that is available for patients A, D, and E is that they were event-free up to their last follow-up. Therefore, their records are censored.

Formally, each patient record consists of a set of covariates $x \in \mathbb{R}^d$ , and the time $t > 0$ when an event occurred or the time $c > 0$ of censoring. Since censoring and experiencing and event are mutually exclusive, it is common to define an event indicator $\delta \in \{0; 1\}$ and the observable survival time $y > 0$. The observable time $y$ of a right censored sample is defined as
\[ y = \min(t, c) =
\begin{cases}
t & \text{if } \delta = 1 , \\
c & \text{if } \delta = 0 ,
\end{cases}
\]

What is scikit-survival?

Recently, many methods from machine learning have been adapted for these kind of problems: random forest, gradient boosting, and support vector machine, many of which are only available for R, but not Python. Some of the traditional models are part of lifelines or statsmodels, but none of those libraries plays nice with scikit-learn, which is the quasi-standard machine learning framework for Python.

This is exactly where scikit-survival comes in. Models implemented in scikit-survival follow the scikit-learn interfaces. Thus, it is possible to use PCA from scikit-learn for dimensionality reduction and feed the low-dimensional representation to a survival model from scikit-survival, or cross-validate a survival model using the classes from scikit-learn. You can see an example of the latter in this notebook.

Download and Install

The source code is available at GitHub and can be installed via Anaconda (currently only for Linux) or pip.

conda install -c sebp scikit-survival

pip install scikit-survival

The API documentation is available here and scikit-survival ships with a couple of sample datasets from the medical domain to get you started.

by sebp at December 29, 2016 02:08 PM

Sebastian PölsterlAnnouncing scikit-survival – a Python library for survival analysis build on top of scikit-learn

I've meant to do this release for quite a while now and last week I finally had some time to package everything and update the dependencies. scikit-survival contains the majority of code I developed during my Ph.D.

About Survival Analysis

Survival analysis – also referred to as reliability analysis in engineering – refers to type of problem in statistics where the objective is to establish a connections between a set of measurements (often called features or covariates) and the time to an event. The name survival analysis originates from clinical research: in many clinical studies, one is interested in predicting the time to death, i.e., survival. Broadly speaking, survival analysis is a type of regression problem (one wants to predict a continuous value), but with a twist. Consider a clinical study, which investigates coronary heart disease and has been carried out over a 1 year period as in the figure below.

Patient A was lost to follow-up after three months with no recorded cardiovascular event, patient B experienced an event four and a half months after enrollment, patient D withdrew from the study two months after enrollment, and patient E did not experience any event before the study ended. Consequently, the exact time of a cardiovascular event could only be recorded for patients B and C; their records are uncensored. For the remaining patients it is unknown whether they did or did not experience an event after termination of the study. The only valid information that is available for patients A, D, and E is that they were event-free up to their last follow-up. Therefore, their records are censored.

Formally, each patient record consists of a set of covariates $x \in \mathbb{R}^d$ , and the time $t > 0$ when an event occurred or the time $c > 0$ of censoring. Since censoring and experiencing and event are mutually exclusive, it is common to define an event indicator $\delta \in \{0; 1\}$ and the observable survival time $y > 0$. The observable time $y$ of a right censored sample is defined as
\[ y = \min(t, c) =
\begin{cases}
t & \text{if } \delta = 1 , \\
c & \text{if } \delta = 0 ,
\end{cases}
\]

What is scikit-survival?

Recently, many methods from machine learning have been adapted for these kind of problems: random forest, gradient boosting, and support vector machine, many of which are only available for R, but not Python. Some of the traditional models are part of lifelines or statsmodels, but none of those libraries plays nice with scikit-learn, which is the quasi-standard machine learning framework for Python.

This is exactly where scikit-survival comes in. Models implemented in scikit-survival follow the scikit-learn interfaces. Thus, it is possible to use PCA from scikit-learn for dimensionality reduction and feed the low-dimensional representation to a survival model from scikit-survival, or cross-validate a survival model using the classes from scikit-learn. You can see an example of the latter in this notebook.

Download and Install

The source code is available at GitHub and can be installed via Anaconda (currently only for Linux) or pip.

conda install -c sebp scikit-survival

pip install scikit-survival

The API documentation is available here and scikit-survival ships with a couple of sample datasets from the medical domain to get you started.

by sebp at December 29, 2016 02:08 PM

December 28, 2016

Arun RaghavanSynchronised Playback and Video Walls

Hello again, and I hope you’re having a pleasant end of the year (if you are, maybe don’t check the news until next year).

I’d written about synchronised playback with GStreamer a little while ago, and work on that has been continuing apace. Since I last wrote about it, a bunch of work has gone in:

  • Landed support for sending a playlist to clients (instead of a single URI)

  • Added the ability to start/stop playback

  • The API has been cleaned up considerably to allow us to consider including this upstream

  • The control protocol implementation was made an interface, so you don’t have to use the built-in TCP server (different use-cases might want different transports)

  • Made a bunch of robustness fixes and documentation

  • Introduced API for clients to send the server information about themselves

  • Also added API for the server to send video transformations for specific clients to apply before rendering

While the other bits are exciting in their own right, in this post I’m going to talk about the last two items.

Video walls

For those of you who aren’t familiar with the term, a video wall is just an array of displays stacked to make a larger display. These are often used in public installations.

One way to set up a video wall is to have each display connected to a small computer (such as the Raspberry Pi), and have them play a part of the entire video, cropped and scaled for the display that is connected. This might look something like:

A 4×4 video wall

The tricky part, of course, is synchronisation — which is where gst-sync-server comes in. Since we’re able to play a given stream in sync across devices on a network, the only missing piece was the ability to distribute a set of per-client transformations so that clients could apply those, and that is now done.

In order to keep things clean from an API perspective, I took the following approach:

  • Clients now have the ability to send a client ID and a configuration (which is just a dictionary) when they first connect to the server

  • The server API emits a signal with the client ID and configuration, which allows you to know when a client connects, what kind of display it’s running, and where it is positioned

  • The server now has additional fields to send a map of client ID to a set of video transformations

This allows us to do fancy things like having each client manage its own information with the server dynamically adapting the set of transformations based on what is connected. Of course, the simpler case of having a static configuration on the server also works.

Demo

Since seeing is believing, here’s a demo of the synchronised playback in action:

The setup is my laptop, which has an Intel GPU, and my desktop, which has an NVidia GPU. These are connected to two monitors (thanks go out to my good friends from Uncommon for lending me their thin-bezelled displays).

The video resolution is 1920×800, and I’ve adjusted the crop parameters to account for the bezels, so the video actually does look continuous. I’ve uploaded the text configuration if you’re curious about what that looks like.

As I mention in the video, the synchronisation is not as tight than I would like it to be. This is most likely because of the differing device configurations. I’ve been working with Nicolas to try to address this shortcoming by using some timing extensions that the Wayland protocol allows for. More news on this as it breaks.

More generally, I’ve done some work to quantify the degree of sync, but I’m going to leave that for another day.

p.s. the reason I used kmssink in the demo was that it was the quickest way I know of to get a full-screen video going — I’m happy to hear about alternatives, though

Future work

Make it real

My demo was implemented quite quickly by allowing the example server code to load and serve up a static configuration. What I would like is to have a proper working application that people can easily package and deploy on the kinds of embedded systems used in real video walls. If you’re interested in taking this up, I’d be happy to help out. Bonus points if we can dynamically calculate transformations based on client configuration (position, display size, bezel size, etc.)

Hardware acceleration

One thing that’s bothering me is that the video transformations are applied in software using GStreamer elements. This works fine(ish) for the hardware I’m developing on, but in real life, we would want to use OpenGL(ES) transformations, or platform specific elements to have hardware-accelerated transformations. My initial thoughts are for this to be either API on playbin or a GstBin that takes a set of transformations as parameters and internally sets up the best method to do this based on whatever sink is available downstream (some sinks provide cropping and other transformations).

Why not audio?

I’ve only written about video transformations here, but we can do the same with audio transformations too. For example, multi-room audio systems allow you to configure the locations of wireless speakers — so you can set which one’s on the left, and which on the right — and the speaker will automatically play the appropriate channel. Implementing this should be quite easy with the infrastructure that’s currently in place.

Merry Happy *.*

I hope you enjoyed reading that — I’ve had great responses from a lot of people about how they might be able to use this work. If there’s something you’d like to see, leave a comment or file an issue.

Happy end of the year, and all the best for 2017!

by Arun at December 28, 2016 06:01 PM

December 15, 2016

Bastien NoceraMaking your own retro keyboard

(Bastien Nocera) We're about a week before Christmas, and I'm going to explain how I created a retro keyboard as a gift to my father, who introduced me to computers when he brought back a Thomson TO7 home, all the way back in 1985.

The original idea was to use a Thomson computer to fit in a smaller computer, such as a CHIP or Raspberry Pi, but the software update support would have been difficult, the use limited to the builtin programs, and it would have required a separate screen. So I restricted myself to only making a keyboard. It was a big enough task, as we'll see.

How do keyboards work?

Loads of switches, that's how. I'll point you to Michał Trybus' blog post « How to make a keyboard - the matrix » for details on this works. You'll just need to remember that most of the keyboards present in those older computers have no support for xKRO, and that the micro-controller we'll be using already has the necessary pull-up resistors builtin.

The keyboard hardware

I chose the smallest Thomson computer available for my project, the MO5. I could have used a stand-alone keyboard, but would have lost all the charm of it (it just looks like a PC keyboard), some other computers have much bigger form factors, to include cartridge, cassette or floppy disk readers.

The DCMoto emulator's website includes tons of documentation, including technical documentation explaining the inner workings of each one of the chipsets on the mainboard. In one of those manuals, you'll find this page:



Whoot! The keyboard matrix in details, no need for us to discover it with a multimeter.

That needs a wash in soapy water

After opening up the computer, and eventually giving the internals, and the keyboard especially if it has mechanical keys, a good clean, we'll need to see how the keyboard is connected.

Finicky metal covered plastic

Those keyboards usually are membrane keyboards, with pressure pads, so we'll need to either find replacement connectors at our local electronics store, or desolder the ones on the motherboard. I chose the latter option.

Desoldered connectors

After matching the physical connectors to the rows and columns in the matrix, using a multimeter and a few key presses, we now know which connector pin corresponds to which connector on the matrix. We can start soldering.

The micro-controller

The micro-controller in my case is a Teensy 2.0, an Atmel AVR-based micro-controller with a very useful firmware that makes it very very difficult to brick. You can either press the little button on the board itself to upload new firmware, or wire it to an external momentary switch. The funny thing is that the Atmega32U4 is 16 times faster than the original CPU (yeah, we're getting old).

I chose to wire it to the "Initial. Prog" ("Reset") button on the keyboard, so as to make it easy to upload new firmware. To do this, I needed to cut a few traces coming out of the physical switch on the board, to avoid interferences from components on the board, using a tile cutter. This is completely optional, and if you're only going to use firmware that you already know at least somewhat works, you can set a key combo to go into firmware upload mode in the firmware. We'll get back to that later.

As far as connecting and soldering to the pins, we can use any I/O pins we want, except D6, which is connected to the board's LED. Note that any deviation from the pinout used in your firmware, you'd need to make changes to it. We'll come back to that again in a minute.

The soldering

Colorful tinning

I wanted to keep the external ports full, so it didn't look like there were holes in the case, but there was enough headroom inside the case to fit the original board, the teensy and pins on the board. That makes it easy to rewire in case of error. You could also dremel (yes, used as a verb) a hole in the board.

As always, make sure early that things would fit, especially the cables!

The unnecessary pollution

The firmware

Fairly early on during my research, I found the TMK keyboard firmware, as well as very well written forum post with detailed explanations on how to modify an existing firmware for your own uses.

This is what I used to modify the firmware for the gh60 keyboard for my own use. You can see here a step-by-step example, implementing the modifications in the same order as the forum post.

Once you've followed the steps, you'll need to compile the firmware. Fedora ships with the necessary packages, so it's a simple:


sudo dnf install -y avr-libc avr-binutils avr-gcc

I also compiled and installed in my $PATH the teensy_cli firmware uploader, and fixed up the udev rules. And after a "make teensy" and a button press...

It worked first time! This is a good time to verify that all the keys work, and you don't see doubled-up letters because of short circuits in your setup. I had 2 wires touching, and one column that just didn't work.

I also prepared a stand-alone repository, with a firmware that uses the tmk_core from the tmk firmware, instead of modifying an existing one.

Some advices

This isn't the first time I hacked on hardware, but I'll repeat some old adages, and advices, because I rarely heed those warnings, and I regret...
  • Don't forget the size, length and non-flexibility of cables in your design
  • Plan ahead when you're going to cut or otherwise modify hardware, because you might regret it later
  • Use breadboard cables and pins to connect things, if you have the room
  • Don't hotglue until you've tested and retested and are sure you're not going to make more modifications
That last one explains the slightly funny cabling of my keyboard.

Finishing touches

All Sugru'ed up

To finish things off nicely, I used Sugru to stick the USB cable, out of the machine, in place. And as earlier, it will avoid having an opening onto the internals.

There are a couple more things that I'll need to finish up before delivery. First, the keymap I have chosen in the firmware only works when a US keymap is selected. I'll need to make a keymap for Linux, possibly hard-coding it. I will also need to create a Windows keymap for my father to use (yep, genealogy software on Linux isn't quite up-to-par).

Prototype and final hardware

All this will happen in the aforementioned repository. And if you ever make your own keyboard, I'm happy to merge in changes to this repository with documentation for your Speccy, C64, or Amstrad CPC hacks.

(If somebody wants to buy me a Sega keyboard, I'll gladly work on a non-destructive adapter. Get in touch :)

by Bastien Nocera (noreply@blogger.com) at December 15, 2016 04:48 PM

November 30, 2016

GStreamerGStreamer 1.10.2 stable release (binaries)

(GStreamer)

Pre-built binary images of the 1.10.2 stable release of GStreamer are now available for Windows 32/64-bit, iOS and Mac OS X and Android.

See /releases/1.10/ for the full list of changes.

The builds are available for download from: Android, iOS, Mac OS X and Windows.

November 30, 2016 05:45 PM

November 29, 2016

GStreamerGStreamer 1.10.2 stable release

(GStreamer)

The GStreamer team is pleased to announce the second bugfix release in the stable 1.10 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.10.0. For a full list of bugfixes see Bugzilla.

See /releases/1.10/ for the full release notes.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Check out the release notes for GStreamer core, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx, or download tarballs for gstreamer, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx.

November 29, 2016 02:00 PM

GStreamerRevamped documentation and gstreamer.com switch-off

(GStreamer)

The GStreamer project is pleased to announce its new revamped documentation featuring a new design, a new navigation bar, search functionality, source code syntax highlighting as well as new tutorials and documentation about how to use GStreamer on Android, iOS, macOS and Windows.

It now contains the former gstreamer.com SDK tutorials which have kindly been made available by Fluendo & Collabora under a Creative Commons license. The tutorials have been reviewed and updated for GStreamer 1.x. The old gstreamer.com site will be shut down with redirects pointing to the updated tutorials and the official GStreamer website.

Thanks to everyone who helped make this happen.

This is just the beginning. Our goal is to provide a more cohesive documentation experience for our users going forward. To that end, we have converted most of our documentation into markdown format. This should hopefully make it easier for developers and contributors to create new documentation, and to maintain the existing one. There is a lot more work to do, do get in touch if you want to help out. The documentation is maintained in the new gst-docs module.

If you encounter any problems or spot any omissions or outdated content in the new documentation, please file a bug in bugzilla to let us know.

November 29, 2016 12:00 PM

November 24, 2016

Sebastian DrögeWriting GStreamer Elements in Rust (Part 3): Parsing data from untrusted sources like it’s 2016

(Sebastian Dröge)

This is part 3, the older parts can be found here: part 1 and part 2

And again it took quite a while to write a new update about my experiments with writing GStreamer elements in Rust. The previous articles can be found here and here. Since last time, there was also the GStreamer Conference 2016 in Berlin, where I had a short presentation about this.

Progress was rather slow unfortunately, due to work and other things getting into the way. Let’s hope this improves. Anyway!

There will be three parts again, and especially the last one would be something where I could use some suggestions from more experienced Rust developers about how to solve state handling / state machines in a nicer way. The first part will be about parsing data in general, especially from untrusted sources. The second part will be about my experimental and current proof of concept FLV demuxer.

Parsing Data

Safety?

First of all, you probably all saw a couple of CVEs about security relevant bugs in (rather uncommon) GStreamer elements going around. While all of them would’ve been prevented by having the code written in Rust (due to by-default array bounds checking), that’s not going to be our topic here. They also would’ve been prevented by using various GStreamer helper API, like GstByteReader, GstByteWriter and GstBitReader. So just use those, really. Especially in new code (which is exactly the problem with the code affected by the CVEs, it was old and forgotten). Don’t do an accountant’s job, counting how much money/many bytes you have left to read.

But yes, this is something where Rust will also provide an advantage by having by-default safety features. It’s not going to solve all our problems, but at least some classes of problems. And sure, you can write safe C code if you’re careful but I’m sure you also drive with a seatbelt although you can drive safely. To quote Federico about his motivation for rewriting (parts of) librsvg in Rust:

Every once in a while someone discovers a bug in librsvg that makes it all the way to a CVE security advisory, and it’s all due to using C. We’ve gotten double free()s, wrong casts, and out-of-bounds memory accesses. Recently someone did fuzz-testing with some really pathological SVGs, and found interesting explosions in the library. That’s the kind of 1970s bullshit that Rust prevents.

You can directly replace the word librsvg with GStreamer here.

Ergonomics

The other aspect with parsing data is that it’s usually a very boring aspect of programming. It should be as painless as possible, as easy as possible to do it in a safe way, and after having written your 100th parser by hand you probably don’t want to do that again. Parser combinator libraries like Parsec in Haskell provide a nice alternative. You essentially write down something very close to a formal grammar of the format you want to parse, and out of this comes a parser for the format. Other than parser generators like good, old yacc, everything is written in target language though, and there is no separate code generation step.

Rust, being quite a bit more expressive than C, also made people write parser generator libraries. They are all not as ergonomic (yet?) as in Haskell, but still a big improvement over anything else. There’s nom, combine and chomp. All having a slightly different approach. Choose your favorite. I decided on nom for the time being.

A FLV Demuxer in Rust

For implementing a demuxer, I decided on using the FLV container format. Mostly because it is super-simple compared to e.g. MP4 and WebM, but also because Geoffroy, the author of nom, wrote a simple header parsing library for it already and a prototype demuxer using it for VLC. I’ll have to extend that library for various features in the near future though, if the demuxer should ever become feature-equivalent with the existing one in GStreamer.

As usual, the code can be found here, in the “demuxer” branch. The most relevant files are rsdemuxer.rs and flvdemux.rs.

Following the style of the sources and sinks, the first is some kind of base class / trait for writing arbitrary demuxers in Rust. It’s rather unfinished at this point though, just enough to get something running. All the FLV specific code is in the second file, and it’s also very minimal for now. All it can do is to play one specific file (or hopefully all other files with the same audio/video codec combination).

As part of all this, I also wrote bindings for GStreamer’s buffer abstraction and a Rust-rewrite of the GstAdapter helper type. Both showed Rust’s strengths quite well, the buffer bindings by being able to express various concepts of the buffers in a compiler-checked, safe way in Rust (e.g. ownership, reability/writability), the adapter implementation by being so much shorter (it’s missing features… but still).

So here we are, this can already play one specific file (at least) in any GStreamer based playback application. But some further work is necessary, for which I hopefully have some time in the near future. Various important features are still missing (e.g. other codecs, metadata extraction and seeking), the code is rather proof-of-concept style (stringly-typed media formats, lots of unimplemented!() and .unwrap() calls). But it shows that writing media handling elements in Rust is definitely feasible, and generally seems like a good idea.

If only we had Rust already when all this media handling code in GStreamer was written!

State Handling

Another reason why all this took a bit longer than expected, is that I experimented a bit with expressing the state of the demuxer in a more clever way than what we usually do in C. If you take a look at the GstFlvDemux struct definition in C, it contains about 100 lines of field declarations. Most of them are only valid / useful in specific states that the demuxer is in. Doing the same in Rust would of course also be possible (and rather straightforward), but I wanted to try to do something better, especially by making invalid states unrepresentable.

Rust has this great concept of enums, also known as tagged unions or sum types in other languages. These are not to be confused with C enums or unions, but instead allow multiple variants (like C enums) with fields of various types (like C unions). But all of that in a type-safe way. This seems like the perfect tool for representing complicated state and building a state machine around it.

So much for the theory. Unfortunately, I’m not too happy with the current state of things. It is like this mostly because of Rust’s ownership system getting into my way (rightfully, how would it have known additional constraints I didn’t know how to express?).

Common Parts

The first problem I ran into, was that many of the states have common fields, e.g.

enum State {
    ...
    NeedHeader,
    HaveHeader {header: Header, to_skip: usize },
    Streaming {header: Header, audio: ... },
    ...
}

When writing code that matches on this, and that tries to move from one state to another, these common fields would have to be moved. But unfortunately they are (usually) borrowed by the code already and thus can’t be moved to the new variant. E.g. the following fails to compile

match self.state {
        ...
        State::HaveHeader {header, to_skip: 0 } => {
            
            self.state = State::Streaming {header: header, ...};
        },
    }

A Tree of States

Repeating the common parts is not nice anyway, so I went with a different solution by creating a tree of states:

enum State {
    ...
    NeedHeader,
    HaveHeader {header: Header, have_header_state: HaveHeaderState },
    ...
}

enum HaveHeaderState {
    Skipping {to_skip: usize },
    Streaming {audio: ... },
}

Apart from making it difficult to find names for all of these, and having relatively deeply nested code, this works

match self.state {
        ...
        State::HaveHeader {ref header, ref mut have_header_state } => {
            match *have_header_state {
                HaveHeaderState::Skipping { to_skip: 0 } => {
                    *have_header = HaveHeaderState::Streaming { audio: ...};
                }
        },
    }

If you look at the code however, this causes the code to be much bigger than needed and I’m also not sure yet how it will be possible nicely to move “backwards” one state if that situation ever appears. Also there is still the previous problem, although less often: if I would match on to_skip here by reference (or it was no Copy type), the compiler would prevent me from overwriting have_header for the same reasons as before.

So my question for the end: How are others solving this problem? How do you express your states and write the functions around them to modify the states?

Update

I actually implemented the state handling as a State -> State function before (and forgot about that), which seems conceptually the right thing to do. It however has a couple of other problems. Thanks for the suggestions so far, it seems like I’m not alone with this problem at least.

Update 2

I’ve went a bit closer to the C-style struct definition now, as it makes the code less convoluted and allows me to just get forwards with the code. The current status can be seen here now, which also involves further refactoring (and e.g. some support for metadata).

by slomo at November 24, 2016 11:10 PM