March 26, 2020

Nirbheek ChauhanGStreamer's Meson and Visual Studio Journey


Almost 3 years ago, I wrote about how we at Centricular had been working on an experimental port of GStreamer from Autotools to the Meson build system for faster builds on all platforms, and to allow building with Visual Studio on Windows.

At the time, the response was mixed, and for good reason—Meson was a very new build system, and it needed to work well on all the targets that GStreamer supports, which was all major operating systems. Meson did aim to support all of those, but a lot of work was required to bring platform support up to speed with the requirements of a non-trivial project like GStreamer.

The Status: Today!

After years of work across several components (Meson, Ninja, Cerbero, etc), GStreamer is being built with Meson on all platforms! Autotools is scheduled to be removed in the next release cycle (1.18). Edit: as of October 2019, Autotools has been removed.

The first stable release with this work was 1.16, which was released yesterday. It has already led to a number of new capabilities:
  • GStreamer can be built with Visual Studio on Windows inside Cerbero, which means we now ship official binaries for GStreamer built with the  MSVC toolchain.
  • From-scratch Cerbero builds are much faster on all platforms, which has aided the implementation of CI-gated merge requests on GitLab.
  • The developer workflow has been streamlined and is the same on all platforms (Linux, Windows, macOS) using the gst-build meta-project. The meta-project can also be used for cross-compilation (Android, iOS, Windows, Linux).
  • The Windows developer workflow no longer requires installing several packages by hand or setting up an MSYS environment. All you need is Git, Python 3, Visual Studio, and 15 min for the initial build.
  • Profiling on Windows is now possible, and I've personally used it to profile and fix numerous Windows-specific performance issues.
  • Visual Studio projects that use GStreamer now have debug symbols since we're no longer mixing MinGW and MSVC binaries. This also enables usable crash reports and symbol servers.
  • We can ship plugins that can only be built with MSVC on Windows, such as the Intel MSDK hardware codec plugin, Directshow plugins, and also easily enable new Windows 10 features in existing plugins such as WASAPI.
  • iOS bitcode builds are more correct, since Meson is smart enough to know how to disable incompatible compiler options on specific build targets.
  • The iOS framework now also ships shared libraries in addition to the static libraries.
Overall, it's been a huge success and we're really happy with how things have turned out!

You can download the prebuilt MSVC binaries, reproduce them yourself, or quickly bootstrap a GStreamer development environment. The choice is yours!

Further Musings

While working on this over the years, what's really stood out to me was how this sort of gargantuan task was made possible through the power of community-driven FOSS and community-focused consultancy.

Our build system migration quest has been long with valleys full of yaks with thick coats of fur, and it would have been prohibitively expensive for a single entity to sponsor it all. Thanks to the inherently collaborative nature of community FOSS projects, people from various backgrounds and across companies could come together and make this possible.

There are many other examples of this, but seeing the improbable happen from the inside is something special.

Special shout-outs to ZEISS, Barco, Pexip, and Cablecast.tv for sponsoring various parts of this work!

Their contributions also made it easier for us to spend thousands more hours of non-sponsored time to fill in the gaps so that all the sponsored work done could be upstreamed in a form that's useful for everyone who uses GStreamer. This sort of thing is, in my opinion, an essential characteristic of being a community-focused consultancy, and we make sure that it always has high priority.

by Nirbheek (noreply@blogger.com) at March 26, 2020 08:35 PM

March 25, 2020

Andy Wingofirefox's low-latency webassembly compiler

(Andy Wingo)

Good day!

Today I'd like to write a bit about the WebAssembly baseline compiler in Firefox.

background: throughput and latency

WebAssembly, as you know, is a virtual machine that is present in web browsers like Firefox. An important initial goal for WebAssembly was to be a good target for compiling programs written in C or C++. You can visit a web page that includes a program written in C++ and compiled to WebAssembly, and that WebAssembly module will be downloaded onto your computer and run by the web browser.

A good virtual machine for C and C++ has to be fast. The throughput of a program compiled to WebAssembly (the amount of work it can get done per unit time) should be approximately the same as its throughput when compiled to "native" code (x86-64, ARMv7, etc.). WebAssembly meets this goal by defining an instruction set that consists of similar operations to those directly supported by CPUs; WebAssembly implementations use optimizing compilers to translate this portable instruction set into native code.

There is another dimension of fast, though: not just work per unit time, but also time until first work is produced. If you want to go play Doom 3 on the web, you care about frames per second but also time to first frame. Therefore, WebAssembly was designed not just for high throughput but also for low latency. This focus on low-latency compilation expresses itself in two ways: binary size and binary layout.

On the size front, WebAssembly is optimized to encode small files, reducing download time. One way in which this happens is to use a variable-length encoding anywhere an instruction needs to specify an integer. In the usual case where, for example, there are fewer than 128 local variables, this means that a local.get instruction can refer to a local variable using just one byte. Another strategy is that WebAssembly programs target a stack machine, reducing the need for the instruction stream to explicitly load operands or store results. Note that size optimization only goes so far: it's assumed that the bytes of the encoded module will be compressed by gzip or some other algorithm, so sub-byte entropy coding is out of scope.

On the layout side, the WebAssembly binary encoding is sorted by design: definitions come before uses. For example, there is a section of type definitions that occurs early in a WebAssembly module. Any use of a declared type can only come after the definition. In the case of functions which are of course mutually recursive, function type declarations come before the actual definitions. In theory this allows web browsers to take a one-pass, streaming approach to compilation, starting to compile as functions arrive and before download is complete.

implementation strategies

The goals of high throughput and low latency conflict with each other. To get best throughput, a compiler needs to spend time on code motion, register allocation, and instruction selection; to get low latency, that's exactly what a compiler should not do. Web browsers therefore take a two-pronged approach: they have a compiler optimized for throughput, and a compiler optimized for latency. As a WebAssembly file is being downloaded, it is first compiled by the quick-and-dirty low-latency compiler, with the goal of producing machine code as soon as possible. After that "baseline" compiler has run, the "optimizing" compiler works in the background to produce high-throughput code. The optimizing compiler can take more time because it runs on a separate thread. When the optimizing compiler is done, it replaces the baseline code. (The actual heuristics about whether to do baseline + optimizing ("tiering") or just to go straight to the optimizing compiler are a bit hairy, but this is a summary.)

This article is about the WebAssembly baseline compiler in Firefox. It's a surprising bit of code and I learned a few things from it.

design questions

Knowing what you know about the goals and design of WebAssembly, how would you implement a low-latency compiler?

It's a question worth thinking about so I will give you a bit of space in which to do so.

.

.

.

After spending a lot of time in Firefox's WebAssembly baseline compiler, I have extracted the following principles:

  1. The function is the unit of compilation

  2. One pass, and one pass only

  3. Lean into the stack machine

  4. No noodling!

In the remainder of this article we'll look into these individual points. Note, although I have done a good bit of hacking on this compiler, its design and original implementation comes mainly from Mozilla hacker Lars Hansen, who also currently maintains it. All errors of exegesis are mine, of course!

the function is the unit of compilation

As we mentioned, in the binary encoding of a WebAssembly module, all definitions needed by any function come before all function definitions. This naturally leads to a partition between two phases of bytestream parsing: an initial serial phase that collects the set of global type definitions, annotations as to which functions are imported and exported, and so on, and a subsequent phase that compiles individual functions in an essentially independent manner.

The advantage of this approach is that compiling functions is a natural task unit of parallelism. If the user has a machine with 8 virtual cores, the web browser can keep one or two cores for the browser itself and farm out WebAssembly compilation tasks to the rest. The result is that the compiled code is available sooner.

Taking functions to be the unit of compilation also allows for an easy "tier-up" mechanism: after the baseline compiler is done, the optimizing compiler can take more time to produce better code, and when it is done, it can swap out the results on a per-function level. All function calls from the baseline compiler go through a jump table indirection, to allow for tier-up. In SpiderMonkey there is no mechanism currently to tier down; if you need to debug WebAssembly code, you need to refresh the page, causing the wasm code to be compiled in debugging mode. For the record, SpiderMonkey can only tier up at function calls (it doesn't do OSR).

This simple approach does have some down-sides, in that it leaves intraprocedural optimizations on the table (inlining, contification, custom calling conventions, speculative optimizations). This is mitigated in two ways, the most obvious being that LLVM or whatever produced the WebAssembly has ideally already done whatever inlining might be fruitful. The second is that WebAssembly is designed for predictable performance. In JavaScript, an implementation needs to do run-time type feedback and speculative optimizations to get good performance, but the result is that it can be hard to understand why a program is fast or slow. The designers and implementers of WebAssembly in browsers all had first-hand experience with JavaScript virtual machines, and actively wanted to avoid unpredictable performance in WebAssembly. Therefore there is currently a kind of détente among the various browser vendors, that everyone has agreed that they won't do speculative inlining -- yet, anyway. Who knows what will happen in the future, though.

Digressing, the summary here is that the baseline compiler receives an individual function body as input, and generates code just for that function.

one pass, and one pass only

The WebAssembly baseline compiler makes one pass through the bytecode of a function. Nowhere in all of this are we going to build an abstract syntax tree or a graph of basic blocks. Let's follow through how that works.

Firstly, emitFunction simply emits a prologue, then the body, then an epilogue. emitBody is basically a big loop that consumes opcodes from the instruction stream, dispatching to opcode-specific code emitters (e.g. emitAddI32).

The opcode-specific code emitters are also responsible for validating their arguments; for example, emitAddI32 is wrapped in an assertion that there are two i32 values on the stack. This validation logic is shared by a templatized codestream iterator so that it can be re-used by the optimizing compiler, as well as by the publicly-exposed WebAssembly.validate function.

A corollary of this approach is that machine code is emitted in bytestream order; if the WebAssembly instruction stream has an i32.add followed by a i32.sub, then the machine code will have an addl followed by a subl.

WebAssembly has a syntactically limited form of non-local control flow; it's not goto. Instead, instructions are contained in a tree of nested control blocks, and control can only exit nonlocally to a containing control block. There are three kinds of control blocks: jumping to a block or an if will continue at the end of the block, whereas jumping to a loop will continue at its beginning. In either case, as the compiler keeps a stack of nested control blocks, it has the set of valid jump targets and can use the usual assembler logic to patch forward jump addresses when the compiler gets to the block exit.

lean into the stack machine

This is the interesting bit! So, WebAssembly instructions target a stack machine. That is to say, there's an abstract stack onto which evaluating i32.const 32 pushes a value, and if followed by i32.const 10 there would then be i32(32) | i32(10) on the stack (where new elements are added on the right). A subsequent i32.add would pop the two values off, and push on the result, leaving the stack as i32(42). There is also a fixed set of local variables, declared at the beginning of the function.

The easiest thing that a compiler can do, then, when faced with a stack machine, is to emit code for a stack machine: as values are pushed on the abstract stack, emit code that pushes them on the machine stack.

The downside of this approach is that you emit a fair amount of code to do read and write values from the stack. Machine instructions generally take arguments from registers and write results to registers; going to memory is a bit superfluous. We're willing to accept suboptimal code generation for this quick-and-dirty compiler, but isn't there something smarter we can do for ephemeral intermediate values?

Turns out -- yes! The baseline compiler keeps an abstract value stack as it compiles. For example, compiling i32.const 32 pushes nothing on the machine stack: it just adds a ConstI32 node to the value stack. When an instruction needs an operand that turns out to be a ConstI32, it can either encode the operand as an immediate argument or load it into a register.

Say we are evaluating the i32.add discussed above. After the add, where does the result go? For the baseline compiler, the answer is always "in a register" via pushing a new RegisterI32 entry on the value stack. The baseline compiler includes a stupid register allocator that spills the value stack to the machine stack if no register is available, updating value stack entries from e.g. RegisterI32 to MemI32. Note, a ConstI32 never needs to be spilled: its value can always be reloaded as an immediate.

The end result is that the baseline compiler avoids lots of stack store and load code generation, which speeds up the compiler, and happens to make faster code as well.

Note that there is one limitation, currently: control-flow joins can have multiple predecessors and can pass a value (in the current WebAssembly specification), so the allocation of that value needs to be agreed-upon by all predecessors. As in this code:

(func $f (param $arg i32) (result i32)
  (block $b (result i32)
    (i32.const 0)
    (local.get $arg)
    (i32.eqz)
    (br_if $b) ;; return 0 from $b if $arg is zero
    (drop)
    (i32.const 1))) ;; otherwise return 1
;; result of block implicitly returned

When the br_if branches to the block end, where should it put the result value? The baseline compiler effectively punts on this question and just puts it in a well-known register (e.g., $rax on x86-64). Results for block exits are the only place where WebAssembly has "phi" variables, and the baseline compiler allocates all integer phi variables to the same register. A hack, but there we are.

no noodling!

When I started to hack on the baseline compiler, I did a lot of code reading, and eventually came on code like this:

void BaseCompiler::emitAddI32() {
  int32_t c;
  if (popConstI32(&c)) {
    RegI32 r = popI32();
    masm.add32(Imm32(c), r);
    pushI32(r);
  } else {
    RegI32 r, rs;
    pop2xI32(&r, &rs);
    masm.add32(rs, r);
    freeI32(rs);
    pushI32(r);
  }
}

I said to myself, this is silly, why are we only emitting the add-immediate code if the constant is on top of the stack? What if instead the constant was the deeper of the two operands, why do we then load the constant into a register? I asked on the chat channel if it would be OK if I improved codegen here and got a response I was not expecting: no noodling!

The reason is, performance of baseline-compiled code essentially doesn't matter. Obviously let's not pessimize things but the reason there's a baseline compiler is to emit code quickly. If we start to add more code to the baseline compiler, the compiler itself will slow down.

For that reason, changes are only accepted to the baseline compiler if they are necessary for some reason, or if they improve latency as measured using some real-world benchmark (time-to-first-frame on Doom 3, for example).

This to me was a real eye-opener: a compiler optimized not for the quality of the code that it generates, but rather for how fast it can produce the code. I had seen this in action before but this example really brought it home to me.

The focus on compiler throughput rather than compiled-code throughput makes it pretty gnarly to hack on the baseline compiler -- care has to be taken when adding new features not to significantly regress the old. It is much more like hacking on a production JavaScript parser than your traditional SSA-based compiler.

that's a wrap!

So that's the WebAssembly baseline compiler in SpiderMonkey / Firefox. Until the next time, happy hacking!

by Andy Wingo at March 25, 2020 04:29 PM

Jean-François Fortin Tam2020: the fecal matter is colliding with the rotary oscillator

Many friends of mine, including a significant portion of GNOME contributors, are in the United States, and I’m personally worried they (or those around them) will face particularly deep trouble this year and beyond. It seems nobody dares talk openly about it, so what the heck, I’m sharing my concern here and getting it out of my chest (then, after worrying about death, I can move on to worrying about taxes). Maybe I’ll be able to sleep a bit better.

As you most probably know, Europe is taking a serious beating and is struggling as we speak… but if you thought the US will fare any better, just wait. Shiitake is about to hit the fan, and the case of the United States of America is particularly concerning because of the many reasons I extensively documented here a couple of days ago. Not only is the US’ preparation for this pandemic very much insufficient and it has no true safety net for its citizens, but it also has very unique societal factors that, compared to all the other countries in the world, put it at risk of suffering extremely deep social disruption and pervasive hardship.

I wish you the best of luck in the fight against the SARS-coronavirus-2, just as I am wishing good luck to the rest of the world. I hope I will be incredibly wrong (so far the trends seem to be confirming my predictions, however) and that some unforeseen radical solutions will turn the tide, but I’m not holding my breath here. The US needs more than band-aid quick-fixes.

Let’s hope that this time, the sheer scale of the problem will bring about real positive change in the system. Not just a bigger economic bubble at the expense of the people and planet. It would be about time.

The post 2020: the fecal matter is colliding with the rotary oscillator appeared first on The Open Sourcerer.

by Jeff at March 25, 2020 03:06 PM

March 16, 2020

Víctor JáquezReview of the Igalia Multimedia team Activities (2019/H2)

This blog post is a review of the various activities the Igalia Multimedia team was involved along the second half of 2019.

Here are the previous 2018/H2 and 2019/H1 reports.

GstWPE

Succinctly, GstWPE is a GStreamer plugin which allows to render web-pages as a video stream where it frames are GL textures.

Phil, its main author, wrote a blog post explaning at detail what is GstWPE and its possible use-cases. He wrote a demo too, which grabs and previews a live stream from a webcam session and blends it with an overlay from wpesrc, which displays HTML content. This composited live stream can be broadcasted through YouTube or Twitch.

These concepts are better explained by Phil himself in the following lighting talk, presented at the last GStreamer Conference in Lyon:

Video Editing

After implementing a deep integration of the GStreamer Editing Services (a.k.a GES) into Pixar’s OpenTimelineIO during the first half of 2019, we decided to implement an important missing feature for the professional video editing industry: nested timelines.

Toward that goal, Thibault worked with the GSoC student Swayamjeet Swain to implement a flexible API to support nested timelines in GES. This means that users of GES can now decouple each scene into different projects when editing long videos. This work is going to be released in the upcoming GStreamer 1.18 version.

Henry Wilkes also implemented the support for nested timeline in OpenTimelineIO making GES integration one of the most advanced one as you can see on that table:

Feature OTIO EDL FCP7 XML FCP X AAF RV ALE GES
Single Track of Clips ✔ ✔ ✔ ✔ ✔ W-O ✔ ✔
Multiple Video Tracks ✔ ✖ ✔ ✔ ✔ W-O ✔ ✔
Audio Tracks & Clips ✔ ✔ ✔ ✔ ✔ W-O ✔ ✔
Gap/Filler ✔ ✔ ✔ ✔ ✔ ✔ ✖ ✔
Markers ✔ ✔ ✔ ✔ ✖ N/A ✖ ✔
Nesting ✔ ✖ ✔ ✔ ✔ W-O ✔ ✔
Transitions ✔ ✔ ✖ ✖ ✔ W-O ✖ ✔
Audio/Video Effects ✖ ✖ ✖ ✖ ✖ N/A ✖ ✔
Linear Speed Effects ✔ ✔ ✖ ✖ R-O ✖ ✖ ✖
Fancy Speed Effects ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖
Color Decision List ✔ ✔ ✖ ✖ ✖ ✖ N/A ✖

Along these lines, Thibault delivered a 15 minutes talk, also in the GStreamer Conference 2019:

After detecting a few regressions and issues in GStreamer, related to frame accuracy, we decided to make sure that we can seek in a perfectly frame accurate way using GStreamer and the GStreamer Editing Services. In order to ensure that, an extensive integration testsuite has been developed, mostly targeting most important container formats and codecs (namely mxf, quicktime, h264, h265, prores, jpeg) and issues have been fixed in different places. On top of that, new APIs are being added to GES to allow expressing times in frame number instead of nanoseconds. This work is still ongoing but should be merged in time for GStreamer 1.18.

GStreamer Validate Flow

GstValidate has been turning into one of the most important GStreamer testing tools to check that elements behave as they are supposed to do in the framework.

Along with our MSE work, we found that other way to specify tests, related with produced buffers and events through specific pads, was needed. Thus, Alicia developed a new plugin for GstValidate: Validate Flow.

Alicia gave an informative 30 minutes talk about GstValidate and the new plugin in the last GStreamer Conference too:

GStreamer VAAPI

Most of the work along the second half of 2019 were maintenance tasks and code reviews.

We worked mainly on memory restrictions per backend driver, and we reviewed a big refactor: internal encoders now use GstObject, instead of the custom GstVaapiObject. Also we reviewed patches for new features such as video rotation and cropping in vaapipostproc.

Servo multimedia

Last year we worked integrating media playing in Servo. We finally delivered hardware accelerated video playback in Linux and Android. We worked also for Windows and Mac ports but they were not finished. As natural, most of the work were in servo/media crate, pushing code and reviewing contributions. The major tasks were to rewrite the media player example and the internal source element looking to handle the download playbin‘s flag properly.

We also added WebGL integration support with <video> elements, thus webpages can use video frames as WebGL textures.

Finally we explored how to isolate the multimedia processing in a dedicated thread or process, but that task remains pending.

WebKit Media Source Extension

We did a lot of downstream and upstream bug fixing and patch review, both in WebKit and GStreamer, for our MSE GStreamer-based backend.

Along this line we improved WebKitMediaSource to use playbin3 but also compatibility with older GStreamer versions was added.

WebKit WebRTC

Most of the work in this area were maintenance and fix regressions uncovered by the layout tests. Besides, the support for the Rasberry Pi was improved by handling encoded streams from v4l2 video sources, with some explorations with Minnowboard on top of that.

Conferences

GStreamer Conference

Igalia was Gold sponsor this last GStreamer Conference held in Lyon, France.

All team attended and five talks were delivered. Only Thibault presented, besides the video editing one which we already referred, another two more: One about GstTranscoder API and the other about the new documentation infrastructure based in Hotdoc:

We also had a productive hackfest, after the conference, where we worked on AV1 Rust decoder, HLS Rust demuxer, hardware decoder flag in playbin, and other stuff.

Linaro Connect

Phil attended the Linaro Connect conference in San Diego, USA. He delivered a talk about WPE/Multimedia which you can enjoy here:

Demuxed

Charlie attended Demuxed, in San Francisco. The conference is heavily focused on streaming and codec engineering and validation. Sadly there are not much interest in GStreamer, as the main focus is on FFmpeg.

RustFest

Phil and I attended the last RustFest in Barcelona. Basically we went to meet with the Rust community and we attended the “WebRTC with GStreamer-rs” workshop presented by Sebastian Dröge.

by vjaquez at March 16, 2020 03:20 PM

March 05, 2020

Jean-François Fortin TamThe Ultimate Free and Open Source conference explanation video

Have you ever wondered what the best community-oriented open source conference events look like? Ever wanted to attend one, but never dared to? Or need something to convince your boss to support you in attending as part of your work?

For many veteran FLOSS contributors who are part of big established projects, it is easy to take things for granted and just go to those events without hesitation; we forget how mysterious and intimidating this can be for casual or new contributors. We don’t typically spend the time to articulate what makes these events great, and why we spend so much effort organizing and attending them.

It also seems quite mysterious to our non-technical friends and family members. They sometimes know that we’re travelling to some mythical “computer conference” event in some faraway land, suspiciously held in a different city every year (as is the case with GNOME’s GUADEC), but it’s hard to explain why we’re mostly going there for a few days to spend time “indoors in some auditorium” instead of sipping margaritas on the beach.

Well, I have the solution for this longstanding communication problem.

After weeks of preparation, a few days of shooting, and over 13 days of full-time editing, I have produced the Ultimate FLOSS conference explanation video:

Video thumbnailClick the image above to view the video

It is a dynamic, cinematic, professional-grade short documentary, meant to serve as evergreen material that you can point people to. I hope you will appreciate the level of attention to detail present in this edit!

It is longer than the typical “2-minutes conference highlights” video, but I believe the quality and depth of topics being discussed, combined with my tight script and editing, will make it a pleasure for you to watch from beginning to end. It’s shorter than a TV episode yet still makes for good nighttime entertainment!

  • In order to make the topic accessible, the cinematic part is preceded by a narrated introduction to establish the context in terms anyone can understand. This is so that you can safely share the video with newcomers, friends, family, new acquaintances you meet for years to come—no matter their knowledge level.
  • After the educational introduction, it then continues to the “cinematic” documentary part.

“What if I’m already a Lv.70 geek?”

If you’re in a big hurry and you already know all about Free & Open Source software, you can skip to the 5:05 mark, and if you already know about GStreamer and don’t care to know why I made the video “around GStreamer” in the first place, you can jump directly to 8:14… But if you have a few minutes extra, it’s certainly worth watching from the start (you’d be missing some jokes otherwise).

I recommend listening to this with good speakers or headphones. While I am no musician, my editing style is centered around sound, rhythm and (e)motion:

  • I adapt motion, beats and flow to fit the desired atmosphere and impact. In terms of editing style, I primarily “cut to the music”, but also sometimes rearrange the music itself to fit the motion.
  • I tweak all the sound levels & frequencies to ensure you can always hear speech effortlessly—whether you are on my studio monitoring speakers or headphones, or on some crappy 1w laptop speakers (which I obviously do not recommend).

While I’m at it, I might as well mention that I’m open for contractual video production or editing work (in addition to being available as a long-term specialized marketer ;)


P.s.: if you’re French, no need to challenge me to a duel after watching the intro! I actually like your transportation system. Especially the fact that you actually have trains. We don’t have that here.

The post The Ultimate Free and Open Source conference explanation video appeared first on The Open Sourcerer.

by Jeff at March 05, 2020 02:30 PM

February 17, 2020

Jean-François Fortin TamRevival of Getting Things GNOME: survey results and first status update

Ever since my previous blogging frenzy where I laid bare the secret to my productivity, formulated my typology of workers, and published a survey to evaluate the revival potential for Getting Things GNOME, I’m sure y’all have been dying to know what were the outcomes of that survey, and how the GTG project is doing.

Well, you can find out in this video where I present my findings and the path forward for the project:

I hope you like it. Much more fun than reading a wall of text, isn’t it?

I’ll have another video (on a different open-source project/community management topic) coming up very soon, and it will be of absolutely epic proportions—it took me months to produce it to the level of quality I desired. If you don’t want to miss it, I suggest you subscribe to my YouTube channel (and/or mailing list).

The post Revival of Getting Things GNOME: survey results and first status update appeared first on The Open Sourcerer.

by Jeff at February 17, 2020 07:17 PM

February 09, 2020

Andy Wingostate of the gnunion 2020

(Andy Wingo)

Greetings, GNU hackers! This blog post rounds up GNU happenings over 2019. My goal is to celebrate the software we produced over the last year and to help us plan a successful 2020.

Over the past few months I have been discussing project health with a group of GNU maintainers and we were wondering how the project was doing. We had impressions, but little in the way of data. To that end I wrote some scripts to collect dates and versions for all releases made by GNU projects, as far back as data is available.

In 2019, I count 243 releases, from 98 projects. Nice! Notably, on ftp.gnu.org we have the first stable releases from three projects:

GNU Guix
GNU Guix is perhaps the most exciting project in GNU these days. It's a package manager! It's a distribution! It's a container construction tool! It's a package-manager-cum-distribution-cum-container-construction-tool! Hearty congratulations to Guix on their first stable release.
GNU Shepherd
The GNU Daemon Shepherd is a modern dependency-based init service, written in Guile Scheme, and used in Guix. When you install Guix as an operating system, it actually stages Scheme programs from the operating system definition into the Shepherd configuration. So cool!
GNU Backgammon
Version 1.06.002 is not GNU Backgammon's first stable release, but it is the earliest version which is available on ftp.gnu.org. Formerly hosted on the now-defunct gnubg.org, GNU Backgammon is a venerable foe, and uses neural networks since before they were cool. Welcome back, GNU Backgammon!

The total release counts above are slightly above what Mike Gerwitz's scripts count in his "GNU Spotlight", posted on the FSF blog. This could be because in addition to files released on ftp.gnu.org, I also manually collected release dates for most packages that upload their software somewhere other than gnu.org. I don't count alpha.gnu.org releases, and there were a handful of packages for which I wasn't successful at retrieving their release dates. But as a first approximation, it's a relatively complete data set.

I put my scripts in git repository if anyone is interested in playing with the data. Some raw CSV files are there as well.

where we at?

Hair toss, check my nails, baby how you GNUing? Hard to tell!

To get us closer to an answer, I calculated the active package count per year. There can be other definitions, but my reading is that an active package is one that has had a stable release within the preceding 3 calendar years. So for 2019, for example, a GNU package is considered active if it had a stable release in 2017, 2018, or 2019. What I got was a graph that looks like this:

What we see is nothing before 1991 -- surely pointing to lacunae in my data set -- then a more or less linear rise in active package count until 2002, some stuttering growth rising to a peak in 2014 at 208 active packages, and from there a steady decline down to 153 active packages in 2019.

Of course, as a metric, active package count isn't precisely the same as project health; GNU ed is indeed the standard editor but it's not GCC. But we need to look for measurements that indirectly indicate project health and this is what I could come up with.

Looking a little deeper, I tabulated the first and last release date for each GNU package, and then grouped them by year. In this graph, the left blue bars indicate the number of packages making their first recorded release, and the right green bars indicate the number of packages making their last release. Obviously a last release in 2019 indicates an active package, so it's to be expected that we have a spike in green bars on the right.

What this graph indicates is that GNU had an uninterrupted growth phase from its beginning until 2006, with more projects being born than dying. Things are mixed until 2012 or so, and since then we see many more projects making their last release and above all, very few packages "being born".

where we going?

I am not sure exactly what steps GNU should take in the future but I hope that this analysis can be a good conversation-starter. I do have some thoughts but will post in a follow-up. Until then, happy hacking in 2020!

by Andy Wingo at February 09, 2020 07:44 PM

February 07, 2020

Andy Wingolessons learned from guile, the ancient & spry

(Andy Wingo)

Greets, hackfolk!

Like just about every year, last week I took the train up to Brussels for FOSDEM, the messy and wonderful carnival of free software and of those that make it. Mostly I go for the hallway track: to see old friends, catch up, scheme about future plans, and refill my hacker culture reserves.

I usually try to see if I can get a talk or two in, and this year was no exception. First on my mind was the recent release of Guile 3. This was the culmination of a 10-year plan of work and so obviously there are some things to say! But at the same time, I wanted to reflect back a bit and look at the past with a bit of distance.

So in the end, my one talk was two talks. Let's start with the first one. (I'm trying a new thing where I share my talks as blog posts. We'll see how this goes. I know the rendering can be a bit off relative to the slides, but hopefully it's good enough. If you prefer, you can just watch the video instead!)

Celebrating Guile 3

FOSDEM 2020, Brussels

Andy Wingo | wingo@igalia.com

wingolog.org | @andywingo

So yeah let's celebrate! I co-maintain the Guile implementation of Scheme. It's a programming language. Guile 3, in summary, is just Guile, but faster. We added a simple just-in-time compiler as well as a bunch of ahead-of-time optimizations. The result is that it runs faster -- sometimes by a lot!

In the image above you can see Guile 3's performance on a number of microbenchmarks, relative to Guile 2.2, sorted by speedup. The baseline is 1.0x as fast. You can see that besides the first couple microbenchmarks where things are a bit inconclusive (click for full-size image), everything gets faster. Most are at least 2x as fast, and one benchmark is even 32x as fast. (Note the logarithmic scale on the Y axis.)

I only took a look at microbenchmarks at the end of the Guile 3 series; before that, I was mostly going by instinct. It's a relief to find out that in this case, my instincts did align with improvement.

mini-benchmark: eval

(primitive-eval
 ’(let fib ((n 30))
    (if (< n 2)
        n
        (+ (fib (- n 1)) (fib (- n 2))))))

Guile 1.8: primitive-eval written in C

Guile 2.0+: primitive-eval in Scheme

Taking a look at a more medium-sized benchmark, let's compute the 30th fibonacci number, but using the interpreter instead of compiling the procedure. In Guile 2.0 and up, the interpreter (primitive-eval) is implemented in Scheme, so it's a good test of an important small Scheme program.

Before 2.0, though, primitive-eval was actually implemented in C. This had a number of disadvantages, notably that it prevented tail calls between interpreted and compiled code. When we switched to a Scheme implementation of primitive-eval, we knew we would have a performance hit, but we thought that we would gain it back eventually as the compiler got better.

As you can see, it took a while before the compiler and run-time improved to the point that primitive-eval in Scheme reached the speed of its old hand-tuned C implementation, but for Guile 3, we finally got there. Note again the logarithmic scale on the Y axis.

macro-benchmark: guix

guix build libreoffice ghc-pandoc guix \
  –dry-run --derivation

7% faster

guix system build config.scm \
  –dry-run --derivation

10% faster

Finally, taking a real-world benchmark, the Guix package manager is implemented entirely in Scheme. All ten thousand packages are defined in Scheme, the building scripts are in Scheme, the initial RAM disk is in Scheme -- you get the idea. Guile performance in Guix can have an important effect on user experience. As you can see, Guile 3 lowered elapsed time for some operations by around 10 percent or so. Of course there's a lot of I/O going on in addition to computation, so Guile running twice as fast will rarely make Guix run twice as fast (Amdahl's law and all that).

spry /sprī/

  • adjective: active; lively

So, when I was thinking about words that describe Guile, the word "spry" came to mind.

spry /sprī/

  • adjective: (especially of an old person) active; lively

But actually when I went to look up the meaning of "spry", Collins Dictionary says that it especially applies to the agèd. At first I was a bit offended, but I knew in my heart that the dictionary was right.

Lessons Learned from Guile, the Ancient & Spry

FOSDEM 2020, Brussels

Andy Wingo | wingo@igalia.com

wingolog.org | @andywingo

That leads me into my second talk.

guile is ancient

2010: Rust

2009: Go

2007: Clojure

1995: Ruby

1995: PHP

1995: JavaScript

1993: Guile (33 years before 3.0!)

It's common for a new project to be lively, but Guile is definitely not new. People have been born, raised, and earned doctorates in programming languages in the time that Guile has been around.

built from ancient parts

1991: Python

1990: Haskell

1990: SCM

1989: Bash

1988: Tcl

1988: SIOD

Guile didn't appear out of nothing, though. It was hacked up from the pieces of another Scheme implementation called SCM, which itself was initially based on Scheme in One Defun (SIOD), back before the Berlin Wall fell.

written in an ancient language

1987: Perl

1984: C++

1975: Scheme

1972: C

1958: Lisp

1958: Algol

1954: Fortran

1958: Lisp

1930s: λ-calculus (34 years ago!)

But it goes back further! The Scheme language, of which Guile is an implementation, dates from 1975, before I was born; and you can, if you choose, trace the lines back to the lambda calculus, created in mid-30s as a notation for computation. I suppose at this point I should say mid-2030s, to disambiguate.

The point is, Guile is old! Statistically, most software projects from olden times are now dead. How has Guile managed to survive and (sometimes) thrive? Surely there must be some lesson or other that can be learned here.

ancient & spry

Men make their own history, but they do not make it as they please; they do not make it under self-selected circumstances, but under circumstances existing already, given and transmitted from the past.

The tradition of all dead generations weighs like a nightmare on the brains of the living. [...]

Eighteenth Brumaire of Louis Bonaparte, Marx, 1852

I am no philospher of history, but I know that there are some ways of looking at the past that do not help me understand things. One is the arrow of enlightened progress, in which events exist in a causal chain, each producing the next. It doesn't help me understand the atmosphere, tensions, and possibilities inherent at any particular point. I find the "progress" theory of history to be an extreme form of selection bias.

Much more helpful to me is the Hegelian notion of dialectics: that at an given point in time there are various tensions at work. In our field, an example could be memory safety versus systems programming. These tensions create an environment that favors actions that lead towards resolution of the tensions. It doesn't mean that there's only one way to resolve the tensions, and it's not an automatic process -- people still have to do things. But the tendency is to ratchet history forward to a new set of tensions.

The history of a project, to me, is then a process of dialectic tensions and resolutions. If the project survives, as Guile has, then it should teach us something about the way this process works in practice.

ancient & spry

Languages evolve; how to remain minimal?

Dialectic opposites

  • world and guile

  • stable and active

  • ...

Lessons learned from inside Hegel’s motor of history

One dialectic is the tension between the world's problems and what tools Guile offers to understand and solve them. In 1993, the web didn't really exist. In 2033, if Guile doesn't run well in a web browser, probably it will be dead. But this process operates very slowly, for an old project; Guile isn't built on CORBA or something ephemeral like that, so we don't have very much data here.

The tension between being a stable base for others to build on, and in being a dynamic project that improves and changes, is a key tension that this talk investigates.

In the specific context of Guile, and for the audience of the FOSDEM minimal languages devroom, we should recognize that for a software project, age and minimalism don't necessarily go together. Software gets features over time and becomes bigger. What does it mean for a minimal language to evolve?

hill-climbing is insufficient

Ex: Guile 1.8; Extend vs Embed

One key lesson that I have learned is that the strategy of making only incremental improvements is a recipe for death, in the long term. The natural result is that you reach what you perceive to be the most optimal state of your project. Any change can only make it worse, so you stop moving.

This is what happened to Guile around version 1.8: we had taken the paradigm of the interpreter as language implementation strategy as far as it could go. There were only around 150 commits to Guile in 2007. We were stuck.

users stay unless pushed away

Inertial factor: interface

  • Source (API)

  • Binary (ABI)

  • Embedding (API)

  • CLI

  • ...

Ex: Python 3; local-eval; R6RS syntax; set!, set-car!

So how do we make change, in such a circumstance? You could start a new project, but then you wouldn't have any users. It would be nice to change and keep your users. Fortunately, it turns out that users don't really go away; yes, they trickle out if you don't do anything, but unless you change in an incompatible way, they stay with you, out of inertia.

Inertia is good and bad. It does conflict with minimalism as a principle; if you were to design Scheme in 2020, you would not include mutable variables or even mutable pairs. But they are still with us because if we removed them, we'd break too many users.

Users can even make you add back things that you had removed. In Guile 2.0, we removed the capability to evaluate an expression at run-time within the lexical environment of an expression, as we didn't know how to implement this outside an interpreter. It turns out this was so important to users that we had to add local-eval back to Guile, later in the 2.0 series. (Fortunately we were able to do it in a way that layered on lower-level facilities; this approach reconciled me to the solution.)

you can’t keep all users

What users say: don’t change or remove existing behavior

But: sometimes losing users is OK. Hard to know when, though

No change at all == death

  • Natural result of hill-climbing

Ex: psyntax; BDW-GC mark & finalize; compile-time; Unicode / locales

Unfortunately, the need to change means that sometimes you will lose users. It's either a dead project, or losing users.

In Guile 1.8, for example, the macro expander ran lazily: it would only expand code the first time it ran it. This was good for start-up time, because not all code is evaluated in the course of a simple script. Lazy expansion allowed us to start doing important work sooner. However, this approach caused immense pain to people that wanted "proper" Scheme macros that preserved lexical scoping; the state of the art was to eagerly expand an entire file. So we switched, and at the same time added a notion of compile-time. This compromise kept good start-up time while allowing fancy macros.

But eager expansion was a change. Users that relied on side effects from macro expansion would see them at compile-time instead of run-time. Users of old "defmacros" that could previously splice in live Scheme closures as literals in expanded source could no longer do that. I think it was the right choice but it did lose some users. In fact I just got another bug report related to this 10-year-old change last week.

every interface is a cost

Guile binary ABI: libguile.so; compiled Scheme files

Make compatibility easier: minimize interface

Ex: scm_sym_unquote, GOOPS, Go, Guix

So if you don't want to lose users, don't change any interface. The easiest way to do this is to minimize your interface surface. In Go, for example, they mostly haven't had dynamic-linking problems because that's not a thing they do: all code is statically linked into binaries. Similarly, Guix doesn't define a stable API, because all of its code is maintained in one "monorepo" that can develop in lock-step.

You always have some interfaces, though. For example Guix can't change its command-line interface from one day to the next, for example, because users would complain. But it's been surprising to me the extent to which Guile has interfaces that I didn't consider. Recently for example in the 3.0 release, we unexported some symbols by mistake. Users complained, so we're putting them back in now.

parallel installs for the win

Highly effective pattern for change

  • libguile-2.0.so

  • libguile-3.0.so

https://ometer.com/parallel.html

Changed ABI is new ABI; it should have a new name

Ex: make-struct/no-tail, GUILE_PKG([2.2]), libtool

So how does one do incompatible change? If "don't" isn't a sufficient answer, then parallel installs is a good strategy. For example in Guile, users don't have to upgrade to 3.0 until they are ready. Guile 2.2 happily installs in parallel with Guile 3.0.

As another small example, there's a function in Guile called make-struct (old doc link), whose first argument is the number of "tail" slots, followed by initializers for all slots (normal and "tail"). This tail feature is weird and I would like to remove it. Unfortunately I can't just remove the argument, so I had to make a new function, make-struct/no-tail, which exists in parallel with the old version that I can't break.

deprecation facilitates migration

__attribute__ ((__deprecated__))
(issue-deprecation-warning
 "(ice-9 mapping) is deprecated."
 "  Use srfi-69 or rnrs hash tables instead.")
scm_c_issue_deprecation_warning
  ("Arbiters are deprecated.  "
   "Use mutexes or atomic variables instead.");

begin-deprecated, SCM_ENABLE_DEPRECATED

Fortunately there is a way to encourage users to migrate from old interfaces to new ones: deprecation. In Guile this applies to all of our interfaces (binary, source, etc). If a feature is marked as deprecated, we cause its use to issue a warning, ideally at compile-time when users responsible for the package can fix it. You can even add __attribute__((__deprecated__)) on C types!

the arch-pattern

Replace, Deprecate, Remove

All change is possible; question is only length of deprecation period

Applies to all interfaces

Guile deprecation period generally one stable series

Ex: scm_t_uint8; make-struct; Foreign objects; uniform vectors

Finally, you end up in a situation where you have replaced the old interface and issued deprecation warnings to help users migrate. The next step is to remove the old interface. If you don't do this, you are failing as a project maintainer -- your project becomes literally unmaintainable as it just grows and grows.

This strategy applies to all changes. The deprecation period may last a while, and it may be that the replacement you built doesn't serve the purpose. There is still a dialog with the users that needs to happen. As an example, I made a replacement for the "SMOB" facility in Guile that allows users to define new types, backed by C interfaces. This new "foreign object" facility might not actually be good enough to replace SMOBs; since I haven't formally deprecatd SMOBs, I don't know yet because users are still using the old thing!

change produces a new stable point

Stability within series: only additions

Corollary: dependencies must be at least as stable as you!

  • for your definition of stable

  • social norms help (GNU, semver)

Ex: libtool; unistring; gnulib

In my experience, the old management dictum that "the only constant is change" does not describe software. Guile changes, then it becomes stable for a while. You need an unstable series escape hill-climbing, then once you found your new hill, you start climbing again in the stable series.

Once you reach your stable point, the projects you rely on need to exhibit the same degree of stability that you envision for your project. You can't build a web site that you expect to maintain for 10 years on technology that fundamentally changes every 6 months. But stable dependencies isn't something you can ensure technically; rather it relies on social norms of who makes the software you use.

who can crank the motor of history?

All libraries define languages

Allow user to evolve the language

  • User functionality: modules (Guix)

  • User syntax: macros (yay Scheme)

Guile 1.8 perf created tension

  • incorporate code into Guile

  • large C interface “for speed”

Compiler removed pressure on C ABI

Empowered users need less from you

A dialectic process does not progress on its own: it requires actions. As a project maintainer, some of my actions are because I want to do them. Others are because users want me to do them. The user-driven actions are generally a burden and as a lazy maintainer, I want to minimize them.

Here I think Guile has to a large degree escaped some of the pressures that weigh on other languages, for example Python. Because Scheme allows users to define language features that exist on par with "built-in" features, users don't need my approval or intervention to add (say) new syntax to the language they work in. Furthermore, their work can still compose with the work of others, even if the others don't buy in to their language extensions.

Still, Guile 1.8 did have a dynamic whereby the relatively poor performance of having to run all code through primitive-eval meant that users were pushed towards writing extensions in C. This in turn pushed Guile to expose all of its guts for access from C, which obviously has led to an overbloated C API and ABI. Happily the work on the Scheme compiler has mostly relieved this pressure, and we may therefore be able to trim the size of the C API and ABI over time.

contributions and risk

From maintenance point of view, all interface is legacy

Guile: Sometimes OK to accept user modules when they are more stable than Guile

In-tree users keep you honest

Ex: SSAX, fibers, SRFI

It can be a good strategy to "sediment" solutions to common use cases into Guile itself. This can improve the minimalism of an entire ecosystem of code. The maintenance burden has to be minimal, however; Guile has sometimes adopted experimental code into its repository, and without active maintenance, it soon becomes stale relative to what users and the module maintainers expect.

I would note an interesting effect: pieces of code that were adopted into Guile become a snapshot of the coding style at that time. It's useful to have some in-tree users because it gives you a better idea about how a project is seen from the outside, from a code perspective.

sticky bits

Memory management is an ongoing thorn

Local maximum: Boehm-Demers-Weiser conservative collector

How to get to precise, generational GC?

Not just Guile; e.g. CPython __del__

There are some points that resist change. The stickiest of these is the representation of heap-allocated Scheme objects in C. Guile currently uses a garbage collector that "automatically" finds all live Scheme values on the C stack and in registers. It was the right choice at the time, given our maintenance budget. But to get the next bump in performance, we need to switch to a generational garbage collector. It's hard to do that without a lot of pain to C users, essentially because the C language is too weak to express the patterns that we would need. I don't know how to proceed.

I would note, though, that memory management is a kind of cross-cutting interface, and that it's not just Guile that's having problems changing; I understand PyPy has had a lot of problems regarding changes on when Python destructors get called due to its switch from reference counting to a proper GC.

future

We are here: stability

And then?

  • Parallel-installability for source languages: #lang

  • Sediment idioms from Racket to evolve Guile user base

Remove myself from “holding the crank”

So where are we going? Nowhere, for the moment; or rather, up the hill. We just released Guile 3.0, so let's just appreciate that for the time being.

But as far as next steps in language evolution, I think in the short term they are essentially to further enable change while further sedimenting good practices into Guile. On the change side, we need parallel installability for entire languages. Racket did a great job facilitating this with #lang and we should just adopt that.

As for sedimentation, we should step back and if any common Guile use patterns built by our users should be include core Guile, and widen our gaze to Racket also. It will take some effort both on a technical perspective but also on a social/emotional consensus about how much change is good and how bold versus conservative to be: putting the dialog into dialectic.

dialectic, boogie woogie woogie

https://gnu.org/s/guile

https://wingolog.org/

#guile on freenode

@andywingo

wingo@igalia.com

Happy hacking!

Hey that was the talk! Hope you enjoyed the writeup. Again, video and slides available on the FOSDEM web site. Happy hacking!

by Andy Wingo at February 07, 2020 11:38 AM

January 10, 2020

Guillaume DesmottesRust/GStreamer paid internship at Collabora

Collabora is offering various paid internship positions for 2020. We have a nice range of very cool projects involving kernel work, Panfrost, Monado, etc.

I'll be mentoring a GStreamer project aiming to write a Chromecast sink element in Rust. It would be a great addition to GStreamer and would give the student a chance to learn about our favorite multimedia framework but also about bindings between C GObject code and Rust.

So if you're interested don't hesitate to apply or contact me if you have any question.

by Guillaume Desmottes at January 10, 2020 11:06 AM

January 08, 2020

Víctor JáquezGStreamer-VAAPI 1.16 and libva 2.6 in Debian

Debian has migrated libva 2.6 into testing. This release includes a pull request that changes how the drivers are selected to be loaded and used. As the pull request mentions:

libva will try to load iHD firstly, if it failed. then it will load i965.

Also, Debian testing has imported that iHD driver with two flavors: intel-media-driver and intel-media-driver-non-free. So basically iHD driver is now the main VAAPI driver for Intel platforms, though it only supports the new chips, the old ones still require i965-va-driver.

Sadly, for current GStreamer-VAAPI stable, the iHD driver is not included in its driver white list. And this will pose a problem for users that have installed either of the intel-media-driver packages, because, by default, such driver is ignored and the VAAPI GStreamer elements won’t be registered.

There are three temporal workarounds (mutually excluded) for those users (updated):

  1. Uninstall intel-media-driver* and install (or keep) the old i965-va-driver-shaders/i965-va-driver.
  2. Export, by default in your session, export LIBVA_DRIVER_NAME=i965. Normally this is done adding the variable exportation in $HOME/.profile file. This environment variable will force libva to load the i965 driver.
  3. And finally, export, by default in your sessions, GST_VAAPI_ALL_DRIVERS=1. This is not advised since many applications, such as Epiphany, might fail.

We prefer to not include iHD in the stable white list because most of the work done for that driver has occurred after release 1.16.

In the case of GStreamer-VAAPI master branch (actively in develop) we have merged the iHD in the white list, since the Intel team has been working a lot to make it work. Though, it will be released for GStreamer version 1.18.

by vjaquez at January 08, 2020 04:36 PM

December 21, 2019

Sebastian Pölsterlscikit-survival 0.11 featuring Random Survival Forests released

Today, I released a new version of scikit-survival which includes an implementation of Random Survival Forests. As it’s popular counterparts for classification and regression, a Random Survival Forest is an ensemble of tree-based learners. A Random Survival Forest ensures that individual trees are de-correlated by 1) building each tree on a different bootstrap sample of the original training data, and 2) at each node, only evaluate the split criterion for a randomly selected subset of features and thresholds. Predictions are formed by aggregating predictions of individual trees in the ensemble.

For a full list of changes in scikit-survival 0.11, please see the release notes.

The latest version can be downloaded via conda or pip. Pre-built conda packages are available for Linux, OSX and Windows via

 conda install -c sebp scikit-survival

Alternatively, scikit-survival can be installed from source via pip:

 pip install -U scikit-survival

Using Random Survival Forests

To demonstrate Random Survival Forest, I’m going to use data from the German Breast Cancer Study Group (GBSG-2) on the treatment of node-positive breast cancer patients. It contains data on 686 women and 8 prognostic factors:

  1. age,
  2. estrogen receptor (estrec),
  3. whether or not a hormonal therapy was administered (horTh),
  4. menopausal status (menostat),
  5. number of positive lymph nodes (pnodes),
  6. progesterone receptor (progrec),
  7. tumor size (tsize,
  8. tumor grade (tgrade).

The goal is to predict recurrence-free survival time.

The code to reproduce the results below is available in this notebook.

First, we need to load the data and transform it into numeric values.

X, y = load_gbsg2()
grade_str = X.loc[:, "tgrade"].astype(object).values[:, np.newaxis]
grade_num = OrdinalEncoder(categories=[["I", "II", "III"]]).fit_transform(grade_str)
X_no_grade = X.drop("tgrade", axis=1)
Xt = OneHotEncoder().fit_transform(X_no_grade)
Xt = np.column_stack((Xt.values, grade_num))
feature_names = X_no_grade.columns.tolist() + ["tgrade"]

Next, the data is split into 75% for training and 25% for testing so we can determine how well our model generalizes.

X_train, X_test, y_train, y_test = train_test_split(
Xt, y, test_size=0.25, random_state=random_state)

Training

Several split criterion have been proposed in the past, but the most widespread one is based on the log-rank test, which you probably now from comparing survival curves among two or more groups. Using the training data, we fit a Random Survival Forest comprising 1000 trees.

rsf = RandomSurvivalForest(n_estimators=1000,
min_samples_split=10,
min_samples_leaf=15,
max_features="sqrt",
n_jobs=-1,
random_state=random_state)
rsf.fit(X_train, y_train)

We can check how well the model performs by evaluating it on the test data.

rsf.score(X_test, y_test)

This gives a concordance index of 0.68, which is a good a value and matches the results reported in the Random Survival Forests paper.

Predicting

For prediction, a sample is dropped down each tree in the forest until it reaches a terminal node. Data in each terminal is used to non-parametrically estimate the survival and cumulative hazard function using the Kaplan-Meier and Nelson-Aalen estimator, respectively. In addition, a risk score can be computed that represents the expected number of events for one particular terminal node. The ensemble prediction is simply the average across all trees in the forest.

Let’s first select a couple of patients from the test data according to the number of positive lymph nodes and age.

a = np.empty(X_test.shape[0], dtype=[("age", float), ("pnodes", float)])
a["age"] = X_test[:, 0]
a["pnodes"] = X_test[:, 4]
sort_idx = np.argsort(a, order=["pnodes", "age"])
X_test_sel = pd.DataFrame(
X_test[np.concatenate((sort_idx[:3], sort_idx[-3:]))],
columns=feature_names)
age estrec horTh menostat pnodes progrec tsize tgrade
0 33.0 0.0 0.0 0.0 1.0 26.0 35.0 2.0
1 34.0 37.0 0.0 0.0 1.0 0.0 40.0 2.0
2 36.0 14.0 0.0 0.0 1.0 76.0 36.0 1.0
3 65.0 64.0 0.0 1.0 26.0 2.0 70.0 2.0
4 80.0 59.0 0.0 1.0 30.0 0.0 39.0 1.0
5 72.0 1091.0 1.0 1.0 36.0 2.0 34.0 2.0

The predicted risk scores indicate that risk for the last three patients is quite a bit higher than that of the first three patients.

pd.Series(rsf.predict(X_test_sel))
0 91.477609
1 102.897552
2 75.883786
3 170.502092
4 171.210066
5 148.691835
dtype: float64

We can have a more detailed insight by considering the predicted survival function. It shows that the biggest difference occurs roughly within the first 750 days.

surv = rsf.predict_survival_function(X_test_sel)
for i, s in enumerate(surv):
plt.step(rsf.event_times_, s, where="post", label=str(i))
plt.ylabel("Survival probability")
plt.xlabel("Time in days")
plt.grid(True)
plt.legend()

Alternatively, we can also plot the predicted cumulative hazard function.

surv = rsf.predict_cumulative_hazard_function(X_test_sel)
for i, s in enumerate(surv):
plt.step(rsf.event_times_, s, where="post", label=str(i))
plt.ylabel("Cumulative hazard")
plt.xlabel("Time in days")
plt.grid(True)
plt.legend()

Permutation-based Feature Importance

The implementation is based on scikit-learn’s Random Forest implementation and inherits many features, such as building trees in parallel. What’s currently missing is feature importances via the feature_importance_ attribute. This is due to the way scikit-learn’s implementation computes importances. It relies on a measure of impurity for each child node, and defines importance as the amount of decrease in impurity due to a split. For traditional regression, impurity would be measured by the variance, but for survival analysis there is no per-node impurity measure due to censoring. Instead, one could use the magnitude of the log-rank test statistic as an importance measure, but scikit-learn’s implementation doesn’t seem to allow this.

Fortunately, this is not a big concern though, as scikit-learn’s definition of feature importance is non-standard and differs from what Leo Breiman proposed in the original Random Forest paper. Instead, we can use permutation to estimate feature importance, which is preferred over scikit-learn’s definition. This is implemented in the ELI5 library, which is fully compatible with scikit-survival.

import eli5
from eli5.sklearn import PermutationImportance
perm = PermutationImportance(rsf, n_iter=15, random_state=random_state)
perm.fit(X_test, y_test)
eli5.show_weights(perm, feature_names=feature_names)
table.eli5-weights { width: 80%; margin-left: auto; margin-right: auto; display: table; overflow-x: visible; -webkit-overflow-scrolling: auto; } table.eli5-weights > tbody > tr > td { background-color: transparent; } table.eli5-weights > tbody > tr:hover > td { background-color: #e5e5e5; } .col-weight { padding: 0 1em 0 0.5em; text-align: right; } .col-feature { padding: 0 0.5em 0 0.5em; text-align: left; }
Weight Feature
0.0676 ± 0.0229 pnodes
0.0206 ± 0.0139 age
0.0177 ± 0.0468 progrec
0.0086 ± 0.0098 horTh
0.0032 ± 0.0198 tsize
0.0032 ± 0.0060 tgrade
-0.0007 ± 0.0018 menostat
-0.0063 ± 0.0207 estrec

The result shows that the number of positive lymph nodes (pnodes) is by far the most important feature. If its relationship to survival time is removed (by random shuffling), the concordance index on the test data drops on average by 0.0676 points. Again, this agrees with the results from the original Random Survival Forests paper.

December 21, 2019 04:46 PM

December 18, 2019

GStreamerGStreamer Rust bindings 0.15.0 release

(GStreamer)

A new version of the GStreamer Rust bindings, 0.15.0, was released.

As usual this release follows the latest gtk-rs release, and a new version of the GStreamer plugins written in Rust was also released.

This new version features a lot of newly bound API for creating subclasses of various GStreamer types: GstPreset, GstTagSetter, GstClock, GstSystemClock, GstAudioSink, GstAudioSrc, GstDevice, GstDeviceProvider, GstAudioDecoder and GstAudioEncoder.

In addition to that, a lot of bugfixes and further API improvements have happened over the last few months that should make development of GStreamer applications or plugins in Rust as convenient as possible.

A new release of the GStreamer Rust plugins will follow in the next days.

Details can be found in the release notes for gstreamer-rs and gstreamer-rs-sys.

The code and documentation for the bindings is available on the freedesktop.org GitLab

as well as on crates.io.

If you find any bugs, missing features or other issues please report them in GitLab.

December 18, 2019 05:00 PM

December 17, 2019

Bastien NoceraGMemoryMonitor (low-memory-monitor, 2nd phase)

(Bastien Nocera) TL;DR

Use GMemoryMonitor in glib 2.63.3 and newer in your applications to lower overall memory usage, and detect low memory conditions.

low-memory-monitor

To start with, let's come back to low-memory-monitor, announced at the end of August.

It's not really a “low memory monitor”. I know, the name is deceiving, but it actually monitors memory pressure stalls, and how hard it is for the kernel to allocate memory when applications need it. The longer it takes to allocate memory, the longer the kernel takes to allocate it, usually because it needs to move memory around to make room for a big allocation, when an application starts up for example, or prepares an in-memory buffer for saving.

It is not a daemon that will kill programs on low memory. It's not a user-space out-of-memory killer, and does not take those policy decisions. It can however be configured to ask the kernel to do that. The kernel doesn't really know what it's doing though, and user-space isn't helping either, so best disable that for now...

As listed in low-memory-monitor's README (and in the announcement post), there were a number of similar projects around, but none that would offer everything we needed, eg.:
  • Has a D-Bus interface to propagate low memory conditions
  • Requires Linux 5.2's kernel memory pressure stalls information (Android's lowmemorykiller daemon has loads of code to get the same information from the kernel for older versions, and it really is quite a lot of code)
  • Written in a compiled language to save on startup/memory usage costs (around 500 lines of C code, as counted by sloccount)
  • Built-in policy, based upon values used in Android and Endless OS
 GMemoryMonitor

Next up, in our effort to limit memory usage, we'll need some help from applications. That's where GMemoryMonitor comes in. It's simple enough, listen to the low-memory-warning signal and free some image thumbnails, index caches, or dump some data to disk, when you receive a signal.

The signal also gives you a “warning level”, with 255 being when low-memory-monitor would trigger the kernel's OOM killer, and lower values different levels of “try to be a good citizen”.

The more astute amongst you will have noticed that low-memory-monitor runs as root, on the system bus, and wonder how those new fangled (5 years old today!) sandboxed applications would receive those signals. Fear not! Support for a portal version of GMemoryMonitor landed in xdg-desktop-portal on the same day as in glib. Everything tied together with installed tests that use the real xdg-desktop-portal to test the portal and unsandboxed versions.

How about an OOM killer?

By using memory pressure stall information, we receive information about the state of the kernel before getting into swapping that'd cause the machine to become unusable. This also means that, as our threshold for keeping everything ticking is low, if we were to kill high memory consumers, we'd get a butter smooth desktop, but, based on my personal experience, your browser and your mail client would take it in turns disappearing from your desktop in a way that you wouldn't even notice.

We'll definitely need to think about our next step in application state management, and changing our running applications paradigm.

Distributions should definitely disable the OOM killer for now, and possibly try their hands at upstream some systemd OOMPolicy and OOMScoreAdjust options for system daemons.

Conclusion

Creating low-memory-monitor was easy enough, getting everything else in place was decidedly more complicated. In addition to requiring changes to glib, xdg-desktop-portal and python-dbusmock, it also required a lot of work on the glib CI to save me from having to write integration tests in C that would have required a lot of scaffolding. So thanks to all involved in particular Philip Withnall for his patience reviewing my changes.

by Bastien Nocera (noreply@blogger.com) at December 17, 2019 11:53 PM

December 15, 2019

Gustavo OrrilloSpecial collections’ design process

The visualization we developed for the Network of Libraries of the Bank of the Republic of Colombia is a web tool that allows users to create their own search paths through the special documents and collections available in the libraries. This project took around 6 months of work, form initial research and sketching to the final product that is currently in use. It gave us a unique opportunity to apply novel frameworks and technologies for web development such as p5.js to make the rich cultural heritage deposited at the Network of Libraries more easily accessible to a wide range of users, from ocassional visitors to expert researchers. This post shows some of the visual materials and concepts that inspired the tool, as well as design sketches and prototypes.

Early UI sketch

Background

The Network of Libraries of the Bank of the Republic is the depository of more than thirty-five historical archives that constitute a primary source for the reconstruction of the history of Colombia. Researchers and historians can consult most of these archives in the Room of Rare Books and Manuscripts of the Luis Ángel Arango Library; however, some of these archives are available at other cultural centers of the Bank of the Republic around the country.

When we started working on this project, several of the materials were available in digital through a web portal where the information was organized in various thematic groupings and by document types. Analysis of this portal revealed the following features:

  • Most of the data had already been digitized
  • A fraction of the contents were available through a “pre-made” timeline visualization implemented with an existing javascript library.
  • Another part of the contents were available through different interfaces (list, maps, etc)
  • There was little connection between all the contents, each topic had to be visualized within its own separate page

The major aims of the interactive visualization to be developed were the following:

  • to gives a more holistic access to the contents of the site
  • to emphasize relationships between separate themes and types of materials, and facilitate finding contents and understanding the context of these contents

Time-based data

As the special documents and collections had a strong temporal dimension, and timelines were used in the original version of the website, we started exploring timeline-based visualizations of the data. The slideshow below contains some previous projects we considered in our research:

A timeline of history - http://histography.io/ World Digital Library Timelines - https://www.wdl.org/en/timelines/ Interactive Blog Calendar - https://eagereyes.org/blog-calendar Summit on the Summit Annual Reports - https://fathom.info/reports

Visualizations using the timeline metaphor.

Two issues with the use timelines to visualize the collections from the Network of Libraries were that the data is sparsely populated, and that some items could cover a very wide time range, for example 50 years or even more. So the problem became how to handle such a large interval properly in a traditional timeline? Because of this issue, we looked into more dynamic approaches. The interactive timelines in the New Cooper Hewitt Experience at the Smithsonian Cooper Hewitt Museum in New York:

and the Timeline of Modern Art at Tate Modern in London:

are great examples where museum’s collections are form a flowing stream where visitors can select and manipulate the items they find interesting.

Seach processes

Another concept that we considered early on was that of serendipitous search, as suggested by the following picture of a library user browsing the shelves:

Browsing the shelves

To us, this idea of serendipitous search resonated with Psychogeography, the term coined in the 1950s by the Situationists and denoting the “exploration of urban environments that emphasizes playfulness and drifting”. Can we transfer this practice into the visualization of cultural materials to create some kind of “Librageography” or “Bibliogeografía”, which prioritizes playful search? Some visual inspiration related to these concepts:

In particular, the psychogeographic concept of drifting led to the idea of a visual exploration where each user constructs their own map of the data by wandering through the connections between the data elements, defined by the common tags shared by the collections. The following early sketch offer a glimpse of these schemes:

First hand-drawn sketch

While a drifting navigation through the collections could provide an engaging and playful experience for ocassional users, the visualization should also offer tools for a more directed search that advanced users may need when researching the data or looking for specific information. A first mockup of the web viewer incorporates such tools:

First hand-drawn sketch

Refining the design

Once we identified an initial visual metaphor and navigation mechanism, it was the time to start iterating over the early sketches while discussing the progress with the team from the Virtual Library of the Bank of the Republic, who was in charge of the project.

The prototype also supported mobile phones since the beginning, as it was very important that the web visualizer was accessible through both desktop and mobile:

Mobile prototype

In parallel to the design of the main visualization screen, we also starting sketching the intro screen. This is also very important, as the intro screen is the entry point to the visualization, and it may disuade users, specially newcomers, to stay in the page and explore the data:

Sketches of intro screen

We developed a working prototype based on those initial designs and sketches, which allowed us to test the basic user flows and iterate the designs to obtain feedback quickly:

Once we agreed on the overall modes of introduction, presentation and interaction, we started refining the visual appareance of the prototype through color, textures, text, animation, and mutual relationships of all the elements:

Work on the UI was also ongoing at this stage:

After a few rounds additional rounds of design iterations, the color palette and UI improved significantly, and at that stage we were approaching the final version, with only a few minor tweaks left:

Responsive design

With a great majority of users navigating the web on their phones, a big priority for us was to ensure that the viewer worked well on mobile browsers. In the end, we were able to keep the functionality and appereance consistent across desktop, iOS, and Android, while dealing with the differences in screen state and interaction modalities (i.e.: touch vs mouse):

Conclusions

The Virtual Library of the Bank of the Republic released the new Special Documents and Collections portal, including the interactive viewer, in November of 2019. We are very satisfied with the design process described here as well as with the final product. We believe that we created an useful tool for cultural promotion and research in the humanities, and we hope to gain insight on the user engagement with the viewer based on the analytics collected by the portal.

by Andres Colubri at December 15, 2019 04:00 PM

December 13, 2019

Bastien NoceraDual-GPU support follow-up: NVIDIA driver support

(Bastien Nocera) If you remember, back in 2016, I did the work to get a “Launch on Discrete GPU” menu item added to application in gnome-shell.

This cycle I worked on adding support for the NVIDIA proprietary driver, so that the menu item shows up, and the right environment variables are used to launch applications on that device.

Tested with another unsupported device...


Behind the scenes

There were a number of problems with the old detection code in switcheroo-control:
- it required the graphics card to use vga_switcheroo in the kernel, which the NVIDIA driver didn't do
- it could support more than 2 GPUs
- and it didn't really actually know which GPU was going to be the “main” one

And, on top of all that, gnome-shell expected the Mesa OpenGL stack to be used, so it only knew the right environment variables to do that, and only for one secondary GPU.

So we've extended switcheroo-control and its API to do all this.

(As a side note, commenters asked me about the KDE support, and how it would integrate, and it turns out that KDE's code just checks for the presence of a file in /sys, which is only present when vga_switcheroo is used. So I would encourage KDE to adopt the switcheroo-control D-Bus API for this)

Closing

All this will be available in Fedora 32, using GNOME 3.36 and switcheroo-control 2.0. We might backport this to Fedora 31 after it's been tested, and if there is enough interest.

by Bastien Nocera (noreply@blogger.com) at December 13, 2019 04:15 PM

December 08, 2019

Phil NormandHTML overlays with GstWPE, the demo

(Phil Normand)

Once again this year I attended the GStreamer conference and just before that, Embedded Linux conference Europe which took place in Lyon (France). Both events were a good opportunity to demo one of the use-cases I have in mind for GstWPE, HTML overlays!

As we, at Igalia, usually have a …

by Philippe Normand at December 08, 2019 02:00 PM

December 03, 2019

GStreamerGStreamer 1.16.2 stable bug fix release

(GStreamer)

The GStreamer team is pleased to announce the second bug fix release in the stable 1.16 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.16.x.

See /releases/1.16/ for the details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Download tarballs directly here: gstreamer, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx.

December 03, 2019 08:00 PM

November 13, 2019

Sebastian DrögeThe GTK Rust bindings are not ready yet? Yes they are!

(Sebastian Dröge)

When talking to various people at conferences in the last year or at conferences, a recurring topic was that they believed that the GTK Rust bindings are not ready for use yet.

I don’t know where that perception comes from but if it was true, there wouldn’t have been applications like Fractal, Podcasts or Shortwave using GTK from Rust, or I wouldn’t be able to do a workshop about desktop application development in Rust with GTK and GStreamer at the Linux Application Summit in Barcelona this Friday (code can be found here already) or earlier this year at GUADEC.

One reason I sometimes hear is that there is not support for creating subclasses of GTK types in Rust yet. While that was true, it is not true anymore nowadays. But even more important: unless you want to create your own special widgets, you don’t need that. Many examples and tutorials in other languages make use of inheritance/subclassing for the applications’ architecture, but that’s because it is the idiomatic pattern in those languages. However, in Rust other patterns are more idiomatic and even for those examples and tutorials in other languages it wouldn’t be the one and only option to design applications.

Almost everything is included in the bindings at this point, so seriously consider writing your next GTK UI application in Rust. While some minor features are still missing from the bindings, none of those should prevent you from successfully writing your application.

And if something is actually missing for your use-case or something is not working as expected, please let us know. We’d be happy to make your life easier!

P.S.

Some people are already experimenting with new UI development patterns on top of the GTK Rust bindings. So if you want to try developing an UI application but want to try something different than the usual signal/callback spaghetti code, also take a look at those.

by slomo at November 13, 2019 03:02 PM

November 06, 2019

GStreamerGStreamer Conference 2019 talk recordings online

(GStreamer)

Thanks to our partners at Ubicast the recordings of this year's GStreamer Conference talks are now available online.

You can view or download the GStreamer Conference 2019 videos here.

Talks:

Lightning Talks:

November 06, 2019 02:00 PM

November 01, 2019

Jean-François Fortin TamSurvey: making Getting Things GNOME sustainable as a productivity app for public good

Now that you’ve been introduced to the overall concept of Getting Things Done with the video in my previous blog post, let me show you the secret weapon of chaos warriors who want to follow that methodology with a digital tool they can truly own.

Your secret weapon: “Getting Things GNOME”

Getting Things GNOME” is a native GNOME desktop application that is entirely free and open-source (licensed under the GPL) and runs locally on your computer.

Here’s how it looks like on my computer (and yes, I do have over 600 actionable tasks at all times lately):

GTG 0.3.1 showing my personal task list

Getting Things GNOME is one of the most well-made productivity apps of the decade. It has:

  • A mature codebase with almost a decade of work that has been put into it
  • A nice GTK interface (GTK2 in the stable version, GTK3 in the development version), with a flexible free-form text editor for handling inline @tags, and extended descriptions/notes below the title
  • The ability to quickly defer tasks (ex: “do it tomorrow”, “do it next week”, “do it next month”, etc.)
  • Natural language parsing (you can tag things while you type the title, and you can use dates such as “Tomorrow”, “Thursday” in addition to the standard “YYYY-MM-DD”)
  • “Work view” mode for displaying only “actionable” tasks so you can focus on your work
  • The notion of sub-tasks and dependencies
  • Tags, which can also be made hierarchical (ex: “@phone” can be a child of “@work”) and can have colors and icons
  • Search and “saved searches”
  • Plugins!
  • Works offline, owned by you with no restrictions, no licensing B.S., software “by the people for the people”

This is what the Git (development) version of GTG’s UI currently looks like, showing my personal tasks (it needs some improvement, but it’s already impressive):

My tasks with the “kusanagi” tag, as shown by the development version’s GTK3 UI.

Here are some more (old) screenshots:

Historical context

Unfortunately, the previous maintainers never developed a business model to sustain development, and, after a long period of development hell, abandoned the project as they moved on to get various jobs.

This is understandable and I can’t fault them… however, I still need a working and maintained application that will continue working working throughout the years as our technological landscape evolves (the Linux platform is pretty ruthless on that front). Otherwise, you risk having your Linux distribution suddenly stop packaging the app you dearly depend on.

If anyone was wondering if all Free and Open-Source software can continue as a side-project forever, this is a prime case study. This is what happens when software goes unmaintained.

Meanwhile, in the world of proprietary software and software-as-a-service, countless people are paying a big one-time, monthly or yearly licensing fee to use to-do software applications that they don’t own, and for which they can’t be 110% certain that the software respects their digital rights. Depending on these raises fundamental questions when it’s something as long-lived and all-encompassing as a personal productivity system: Can you trust that cloud service? Or that this proprietary app isn’t profiling what’s going on in your OS, or uploading your data if it sees certain key words? That the licensing fee won’t increase? That the whole thing won’t change when the parent company becomes part of a merger, acquisition or bankruptcy? That the app will keep functioning throughout the years as you upgrade your operating systems, that it will work on more than one or two “authorized” computers?

All these questions are relevant when you’re depending on proprietary software. And when you’re depending on an online/cloud service, you’re always at the mercy of that service eventually shutting down (or being acquired, merged, etc.). There are numerous examples of that happening, including this, this, this, this and that, to name only a few.

This is not The End?

It doesn’t have to be that way.

What if we could bring “GTG” back from the dead, finish the job, and sustain it forever more so that we can all benefit from it?

There are two ways this can be done (which are not mutually exclusive, I might add):

  • From the existing userbase, reform a completely new team of software developers that will be willing to come together and “finish the job” to get a new release out of the door. I have obtained administration rights to the GitHub repository so I can, in practice, play “project manager” and help volunteers get on board. We could have a status analysis and brainstorming session to establish a roadmap with the shortest path to releasability, and beyond, and then work together for the months and years to come to accomplish our goals
  • Someone with enough freedom and time (it could be me, it could be someone else, it could be multiple people) gets paid to be day-to-day maintainer(s), the one(s) doing the majority of the core “boring” work to get everything back into shape and mentors the part-time contributors who contribute opportunistic (“drive-by”) patches and merge requests.

The two options are viable, but require enough public interest to happen. This is why I’m making a survey, linked below, to evaluate the potential for such an initiative. Are you interested in GTG coming back in full force? Whatever your answer, let me know through this 5 minutes survey (November 24th update: survey is now closed, thanks for participating!). Your input is was very much appreciated:

  • It is much more efficient to run this survey (and share results here) than to spend months preparing a campaign only to find out that there is or isn’t enough demand.
  • It also lets me raise awareness and hopefully assemble a team of new contributors (because you don’t just make them appear out of thin air); if enough volunteers show up (with a lot of time and passion to share) we can get started without much delay, get together and create momentum.

Help me evaluate how we can bring it back to life, with this 5 minutes survey

The survey can be found here (November 24th update: the survey is now closed, thanks for participating!)

Filling this survey should take 4-6 minutes at most.

Let’s be clear: I’m not doing this for myself (just grabbing a proprietary app package is much easier and would let me move on to MUCH more lucrative opportunities), I would be doing this for the greater public good, because it breaks my heart to think that GTG would die when it’s such a great piece of software.

There is no sane FLOSS native desktop alternative for Linux users, and open-source software should be worth more money than proprietary software, not less: you are getting better value out of it, with an implicit guarantee that the software respects your rights and privacy, and that it will remain available forever as long as there is someone on the planet willing to maintain it.

On the other hand, spending time creating software costs money; the alternative is not caring and pursuing a lucrative career, so the software remains unmaintained and everybody loses. So I need to know that nursing GTG back to health would be worth the effort, that the application would be used by many (not just a handful) of people around the world. I seek “meaningful” work.

Help me determine if this is worth my (or anyone’s) time by filling the survey today, and please share it with those around you, and elsewhere on the interwebs. Thanks!

The post Survey: making Getting Things GNOME sustainable as a productivity app for public good appeared first on The Open Sourcerer.

by Jeff at November 01, 2019 12:30 PM

October 30, 2019

Jean-François Fortin TamThe goldsmith and the chaos warrior: a typology of workers

As I’ve spent a number of years working for various organizations, big and small, with different types of collaborators and staffers, I’ve devised a simple typology of workers that can help explain the various levels of success, self-organization, productivity and stress of those workers, depending on whether there is a fit between their work type and their work processes. This is one of the many typologies I use to describe human behavior, and I haven’t spent years and a Ph.D. thesis devising this, this is just some down-to-earth reflections I’ve had. Without much further ado, here’s what I’ve come up with so far.


The first type of worker is what I call the “chaos warrior”: this includes the busy managers, professional event organizers, executives, deal-with-everything assistants, researchers, freelancers or contractors.

In my view, “chaos warriors” are the types of workers who—from a systemic point of view—have to deal with constantly changing environments and demands, time-based deadlines, dependencies on other people or materials, multiple parallel projects, etc.

  • Chaos warriors, in their natural state, very rarely have the luxury of single-threaded work and interruption-free environments (though those certainly would be welcome, and chaos warriors sometimes have to naturally retreat to external “think spaces” to get foundational work done).
  • Chaos warriors don’t necessarily enjoy the chaos (some of them hate it, some of them crave it), but it’s part of the system they find themselves in, so they have to structure their workflow around it—or risk incompetence or burning out really fast. They have to become “organized” chaos warriors, otherwise they’re just chaos “victims”.
  • The in-between state, the somewhat-organized-but-not-zen chaos worker that many freelancers experience, is what I call, “Calm Like a Bomb”.

Note that the chaos I am referring to is cognitive, not physical; firefighters, paramedics and ER nurses, the police and military, are “emergency” workers, not warriors of cognitive “chaos”. They are beyond the scope of what I’m covering here (and what’s coming in my next blog posts). They don’t need a “productivity system” to sort through cognitive overload, they deal with whatever comes forth as best as they can in any given situation.

The ultimate embodiment of a chaos warrior is the nameless heroïne in the 4th DaiCon event opening animation from 1983: you can’t get a better representation of triumph amidst chaos than that!

The second category of workers is what I call the “goldsmith”—that is, people with a very specific role, who work in a single regular “employment” type of job, often with set hours, and possibly on-site (in an office/warehouse/shop/etc.).

This may include most office workers, public servants, software design & development folks who make a sharp separation between work and personal life, construction subcontractors working as part of a big real estate project, waiters and bartenders, technicians, retail sales & logistics, etc. I’m vastly simplifying and generalizing of course, but here I sketch the picture of someone who comes in in the morning, looks at the task list/assignments/inbox, works on that throughout the day, and then leaves their work life behind to enjoy their personal life; then the process resets on the next day.

  • “Pure” goldsmiths do not track work items outside of the workplace, and usually do not need to track personal items while at work (or aren’t allowed to). As such, in both settings, their mind is focused and clear. You arrive at context A, you work on context A’s items that are in front of you. You arrive at context B, you relax or deal with whatever has come up in your home “as it happens”. Arguably, from this standpoint of work-life separation, you could put some “emergency workers” in this category.
  • The goldsmith may have a simpler life, which is kind of a luxury, really: they can more easily have a tranquil mind, without the cognitive weight of hundreds of pending items and complex dependency chains governing their tasks. You do the job, you move on.
  • When they are asked to “produce” output in their area of expertise, those are often the type of workers that would benefit from a quiet, interruption-free work environment. There’s a reason why Joel Spolsky designed the FogCreek offices to allow developers to close the door and work in peace, instead of the chaos of open-space offices (that’s a story for another day).

Some specialized “creative” goldsmiths have a hard time separating work from personal life; even when they are home, they can’t help but think about potential creative solutions to the challenges they’re trying to solve at work. In that case, those may be “chaos warriors” in disguise.

In my view, personal productivity methodologies like GTD cater first and foremost to the “organized chaos warriors”, rather than the goldsmiths, who may have little use for all-encompassing cognitive techniques, or who may have tools that already structure their work for them.

Notably, in some industries like IT or the Free & Open-Source software sub-industry, we have done a pretty good job at externalizing (for better or for worse) the software developers and designers’ todo list as “bug/issue trackers”, and their assignments may often be linear and fairly predictable, allowing them to be “goldsmiths”. Most of the time, a software developer or designer, in their core duties, are going to deal with “whatever is in the issue list” (or kanban board), particularly in a team setting, and as such probably don’t feel the need to have a dedicated personal todo list, which might be considered duplication of information and management overhead. Input goes in (requirements, bug reports, feature requests), output goes out (a new feature, design, or fix). There are some exceptions to this generalization however:

  • When your issue tracker (bug inventory) is not actively managed (triaged, organized, regularly pruned), you eventually end up declaring “bugtracker bankruptcy”, or, like Benjamin Otte once said to me on IRC, “Whoever catches me first on IRC in the morning, wins.”
  • Some goldsmiths may have more complicated lives than just their job duties and might be interested by this approach nonetheless.
  • Sometimes, shared/public/open bug trackers are a tyranny on the mind, much like a popular email inbox: the demands are so numerous and complex (or unstructured) that they are not only externally imposed goals, they become imposed chaos—in which case the goldsmith may find that they need to extract a personal subset of the items from the “firehose” into a remixed, personalized, digestible task list for themselves.

Do you recognize yourself in one of these categories I’ve come up with? Or do you fit into some other category I might not have thought about? Did you find this essay interesting? Let me know in the comments!

If you’re a chaos warrior, or you fit any of the goldsmith’s “exceptions” (or if you’re interested in the field of personal productivity in general), you’ll probably be interested in reading my next article on (re)building the best free & open-source “GTD” application out there (but before that, if you haven’t read it already, check out my previous article on “getting things done”).

The post The goldsmith and the chaos warrior: a typology of workers appeared first on The Open Sourcerer.

by Jeff at October 30, 2019 12:30 PM

October 28, 2019

Jean-François Fortin TamA secret to productivity for busy individuals with chaotic contexts

Over the years, some people have asked me how I manage so many projects—short and long—without forgetting anything, without breaking promises and commitments, all while looking like a zen buddha. A few observers also remarked (often in mockery) that I tend to take a note of everything, that I document an outrageous amount of seemingly mundane details in my professional and personal life.

In the battlefield of the modern world’s incessant demands and boundless opportunities, I survive “seemingly effortlessly” by being methodology-intensive and adhering to a particular cognitive philosophy that acknowledges the limitations of the human brain—and works around them. This way of life lets me keep track of the big picture without being paralyzed by the daunting nature of a project:

(How do you climb a colossus? One hair at a time!)

I do procrastinate sometimes (and spend a long time juggling and incubating ideas), but this is not the same thing as being paralyzed by the anxiety of a restless mind, which is what, I posit, plagues many knowledge workers today.

  • My procrastination is often the result of the context and timing I find myself in: prioritizing other emergencies, or having no energy left for certain types of tasks, needing to find some long blocks of uninterrupted time to focus on a complex task, or simply needing additional tools or information.
  • When I’m fed up from “incubating” a task long enough and need to force the creation of “focused time”, I can always shut down my email and chat clients and blast out a State of Trance. That sometimes help get into the state of flow.

But surely, listening to Armin van Buuren’s playlists is not the primary gateway to productivity, is it? There has to be more to why I virtually don’t experience stress and anxiety!

So here’s a little secret I’ve been harboring for the last ten years: I’m a hardcore GTD practicioner. It basically governs my life.

In geek terms, that means I’m a cyborg who outsources part of his brain to an external memory system so that he can have all his CPU and RAM available for focused processing.

I could write an unbearably long blog post on the matter and try to explain GTD without forgetting anything, but I thought I’d make a quick overview video instead, which summarizes the core concept for you:

(By the way, hi, I have resurrected my YouTube channel! Feel free to subscribe as that will encourage me to publish more content)

David Allen’s GTD is certainly the single most transformative personal productivity book, bar none, that you should read, if you care about efficiency and stress reduction. It changed my life: it helped me throughout my university studies, through the jobs, through the Free & Open-Source software projects, and the personal day-to-day (things get a lot more complicated when you juggle career with projects, finances, home improvement, continuous learning, and a fat corgi dog).

Pictured: the fat corgi. That fluffy guy doesn’t care about personal productivity.

This methodology and philosophy works best for those I would call “organized chaos warriors”. Check out my two complementary blog posts: my typology of workers (where I define the chaos warriors) and the presentation of my favorite Free and Open-Source tool for Getting Things Done.


P.s.: I linked to the 2001 edition of the book (which is the one I had read) because, according to some reviewers’ comments on the 2015 remake, the 2nd edition is much longer for no real gain other than to mention digital tools rather than a paper-based workflow).

The post A secret to productivity for busy individuals with chaotic contexts appeared first on The Open Sourcerer.

by Jeff at October 28, 2019 12:30 PM

October 24, 2019

Jean-François Fortin TamUnderstanding the Rotschild vs GNOME case in 12 minutes

What’s the deal with the Rothschild vs GNOME Shotwell patent litigation case that the GNOME Foundation must defend against, and why does it matter for protecting the Free & Open-Source software community at large? Here’s my personal attempt at explaining the matter with a short video.

Please note that this video represents my personal opinions, I am not a lawyer nor a representative of the GNOME Foundation, etc. That said, please feel free to share far and wide 🙂

The post Understanding the Rotschild vs GNOME case in 12 minutes appeared first on The Open Sourcerer.

by Jeff at October 24, 2019 06:00 PM

Xabier Rodríguez CalvarVCR to WebM with GStreamer and hardware encoding

My family had bought many years ago a Panasonic VHS video camera and we had recorded quite a lot of things, holidays, some local shows, etc. I even got paid 5000 pesetas (30€ more than 20 years ago) a couple of times to record weddings in a amateur way. Since my father passed less than a year ago I have wanted to convert those VHS tapes into something that can survive better technologically speaking.

For the job I bought a USB 2.0 dongle and connected it to a VHS VCR through a SCART to RCA cable.

The dongle creates a V4L2 device for video and is detected by Pulseaudio for audio. As I want to see what I am converting live I need to tee both audio and video to the corresponding sinks and the other part would go to to the encoders, muxer and filesink. The command line for that would be:

gst-launch-1.0 matroskamux name=mux ! filesink location=/tmp/test.webm \
v4l2src device=/dev/video2 norm=255 io-mode=mmap ! queue ! vaapipostproc ! tee name=video_t ! \
queue ! vaapivp9enc rate-control=4 bitrate=1536 ! mux.video_0 \
video_t. ! queue ! xvimagesink \
pulsesrc device=alsa_input.usb-MACROSIL_AV_TO_USB2.0-02.analog-stereo ! 'audio/x-raw,rate=48000,channels=2' ! tee name=audio_t ! \
queue ! pulsesink \
audio_t. ! queue ! vorbisenc ! mux.audio_0

As you can see I convert to WebM with VP9 and Vorbis. Something interesting can be passing norm=255 to the v4l2src element so it’s capturing PAL and the rate-control=4 for VBR to the vaapivp9enc element, otherwise it will use cqp as default and file size would end up being huge.

You can see the pipeline, which is beatiful, here:

As you can see, we’re using vaapivp9enc here which is hardware enabled and having this pipeline running in my computer was consuming more or less 20% of CPU with the CPU absolutely relaxed, leaving me the necessary computing power for my daily work. This would not be possible without GStreamer and GStreamer VAAPI plugins, which is what happens with other solutions whose instructions you can find online.

If for some reason you can’t find vaapivp9enc in Debian, you should know there are a couple of packages for the intel drivers and that the one you should install is intel-media-va-driver. Thanks go to my colleague at Igalia Víctor Jáquez, who maintains gstreamer-vaapi and helped me solving this problem.

My workflow for this was converting all tapes into WebM and then cutting them in the different relevant pieces with PiTiVi running GStreamer Editing Services both co-maintained by my colleague at Igalia, Thibault Saunier.

by calvaris at October 24, 2019 10:27 AM

October 14, 2019

GStreamerGStreamer Conference 2019: Full Schedule, Talks Abstracts and Speakers Biographies now available

(GStreamer)

The GStreamer Conference team is pleased to announce that the full conference schedule including talk abstracts and speaker biographies is now available for this year's lineup of talks and speakers, covering again an exciting range of topics!

The GStreamer Conference 2019 will take place on 31 October - 1 November 2019 in Lyon, France just after the Embedded Linux Conference Europe (ELCE).

Details about the conference and how to register can be found on the conference website.

This year's topics and speakers:

Lightning Talks:

  • Raising the Importance of the V4L2 plugin and Challenges
    Nicolas Dufresne, Collabora
  • WebKit-powered HTML overlays in your pipeline with GstWPE
    Philippe Normand, Igalia
  • Detect a metal can using GStreamer/OpenFoodFacts
    Stéphane Cerveau, Collabora
  • A new GStreamer RTSP Server
    Sebastian Dröge, Centricular
  • A brand new documentation infrastructure for the GStreamer framework
    Thibault Saunier, Igalia
  • GStreamer on Windows: Everything New
    Nirbheek Chauhan, Centricular
  • An Improved Latency Tracer
    Nicolas Dufresne, Collabora
  • Using Bots to Improve the Gitlab Workflow
    Jordan Petridis, Centricular
  • GNOME Radio
    Ole Aamot, GNOME
  • SCTE-35 support in GStreamer
    Edward Hervey, Centricular
  • Closed captions, AFD, BAR
    Aaron Boxer, Collabora
  • ...and more to come
  • ...
  • Submit your lightning talk now!

Many thanks to our sponsors, Collabora, Pexip, Igalia, Fluendo, Centricular, Facebook and Zeiss, without whom the conference would not be possible in this form. And to Ubicast who will be recording the talks again.

Considering becoming a sponsor? Please check out our sponsor brief.

We hope to see you all in Lyon in October! Don't forget to register!

October 14, 2019 06:00 PM

October 08, 2019

Andy Wingothoughts on rms and gnu

(Andy Wingo)

Yesterday, a collective of GNU maintainers publicly posted a statement advocating collective decision-making in the GNU project. I would like to expand on what that statement means to me and why I signed on.

For many years now, I have not considered Richard Stallman (RMS) to be the head of the GNU project. Yes, he created GNU, speaking it into existence via prophetic narrative and via code; yes, he inspired many people, myself included, to make the vision of a GNU system into a reality; and yes, he should be recognized for these things. But accomplishing difficult and important tasks for GNU in the past does not grant RMS perpetual sovereignty over GNU in the future.

ontological considerations

More on the motivations for the non serviam in a minute. But first, a meta-point: the GNU project does not exist, at least not in the sense that many people think it does. It is not a legal entity. It is not a charity. You cannot give money to the GNU project. Besides the manifesto, GNU has no by-laws or constitution or founding document.

One could describe GNU as a set of software packages that have been designated by RMS as forming part, in some way, of GNU. But this artifact-centered description does not capture movement: software does not, by itself, change the world; it lacks agency. It is the people that maintain, grow, adapt, and build the software that are the heart of the GNU project -- the maintainers of and contributors to the GNU packages. They are the GNU of whom I speak and of whom I form a part.

wasted youth

Richard Stallman describes himself as the leader of the GNU project -- the "chief GNUisance", he calls it -- but this position only exists in any real sense by consent of the people that make GNU. So what is he doing with this role? Does he deserve it? Should we consent?

To me it has been clear for many years that to a first approximation, the answer is that RMS does nothing for GNU. RMS does not write software. He does not design software, or systems. He does hold a role of accepting new projects into GNU; there, his primary criteria is not "does this make a better GNU system"; it is, rather, "does the new project meet the minimum requirements".

By itself, this seems to me to be a failure of leadership for a software project like GNU. But unfortunately when RMS's role in GNU isn't neglect, more often as not it's negative. RMS's interventions are generally conservative -- to assert authority over the workings of the GNU project, to preserve ways of operating that he sees as important. See for example the whole glibc abortion joke debacle as an example of how RMS acts, when he chooses to do so.

Which, fair enough, right? I can hear you saying it. RMS started GNU so RMS decides what it is and what it can be. But I don't accept that. GNU is about practical software freedom, not about RMS. GNU has long outgrown any individual contributor. I don't think RMS has the legitimacy to tell this group of largely volunteers what we should build or how we should organize ourselves. Or rather, he can say what he thinks, but he has no dominion over GNU; he does not have majority sweat equity in the project. If RMS actually wants the project to outlive him -- something that by his actions is not clear -- the best thing that he could do for GNU is to stop pretending to run things, to instead declare victory and retire to an emeritus role.

Note, however, that my personal perspective here is not a consensus position of the GNU project. There are many (most?) GNU developers that still consider RMS to be GNU's rightful leader. I think they are mistaken, but I do not repudiate them for this reason; we can work together while differing on this and other matters. I simply state that I, personally, do not serve RMS.

selective attrition

Though the "voluntary servitude" questions are at the heart of the recent joint statement, I think we all recognize that attempts at self-organization in GNU face a grave difficulty, even if RMS decided to retire tomorrow, in the way that GNU maintainers have selected themselves.

The great tragedy of RMS's tenure in the supposedly universalist FSF and GNU projects is that he behaves in a way that is particularly alienating to women. It doesn't take a genius to conclude that if you're personally driving away potential collaborators, that's a bad thing for the organization, and actively harmful to the organization's goals: software freedom is a cause that is explicitly for everyone.

We already know that software development in people's free time skews towards privilege: not everyone has the ability to devote many hours per week to what is for many people a hobby, and it follows of course that those that have more privilege in society will be more able to establish a position in the movement. And then on top of these limitations on contributors coming in, we additionally have this negative effect of a toxic culture pushing people out.

The result, sadly, is that a significant proportion of those that have stuck with GNU don't see any problems with RMS. The cause of software freedom has always run against the grain of capitalism so GNU people are used to being a bit contrarian, but it has also had the unfortunate effect of creating a cult of personality and a with-us-or-against-us mentality. For some, only a traitor would criticise the GNU project. It's laughable but it's a thing; I prefer to ignore these perspectives.

Finally, it must be said that there are a few GNU people for whom it's important to check if the microphone is on before making a joke about rape culture. (Incidentally, RMS had nothing to say on that issue; how useless.)

So I honestly am not sure if GNU as a whole effectively has the demos to make good decisions. Neglect and selective attrition have gravely weakened the project. But I stand by the principles and practice of software freedom, and by my fellow GNU maintainers who are unwilling to accept the status quo, and I consider attempts to reduce GNU to founder-loyalty to be mistaken and without legitimacy.

where we're at

Given this divided state regarding RMS, the only conclusion I can make is that for the foreseeable future, GNU is not likely to have a formal leadership. There will be affinity groups working in different ways. It's not ideal, but the differences are real and cannot be papered over. Perhaps in the medium term, GNU maintainers can reach enough consensus to establish a formal collective decision-making process; here's hoping.

In the meantime, as always, happy hacking, and: no gods! No masters! No chief!!!

by Andy Wingo at October 08, 2019 03:34 PM

October 07, 2019

Christian SchallerGStreamer Conference 2019 (including GStreamer and PipeWire hackfests)

(Christian Schaller)

GStreamer Conference 2019 banner

GStreamer Conference 2019 in Lyon France


So the GStreamer Conference 2019 is approaching being held in Lyon, France between 31st October and 1st November 2019. This year is special as it marks the GStreamer projects 20th year of existence. I still remember seeing the announcement of GStreamer 0.0.9 which Erik Walthinsen sent to the GNOME announe mailing list. Back then I felt that multimedia support where one of the big gaps around the Linux operating system that needed filling (no, XAnim was nice for its time, but it was not a long term solution :) and GStreamer seemed like the perfect project to fill it. So I joined the GStreamer IRC channel determined to try to help the project succeed however I could. A little over a year later we all met for the first time at GUADEC in Copenhagen, even posing for this exciting team photo.

GStreamer Team at GUADEC Copenhagen in 2001 (we all looked slightly younger and fresher back then.)


Anyway, 20 years later there will be a talk and presentation by GStreamer co-founder Wim Taymans (wearing blue shirt and black pants in picture above) at the GStreamer Conference commemorating 20 years of GStreamer. Detailing taking the project from idealistic spare time effort to the multimedia industry juggernaut it is today.

Of course the conference is not going to be focused on the past, as there is a long line up of great talks talking about modern streaming with DASH, HDR support in GStreamer, latest developments around GStreamer and Rust, Virtual reality, Vulkan and more. Actually on the ‘and more’ topic, Wim Taymans will also do a presentation on PipeWire, the next generation audio and video server, at the GStreamer Conference this year, hopefully demoing some of the great improvements in things like our pro-audio Jack emulation support.
So if you haven’t already, make your way to the GStreamer Conference 2019 website and register for the 10th annual GStreamer Conference!

For those going be aware that there will also be a joint GStreamer fall hackfest and PipeWire hackfest in the two days following the GStreamer Conference. So be sure to sign up for those if interested. They will be co-located with participants flowing freely between the two events.

by uraeus at October 07, 2019 03:57 PM

September 23, 2019

GStreamerGStreamer 1.16.1 stable bug fix release

(GStreamer)

The GStreamer team is pleased to announce the first bug fix release in the stable 1.16 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.16.x.

See /releases/1.16/ for the details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Download tarballs directly here: gstreamer, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-validate, gstreamer-vaapi, or gst-omx.

September 23, 2019 11:00 PM

Christian SchallerFedora Workstation 31 – Whats new

(Christian Schaller)

We are laboring on getting Fedora Workstation 31 out the door next Month, with the beta release being made available last week. So here are some of the highlights of this upcoming release which I and the team hope you will enjoy. Many of these items I already covered in my June blogpost about Fedora Workstation 31, so if you read that one consider this one a status update as there will be some repeats.

Wayland improvements
Fedora has been leading the migration to Wayland since day one and we are not planning to stop. XWayland on demand has been an effort a lot of people contributed to this cycle. The goal is to only need XWayland for legacy X applications, not have it started and running all the time as that is a waste of system resources and also having core functionality still depend on X under Wayland makes the system more fragile. XWayland-on-demand has been a big effort with contributions from a lot of people and companies. One piece of this was the Systemd user session patches that was originally written by Iain Lane from Canonical. They had been lingering for a bit so Benjamin Berg took those patches on for this cycle and helped shepherd them over the finish line and get them merged upstream. This work wasn’t a hard requirement for Wayland-on-demand, but since it makes it a lot easier to do different things under X and Wayland which in turn makes moving towards XWayland-on-demand a little simpler to implement. That work will also allow (in future releases) us to do things like only start services under GNOME that are actually needed for your hardware, so for instance if you don’t have a bluetooth adapter in your computer there is no reason to run the bits of GNOME dealing with bluetooth. So expect further resource savings coming from this work over time.

Carlos Garnacho then spent time going through GNOME Shell removing any lingering X dependencies while Olivier Fourdan worked on cleaning up the control center. This work has mostly landed, but it is hidden behind an experimental flag (gsettings set org.gnome.mutter experimental-features "[...,'autostart-xwayland']") in Fedora 31 as we need to mature it a bit more before its ready for primetime. But we hope and expect to have it running by default in Fedora Workstation 32.

One example of something that was still requiring X that is now gone is the keyboard and mouse accessibility features in GNOME 3, which Olivier Fourdan got re-implemented and improved for this release. So if anyone out there reading this rely on the hover click accessibility feature then that is actually a lot nicer in Fedora Workstation 31. As seen in the screenshot below you now have this nice little pie animation filling up as it prepares to click which is a huge improvement over how it used to work.

Clock on hover

Click on hover in action

Another item we feel is an important part of reducing the need for XWayland is having Firefox running natively on Wayland. Martin Stransky and Jan Horak has been working tirelessly on trying to ensure Firefox works well on Wayland and in the Fedora 31 Beta it is running on Wayland by default. However there are a few bugs discovered that Martin and Jan are trying hard to fix atm so we can keep this default for the GA release, but if they miss the deadline we will ship the X backend version in F31 and then move to the Wayland version later on.

In Fedora Workstation 31 Wayland is still disabled by default if you use the Nvidia binary driver. The reason for this is due to lack of acceleration under XWayland, meaning that any application depending on GLX, like a lot of games, will just get software GL rendering with the binary NVidia driver. This isn’t something we can resolv on our own, Nvidia has to do the work since its their closed source driver, but we been discussing it regularly with them and we been told now that they are looking at the work Adam Jackson some time ago which was specifically aimed at helping them bring their X.org driver to XWayland. We don’t have a timeline yet, but it is being actively looked at and hopefully a proper date can be provided soon. I am actually running Fedora Workstation 31 using the NVidia driver myself at the moment on this laptop, and for those interested in helping dogfood this setup, in preparation for hopefully being able to enable Wayland on NVidia in Fedora Workstation 32, it is fairly simple thing to do. Under /usr/lib/udev/rules.d/ you find a file called 61-gdm.rules, just edit that file and comment out (#) the line that reads ‘DRIVER=="nvidia", RUN+="/usr/libexec/gdm-disable-wayland"‘ and you will revert to a standard setup where your standard session is a Wayland session, but with a x.org session available as a fallback. The more people that run this and report issues the better as it helps us make this rock solid before releasing it upon the world.

Atomic kernel modesetting
Jonas Ådahl has been hard at work this cycle on adding support for atomic mode setting. This work is not done, but the first parts of it has landed, but it has major long term advantages for us. I asked Jonas to provide a short description of the work and what it will eventually achieve as I don’t we articulated that anywhere else yet:

There are two ways for a display server to control the configuration and content of monitors – the old classic Kernel Mode Setting (classic KMS), and newer atomic Kernel Mode Setting (atomic KMS). The main difference between these two modes of operations is that with atomic KMS, the display server posts transactions containing configuration KMS that are then processed atomically by the kernel, while when using the classic KMS, the display server posts configurations command by command, where each monitor is configured by posting multiple commands. The benefits with atomic KMS are for example that the display server will up front know whether a configuration is valid (e.g. enough memory bandwidth), or that the display server can configure multiple aspects of the hardware atomically.

During the cycle leading up to Fedora Workstation 31 the foundations for how mutter (the window manager powering GNOME Shell) can make use of the new atomic KMS API was put in place. What was done was to introduce an internal transactional API for configuring monitors. This will eventually allow us to have much more control over how more advanced monitor features are utilized. For example it will be possible to place client windows directly in hardware overlay planes, meaning we can more often completely bypass full frame compositing when only the content of a single window changes. Another example for what this enables us to do is with color management; we will be able to do seamless switching between managing window color profiles using OpenGL and for instance gamma ramps. Yet another example of what this work opens the door for is framebuffer modifiers, which will among other things potentially result in higher performance with very high resolution monitors.
Finally an important aspect of the work done related to the new internal KMS API is that it aims to be thread safe, meaning eventually it will be possible to put KMS processing completely in a separate thread. This means that together with e.g. moving input device processing to its own thread it will be possible to get very short latency between mouse movement and the cursor
being moved on screen.

QtGNOME improvements
Jan Grulich has continued improved the QtGNOME module to make sure Qt apps integrate as well as possible into Fedora Workstation. His latest updates ensures that the theming keeps up to date with latest upstream changes in Adwaita, that we have a fully working dark theme, that accessibility theming work and that it works with Flatpaks. Below is a screenshot showing Okular running allowing you to see how the QtGNOME module affects the look and feel of Qt applications.

Firmware improvements
The LVFS firmware service keeps going from strength to strength. Richard Hughes presented on it during the Open Firmware Conference recently and was approached by a lot of vendors afterwards both thanking him and Red Hat for the effort, but also asking about getting more of their hardware supported. New vendors are coming onboard at rapid pace, for instance Acer joined recently and are planning to support more of their hardware on the LVFS going forward. It is also worth mentioning the GNOME Firmware tool that can now be downloaded from flathub and which works great on Fedora Workstation 31.

OpenH264 Greatly Improved
The much improved version of OpenH264 will be available soon for Fedora users. This new version adds support for the High and Advanced profiles of H264 which is what most videos found online or produced by your camera would be using. This means you can add H264 playback support to your Fedora Workstation without having to search online for 3rd party repositories like you have had to do up to now. We also are trying to ensure this will be usable by Firefox for video playback eventually. This was work we partnered with Endless, Cisco to hire the multimedia experts at Centricular to do, so another great example of cross company collaboration to bring improved functionality to the community.

Fedora Toolbox
Debarshi Ray has been working on many small improvements and better robustness for Fedora Toolbox going into Fedora Workstation 31. Fedora Toolbox for those not aware of it yet, is our tool to make doing development using pet containers simple and convenient, providing ease of use features on top of traditional container tools and integration with GNOME terminal and the GNOME Shell. The version shipping in F31 will be the last shell script based one as once Fedora Workstation 31 is out we will be going all in on rewritting Fedora Toolbox in Go, in preparation for future development and expansion. I strongly recommend trying it out as it will help open your eyes to the possibilities that using pet containers for development gives you. For instance you can easily set up a RHEL based pet container on your Fedora system to do development work that is mean to be deployed on a RHEL system or grab our special AI/ML development container for easy access to TensorFlow and similar tools.

Improved Classic mode
Another notable change in this release is the updates to GNOME Classic mode. GNOME Classic mode is a set of extensions to GNOME 3 that makes it look and behave a lot more like GNOME 2, which still has many fans out there. With this release we collected feedback from a group of Classic mode users and tried to improve the experience further, mostly be removing some remaining GNOME 3’isms that didn’t really fit the GNOME Classic user experience, like the overview and the hot corner. The session manager is now also easily accessible in the bottom corner. The theming also got cleaned up a little to remove the last bit of the ‘black’ GNOME 3 theming. That said I think it is important to remember that this is still GNOME 3 in the end, we are really just showcasing the power of extensions to tweak the user experience in quite fundamental ways here.

GNOME Classic improved

Improved GNOME Classic mode


Better support for non-English users
Fedora Workstation is used all over the globe, but we have not been happy about how our support for picking languages other than English has worked so far. The thing is that if you choose one or more languages at install time, things tended to just work fine, but if you wanted to add a new language afterwards it required jumping onto the command line and figuring out how to install the needed langpacks. In Fedora Workstation 31 Sundeep Anand have worked hard to improve this, so if you choose a new language in the GNOME Control center in Fedora Workstation 31, the required langpacks should be installed automatically for you.

Fleet Commander
Fleet Commander 0.14.1 is out just in time for Fedora Workstation 31. Fleet Commander is a tool for doing large scale deployments of Fedora and RHEL workstations, allowing you to set system wide profiles. So for instance if you have a GNOME Shell extension everyone in your organization or a specific team inside your organization should have enabled, you can deploy a profile with Fleet commander ensuring that extension is enabled for those users. Basically any setting within GNOME can be set using this, including network configuration options. There is also support for Firefox and LibreOffice settings in Fleet Commander. The big feature addition of 0.14.1 is that Fleet Commander now can be used with Active Directory, which means that even if your company or university use Active Directory for their user management, you can now deploy Fedora and RHEL profiles without needing FreeIPA. Fleet Commander is pretty much finished at this point, at least as far as any piece of software can ever be finished. Oliver Gutierrez Suarez is working on finishing up some last bits of Firefox support currently, but we don’t have any major Fleet Commander items on his todo list after that, so if you been waiting to test it out there are on new major features you need to wait on anymore, it is all there. If you are doing large scale linux desktop deployments I definitely recommend checking out Fleet commander. You will find that Fleet Commander definitely makes Fedora a great choice for doing large scale Linux desktop deployments.

Pipewire
We are not doing a lot of changes to Pipewire for Fedora Workstation 31. Mostly some bugfixes and minor improvements to the video infrastructure it already provides in Fedora 30 for Flatpaks and web browsers. We are planning major changes for Fedora Workstation 32 though, where we in fact plan to ship Pipewire as a tech preview for both Jack and PulseAudio users. The way it will work is that the system will still default to PulseAudio, but we will provide either a script or a UI option to switch over to Pipewire (and back again). There is also a plan to have a core set of ProAudio applications available as Flatpaks for Fedora Workstation 32 tested and verified to work perfectly with Pipewire, the current apps planned to be included are Ardour, Carla, a2jmidid, Hydrogen, Qtractor and Patroneo, but if there is interested contributors joining the effort we could have even more. Then for Fedora Workstation 33 the idea is to ship with Pipewire as the default audio handler, but with some way for users to switch back to PulseAudio if they have a need. Not unlike how the setup is currently with Wayland and X.org in Fedora. Wim Taymans will also be attending the Sonoj conference in Cologne Germany at the end of October to discuss Pipewire with many members of the Linux ProAudio community and hopefully help prepare them for a future where Fedora Workstation is the perfect home for ProAudio users and developers.

Sysprof
Christian Hergert spent some cycles this round on improving the Sysprof tool as it was becoming clear that to keep improving GNOME Shell and general desktop performance going forward we needing better data and ability to find the bottlenecks. Tools like sysprof often ends up being the unsung heroes of the system, but as we continue improving the overall GNOME performance and resource usage of the next few years the revamped sysprof tool will be a big part of that story.

Sysprof

Much improved Sysprof tool

Silverblue
A lot of the items we work on are part of our vision around Silverblue, a Linux desktop OS built on the idea of an immutable core image. We often mention the theoretic advantage that such a setup with an immutable OS brings, but actually as I upgraded from F30 and F31 beta on my RPM based laptop (I got a separate machine where I run Silverblue) I hit the exact kind of issue that Silverblue can help us and our users avoid. What happened was that after my upgrade I suddenly had no Wayland session anymore, just the fallback X.org session. After quite a bit of fault searching I discovered that the reason for this was that I had been testing Valves ACO shader compiler on F30. These packages had a newer version number than the F31 packages and thus where not overwritten as part of the upgrade. Unfortunately the EGL package that came as part of that repository did not work well on F31 and thus the Wayland session failed. Once I did a distro sync and forced all packages to be the actual F31 versions things worked correctly, but it did illustrate the challenges with systems where different parts of the core can and will get updated at different times. With a single well tested core OS image these kind of problems will not happen. That said being able to test such things as ACO is valuable and useful and luckily OStree and Silverblue do offer functionality for installing such things in a clean and non-damaging way through what is known as package layering. When you install new packages like that on using package layering they will only last until your next reboot, after you reboot your back to a clean original state system. Of course if you really want to keep some experimental packages around there are other things you can do too, like overriding, but for simple testing like I did with ACO, package layering will provide you with a simple and safe way to do that.

We realize that Silverblue is a major change in how a Linux distro is ‘supposed’ to work, so we are taking our time with it to ensure we do it right and that we have made sure applications and tools work in a way that functions well on an immutable OS. So if you are interested I do recommend that you grab the Fedora 31 Silverblue image and give it a spin, but we are still working on polishing the experience so don’t expect it to be a seamless experience at this point in time. Of course as things like Flatpaks, Fedora Toolbox and a host of smaller issues get improved upon we do believe this will be such an overall improvement over an ‘old fashioned’ linux distro that you will be asking yourself why the Linux world didn’t do this years ago.

Improved performance
A lot of work has gone into improving the general performance of GNOME 3.34. The GNOME shell team has been very active and is a great example of a large numbers of developers working together from different backgrounds. So this release features a lot of great performance work by Daniel van Vugt from Canonical and by Georges Stavracas from Endless for instance. The Red Hat team has focused on providing patch review and feedback and working on bigger long term changes and enablers, like Christian Hergerts work on Sysprof, Jonas Ådahl work on atomic mode setting and Benjamin Bergs work on systemd-user session support. All in all I think you will find that Fedora Workstation 31 with GNOME 3.34 provides a faster and smoother experience, an experience we will continue to build upon going forward as some of these long term efforts starts paying off.

Sonic Boom

Performance is better than ever

Summary
So this has been a roundup of some of the core items you should look forward to in Fedora Workstation 31. There are other items coming too in this release, like the Miracast GNOME Network Display application that Benjamin Berg has written, more Fedora Flatpaks available than ever before and more. We also have a lot of interesting items coming up in Fedora Workstation 32 like Bastien Noceras work improving low memory handling. So stay tuned.

by uraeus at September 23, 2019 05:45 PM

September 02, 2019

Sebastian Pölsterlscikit-survival 0.10 released

This release of scikit-survival adds two features that are standard in most software for survival analysis, but were missing so far:

  1. CoxPHSurvivalAnalysis now has a ties parameter that allows you to choose between Breslow’s and Efron’s likelihood for handling tied event times. Previously, only Breslow’s likelihood was implemented and it remains the default. If you have many tied event times in your data, you can now select Efron’s likelihood with ties="efron" to get better estimates of the model’s coefficients.
  2. A compare_survival function has been added. It can be used to assess whether survival functions across 2 or more groups differ.

To illustrate the use of compare_survival, let’s consider the Veterans’ Administration Lung Cancer Trial. Here, we are considering the Celltype feature and we want to know whether the tumor type impacts survival. We can visualize the survival function for each subgroup using the Kaplan-Meier estimator.

import matplotlib.pyplot as plt
from sksurv.datasets import load_veterans_lung_cancer
from sksurv.nonparametric import kaplan_meier_estimator
data_x, data_y = load_veterans_lung_cancer()
group_indicator = data_x.loc[:, "Celltype"]
groups = group_indicator.unique()
for group in groups:
group_y = data_y[group_indicator == group]
time, surv_prob = kaplan_meier_estimator(
group_y["Status"],
group_y["Survival_in_days"])
plt.step(time, surv_prob, where="post",
label="Celltype = {}".format(group))
plt.xlabel("time $t$")
plt.ylabel("est. probability of survival")
plt.ylim(0, 1)
plt.grid(True)
plt.legend()
Kaplan-Meier estimates of survival function.

Kaplan-Meier estimates of survival function.

The figure indicates that patients with adenocarcinoma (green line) do not survive beyond 200 days, whereas patients with squamous cell lung cancer (blue line) can survive several years. We can determine whether this difference is indeed statistically significant by performing a non-parametric log-rank test. It groups patients according to cell type and compares the estimated group-specific hazard rate with the pooled hazard rate. Under the null hypothesis, the hazard rate of groups is equal for all time points. The alternative hypothesis is that the hazard rate of at least one group differs from the others at some time.

from sksurv.compare import compare_survival
chisq, pvalue, stats, covar = compare_survival(
data_y, group_indicator, return_stats=True)

The resulting test statistic $\chi^2 = 25.40$, which corresponds to a highly significant P-value of $1.3\cdot{10}^{-5}$. In addition, we can look at group-specific statistics by specifying return_stats=True.

counts observed expected statistic
group
adeno 27 26 15.69 10.31
large 27 26 34.55 -8.55
smallcell 48 45 30.10 14.90
squamous 35 31 47.65 -16.65

The column counts lists the size of each group and is followed by the number of observed and expected events. The last column statistic is the difference between the observed and expected number of events from which the overall $\chi^2$ statistic is computed.

Download

The latest version of scikit-survival can be obtained via conda or pip. Pre-built conda packages are available for Linux, OSX and Windows:

 conda install -c sebp scikit-survival

Alternatively, you can install it from source via pip:

 pip install -U scikit-survival

September 02, 2019 04:06 PM