Planet

New generic multi-stream analytics batcher element

From Centricular Devlog by Centricular

For a project recently it was necessary to collect video frames of multiple streams during a specific interval, and in the future also audio, to pass it through an inference framework for extracting additional metadata from the media and attaching it to the frames.

While GStreamer has gained quite a bit of infrastructure in the past years for machine learning use-cases in the analytics library, there was nothing for this specific use-case yet.

As part of solving this, I proposed as design for a generic interface that allows combining and batching multiple streams into a single one by using empty buffers with a GstMeta that contains the buffers of the original streams, and caps that include the caps of the original streams and allow format negotiation in the pipeline to work as usual.

While this covers my specific use case of combining multiple streams, it should be generic enough to also handle other cases that came up during the discussions.

In addition I wrote two new elements, analyticscombiner and analyticssplitter, that make use of this new API for combining and batching multiple streams in a generic, media-agnostic way over specific time intervals, and later splitting it out again into the original streams. The combiner can be configured to collect all media in the time interval, or only the first or last.

Conceptually the combiner element is similar to NVIDIA's DeepStream nvstreammux element, and in the future it should be possible to write a translation layer between the GStreamer analytics library and DeepStream.

The basic idea for the usage of these elements is to have a pipeline like

-- stream 1 --\                                                                  / -- stream 1 with metadata --
               -- analyticscombiner -- inference elements -- analyticssplitter --
-- stream 2 --/                                                                  \ -- stream 2 with metadata --
   ........                                                                           ......................
-- stream N -/                                                                     \- stream N with metadata --

The inference elements would only add additional metadata to each of the buffers, which can then be made use of further downstream in the pipeline for operations like overlays or blurring specific areas of the frames.

In the future there are likely going to be more batching elements for specific stream types, operating on multiple or a single stream, or making use of completely different batching strategies.

Special thanks also to Olivier and Daniel who provided very useful feedback during the review of the two merge requests.

GStreamer 1.26.3 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.26 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes as well as a number of security fixes and important playback fixes, and it should be safe to update from 1.26.x.

Highlighted bugfixes:

  • Various security fixes and playback fixes
  • Security fix for the H.266 video parser
  • Fix regression for WAV files with acid chunks
  • Fix high memory consumption caused by a text handling regression in uridecodebin3 and playbin3
  • Fix panic on late GOP in fragmented MP4 muxer
  • Closed caption conversion, rendering and muxing improvements
  • Decklink video sink preroll frame rendering and clock drift handling fixes
  • MPEG-TS demuxing and muxing fixes
  • MP4 muxer fixes for creating very large files with faststart support
  • New thread-sharing 1:N inter source and sink elements, and a ts-rtpdtmfsrc
  • New speech synthesis element around ElevenLabs API
  • RTP H.265 depayloader fixes and improvements, as well as TWCC and GCC congestion control fixes
  • Seeking improvements in DASH client for streams with gaps
  • WebRTC sink and source fixes and enhancements, including to LiveKit and WHIP signallers
  • The macOS osxvideosink now posts navigation messages
  • QtQML6GL video sink input event handling improvements
  • Overhaul detection of hardware-accelerated video codecs on Android
  • Video4Linux capture source fixes and support for BT.2100 PQ and 1:4:5:3 colorimetry
  • Vulkan buffer upload and memory handling regression fixes
  • Python bindings: fix various regressions introduced in 1.26.2
  • cerbero: fix text relocation issues on 32-bit Android and fix broken VisualStudio VC templates
  • packages: ship pbtypes plugin and update openssl to 3.5.0 LTS
  • Various bug fixes, build fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.26.3 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

The Unbearable Anger of Broken Audio

From Arun Raghavan by Arun Raghavan

It should be surprising to absolutely nobody that the Linux audio stack is often the subject of varying levels of negative feedback, ranging from drive-by meme snark to apoplectic rage[1].

A lot of what computers are used for today involves audiovisual media in some form or the other, and having that not work can throw a wrench in just going about our day. So it is completely understandable for a person to get frustrated when audio on their device doesn’t work (or maybe worse, stops working for no perceivable reason).

It is also then completely understandable for this person to turn up on Matrix/IRC/Gitlab and make their displeasure known to us in the PipeWire (and previously PulseAudio) community. After all, we’re the maintainers of the part of the audio stack most visible to you.

To add to this, we have two and a half decades’ worth of history in building the modern Linux desktop audio stack, which means there are historical artifacts in the stack (OSS -> ALSA -> ESD/aRTs -> PulseAudio/JACK -> PipeWire). And a lot of historical animus that apparently still needs venting.

In large centralised organisations, there is a support function whose (thankless) job it is to absorb some of that impact before passing it on to the people who are responsible for fixing the problem. In the F/OSS community, sometimes we’re lucky to have folks who step up to help users and triage issues. Usually though, it’s just maintainers managing this.

This has a number of … interesting … impacts for those of us who work in the space. For me this includes:

  1. Developing thick skin
  2. Trying to maintain equanimity while being screamed at
  3. Knowing to step away from the keyboard when that doesn’t work
  4. Repeated reminders that things do work for millions of users every day

So while the causes for the animosity are often sympathetic, this is not a recipe for a healthy community. I try to be judicious while invoking the fd.o Code of Conduct, but thick skin or not, abusive behaviour only results in a toxic community, so there are limits to that.

While I paint a picture of doom and gloom, most recent user feedback and issue reporting in the PipeWire community has been refreshingly positive. Even the trigger for this post is an issue from an extremely belligerent user (who I do sympathise with), who was quickly supplanted by someone else who has been extremely courteous in the face of what is definitely a frustrating experience.

So if I had to ask something of you, dear reader – the next time you’re angry with the maintainers of some free software you depend on, please get some of the venting out of your system in private (tell your friends how terrible we are, or go for a walk maybe), so we can have a reasonable conversation and make things better.

Thank you for reading!


  1. I’m not linking to examples, because that’s not the point of this post. ↩

GStreamer Direct3D12 Rust bindings

From Centricular Devlog by Centricular

With GStreamer 1.26, a new D3D12 backend GstD3D12 public library was introduced in gst-plugins-bad.

Now, with the new gstreamer-d3d12 rust crate, Rust can finally access the Windows-native GPU feature written in GStreamer in a safe and idiomatic way.

What You Get with GStreamer D3D12 Support in Rust

  • Pass D3D12 textures created by your Rust application directly into GStreamer pipelines without data copying
  • Likewise, GStreamer-generated GPU resources (such as frames decoded by D3D12 decoders) can be accessed directly in your Rust app
  • GstD3D12 base GStreamer element can be written in Rust

Beyond Pipelines: General D3D12 Utility Layer

GstD3D12 is not limited to multimedia pipelines. It also acts as a convenient D3D12 runtime utility, providing:

  • GPU resource pooling such as command allocator and descriptor heap, to reduce overhead and improve reuse
  • Abstractions for creating and recycling GPU textures with consistent lifetime tracking
  • Command queue and fence management helpers, greatly simplifying GPU/CPU sync
  • A foundation for building custom GPU workflows in Rust, with or without the full GStreamer pipeline

GStreamer 1.24.13 old-stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new old stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes as well as a number of security fixes and important playback fixes, and it should be safe to update from 1.24.x.

Please note that the 1.24 old-stable series is no longer actively maintained and has been superseded by the GStreamer 1.26 stable series now.

Highlighted bugfixes:

  • Various security fixes and playback fixes
  • MP4 demuxer atom parsing improvements and security fixes
  • H.265 decoder base class and caption inserter SPS/PPS handling fixes
  • Subtitle parser security fixes
  • Subtitle rendering and seeking fixes
  • Closed caption fixes
  • Matroska rotation tag support and v4 muxing support
  • Ogg seeking improvements in streaming mode
  • Windows plugin loading fixes
  • MIDI parser improvements for tempo changes
  • Video time code support for 119.88 fps and drop-frames-related conversion fixes
  • GStreamer editing services fixes for sources with non-1:1 aspect ratios
  • RTP session handling and RTSP server fixes
  • Thread-safety improvements for the Media Source Extension (MSE) library
  • macOS video capture improvements for external devices
  • Python bindings: Fix compatibility with PyGObject >= 3.52.0
  • cerbero: bootstrapping fixes on Windows, improved support for RHEL, and openh264 recipe update
  • Various bug fixes, build fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.13 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

GStreamer 1.26.2 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.26 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes as well as a number of security fixes and important playback fixes, and it should be safe to update from 1.26.x.

Highlighted bugfixes:

  • Various security fixes and playback fixes
  • aggregator base class fixes to not produce buffers too early in live mode
  • AWS translate element improvements
  • D3D12 video decoder workarounds for crashes on NVIDIA cards on resolution changes
  • dav1d AV1-decoder performance improvements
  • fmp4mux: tfdt and composition time offset fixes, plus AC-3 / EAC-3 audio support
  • GStreamer editing services fixes for sources with non-1:1 aspect ratios
  • MIDI parser improvements for tempo changes
  • MP4 demuxer atom parsing improvements and security fixes
  • New skia-based video compositor element
  • Subtitle parser security fixes
  • Subtitle rendering and seeking fixes
  • Playbin3 and uridecodebin3 stability fixes
  • GstPlay stream selection improvements
  • WAV playback regression fix
  • GTK4 paintable sink colorimetry support and other improvements
  • WebRTC: allow webrtcsrc to wait for a webrtcsink producer to initiate the connection
  • WebRTC: new Janus Video Room WebRTC source element
  • vah264enc profile decision making logic fixes
  • Python bindings gained support for handling mini object writability (buffers, caps, etc.)
  • Various bug fixes, build fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.26.2 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

GStreamer 1.26 and Igalia

From Herostratus’ legacy by Víctor Jáquez

The release of GStreamer 1.26, last March, delivered new features, optimization and improvements. Igalia played its role as long standing contributor, with 382 commits (194 merge requests) from a total of 2666 of commits merged in this release.This blog post takes a closer look on those contributions.

gst-devtools #

This module contains development and validation tools.

gst-dot-viewer #

gst-dot-viewer is a new web tool for real-time pipeline visualization. Our colleague, Thibault, wrote a blog post about its usage.

validate #

GstValidate is a tool to check if elements are behaving as expected.

  • Added support for HTTP Testing.
  • Scenario fixes such as reset pipelines on expected errors to avoid inconsistent states, improved error logging, and async action handling to prevent busy loops.

gst-editing-services #

GStreamer Editing Services is a library to simplify the creation of multimedia editing applications.

  • Enabled reverse playback, by adding a reverse property to nlesource for seamless backward clip playback.
  • Added internal tests for Non-Linear Engine elements.

gst-libav #

GStreamer Libav plug-in contains a set of many popular decoders and encoders using FFmpeg.

  • As part of the effort to support VVC/H.266 in GStreamer FFmpeg VVC/H.266 decoder was exposed.
  • Optimized framerate renegotiation in avviddec without decoder resets.
  • Mapped GST_VIDEO_FORMAT_GRAY10_LE16 format to FFmpeg’s equivalent.

gstreamer #

Core library.

  • Added a tracer for gst-dots-viewer.
  • Log tracers improvements such as, replaced integer codes with readable strings, to track pad’s sticky events, and simplify parameters handling, etc.
  • On pads, don’t push sticky events in response to a FLUSH_STOP event.
  • On queue element, fixed missing notify signals for level changes.
  • Pipeline parser now logs bus error messages during pipeline construction.
  • Fixed gst_util_ceil_log2 utility function.

gst-plugins-base #

GStreamer Base Plugins is a well-groomed and well-maintained collection of plugins. It also contains helper libraries and base classes useful for writing elements.

  • audiorate: respect tolerance property to avoid unnecessary sample adjustments for minor gaps.
  • audioconvert: support reordering of unpositioned input channels.
  • videoconvertscale: improve aspect ratio handling.
  • glcolorconvert: added I422_10XX, I422_12XX, Y444_10XX, and Y444_16XX color formats, and fixed caps negotiation for DMABuf.
  • glvideomixer: handle mouse events.
  • pbutils: added VVC/H.266 codec support
  • encodebasebin: parser fixes.
  • oggdemux: fixed seek to the end of files.
  • rtp: fixed precision for UNIX timestamp.
  • sdp: enhanced debugging messages.
  • parsebin: improved caps negotiation.
  • decodebin3: added missing locks to prevent race conditions.
  • streamsynchronizer: improved documentation.

gst-plugins-good #

GStreamer Good Plugins is a set of plugins considered to have good quality code, correct functionality, and uses LGPL/LGPL+compatible licenses.

  • hlsdemux2: handle empty segments at the beginning of a stream.
  • qtmux and matroska: add support for VVC/H.266.
  • matroskademux:support seek with stop in push mode.
  • rtp: several fixes.
  • osxaudio: fixes.
  • videoflip: support Y444_16LE and Y444_16BE color formats.
  • vpx: enhance error and warning messages.

gst-plugins-bad #

GStreamer Bad Plug+ins is a set of plugins that aren’t up to par compared to the rest. They might be close to being good quality, but they’re missing something, be it a good code review, some documentation, a set of tests, etc.

  • dashsink: a lot of improvements and cleanups, such as unit tests, state and event management.
  • h266parse: enabled vvc1 and vvi1 stream formats, improved code data parsing and negotiatios, along with cleanups and fixes.
  • mpegtsmux and tsdemux: added support for VVC/H.266 codec.
  • vulkan:
    • Added compatibility for timeline semaphores and barriers.
    • Initial support of multiple GPU and dynamic element registering.
    • Vulkan image buffer pool improvements.
    • vulkanh264dec: support interlaced streams.
    • vulkanencoding: rate control and quality level adjustments, update SPS/PPS, support layered DPBs.
  • webrtcbin:
    • Resolved duplicate payload types in SDP offers with RTX and multiple codecs.
    • Transceivers are now created earlier during negotiation to avoid linkage issues.
    • Allow session level in setup attribute in SDP answer.
  • wpevideosrc:
    • code cleanups
    • cached SHM buffers are cleared after caps renegotiation.
    • handle latency queries and post progress messages on bus.
  • srtdec: fixes
  • jpegparse: handle avi1 tag for progressive images
  • va: improve encoders configuration when properties change in run+time, specially rate control.

GStreamer OpenGL surfaces on Wayland

From Centricular Devlog by Centricular

As part of the GStreamer Hackfest in Nice, France I had some time to go through some outstanding GStreamer issues. One such issue that has been on my mind was this GStreamer OpenGL Wayland issue.

Now, the issue is that OpenGL is an old API and did not have some of the platform extensions it does today. As a result, most windowing system APIs allow creating an output surface (or a window) but never showing it. This also works just fine when you are creating an OpenGL context but not actually rendering anything to the screen and this approach is what is used by all of the other major OpenGL platforms (Windows, macOS, X11, etc) supported by GStreamer.

When wayland initially arrived, this was not the case. A wayland surface could be the back buffer (an OpenGL term for rendering to a surface) but could not be hidden. This is very different from how other windowing APIs worked at the time. As a result, the initial implementation using Wayland within GStreamer OpenGL used some heuristics for determining when a wayland surface would be created and used that basically boiled down to, if there is no shared OpenGL context, then create a window.

This heuristic obviously breaks in multiple different ways, the two most obvious being:

  1. gltestsrc ! gldownload ! some-non-gl-sink - there should be no surface used here.
  2. gltestsrc ! glimagesink gltestsrc ! glimagesink - there should be two output surfaces used here.

The good news is that issue is now fixed by adding some API that glimagesink can use to notify that it would like an output surface. This has been implemented in this merge request and will be part of GStreamer 1.28.

gst-dots-viewer: A New Tool for GStreamer Pipeline Visualization

From Thibault Saunier's blog by Thibault Saunier

We’re happy to have released gst-dots-viewer, a new development tool that makes it easier to visualize and debug GStreamer pipelines. This tool,  included in GStreamer 1.26, provides a web-based interface for viewing pipeline graphs in real-time as your application runs and allows to easily request all pipelines to be dumped at any time.

What is gst-dots-viewer?

gst-dots-viewer is a server application that monitors a directory for .dot files generated by GStreamer’s pipeline visualization system and displays them in your web browser. It automatically updates the visualization whenever new .dot files are created, making it simpler to debug complex applications and understand the evolution of the pipelines at runtime.

Key Features

  • Real-time Updates: Watch your pipelines evolve as your application runs
  • Interactive Visualization:
    • Click nodes to highlight pipeline elements
    • Use Shift-Ctrl-scroll or w/s keys to zoom
    • Drag-scroll support for easy navigation
  • Easily deployable in cloud based environments

How to Use It

  1. Start the viewer server:
    gst-dots-viewer
    
  2. Open your browser at http://localhost:3000
  3. Enable the dots tracer in your GStreamer application:
    GST_TRACERS=dots your-gstreamer-application
    

The web page will automatically update whenever new pipeline are dumped, and you will be able to dump all pipelines from the web page.

New Dots Tracer

As part of this release, we’ve also introduced a new dots tracer that replaces the previous manual approach to specify where to dump pipelines. The tracer can be activated simply by setting the GST_TRACERS=dots environment variable.

Interactive Pipeline Dumps

The dots tracer integrates with the pipeline-snapshot tracer to provide real-time pipeline visualization control. Through a WebSocket connection, the web interface allows you to trigger pipeline dumps. This means you can dump pipelines exactly when you need them during debugging or development, from your browser.

Future Improvements

We plan on adding more feature and  have this list of possibilities:

  • Additional interactive features in the web interface
  • Enhanced visualization options
  • Integration with more GStreamer tracers to provide comprehensive debugging information. For example, we could integrate the newly released memory-tracer and queue-level tracers so to plot graphs about memory usage at any time.

This could transform gst-dots-viewer into a more complete debugging and monitoring dashboard for GStreamer applications.

Demo

GStreamer 1.26.1 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.26 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and security fixes, and it should be safe to update from 1.26.0.

Highlighted bugfixes:

  • awstranslate and speechmatics plugin improvements
  • decodebin3 fixes and urisourcebin/playbin3 stability improvements
  • Closed captions: CEA-708 generation and muxing fixes, and H.264/H.265 caption extractor fixes
  • dav1d AV1 decoder: RGB support, plus colorimetry, renegotiation and buffer pool handling fixes
  • Fix regression when rendering VP9 with alpha
  • H.265 decoder base class and caption inserter SPS/PPS handling fixes
  • hlssink3 and hlsmultivariantsink feature enhancements
  • Matroska v4 support in muxer, seeking fixes in demuxer
  • macOS: framerate guessing for cameras or capture devices where the OS reports silly framerates
  • MP4 demuxer uncompressed video handling improvements and sample table handling fixes
  • oggdemux: seeking improvements in streaming mode
  • unixfdsrc: fix gst_memory_resize warnings
  • Plugin loader fixes, especially for Windows
  • QML6 GL source renegotiation fixes
  • RTP and RTSP stability fixes
  • Thread-safety improvements for the Media Source Extension (MSE) library
  • v4l2videodec: fix A/V sync issues after decoding errors
  • Various improvements and fixes for the fragmented and non-fragmented MP4 muxers
  • Video encoder base class segment and buffer timestamp handling fixes
  • Video time code support for 119.88 fps and drop-frames-related conversion fixes
  • WebRTC: Retransmission entry creation fixes and better audio level header extension compatibility
  • YUV4MPEG encoder improvments
  • dots-viewer: make work locally without network access
  • gst-python: fix compatibility with PyGObject >= 3.52.0
  • Cerbero: recipe updates, compatibility fixes for Python < 3.10; Windows Android cross-build improvements
  • Various bug fixes, build fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.26.1 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Fedora Workstation 42 is upon us!

Christian Schaller
From Christian F.K. Schaller by Christian Schaller

We are excited about the Fedora Workstation 42 released today. Having worked on some great features for it.

Fedora Workstation 42 HDR edition
I would say that the main feature that landed was HDR or High Dynamic Range. It is a feature we spent years on with many team members involved and a lot of collaboration with various members of the wider community.

GNOME Settings menu showing HDR settings

GNOME Settings menu showing HDR settings

The fact that we got this over the finish line was especially due to all the work Sebastian Wick put into it in collaboration with Pekka Paalanen around HDR Wayland specification and implementations.
Another important aspect was tools like libdisplay which was co-created with Simon Ser, with others providing more feedback and assistance in the final stretch of the effort.

Ori and the Will of the Wisps screenshot

HDR setup in Ori and Will of the Wisps


That said a lot of other people at Red Hat and in the community deserve shout outs for this too. Like Xaver Hugl whose work on HDR in Kwin was a very valuable effort that helped us move the GNOME support forward too. Matthias Clasen and Benjamin Otte for their work on HDR support in GTK+, Martin Stransky for his work on HDR support in Firefox, Jonas Aadahl and Olivier Fourdan for their protocol and patch reviews. Jose Exposito for packaging up the Mesa Vulkan support for Fedora 42.

One area that should benefit from HDR support are games. In the screenshot about you see the game Ori and the Will of the Wisps which is known for great HDR support. Valve will need to update to a Wine version for Proton that supports Wayland natively though before this just works, at the moment you can get it working using gamescope, but hopefully soon it will just work under both Mutter and Kwin.

Also a special shoutout to the MPV community for quickly jumping on this and releasing a HDR capable video player recently.

MPV video player playing HDR content

MPV video player playing HDR content

Of course getting Fedora Workstation 42 to out with these features is just the beginning, with the baseline support it now is really the time when application maintainers have a real chance of starting to make use of these features, so I would expect various content creative applications for instance to start having support over the next year.

For the desktop itself there are also open questions we need to decide on like:

  • Format to use for HDR screenshots
  • Better backlight and brightness handling
  • Better offloading
  • HDR screen recording video format
  • How to handle HDR webcams (seems a lot of them are not really capable of producing HDR output).
  • Version of the binary NVIDIA driver released supporting the VK_EXT_hdr_metadata and VK_COLOR_SPACE_HDR10_ST2084_EXT Vulkan extension on Linux
  • A million smaller issues we will need to iron out

Accessibility
Our accessibility team has been hard at work trying to ensure we have a great accessibility story in Fedora Workstation 42. Our accessibility team with Lukas Tyrychtr and Bohdan Milar has been working hard together with others to ensure that Fedora Workstation 42 has the best accessibility support you can get on Linux. One major effort that landed was the new keyboard monitoring interface which is critical for making Orca work well under Wayland. This was a collaboration of between Lukas Tyrychtr, Matthias Clasen and Carlos Garnacho on our team. If you are interested in Accessibility, as a user or a developer or both then make sure to join in by reaching out to the Accessibility Working group

PipeWire
PipeWire also keeps going strong with continuous improvements and bugfixes. Thanks to the great work by Jan Grulich the support for PipeWire in Firefox and Chrome is now working great, including for camera handling. It is an area where we want to do an even better job though, so Wim Taymans is currently looking at improving video handling to ensure we are using the best possible video stream the camera can provide and handle conversion between formats transparently. He is currently testing it out using a ffmpeg software backend, but the end goal is to have it all hardware accelerated through directly using Vulkan.

Another feature Wim Taymans added recently is MIDI2 support. This is the next generation of MIDI with only a limited set of hardware currently supporting it, but on the other hand it feels good that we are now able to be ahead of the curve instead of years behind thanks to the solid foundation we built with Pipewire.

Wayland
For a long time the team has been focused on making sure Wayland has all the critical pieces and was functionality wise on the same level as X11. For instance we spent a lot of time and effort on ensuring proper remote desktop support. That work all landed in the previous Fedora release which means that over the last 6 Months the team has had more time to look at things like various proposed Wayland protocols and get them supported in GNOME. Thanks to that we helped ensure the Cursor Shape Protocol and Toplevel Drag protocols got landed in time for this release. We are already looking and what to help land for the next release, so expect a continued acceleration in Wayland protocol adoption going forward.

First steps into AI
So an effort we been plugging away at recently is starting to bring AI tooling to Open Source desktop applications. Our first effort in this regard is Granite.code. Granite.code is a extension for Visual Studio Code that sets up a local AI engine on your system to help with various tasks including code generation and chat inside Visual Studio Code. So what is special about this effort is that it relies on downloading and running a copy of the open source AI Granite LLM model to your system instead on relying on it being run in a cloud instance somewhere. That means you can use Granite.code without having to share your data and work with someone else. Granite.code is still very early stage and it requires a NVIDIA or AMD GPU with over 8GB of video ram to use under Linux. (It also runs under Windows and MacOS X). It is still in a pre-release stage, we are waiting for the Granite 3.3 model update to enable some major features for us before we make the first formal release, but for those willing to help us test you can search for Granite in the Visual Studio Code extension marketplace and install it.
We are hoping though that this will just the starting point where our work can get picked up and used by other IDEs out there too and also we are thinking about how we can offer AI features in other parts of the desktop too.

Granite.code

Granite.code running on Linux

GStreamer Spring Hackfest on 16-18 May 2025 in Nice, France

GStreamer
From GStreamer News by GStreamer

The GStreamer project is thrilled to announce that there will be a spring hackfest on Friday-Sunday 16-18 May 2025 in Nice, France.

For more details and latest updates check out the announcement on Discourse.

We will announce any further updates on Discourse, but you can also follow us on Bluesky and on on Mastodon.

We hope to see you in Nice!

Please spread the word!

Using pre-commit in GStreamer

From Herostratus’ legacy by Víctor Jáquez

Recently, GStreamer development story integrated the usage of pre-commit. pre-commit is a Git hook script that chain different linters, checkers, validators, formatters, etc., that are executed at git commit. This script is in Python. And there’s other GStreamer utility in Python: hotdoc

The challenge is that Debian doesn’t allow to install Python packages through pip, they have to be installed as Debian packages or inside virtual environments, such as venv.

So, instead of activating a virtual environment when I work in GStreamer, let’s just use direnv to activate it automatically.

Here’s a screencast of what I did to setup a Python virtual environment, within direnv, and installing pre-commit, hotdoc and gst-indent-1.0.

UPDATE: Tim told me that wit pipx we can do the same without the venv hassle.

https://mastodon.social/@tp_muller@fosstodon.org/114150178786863565

GStreamer 1.26.0 new major stable release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is excited to announce a new major feature release of your favourite cross-platform multimedia framework!

As always, this release is again packed with new features, bug fixes and many other improvements.

The 1.26 release series adds new features on top of the previous 1.24 series and is part of the API and ABI-stable 1.x release series of the GStreamer multimedia framework.

Highlights:

  • H.266 Versatile Video Coding (VVC) codec support
  • Low Complexity Enhancement Video Coding (LCEVC) support
  • Closed captions: H.264/H.265 extractor/inserter, cea708overlay, cea708mux, tttocea708 and more
  • New hlscmafsink, hlssink3, and hlsmultivariantsink; HLS/DASH client and dashsink improvements
  • New AWS and Speechmatics transcription, translation and TTS services elements, plus translationbin
  • Splitmux lazy loading and dynamic fragment addition support
  • Matroska: H.266 video and rotation tag support, defined latency muxing
  • MPEG-TS: support for H.266, JPEG XS, AV1, VP9 codecs and SMPTE ST-2038 and ID3 meta; mpegtslivesrc
  • ISO MP4: support for H.266, Hap, Lagarith lossless codecs; raw video support; rotation tags
  • SMPTE 2038 ancillary data streams support
  • JPEG XS image codec support
  • Analytics: New TensorMeta; N-to-N relationships; Mtd to carry segmentation masks
  • ONVIF metadata extractor and conversion to/from relation metas
  • New originalbuffer element that can restore buffers again after transformation steps for analytics
  • Improved Python bindings for analytics API
  • Lots of Vulkan integration and Vulkan Video decoder/encoder improvements
  • OpenGL integration improvements, esp. in glcolorconvert, gldownload, glupload
  • Qt5/Qt6 QML GL sinks now support direct DMABuf import from hardware decoders
  • CUDA: New compositor, Jetson NVMM memory support, stream-ordered allocator
  • NVCODEC AV1 video encoder element, and nvdsdewarp
  • New Direct3D12 integration support library
  • New d3d12swapchainsink and d3d12deinterlace elements and D3D12 sink/source for zero-copy IPC
  • Decklink HDR support (PQ + HLG) and frame scheduling enhancements
  • AJA capture source clock handling and signal loss recovery improvements
  • RTP and RTSP: New rtpbin sync modes, client-side MIKEY support in rtspsrc
  • New Rust rtpbin2, rtprecv, rtpsend, and many new Rust RTP payloaders and depayloaders
  • webrtcbin support for basic rollbacks and other improvements
  • webrtcsink: support for more encoders, SDP munging, and a built-in web/signalling server
  • webrtcsrc/sink: support for uncompressed audio/video and NTP & PTP clock signalling and synchronization
  • rtmp2: server authentication improvements incl. Limelight CDN (llnw) authentication
  • New Microsoft WebView2 based web browser source element
  • The GTK3 plugin has gained support for OpenGL/WGL on Windows
  • Many GTK4 paintable sink improvements
  • GstPlay: id-based stream selection and message API improvements
  • Real-time pipeline visualization in a browser using a new dots tracer and viewer
  • New tracers for tracking memory usage, pad push timings, and buffer flow as pcap files
  • VA hardware-acclerated H.266/VVC decoder, VP8 and JPEG encoders, VP9/VP8 alpha decodebins
  • Video4Linux2 elements support DMA_DRM caps negotiation now
  • V4L2 stateless decoders implement inter-frame resolution changes for AV1 and VP9
  • Editing services: support for reverse playback and audio channel reordering
  • New QUIC-based elements for working with raw QUIC streams, RTP-over-QUIC (RoQ) and WebTransport
  • Apple AAC audio encoder and multi-channel support for the Apple audio decoders
  • cerbero: Python bindings and introspection support; improved Windows installer based on WiX5
  • Lots of new plugins, features, performance improvements and bug fixes

For more details check out the GStreamer 1.26 release notes.

Binaries for Android, iOS, macOS and Windows will be provided in due course.

You can download release tarballs directly here: gstreamer, gst-plugins-base, gst-plugins-good, gst-plugins-ugly, gst-plugins-bad, gst-libav, gst-rtsp-server, gst-python, gst-editing-services, gst-devtools, gstreamer-vaapi, gstreamer-sharp, gstreamer-docs.

Vulkan Video is Open: Application showcase !

From dabrain34 by Stéphane Cerveau
Vulkanised 2025
Vulkanised 2025

UK calling

Long time no see this beautiful grey sky, roast beef on sunday and large but full packed pub when there is a football or a rugby game (The rose team has been lucky this year, grrr).

It was a delightful journey in the UK starting with my family visiting London including a lot (yes a lot…) of sightviews in a very short amount of time. But we managed to fit everything in. We saw the changing of the guards, the Thames river tide on a boat, Harry Potter gift shops and the beautiful Arsenal stadium with its legendary pitch, one of the best of England.

It was our last attraction in London and now it was time for my family to go to Standsted back home and me to Cambridge and its legendary university.

To start the journey in Cambridge, first I got some rest on Monday in the hotel to face the hail of information I will get during the conference. This year, Vulkanised took place on Arm’s campus, who kindly hosted the event, providing everything we needed to feel at home and comfortable.

The first day, we started with an introduction from Ralph Potter, the Vulkan Working Group Chair at Khronos, who introduced the new 1.4 release and all the extensions coming along including “Vulkan Video”. Then we could start this conference with my favorite topic, decoding video content with Vulkan Video. And the game was on! There was a presentation every 30 minutes including a neat one from my colleague at Igalia Ricardo Garcia about Device-Generated Commands in Vulkan and a break every 3 presentations. It took a lot of mental energy to keep up with all the topics as each presentation was more interesting than the last. During the break, we had time to relax with good coffee, delicious cookies, and nice conversations.

The first day ended up with a tooling demonstrations from LunarG, helping us all to understand and tame the Vulkan beast. The beast is ours now!

As I was not in the best shape due to a bug I caught on Sunday, I decided to play it safe and went to the hotel just after a nice indian meal. I had to prepare myself for the next day, where I would present “Vulkan Video is Open: Application Showcase”.

Vulkan Video is Open: Application showcase !

First Srinath Kumarapuram from Nvidia gave a presentation about the new extensions made available during 2024 by the Vulkan Video TSG. It started with a brief timeline of the video extensions from the initial h26x decoding to the latest VP9 decode coming this year including the 2024 extensions such as the AV1 codec. Then he presented more specific extensions such as VK_KHR_video_encode_quantization_map, VK_KHR_video_maintenance2 released during 2024 and coming in 2025, VK_KHR_video_encode_intra_refresh. He mentioned that the Vulkan toolbox now completely supports Vulkan Video, including the Validation Layers, Vulkan Profiles, vulkaninfo or GFXReconstruct.

After some deserved applause for a neat presentation, it was my time to be on stage.

During this presentation I focused on the Open source ecosystem around Vulkan Video. Indeed Vulkan Video ships with a sample app which is totally open along with the regular Conformance Test Suite. But that’s not all! Two major frameworks now ship with Vulkan Video support: GStreamer and FFmpeg.

Before this, I started by talking about Mesa, the open graphics library. This library which is totally open provides drivers which support Vulkan Video extensions and allow applications to run Vulkan Video decode or encode. The 3 major chip vendors are now supported. It started in 2022 with RADV, a userspace driver that implements the Vulkan API on most modern AMD GPUs. This driver supports all the vulkan video extensions except the lastest ones such as VK_KHR_video_encode_quantization_map or VK_KHR_video_maintenance2 but this they should be implemented sometime in 2025. Intel GPUs are now supported with the ANV driver, this driver also supports the common video extensions such as h264/5 and AV1 codec. The last driver to gain support was at the end of 2024 where several of the Vulkan Video extensions were introduced to NVK, a Vulkan driver for NVIDIA GPUs. This driver is still experimental but it’s possible to decode H264 and H265 content as well as its proprietary version. This completes the offering of the main GPUs on the market.

Then I moved to the applications including GStreamer, FFmpeg and Vulkan-Video-Samples. In addition to the extensions supported in 2025, we talked mainly about the decode conformance using Fluster. To compare all the implementations, including the driver, the version and the framework, a spreadsheet can be found here. In this spreadsheet we summarize the 3 supported codecs (H264, H265 and AV1) with their associated test suites and compare their implemententations using Vulkan Video (or not, see results for VAAPI with GStreamer). GStreamer, my favorite playground, can now decode H264 and H265 since 1.24 and recently got the support for AV1 but the merge request is still under review. It supports more than 80% of the H264 test vectors for the JVT-AVC_V1 and 85% of the H265 test vectors in JCT-VC-HEVC_V1. FFMpeg is offering better figures passing 90% of the tests. It supports all the avaliable codecs including all of the encoders as well. And finally Vulkan-Video-Samples is the app that you want to use to support all codecs for both encode and decode, but its currently missing support for mesa drivers when it comes to use Fluster decode tests..

Vulkanised on the 3rd day

During the 3rd day, we had interesting talks as well demonstrating the power of Vulkan, from Blender, a free and open-source 3D computer graphics software tool switching progressively to Vulkan, to the implementation of 3D a game engine using Rust, or compute shaders in Astronomy. My other colleague at Igalia, Lucas Fryzek, also had a presentation on Mesa with Lavapipe: a Mesa’s Software Renderer for Vulkan which allows you to have a hardware free implementation of Vulkan and to validate extensions in a simpler way. Finally, we finished this prolific and dense conference with Android and its close collaboration with Vulkan.

If you are interested in 3D graphics, I encourage you to attend future Vulkanised editions, which are full of passionate people. And if you can not attend you can still watch the presentation online.

If you are interested in the Vulkan Video presentation I gave, you can catch up the video here:

Or follow our Igalia live blog post on Vulkan Video:

https://blogs.igalia.com/vjaquez/vulkan-video-status/

As usual, if you would like to learn more about Vulkan, GStreamer or any other open multimedia framework, please feel free to contact us!

GStreamer 1.25.90 (1.26.0 rc1) pre-release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is excited to announce the first release candidate for the upcoming stable 1.26.0 feature release.

This 1.25.90 pre-release is for testing and development purposes in the lead-up to the stable 1.26 series which is now frozen for commits and scheduled for release very soon.

Depending on how things go there might be more release candidates in the next couple of days, but in any case we're aiming to get 1.26.0 out as soon as possible.

Binaries for Android, iOS, Mac OS X and Windows will be made available shortly at the usual location.

Release tarballs can be downloaded directly here:

As always, please give it a spin and let us know of any issues you run into by filing an issue in GitLab.

Orc 0.4.41 bug-fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another release of liborc, the Optimized Inner Loop Runtime Compiler, which is used for SIMD acceleration in GStreamer plugins such as audioconvert, audiomixer, compositor, videoscale, and videoconvert, to name just a few.

This is a bug-fix release.

Highlights:

  • orccodemem: Don't modify the process umask, which caused race conditions with other threads
  • Require glibc >= 2.07
  • x86: various SSE and MMX fixes
  • avx: Fix sqrtps encoding causing an illegal instruction crash
  • Hide internal symbols from ABI and do not install internal headers
  • Rename backend to target, including `orc-backend` meson option and `ORC_BACKEND` environment variable
  • Testsuite, tools: Disambiguate OrcProgram naming conventions
  • Build: Fix `_clear_cache` call for Clang and error out on implicit function declarations
  • opcodes: Use MIN instead of CLAMP for known unsigned values to fix compiler warnings
  • ci improvements: Upload the generated .S and .bin and include Windows artifacts
  • Spelling fix in debug log message

Direct tarball download: orc-0.4.41.tar.xz.

GStreamer 1.25.50 unstable development release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another development release in the unstable 1.25 release series.

The unstable 1.25 release series is for testing and development purposes in the lead-up to the stable 1.26 series which is scheduled for release ASAP. Any newly-added API can still change until that point.

This development release is primarily for developers and early adopters.

The plan is to get 1.26 out of the door as quickly as possible, and with this release a feature freeze is now in effect.

Binaries for Android, iOS, Mac OS X and Windows will be made available shortly at the usual location.

Release tarballs can be downloaded directly here:

As always, please give it a spin and let us know of any issues you run into by filing an issue in GitLab.

PipeWire ♥ Sovereign Tech Agency

From Arun Raghavan by Arun Raghavan

In my previous post, I alluded to an exciting development for PipeWire. I’m now thrilled to officially announce that Asymptotic will be undertaking several important tasks for the project, thanks to funding from the Sovereign Tech Fund (now part of the Sovereign Tech Agency).

Some of you might be familiar with the Sovereign Tech Fund from their funding for GNOME, GStreamer and systemd – they have been investing in foundational open source technology, supporting the digital commons in key areas, a mission closely aligned with our own.

We will be tackling three key areas of work.

ASHA hearing aid support

I wrote a bit about our efforts on this front. We have already completed the PipeWire support for single ASHA hearing aids, and are actively working on support for stereo pairs.

Improvements to GStreamer elements

We have been working through the GStreamer+PipeWire todo list, fixing bugs and making it easier to build audio and video streaming pipelines on top of PipeWire. A number of usability improvements have already landed, and more work on this front continues

A Rust-based client library

While we have a pretty functional set of Rust bindings around the C-based libpipewire already, we will be creating a pure Rust implementation of a PipeWire client, and provide that via a C API as well.

There are a number of advantages to this: type and memory safety being foremost, but we can also leverage Rust macros to eliminate a lot of boilerplate (there are community efforts in this direction already that we may be able to build upon).

This is a large undertaking, and this funding will allow us to tackle a big chunk of it – we are excited, and deeply appreciative of the work the Sovereign Tech Agency is doing in supporting critical open source infrastructure.

Watch this space for more updates!

Looking ahead at 2025 and Fedora Workstation and jobs on offer!

Christian Schaller
From Christian F.K. Schaller by Christian Schaller

So a we are a little bit into the new year I hope everybody had a great break and a good start of 2025. Personally I had a blast having gotten the kids an air hockey table as a Yuletide present :). Anyway, wanted to put this blog post together talking about what we are looking at for the new year and to let you all know that we are hiring.

Artificial Intelligence
One big item on our list for the year is looking at ways Fedora Workstation can make use of artificial intelligence. Thanks to IBMs Granite effort we know have an AI engine that is available under proper open source licensing terms and which can be extended for many different usecases. Also the IBM Granite team has an aggressive plan for releasing updated versions of Granite, incorporating new features of special interest to developers, like making Granite a great engine to power IDEs and similar tools. We been brainstorming various ideas in the team for how we can make use of AI to provide improved or new features to users of GNOME and Fedora Workstation. This includes making sure Fedora Workstation users have access to great tools like RamaLama, that we make sure setting up accelerated AI inside Toolbx is simple, that we offer a good Code Assistant based on Granite and that we come up with other cool integration points.

Wayland
The Wayland community had some challenges last year with frustrations boiling over a few times due to new protocol development taking a long time. Some of it was simply the challenge of finding enough people across multiple projects having the time to follow up and help review while other parts are genuine disagreements of what kind of things should be Wayland protocols or not. That said I think that problem has been somewhat resolved with a general understanding now that we have the ‘ext’ namespace for a reason, to allow people to have a space to review and make protocols without an expectation that they will be universally implemented. This allows for protocols of interest only to a subset of the community going into ‘ext’ and thus allowing protocols that might not be of interest to GNOME and KDE for instance to still have a place to live.

The other more practical problem is that of having people available to help review protocols or providing reference implementations. In a space like Wayland where you need multiple people from multiple different projects it can be hard at times to get enough people involved at any given time to move things forward, as different projects have different priorities and of course the developers involved might be busy elsewhere. One thing we have done to try to help out there is to set up a small internal team, lead by Jonas Ådahl, to discuss in-progress Wayland protocols and assign people the responsibility to follow up on those protocols we have an interest in. This has been helpful both as a way for us to develop internal consensus on the best way forward, but also I think our contribution upstream has become more efficient due to this.

All that said I also believe Wayland protocols will fade a bit into the background going forward. We are currently at the last stage of a community ‘ramp up’ on Wayland and thus there is a lot of focus on it, but once we are over that phase we will probably see what we saw with X.org extensions over time, that for the most time new extensions are so niche that 95% of the community don’t pay attention or care. There will always be some new technology creating the need for important new protocols, but those are likely to come along a relatively slow cadence.

High Dynamic Range

HDR support in GNOME Control Center

HDR support in GNOME Control Center

As for concrete Wayland protocols the single biggest thing for us for a long while now has of course been the HDR support for Linux. And it was great to see the HDR protocol get merged just before the holidays. I also want to give a shout out to Xaver Hugl from the KWin project. As we where working to ramp up HDR support in both GNOME Shell and GTK+ we ended up working with Xaver and using Kwin for testing especially the GTK+ implementation. Xaver was very friendly and collaborative and I think HDR support in both GNOME and KDE is more solid thanks to that collaboration, so thank you Xaver!

Talking about concrete progress on HDR support Jonas Adahl submitted merge requests for HDR UI controls for GNOME Control Center. This means you will be able to configure the use of HDR on your system in the next Fedora Workstation release.

PipeWire
I been sharing a lot of cool PipeWire news here in the last couple of years, but things might slow down a little as we go forward just because all the major features are basically working well now. The PulseAudio support is working well and we get very few bug reports now against it. The reports we are getting from the pro-audio community is that PipeWire works just as well or better as JACK for most people in terms of for instance latency, and when we do see issues with pro-audio it tends to be more often caused by driver issues triggered by PipeWire trying to use the device in ways that JACK didn’t. We been resolving those by adding more and more options to hardcode certain options in PipeWire, so that just as with JACK you can force PipeWire to not try things the driver has problems with. Of course fixing the drivers would be the best outcome, but for some of these pro-audio cards they are so niche that it is hard to find developers who wants to work on them or who has hardware to test with.

We are still maturing the video support although even that is getting very solid now. The screen capture support is considered fully mature, but the camera support is still a bit of a work in progress, partially because we are going to a generational change the camera landscape with UVC cameras being supplanted by MIPI cameras. Resolving that generational change isn’t just on PipeWire of course, but it does make the a more volatile landscape to mature something in. Of course an advantage here is that applications using PipeWire can easily switch between V4L2 UVC cameras and libcamera MIPI cameras, thus helping users have a smooth experience through this transition period.
But even with the challenges posed by this we are moving rapidly forward with Firefox PipeWire camera support being on by default in Fedora now, Chrome coming along quickly and OBS Studio having PipeWire support for some time already. And last but not least SDL3 is now out with PipeWire camera support.

MIPI camera support
Hans de Goede, Milan Zamazal and Kate Hsuan keeps working on making sure MIPI cameras work under Linux. MIPI cameras are a step forward in terms of technical capabilities, but at the moment a bit of a step backward in terms of open source as a lot of vendors believe they have ‘secret sauce’ in the MIPI camera stacks. Our works focuses mostly on getting the Intel MIPI stack fully working under Linux with the Lattice MIPI aggregator being the biggest hurdle currently for some laptops. Luckily Alan Stern, the USB kernel maintainer, is looking at this now as he got the hardware himself.

Flatpak
Some major improvements to the Flatpak stack has happened recently with the USB portal merged upstream. The USB portal came out of the Sovereign fund funding for GNOME and it gives us a more secure way to give sandboxed applications access to you USB devcices. In a somewhat related note we are still working on making system daemons installable through Flatpak, with the usecase being applications that has a system daemon to communicate with a specific piece of hardware for example (usually through USB). Christian Hergert got this on his todo list, but we are at the moment waiting for Lennart Poettering to merge some pre-requisite work into systemd that we want to base this on.

Accessibility
We are putting in a lot of effort towards accessibility these days. This includes working on portals and Wayland extensions to help facilitate accessibility, working on the ORCA screen reader and its dependencies to ensure it works great under Wayland. Working on GTK4 to ensure we got top notch accessibility support in the toolkit and more.

GNOME Software
Last year Milan Crha landed the support for signing the NVIDIA driver for use on secure boot. The main feature Milan he is looking at now is getting support for DNF5 into GNOME Software. Doing this will resolve one of the longest standing annoyances we had, which is that the dnf command line and GNOME Software would maintain two separate package caches. Once the DNF5 transition is done that should be a thing of the past and thus less risk of disk space being wasted on an extra set of cached packages.

Firefox
Martin Stransky and Jan Horak has been working hard at making Firefox ready for the future, with a lot of work going into making sure it supports the portals needed to function as a flatpak and by bringing HDR support to Firefox. In fact Martin just got his HDR patches for Firefox merged this week. So with the PipeWire camera support, Flatpak support and HDR support in place, Firefox will be ready for the future.

We are hiring! looking for 2 talented developers to join the Red Hat desktop team
We are hiring! So we got 2 job openings on the Red Hat desktop team! So if you are interested in joining us in pushing the boundaries of desktop linux forward please take a look and apply. For these 2 positions we are open to remote workers across the globe and while the job adds list specific seniorities we are somewhat flexible on that front too for the right candidate. So be sure to check out the two job listings and get your application in! If you ever wanted to work fulltime on GNOME and related technologies this is your chance.

GStreamer 1.24.12 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.24.x.

Highlighted bugfixes:

  • d3d12: Fix shaders failing to compile with newer dxc versions
  • decklinkvideosink: Fix handling of caps framerate in auto mode; also a decklinkaudiosink fix
  • devicemonitor: Fix potential crash macOS when a device is unplugged
  • gst-libav: Fix crash in audio encoders like avenc_ac3 if input data has insufficient alignment
  • gst-libav: Fix build against FFmpeg 4.2 as in Ubuntu 20.04
  • gst-editing-services: Fix Python library name fetching on Windows
  • netclientclock: Don't store failed internal clocks in the cache, so applications can re-try later
  • oggdemux: Seeking and duration fixes
  • osxaudiosrc: Fixes for failing init/no output on recent iOS versions
  • qtdemux: Use mvhd transform matrix and support for flipping
  • rtpvp9pay: Fix profile parsing
  • splitmuxsrc: Fix use with decodebin3 which would occasionally fail with an assertion when seeking
  • tsdemux: Fix backwards PTS wraparound detection with ignore-pcr=true
  • video-overlay-composition: Declare the video/size/orientation tags for the meta and implement scale transformations
  • vtdec: Fix seeks occasionally hanging on macOS due to a race condition when draining
  • webrtc: Fix duplicate payload types with RTX and multiple video codecs
  • win32-pluginloader: Make sure not to create any windows when inspecting plugins
  • wpe: Various fixes for re-negotiation, latency reporting, progress messages on startup
  • x264enc: Add missing data to AvcDecoderConfigurationRecord in codec_data for high profile variants
  • cerbero: Support using ccache with cmake if enabled
  • Various bug fixes, build fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.12 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

GStreamer 1.25.1 unstable development release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce the first development release in the unstable 1.25 release series.

The unstable 1.25 release series is for testing and development purposes in the lead-up to the stable 1.26 series which is scheduled for release ASAP. Any newly-added API can still change until that point.

This development release is primarily for developers and early adopters.

The plan is to get 1.26 out of the door as quickly as possible.

Binaries for Android, iOS, Mac OS X and Windows will be made available shortly at the usual location.

Release tarballs can be downloaded directly here:

As always, please give it a spin and let us know of any issues you run into by filing an issue in GitLab.

GStreamer 1.24.11 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.24.x.

Highlighted bugfixes:

  • playback: Fix SSA/ASS subtitles with embedded fonts
  • decklink: add missing video modes and fix 8K video modes
  • matroskamux: spec compliance fixes for audio-only files
  • onnx: disable onnxruntime telemetry
  • qtdemux: Fix base offset update when doing segment seeks
  • srtpdec: Fix a use-after-free issue
  • (uri)decodebin3: Fix stream change scenarios, possible deadlock on shutdown
  • video: fix missing alpha flag in AV12 format description
  • avcodecmap: Add some more channel position mappings
  • cerbero bootstrap fixes for Windows 11
  • Various bug fixes, build fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.11 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

A Brimful of ASHA

From Arun Raghavan by Arun Raghavan

It’s 2025(!), and I thought I’d kick off the year with a post about some work that we’ve been doing behind the scenes for a while. Grab a cup of $beverage_of_choice, and let’s jump in with some context.

History: Hearing aids and Bluetooth

Various estimates put the number of people with some form of hearing loss at 5% of the population. Hearing aids and cochlear implants are commonly used to help deal with this (I’ll use “hearing aid” or “HA” in this post, but the same ideas apply to both). Historically, these have been standalone devices, with some primitive ways to receive audio remotely (hearing loops and telecoils).

As you might expect, the last couple of decades have seen advances that allow consumer devices (such as phones, tablets, laptops, and TVs) to directly connect to hearing aids over Bluetooth. This can provide significant quality of life improvements – playing audio from a device’s speakers means the sound is first distorted by the speakers, and then by the air between the speaker and the hearing aid. Avoiding those two steps can make a big difference in the quality of sound that reaches the user.

An illustration of the audio path through air vs. wireless audio (having higher fidelity)
Comparison of audio paths

Unfortunately, the previous Bluetooth audio standards (BR/EDR and A2DP – used by most Bluetooth audio devices you’ve come across) were not well-suited for these use-cases, especially from a power-consumption perspective. This meant that HA users would either have to rely on devices using proprietary protocols (usually limited to Apple devices), or have a cumbersome additional dongle with its own battery and charging needs.

Recent Past: Bluetooth LE

The more recent Bluetooth LE specification addresses some of the issues with the previous spec (now known as Bluetooth Classic). It provides a low-power base for devices to communicate with each other, and has been widely adopted in consumer devices.

On top of this, we have the LE Audio standard, which provides audio streaming services over Bluetooth LE for consumer audio devices and HAs. The hearing aid industry has been an active participant in its development, and we should see widespread support over time, I expect.

The base Bluetooth LE specification has been around from 2010, but the LE Audio specification has only been public since 2021/2022. We’re still seeing devices with LE Audio support trickle into the market.

In 2018, Google partnered with a hearing aid manufacturer to announce the ASHA (Audio Streaming for Hearing Aids) protocol, presumably as a stop-gap. The protocol uses Bluetooth LE (but not LE Audio) to support low-power audio streaming to hearing aids, and is publicly available. Several devices have shipped with ASHA support in the last ~6 years.

A brief history of Bluetooth LE and audio

Hot Take: Obsolescence is bad UX

As end-users, we understand the push/pull of technological advancement and obsolescence. As responsible citizens of the world, we also understand the environmental impact of this.

The problem is much worse when we are talking about medical devices. Hearing aids are expensive, and are expected to last a long time. It’s not uncommon for people to use the same device for 5-10 years, or even longer.

In addition to the financial cost, there is also a significant emotional cost to changing devices. There is usually a period of adjustment during which one might be working with an audiologist to tune the device to one’s hearing. Neuroplasticity allows the brain to adapt to the device and extract more meaning over time. Changing devices effectively resets the process.

All this is to say that supporting older devices is a worthy goal in itself, but has an additional set of dimensions in the context of accessibility.

HAs and Linux-based devices

Because of all this history, hearing aid manufacturers have traditionally focused on mobile devices (i.e. Android and iOS). This is changing, with Apple supporting its proprietary MFi (made for iPhone/iPad/iPod) protocol on macOS, and Windows adding support for LE Audio on Windows 11.

This does leave the question of Linux-based devices, which is our primary concern – can users of free software platforms also have an accessible user experience?

A lot of work has gone into adding Bluetooth LE support in the Linux kernel and BlueZ, and more still to add LE Audio support. PipeWire’s Bluetooth module now includes support for LE Audio, and there is continuing effort to flesh this out. Linux users with LE Audio-based hearing aids will be able to take advantage of all this.

However, the ASHA specification was only ever supported on Android devices. This is a bit of a shame, as there are likely a significant number of hearing aids out there with ASHA support, which will hopefully still be around for the next 5+ years. This felt like a gap that we could help fill.

Step 1: A Proof-of-Concept

We started out by looking at the ASHA specification, and the state of Bluetooth LE in the Linux kernel. We spotted some things that the Android stack exposes that BlueZ does not, but it seemed like all the pieces should be there.

Friend-of-Asymptotic, Ravi Chandra Padmala spent some time with us to implement a proof-of-concept. This was a pretty intense journey in itself, as we had to identify some good reference hardware (we found an ASHA implementation on the onsemi RSL10), and clean out the pipes between the kernel and userspace (LE connection-oriented channels, which ASHA relies on, weren’t commonly used at that time).

We did eventually get the proof-of-concept done, and this gave us confidence to move to the next step of integrating this into BlueZ – albeit after a hiatus of paid work. We have to keep the lights on, after all!

Step 2: ASHA in BlueZ

The BlueZ audio plugin implements various audio profiles within the BlueZ daemon – this includes A2DP for Bluetooth Classic, as well as BAP for LE Audio.

We decided to add ASHA support within this plugin. This would allow BlueZ to perform privileged operations and then hand off a file descriptor for the connection-oriented channel, so that any userspace application (such as PipeWire) could actually stream audio to the hearing aid.

I implemented an initial version of the ASHA profile in the BlueZ audio plugin last year, and thanks to Luiz Augusto von Dentz’ guidance and reviews, the plugin has landed upstream.

This has been tested with a single hearing aid, and stereo support is pending. In the process, we also found a small community of folks with deep interest in this subject, and you can join us on #asha on the BlueZ Slack.

Step 3: PipeWire support

To get end-to-end audio streaming working with any application, we need to expose the BlueZ ASHA profile as a playback device on the audio server (i.e., PipeWire). This would make the HAs appear as just another audio output, and we could route any or all system audio to it.

My colleague, Sanchayan Maity, has been working on this for the last few weeks. The code is all more or less in place now, and you can track our progress on the PipeWire MR.

Step 4 and beyond: Testing, stereo support, …

Once we have the basic PipeWire support in place, we will implement stereo support (the spec does not support more than 2 channels), and then we’ll have a bunch of testing and feedback to work with. The goal is to make this a solid and reliable solution for folks on Linux-based devices with hearing aids.

Once that is done, there are a number of UI-related tasks that would be nice to have in order to provide a good user experience. This includes things like combining the left and right HAs to present them as a single device, and access to any tuning parameters.

Getting it done

This project has been on my mind since the ASHA specification was announced, and it has been a long road to get here. We are in the enviable position of being paid to work on challenging problems, and we often contribute our work upstream. However, there are many such projects that would be valuable to society, but don’t necessarily have a clear source of funding.

In this case, we found ourselves in an interesting position – we have the expertise and context around the Linux audio stack to get this done. Our business model allows us the luxury of taking bites out of problems like this, and we’re happy to be able to do so.

However, it helps immensely when we do have funding to take on this work end-to-end – we can focus on the task entirely and get it done faster.

Onward…

I am delighted to announce that we were able to find the financial support to complete the PipeWire work! Once we land basic mono audio support in the MR above, we’ll move on to implementing stereo support in the BlueZ plugin and the PipeWire module. We’ll also be testing with some real-world devices, and we’ll be leaning on our community for more feedback.

This is an exciting development, and I’ll be writing more about it in a follow-up post in a few days. Stay tuned!

JPEG XS support for interlaced video

From Centricular Devlog by Centricular

JPEG XS is a visually lossless, low-latency, intra-only video codec for video production workflows, standardised in ISO/IEC 21122.

A few months ago we added support for JPEG XS encoding and decoding in GStreamer, alongside MPEG-TS container support.

This initially covered progressive scan only though.

Unfortunately interlaced scan, which harks back to the days when TVs had cathode ray tube displays, is still quite common, especially in the broadcasting industry, so it was only a matter of time until support for that would be needed as well.

Long story short, GStreamer can now (with this pending Merge Request) also encode and decode interlaced video into/from JPEG XS.

When putting JPEG XS into MPEG-TS, the individual fields are actually coded separately, so there are two JPEG XS code streams per frame. Inside GStreamer pipelines interlaced raw video can be carried in multiple ways, but the most common one is an "interleaved" image, where the two fields are interleaved row by row, and this is also what capture cards such as AJA or Decklink Blackmagic produce in GStreamer.

When encoding interlaced video in this representation, we need to go twice over each frame and feed every second row of pixels to the underlying SVT JPEG XS encoder which itself is not aware of the interlaced nature of the video content. We do this by specifying double the usual stride as rowstride. This works fine, but unearthed some minor issues with the size checks on the codec side, for which we filed a pull request.

Please give it a spin, and let us know if you have any questions or are interested in additional container mappings such as MP4 or MXF, or RTP payloaders / depayloaders.

Igalia Multimedia achievements (2024)

From Herostratus’ legacy by Víctor Jáquez

As 2024 draws to a close, it’s a perfect time to reflect on the year’s accomplishments done by the Multimedia team in Igalia. In our consideration, there were three major achievements:

  1. WebRTC’s support in WPE/WebKitGTK with GStreamer.
  2. GES maturity improved for real-life use-cases.
  3. Vulkan Video development and support in GStreamer.

WebRTC support in WPE/WebKitGTK with GStreamer #

WPE and WebKitGTK are WebKit ports maintained by Igalia, the former for embedded devices and the latter for applications with a full-featured Web integration.

WebRTC is a web API that allows real-time communication (RTC) directly between web browser and applications. Examples of these real-time communications are video conferencing, cloud gaming, live-streaming, etc.

Some WebKit ports support libwebrtc, an open-source library that implements the WebRTC specification, developed and maintained by Google. WPE and WebKitGTK originally also supports libwebrtc, but we started to use also GstWebRTC, a set of GStreamer plugins and libraries that implement WebRTC, which adapts perfectly to the multimedia implementation in both ports, also in GStreamer.

This year the fruits of this work have been unlocked by enabling Amazon Luna gaming:

https://www.youtube.com/watch?v=lyO7Hqj1jMs

And also enabling a CAD modeling, server-side rendered service, known as Zoo:

https://www.youtube.com/watch?v=CiuYjSCDsUM

WebKit Multimedia #

WebKit made significant improvements in multimedia handling, addressing various issues to enhance stability and playback quality. Key updates include preventing premature play() calls during seeking, fixing memory leaks. The management of track identifiers was also streamlined by transitioning from string-based to integer-based IDs. Additionally, GStreamer-related race conditions were resolved to prevent hangs during playback state transitions. Memory leaks in WebAudio and event listener management were addressed, along with a focus on memory usage optimizations.

The handling of media buffering and seeking was enhanced with buffering hysteresis for smoother playback. Media Source Extensions (MSE) behavior was refined to improve playback accuracy, such as supporting markEndOfStream() before appendBuffer() and simplifying playback checks. Platform-specific issues were also tackled, including AV1 and Opus support for encrypted media and better detection of audio sinks. And other improvements on multimedia performance and efficiency.

GES maturity improved for real-live use-cases #

GStreamer Editing Services (GES) is a set of GStreamer plugins and a library that allow non-linear video editing. For example, GES is what’s behind of Pitivi, the open source video editor application.

Last year, GES was deployed in web-based video editors, where the actual video processing is done server-side. These projects allowed, in great deal, the enhancement and maturity of the library and plugins.

Tella is a browser-based tool that allow to screen record and webcam, without any extra software. Finished the recording, the user can edit, in the browser, the video and publish it.

https://www.youtube.com/watch?v=uSWqWHBRDWE

Sequence is a complete, browser-based, video editor with collaborative features. GES is used in the backend to render the editing operations.

https://www.youtube.com/watch?v=bXNdDIiG9lE

Vulkan Video development and GStreamer support #

The last but not the least, this year we continue our work with the Vulkan Video ecosystem by working the task subgroup (TSG) enabling H.264/H.265 encoding, and AV1 decoding and encoding.

Early this year we delivered a talk in the Vulkanised about our work, which ranges from the Conformance Test Suite (CTS), Mesa, and GStreamer.

https://www.youtube.com/watch?v=z1HcWrmdwzI

Conclusion #

As we wrap up 2024, it’s clear that the year has been one of significant progress, driven by innovation and collaboration. Here’s to continuing the momentum and making 2025 even better!

GStreamer + PipeWire: A Todo List

From Arun Raghavan by Arun Raghavan

I wrote about our time at the GStreamer Conference in October, and one important thing I was able to do is spend some time with all-around great guy George reflecting on where the GStreamer plugins for PipeWire are, and what we need to do to get them to a rock-solid state.

This is a summary of our conversation, in the form of a to-do list of sorts…

Status Quo

Currently, we have two elements: pipewiresrc and pipewiresink. The two plugins work with both audio and video, and instantiate a PipeWire capture and playback stream, respectively. The stream, as with any PipeWire client, appears as a node in the PipeWire.

Buffers are managed in the GStreamer pipeline using bufferpools, and recently Wim re-enabled exposing the stream clock as a GStreamer clock.

There have been a number of issues that have cropped up over time, and we’ve been plugging away at addressing them, but it was worth stepping back and looking at the whole for a bit.

Use Cases

The straightforward uses of these elements might be to represent client streams: pipewiresrc might connect to an audio capture device (like a microphone), or video capture device (like a webcam), and provide the data for downstream elements to consume. Similarly pipewiresink might be used to play audio to the system output (speakers or headphones, perhaps).

Because of the flexibility of the PipeWire API, these elements may also be used to provide a virtual capture or playback device though. So pipewiresrc might provide a virtual audio sink, which applications could connect to to stream audio over the network (like say a WebRTC stream).

Conversely, it is possible to use pipewiresink to provide a virtual capture device – for example, the pipeline might generate a video stream and expose it a virtual camera for other applications to use.

We might even combine the two cases, one might connect to a webcam as a client, apply some custom video processing, and then expose that stream back as a virtual camera source as easily as:

pipewiresrc target-object="MyCamera" ! <some video filters> ! \
  pipewiresink provide=true stream-properties="props,media.class=Video/Source,media.role=Camera"

So we have a minor combinatorial explosion across 3 axes, and all combinations are valid:

  • pipewiresrc vs. pipewiresink
  • audio vs. video
  • stream vs. virtual device

For each of these combinations, we might have different behaviour across the various issues below.

Split ’em up?

Before we look at specific issues, it is worth pointing out that the PipeWire elements are unusual in that they support both audio and video with the same code. This seems like a tantalisingly elegant idea, and it’s quite neat that we are able to get this far with this unified approach.

However, as we examine the specific issues we are seeing, it does seem to emerge that the audio and video paths diverge in several ways. It may be time to consider whether the divergence merits just splitting them up into separate audio and video elements.

Linking

The first issue that comes to mind is how we might want PipeWire or WirePlumber to manage linking the nodes from the GStreamer pipeline with other nodes (devices or streams).

For the playback/capture stream use-cases, we would want the nodes to automatically be connected to a sink/source node when the GStreamer pipeline goes to the PAUSED or PLAYING state, and for that link to be torn down when leaving those states. It might be possible for the link to “move” if, for example, the default playback or capture device changes, though a “move” is really the removal of the current link with a new link following.

For the virtual device use-cases, the pipeline state should likely follow the link state. That is, when a node is connected to our virtual device, we want the GStreamer pipeline to start producing/consuming data, and when disconnected, it should go back to “sleep”, possibly running again later.

The latter is something that a GStreamer application using these plugins might have to manage manually, but simplifying this and supporting this via gst-launch-1.0 for easy command-line use would be nice to have.

There are already the beginnings of support for such usage via the provide property on pipewiresink, but more work is needed for this to make this truly usable.

Bufferpools

Closely related to linking are buffers and bufferpools, as the process of linking nodes is what makes buffers for data exchange available to PipeWire nodes.

While bufferpools are a valuable concept for memory efficiency and avoiding unnecessary memcpy()s, they come with some complexity overhead in managing the pipeline. For one, as the number of buffers in a bufferpool is limited, it is possible to exhaust the set of buffers (with a large queue for example).

There are also some lifecycle complexities that arise from links coming and going, as the corresponding buffers also then go away from under us, something that GStreamer bufferpools are not designed for.

A solution to the first problem might be to avoid using bufferpools for some cases (for example, they might not be very valuable for audio). The solution to the lifecycle problem is a trickier one, and no clear answer is apparent yet, at least with the APIs as they stand.

We might also need to support resizing bufferpools for some cases, and that is not something that is easy to support with how buffer management currently happens in PipeWire (the stream API does not really give us much of a handle on this).

Formats

In order to support the various use-cases, we want to be able to support both a fixed format (if we know what we are providing), or a negotiated format (if we can adapt in the GStreamer pipeline based on what PipeWire has/wants).

There is also a large surface area of formats that PipeWire supports that we need to make sure we support well:

  • There are known issues with some planar video formats being presented correctly from pipewiresrc
  • We do not expose planar audio formats, although both GStreamer and PipeWire support them
  • Support for DSD and passthrough audio (e.g. Dolby/DTS over HDMI) needs to be wired up
  • Support for compressed formats (we added PipeWire support for decode + render on a DSP)

Rate matching

While Wim recently added some rate matching code to pipewiresink, there is work to be done to make sure that if there is skew between the GStreamer pipeline’s data rate and the audio device rate, we can use PipeWire’s rate adaptation features to compensate for such skew. This should work in both pipewiresink and pipewiresrc.

For some background on this topic, check out my talk on clock rate matching from a couple of months ago.

Device provider conflicts

While we are improving the out-of-the-box experience of these elements, unfortunately the PipeWire device provider currently supersedes all others (the GStreamer Device Provider API allows for discovering devices on the system and the elements used to access them).

The higher rank might make sense for video (as system integrators likely want to start preferring PipeWire elements over V4L2), but it can lead to a bad experience for audio (the PulseAudio elements work better today via PipeWire’s PulseAudio emulation layer).

We might temporarily drop the rank of PipeWire elements for audio to avoid autoplugging them while we fix the problems we have.

Probing formats

We create a “probe” stream in pipewiresink while getting ready to play audio, in order to discover what formats the device we would play to supports. This is required to detect supported formats and make decisions about whether to decode in GStreamer, what sample rate and format are preferred, etc.

Unfortunately, that also causes a “false” playback device startup sequence, which might manifest as a click or glitch on some hardware. Having a way to set up a probe that does not actually open the device would be a nice improvement.

Player state

There are a couple of areas where policy actions do not always surface well to the application/UI layer. One instance of this is where a stream is “corked” (maybe because only one media player should be active at a time) – we want to let the player know it has been paused, so it can update its state and let the UI know too. There is limited infrastructure for this already, via GST_MESSAGE_REQUEST_STATE.

Also, more of a session management (i.e. WirePlumber / system integration) thing, we do not really have the concept of a system-wide media player state. This would be useful if we want to exercise policy like “don’t let any media player play while we’re on a call”, and have that state be consistent across UI interactions (i.e. hitting play during a call does not trigger playback, maybe even lets the user know why it’s not working / provides an override).

Clocking a GStreamer pipeline from an MPEG-TS stream

From Centricular Devlog by Centricular

Some time ago, Edward and I wrote a new element that allows clocking a GStreamer pipeline from an MPEG-TS stream, for example received via SRT.

This new element, mpegtslivesrc, wraps around any existing live source element, e.g. udpsrc or srtsrc, and provides a GStreamer clock that approximates the sender's clock. By making use of this clock as pipeline clock, it is possible to run the whole pipeline at the same speed as the sender is producing the stream and without having to implement any kind of clock drift mechanism like skewing or resampling. Without this it is necessary currently to adjust the timestamps of media coming out of GStreamer's tsdemux element, which is problematic if accurate timestamps are necessary or the stream should be stored to a file, e.g. a 25fps stream wouldn't have exactly 40ms inter-frame timestamp differences anymore.

The clock is approximated by making use of the in-stream MPEG-TS PCR, which basically gives the sender's clock time at specific points inside the stream, and correlating that together with the local receive times via a linear regression to calculate the relative rate between the sender's clock and the local system clock.

Usage of the element is as simple as

$ gst-launch-1.0 mpegtslivesrc source='srtsrc location=srt://1.2.3.4:5678?latency=150&mode=caller' ! tsdemux skew-corrections=false ! ...
$ gst-launch-1.0 mpegtslivesrc source='udpsrc address=1.2.3.4 port=5678' ! tsdemux skew-corrections=false ! ...

Addition 2025-06-28: If you're using an older (< 1.28) version of GStreamer, you'll have to use the ignore-pcr=true property on tsdemux instead. skew-corrections=false was only added recently and allows for more reliable handling of MPEG-TS timestamp discontinuities.

A similar approach for clocking is implemented in the AJA source element and the NDI source element when the clocked timestamp mode is configured.

GStreamer 1.24.10 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and security fixes.

It should be safe to update from 1.24.x, and we would recommend you update at your earliest convenience.

Highlighted bugfixes:

  • More than 40 security fixes across a wide range of elements following an audit by the GitHub Security Lab, including the MP4, Matroska, Ogg and WAV demuxers, subtitle parsers, image decoders, audio decoders and the id3v2 tag parser
  • avviddec: Fix regression that could trigger assertions about width/height mismatches
  • appsink and appsrc fixes
  • closed caption handling fixes
  • decodebin3 and urisourcebin fixes
  • glupload: dmabuf: Fix emulated tiled import
  • level: fix LevelMeta values outside of the stated range
  • mpegtsmux, flvmux: fix potential busy looping with high cpu usage in live mode
  • pipeline dot file graph generation improvements
  • qt(6): fix criticals with multiple qml(6)gl{src,sink}
  • rtspsrc: Optionally timestamp RTP packets with their receive times in TCP/HTTP mode to enable clock drift handling
  • splitmuxsrc: reduce number of file descriptors used
  • systemclock: locking order fixes
  • v4l2: fix possible v4l2videodec deadlock on shutdown; 8-bit bayer format fixes
  • x265: Fix build with libx265 version >= 4.1 after masteringDisplayColorVolume API change
  • macOS: fix rendering artifacts in retina displays, plus ptp clock fixes
  • cargo: Default to thin lto for the release profile (for faster builds with lower memory requirements)
  • Various bug fixes, build fixes, memory leak fixes, and other stability and reliability improvements
  • Translation updates

See the GStreamer 1.24.10 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

GStreamer Conference 2024

From Arun Raghavan by Arun Raghavan

All of us at Asymptotic are back home from the exciting week at GStreamer Conference 2024 in Montréal, Canada last month. It was great to hang out with the community and see all the great work going on in the GStreamer ecosystem.

Montréal sunsets are 😍

There were some visa-related adventures leading up to the conference, but thanks to the organising team (shoutout to Mark Filion and Tim-Philipp Müller), everything was sorted out in time and Sanchayan and Taruntej were able to make it.

This conference was also special because this year marks the 25th anniversary of the GStreamer project!

Happy birthday to us! 🎉

Talks

We had 4 talks at the conference this year.

GStreamer & QUIC (video)

Sancyahan speaking about GStreamer and QUIC

Sanchayan spoke about his work with the various QUIC elements in GStreamer. We already have the quinnquicsrc and quinquicsink upstream, with a couple of plugins to allow (de)multiplexing of raw streams as well as an implementation or RTP-over-QUIC (RoQ). We’ve also started work on Media-over-QUIC (MoQ) elements.

This has been a fun challenge for us, as we’re looking to build out a general-purpose toolkit for building QUIC application-layer protocols in GStreamer. Watch this space for more updates as we build out more functionality, especially around MoQ.

Clock Rate Matching in GStreamer & PipeWire (video)

Arun speaking about PipeWire delay-locked loops
Photo credit: Francisco

My talk was about an interesting corner of GStreamer, namely clock rate matching. This is a part of live pipelines that is often taken for granted, so I wanted to give folks a peek under the hood.

The idea of doing this talk was was born out of some recent work we did to allow splitting up the graph clock in PipeWire from the PTP clock when sending AES67 streams on the network. I found the contrast between the PipeWire and GStreamer approaches thought-provoking, and wanted to share that with the community.

GStreamer for Real-Time Audio on Windows (video)

Next, Taruntej dove into how we optimised our usage of GStreamer in a real-time audio application on Windows. We had some pretty tight performance requirements for this project, and Taruntej spent a lot of time profiling and tuning the pipeline to meet them. He shared some of the lessons learned and the tools he used to get there.

Simplifying HLS playlist generation in GStreamer (video)

Sanchayan also walked us through the work he’s been doing to simplify HLS (HTTP Live Streaming) multivariant playlist generation. This should be a nice feature to round out GStreamer’s already strong support for generating HLS streams. We are also exploring the possibility of reusing the same code for generating DASH (Dynamic Adaptive Streaming over HTTP) manifests.

Hackfest

As usual, the conference was followed by a two-day hackfest. We worked on a few interesting problems:

  • Sanchayan addressed some feedback on the QUIC muxer elements, and then investigated extending the HLS elements for SCTE-35 marker insertion and DASH support

  • Taruntej worked on improvements to the threadshare elements, specifically to bring some ts-udpsrc element features in line with udpsrc

  • I spent some time reviewing a long-pending merge request to add soft-seeking support to the AWS S3 sink (so that it might be possible to upload seekable MP4s, for example, directly to S3). I also had a very productive conversation with George Kiagiadakis about how we should improve the PipeWire GStreamer elements (more on this soon!)

All in all, it was a great time, and I’m looking forward to the spring hackfest and conference in the the latter part next year!

Apple AAC encoder in GStreamer

From Centricular Devlog by Centricular

Thanks to the newly added atenc element, you can now use Apple's well-known AAC encoder directly in GStreamer!

gst-launch-1.0 -e audiotestsrc ! audio/x-raw,channels=2,rate=48000 ! atenc ! mp4mux ! filesink location=output.m4a

It supports all the usual rate control modes (CBR/LTA/VBR/CVBR), as well as settings relevant for each of them (target bitrate for CBR, perceived quality for VBR).

For now you can encode AAC-LC with up to 7.1 channels. Support for more AAC profiles and different output formats will be added in the future.

If you need decoding too, atdec is there to help and has supported AAC alongside a few other formats for a long time now.

GStreamer 1.24.9 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and a security fix and it should be safe to update from 1.24.x.

Highlighted bugfixes:

  • gst-rtsp-server security fix
  • GstAggregator start time selection and latency query fixes for force-live mode
  • audioconvert: fix dynamic handling of mix matrix, and accept custom upstream event for setting one
  • encodebin: fix parser selection for encoders that support multiple codecs
  • flvmux improvments for pipelines where timestamps don't start at 0
  • glcontext: egl: Unrestrict the support base DRM formats
  • kms: Add IMX-DCSS auto-detection in sink and fix stride with planar formats in allocator
  • macOS main application event loop fixes
  • mpegtsdemux: Handle PTS/DTS wraparound with ignore-pcr=true
  • playbin3, decodebin3, parsebin, urisourcebin: fix races, and improve stability and stream-collection handling
  • rtpmanager: fix early RTCP SR generation for sparse streams like metadata
  • qml6glsrc: Reduce capture delay
  • qtdemux: fix parsing of rotation matrix with 180 degree rotation
  • rtpav1depay: added wait-for-keyframe and request-keyframe properties
  • srt: make work with newer libsrt versions and don't re-connect on authentication failure
  • v4l2 fixes and improvement
  • webrtcsink, webrtcbin and whepsrc fixes
  • cerbero: fix Python 3.13 compatibility, g-i with newer setuptools, bootstrap on Arch Linux; iOS build fixes
  • Ship qroverlay plugin in binary packages
  • Various bug fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.9 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

GStreamer Support for SMPTE 2038 Ancillary Data and Closed Captions in MPEG-TS

From Centricular Devlog by Centricular

If you've ever seen a news or sports channel playing without sound in the background of a hotel lobby, bar, or airport, you've probably seen closed captions in action.

These TV-style captions are alphabet/character-based, with some very basic commands to control the positioning and layout of the text on the screen.

They are very low bitrate and were transmitted in the invisible part of TV images during the vertical blanking interval (VBI) back in those good old analogue days ("line 21 captions").

Nowadays they are usually carried as part of the MPEG-2 or H.264/H.265 video bitstream, unlike say text subtitles in a Matroska file which will be its own separate stream in the container.

In GStreamer closed captions can be carried in different ways: Either implicitly as part of a video bitstream, or explicitly as part of a video bitstream with video caption metas on the buffers passing through the pipeline. Captions can also travel through a pipeline stand-alone in form of one of multiple raw caption bitstream formats.

To make handling these different options easier for applications there are elements that can extract captions from the video bitstream into metas, and split off captions from metas into their own stand-alone stream, and to do the reverse and combine and reinject them again.

SMPTE 2038 Ancillary Data

SMPTE 2038 (pdf) is a generic system to put VBI-style ancillary data into an MPEG-TS container. This could include all kinds of metadata such as scoreboard data or game clocks, and of course also closed captions, in this case in form of a distinct stream completely separate from any video bitstream.

We've recently added support for SMPTE 2038 ancillary data in GStreamer. This comes in form of a number of new elements in the GStreamer Rust closedcaption plugin and mappings for it in the MPEG-TS muxer and demuxer.

The new elements are:

  • st2038ancdemux: splits SMPTE ST-2038 ancillary metadata (as received from tsdemux) into separate streams per DID/SDID and line/horizontal_offset. Will add a sometimes pad with details for each ancillary stream. Also has an always source pad that just outputs all ancillary streams for easy forwarding or remuxing, in case none of the ancillary streams need to be modified or dropped.

  • st2038ancmux: muxes SMPTE ST-2038 ancillary metadata streams into a single stream for muxing into MPEG-TS with mpegtsmux. Combines ancillary data on the same line if needed, as is required for MPEG-TS muxing. Can accept individual ancillary metadata streams as inputs and/or the combined stream from st2038ancdemux.

    If the video framerate is known, it can be signalled to the ancillary data muxer via the output caps by adding a capsfilter behind it, with e.g. meta/x-st-2038,framerate=30/1.

    This allows the muxer to bundle all packets belonging to the same frame (with the same timestamp), but that is not required. In case there are multiple streams with the same DID/SDID that have an ST-2038 packet for the same frame, it will prioritise the one from more recently created request pads over those from earlier created request pads (which might contain a combined stream for example if that's fed first).

  • st2038anctocc: extracts closed captions (CEA-608 and/or CEA-708) from SMPTE ST-2038 ancillary metadata streams and outputs them on the respective sometimes source pad (src_cea608 or src_cea708). The data is output as a closed caption stream with caps closedcaption/x-cea-608,format=s334-1a or closedcaption/x-cea-708,format=cdp for further processing by other GStreamer closed caption processing elements.

  • cctost2038anc: takes closed captions (CEA-608 and/or CEA-708) as produced by other GStreamer closed caption processing elements and converts them into SMPTE ST-2038 ancillary data that can be fed to st2038ancmux and then to mpegtsmux for splicing/muxing into an MPEG-TS container. The line-number and horizontal-offset properties should be set to the desired line number and horizontal offset.

Please give it a spin and let us know how it goes!

wireless_status kernel sysfs API


From /bɑs ˈtjɛ̃ no ˈse ʁɑ/ (hadess) | News by Bastien Nocera
(I worked on this feature last year, before being moved off desktop related projects, but I never saw it documented anywhere other than in the original commit messages, so here's the opportunity to shine a little light on a feature that could probably see more use)

    The new usb_set_wireless_status() driver API function can be used by drivers of USB devices to export whether the wireless device associated with that USB dongle is turned on or not.

    To quote the commit message:

This will be used by user-space OS components to determine whether the
battery-powered part of the device is wirelessly connected or not,
allowing, for example:
- upower to hide the battery for devices where the device is turned off
  but the receiver plugged in, rather than showing 0%, or other values
  that could be confusing to users
- Pipewire to hide a headset from the list of possible inputs or outputs
  or route audio appropriately if the headset is suddenly turned off, or
  turned on
- libinput to determine whether a keyboard or mouse is present when its
  receiver is plugged in.
This is not an attribute that is meant to replace protocol specific
APIs [...] but solely for wireless devices with
an ad-hoc “lose it and your device is e-waste” receiver dongle.
 

    Currently, the only 2 drivers to use this are the ones for the Logitech G935 headset, and the Steelseries Arctis 1 headset. Adding support for other Logitech headsets would be possible if they export battery information (the protocols are usually well documented), support for more Steelseries headsets should be feasible if the protocol has already been reverse-engineered.

    As far as consumers for this sysfs attribute, I filed a bug against Pipewire (link) to use it to not consider the receiver dongle as good as unplugged if the headset is turned off, which would avoid audio being sent to headsets that won't hear it.

    UPower supports this feature since version 1.90.1 (although it had a bug that makes 1.90.2 the first viable release to include it), and batteries will appear and disappear when the device is turned on/off.

A turned-on headset

GStreamer Conference 2024

From Herostratus’ legacy by Víctor Jáquez

Early this month I spent a couple weeks in Montreal, visiting the city, but mostly to attend the GStreamer Conference and the following hackfest, which happened to be co-located with the XDC Conference. It was my first time in Canada and I utterly enjoyed it. Thanks to all those from the GStreamer community that organized and attended the event.

For now you can replay the whole streams of both days of conference, but soon the split videos for each talk will be published:

GStreamer Conference 2024 - Day 1, Room 1 - October 7, 2024

https://www.youtube.com/watch?v=KLUL1D53VQI


GStreamer Conference 2024 - Day 1, Room 2 - October 7, 2024

https://www.youtube.com/watch?v=DH64D_6gc80


GStreamer Conference 2024 - Day 2, Room 2 - October 8, 2024

https://www.youtube.com/watch?v=jt6KyV757Dk


GStreamer Conference 2024 - Day 2, Room 1 - October 8, 2024

https://www.youtube.com/watch?v=W4Pjtg0DfIo


And a couple pictures of igalians :)

GStreamer Skia plugin
GStreamer Skia plugin

GStreamer VA
GStreamer VA

GstWebRTC
GstWebRTC

GstVulkan
GstVulkan

GStreamer Editing Services
gstreamer editing services

Adding buffering hysteresis to the WebKit GStreamer video player

From GStreamer – Happy coding by Enrique Ocaña González

The <video> element implementation in WebKit does its job by using a multiplatform player that relies on a platform-specific implementation. In the specific case of glib platforms, which base their multimedia on GStreamer, that’s MediaPlayerPrivateGStreamer.

WebKit GStreamer regular playback class diagram

The player private can have 3 buffering modes:

  • On-disk buffering: This is the typical mode on desktop systems, but is frequently disabled on purpose on embedded devices to avoid wearing out their flash storage memories. All the video content is downloaded to disk, and the buffering percentage refers to the total size of the video. A GstDownloader element is present in the pipeline in this case. Buffering level monitoring is done by polling the pipeline every second, using the fillTimerFired() method.
  • In-memory buffering: This is the typical mode on embedded systems and on desktop systems in case of streamed (live) content. The video is downloaded progressively and only the part of it ahead of the current playback time is buffered. A GstQueue2 element is present in the pipeline in this case. Buffering level monitoring is done by listening to GST_MESSAGE_BUFFERING bus messages and using the buffering level stored on them. This is the case that motivates the refactoring described in this blog post, what we actually wanted to correct in Broadcom platforms, and what motivated the addition of hysteresis working on all the platforms.
  • Local files: Files, MediaStream sources and other special origins of video don’t do buffering at all (no GstDownloadBuffering nor GstQueue2 element is present on the pipeline). They work like the on-disk buffering mode in the sense that fillTimerFired() is used, but the reported level is relative, much like in the streaming case. In the initial version of the refactoring I was unaware of this third case, and only realized about it when tests triggered the assert that I added to ensure that the on-disk buffering method was working in GST_BUFFERING_DOWNLOAD mode.

The current implementation (actually, its wpe-2.38 version) was showing some buffering problems on some Broadcom platforms when doing in-memory buffering. The buffering levels monitored by MediaPlayerPrivateGStreamer weren’t accurate because the Nexus multimedia subsystem used on Broadcom platforms was doing its own internal buffering. Data wasn’t being accumulated in the GstQueue2 element of playbin, because BrcmAudFilter/BrcmVidFilter was accepting all the buffers that the queue could provide. Because of that, the player private buffering logic was erratic, leading to many transitions between “buffer completely empty” and “buffer completely full”. This, it turn, caused many transitions between the HaveEnoughData, HaveFutureData and HaveCurrentData readyStates in the player, leading to frequent pauses and unpauses on Broadcom platforms.

So, one of the first thing I tried to solve this issue was to ask the Nexus PlayPump (the subsystem in charge of internal buffering in Nexus) about its internal levels, and add that to the levels reported by GstQueue2. There’s also a GstMultiqueue in the pipeline that can hold a significant amount of buffers, so I also asked it for its level. Still, the buffering level unstability was too high, so I added a moving average implementation to try to smooth it.

All these tweaks only make sense on Broadcom platforms, so they were guarded by ifdefs in a first version of the patch. Later, I migrated those dirty ifdefs to the new quirks abstraction added by Phil. A challenge of this migration was that I needed to store some attributes that were considered part of MediaPlayerPrivateGStreamer before. They still had to be somehow linked to the player private but only accessible by the platform specific code of the quirks. A special HashMap attribute stores those quirks attributes in an opaque way, so that only the specific quirk they belong to knows how to interpret them (using downcasting). I tried to use move semantics when storing the data, but was bitten by object slicing when trying to move instances of the superclass. In the end, moving the responsibility of creating the unique_ptr that stored the concrete subclass to the caller did the trick.

Even with all those changes, undesirable swings in the buffering level kept happening, and when doing a careful analysis of the causes I noticed that the monitoring of the buffering level was being done from different places (in different moments) and sometimes the level was regarded as “enough” and the moment right after, as “insufficient”. This was because the buffering level threshold was one single value. That’s something that a hysteresis mechanism (with low and high watermarks) can solve. So, a logical level change to “full” would only happen when the level goes above the high watermark, and a logical level change to “low” when it goes under the low watermark level.

For the threshold change detection to work, we need to know the previous buffering level. There’s a problem, though: the current code checked the levels from several scattered places, so only one of those places (the first one that detected the threshold crossing at a given moment) would properly react. The other places would miss the detection and operate improperly, because the “previous buffering level value” had been overwritten with the new one when the evaluation had been done before. To solve this, I centralized the detection in a single place “per cycle” (in updateBufferingStatus()), and then used the detection conclusions from updateStates().

So, with all this in mind, I refactored the buffering logic as https://commits.webkit.org/284072@main, so now WebKit GStreamer has a buffering code much more robust than before. The unstabilities observed in Broadcom devices were gone and I could, at last, close Issue 1309.

GStreamer Conference 2024: Full Schedule, Talk Abstracts and Speakers Biographies now available

GStreamer
From GStreamer News by GStreamer

The GStreamer Conference team is pleased to announce that the full conference schedule including talk abstracts and speaker biographies is now available for this year's lineup of talks and speakers, covering again an exciting range of topics!

The GStreamer Conference 2024 will take place on 7-8 October 2024 in Montréal, Canada, followed by a hackfest.

Details about the conference, hackfest and how to register can be found on the conference website.

This year's topics and speakers:

Lightning Talks:

Many thanks to our amazing sponsors ‒ Platinum sponsors Collabora, Igalia, and Pexip, Gold sponsors Centricular, La Société des Arts Technologiques, Axis Communications, and Genius Sports, and Silver sponsors Laerdal Labs, asymptotic, Cablecast, and Fluendo, without whom the conference would not be possible in this form.

We hope to see you all in Montréal! Don't forget to register as soon as possible if you're planning on joining us, so we can order enough food and drinks!

Switching hlscmafsink to a new playlist without pausing

From Centricular Devlog by Centricular

Up until recently, when using hlscmafsink, if you wanted to move to a new playlist you had to stop the pipeline, modify the relevant properties and then go to PLAYING again.

This was problematic when working with live sources because some data was being lost between the state changes. Not anymore!

A new-playlist signal has been added, which lets you switch output to a new location on the fly, without having any gaps between the content in each playlist.

Simply change the relevant properties first and then emit the signal:

hlscmafsink.set_property("playlist-location", new_playlist_location);
hlscmafsink.set_property("init-location", new_init_location);
hlscmafsink.set_property("location", new_segment_location);
hlscmafsink.emit_by_name::<()>("new-playlist", &[]);

This can be useful if you're capturing a live source and want to switch to a different folder every couple of hours, for example.

JPEG XS encoding and decoding in GStreamer, with MPEG-TS mux support

From Centricular Devlog by Centricular

What is JPEG XS?

JPEG XS is a visually lossless, low-latency, intra-only video codec for video production workflows, standardised in ISO/IEC 21122.

It's wavelet based, with low computational overhead and a latency measured in scanlines, and it is designed to allow easy implementation in software, GPU or FPGAs.

Multi-generation robustness means repeated decoding and encoding will not introduce unpleasant coding artefacts or noticeably degrade image quality, which makes it suitable for video production workflows.

It is often deployed in lieu of existing raw video workflows, where it allows sending multiple streams over links designed to carry a single raw video transport.

JPEG XS encoding / decoding in GStreamer

GStreamer now gained basic support for this codec.

Encoding and decoding is supported via the Open Source Intel Scalable Video Technology JPEG XS library, but third-party GStreamer plugins that provide GPU accelerated encoding and decoding exist as well.

MPEG-TS container mapping

Support was also added for carriage inside MPEG-TS which should enable a wide range of streaming applications including those based on the Video Services Forum (VSF)'s Technical Recommendation TR-07.

JPEG XS caps in GStreamer

It actually took us a few iterations to come up with GStreamer caps that we were somewhat happy with for starters.

Our starting point was what the SVT encoder/decoder output/consume, and our initial target was MPEG-TS container format support.

We checked various specifications to see how JPEG XS is mapped there and how it could work, in particular:

  • ISO/IEC 21122-3 (Part 3: Transport and container formats)
  • MPEG-TS JPEG XS mapping and VSF TR-07 - Transport of JPEG XS Video in MPEG-2 Transport Stream over IP
  • RFC 9134: RTP Payload Format for ISO/IEC 21122 (JPEG XS)
  • SMPTE ST 2124:2020 (Mapping JPEG XS Codestreams into the MXF)
  • MP4 mapping

and we think the current mapping will work for all of those cases.

Basically each mapping wants some extra headers in addition to the codestream data, for the out-of-band signalling required to make sense of the image data. Originally we thought about putting some form of codec_data header into the caps, but it wouldn't really have made anything easier, and would just have duplicated 99% of the info that's in the video caps already anyway.

The current caps mapping is based on ISO/IEC 21122-3, Annex D, with additional metadata in the caps, which should hopefully work just fine for RTP, MP4, MXF and other mappings in future.

Please give it a spin, and let us know if you have any questions or are interested in additional container mappings such as MP4 or MXF, or RTP payloaders / depayloaders.

GStreamer 1.24.8 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.24.x.

Highlighted bugfixes:

  • decodebin3: collection handling fixes
  • encodebin: Fix pad removal (and smart rendering in gst-editing-services)
  • glimagesink: Fix cannot resize viewport when video size changed in caps
  • matroskamux, webmmux: fix firefox compatibility issue with Opus audio streams
  • mpegtsmux: Wait for data on all pads before deciding on a best pad unless timing out
  • splitmuxsink: Override LATENCY query to pretend to downstream that we're not live
  • video: QoS event handling improvements
  • voamrwbenc: fix list of bitrates
  • vtenc: Restart encoding session when certain errors are detected
  • wayland: Fix ABI break in WL context type name
  • webrtcbin: Prevent crash when attempting to set answer on invalid SDP
  • cerbero: ship vp8/vp9 software encoders again, which went missing in 1.24.7; ship transcode plugin
  • Various bug fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.8 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Orc 0.4.40 bug-fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another release of liborc, the Optimized Inner Loop Runtime Compiler, which is used for SIMD acceleration in GStreamer plugins such as audioconvert, audiomixer, compositor, videoscale, and videoconvert, to name just a few.

This is a bug-fix release, with some minor follow-ups to the security fixes of the previous release.

Highlights:

  • Security: Minor follow-up fixes for CVE-2024-40897
  • Fix include header use from C++
  • orccodemem: Assorted memory mapping fixes
  • powerpc: fix div255w which still used the inexact substitution
  • powerpc: Disable VSX and ISA 2.07 for Apple targets
  • powerpc: Allow detection of ppc64 in Mac OS
  • x86: work around old GCC versions (pre 9.0) having broken xgetbv implementationsv
  • x86: consider MSYS2/Cygwin as Windows for ABI purposes only
  • x86: handle unnatural and misaligned array pointers
  • x86: Fix non-C11 typedefs
  • x86: try fixing AVX detection again by adding check for XSAVE
  • Some compatibility fixes for Musl
  • meson: Fix detecting XSAVE on older AppleClangv
  • Check return values of malloc() and realloc()

Direct tarball download: orc-0.4.40.tar.xz.

GStreamer and WebRTC HTTP signalling

From Arun Raghavan by Arun Raghavan

The WebRTC nerds among us will remember the first thing we learn about WebRTC, which is that it is a specification for peer-to-peer communication of media and data, but it does not specify how signalling is done.

Or put more simply, if you want call someone on the web, WebRTC tells you how you can transfer audio, video and data, but it leaves out the bit about how you make the call itself: how do you locate the person you’re calling, let them know you’d like to call them, and a few following steps before you can see and talk to each other.

WebRTC signalling
WebRTC signalling

While this allows services to provide their own mechanisms to manage how WebRTC calls work, the lack of a standard mechanism means that general-purpose applications need to individually integrate each service that they want to support. For example, GStreamer’s webrtcsrc and webrtcsink elements support various signalling protocols, including Janus Video Rooms, LiveKit, and Amazon Kinesis Video Streams.

However, having a standard way for clients to do signalling would help developers focus on their application and worry less about interoperability with different services.

Standardising Signalling

With this motivation, the IETF WebRTC Ingest Signalling over HTTPS (WISH) workgroup has been working on two specifications:

(author’s note: the puns really do write themselves :))

As the names suggest, the specifications provide a way to perform signalling using HTTP. WHIP gives us a way to send media to a server, to ingest into a WebRTC call or live stream, for example.

Conversely, WHEP gives us a way for a client to use HTTP signalling to consume a WebRTC stream – for example to create a simple web-based consumer of a WebRTC call, or tap into a live streaming pipeline.

WHIP and WHEP
WHIP and WHEP

With this view of the world, WHIP and WHEP can be used both for calling applications, but also as an alternative way to ingest or play back live streams, with lower latency and a near-ubiquitous real-time communication API.

In fact, several services already support this including Dolby Millicast, LiveKit and Cloudflare Stream.

WHIP and WHEP with GStreamer

We know GStreamer already provides developers two ways to work with WebRTC streams:

  • webrtcbin: provides a low-level API, akin to the PeerConnection API that browser-based users of WebRTC will be familiar with

  • webrtcsrc and webrtcsink: provide high-level elements that can respectively produce/consume media from/to a WebRTC endpoint

At Asymptotic, my colleagues Tarun and Sanchayan have been using these building blocks to implement GStreamer elements for both the WHIP and WHEP specifications. You can find these in the GStreamer Rust plugins repository.

Our initial implementations were based on webrtcbin, but have since been moved over to the higher-level APIs to reuse common functionality (such as automatic encoding/decoding and congestion control). Tarun covered our work in a talk at last year’s GStreamer Conference.

Today, we have 4 elements implementing WHIP and WHEP.

Clients

  • whipclientsink: This is a webrtcsink-based implementation of a WHIP client, using which you can send media to a WHIP server. For example, streaming your camera to a WHIP server is as simple as:
gst-launch-1.0 -e \
  v4l2src ! video/x-raw ! queue ! \
  whipclientsink signaller::whip-endpoint="https://my.webrtc/whip/room1"
  • whepclientsrc: This is work in progress and allows us to build player applications to connect to a WHEP server and consume media from it. The goal is to make playing a WHEP stream as simple as:
gst-launch-1.0 -e \
  whepclientsrc signaller:whep-endpoint="https://my.webrtc/whep/room1" ! \
  decodebin ! autovideosink

The client elements fit quite neatly into how we might imagine GStreamer-based clients could work. You could stream arbitrary stored or live media to a WHIP server, and play back any media a WHEP server provides. Both pipelines implicitly benefit from GStreamer’s ability to use hardware-acceleration capabilities of the platform they are running on.

GStreamer WHIP/WHEP clients
GStreamer WHIP/WHEP clients

Servers

  • whipserversrc: Allows us to create a WHIP server to which clients can connect and provide media, each of which will be exposed as GStreamer pads that can be arbitrarily routed and combined as required. We have an example server that can play all the streams being sent to it.

  • whepserversink: Finally we have ongoing work to publish arbitrary streams over WHEP for web-based clients to consume this media.

The two server elements open up a number of interesting possibilities. We can ingest arbitrary media with WHIP, and then decode and process, or forward it, depending on what the application requires. We expect that the server API will grow over time, based on the different kinds of use-cases we wish to support.

GStreamer WHIP/WHEP server
GStreamer WHIP/WHEP server

This is all pretty exciting, as we have all the pieces to create flexible pipelines for routing media between WebRTC-based endpoints without having to worry about service-specific signalling.

If you’re looking for help realising WHIP/WHEP based endpoints, or other media streaming pipelines, don’t hesitate to reach out to us!

Notifications about new segments in hlssink3

From Centricular Devlog by Centricular

When using hlssink3 and hlscmafsink elements, it's now possible to track new fragments being added by listening for the hls-segment-added message:

Got message #67 from element "hlscmafsink0" (element): hls-segment-added, location=(string)segment00000.m4s, running-time=(guint64)0, duration=(guint64)3000000000;
Got message #71 from element "hlscmafsink0" (element): hls-segment-added, location=(string)segment00001.m4s, running-time=(guint64)3000000000, duration=(guint64)3000000000;
Got message #74 from element "hlscmafsink0" (element): hls-segment-added, location=(string)segment00002.m4s, running-time=(guint64)6000000000, duration=(guint64)3000000000;

This is similar to how you would listen for splitmuxsink-fragment-closed when using the older hlssink2.

webrtcsink now implements a generic control data channel

From Centricular Devlog by Centricular

webrtcsink already supported instantiating a data channel for the sole purpose of carrying navigation events from the consumer to the producer, it can also now create a generic control data channel through which the consumer can send JSON requests in the form:

{
    "id": identifier used in the response message,
    "mid": optional media identifier the request applies to,
    "request": {
        "type": currently "navigationEvent" and "customUpstreamEvent" are supported,
        "type-specific-field": ...
    }
}

The producer will reply with such messages:

{
  "id": identifier of the request,
  "error": optional error message, successful if not set
}

The example frontend was also updated with a text area for sending any arbitrary request.

The use case for this work was to make it possible for a consumer to control the mix matrix used for the audio stream, with such a pipeline running on the producer side:

gst-launch-1.0 audiotestsrc ! audioconvert ! webrtcsink enable-control-data-channel=true

As audioconvert now supports setting a mix matrix through a custom upstream event, the consumer can simply input the following text in the request field of the frontend to reverse the channels of a stereo audio stream:

{
  "type": "customUpstreamEvent",
  "structureName": "GstRequestAudioMixMatrix",
  "structure": {
    "matrix": [[0.0, 1.0], [1.0, 0.0]]
  }
}

GStreamer 1.24.7 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and it should be safe to update from 1.24.x.

Highlighted bugfixes:

  • Fix APE and Musepack audio file and GIF playback with FFmpeg 7.0
  • playbin3: Fix potential deadlock with multiple playbin3s with glimagesink used in parallel
  • qt6: various qmlgl6src and qmlgl6sink fixes and improvements
  • rtspsrc: expose property to force usage of non-compliant setup URLs for RTSP servers where the automatic fallback doesn't work
  • urisourcebin: gapless playback and program switching fixes
  • v4l2: various fixes
  • va: Fix potential deadlock with multiple va elements used in parallel
  • meson: option to disable gst-full for static-library build configurations that do not need this
  • cerbero: libvpx updated to 1.14.1; map 2022Server to Windows11; disable rust variant on Linux if binutils is too old
  • Various bug fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.7 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Encoding HEVC with an alpha channel on macOS/iOS

From Centricular Devlog by Centricular

GStreamer's VideoToolbox encoder recently gained support for encoding HEVC/H.265 videos containing an alpha channel.

A separate vtenc_h265a element has been added for this purpose. Assuming you're on macOS, you can use it like this:

gst-launch-1.0 -e videotestsrc ! alpha alpha=0.5 ! videoconvert ! vtenc_h265a ! mp4mux ! filesink location=alpha.mp4

Click here to see an example in action! It should work fine on macOS and iOS, in both Chrome and Safari. On other platforms it might not be displayed at all - compatibility is unfortunately still quite limited.

If your browser supports this format correctly, you will see a moving GStreamer logo on a constantly changing background - something like this. That background is entirely separate from the video and is generated using CSS.

Default webrtcsink signaller now supports answering

From Centricular Devlog by Centricular

The default signaller for webrtcsink can now produce an answer when the consumer sends the offer first.

To test this with the example, you can simply follow the usual steps but also paste the following text in the text area before clicking on the producer name:

{
  "offerToReceiveAudio": 1,
  "offerToReceiveVideo": 1
}

I implemented this in order to test multiopus support with webrtcsink, as it seems to work better when munging the SDP offered by chrome.

Embedded servers in webrtcsink

From Centricular Devlog by Centricular

webrtcsink can now run both the default signalling server and a web server for static content.

gst-launch-1.0 ! videotestsrc webrtcsink name=ws run-signalling-server=true run-web-server=true

This comes in very handy for testing purposes, but could also prove useful in production.

Decklink HDR support

From Centricular Devlog by Centricular

A couple of weeks ago I implemented support for static HDR10 metadata in the decklinkvideosink and decklinkvideosrc elements for Blackmagic video capture and playout devices. The culmination of this work is available from MR 7124 - decklink: add support for HDR output and input

This adds support for both PQ and HLG HDR alongside some improvements in colorimetry negotiation. Static HDR metadata in GStreamer is conveyed through caps.

The first part of this is the 'colorimetry' field in video/x-raw caps. decklinkvideosink and decklinkvideosrc now support the colorimetry values 'bt601', 'bt709', 'bt2020', 'bt2100-hlg', and 'bt2100-pq' for any resolution. Previously the colorimetry used was fixed based on the resolution of the video frames being sent or received. With some glue code, the colorimetry is now retrieved from the Decklink API and the Decklink API can ask us for the colorimetry of the submitted video frame. Arbitrary colorimetry support is not supported on all Decklink devices and we fallback to the previous fixed list based on frame resolution when not supported.

Support for HDR metadata is a separate feature flag in the Decklink API and may or may not be present independent of Decklink's arbitrary colour space support. If the Decklink device does not support HDR metadata, then the colorimetry values 'bt2100-hlg', and 'bt2100-pq' are not supported.

In the case of HLG, all that is necessary is to provide information that the HLG gamma transfer function is being used. Nothing else is required.

In the case of PQ HDR, in addition to providing Decklink with the correct gamma transfer function, Decklink also needs some other metadata conveyed in the caps in the form of the 'mastering-display-info' and 'light-content-level' fields. With some support from GstVideoMasteringDisplayInfo, and GstVideoContentLightLevel the relevant information signalled to Decklink and can be retrieved from each individual video frame.

GStreamer Conference 2024: Call for Presentations is open - submit your talk proposal now!

GStreamer
From GStreamer News by GStreamer

About the GStreamer Conference

The GStreamer Conference 2024 GStreamer Conference will take place on Monday-Tuesday 7-8 October 2024 in Montréal, Québec, Canada.

It is a conference for developers, community members, decision-makers, industry partners, researchers, students and anyone else interested in the GStreamer multimedia framework or Open Source and cross-platform multimedia.

There will also be a hackfest just after the conference.

You can find more details about the conference on the GStreamer Conference 2024 web site.

Call for Presentations is open

The call for presentations is now open for talk proposals and lightning talks. Please submit your talk now!

Talks can be on almost anything multimedia related, ranging from talks about applications to challenges in the lower levels in the stack or hardware.

We want to hear from you, what you are working on, what you are using GStreamer for, what difficulties you have encountered and how you solved them, what your plans are, what you like, what you dislike, how to improve things!

Please submit all proposals through the GStreamer Conference 2024 Talk Submission Portal.

GStreamer 1.24.6 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and security fixes and it should be safe to update from 1.24.x.

Highlighted bugfixes:

  • Fix compatibility with FFmpeg 7.0
  • qmlglsink: Fix failure to display content on recent Android devices
  • adaptivedemux: Fix handling of closed caption streams
  • cuda: Fix runtime compiler loading with old CUDA tookit
  • decodebin3 stream selection handling fixes
  • d3d11compositor, d3d12compositor: Fix transparent background mode with YUV output
  • d3d12converter: Make gamma remap work as intended
  • h264decoder: Update output frame duration for interlaced video when second field frame is discarded
  • macOS audio device provider now listens to audio devices being added/removed at runtime
  • Rust plugins: audioloudnorm, s3hlssink, gtk4paintablesink, livesync and webrtcsink fixes
  • videoaggregator: preserve features in non-alpha caps for subclasses with non-system memory sink caps
  • vtenc: Fix redistribute latency spam
  • v4l2: fixes for complex video formats
  • va: Fix strides when importing DMABUFs, dmabuf handle leaks, and blocklist unmaintained Intel i965 driver for encoding
  • waylandsink: Fix surface cropping for rotated streams
  • webrtcdsp: Enable multi_channel processing to fix handling of stereo streams
  • Various bug fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.6 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Orc 0.4.39 bug-fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another release of liborc, the Optimized Inner Loop Runtime Compiler, which is used for SIMD acceleration in GStreamer plugins such as audioconvert, audiomixer, compositor, videoscale, and videoconvert, to name just a few.

This is a minor bug-fix release, and also includes a security fix.

Highlights:

  • Security: Fix error message printing buffer overflow leading to possible code execution in orcc with specific input files (CVE-2024-40897). This only affects developers and CI environments using orcc, not users of liborc.
  • div255w: fix off-by-one error in the implementations
  • x86: only run AVX detection if xgetbv is available
  • x86: fix AVX detection by implementing the check recommended by Intel
  • Only enable JIT compilation on Apple arm64 if running on macOS, fixes crashes on iOS
  • Fix potential crash in emulation mode if logging is enabled
  • Handle undefined TARGET_OS_OSX correctly
  • orconce: Fix typo in GCC __sync-based implementation
  • orconce: Fix usage of __STDC_NO_ATOMICS__
  • Fix build with MSVC 17.10 + C11
  • Support stack unwinding on Windows
  • Major opcode and instruction set code clean-ups and refactoring
  • Refactor allocation and chunk initialization of code regions
  • Fall back to emulation on Linux if JIT support is not available, e.g. because of SELinux sandboxing or noexec mounting)

Direct tarball download: orc-0.4.39.tar.xz.

GStreamer Rust bindings 0.23 / Rust Plugins 0.13 release

GStreamer
From GStreamer News by GStreamer

GStreamer Rust bindingsGStreamer Rust plugins

As usual this release follows the latest gtk-rs 0.20 release and the corresponding API changes.

This release features relatively few changes and mostly contains the addition of some convenience APIs, the addition of bindings for some minor APIs, addition of bindings for new GStreamer 1.26 APIs, and various optimizations.

The new release also brings a lot of bugfixes, most of which were already part of the bugfix releases of the previous release series.

Details can be found in the release notes for gstreamer-rs.

The code and documentation for the bindings is available on the freedesktop.org GitLab

as well as on crates.io.

The new 0.13 version of the GStreamer Rust plugins features many improvements to the existing plugins as well as various new plugins. A majority of the changes were already backported to the 0.12 release series and its bugfix releases, which is part of the GStreamer 1.24 binaries.

A full list of available plugins can be seen in the repository's README.md.

Details for this release can be found in the release notes for gst-plugins-rs.

If you find any bugs, notice any missing features or other issues please report them in GitLab for the bindings or the plugins.

RTSP source handling of non-compliant control URI handling of various RTSP servers

From Centricular Devlog by Centricular

In GStreamer 1.20 times I fixed the handling of RTSP control URIs in GStreamer's RTSP source element by making use of GstUri for joining URIs and resolving relative URIs instead of using a wrong, custom implementation of those basic URI operations (see RFC 2396).

This was in response to a bug report which was caused by a regression in 1.18 when fixing that custom implementation some years before. Now that this is handled according to the standards, one would expect that the topic is finally solved.

Unfortunately that was not the case. As it turns out, various RTSP servers are not actually implementing the URI operations for constructing the control URI but instead do simple string concatenation. This works fine for simple cases but once path separators or query parameters are involved this is not sufficient. The fact that both VLC and ffmpeg on the client-side also only do string concatenation unfortunately does not help this situation either as these servers will work fine in VLC and ffmpeg but not in GStreamer, so it initially appears like a GStreamer bug.

To work around these cases automatically, a workaround with a couple of follow-up fixes 1 2 3 4 was implemented. This workaround is available since 1.20.4.

Unfortunately this was also not enough as various servers don't just implement the URI RFC wrong, but also don't implement the RTSP RFC correctly and don't return any kind of meaningful errors but, for example, simply close the connection.

To solve this once and for all, Mathieu now added a new property to rtspsrc that forces it to directly use string concatenation and not attempt proper URI operations first.

$ gst-launch-1.0 rtspsrc location=rtsp://1.2.3.4/test force-non-compliant-url=true ! ...

This property is available since 1.24.7 and should make it possible to use such misbehaving and non-compliant servers.

If GStreamer's rtspsrc fails on an RTSP stream that is handled just fine by VLC and ffmpeg, give this a try.

GStreamer 1.24.5 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and security fixes and it should be safe to update from 1.24.x.

Highlighted bugfixes:

  • webrtcsink: Support for AV1 via nvav1enc, av1enc or rav1enc encoders
  • AV1 RTP payloader/depayloader fixes to work correctly with Chrome and Pion WebRTC
  • av1parse, av1dec error handling/robustness improvements
  • av1enc: Handle force-keyunit events properly for WebRTC
  • decodebin3: selection and collection handling improvements
  • hlsdemux2: Various fixes for discontinuities, variant switching, playlist updates
  • qml6glsink: fix RGB format support
  • rtspsrc: more control URL handling fixes
  • v4l2src: Interpret V4L2 report of sync loss as video signal loss
  • d3d12 encoder, memory and videosink fixes
  • vtdec: more robust error handling, fix regression
  • ndi: support for NDI SDK v6
  • Various bug fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.5 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

Fedora Workstation development update – Artificial Intelligence edition

Christian Schaller
From Christian F.K. Schaller by Christian Schaller

There are times when you feel your making no progress and there are other times when things feel like they are landing in quick succession. Luckily this definitely is the second when a lot of our long term efforts are finally coming over the finish line. As many of you probably know our priorities tend to be driven by a combination of what our RHEL Workstation customers need, what our hardware partners are doing and what is needed for Fedora Workstation to succeed. We also try to be good upstream partners and do patch reviews and participate where we can in working on upstream standards, especially those of course of concern to our RHEL Workstation and Server users. So when all those things align we are at our most productive and that seems to be what is happening now. Everything below is features in flight that will at the latest land in Fedora Workstation 41.

Artificial Intelligence

Granite LLM

IBM Granite LLM models usher in a new era of open source AI.


One of the areas of great importance to Red Hat currently is working on enabling our customers and users to take advantage of the advances in Artificial Intelligence. We do this in a lot of interesting ways like our recently announced work with IBM to release the high quality Granite AI models under terms that make them the most open major vendor AI models according to the Stanford Foundation Model Transparency Index , but not only are we releasing the full LLM source code, we are also creating a project to make modifying and teaching the LLM a lot easier through a project we call Instructlab. Instructlab is enabling almost anyone to quickly download a Granite LLM model and start teaching it specific things relevant to you or your organization. This put you in control of the AI and what it knows and can do as opposed to being demoted to a pure consumer.

And it is not just Granite, we are ensuring other other major AI projects will work with Fedora too, like Meta’s popular Llama LLM. And a big step for that is how Tom Rix has been working on bringing in AMD accelerated support (ROCm) for PyTorch to Fedora. PyTorch is an optimized tensor library for deep learning using GPUs and CPUs. The long term goal is that you should be able to just install PyTorch on Fedora and have it work hardware accelerated with any of the 3 major GPU vendors chipsets.

NVIDIA in Fedora
Fedora Workstation
So the clear market leader at the moment for powering AI workloads in NVIDIA so I am also happy to let you know about two updates we are working on that will make you life better on Fedora when using NVIDIA GPUs, be that for graphics or for compute or Artificial Intelligence. So for the longest time we have had easy install of the NVIDIA driver through GNOME Software in Fedora Workstation, unfortunately this setup never dealt with what is now the default usecase, which is using it with a system that has secure boot enabled. So the driver install was dropped from GNOME Software in our most recent release as the only way for people to get it working was through using mokutils on the command line, but the UI didn’t tell you that. Well we of course realize that sending people back to the command line to get this driver installed is highly unfortunate so Milan Crha has been working together with Alan Day and Jakub Steiner to come up with a streamlined user experience in GNOME Software to let you install the binary NVIDIA driver and provide you with an integrated graphical user interface help to sign the kernel module for use with secure boot. This is a bit different than what we for instance are doing in RHEL, where we are working with NVIDIA to provide pre-signed kernel modules, but that is a lot harder to do in Fedora due to the rapidly updating kernel versions and which most Fedora users appreciate as a big plus. So instead what we are for opting in Fedora is as I said to make it simple for you to self-sign the kernel module for use with secure boot. We are currently looking at when we can make this feature available, but no later than Fedora Workstation 41 for sure.

Toolbx getting top notch NVIDIA integration

Toolbx

Container Toolbx enables developers quick and easy access to their favorite development platforms


Toolbx, our incredible developer focused containers tool, is going from strength to strength these days with the rewrite from the old shell scripts to Go starting to pay dividends. The next major feature that we are closing in on is full NVIDIA driver support with Toolbx. As most of you know Toolbx is our developer container solution which makes it super simple to set up development containers with specific versions of Fedora or RHEL or many other distributions. Debarshi Ray has been working on implementing support for the official NVIDIA container device interface module which should enable us to provide full NVIDIA and CUDA support for Toolbx containers. This should provide reliable NVIDIA driver support going forward and Debarshi is currently testing various AI related container images to ensure they run smoothly on the new setup.

We are also hoping the packaging fixes to subscription manager will land soon as that will make using RHEL containers on Fedora a lot smoother. While this feature basically already works as outlined here we do hope to make it even more streamlined going forward.

Open Source NVIDIA support
Of course being Red Hat we haven’t forgotten about open source here, you probably heard about Nova our new Rust based upstream kernel driver for NVIDIA hardware which will provided optimized support for the hardware supported by NVIDIAs firmware (basically all newer ones) and accelerate Vulkan through the NVK module and provide OpenGL through Zink. That effort is still quite early days, but there is some really cool developments happening around Nova that I am not at liberty to share yet, but I hope to be able to talk about those soon.

High Dynamic Range (HDR)
Jonas Ådahl after completing the remote access work for GNOME under Wayland has moved his focus to help land the HDR support in mutter and GNOME Shell. He recently finished rebasing his HDR patches onto a wip merge request from
Georges Stavracas which ported gnome-shell to using paint nodes,

So the HDR enablement in mutter and GNOME shell is now a set of 3 patches.

With this the work is mostly done, what is left is avoiding over exposure of the cursor, and inhibiting direct scanout.

We also hope to help finalize the upstream Wayland specs soon so that everyone can implement this and know the protocols are stable and final.

DRM leasing – VR Headsets
VR Googles
The most common usecase for DRM leasing is VR headsets, but it is also a useful feature for things like video walls. José Expósito is working on finalizing a patch for it using the Wayland protocol adopted by KDE and others. We where somewhat hesitant to go down this route as we felt a portal would have been a better approach, especially as a lot of our experienced X.org developers are worried that Wayland is in the process of replicating one of the core issues with X through the unmanageable plethora of Wayland protocols that is being pushed. That said, the DRM leasing stuff was not a hill worth dying on here, getting this feature out to our users in a way they could quickly use was more critical, so DRM leasing will land soon through this merge request.

Explicit sync
Another effort that we have put a lot of effort into together with our colleagues at NVIDIA is landing support for what is called explicit sync into the Linux kernel and the graphics drivers.The linux graphics stack was up to this point using something called implicit sync, but the NVIDIA drivers did not work well with that and thus people where experiencing ‘blinking’ applications under Wayland. So we worked with NVIDIA and have landed the basic support in the kernel and in GNOME and thus once the 555 release of the NVIDIA driver is out we hope the ‘blinking’ issues are fully resolved for your display. There has been some online discussion about potential performance gains from this change too, across all graphics drivers, but the reality of this is somewhat uncertain or at least it is still unclear if there will be real world measurable gains from adding explicit sync. I heard knowledgeable people argue both sides with some saying there should be visible performance gains while others say the potential gains will be so specific that unless you write a test to benchmark it explicitly you will not be able to detect a difference. But what is beyond doubt is that this will make using the NVIDIA stack with Wayland a lot better a that is a worthwhile goal in itself. The one item we are still working on is integrating the PipeWire support for explicit sync into our stack, because without it you might have the same flickering issues with PipeWire streams on top of the NVIDIA driver that you have up to now seen on your screen. So for instance if you are using PipeWire for screen capture it might look fine on screen with the fixes already merged, but the captured video has flickering. Wim Taymans landed some initial support in PipeWire already so now Michel Dänzer is working on implementing the needed bits for PipeWire in mutter. At the same time Wim is working on ensuring we have a testing client available to verify the compositor support. Once everything has landed in mutter and we been able to verify that it works with the sample client we will need to add support to client applications interacting with PipeWire, like Firefox, Chrome, OBS Studio and GNOME-remote-desktop.

GStreamer Vulkan Operation API

From Herostratus’ legacy by Víctor Jáquez

Two weeks ago the GStreamer Spring Hackfest took place in Thessaloniki, Greece. I had a great time. I hacked a bit on VA, Vulkan and my toy, planet-rs, but mostly I ate delicious Greek food ☻. A big thanks to our hosts: Vivia, Jordan and Sebastian!

GStreamer Spring Hackfest
2024
First day of the GStreamer Spring Hackfest 2024 - https://floss.social/@gstreamer/112511912596084571

And now, writing this supposed small note, I recalled that I have in my to-do list an item to write a comment about GstVulkanOperation, an addition to GstVulkan API which helps with the synchronization of operations on frames, in order to enable Vulkan Video.

Originally, GstVulkan API didn’t provide almost any synchronization operation, beside fences, and that appeared to be enough for elements, since they do simple Vulkan operations. Nonetheless, as soon as we enabled VK_VALIDATION_FEATURE_ENABLE_SYNCHRONIZATION_VALIDATION_EXT feature, which reports resource access conflicts due to missing or incorrect synchronization operations between action [*], a sea of hazard operation warnings drowned us [*].

Hazard operations are a sequence of read/write commands in a memory area, such as an image, that might be re-ordered, or racy even.

Why are those hazard operations reported by the Vulkan Validation Layer, if the programmer pushes the commands to execute in queue in order? Why is explicit synchronization required? Because, as the great blog post from Hans-Kristian Arntzen, Yet another blog explaining Vulkan synchronization, (make sure you read it!) states:

[…] all commands in a queue execute out of order. Reordering may happen across command buffers and even vkQueueSubmits

In order to explain how synchronization is done in Vulkan, allow me to yank a couple definitions stated by the specification:

Commands are instructions that are recorded in a device’s queue. There are four types of commands: action, state, synchronization and indirection. Synchronization commands impose ordering constraints on action commands, by introducing explicit execution and memory dependencies.

Operation is an arbitrary amount of commands recorded in a device’s queue.

Since the driver can reorder commands (perhaps for better performance, dunno), we need to send explicit synchronization commands to the device’s queue to enforce a specific sequence of action commands.

Nevertheless, Vulkan doesn’t offer fine-grained dependencies between individual operations. Instead, dependencies are expressed as a relation of two elements, where each element is composed by the intersection of scope and operation. A scope is a concept in the specification that, in practical terms, can be either pipeline stage (for execution dependencies), or both pipeline stage and memory access type (for memory dependencies).

First let’s review execution dependencies through pipeline stages:

Every command submitted to a device’s queue goes through a sequence of steps known as pipeline stages. This sequence of steps is one of the very few implicit ordering guarantees that Vulkan has. Draw calls, copy commands, compute dispatches, all go through certain sequential stages, which amount of stages to cover depends on the specific command and the current command buffer state.

In order to visualize an abstract execution dependency let’s imagine two compute operations and the first must happen before the second.

Operation 1
Sync command
Operation 2
  1. The programmer has to specify the Sync command in terms of two scopes (Scope 1 and Scope 2), in this execution dependency case, two pipeline stages.
  2. The driver generates an intersection between commands in Operation 1 and Scope 1 defined as Scoped operation 1. The intersection contains all the commands in Operation 1 that go through up to the pipeline stage defined in Scope 1. The same is done with Operation 2 and Scope 2 generating Scoped operation 2.
  3. Finally, we got an execution dependency that guarantees that Scoped operation 1 happens before Scoped operation 2.

Now let’s talk about memory dependencies:

First we need to understand the concepts of memory availability and visibility. Their formal definition in Vulkan are a bit hard to grasp since they come from the Vulkan memory model, which is intended to abstract all the ways of how hardware access memory. Perhaps we could say that availability is the operation that assures the existence of the required memory; while visibility is the operation that assures it’s possible to read/write the data in that memory area.

Memory dependencies are limited the Operation 1 that be done before memory availability and Operation 2 that have to be done after its visibility.

But again, there’s no fine-grained way to declare that memory dependency. Instead, there are memory access types, which are functions used by descriptor types, or functions for pipeline stage to access memory, and they are used as access scopes.

All in all, if a synchronization command defining a memory dependency between two operations, it’s composed by the intersection of between each command and a pipeline stage, intersected with the memory access type associated with the memory processed by those commands.

Now that the concepts are more or less explained we could see those concepts expressed in code. The synchronization command for execution and memory dependencies is defined by VkDependencyInfoKHR. And it contains a set of barrier arrays, for memory, buffers and images. Barriers express the relation of dependency between two operations. For example, Image barriers use VkImageMemoryBarrier2 which contain the mask for source pipeline stage (to define Scoped operation 1), and the mask for the destination pipeline stage (to define Scoped operation 2); the mask for source memory access type and the mask for the destination memory access to define access scopes; and also layout transformation declaration.

A Vulkan synchronization example from Vulkan Documentation wiki:

vkCmdDraw(...);

... // First render pass teardown etc.

VkImageMemoryBarrier2KHR imageMemoryBarrier = {
...
.srcStageMask = VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT_KHR,
.dstStageMask = VK_PIPELINE_STAGE_2_COLOR_ATTACHMENT_OUTPUT_BIT_KHR,
.dstAccessMask = VK_ACCESS_2_COLOR_ATTACHMENT_WRITE_BIT_KHR,
.oldLayout = VK_IMAGE_LAYOUT_READ_ONLY_OPTIMAL,
.newLayout = VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL
/* .image and .subresourceRange should identify image subresource accessed */};

VkDependencyInfoKHR dependencyInfo = {
...
1, // imageMemoryBarrierCount
&imageMemoryBarrier, // pImageMemoryBarriers
...
}

vkCmdPipelineBarrier2KHR(commandBuffer, &dependencyInfo);

... // Second render pass setup etc.

vkCmdDraw(...);

First draw samples a texture in the fragment shader. Second draw writes to that texture as a color attachment.

This is a Write-After-Read (WAR) hazard, which you would usually only need an execution dependency for - meaning you wouldn’t need to supply any memory barriers. In this case you still need a memory barrier to do a layout transition though, but you don’t need any access types in the src access mask. The layout transition itself is considered a write operation though, so you do need the destination access mask to be correct - or there would be a Write-After-Write (WAW) hazard between the layout transition and the color attachment write.

Other explicit synchronization mechanisms, along with barriers, are semaphores and fences. Semaphores are a synchronization primitive that can be used to insert a dependency between operations without notifying the host; while fences are a synchronization primitive that can be used to insert a dependency from a queue to the host. Semaphores and fences are expressed in the VkSubmitInfo2KHR structure.

As a preliminary conclusion, synchronization in Vulkan is hard and a helper API would be very helpful. Inspired by FFmpeg work done by Lynne, I added GstVulkanOperation object helper to GStreamer Vulkan API.

GstVulkanOperation object helper aims to represent an operation in the sense of the Vulkan specification mentioned before. It owns a command buffer as public member where external commands can be pushed to the associated device’s queue.

It has a set of methods:

Internally, GstVulkanOperation contains two arrays:

  1. The array of dependency frames, which are the set of frames, each representing an operation, which will hold dependency relationships with other dependency frames.

    gst_vulkan_operation_add_dependency_frame appends frames to this array.

    When calling gst_vulkan_operation_end the frame’s barrier state for each frame in the array is updated.

    Also, each dependency frame creates a timeline semaphore, which will be signaled when a command, associated with the frame, is executed in the device’s queue.

  2. The array of barriers, which contains a list of synchronization commands. gst_vulkan_operation_add_frame_barrier fills and appends a VkImageMemoryBarrier2KHR associated with a frame, which can be in the array of dependency frames.

Here’s a generic view of video decoding example:

gst_vulkan_operation_begin (priv->exec, ...);

cmd_buf = priv->exec->cmd_buf->cmd;

gst_vulkan_operation_add_dependency_frame (exec, out,
VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR,
VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR);

/* assume a engine where out frames can be used for DPB frames, */
/* so a barrier for layout transition is required */
gst_vulkan_operation_add_frame_barrier (exec, out,
VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT,
VK_ACCESS_2_VIDEO_DECODE_WRITE_BIT_KHR,
VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR, NULL);

for (i = 0; i < dpb_size; i++) {
gst_vulkan_operation_add_dependency_frame (exec, dpb_frame,
VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR,
VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR);
}

barriers = gst_vulkan_operation_retrieve_image_barriers (exec);
vkCmdPipelineBarrier2 (cmd_buf, &(VkDependencyInfo) {
...
.pImageMemoryBarriers = barriers->data,
.imageMemoryBarrierCount = barriers->len,
});
g_array_unref (barriers);

vkCmdBeginVideoCodingKHR (cmd_buf, &decode_start);
vkCmdDecodeVideoKHR (cmd_buf, &decode_info);
vkCmdEndVideoCodingKHR (cmd_buf, &decode_end);

gst_vulkan_operation_end (exec, ...);

Here, just one memory barrier is required for memory layout transition, but semaphores are required to signal when an output frame and its DPB frames are processed, and later, the output frame can be used as a DPB frame. Otherwise, the output frame might not be fully reconstructed with it’s used as DPB for the next output frame, generating only noise.

And that’s all. Thank you.

GStreamer 1.24.4 stable bug fix release

GStreamer
From GStreamer News by GStreamer

The GStreamer team is pleased to announce another bug fix release in the new stable 1.24 release series of your favourite cross-platform multimedia framework!

This release only contains bugfixes and security fixes and it should be safe to update from 1.24.x.

Highlighted bugfixes:

  • audioconvert: support more than 64 audio channels
  • avvidec: fix dropped frames when doing multi-threaded decoding of I-frame codecs such as DV Video
  • mpegtsmux: Correctly time out in live pipelines, esp. for sparse streams like KLV and DVB subtitles
  • vtdec deadlock fixes on shutdown and format/resolution changes (as might happen with e.g. HLS/DASH)
  • fmp4mux, isomp4mux: Add support for adding AV1 header OBUs into the MP4 headers, and add language from tags
  • gtk4paintablesink improvements: fullscreen mode and gst-play-1.0 support
  • webrtcsink: add support for insecure TLS and improve error handling and VP9 handling
  • vah264enc, vah265enc: timestamp handling fixes; generate IDR frames on force-keyunit-requests, not I frames
  • v4l2codecs: decoder: Reorder caps to prefer `DMA_DRM` ones, fixes issues with playbin3
  • Visualizer plugins fixes
  • Avoid using private APIs on iOS
  • various bug fixes, memory leak fixes, and other stability and reliability improvements

See the GStreamer 1.24.4 release notes for more details.

Binaries for Android, iOS, Mac OS X and Windows will be available shortly.

GStreamer Hackfest 2024

From Herostratus’ legacy by Víctor Jáquez

Last weeks were a bit hectic. First, with a couple friends we biked the southwest of the Netherlands for almost a week. The next week, the last one, I attended the 2024 Display Next Hackfest

https://mastodon.social/@igalia@floss.social/112438486139416587

This week was Igalia’s Assembly meetings, and next week, along with other colleagues, I’ll be in Thessaloniki for the GStreamer Spring Hackfest

https://mastodon.social/@gstreamer@floss.social/112473279155437759

I’m happy to meet again friends from the GStreamer community and talk and move things forward related with Vulkan, VA-API, KMS, video codecs, etc.

GStreamer Conference 2024 announced to take place 7-10 October 2024 in Montréal, Canada

GStreamer
From GStreamer News by GStreamer

The GStreamer project is thrilled to announce that this year's GStreamer Conference will take place on Monday-Tuesday 7-8 October 2024 in Montréal, Québec, Canada, followed by a hackfest.

You can find more details about the conference on the GStreamer Conference 2024 web site.

A call for papers will be sent out in due course.

Registration will open at a later time, in late June / early July.

We will announce those and any further updates on the gstreamer-announce mailing list, the website, on Twitter. and on Mastodon.

Talk slots will be available in varying durations from 20 minutes up to 45 minutes. Whatever you're doing or planning to do with GStreamer, we'd like to hear from you!

We also plan to have sessions with short lightning talks / demos / showcase talks for those who just want to show what they've been working on or do a mini-talk instead of a full-length talk. Lightning talk slots will be allocated on a first-come-first-serve basis, so make sure to reserve your slot if you plan on giving a lightning talk.

A GStreamer hackfest will take place right after the conference, on 9-10 October 2024.

We hope to see you in Montréal!

Please spread the word!