Capabilities of a pad

Since the pads play a very important role in how the element is viewed by the outside world, a mechanism is implemented to describe the data that can flow or currently flows through the pad by using capabilities. Here, we will briefly describe what capabilities are and how to use them, enough to get an understanding of the concept. For an in-depth look into capabilities and a list of all capabilities defined in GStreamer, see the Plugin Writers Guide.

Capabilities are attached to pad templates and to pads. For pad templates, it will describe the types of media that may stream over a pad created from this template. For pads, it can either be a list of possible caps (usually a copy of the pad template's capabilities), in which case the pad is not yet negotiated, or it is the type of media that currently streams over this pad, in which case the pad has been negotiated already.

Dissecting capabilities

A pad's capabilities are described in a GstCaps object. Internally, a GstCaps will contain one or more GstStructure that will describe one media type. A negotiated pad will have capabilities set that contain exactly one structure. Also, this structure will contain only fixed values. These constraints are not true for unnegotiated pads or pad templates.

As an example, below is a dump of the capabilities of the vorbisdec element, which you will get by running gst-inspect vorbisdec. You will see two pads: a source and a sink pad. Both of these pads are always available, and both have capabilities attached to them. The sink pad will accept vorbis-encoded audio data, with the media type audio/x-vorbis. The source pad will be used to send raw (decoded) audio samples to the next element, with a raw audio media type (in this case, audio/x-raw). The source pad will also contain properties for the audio samplerate and the amount of channels, plus some more that you don't need to worry about for now.

Pad Templates:
  SRC template: 'src'
    Availability: Always
                 format: F32LE
                   rate: [ 1, 2147483647 ]
               channels: [ 1, 256 ]

  SINK template: 'sink'
    Availability: Always

Properties and values

Properties are used to describe extra information for capabilities. A property consists of a key (a string) and a value. There are different possible value types that can be used:

  • Basic types, this can be pretty much any GType registered with Glib. Those properties indicate a specific, non-dynamic value for this property. Examples include:

    • An integer value (G_TYPE_INT): the property has this exact value.

    • A boolean value (G_TYPE_BOOLEAN): the property is either TRUE or FALSE.

    • A float value (G_TYPE_FLOAT): the property has this exact floating point value.

    • A string value (G_TYPE_STRING): the property contains a UTF-8 string.

    • A fraction value (GST_TYPE_FRACTION): contains a fraction expressed by an integer numerator and denominator.

  • Range types are GTypes registered by GStreamer to indicate a range of possible values. They are used for indicating allowed audio samplerate values or supported video sizes. The two types defined in GStreamer are:

    • An integer range value (GST_TYPE_INT_RANGE): the property denotes a range of possible integers, with a lower and an upper boundary. The vorbisdec element, for example, has a rate property that can be between 8000 and 50000.

    • A float range value (GST_TYPE_FLOAT_RANGE): the property denotes a range of possible floating point values, with a lower and an upper boundary.

    • A fraction range value (GST_TYPE_FRACTION_RANGE): the property denotes a range of possible fraction values, with a lower and an upper boundary.

  • A list value (GST_TYPE_LIST): the property can take any value from a list of basic values given in this list.

    Example: caps that express that either a sample rate of 44100 Hz and a sample rate of 48000 Hz is supported would use a list of integer values, with one value being 44100 and one value being 48000.

  • An array value (GST_TYPE_ARRAY): the property is an array of values. Each value in the array is a full value on its own, too. All values in the array should be of the same elementary type. This means that an array can contain any combination of integers, lists of integers, integer ranges together, and the same for floats or strings, but it can not contain both floats and ints at the same time.

    Example: for audio where there are more than two channels involved the channel layout needs to be specified (for one and two channel audio the channel layout is implicit unless stated otherwise in the caps). So the channel layout would be an array of integer enum values where each enum value represents a loudspeaker position. Unlike a GST_TYPE_LIST, the values in an array will be interpreted as a whole.