vmaf

VMAF (Video Multi-Method Assessment Fusion) is a perceptual video quality assessment algorithm developed by Netflix. It combines multiple elementary quality metrics (VIF, DLM, Motion, ADM) and fuses them using a machine learning model to predict the perceived video quality as experienced by human viewers. VMAF scores range from 0 to 100, where higher scores indicate better perceptual quality.

This element is useful for:

  • Evaluating video encoding quality and compression efficiency
  • Comparing different encoding settings or codecs
  • Quality assurance in video processing pipelines
  • A/B testing of video content

For more information about VMAF, see: https://github.com/Netflix/vmaf

VMAF will perform perceptive video quality analysis on a set of input pads, the first pad is the reference video, the second is the distorted pad.

The image output will be the be the reference video pad, ref_pad.

VMAF will post a message containing a structure named "VMAF" at EOS or every reference frame if the property for frame-message=true.

The VMAF message structure contains the following fields:

  • "timestamp" G_TYPE_UINT64 Buffer timestamp in nanoseconds
  • "stream-time" G_TYPE_UINT64 Stream time in nanoseconds
  • "running-time" G_TYPE_UINT64 Running time in nanoseconds
  • "duration" G_TYPE_UINT64 Duration in nanoseconds
  • "score" G_TYPE_DOUBLE The VMAF quality score (0-100, higher is better)
  • "type" G_TYPE_STRING Message type: "frame" = per-frame score, "pooled" = aggregate score
  • "index" G_TYPE_INT Frame index (only present for type="frame", per-frame messages)
  • "psnr-y" G_TYPE_DOUBLE Peak Signal-to-Noise Ratio for Y (luma) channel in dB (only present if psnr property is enabled)
  • "ssim" G_TYPE_DOUBLE Structural Similarity Index (0-1, higher is better) (only present if ssim property is enabled)
  • "ms-ssim" G_TYPE_DOUBLE Multi-Scale Structural Similarity Index (0-1, higher is better) (only present if ms-ssim property is enabled)

The "type" field indicates whether the message contains a score for an individual frame (type="frame") or a pooled score for the entire stream up to that point (type="pooled"). Pooled scores are calculated at EOS using the pool-method property (mean, min, max, or harmonic mean).

The timing fields (timestamp, stream-time, running-time, duration) allow correlation of VMAF scores with specific video frames in the pipeline.

Per-frame messages (type="frame") include an "index" field indicating the frame number. With sub-sampling enabled, scores are only computed for frames at the sub-sampling rate, except motion scores which are computed for every frame.

It is possible to configure and run PSNR, SSIM, MS-SSIM together with VMAF by setting the appropriate properties to true.

For example, if ms-ssim, ssim, psnr are set to true, the emitted structure will look like this:

VMAF, timestamp=(guint64)1234567890, stream-time=(guint64)1234567890, running-time=(guint64)1234567890, duration=(guint64)40000000, score=(double)78.910751757633022, index=(int)26, type=(string)frame, ms-ssim=(double)0.96676034472760064, ssim=(double)0.8706783652305603, psnr-y=(double)30.758853484390933;

Example launch line

 gst-launch-1.0 -m \
   filesrc location=test1.yuv ! rawvideoparse width=1920 height=1080 ! v.ref_sink  \
   filesrc location=test2.yuv ! rawvideoparse width=1920 height=1080 ! v.dist_sink \
   vmaf name=v frame-message=true results-filename=scores.json psnr=true ssim=true ms-ssim=true ! autovideosink \

This pipeline will output messages to the console for each set of compared frames.

Hierarchy

GObject
    ╰──GInitiallyUnowned
        ╰──GstObject
            ╰──GstElement
                ╰──GstAggregator
                    ╰──GstVideoAggregator
                        ╰──vmaf

Factory details

Authors: – Casey Bateman , Andoni Morales , Diego Nieto

Classification:Filter/Analyzer/Video

Rank – none

Plugin – vmaf

Package – GStreamer Bad Plug-ins

Pad Templates

dist_sink

video/x-raw:
         format: { I420, NV12, YV12, Y42B, Y444, I420_10LE, I422_10LE, Y444_10LE }
          width: [ 1, 2147483647 ]
         height: [ 1, 2147483647 ]
      framerate: [ 0/1, 2147483647/1 ]

Presencealways

Directionsink

Object typeGstVideoAggregatorPad


ref_sink

video/x-raw:
         format: { I420, NV12, YV12, Y42B, Y444, I420_10LE, I422_10LE, Y444_10LE }
          width: [ 1, 2147483647 ]
         height: [ 1, 2147483647 ]
      framerate: [ 0/1, 2147483647/1 ]

Presencealways

Directionsink

Object typeGstVideoAggregatorPad


src

video/x-raw:
         format: { I420, NV12, YV12, Y42B, Y444, I420_10LE, I422_10LE, Y444_10LE }
          width: [ 1, 2147483647 ]
         height: [ 1, 2147483647 ]
      framerate: [ 0/1, 2147483647/1 ]

Presencealways

Directionsrc

Object typeGstAggregatorPad


Properties

conf-interval

“conf-interval” gboolean

Enable confidence intervals

Flags : Read / Write

Default value : false


disable-clip

“disable-clip” gboolean

Disable clipping VMAF values

Flags : Read / Write

Default value : false


enable-transform

“enable-transform” gboolean

Enable transform VMAF scores

Flags : Read / Write

Default value : false


frame-message

“frame-message” gboolean

Enable frame level score messaging

Flags : Read / Write

Default value : false


log-level

“log-level” GstVmafLogLevel *

VMAF log level

Flags : Read / Write

Default value : none (0)


model-filename

“model-filename” gchararray

Model *.pkl abs filename, or file version for built in models

Flags : Read / Write

Default value : vmaf_v0.6.1


ms-ssim

“ms-ssim” gboolean

Estimate MS-SSIM

Flags : Read / Write

Default value : false


phone-model

“phone-model” gboolean

Use VMAF phone model

Flags : Read / Write

Default value : false


pool-method

“pool-method” GstVmafPoolMethod *

Pool method for mean

Flags : Read / Write

Default value : mean (3)


psnr

“psnr” gboolean

Estimate PSNR

Flags : Read / Write

Default value : false


results-filename

“results-filename” gchararray

VMAF results filename for scores

Flags : Read / Write

Default value : NULL


results-format

“results-format” GstVmafResultsFormat *

VMAF results file format used for scores (csv, xml, json)

Flags : Read / Write

Default value : none (0)


ssim

“ssim” gboolean

Estimate SSIM

Flags : Read / Write

Default value : false


subsample

“subsample” guint

Computing on one of every N frames

Flags : Read / Write

Default value : 1


threads

“threads” guint

The number of threads

Flags : Read / Write

Default value : 8


Named constants

GstVmafLogLevel

Members

none (0) – No logging
error (1) – Error
warning (2) – Warning
info (3) – Info
debug (4) – Debug

GstVmafPoolMethod

Members

min (1) – Minimum value
max (2) – Maximum value
mean (3) – Arithmetic mean
harmonic_mean (4) – Harmonic mean

GstVmafResultsFormat

Members

none (0) – None
xml (1) – XML
csv (3) – Comma Separated File (csv)
json (2) – JSON

The results of the search are