vmaf
VMAF (Video Multi-Method Assessment Fusion) is a perceptual video quality assessment algorithm developed by Netflix. It combines multiple elementary quality metrics (VIF, DLM, Motion, ADM) and fuses them using a machine learning model to predict the perceived video quality as experienced by human viewers. VMAF scores range from 0 to 100, where higher scores indicate better perceptual quality.
This element is useful for:
- Evaluating video encoding quality and compression efficiency
- Comparing different encoding settings or codecs
- Quality assurance in video processing pipelines
- A/B testing of video content
For more information about VMAF, see: https://github.com/Netflix/vmaf
VMAF will perform perceptive video quality analysis on a set of input pads, the first pad is the reference video, the second is the distorted pad.
The image output will be the be the reference video pad, ref_pad.
VMAF will post a message containing a structure named "VMAF" at EOS or every reference frame if the property for frame-message=true.
The VMAF message structure contains the following fields:
- "timestamp" G_TYPE_UINT64 Buffer timestamp in nanoseconds
- "stream-time" G_TYPE_UINT64 Stream time in nanoseconds
- "running-time" G_TYPE_UINT64 Running time in nanoseconds
- "duration" G_TYPE_UINT64 Duration in nanoseconds
- "score" G_TYPE_DOUBLE The VMAF quality score (0-100, higher is better)
- "type" G_TYPE_STRING Message type: "frame" = per-frame score, "pooled" = aggregate score
- "index" G_TYPE_INT Frame index (only present for type="frame", per-frame messages)
- "psnr-y" G_TYPE_DOUBLE Peak Signal-to-Noise Ratio for Y (luma) channel in dB (only present if psnr property is enabled)
- "ssim" G_TYPE_DOUBLE Structural Similarity Index (0-1, higher is better) (only present if ssim property is enabled)
- "ms-ssim" G_TYPE_DOUBLE Multi-Scale Structural Similarity Index (0-1, higher is better) (only present if ms-ssim property is enabled)
The "type" field indicates whether the message contains a score for an individual frame (type="frame") or a pooled score for the entire stream up to that point (type="pooled"). Pooled scores are calculated at EOS using the pool-method property (mean, min, max, or harmonic mean).
The timing fields (timestamp, stream-time, running-time, duration) allow correlation of VMAF scores with specific video frames in the pipeline.
Per-frame messages (type="frame") include an "index" field indicating the frame number. With sub-sampling enabled, scores are only computed for frames at the sub-sampling rate, except motion scores which are computed for every frame.
It is possible to configure and run PSNR, SSIM, MS-SSIM together with VMAF by setting the appropriate properties to true.
For example, if ms-ssim, ssim, psnr are set to true, the emitted structure will look like this:
VMAF, timestamp=(guint64)1234567890, stream-time=(guint64)1234567890, running-time=(guint64)1234567890, duration=(guint64)40000000, score=(double)78.910751757633022, index=(int)26, type=(string)frame, ms-ssim=(double)0.96676034472760064, ssim=(double)0.8706783652305603, psnr-y=(double)30.758853484390933;
Example launch line
gst-launch-1.0 -m \
filesrc location=test1.yuv ! rawvideoparse width=1920 height=1080 ! v.ref_sink \
filesrc location=test2.yuv ! rawvideoparse width=1920 height=1080 ! v.dist_sink \
vmaf name=v frame-message=true results-filename=scores.json psnr=true ssim=true ms-ssim=true ! autovideosink \
This pipeline will output messages to the console for each set of compared frames.
Hierarchy
GObject ╰──GInitiallyUnowned ╰──GstObject ╰──GstElement ╰──GstAggregator ╰──GstVideoAggregator ╰──vmaf
Factory details
Authors: – Casey Bateman
Classification: – Filter/Analyzer/Video
Rank – none
Plugin – vmaf
Package – GStreamer Bad Plug-ins
Pad Templates
dist_sink
video/x-raw:
format: { I420, NV12, YV12, Y42B, Y444, I420_10LE, I422_10LE, Y444_10LE }
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
ref_sink
video/x-raw:
format: { I420, NV12, YV12, Y42B, Y444, I420_10LE, I422_10LE, Y444_10LE }
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
src
video/x-raw:
format: { I420, NV12, YV12, Y42B, Y444, I420_10LE, I422_10LE, Y444_10LE }
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
Properties
conf-interval
“conf-interval” gboolean
Enable confidence intervals
Flags : Read / Write
Default value : false
disable-clip
“disable-clip” gboolean
Disable clipping VMAF values
Flags : Read / Write
Default value : false
enable-transform
“enable-transform” gboolean
Enable transform VMAF scores
Flags : Read / Write
Default value : false
frame-message
“frame-message” gboolean
Enable frame level score messaging
Flags : Read / Write
Default value : false
log-level
“log-level” GstVmafLogLevel *
VMAF log level
Flags : Read / Write
Default value : none (0)
model-filename
“model-filename” gchararray
Model *.pkl abs filename, or file version for built in models
Flags : Read / Write
Default value : vmaf_v0.6.1
pool-method
“pool-method” GstVmafPoolMethod *
Pool method for mean
Flags : Read / Write
Default value : mean (3)
results-filename
“results-filename” gchararray
VMAF results filename for scores
Flags : Read / Write
Default value : NULL
results-format
“results-format” GstVmafResultsFormat *
VMAF results file format used for scores (csv, xml, json)
Flags : Read / Write
Default value : none (0)
subsample
“subsample” guint
Computing on one of every N frames
Flags : Read / Write
Default value : 1
Named constants
GstVmafLogLevel
Members
none (0) – No logging
error (1) – Error
warning (2) – Warning
info (3) – Info
debug (4) – Debug
GstVmafPoolMethod
Members
min (1) – Minimum value
max (2) – Maximum value
mean (3) – Arithmetic mean
harmonic_mean (4) – Harmonic mean
GstVmafResultsFormat
Members
none (0) – None
xml (1) – XML
csv (3) – Comma Separated File (csv)
json (2) – JSON
The results of the search are