llamacpp-texttransform

llama.cpp-based text transformation element that passes text through a LLM and forwards the output.

It is possible to configure a system prompt, various sampling parameters and to keep a history of the last inputs/outputs for producing more consistent outputs.

The element can be used for example for text translation.

Models

The model can be selected via the model-path property. This expects a local GGUF file, which can be downloaded e.g. from Hugging Face. Models that are known to work well are

Gemma 4 E4B (quantizations) or Gemma 4 E2B (quantizations),
Qwen 3.5 9B (quantizations) or Qwen 3.5 4B (quantizations),
Ministral 3 14B Instruct (quantizations) or Ministral 3 8B Instruct (quantizations) or Ministral 3 3B Instruct (quantizations),
Hunyuan MT 7B (quantizations). This model is specifically trained for translations and expects a system prompt in a specific format that is described on the Hugging Face page.

It generally makes no sense to use huge models for this element, and even smaller ones than the ones above will give useful results.

Keep in mind that all these models have safeguards integrated, which can lead to rejections. For subtitle translations of Rated R movies, for example, it might be necessary to use an abliterated / decensored model like this.

Examples

Subtitle translation

 gst-launch-1.0 filesrc location=subtitles.eng.srt ! subparse ! llamacpp-texttransform model-path=/path/to/Hunyuan-MT-7B.Q4_K_M.gguf system-prompt="Translate the following segments into German, without additional explanation." history-size=5 ! overlay.text_sink \
     filesrc location=movie.mp4 ! decodebin3 name=dbin \
     dbin. ! queue ! videoconvert ! textoverlay name=overlay ! videoconvert ! navseek ! autovideosink \
     dbin. ! queue ! audioconvert ! autoaudiosink

Plays a movie with English subtitles and translates them to German before overlaying.

Audio transcription and translation

 gst-launch-1.0 filesrc location=movie.mp4 ! decodebin3 name=dbin \
     dbin. ! audio/x-raw ! tee name=audio-tee
     audio-tee. ! queue max-size-time=10000000000 max-size-buffers=0 max-size-bytes=0 ! audioconvert ! audioresample ! \
         whispertranscriber model-path=whisper-ggml-large-v3.bin model-preset=large-v3 chunk-duration=4000 ! \
         textaccumulate latency=0 ! queue max-size-time=10000000000 max-size-buffers=0 max-size-bytes=0 ! \
         llamacpp-texttransform model-path=Hunyuan-MT-7B.Q4_K_M.gguf system-prompt="Translate the following segments into German, without additional explanation." history-size=5 ! \
         textwrap columns=72 ! overlay.text_sink \
     dbin. ! queue max-size-time=10000000000 max-size-buffers=0 max-size-bytes=0 ! videoconvert ! \
         textoverlay name=overlay ! videoconvert ! autovideosink
     audio-tee. ! queue max-size-time=10000000000 max-size-buffers=0 max-size-bytes=0 ! audioconvert ! autoaudiosink

Plays a movie, transcribes the audio to text, translates the text to German and overlays it on top of the video.

Hierarchy

GObject
    ╰──GInitiallyUnowned
        ╰──GstObject
            ╰──GstElement
                ╰──llamacpp-texttransform

Factory details

Authors: – Sebastian Dröge

Classification: – Text/LLM

Rank – none

Plugin – llamacpp

Package – gst-plugin-llamacpp

Pad Templates

`sink`

text/x-raw:
         format: utf8

Presence – always

Direction – sink

Object type – GstPad

`src`

text/x-raw:
         format: utf8

Presence – always

Direction – src

Object type – GstPad

Properties

context-size

“context-size” guint

Size of the context window for the LLM

Flags : Read / Write

Default value : 2048

history-size

“history-size” guint

Number of previous messages to keep in context

Flags : Read / Write

Default value : 5

min-p

“min-p” gfloat

Minimum probability threshold (0.0 = disabled)

Flags : Read / Write

Default value : 0.05

model-path

“model-path” gchararray

Path to the GGUF model file

Flags : Read / Write

Default value : NULL

penalty-freq

“penalty-freq” gfloat

Frequency penalty (0.0 = disabled)

Flags : Read / Write

Default value : 0

penalty-last-n

“penalty-last-n” gint

Last n tokens to penalize (0 = disable, -1 = context size)

Flags : Read / Write

Default value : 64

penalty-present

“penalty-present” gfloat

Presence penalty (0.0 = disabled)

Flags : Read / Write

Default value : 0

penalty-repeat

“penalty-repeat” gfloat

Repetition penalty (1.0 = disabled)

Flags : Read / Write

Default value : 1

seed

“seed” guint

Random seed for sampling

Flags : Read / Write

Default value : -1159983106

system-prompt

“system-prompt” gchararray

System prompt for the LLM

Flags : Read / Write

Default value : NULL

temp

“temp” gfloat

Sampling temperature

Flags : Read / Write

Default value : 0.8

top-k

“top-k” gint

Top-k sampling parameter (<= 0 to use vocab size)

Flags : Read / Write

Default value : 40

top-p

“top-p” gfloat

Top-p sampling parameter (1.0 = disabled)

Flags : Read / Write

Default value : 0.95

The results of the search are