This page is a draft to a running Google Summer of Code project. Feel free to discuss with me, my email is robertofaga - NO_SPAM_AT - g mail _NO_SPAM_DOT_ com , or when I'm in #gstreamer, with nick robertofaga.
Idea
In nowadays, media files and types are in distinct formats, like an audio file or a DVD-video. Users usually want to convert a MP3 to OGG, extract an audio from a dvd-video, multiplex (merge) an audio with an image into a video, and so on. I will define Media Operations as these three operations - convert, converge and extract (this could be know as transcoding media types).
Free Software community have powerful libs and command-line programs, like gstreamer, ffmpeg, mencoder, transcoder, sox and so on. But command-line programs are terribly to end-users, as these users don't have too experience to use advanced commands, sometimes needing more than one program or even shell scripting to make a Media Operation. Transcoder, for example, does most of the proposed operations, but needs some shell scripting for batch processing sometimes and needs more than one library. We also have some converters with a GUI like SoundKonverter, SoundConverter and Gnormalize for audio and MultimediaConverter for sounds and videos, but neither of them have extraction and operations in an easy mode with any type of media. MultimediaConverter is the nearest tool that does it, but it's only does the selected operation without detecting filetype. It is not integrated with Desktop too and does not work with the three Multimedia Operations.
Also, is interesting to offer this transcoding service to other applications, so a file converter could be done in nautilus with a right-click, a simple audio conversion could be done in Rhythmbox, an extraction could be in PiTiVi or even in Totem, and so on. So gst-media-services propose to be a service, offering simple dialog widgets and library calls to transcode media types, providing easy implementation of these operations to users favorite programs. gst-media-services will use gstreamer operations to do these Media Operations with an easy-to-use GUI, based in some XML template (or something else) that turns easy to adapt with gstreamer updates.
Use Cases
Here I wrote some use cases, but I want to write more with mentors and community opinions. This user cases are written as the ideal, I'm not sure if we could code them all with Gstreamer.
- User wants to extract the audio from his DVD-video. User is running Totem and playing his DVD, when he choose in some menu "Extract Operation", and a window with titles and chapters and other info about his DVD will appear. User choose what titles and chapters he want to extract and what he want to extract - Images, Audio or Video. Images in this case he can, for example, choose to extract an image every 10s from DVD-video. In Audio case, he choose audio codec to reencode or the original audio and each chapter/title to turn in a new chapter/title, or he can concatenate all audio chapters in one, or rearrange how he want (a drag and drop interface will be implemented). In Video case, he can Convert to a new video format or just extract video without audio -- this example is a bit complex, it is a nice feature that gst-media-services could have, but perhaps part of described there could be done by the original application.
- User wants to convert his audio file. He can just right-click in audio file, choose gst-media-services menu, Convert and choose new audio type and proprieties, like bitrate and codec. He click convert and that's done. He can also select multiple files and apply the same rule to all selected files.
User wants to change audio file from his recorded video. He can just multiplex new audio file in video file if this video support multiple audio, or extract video without audio and multiplex new audio with extracted video. Note that this operation is transparently to user, so if application do two operations, user only see one operation. Also, this operation can be done in file navigator too or in a more common program to it, like PiTiVi or a video player.
Implementation
gst-media-services is being prototyped in Python with Glade/PyGTK as interface language for dialogs. The intention is to after a nice work, code in C, as all gst code.
As Stefan said too, the original project is too big to be completed in three months. Here is proposed only to offer a service, some library calls and a simple set of dialogs to do the desired operations. The intention is to fully integrate with user's desktop, so any application which (s)he uses can have conversions operations (and only operations which application needs).
What I think is suitable and can bring good results is to divide into four parts for this Summer of Code:
1) Code the core - code the base of service: recognize input format, extract properties from the input file, build xml/EncodingProfiles interpreter, code core classes (like media format class).
2) Listening - Code first converting formats, audio operations. SoundConverter could help a lot on it, checking and even using parts of code. the only operation on this part is to support conversion, so conversion and audio are the things supported on this first milestone-be version. Also, draw a simple dialog GUI to provide by default to applications.
- 3) Viewing - Extend convert operation to video and start the extraction of audio / mute video and the mixing of audio and video into a video. At this point, it is possible to extract, mix and convert videos and audios, but not intended to have advanced options.
- 4) First usable version - This is a point to mix everything, reorganize the code and put some advanced options, like choose bitrate and position audio into some time of video. If the time permit, I can work on a plugin system, so in future we could have nice plugins like add effects/noise to audio/video. It is not expected to support all Gstreamer formats, only the most popular, because this can be easily extended after, so it could be considered Extra.
- After Summer of Code, there are new operations that could be implemented, like merge texts with videos, make videos with images, audio and texts, and so on, turning a possible easy media service provider, with simple operations. They are very nice to have on project, but aren't so essential to the workability of gst-media-services.
Project could use Gstreamer structure for repository, site and these kind of structure, but can use sourceforge anyway. It is interesting to use some platform to guarantee internationalization to app. Multiple files operations will also be available.
Software to use: some code editor, Glade or Gazpacho, Python, CVS/Bazaar/SVN/GIT, Gnome and any other free software that I can need. Libraries to use: PyGTK, PyGlade, GTK and of course Gstreamer.
Gstreamer Encoding Profiles
EncodingProfiles need to be implemented in some way. Use xml? Or just a line-by-line list with values?
Reasons for do it in XML:
easily extensible: if we want to add a new property in a future version, it can be done easily;
mix plugins suggestions: we can have in the same profile file refers for plugins needed to each operation;
validation: as any XML can have, we can have a xml schema file to prevent possible bugs/errors;
Reasons for do it in line-by-line plain text file:
easy to work with: it's easier to handle a text file instead of parsing a XML file;
Any more item to add on each lists? Feel free for it!
Extra
- Challenges: What I think will be hard to me on this project, based on my experience, is to learn Gstreamer and do the media operations, as I have too few experience with Gstreamer. I also will need suggestions and help with the GUIs and the interaction, but the most hard to me is to process the operations. And also, help with Encoding profiles too will be needed.
- Project Draft: All what I wrote here is a draft. I'm ready to discuss any change with community and mentors, as I'm there to learn.
- Project portability: This project can run in any other platform which has PyGTK and python bindings for Gstreamer (and of course Gstreamer).
Talking
Christian Schaller: Looks good. One important feature you should aim for making work well is keeping original audio. For instance when I rip a DVD I want to preserve the AC3 audio without any changes, while transcoding the video to Dirac or H264 for instance. Wrote a blog entry on this issue not to long ago discussing UI challenges with Transcoders - http://blogs.gnome.org/uraeus/2008/01/17/the-world-of-transcoding/
Roberto Faga: Christian, nice entry, I agree with the options to maintain the same audio (or video-image) and be easy to use. We have some transcoders, but neither of them are very easy to use or don't cover all operations designed there. This is the main goal of this application, provide easy media operations/transcoding. About the different formats possible, formats for each platform (like PS3, iPod, etc), that's why I thought in some XML database with different instructions, so I expect to have community help to index all formats possible, as in the same time I can do it too as isn't supposed to be hard to change/add operations and formats associations with platforms.

