Thursday, August 21, 2008

Multichannel audio with gstreamer

Over the past few months, I have spent a lot of time working with the gstreamer multimedia framework. From their site:
GStreamer is a library for constructing [...] graphs of media-handling components. The use cases it covers range from simple Ogg/Vorbis playback, audio/video streaming to complex audio (mixing) and video (non-linear editing) processing.
I've found gstreamer to be remarkably flexible and useful for a variety of audio-video applications. One of the trickier things I had to figure out was how to have 8 channels of audio playing in one gstreamer pipeline.

These examples require the following packages:
  1. JACK Audio Connection Kit librairies
  2. gst-plugins-base-0.10.20
  3. gst-plugins-good-0.10.9
  4. gst-plugins-bad-0.10.8

These modules should be installed in the above order. Personally I use the CVS head for all of the above, which you can get by doing:
$ cvs co modulename
where modulename is gstreamer, gst-plugins-base, gst-plugins-good, and gst-plugins-bad respectively.

Gstreamer includes a command-line utility, gst-launch, that allows a user to quickly build a gstreamer pipeline with a simple text description. For example:

$ gst-launch audiotestsrc ! jackaudiosink

will play a sine wave, provided you are already rolling a jack server. I generally run a jack server using the qjacktl application.

Users should be aware of this warning from the gst-launch man page:

gst-launch is primarily a debugging tool for developers and users. You should not build applications on top of it. For applications, use the gst_parse_launch() function of the GStreamer API as an easy way to construct pipelines from pipeline descriptions.

It is possible to run a simple multichannel audio example with the following launch line:

gst-launch-0.10 -v interleave name=i ! audioconvert ! audioresample ! queue ! jackaudiosink \
audiotestsrc volume=0.125 freq=200 ! audioconvert ! queue ! i. \
audiotestsrc volume=0.125 freq=300 ! audioconvert ! queue ! i. \
audiotestsrc volume=0.125 freq=400 ! audioconvert ! queue ! i. \
audiotestsrc volume=0.125 freq=500 ! audioconvert ! queue ! i. \
audiotestsrc volume=0.125 freq=600 ! audioconvert ! queue ! i. \
audiotestsrc volume=0.125 freq=700 ! audioconvert ! queue ! i. \
audiotestsrc volume=0.125 freq=800 ! audioconvert
! queue ! i. \
audiotestsrc volume=0.125 freq=900 ! audioconvert ! queue ! i.

This pipeline consists of 8 audiotestsrc elements, which generate sine tones of increasing frequency. The audioconvert element is used to convert a given audiotestsrc's output to the appropriate numeric data type that the queue element expects. The queue element is a simple data buffer, to which the audioconvert element writes, and from which the interleave element reads. The interleave element combines multiple channels of audio into one interleaved "frame" of audio. For example, if we had 2 independent channels of audio like so:

Channel1: 00000...
Channel2: 11111...

where channel 1 outputs only 0's, and channel only 1's, the interleaved frame would look like:


The interleaved audio again needs to go through an audioconvert and an audioresample element in case the audio from our pipeline differs in datatype or sample rate from the jack server. Finally the audio is output by the jackaudiosink element, which writes audio from our pipeline into corresponding jack ports.

Many plugins require that the interleave element explicitly specify each channel's spatial position. Unfortunately, this cannot be done with gst-launch. I've created an example C program, multiChannel.c, which initializes interleave appropriately. It can be compiled with this Makefile. The relevant section in multiChannel.c is the function set_channel_layout(GstElement *interleave). This function is passed our interleave element, and sets its channel-positions property to an array of valid spatial positions.

Gstreamer uses Glib's object system (GObject) heavily, and as a result the above example program might be a little tricky to follow for programmers used to straight C. Check out gstreamer's application development manual for further examples of gstreamer usage in C.


Anonymous said...

That is really a good idea! So do you have any idea how to switch audio tracks when play back a multi-audio-track movie?

Tristan Matthews said...

Hmm...that's a good question. If the audio for the movie is interleaved already, you would probably need to to deinterleave it, then interleave it and set whatever layout you want. Support for multichannnel layouts is relatively new in gstreamer so i'm not sure how feasible this would be.

mjoachimiak said...

Thank you so much for sharing your knowledge. You saved a lot of my time, really.

mjoachimiak said...

I am bit new to gstreamer. I am using two video sources but I couldn't find gst-launch example how to read from two files to one element (filter). Many thanks for the idea.
My usage is:
gst-launch -v encoder name=s264 ! filesink location=./Out filesrc location=./Grasshopper/texture/Grasshopper_Right.yuv ! s264.sink_right filesrc location=./Grasshopper/texture/Grasshopper_Left.yuv ! s264.sink_left

Tristan Matthews said...

Hi Michael,

I saw on the mailing list that you got some feedback. Let me know if you have any other questions, and thanks for reading! Also I'm due to write another tutorial soon so if you have any requests please let me know.