Using Sox

Last updated February 2007.

Introduction
Copying the Input
Changing the Number of Channels
Effects
Using Sox to Change Volume
Using Sox to Obtain Information
Using Sox to Synthesize Sound
Using Sox to Combine Sound Files
Using Sox to Play and Record

Introduction

Sox is a very useful program, but its command line syntax is confusing and it isn't always easy to figure out how to get it to do what you want it to do. Under most circumstances, sox copies its input to its output, possibly making changes along the way. It therefore needs an input file name and an output file name, possibly together with information about them. If it is desired to do anything other than copy the input to the output (possibly with a change in format), it is necessary to specify what to do.

Copying the Input

The simplest use of sox therefore is with two filenames as arguments:

sox foo.aiff foo.wav

This command tells sox to copy the file foo.aiff, changing its format from aiff to wav. Sox will infer the type of a file from its extension. Since the header of the aiff file contains sufficient information about the file to convert it to wav format, no other information is necessary.

Sometimes you need to convert a file to pure PCM data so that it can be processed by programs that don't understand the various encodings and headers. Such pure PCM files are called raw files and are recognized by sox by the extension .raw. By using this extension you can use sox to convert a file to raw format. The command:

sox foo.wav foo.raw

converts the file foo to raw format.

We can also use sox to convert a raw file to another format. In this case, we have to supply some information about the raw file:

sox -r 441000 -s -w foo.raw foo.wav

The three flags preceding the input file name tell sox that the input file has a sampling rate of 44,100 samples per second, that the data is signed, and that each sample consists of a two byte word. With this information, sox can create a copy in wav format. The wav header also obligatorily includes the number of channels, but the number of channels in the input file need not be specified as sox assumes a default of mono.

It is also possible to change the representation of the data. For example, we can change the sampling rate by specifying the sampling rate for the output file:

sox foo.wav -r 22050 foonew.wav

This command changes the sampling rate to 22,050 samples per second.

Effects

Thus far we have used sox only to copy a file, possibly with a change in format. Sox can also transform its input in various ways. Some of these, such as reverb, are for musical use, but a number of effects, such as filtering, may be useful for phonetics. The name of the effect follows the the name of the output file. Any further parameters necessary to specify the effect follow its name. For example, the command:

sox foo.wav bar.wav lowp 1000.0

applies a low pass filter with cutoff at 1000 Hz to foo.wav and puts the result in bar.wav.

Changing the Number of Channels

Sox can also change the number of channels. For example, some sound cards insist on stereo data, so it may be useful to convert monaural sound files to stereo. This command does the job:

sox foo.wav -c 2 foostereo.wav split

Of course, we can't create true stereo from mono data; the effect of the command is to duplicate the original single channel.

On the other hand, sometimes it is necessary to extract a single channel from a stereo recording. This may be because we want to process it using software that cannot deal with stereo input, or it may be because we are interested only in one channel. Sox can deal with monaural (1 channel), stereo (2 channel) and quadriphonic (4 channel) data.

There are two ways to reduce the number of channels. One is to select a particular channel. This is done by using the avg effect with an option indicating what channel to use. The options are -l for left, -r for right, f for front, and b for back. For example, to extract the left channel give a command like this:

sox foo.wav -c 1 foomono.wav avg -l

Another approach is to average the channels. To create a monoaural file from a stereo file by averaging the two channels, give a command like this:

sox foo.wav -c 1 foomono.wav avg

Using Sox to Change Volume

The general option -v is used to change the volume. The argument to this option is used as a multiplier:

sox -v 2.0 foo.wav bar.wav

places in bar.wav a copy of foo.wav with the volume doubled.

Using Sox to Obtain Information

The "stat" effect produces statistical information about the audio data:

sox foo.wav -e stat

The -e flag tells sox not to generate any output other than the statistical information.

If the stat effect is followed by the flag -v, all that is printed is the multiplier that will maximize the volume without clipping. This value can be used as the argument to the -v general option.

Using Sox to Synthesize Sound

Sox can synthesize a number of standard waveforms (sine wave, square wave, etc.) and types of noise. These are specified by means of the synth effect. Even though sox creates the output from scratch, an input file name must still be specified. The input file should be /dev/null, with the -t flag to specify its special type:

sox -t nul /dev/null sine.wav synth 1.0 sine  1000.0

This command synthesizes a 1000 Hz sine wave 1.0 seconds long, leaving the result in sine.wav.

Using Sox to Combine Sound Files

If called as soxmix, sox adds two input files together to produce its output. For example, the command:

soxmix sine100.wav sine250.wav sine100-250.wav

adds sine100.wav and sine250.wav, leaving the result in sine100-250.wav.

Using Sox to Play and Record

On our GNU/Linux systems, sox provides the usual means for playing and recording sound files. The play command is actually a shell script that calls sox. Playing a sound file is accomplished by copying the file to the device special file /dev/dsp. The following command plays the file foo.wav:

sox foo.wav -t ossdsp /dev/dsp

The -t flag specifies the type of the file /dev/dsp.