Compressing Sound

Newton Developer Technical Information: Newton Programmer's Guide: 2.1 OS Addendum /: Chapter 7 - Sound / Using Sound

Compressing Sound
On the Newton 2.1 OS, you can play compressed sounds and record sound, compressing it on the fly. You control sound compression by setting slots in the sound frame; for a complete description of sound frame slots, see "Sound Frame". The key slots controlling sound compression include:

sndFrameType, specifying whether or not to use a codec
codecName, identifying the codec to use
bufferCount, setting how many codec buffers to allocate
bufferSize, setting the size of each codec buffer in bytes
compressionType, specifying the compression type if a codec is not used
samplingRate, specifying the number of samples per second to play or record
compressionRatio, an optional slot specifying the compression ratio for user interface updating
You can compress and decompress sound using one of the built-in codecs, as described in the following subsections.
In addition to using compression, you can control the resulting size of recorded sounds by varying the sampling rate used when recording. The highest quality sound results from using a sampling rate of 21600, however, this also uses the most storage. You can use lower sampling rates to decrease the size of the resulting sound object, with some loss of quality. Reasonable sampling rates are 8000 on the MessagePad 2000 unit and 10000 on the eMate 300 unit.
Note that playing back a sound at a sampling rate different from that at which it was recorded shifts the sound's pitch.

Using Codecs to Compress and Decompress Sound
You can use one of several different codecs built into the Newton 2.1 OS to compress and decompress sounds. The codec mechanism makes use of intermediate buffers to hold the sound data, so that there are no skips in the sound caused by the system accessing VBO storage. You should use at least two buffers, and typically four buffers of the same size works well. The total size of all the buffers should be enough to hold 2-3 seconds of sound; so the size of the buffers will vary, depending on the sampling rate and the size of each sample.
For example, if the sampling rate is 8000 samples per second, and you are recording 16-bit samples, that yields 16 KB per second. A total buffer size of 40 KB would be adequate to handle the data. You could allocate four 10 KB buffers to satisfy this need. In the following examples, various values for buffer size are shown, depending on the sampling rate.
Note that if the sound sample data is not stored in a VBO, the buffers can be much smaller.
When using a codec, the samplingRate, compressionType, and dataType slots define the format of the data produced by decompression (for playback), or required for compression (for recording). They also define the format of the data in the intermediate buffers.
To compress or decompress sounds using the muLaw codec, set up the slots in the sound frame like this:
 {sndFrameType: 'codec, // use a codec

codecName: "TMuLawCodec", // select muLaw codec
bufferSize: 10000, // size of codec buffers
bufferCount: 4, // # of codec buffers
compressionType: kSampleLinear, // for playback
dataType: k16bit, // for playback

samples: mySamples, // compressed sound object
samplingRate: 8000, // set to any value you want
compressionRatio: 1, // for muLaw
 }
The muLaw codec compresses each 16-bit sample to 8 bits.
To use the IMA codec, set up the slots in the sound frame like this:
 {sndFrameType: 'codec, // use a codec

codecName: "TIMACodec", // select IMA codec
bufferSize: 12500, // size of codec buffers
bufferCount: 4, // # of codec buffers
compressionType: kSampleLinear, // for playback
dataType: k16bit, // for playback

samples: mySamples, // compressed sound object
samplingRate: 10000, // set to any value you want
compressionRatio: 64/34, // for IMA
 }
This example shows the same values as those used for the highest quality recording level on the Newton 2.1 devices (Music on the MessagePad 2000, and High on the eMate 300). A sampling rate of 8000 is used for the next lowest quality level (Voice 4K on the MessagePad 2000, and Low on the eMate 300).
The IMA codec compresses each block of 64 16-bit samples to 34 bytes, saving 73% over no compression.
To use the GSM codec, set up the slots in the sound frame like this:
 {sndFrameType: 'codec, // use a codec

codecName: "TGSMCodec", // select GSM codec
bufferSize: 10000, // size of codec buffers
bufferCount: 4, // # of codec buffers
compressionType: kSampleLinear, // for playback
dataType: k16bit, // for playback

samples: mySamples, // compressed sound object
samplingRate: 8000, // set to any value you want
compressionRatio: 160/33, // for GSM
 }
This example shows the same values as those used for the lowest quality recording level on the MessagePad 2000 device. Note that GSM is a mathematically intensive compression technique that works effectively only on fast processors, such as that used in the MessagePad 2000. It is possible to play GSM-encoded sounds on the eMate 300, but not to record them.
The GSM codec compresses each block of 160 16-bit samples to 33 bytes, saving 90% over no compression.