Digital audio is the most commonly used method of representing sound inside a computer. In this method sound is stored as a sequence of samples taken from the audio signal using constant time intervals. A sample represents volume of the signal at the moment when it was measured. In uncompressed digital audio each sample require one or more bytes of storage. The number of bytes required depends on number of channels (mono, stereo) and sample format (8 or 16 bits, mu-Law, etc.). The length of this interval determines the sampling rate. Commonly used sampling rates are between 8 kHz (telephone quality) and 48 kHz (DAT tapes).
The physical devices used in digital audio are called the ADC (Analog to Digital Converter) and DAC (Digital to Analog Converter). A device containing both ADC and DAC is commonly known as a codec. The codec device used in a Sound Blaster cards is called a DSP which is somewhat misleading since DSP also stands for Digital Signal Processor (the SB DSP chip is very limited when compared to "true" DSP chips).
Sampling parameters affect the quality of sound which can be reproduced from the recorded signal. The most fundamental parameter is sampling rate which limits the highest frequency than can be stored. It is well known (Nyquist's Sampling Theorem) that the highest frequency that can be stored in a sampled signal is at most 1/2 of the sampling frequency. For example, a 8 kHz sampling rate permits the recording of a signal in which the highest frequency is less than 4 kHz. Higher frequency signals must be filtered out before feeding them to DAC.
Sample encoding limits the dynamic range of recorded signal (difference between the faintest and the loudest signal that can be recorded). In theory the maximum dynamic range of signal is number_of_bits * 6 dB . This means that 8 bits sampling resolution gives dynamic range of 48 dB and 16 bit resolution gives 96 dB.
Quality has price. The number of bytes required to store an audio sequence depends on sampling rate, number of channels and sampling resolution. For example just 8000 bytes of memory is required to store one second of sound using 8 kHz/8 bits/mono but 48 kHz/16bit/stereo takes 192 kilobytes. A 64 kbps ISDN channel is required to transfer a 8kHz/8bit/mono audio stream in real time, and about 1.5 Mbps is required for DAT quality (48kHz/16bit/stereo). On the other hand it is possible to store just 5.46 seconds of sound in a megabyte of memory when using 48kHz/16bit/stereo sampling. With 8kHz/8bits/mono it is possible to store 131 seconds of sound using the same amount of memory. It is possible to reduce memory and communication costs by compressing the recorded signal but this is out of the scope of this document.
Audio devices are opened exclusively for a selected direction. This doesn't allow open from more than one processes for the same audio device in the same direction, but does allow one open call to each playback direction and second open call to record direction independently. Audio devices return EBUSY error to applications when other applications have already opened the requested direction.
Low-Level layer supports these formats:
#define SND_PCM_SFMT_MU_LAW 0
#define SND_PCM_SFMT_A_LAW 1
#define SND_PCM_SFMT_IMA_ADPCM 2
#define SND_PCM_SFMT_U8 3
#define SND_PCM_SFMT_S16_LE 4
#define SND_PCM_SFMT_S16_BE 5
#define SND_PCM_SFMT_S8 6
#define SND_PCM_SFMT_U16_LE 7
#define SND_PCM_SFMT_U16_BE 8
#define SND_PCM_SFMT_MPEG 9
#define SND_PCM_SFMT_GSM 10
#define SND_PCM_FMT_MU_LAW (1 << SND_PCM_SFMT_MU_LAW)
#define SND_PCM_FMT_A_LAW (1 << SND_PCM_SFMT_A_LAW)
#define SND_PCM_FMT_IMA_ADPCM (1 << SND_PCM_SFMT_IMA_ADPCM)
#define SND_PCM_FMT_U8 (1 << SND_PCM_SFMT_U8)
#define SND_PCM_FMT_S16_LE (1 << SND_PCM_SFMT_S16_LE)
#define SND_PCM_FMT_S16_BE (1 << SND_PCM_SFMT_S16_BE)
#define SND_PCM_FMT_S8 (1 << SND_PCM_SFMT_S8)
#define SND_PCM_FMT_U16_LE (1 << SND_PCM_SFMT_U16_LE)
#define SND_PCM_FMT_U16_BE (1 << SND_PCM_SFMT_U16_BE)
#define SND_PCM_FMT_MPEG (1 << SND_PCM_SFMT_MPEG)
#define SND_PCM_FMT_GSM (1 << SND_PCM_SFMT_GSM)
Constants with prefix SND_PCM_FMT_ are used in info structures
and constants with prefix SND_PCM_SFMT_ are used in format structures.
Creates a new handle and opens a connection to kernel sound audio interface for soundcard number card (0-N) and audio device number device. Function also checks if protocol is compatible to prevent use of old programs with a new kernel API. Function returns zero if successful,ful otherwise it returns an error code. Error code -EBUSY is returned when some process ownes the selected direction.
Default format after opening is mono mu-Law at 8000Hz. This device can be used directly for playback of standard .au (Sparc) files.
The following modes should be used for the mode argument:
#define SND_PCM_OPEN_PLAYBACK (O_WRONLY) #define SND_PCM_OPEN_RECORD (O_RDONLY) #define SND_PCM_OPEN_DUPLEX (O_RDWR)
Frees all resources allocated with audio handle and closes the connection to the kernel sound audio interface. Function returns zero if successful, otherwise it returns an error code.
Returns the file descriptor of the connection to the kernel sound audio interface. Function returns an error code if an error was encountered.
The file descriptor should be used for the select synchronous multiplexer function for setting the read direction. Application should call snd_pcm_read or snd_pcm_write functions if some data is waiting for reading or a write can be performed. Calling this function is highly recomended, as it leaves a place for the API to things like data conversions, if needed.
Sets up block (default) or nonblock mode for a handle. Block mode suspends execution of a program when snd_pcm_read or snd_pcm_write is called for the time which is needed for the actual playback or record over of the entire buffer. In nonblock mode, programs aren't suspended and the above functions returns immediately with the count of bytes which were read or written by the driver. When used in this way, don't try to use the entire buffer after the call, but instead process the number of bytes returned, and call the function again.
Fills the *info structure with data about the PCM device selected by *handle. Function returns zero if successful, otherwise it returns an error code.
#define SND_PCM_INFO_CODEC 0x00000001 #define SND_PCM_INFO_DSP SND_PCM_INFO_CODEC #define SND_PCM_INFO_MMAP 0x00000002 /* reserved */ #define SND_PCM_INFO_PLAYBACK 0x00000100 #define SND_PCM_INFO_RECORD 0x00000200 #define SND_PCM_INFO_DUPLEX 0x00000400 #define SND_PCM_INFO_DUPLEX_LIMIT 0x00000800 /* rate for playback & record are same */ struct snd_pcm_info { unsigned int type; /* soundcard type */ unsigned int flags; /* see SND_PCM_INFO_XXXX */ unsigned char id[32]; /* ID of this PCM device */ unsigned char name[80]; /* name of this device */ unsigned char reserved[64]; /* reserved for future use */ };
This flag is reserved and should be never used. It remains for compatibility with Open Sound System driver.
If this bit is set, rate must be same for playback and record direction.
Fills the *info structure with data about PCM playback. Function returns zero if successful, otherwise it returns an error code.
#define SND_PCM_PINFO_BATCH 0x00000001 #define SND_PCM_PINFO_8BITONLY 0x00000002 #define SND_PCM_PINFO_16BITONLY 0x00000004 struct snd_pcm_playback_info { unsigned int flags; /* see SND_PCM_PINFO_XXXX */ unsigned int formats; /* supported formats */ unsigned int min_rate; /* min rate (in Hz) */ unsigned int max_rate; /* max rate (in Hz) */ unsigned int min_channels; /* min channels (probably always 1) */ unsigned int max_channels; /* max channels */ unsigned int buffer_size; /* playback buffer size */ unsigned int min_fragment_size; /* min fragment size in bytes */ unsigned int max_fragment_size; /* max fragment size in bytes */ unsigned int fragment_align; /* align fragment value */ unsigned char reserved[64]; /* reserved for future use */ };
Driver implements double buffering with this device. This means that the chip used for data processing has its own memory, and output should be more delayed than if a traditional codec chip is used.
If this bit is set, the driver uses 8-bit format for 16-bit samples and does software conversion. This bit is set on broken SoundBlaster 16/AWE soundcards which can't do full 16-bit duplex. If this bit is set application or highter digital audio layer should do the conversion from 16-bit samples to 8-bit samples rather than making the driver to do it in the kernel.
If this bit is set, driver uses 16-bit format for 8-bit samples and does software conversion. This bit is set on broken SoundBlaster 16/AWE soundcards which can't do full 8-bit duplex. If this bit is set the application or highter digital audio layer should do conversion from 8-bit samples to 16-bit samples rather than making the driver to do it in the kernel.
Fills the *info structure. Returns zero if successful, otherwise it returns an error code.
#define SND_PCM_RINFO_BATCH 0x00000001 #define SND_PCM_RINFO_8BITONLY 0x00000002 #define SND_PCM_RINFO_16BITONLY 0x00000004 struct snd_pcm_record_info { unsigned int flags; /* see to SND_PCM_RINFO_XXXX */ unsigned int formats; /* supported formats */ unsigned int min_rate; /* min rate (in Hz) */ unsigned int max_rate; /* max rate (in Hz) */ unsigned int min_channels; /* min channels (probably always 1) */ unsigned int max_channels; /* max channels */ unsigned int buffer_size; /* record buffer size */ unsigned int min_fragment_size; /* min fragment size in bytes */ unsigned int max_fragment_size; /* max fragment size in bytes */ unsigned int fragment_align; /* align fragment value */ unsigned char reserved[64]; /* reserved for future... */ };
Driver implements buffering for this device. This means that the chip used for data processing has its own memory and output should be more delayed than if a traditional codec chip is used.
If this bit is set, the device uses 8-bit format for 16-bit samples and does software conversion. This bit is set on broken SoundBlaster 16/AWE soundcards which can't do full 16-bit duplex. If this bit is set the application or highter digital audio layer should do conversion from 16-bit samples to 8-bit samples rather than making the driver to do it in the kernel.
If this bit is set, the device uses a 16-bit format for 8-bit samples and does software conversion. This bit is set on broken SoundBlaster 16/AWE soundcards which can't do full 8-bit duplex. If this bit is set the application or highter digital audio layer should do the conversion from 8-bit samples to 16-bit samples rather than making the driver to do it in the kernel.
Sets up format, rate (in Hz) and number of channels for playback, in the desired direction. Function returns zero if successful, otherwise it returns an error code.
struct snd_pcm_format { unsigned int format; /* SND_PCM_SFMT_XXXX */ unsigned int rate; /* rate in Hz */ unsigned int channels; /* channels (voices) */ unsigned char reserved[16]; };
Sets up format, rate (in Hz) and number of channels for used for recording in the specified direction. Function returns zero if successful, otherwise it returns an error code.
struct snd_pcm_format { unsigned int format; /* SND_PCM_SFMT_XXXX */ unsigned int rate; /* rate in Hz */ unsigned int channels; /* channels (voices) */ unsigned char reserved[16]; };
Sets various parameters for playback direction. Function returns zero if successful, otherwise it returns an error code.
struct snd_pcm_playback_params { int fragment_size; int fragments_max; int fragments_room; unsigned char reserved[16]; /* must be filled with zero */ };
Requested size of fragment. This value should be aligned for current format (for example to 4 if stereo 16-bit samples are used) or with the fragment_align variable from snd_pcm_playback_info_t structure. Its range can be from min_fragment_size to max_fragment_size.
Maximum number of fragments in queue for wakeup. This number doesn't counts partly used fragment. If current count of filled playback fragments is greater than this value driver block application or return immediately back if nonblock mode is active.
Minumum number of fragments writeable for wakeup. This value should be in most cases 1 which means return back to application if at least one fragment is free for playback. This value includes partly used fragments, too.
Function sets various parameters for the recording direction. Function returns zero if successful, otherwise it returns an error code.
struct snd_pcm_record_params { int fragment_size; int fragments_min; unsigned char reserved[16]; };
Requested size of fragment. This value should be aligned for current format (for example to 4 if stereo 16-bit samples are used) or set to the fragment_align variable from snd_pcm_playback_info_t structure. Its range can be from min_fragment_size to max_fragment_size.
Minimum filled fragments for wakeup. Driver blocks the application (if block mode is selected) until it isn't filled with number of fragments specified with this value.
Fills the *status structure. Function returns zero if successful, otherwise it returns an error code.
struct snd_pcm_playback_status { unsigned int rate; int fragments; int fragment_size; int count; int queue; int underrun; struct timeval time; struct timeval stime; unsigned char reserved[16]; };
Real playback rate. This value reflects hardware limitations.
Currently allocated fragments by the driver for playback direction.
Current fragment size used by driver for the playback direction.
Count of bytes writeable without blocking.
Count of bytes in queue. Note: (fragments * fragment_size) - queue should not be equal to count.
This value tells the application the number of underruns since the ast call of snd_pcm_playback_status.
Delay till played of the first sample from next write. This value should be used for time synchronization. Returned value is in the same format as returned from the standard C function gettimeofday( &time, NULL ). This variable contains right value only if playback time mode is enabled (look to snd_pcm_playback_time function).
Time when playback was started. This variable contains right value only if playback time mode is enabled (look to snd_pcm_playback_time function).
Fills the *status structure. Function returns zero if successful, otherwise it returns an error code.
struct snd_pcm_record_status { unsigned int rate; int fragments; int fragment_size; int count; int free; int overrun; struct timeval time; unsigned char reserved[16]; };
Real record rate. This value reflects hardware limitations.
Currently allocated fragments by driver for the record direction.
Current fragment size used by driver for the record direction.
Count of bytes readable without blocking.
Count of bytes in buffer still free. Note: (fragments * fragment_size) - free should not be equal to count.
This value tells application the count of overruns since the last call to snd_pcm_record_status.
Lag since the next sample read was recorded. This value should be used for time synchronization. Returned value is in the same format as returned by the from standard C function gettimeofday( &time, NULL ). This variable contains right value only if record time mode is enabled (look to snd_pcm_record_time function).
Time when record was started. This variable contains right value only if record time mode is enabled (look to snd_pcm_record_time function).
This function drain playback buffers immediately. Function returns zero if successful, otherwise it returns an error code.
This function flushes the playback buffers. It blocks the program while the all the waiting samples in kernel playback buffers are processed. Function returns zero if successful, otherwise it returns an error code.
This function flushes (destroyes) record buffers. Function returns zero if successful, otherwise it returns an error code.
This function enables or disables time mode for playback direction. Time mode allows to application better time synchronization. Function returns zero if successful, otherwise it returns an error code.
This function enables or disables time mode for record direction. Time mode allows to application better time synchronization. Function returns zero if successful, otherwise it returns an error code.
Writes samples to the device which must be in the proper format specified by the snd_pcm_playback_format function. Function returns zero or positive value if playback was successful (value represents count of bytes which was successfuly written to device) or an error value if error occured. Function should suspend process if block mode is active.
Function reads samples from driver. Samples are in format specified by snd_pcm_record_format function. Function returns zero or positive value if record was success (value represents count of bytes which was successfuly read from device) or negative error value if error occured. Function should suspend process if block mode is active.
The following example shows how to play the first 512kB from the /tmp/test.au file with soundcard #0 and PCM device #0:
int card = 0, device = 0, err, fd, count, size, idx;
void *handle;
snd_pcm_format_t format;
char *buffer;
buffer = (char *)malloc( 512 * 1024 );
if ( !buffer ) return;
if ( (err = snd_pcm_open( &handle, card, device, SND_PCM_OPEN_PLAYBACK )) < 0 ) {
fprintf( stderr, "open failed: %s\n", snd_strerror( err ) );
return;
}
format.format = SND_PCM_SFMT_MU_LAW;
format.rate = 8000;
format.channels = 1;
if ( (err = snd_pcm_playback_format( handle, &format )) < 0 ) {
fprintf( stderr, "format setup failed: %s\n", snd_strerror( err ) );
snd_pcm_close( handle );
return;
}
fd = open( "/tmp/test.au", O_RDONLY );
if ( fd < 0 ) {
perror( "open file" );
snd_pcm_close( handle );
return;
}
idx = 0;
count = read( fd, buffer, 512 * 1024 );
if ( count <= 0 ) {
perror( "read from file" );
snd_pcm_close( handle );
return;
}
close( fd );
if ( !memcmp( buffer, ".snd", 4 ) ) {
idx = (buffer[4]<<24)|(buffer[5]<<16)|(buffer[6]<<8)|(buffer[7]);
if ( idx > 128 ) idx = 128;
if ( idx > count ) idx = count;
}
size = snd_pcm_write( handle, &buffer[ idx ], count - idx );
printf( "Bytes written %i from %i...\n", size, count - idx );
snd_pcm_close( handle );
free( buffer );