Previous Next Table of Contents

5. Digital Audio (PCM) Interface

Digital audio is the most commonly used method to represent sound inside a computer. In this method sound is stored as a sequence of samples taken from the audio signal using constant time intervals. A sample represents volume of the signal at the moment when it was measured. In uncompressed digital audio each sample require one or more bytes of storage. Number of bytes required depends on number of channels (mono, stereo) and sample format (8 or 16 bits, mu-Law, etc.). The length of this interval determines the sampling rate. Normally used sampling rates are between 8 kHz (telephone quality) and 48 kHz (DAT tapes).

The physical devices used in digital audio are called ADC (Analog to Digital Converter) and DAC (Digital to Analog Converter). A device containing both ADC and DAC is commonly known as codec. The codec device used in Sound Blaster cards is called DSP which is somehow misleading since DSP also stands for Digital Signal Processor (the SB DSP chip is very limited when compared to "true" DSP chips).

Sampling parameters affect quality of sound which can be reproduced from the recorded signal. The most fundamental parameter is sampling rate which limits the highest frequency than can be stored. It is well known (Nyquist's Sampling Theorem) that the highest frequency that can be stored in sampled signal is at most 1/2 of the sampling frequency. For example 8 kHz sampling rate permits recording of signal in which the highest frequency is less than 4 kHz. Higher frequency signals must be filtered out before feeding them to DAC.

Sample encoding limits dynamic range of recorded signal (difference between the faintest and the loudest signal that can be recorded). In theory the maximum dynamic range of signal is number_of_bits * 6 dB . This means that 8 bits sampling resolution gives dynamic range of 48 dB and 16 bit resolution gives 96 dB.

Sample encoding limits dynamic range of recorded signal (difference between the faintest and the loudest signal that can be recorded). In theory the maximum dynamic range of signal is number_of_bits * 6 dB . This means that 8 bits sampling resolution gives dynamic range of 48 dB and 16 bit resolution gives 96 dB. Quality has price. Number of bytes required to store an audio sequence depends on sampling rate, number of channels and sampling resolution. For example just 8000 bytes of memory is required to store one second of sound using 8 kHz/8 bits/mono but 48 kHz/16bit/stereo takes 192 kilobytes. A 64 kbps ISDN channel is required to transfer a 8kHz/8bit/mono audio stream and about 1.5 Mbps is required for DAT quality (48kHz/16bit/stereo). On the other hand it is possible to store just 5.46 seconds of sound to a megabyte of memory when using 48kHz/16bit/stereo sampling. With 8kHz/8bits/mono it is possible to store 131 seconds of sound using the same amount of memory. It is possible to reduce memory and communication costs by compressing the recorded signal but this is out of the scope of this document.

5.1 Low-Level Layer

Audio devices are opened exclusively for selected direction. This doesn't allows to open device direction multiple times with one or more processes for the same audio device direction, but allows one open call to playback direction and second open call to record direction independently. Audio device return EBUSY error to application when other application ownes requested direction.

Low-Level layer supports these formats:


#define SND_PCM_SFMT_MU_LAW             0
#define SND_PCM_SFMT_A_LAW              1
#define SND_PCM_SFMT_IMA_ADPCM          2
#define SND_PCM_SFMT_U8                 3
#define SND_PCM_SFMT_S16_LE             4
#define SND_PCM_SFMT_S16_BE             5
#define SND_PCM_SFMT_S8                 6
#define SND_PCM_SFMT_U16_LE             7
#define SND_PCM_SFMT_U16_BE             8
#define SND_PCM_SFMT_MPEG               9
#define SND_PCM_SFMT_GSM                10

#define SND_PCM_FMT_MU_LAW              (1 << SND_PCM_SFMT_MU_LAW)
#define SND_PCM_FMT_A_LAW               (1 << SND_PCM_SFMT_A_LAW)
#define SND_PCM_FMT_IMA_ADPCM           (1 << SND_PCM_SFMT_IMA_ADPCM)
#define SND_PCM_FMT_U8                  (1 << SND_PCM_SFMT_U8)
#define SND_PCM_FMT_S16_LE              (1 << SND_PCM_SFMT_S16_LE)
#define SND_PCM_FMT_S16_BE              (1 << SND_PCM_SFMT_S16_BE)
#define SND_PCM_FMT_S8                  (1 << SND_PCM_SFMT_S8)
#define SND_PCM_FMT_U16_LE              (1 << SND_PCM_SFMT_U16_LE)
#define SND_PCM_FMT_U16_BE              (1 << SND_PCM_SFMT_U16_BE)
#define SND_PCM_FMT_MPEG                (1 << SND_PCM_SFMT_MPEG)
#define SND_PCM_FMT_GSM                 (1 << SND_PCM_SFMT_GSM)

Constants with prefix SND_PCM_SFMT_ are used in info structures and constants with prefix SND_PCM_FMT_ are used in format structure.

int snd_pcm_open( void **handle, int card, int device, int mode )

Function creates new handle and opens connection to kernel sound audio interface to soundcard number card (0-N) and audio device number device. Function also checks if protocol is compatible to prevent use old programs with new kernel API. Function returns zero if success otherwise it returns negative error code. Error code -EBUSY is returned when some process ownes selected direction.

Default format after open is mono mu-Law at 8000Hz. Device should be used directly for playback of standard .au (Sparc) files.

Bellow modes should be used for mode argument:


  #define SND_PCM_OPEN_PLAYBACK   (O_RDONLY)
  #define SND_PCM_OPEN_RECORD     (O_WRONLY)
  #define SND_PCM_OPEN_DUPLEX     (O_RDWR)
  

int snd_pcm_close( void *handle )

Function frees all resources allocated with audio handle and closes connection to kernel sound mixer interface. Function returns zero if success otherwise it returns negative error code.

int snd_pcm_file_descriptor( void *handle )

Function returns file descriptor of connection to kernel sound mixer interface. Function returns negative error code if some error was occured.

File descriptor should be used for select synchronous multiplexer function for read direction. Application should call snd_pcm_read or snd_pcm_write functions if some data is waiting for read or write can be performed. Call to this functions is very recomended and leaves place to this functions to do for example some data conversions if needed.

int snd_pcm_block_mode( void *handle, int enable )

Functions setup block (default) or nonblock mode. Block mode suspends execution of program when snd_pcm_read or snd_pcm_write is called for time which is needed for actual playback or record of whole size of buffer. In nonblock mode program isn't suspended and above functions returns immediately with count of bytes which was read or written to driver. Functions shouldn't in this mode read or write whole buffer and application should perform next call of these functions to continue operation.

int snd_pcm_info( void *handle, snd_pcm_info_t *info )

Function returns filled *info structure. Function returns zero if success otherwise it returns negative error code.


  #define SND_PCM_INFO_CODEC              0x00000001
  #define SND_PCM_INFO_DSP                SND_PCM_INFO_CODEC
  #define SND_PCM_INFO_MMAP               0x00000002      /* reserved */
  #define SND_PCM_INFO_PLAYBACK           0x00000100
  #define SND_PCM_INFO_RECORD             0x00000200
  #define SND_PCM_INFO_DUPLEX             0x00000400
  #define SND_PCM_INFO_DUPLEX_LIMIT       0x00000800      /* rate for playback & record are same */

  struct snd_pcm_info {
    unsigned int type;                    /* soundcard type */
    unsigned int flags;                   /* see to SND_PCM_INFO_XXXX */
    unsigned char id[32];                 /* ID of this PCM device */
    unsigned char name[80];               /* name of this device */
    unsigned char reserved[64];           /* reserved for future... */
  };
  

SND_PCM_INFO_MMAP

This flag is reserved and should be never used. It remains for compatibility with Open Sound System driver.

SND_PCM_INFO_DUPLEX_LIMIT

If this bit is set, rate must be same for playback and record direction.

int snd_pcm_playback_info( void *handle, snd_pcm_playback_info_t *info )

Function returns filled *info structure. Function returns zero if success otherwise it returns negative error code.


  #define SND_PCM_PINFO_BATCH             0x00000001
  #define SND_PCM_PINFO_8BITONLY          0x00000002
  #define SND_PCM_PINFO_16BITONLY         0x00000004

  struct snd_pcm_playback_info {
    unsigned int flags;                   /* see to SND_PCM_PINFO_XXXX */
    unsigned int formats;                 /* supported formats */
    unsigned int min_rate;                /* min rate (in Hz) */
    unsigned int max_rate;                /* max rate (in Hz) */
    unsigned int min_channels;            /* min channels (probably always 1) */
    unsigned int max_channels;            /* max channels */
    unsigned int buffer_size;             /* playback buffer size */
    unsigned int min_fragment_size;       /* min fragment size in bytes */
    unsigned int max_fragment_size;       /* max fragment size in bytes */
    unsigned int fragment_align;          /* align fragment value */
    unsigned char reserved[64];           /* reserved for future... */
  };
  

SND_PCM_PINFO_BATCH

Driver does double buffering for this device. This means that used chip for data processing have own memory and output should be more delayed than traditional codec chip is used.

SND_PCM_PINFO_8BITONLY

If this bit is set, driver uses 8-bit format for 16-bit samples and does software conversion. This bit is used with broken SoundBlaster 16/AWE soundcards which can't do full 16-bit duplex. If this bit is set application or highter digital audio layer should do conversion from 16-bit samples to 8-bit samples rather than keep driver to do it in the kernel.

SND_PCM_PINFO_16BITONLY

If this bit is set, driver uses 16-bit format for 8-bit samples and does software conversion. This bit is used with broken SoundBlaster 16/AWE soundcards which can't do full 8-bit duplex. If this bit is set application or highter digital audio layer should do conversion from 8-bit samples to 16-bit samples rather than keep driver to do it in the kernel.

int snd_pcm_record_info( void *handle, snd_pcm_record_info_t *info )

Function returns filled *info structure. Function returns zero if success otherwise it returns negative error code.


  #define SND_PCM_RINFO_BATCH             0x00000001
  #define SND_PCM_RINFO_8BITONLY          0x00000002
  #define SND_PCM_RINFO_16BITONLY         0x00000004

  struct snd_pcm_record_info {
    unsigned int flags;                   /* see to SND_PCM_RINFO_XXXX */
    unsigned int formats;                 /* supported formats */
    unsigned int min_rate;                /* min rate (in Hz) */
    unsigned int max_rate;                /* max rate (in Hz) */
    unsigned int min_channels;            /* min channels (probably always 1) */
    unsigned int max_channels;            /* max channels */
    unsigned int buffer_size;             /* record buffer size */
    unsigned int min_fragment_size;       /* min fragment size in bytes */
    unsigned int max_fragment_size;       /* max fragment size in bytes */
    unsigned int fragment_align;          /* align fragment value */
    unsigned char reserved[64];           /* reserved for future... */
  };
  

SND_PCM_PINFO_BATCH

Driver does double buffering for this device. This means that used chip for data processing have own memory and output should be more delayed than traditional codec chip is used.

SND_PCM_PINFO_8BITONLY

If this bit is set, driver uses 8-bit format for 16-bit samples and does software conversion. This bit is used with broken SoundBlaster 16/AWE soundcards which can't do full 16-bit duplex. If this bit is set application or highter digital audio layer should do conversion from 16-bit samples to 8-bit samples rather than keep driver to do it in the kernel.

SND_PCM_PINFO_16BITONLY

If this bit is set, driver uses 16-bit format for 8-bit samples and does software conversion. This bit is used with broken SoundBlaster 16/AWE soundcards which can't do full 8-bit duplex. If this bit is set application or highter digital audio layer should do conversion from 8-bit samples to 16-bit samples rather than keep driver to do it in the kernel.

int snd_pcm_playback_format( void *handle, snd_pcm_format_t *format )

Function setup format, rate (in Hz) and number of channels for playback direction. Function returns zero if success otherwise it returns negative error code.


  struct snd_pcm_format {
    unsigned int format;                  /* SND_PCM_SFMT_XXXX */
    unsigned int rate;                    /* rate in Hz */
    unsigned int channels;                /* channels (voices) */
    unsigned char reserved[16];
  };
  

int snd_pcm_record_format( void *handle, snd_pcm_format_t *format )

Function setup format, rate (in Hz) and number of channels for record direction. Function returns zero if success otherwise it returns negative error code.


  struct snd_pcm_format {
    unsigned int format;                  /* SND_PCM_SFMT_XXXX */
    unsigned int rate;                    /* rate in Hz */
    unsigned int channels;                /* channels (voices) */
    unsigned char reserved[16];
  };
  

int snd_pcm_playback_params( void *handle, snd_pcm_playback_params_t *params )

Function sets various parameters for playback direction. Function returns zero if success otherwise it returns negative error code.


  struct snd_pcm_playback_params {
    int fragment_size;
    int fragments_max;
    int fragments_room;
    unsigned char reserved[16];           /* must be filled with zero */
  };
  

fragment_size

Requested size of fragment. This value should be aligned for current format (for example to 4 if stereo 16-bit samples are used) and with fragment_align variable from snd_pcm_playback_info_t structure. Range can be from min_fragment_size to max_fragment_size.

fragments_max

Maximum number of fragments in queue for wakeup. This number doesn't counts partly used fragment. If current count of filled playback fragments is greater than this value driver block application or return immediately back if nonblock mode is active.

fragments_room

Minumum number of fragments writeable for wakeup. This value should be in most cases 1 which means return back to application if at least one fragment is free for playback. This value includes partly used fragment, too.

int snd_pcm_record_params( void *handle, snd_pcm_record_params_t *params )

Function sets various parameters for record direction. Function returns zero if success otherwise it returns negative error code.


  struct snd_pcm_record_params {
    int fragment_size;
    int fragments_min;
    unsigned char reserved[16];
  };
  

fragment_size

Requested size of fragment. This value should be aligned for current format (for example to 4 if stereo 16-bit samples are used) and with fragment_align variable from snd_pcm_playback_info_t structure. Range can be from min_fragment_size to max_fragment_size.

fragments_min

Minimum filled fragments for wakeup. Driver blocks application (if block mode is selected) until isn't filled number of fragments specified with this value.

int snd_pcm_playback_status( void *handle, snd_pcm_playback_status_t *status )

Function returns filled *status structure. Function returns zero if success otherwise it returns negative error code.


  struct snd_pcm_playback_status {
    int fragments;
    int fragment_size;
    int count;
    int queue;
    int underrun;
    struct timeval time;
    unsigned char reserved[16];
  };
  

fragments

Currently allocated fragments by driver for playback direction.

fragment_size

Current fragment size used by driver for playback direction.

count

Count of bytes writeable without blocking.

queue

Count of bytes in queue. Note: (fragments * fragment_size) - queue should not be equal to count.

underrun

This value gives to application count of underruns relative from last call of snd_pcm_playback_status.

timeval

Time the first sample from next write is going to play. This value should be used for time synchronization. Returned value is same as you can get from standard C function gettimeofday( &time, NULL ).

int snd_pcm_record_status( void *handle, snd_pcm_record_status_t *status )

Function returns filled *status structure. Function returns zero if success otherwise it returns negative error code.


  struct snd_pcm_record_status {
    int fragments;                        /* allocated fragments */
    int fragment_size;                    /* current fragment size in bytes */
    int count;                            /* number of bytes readable without blo
    int free;                             /* bytes in buffer still free */
    int overrun;                          /* count of overruns from last status *
    struct timeval time;                  /* time the next read was taken */
    unsigned char reserved[16];
  };
  

fragments

Currently allocated fragments by driver for record direction.

fragment_size

Current fragment size used by driver for record direction.

count

Count of bytes readable without blocking.

free

Count of bytes in buffer still free. Note: (fragments * fragment_size) - free should not be equal to count.

overrun

This value gives to application count of overruns relative from last call of snd_pcm_record_status.

timeval

Time the next sample read was taken. This value should be used for time synchronization. Returned value is same as you can get from standard C function gettimeofday( &time, NULL ).

int snd_pcm_drain_playback( void *handle )

This function drain playback buffers immediately. Function returns zero if success otherwise it returns negative error code.

int snd_pcm_flush_playback( void *handle )

This function flush playback buffers. Function block program while last sample isn't processed. Function returns zero if success otherwise it returns negative error code.

int snd_pcm_flush_record( void *handle )

This function flush (destroy) record buffers. Function returns zero if success otherwise it returns negative error code.

ssize_t snd_pcm_write( void *handle, const void *buffer, size_t size )

Function writes samples to driver which must be in proper format than specified by snd_pcm_playback_format function. Function returns zero or positive value if playback was success (value represents count of bytes which was successfuly written to device) or negative error value if error occured. Function should suspend process if block mode is active.

ssize_t snd_pcm_read( void *handle, void *buffer, size_t size )

Function reads samples from driver. Samples are in format specified by snd_pcm_record_format function. Function returns zero or positive value if record was success (value represents count of bytes which was successfuly read from device) or negative error value if error occured. Function should suspend process if block mode is active.

5.2 Examples

Bellow example shows how can be played first 512kB from /tmp/test.au file on soundcard #0 and device #0:


int card = 0, device = 0, err, fd, count, size, idx;
void *handle;
snd_pcm_format_t format;
char *buffer;

buffer = (char *)malloc( 512 * 1024 );
if ( !buffer ) return;
if ( (err = snd_pcm_open( &handle, card, device, SND_PCM_OPEN_PLAYBACK )) < 0 ) {
  fprintf( stderr, "open failed: %s\n", snd_strerror( err ) );
  return;
}
format.format = SND_PCM_SFMT_MU_LAW;
format.rate = 8000;
format.voices = 1;
if ( (err = snd_pcm_playback_format( handle, &format )) < 0 ) {
  fprintf( stderr, "format setup failed: %s\n", snd_strerror( err ) );
  snd_pcm_close( handle );
  return;
}
fd = open( "/tmp/test.au" );
if ( fd < 0 ) {
  perror( "open file" );
  snd_pcm_close( handle );
  return;
}
idx = 0;
count = read( fd, buffer, 512 * 1024 );
if ( count <= 0 ) {
  perror( "read from file" );
  snd_pcm_close( handle );
  return;
}
close( fd );
if ( !memcmp( buffer, ".snd", 4 ) ) {
  idx = (buffer[4]<<24)|(buffer[5]<<16)|(buffer[6]<<8)|(buffer[7]);
  if ( idx > 128 ) idx = 128;
  if ( idx > count ) idx = count;
}
size = snd_pcm_write( handle, &buffer[ idx ], count - idx );
printf( "Bytes written %i from %i...\n", size, count - idx );
snd_pcm_close( handle );
free( buffer );


Previous Next Table of Contents