WaveForm File Format

Pulse Code Modulation is a technique used to store audio data in computers. This is  done without compressing the audio.  In this post I shall expain in brief the format of uncompressed audio. The structure of an uncompressed audio file is as below:

Offset Bytes Data
0 4 “RIFF”
4 4 size of waveform chunk. This is actally 8 bytes minus the actual file size
8 4 “WAVE”
C 4 “fmt”
10 4 size of format chunk
14 2 wf.wformatTag
16 2 wf.nChannels
18 4 wf.nsamplesPerSec
1C 4 wf.nAvgbytesPerSec
20 2 wf.nBlockAlign
22 2 wf.wBitsPerSample
24 4 “data”
28 4 size of waveform data
2C waveform data

This format is based on RIFF(Resource Intensive File Format). This is a tagged file format where the file contains chunks of data that are preceded by ASCCI names and chunk size. So in a RIFF file, you may encounter several tags.

The first item in the Waveform audio is the string RIFF which identifies it as a RIFF file. The next field is the 32 bit chunk size which is the size of the rest of the file. ie. the actual file size will be 8 minus the total size. The chunk data actually starts with the string “WAVE”. This is actually a tag which states that the rest of the file is a WAVE chunk. Next is the string “fmt” which is another tag which contains the format of the audio data. Next comes the size of the format information. The format information contains the first 16 bytes of WAVEFORMATEX structure. The WAVEFORMATEX structure defines the format of waveform audio data:

typedef struct {
WORD  wFormatTag;
WORD  nChannels;
DWORD nSamplesPerSec;
DWORD nAvgBytesPerSec;
WORD  nBlockAlign;
WORD  wBitsPerSample;
WORD  cbSize;

So the format information contains all member except cbSize member of WAVEFORMATE structure.


I found a nice tool called RiffPad to inspect the Wave form audio data. You can load an audio file and view the above format in this editor. Below is a screenshot of a file on RiffPad.The first one shows the WAVEFORMATEX structure while the second one shows the actual audio data section.