Simple GUI FLAC player (Java)

Introduction

The goal of this mini-project was to make a FLAC audio player with a GUI and a working seek bar, while keeping the amount of implementation code small. The result is a program that delivers the promised features in ~650 lines of Java code. The tradeoff for smallness is that the code is less modular/reusable than ideal, has little explanatory and documentation comments, and ignores many error conditions. However, the result still successfully illustrates the modest effort needed to implement a seekable FLAC player.

Download source code: SimpleGuiFlacPlayer.java

This monolithic program is considered to be an amalgamation of SimpleDecodeFlacToWav, FrameInfo, FlacDecoder, and SeekableFlacPlayerGui which are published on other pages.

Code overview

Major components

Graphical user interface (~100 lines): Displays widgets, gives requests to the audio worker, and accepts display updates from the audio worker.
Audio worker (~150 lines): Runs a loop that decodes audio and sends them to an output line, or accepts an open-file or seek request from the GUI, or waits for a new request because the end of file is reached or the file is closed due to runtime errors.
FLAC decoder (~300 lines): Parses FLAC frames, decodes audio data, and implements logic to seek to a frame at the desired audio position.
Low-level file stream (~100 lines): Provides methods to read bits, read bytes, and seek around.

Threads and call tree

Main thread: Initializes the GUI by creating objects and configuring attributes and event handlers, then transitions to process audio data or user requests in an infinite loop. The thread maintains objects like a FLAC decoder and an output line for audio processing, and receives user requests from shared variables. The FLAC decoder object contains and uses a low-level file stream object, which separates out the very simple operations (such as reading an n-bit unsigned integer) that are not specific to FLAC.
AWT thread: Handles both GUI input and output. When the user clicks a button or interacts with the UI, the AWT thread calls the appropriate piece of event listener code, which may update the UI and/or set some shared variables. Also, when the main program thread needs to update the UI, it needs to execute a callback on the AWT thread because Swing is not thread-safe – the main thread cannot directly update the UI objects.

Major classes and methods

class SimpleGuiFlacPlayer {
	(... Various fields for GUI ...)
	void main(String[] args);
	void setSliderPosition(double t);
	
	(... Various fields for worker ...)
	void doAudioDecoderWorkerLoop();
	void doWorkerIteration();
	
	class FlacDecoder {
		(... Various fields ...)
		void close();
		long[][] seekAndReadBlock(long samples);
		long[] findNextDecodableFrame(long filePos);
		Object[] readNextBlock();
		long[][] decodeSubframes(...);
		void decodeSubframe(...);
		void decodeRiceResiduals(...);
		
		class Stream {
			(... Various fields ...)
			void close();
			long getLength();
			long getPosition();
			void seekTo(long pos);
			int readByte();
			int readUint(int n);
			int readSignedInt(int n);
			void alignToByte();
		}
		
		class FormatException {}
	}
}

FLAC decoding

These are the steps to decode a FLAC audio file into raw uncompressed samples:

Parse the stream info metadata block. This contains essential facts such as the sample rate, bit depth, number of channels, and total number of samples (i.e. clip length).
Skip all other metadata blocks (tags, pictures, etc.).
Sequentially decode every audio frame until the end of file is reached.
To decode a frame, first confirm its sync code, then parse about a dozen header fields.
Next, decode every subframe in the frame, with one subframe per audio channel.
To decode a subframe, first read its header fields to determine the encoding parameters.
A subframe encoded in constant mode or verbatim mode is easily handled.
When a subframe is encoded in fixed prediction mode or linear predictive coding (LPC) mode, first the uncompressed warm-up samples are read, then the remain samples are decoded using Rice coding, and finally LPC restoration is applied.
When all the subframes of a frame are decoded, there may be a bit more work to decode stereo encoding modes such as mid-side coding.

To seek in a FLAC file, we can search entries in the embedded seek table or search blindly in the whole audio file. Unfortunately, many publishers chose to omit seek tables in FLAC files, so the latter method is better in practice. Here is how seeking works:

Suppose we want the playback position to jump to a specific audio sample offset in the file.
We use binary search over the whole file data to narrow down which frame to ultimately decode.
Define the range start as the file position of the foremost frame (i.e. immediately after the header metadata ends) and the range end as the end of the file.
In each iteration, calculate the middle file position as the average of the range start and end.
Starting at the middle position, read forward and try to find a sync sequence.
When a sync code is found, try decoding the frame starting there. If decoding fails, then most likely some audio data accidentally mimicked a sync code and this wasn’t a real frame, so keep reading forward to find a valid sync and frame.
If decoding succeeded, then we can look at the sample offset encoded in the frame header. Depending on whether it is less than or greater than the sample offset we want to seek to, we either set start = middle or end = middle.
After binary search terminates, the value of the range start must satisfy the constraint that the first frame found starting at that file offset will have a sample offset less than or equal to the requested seek position.
We seek to the range start, find a sync, and decode the next frame.
If the frame’s end sample position is after the requested seek position, then we return the appropriate suffix of the frame’s samples as the result.
Otherwise the frame’s end is not after the requested seek position, then we advance forward and decode the next frame, repeating until the frame’s data falls in the desired range of sample offsets.