Project Nayuki


PNG file chunk inspector

PNG file format overview

Portable Network Graphics is a ubiquitous file format for conveying still images. It is used on the web and in various document systems, and it has a decent level of lossless compression.

A PNG file is composed of an 8-byte signature header, followed by any number of chunks that contain control data / metadata / image data. Each chunk contains three standard fields – 4-byte length, 4-byte type code, 4-byte CRC – and various internal fields that depend on the chunk type.

The JavaScript tool on this page reads a given PNG file and dissects it deeply, showing the list of chunks and fields as well as any errors that violate the format specification. This can be helpful in looking for hidden metadata (i.e. stuff not in the visual picture), as well as in developing software that reads or writes PNG files in a compliant manner.

Program

Use sample file: (download)
Upload your file:

Chunk summary:

Start offset Raw bytes Chunk outside Chunk inside Errors

Notes

  • The PNG file format starts off with a magic signature, and is followed by any number of chunks all with a uniform syntax. This design is similar to other popular multimedia file formats, like: BMP, TIFF, WAV, AVI, general RIFF. It’s different from plain text file formats (examples like Netpbm surprisingly exist), XML’s hierarchical elements, a ZIP container of subfiles, PDF’s custom binary format, etc.

  • A PNG file can contain multiple IDAT chunks to hold the compressed image data. This is semantically equivalent to concatenating the data bytes of all IDAT chunks; it is not equivalent to having multiple (sub)images. Having multiple IDATs costs a bit more space in chunk headers and footers, and it serves no benefit to the decoder. The main reason to have multiple IDATs is if the encoder software wants to keep memory usage low and not need to buffer the entire IDAT, which is required just to know the final length of the chunk before starting to write the chunk. A far rarer reason for many IDATs is if the payload data after compression exceeds 2 GiB; in this case it is mandatory.

  • Although this tool goes quite deep into various values and fields in PNG files, it can’t be perfectly detailed. For example, outputting every DEFLATE symbol, every pixel value, or even every palette value, would be very verbose and probably not helpful. Also, there are some complex external data formats (not defined in the PNG standard) that I choose not to understand or decode, such as ICC profiles or Exif metadata. When this tool is insufficient for your needs, the only remaining solution is to examine the raw bytes of a file, and parse the chunks and fields yourself.

  • The source TypeScript code and compiled JavaScript code are available for viewing.

Compared to pngcheck

  • Not made by me, pngcheck is a similar command-line program that displays information about a PNG file’s chunks and checks for format errors. It’s implemented in C and has existed since year 1995, around when PNG was born. As far as I know, it’s the only publicly available and comprehensive PNG checker, so presumably many other developers over the decades have relied on it to check their own work. My program was released in 2021, a long time after them.

  • In terms of verbosity, pngcheck is about 5000 lines of C code (includes MNG and JNG chunk support, but excludes a decompressor), whereas my work is about 2000 lines of TypeScript code (includes HTML GUI logic and a DEFLATE decompressor).

  • My program and pngcheck can correctly detect almost all common and serious file errors. pngcheck fails to detect some errors like bKGD overflow, tRNS overflow, tIME wrong day-of-month, sPLT duplicate name, iTXt wrong UTF-8, zTXt wrong compressed data. My program fails to detect any kind of error in IDAT – e.g. decompression error, wrong length, wrong filter.

  • The author(s) of pngcheck admit to fixing a number of buffer overrun bugs over the years, and acknowledge that more security vulnerabilities can still remain. By contrast, I was aware of buffer overruns and numeric overflows while writing my code, and I believe I have avoided all possible mistakes; furthermore my code runs in a JavaScript virtual machine which is inherently memory-safe.

More info