WebP Format
Google's Modern Image Format
RIFF Container
Google didn't invent a new container for WebP. They went with RIFF, the same format that's been carrying WAV audio and AVI video since the early '90s. If you've ever parsed a WAV file, you already know how WebP works at the container level.
The format uses nested chunks with 4-character identifiers (called FourCC codes). It's a simple and robust design that has stood the test of time.
WebP File Structure (RIFF Container)
WebP Header
| Offset | Hex | ASCII | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0000 | 52494646241A00005745425056503858 | RIFF$...WEBPVP8X | |||||||||||||||
| 0010 | 0A00000010000000FF0000FF00000000 | .............. | |||||||||||||||
RIFF Chunk Structure
Note: Chunk data is padded to even length with a null byte if necessary.
Key Offsets
- • Sub-chunk start: 12 bytes (after RIFF + size + WEBP)
- • Chunk header: 8 bytes (4-byte FourCC + 4-byte size)
- • Size field offset: +4 from chunk start (little-endian)
Image Chunks
WebP is actually three formats in one: lossy (VP8), lossless (VP8L), and extended (VP8X). The chunk type tells you which compression method was used.
VP8:Lossy Compression
Uses VP8 video codec for compression. Similar quality/size ratio to JPEG. Simple WebP files contain only this chunk.
VP8L:Lossless Compression
Lossless compression with prediction, color transform, and entropy coding. Typically 25-34% smaller than PNG.
VP8X:Extended Format
Indicates presence of optional features: animation, alpha channel, ICC profile, EXIF, or XMP. Required when any of these features are present.
ALPH:Alpha Channel
Separate alpha channel data. Can be lossless or lossy compressed independently from color data.
ANIM/ANMF:Animation
ANIM contains global animation parameters. ANMF chunks contain individual animation frames with timing info.
Metadata Chunks
Unlike simple WebP files that contain only image data, extended WebP files can carry metadata. The good news is that these chunks are cleanly separated from the image data, making them easy to identify and remove.
EXIF:EXIF Metadata
Contains EXIF data in the same format as JPEG (raw TIFF structure). Includes camera info, GPS coordinates, timestamps, and thumbnails. Primary privacy concern.
XMP:XMP Metadata
XML-based metadata. Can duplicate EXIF data plus editing history, software info, and custom fields.
ICCP:ICC Color Profile
ICC color profile for color management. PicScrub preserves by default to maintain color accuracy.
VP8X Flags
The VP8X chunk acts like a table of contents, telling decoders what to expect in the file. When we remove metadata chunks, we also need to update these flags, otherwise the decoder will look for data that isn't there.
Flag Bits
Why Flags Matter
When PicScrub removes EXIF/XMP chunks, it must also update the VP8X flags to reflect their absence. Otherwise, decoders may expect data that isn't there.
How PicScrub Processes WebP
Validate RIFF Header
Check for RIFF....WEBP signature
Parse Chunks
Iterate through all RIFF chunks, reading FourCC and size
Remove Metadata Chunks
Skip EXIF and XMP chunks when writing output
Update VP8X Flags
Clear EXIF/XMP flag bits in VP8X chunk
Recalculate File Size
Update RIFF header with new total file size
Preserved
- • VP8/VP8L image data
- • ALPH alpha channel
- • ANIM/ANMF animation data
- • ICCP color profile (optional)
Removed
- • EXIF metadata
- • XMP metadata
Simple vs Extended WebP
If a WebP file doesn't have a VP8X chunk, it's a "simple" WebP: just the RIFF wrapper around a single VP8 or VP8L chunk. No metadata is possible in simple WebP files, so if you're privacy-conscious, these are ideal.
Simple WebP
Contains only VP8 or VP8L chunk. No metadata possible.
Extended WebP
Has VP8X chunk enabling metadata, alpha, animation.