TIFF Format
Tagged Image File Format
Header Structure
TIFF is the grandfather of image metadata. When you see EXIF data in a JPEG, that's actually a TIFF structure embedded inside the file. Understanding TIFF means understanding how most image metadata works.
The format starts with an 8-byte header that tells you two critical things: the byte order (Intel vs Motorola) and where to find the first IFD. From there, it's all pointer-chasing.
TIFF File Structure
TIFF Header
| Offset | Hex | ASCII | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 0000 | 49492A0008000000 | II*..... | |||||||
Byte Order
49 49 (II)Intel:Little-endian (least significant byte first)
4D 4D (MM)Motorola:Big-endian (most significant byte first)
IFD Structure
IFD stands for Image File Directory, but think of it as a table of contents. It's a flat list of tags, each pointing to some piece of data. The clever part is that IFDs can point to other IFDs, creating a tree structure.
IFD Layout
Tag Entry Structure (12 bytes)
If data fits in 4 bytes, it's stored directly in the value field. Otherwise, value field contains an offset to the data.
Tag Types
TIFF is strongly typed. Each tag specifies not just what it contains, but how to interpret the bytes. This is why you need to understand the type system to parse TIFF correctly.
| ID | Name | Size | Description |
|---|---|---|---|
| 1 | BYTE | 1 | Unsigned 8-bit integer |
| 2 | ASCII | 1 | Null-terminated string (8-bit) |
| 3 | SHORT | 2 | Unsigned 16-bit integer |
| 4 | LONG | 4 | Unsigned 32-bit integer |
| 5 | RATIONAL | 8 | Two LONGs: numerator/denominator |
| 6 | SBYTE | 1 | Signed 8-bit integer |
| 7 | UNDEFINED | 1 | Raw bytes (binary data) |
| 8 | SSHORT | 2 | Signed 16-bit integer |
| 9 | SLONG | 4 | Signed 32-bit integer |
| 10 | SRATIONAL | 8 | Two SLONGs: signed num/denom |
| 11 | FLOAT | 4 | IEEE 32-bit float |
| 12 | DOUBLE | 8 | IEEE 64-bit double |
| 13 | IFD | 4 | Pointer to nested IFD |
Value Storage Rule
If size × count ≤ 4 bytes, the value is stored directly in the 4-byte value field. Otherwise, the field contains an offset pointing to where the data is stored in the file.
IFD Offset Calculations
The hardest part of parsing TIFF is following the offset chain. Everything is relative to the start of the file, and you'll be jumping around a lot. Here's how the math works:
IFD0_offset = bytes[4:8] // From TIFF header
next_IFD = IFD0_offset + 2 + (entry_count × 12) + 4
subIFD_offset = tag_value // From pointer tag (0x8769, 0x8825)
Important Tags
Image Data Tags (Preserved)
0x0100:ImageWidth0x0101:ImageLength0x0102:BitsPerSample0x0103:Compression0x0111:StripOffsets0x0117:StripByteCountsMetadata Tags (Removed)
0x010F:Make (camera manufacturer)0x0110:Model (camera model)0x0132:DateTime0x013B:Artist0x8298:Copyright0x9003:DateTimeOriginalPointer Tags (Sub-IFDs)
0x8769:ExifIFDPointer0x8825:GPSInfoIFDPointer0xA005:InteroperabilityIFDPointerSubIFDs
The main IFD (IFD0) often contains "pointer tags" that reference sub-directories. This is how EXIF, GPS, and other metadata gets organized into logical groups while keeping the main IFD clean.
EXIF SubIFD
Contains camera-specific metadata: exposure, aperture, ISO, focal length, flash, white balance, and more.
GPS SubIFD
Contains geolocation data: latitude, longitude, altitude, timestamp, satellites, speed, direction. Major privacy concern.
Maker Notes
Proprietary camera data. Format varies by manufacturer. May contain serial numbers, firmware version, custom settings.
How PicScrub Processes TIFF
Parse Header
Determine byte order (II/MM) and locate first IFD
Build Tag Whitelist
Identify essential tags for image rendering
Filter IFD Entries
Remove metadata tags, zero out SubIFD pointers
Rewrite File
Reconstruct TIFF with clean IFDs and copied image data
Offset Recalculation
TIFF files use file offsets extensively. When metadata is removed, PicScrub must recalculate all offsets to maintain file integrity.