A comprehensive reference guide to file magic bytes (file signatures)
Identify file types by their binary signatures, not just extensions
Magic bytes (also known as file signatures or magic numbers) are unique sequences of bytes located at the beginning of a file that identify its format. Unlike file extensions, which can be easily changed or faked, magic bytes are embedded in the file's binary structure and provide a reliable method for file type identification.
| Use Case | Description |
|---|---|
| 🔒 Security | Detect disguised malware (e.g., an .exe renamed to .jpg) |
| 🔍 Digital Forensics | Recover and identify files without extensions |
| 🛡️ File Validation | Verify uploaded files match their claimed type |
| 🔧 Development | Build robust file handling in applications |
| 📁 Data Recovery | Carve files from damaged storage media |
- 🖼️ Images
- 🎬 Video & Audio
- 📄 Documents
- 📦 Archives
- ⚙️ Executables
- 🐍 Python Usage Example
- 📚 Resources
- 🤝 Contributing
| Format | Extension(s) | Hex Signature | ASCII/String | Notes |
|---|---|---|---|---|
| JPEG | .jpg, .jpeg |
FF D8 FF E0 |
ÿØÿà |
JFIF format |
| JPEG | .jpg, .jpeg |
FF D8 FF E1 |
ÿØÿá |
EXIF format |
| JPEG | .jpg, .jpeg |
FF D8 FF DB |
ÿØÿÛ |
Raw JPEG |
| PNG | .png |
89 50 4E 47 0D 0A 1A 0A |
‰PNG.... |
Portable Network Graphics |
| GIF87a | .gif |
47 49 46 38 37 61 |
GIF87a |
Original GIF |
| GIF89a | .gif |
47 49 46 38 39 61 |
GIF89a |
GIF with animation support |
| BMP | .bmp, .dib |
42 4D |
BM |
Windows Bitmap |
| WebP | .webp |
52 49 46 46 ?? ?? ?? ?? 57 45 42 50 |
RIFF....WEBP |
Google WebP format |
| TIFF (LE) | .tif, .tiff |
49 49 2A 00 |
II*. |
Little-endian TIFF |
| TIFF (BE) | .tif, .tiff |
4D 4D 00 2A |
MM.* |
Big-endian TIFF |
| ICO | .ico |
00 00 01 00 |
.... |
Windows Icon |
| PSD | .psd |
38 42 50 53 |
8BPS |
Adobe Photoshop |
| Format | Extension(s) | Hex Signature | ASCII/String | Notes |
|---|---|---|---|---|
| MP4 | .mp4, .m4v, .m4a |
00 00 00 ?? 66 74 79 70 |
....ftyp |
MPEG-4 container |
| MP4 (isom) | .mp4 |
00 00 00 ?? 66 74 79 70 69 73 6F 6D |
....ftypisom |
ISO Base Media |
| MP4 (M4A) | .m4a |
00 00 00 ?? 66 74 79 70 4D 34 41 20 |
....ftypM4A |
Apple Audio |
| MP3 | .mp3 |
FF FB |
ÿû |
MPEG Audio Layer III |
| MP3 | .mp3 |
FF F3 |
ÿó |
MPEG Audio Layer III |
| MP3 | .mp3 |
FF F2 |
ÿò |
MPEG Audio Layer III |
| MP3 (ID3v2) | .mp3 |
49 44 33 |
ID3 |
MP3 with ID3v2 tag |
| WAV | .wav |
52 49 46 46 ?? ?? ?? ?? 57 41 56 45 |
RIFF....WAVE |
Waveform Audio |
| AVI | .avi |
52 49 46 46 ?? ?? ?? ?? 41 56 49 20 |
RIFF....AVI |
Audio Video Interleave |
| MKV | .mkv, .webm |
1A 45 DF A3 |
.Eᥠ|
Matroska container |
| FLV | .flv |
46 4C 56 01 |
FLV. |
Flash Video |
| FLAC | .flac |
66 4C 61 43 |
fLaC |
Free Lossless Audio Codec |
| OGG | .ogg, .oga, .ogv |
4F 67 67 53 |
OggS |
Ogg container |
| WMV/WMA | .wmv, .wma, .asf |
30 26 B2 75 8E 66 CF 11 |
0&²u.fÏ. |
Windows Media |
| MIDI | .mid, .midi |
4D 54 68 64 |
MThd |
Musical Instrument Digital |
| MOV | .mov, .qt |
00 00 00 ?? 66 74 79 70 71 74 20 20 |
....ftypqt |
QuickTime Movie |
| Format | Extension(s) | Hex Signature | ASCII/String | Notes |
|---|---|---|---|---|
.pdf |
25 50 44 46 2D |
%PDF- |
Portable Document Format | |
| DOCX | .docx |
50 4B 03 04 |
PK.. |
Word (Office Open XML) |
| XLSX | .xlsx |
50 4B 03 04 |
PK.. |
Excel (Office Open XML) |
| PPTX | .pptx |
50 4B 03 04 |
PK.. |
PowerPoint (Office Open XML) |
| DOC | .doc |
D0 CF 11 E0 A1 B1 1A E1 |
ÐÏ.ð±.á |
Word (OLE Compound) |
| XLS | .xls |
D0 CF 11 E0 A1 B1 1A E1 |
ÐÏ.ð±.á |
Excel (OLE Compound) |
| PPT | .ppt |
D0 CF 11 E0 A1 B1 1A E1 |
ÐÏ.ð±.á |
PowerPoint (OLE Compound) |
| RTF | .rtf |
7B 5C 72 74 66 31 |
{\rtf1 |
Rich Text Format |
| ODT | .odt |
50 4B 03 04 |
PK.. |
OpenDocument Text |
| ODS | .ods |
50 4B 03 04 |
PK.. |
OpenDocument Spreadsheet |
| EPUB | .epub |
50 4B 03 04 |
PK.. |
Electronic Publication |
| XML | .xml |
3C 3F 78 6D 6C 20 |
<?xml |
Extensible Markup Language |
| HTML | .html, .htm |
3C 21 44 4F 43 54 59 50 45 |
<!DOCTYPE |
HTML Document |
⚠️ Note: DOCX, XLSX, PPTX, ODT, ODS, and EPUB all share the same magic bytes (50 4B 03 04) because they are ZIP-based archives. To differentiate them, you need to examine the archive contents (e.g.,[Content_Types].xmlfor Office formats).
| Format | Extension(s) | Hex Signature | ASCII/String | Notes |
|---|---|---|---|---|
| ZIP | .zip |
50 4B 03 04 |
PK.. |
Standard ZIP archive |
| ZIP (empty) | .zip |
50 4B 05 06 |
PK.. |
Empty ZIP archive |
| ZIP (spanned) | .zip |
50 4B 07 08 |
PK.. |
Spanned ZIP archive |
| RAR v1.5+ | .rar |
52 61 72 21 1A 07 00 |
Rar!... |
RAR archive v1.5-4.x |
| RAR v5.0+ | .rar |
52 61 72 21 1A 07 01 00 |
Rar!.... |
RAR archive v5.0+ |
| 7-Zip | .7z |
37 7A BC AF 27 1C |
7z¼¯'. |
7-Zip archive |
| GZIP | .gz, .tar.gz |
1F 8B 08 |
... |
GNU Zip |
| TAR | .tar |
75 73 74 61 72 (offset 257) |
ustar |
Tape Archive |
| BZIP2 | .bz2 |
42 5A 68 |
BZh |
BZIP2 compressed |
| XZ | .xz |
FD 37 7A 58 5A 00 |
ý7zXZ. |
XZ compressed |
| ZSTD | .zst |
28 B5 2F FD |
(µ/ý |
Zstandard compressed |
| LZ4 | .lz4 |
04 22 4D 18 |
."M. |
LZ4 compressed |
| CAB | .cab |
4D 53 43 46 |
MSCF |
Microsoft Cabinet |
| ISO | .iso |
43 44 30 30 31 (offset 32769) |
CD001 |
ISO 9660 image |
| Format | Extension(s) | Hex Signature | ASCII/String | Notes |
|---|---|---|---|---|
| EXE/DLL (MZ) | .exe, .dll, .sys |
4D 5A |
MZ |
DOS/Windows Executable |
| ELF | (none), .so, .o |
7F 45 4C 46 |
.ELF |
Linux/Unix Executable |
| Mach-O (32-bit) | (none), .dylib |
FE ED FA CE |
þíúÎ |
macOS Executable (32-bit) |
| Mach-O (64-bit) | (none), .dylib |
FE ED FA CF |
þíúÏ |
macOS Executable (64-bit) |
| Mach-O (Universal) | (none), .dylib |
CA FE BA BE |
Êþº¾ |
macOS Universal Binary |
| Java Class | .class |
CA FE BA BE |
Êþº¾ |
Java bytecode |
| DEX | .dex |
64 65 78 0A 30 33 35 00 |
dex.035. |
Android Dalvik Executable |
| WebAssembly | .wasm |
00 61 73 6D |
.asm |
WebAssembly binary |
| COM | .com |
— | — | No standard signature |
| Python Bytecode | .pyc |
Varies by version | — | Version-dependent magic |
| Shell Script | .sh |
23 21 |
#! |
Shebang (e.g., #!/bin/bash) |
💡 Tip: Java Class files and Mach-O Universal binaries share the same magic bytes (
CA FE BA BE). Context (file extension, platform) is needed to differentiate them.
Here's a practical Python script to detect file types using magic bytes:
#!/usr/bin/env python3
"""
Magic Bytes File Type Detector
Identify file types by their binary signatures.
"""
# Dictionary of file signatures (magic bytes)
MAGIC_SIGNATURES = {
# Images
b'\xFF\xD8\xFF': 'JPEG Image',
b'\x89PNG\r\n\x1a\n': 'PNG Image',
b'GIF87a': 'GIF Image (87a)',
b'GIF89a': 'GIF Image (89a)',
b'BM': 'BMP Image',
b'RIFF': 'RIFF Container (WAV/AVI/WebP)',
# Documents
b'%PDF-': 'PDF Document',
b'PK\x03\x04': 'ZIP Archive / Office Document (DOCX/XLSX/PPTX)',
b'\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1': 'Microsoft Office (DOC/XLS/PPT)',
b'{\\rtf1': 'RTF Document',
# Archives
b'Rar!\x1a\x07\x00': 'RAR Archive (v1.5-4.x)',
b'Rar!\x1a\x07\x01\x00': 'RAR Archive (v5.0+)',
b'7z\xBC\xAF\x27\x1C': '7-Zip Archive',
b'\x1F\x8B\x08': 'GZIP Archive',
b'BZh': 'BZIP2 Archive',
# Executables
b'MZ': 'Windows Executable (EXE/DLL)',
b'\x7FELF': 'ELF Executable (Linux)',
b'\xFE\xED\xFA\xCE': 'Mach-O Executable (32-bit)',
b'\xFE\xED\xFA\xCF': 'Mach-O Executable (64-bit)',
b'\xCA\xFE\xBA\xBE': 'Java Class / Mach-O Universal',
# Audio/Video
b'ID3': 'MP3 Audio (ID3 tag)',
b'\xFF\xFB': 'MP3 Audio',
b'\xFF\xF3': 'MP3 Audio',
b'fLaC': 'FLAC Audio',
b'OggS': 'OGG Container',
b'\x1AE\xDF\xA3': 'Matroska Video (MKV/WebM)',
}
def identify_file(filepath: str, read_bytes: int = 32) -> str:
"""
Identify a file's type by reading its magic bytes.
Args:
filepath: Path to the file to identify
read_bytes: Number of bytes to read from the start (default: 32)
Returns:
String describing the detected file type
"""
try:
with open(filepath, 'rb') as f:
header = f.read(read_bytes)
if not header:
return "Empty file"
# Check against known signatures
for signature, file_type in MAGIC_SIGNATURES.items():
if header.startswith(signature):
return file_type
# Special case: MP4 and MOV files (ftyp at offset 4)
if b'ftyp' in header[:12]:
return "MP4/MOV Video"
# Special case: Check for text/script files
if header.startswith(b'#!'):
return "Shell Script"
if header.startswith(b'<?xml'):
return "XML Document"
if header.startswith(b'<!DOCTYPE') or header.startswith(b'<html'):
return "HTML Document"
return "Unknown file type"
except FileNotFoundError:
return f"Error: File '{filepath}' not found"
except PermissionError:
return f"Error: Permission denied for '{filepath}'"
except Exception as e:
return f"Error: {str(e)}"
def print_hex_dump(filepath: str, num_bytes: int = 16) -> None:
"""
Print a hex dump of the first N bytes of a file.
Args:
filepath: Path to the file
num_bytes: Number of bytes to display (default: 16)
"""
try:
with open(filepath, 'rb') as f:
data = f.read(num_bytes)
hex_str = ' '.join(f'{b:02X}' for b in data)
ascii_str = ''.join(chr(b) if 32 <= b < 127 else '.' for b in data)
print(f"\n📁 File: {filepath}")
print(f"🔢 Hex: {hex_str}")
print(f"📝 ASCII: {ascii_str}")
print(f"🎯 Type: {identify_file(filepath)}")
except Exception as e:
print(f"Error: {str(e)}")
# Example usage
if __name__ == "__main__":
import sys
if len(sys.argv) > 1:
# Check files provided as command-line arguments
for filepath in sys.argv[1:]:
print_hex_dump(filepath)
print("-" * 50)
else:
# Demo with a test message
print("🔮 Magic Bytes File Type Detector")
print("=" * 40)
print("\nUsage: python magic_bytes.py <file1> [file2] ...")
print("\nExample:")
print(" python magic_bytes.py image.jpg document.pdf archive.zip")
print("\nOr import and use in your code:")
print(' from magic_bytes import identify_file')
print(' file_type = identify_file("myfile.bin")')# Quick check for file signature
with open('file.bin', 'rb') as f: print(' '.join(f'{b:02X}' for b in f.read(16)))📁 File: example.jpg
🔢 Hex: FF D8 FF E0 00 10 4A 46 49 46 00 01 01 00 00 01
📝 ASCII: ÿØÿà..JFIF......
🎯 Type: JPEG Image
- Gary Kessler's File Signatures Table - Comprehensive database
- Wikipedia: List of file signatures - General reference
- File Signatures (Forensics Wiki) - Digital forensics perspective
- python-magic - Python library using libmagic
Contributions are welcome! If you'd like to add more file signatures or improve the documentation:
- Fork this repository
- Create a feature branch (
git checkout -b add-new-signatures) - Add your changes
- Submit a Pull Request
Please ensure any new signatures include:
- Format name
- Common file extensions
- Verified hex signature
- ASCII representation (if applicable)
- Any relevant notes
⭐ Star this repo if you found it useful! ⭐
Made with ❤️ for the security & development community