Magic Numbers
Sometimes we need to scan a disk at a low level, and determine the files that are contained on a disk. One method of determining the files is to look for standard signatures, normally using standard sequences at the start of the file. I’ve tried to gather as many of these signatures as possible for key file types (see Table 1). For example an Abobe Illustrator file should start with the hex sequence of 0x25, 0x50, 0x44, 0x46 (which is the ASCII characters of %PDF), and which shows that it is a standard PDF file. If we scan a disk and find this signature, it may thus be an Illustrator file.
PNG File
PNG files provide high quality vector and bit mapped graphic formats. They have a magic number of 0x89 0x50 0x4E 0x47 0x0D 0x0A 0x1A 0x0A. The following gives a sample listing for a real PNG file:
http://www.profsims.com/information/png?file=bg.png
The starting part of the file shows the magic number:
[00000000] 89 50 4E 47 0D 0A 1A 0A .PNG.... [00000008] 00 00 00 0D 49 48 44 52 ....IHDR [00000016] 00 00 00 F3 00 00 00 C3 ........ [00000024] 08 06 00 00 00 57 8C 27 .....W.' [00000032] 92 00 00 00 04 67 41 4D .....gAM [00000040] 41 00 00 AF C8 37 05 8A A....7.. [00000048] E9 00 00 00 19 74 45 58 .....tEX
A demonstration of this is given in:
GIF file
The GIF file format uses a file signature of 0x47 0x49 0x46 0x38 0x39 0x61 (GIF89a) in the first few bytes of the file. After this, the key fields are then Width (16 bits), Height (16 bits), Packed (8 bits), Color Index (8 bits) and Aspect (8 bits), followed by a colour table of 256 24-bit colors. This means that GIF files have good resolution of the colour of a pixel, but only have 256 different colours, which limits its scope. For example it is not good for photographs, as these typically need thousands of colours.
A sample analysis is:
http://www.profsims.com/information/gif?file=cat01_with_hidden_text.gif
which analyses this image:
An example header is then:
[00000000] 47 49 46 38 39 61 64 00 GIF89ad. [00000008] 55 00 E6 00 00 FF FF FF U....... [00000016] F7 F7 F6 F1 F4 F2 EE EE ........ [00000024] EF E7 E7 E7 E1 E4 E6 DF ........
It should be noted that I have added a covert message into the colour table (which will only affect a few pixels – where a few pixels change their colour):
[00000048] A1 CC CC CC C4 C8 CC 68 .......h [00000056] 65 6C 6C 6F C0 D1 C6 84 ello.... [00000064] C0 BF BD BD BB B8 B8 B6 ........
A presentation on this is at:
PKZIP File
The PKZIP file format is used to compress files, and, potentially encrypt them. It can be identified with the magic number of 0x504B0304 at the start of the file, followed by a fairly structure format of:
Version: 14 00
General purpose bit flag: 02 00
Compression method: 08 00
File last modification time: 80 9D
File last modification date: 6C 39
CRC: DA4DB80F
Compessed size: 90010000
Uncompressed size: 27060000
File name length: 0900
Extra field length: 0000
Filename: anim.xaml
The following shows an example with a real file:
http://www.profsims.com/information/zip?file=anim.zip
where we see the following at the start of the file:
[00000000] 50 4B 03 04 14 00 02 00 PK...... [00000008] 08 00 80 9D 6C 39 DA 4D ....l9.M
A presentation is here:
An interest fact is that Office 2010 files, such DOCX, XLSX, and so on, in a XML format, which has a PKZIP compressed format. This can be seen with:
http://www.profsims.com/information/docx?file=hello.docx
[00000000] 50 4B 03 04 14 00 06 00 PK...... [00000008] 08 00 00 00 21 00 09 24 ....!..$ [00000016] 87 82 81 01 00 00 8E 05 ........ [00000024] 00 00 13 00 08 02 5B 43 ......[C [00000032] 6F 6E 74 65 6E 74 5F 54 ontent_T [00000040] 79 70 65 73 5D 2E 78 6D ypes].xm [00000048] 6C 20 A2 04 02 28 A0 00 l....(..
Table 1: Magic file numbers
Description | Extension | Magic Number |
Adobe Illustrator | .ai | 25 50 44 46 [%PDF] |
Bitmap graphic | .bmp | 42 4D [BM] |
Class File | .class | CA FE BA BE |
JPEG graphic file | .jpg | FFD8 |
JPEG 2000 graphic file | .jp2 | 0000000C6A5020200D0A [….jP..] |
GIF graphic file | .gif | 47 49 46 38 [GIF89] |
TIF graphic file | .tif | 49 49 [II] |
PNG graphic file | .png | 89 50 4E 47 .PNG |
Photoshop Graphics | .psd | 38 42 50 53 [8BPS] |
Windows Meta File | .wmf | D7 CD C6 9A |
MIDI file | .mid | 4D 54 68 64 [MThd] |
Icon file | .ico | 00 00 01 00 |
MP3 file with ID3 identity tag | .mp3 | 49 44 33 [ID3] |
AVI video file | .avi | 52 49 46 46 [RIFF] |
Flash Shockwave | .swf | 46 57 53 [FWS] |
Flash Video | .flv | 46 4C 56 [FLV] |
Mpeg 4 video file | .mp4 | 00 00 00 18 66 74 79 70 6D 70 34 32 [….ftypmp42] |
MOV video file | .mov | 6D 6F 6F 76 [….moov] |
Windows Video file | .wmv | 30 26 B2 75 8E 66 CF |
Windows Audio file | .wma | 30 26 B2 75 8E 66 CF |
PKZip | .zip | 50 4B 03 04 [PK] |
GZip | .gz | 1F 8B 08 |
Tar file | .tar | 75 73 74 61 72 |
Microsoft Installer | .msi | D0 CF 11 E0 A1 B1 1A E1 |
Object Code File | .obj | 4C 01 |
Dynamic Library | .dll | 4D 5A [MZ] |
CAB Installer file | .cab | 4D 53 43 46 [MSCF] |
Executable file | .exe | 4D 5A [MZ] |
RAR file | .rar | 52 61 72 21 1A 07 00 [Rar!…] |
SYS file | .sys | 4D 5A [MZ] |
Help file | .hlp | 3F 5F 03 00 [?_..] |
VMWare Disk file | .vmdk | 4B 44 4D 56 [KDMV] |
Outlook Post Office file | .pst | 21 42 44 4E 42 [!BDNB] |
PDF Document | 25 50 44 46 [%PDF] | |
Word Document | .doc | D0 CF 11 E0 A1 B1 1A E1 |
RTF Document | .rtf | 7B 5C 72 74 66 31 [{ tf1] |
Excel Document | .xls | D0 CF 11 E0 A1 B1 1A E1 |
PowerPoint Document | .ppt | D0 CF 11 E0 A1 B1 1A E1 |
Visio Document | .vsd | D0 CF 11 E0 A1 B1 1A E1 |
DOCX (Office 2010) | .docx | 50 4B 03 04 [PK] |
XLSX (Office 2010) | .xlsx | 50 4B 03 04 [PK] |
PPTX (Office 2010) | .pptx | 50 4B 03 04 [PK] |
Microsoft Database | .mdb | 53 74 61 6E 64 61 72 64 20 4A 65 74 |
Postcript File | .ps | 25 21 [%!] |
Outlook Message File | .msg | D0 CF 11 E0 A1 B1 1A E1 |
EPS File | .eps | 25 21 50 53 2D 41 64 6F 62 65 2D 33 2E 30 20 45 50 53 46 2D 33 20 30 |
Jar File | .jar | 50 4B 03 04 14 00 08 00 08 00 |
SLN File | .sln | 4D 69 63 72 6F 73 6F 66 74 20 56 69 73 75 61 6C 20 53 74 75 64 69 6F 20 53 6F 6C 75 74 69 6F 6E 20 46 69 6C 65 |