File Forensics with Signatures

Magic Numbers

Sometimes we need to scan a disk at a low level, and determine the files that are contained on a disk. One method of determining the files is to look for standard signatures, normally using standard sequences at the start of the file. I’ve tried to gather as many of these signatures as possible for key file types (see Table 1). For example an Abobe Illustrator file should start with the hex sequence of 0x25, 0x50, 0x44, 0x46 (which is the ASCII characters of %PDF), and which shows that it is a standard PDF file. If we scan a disk and find this signature, it may thus be an Illustrator file.

PNG File

PNG files provide high quality vector and bit mapped graphic formats. They have a magic number of 0x89 0x50 0x4E 0x47 0x0D 0x0A 0x1A 0x0A. The following gives a sample listing for a real PNG file:

http://www.profsims.com/information/png?file=bg.png

The starting part of the file shows the magic number:

[00000000] 89 50 4E 47 0D 0A 1A 0A   .PNG....
[00000008] 00 00 00 0D 49 48 44 52   ....IHDR
[00000016] 00 00 00 F3 00 00 00 C3   ........
[00000024] 08 06 00 00 00 57 8C 27   .....W.'
[00000032] 92 00 00 00 04 67 41 4D   .....gAM
[00000040] 41 00 00 AF C8 37 05 8A   A....7..
[00000048] E9 00 00 00 19 74 45 58   .....tEX

A demonstration of this is given in:

GIF file

The GIF file format uses a file signature of 0x47 0x49 0x46 0x38 0x39 0x61 (GIF89a) in the first few bytes of the file. After this, the key fields are then Width (16 bits), Height (16 bits), Packed (8 bits), Color Index (8 bits) and Aspect (8 bits), followed by a colour table of 256 24-bit colors. This means that GIF files have good resolution of the colour of a pixel, but only have 256 different colours, which limits its scope. For example it is not good for photographs, as these typically need thousands of colours.

A sample analysis is:

http://www.profsims.com/information/gif?file=cat01_with_hidden_text.gif

which analyses this image:

cat01

An example header is then:

[00000000] 47 49 46 38 39 61 64 00   GIF89ad.
[00000008] 55 00 E6 00 00 FF FF FF   U.......
[00000016] F7 F7 F6 F1 F4 F2 EE EE   ........
[00000024] EF E7 E7 E7 E1 E4 E6 DF   ........

It should be noted that I have added a covert message into the colour table (which will only affect a few pixels – where a few pixels change their colour):

[00000048] A1 CC CC CC C4 C8 CC 68   .......h
[00000056] 65 6C 6C 6F C0 D1 C6 84   ello....
[00000064] C0 BF BD BD BB B8 B8 B6   ........

A presentation on this is at:

PKZIP File

The PKZIP file format is used to compress files, and, potentially encrypt them. It can be identified with the magic number of 0x504B0304 at the start of the file, followed by a fairly structure format of:

Version: 14 00
General purpose bit flag: 02 00
Compression method: 08 00
File last modification time: 80 9D
File last modification date: 6C 39
CRC: DA4DB80F
Compessed size: 90010000
Uncompressed size: 27060000
File name length: 0900
Extra field length: 0000
Filename: anim.xaml

The following shows an example with a real file:

http://www.profsims.com/information/zip?file=anim.zip

where we see the following at the start of the file:

[00000000] 50 4B  03 04 14 00 02 00    PK......
[00000008] 08 00 80 9D 6C 39 DA 4D   ....l9.M

A presentation is here:

An interest fact is that Office 2010 files, such DOCX, XLSX, and so on, in a XML format, which has a PKZIP compressed format. This can be seen with:

http://www.profsims.com/information/docx?file=hello.docx

[00000000] 50 4B  03 04 14 00 06 00    PK......
[00000008] 08 00 00 00 21 00 09 24   ....!..$
[00000016] 87 82 81 01 00 00 8E 05   ........
[00000024] 00 00 13 00 08 02 5B 43   ......[C
[00000032] 6F 6E 74 65 6E 74 5F 54   ontent_T
[00000040] 79 70 65 73 5D 2E 78 6D   ypes].xm
[00000048] 6C 20 A2 04 02 28 A0 00   l....(..

Table 1: Magic file numbers

Description Extension Magic Number
Adobe Illustrator .ai 25 50 44 46 [%PDF]
Bitmap graphic .bmp 42 4D [BM]
Class File .class CA FE BA BE
JPEG graphic file .jpg FFD8
JPEG 2000 graphic file .jp2 0000000C6A5020200D0A [….jP..]
GIF graphic file .gif 47 49 46 38 [GIF89]
TIF graphic file .tif 49 49 [II]
PNG graphic file .png 89 50 4E 47 .PNG
Photoshop Graphics .psd 38 42 50 53 [8BPS]
Windows Meta File .wmf D7 CD C6 9A
MIDI file .mid 4D 54 68 64 [MThd]
Icon file .ico 00 00 01 00
MP3 file with ID3 identity tag .mp3 49 44 33 [ID3]
AVI video file .avi 52 49 46 46 [RIFF]
Flash Shockwave .swf 46 57 53 [FWS]
Flash Video .flv 46 4C 56 [FLV]
Mpeg 4 video file .mp4 00 00 00 18 66 74 79 70 6D 70 34 32 [….ftypmp42]
MOV video file .mov 6D 6F 6F 76 [….moov]
Windows Video file .wmv 30 26 B2 75 8E 66 CF
Windows Audio file .wma 30 26 B2 75 8E 66 CF
PKZip .zip 50 4B 03 04 [PK]
GZip .gz 1F 8B 08
Tar file .tar  75 73 74 61 72 
Microsoft Installer .msi D0 CF 11 E0 A1 B1 1A E1
Object Code File .obj 4C 01
Dynamic Library .dll 4D 5A [MZ]
CAB Installer file .cab 4D 53 43 46 [MSCF]
Executable file .exe 4D 5A [MZ]
RAR file .rar 52 61 72 21 1A 07 00 [Rar!…]
SYS file .sys 4D 5A [MZ]
Help file .hlp 3F 5F 03 00 [?_..]
VMWare Disk file .vmdk 4B 44 4D 56 [KDMV]
Outlook Post Office file .pst 21 42 44 4E 42 [!BDNB]
PDF Document .pdf 25 50 44 46 [%PDF]
Word Document .doc D0 CF 11 E0 A1 B1 1A E1
RTF Document .rtf 7B 5C 72 74 66 31 [{ tf1]
Excel Document .xls D0 CF 11 E0 A1 B1 1A E1
PowerPoint Document .ppt D0 CF 11 E0 A1 B1 1A E1
Visio Document .vsd D0 CF 11 E0 A1 B1 1A E1
DOCX (Office 2010) .docx 50 4B 03 04 [PK]
XLSX (Office 2010) .xlsx 50 4B 03 04 [PK]
PPTX (Office 2010) .pptx 50 4B 03 04 [PK]
Microsoft Database .mdb 53 74 61 6E 64 61 72 64 20 4A 65 74
Postcript File .ps 25 21 [%!]
Outlook Message File .msg D0 CF 11 E0 A1 B1 1A E1
EPS File .eps 25 21 50 53 2D 41 64 6F 62 65 2D 33 2E 30 20 45 50 53 46 2D 33 20 30
Jar File .jar 50 4B 03 04 14 00 08 00 08 00
SLN File .sln 4D 69 63 72 6F 73 6F 66 74 20 56 69 73 75 61 6C 20 53 74 75 64 69 6F 20 53 6F 6C 75 74 69 6F 6E 20 46 69 6C 65

Leave a comment