Tag: GIF

The File Forensics of a GIF file …

A bit of History

GIF files are still one of the most used formats for graphics files, even though it’s been around for nearly¬†20¬†years.¬†In 1987 CompuServe released the GIF (Graphics Interchange Format) format as a free and open specification. In fact, I remember using CompuServe for my Internet access, just before I switched to using AOL (which, as I remember bought over CompuServe, and acquired all their customers). GIF quickly became a standard way to present graphics on the Web, and unfortunately many developers started to write software supporting GIF without even acknowledging the existence of CompuServe. Along with this GIF used a compression technique called LZW (Lempel-Ziv-Welch), which Unisys holds a patent on.

The GIF format became so successful that by at the end of December 1994, CompuServe Inc. and Unisys Corporation announced that developers would have to pay a license fee in order to continue to use technology patented by Unisys. This, though, only applied to certain categories of software supporting the GIF format. These first statements caused immediate reactions and some confusion. With all these legal discussions, it is likely that GIF will be replaced, in the future, by other formats which do not have any patent or licensing problem, especially the PNG format. The great strength of GIF over JPEG is that it supports transparent colours (which will show through the colour of the background), where JPEG does not. PNG also supports this.

After a great deal of anger (including an article in Time), and with statements like:

"The announcement by CompuServe and Unisys that users of the GIF image format must 
 register by January 10 and pay a royalty or face lawsuits for their past usage, 
is the online communications community's equivalent of the sneak attack at Pearl Harbor."

In the end it has been ruled the GIF file format cannot be patented, but the usage of the LZW algorithm is patented (by Unisys). So as long as you do not breach the patent for this, you are not breaching any patents. If you are you must pay a royalty for its usage.

File Format Analysis

So here are a few images (just click on the image to see the forensic analysis):

Click on image for forensic analysis
Click on image for forensic analysis
Click on image for forensic analysis
Click on graphic to analyse

Detailed File Format

The graphics interchange format (GIF) is the copyright of CompuServe Incorporated. Its popularity has increased mainly because of its wide usage on the Internet. CompuServe Incorporated, luckily, has granted a limited, non-exclusive, royalty-free license for the use of GIF (but any software using the GIF format must acknowledge the ownership of the GIF format).

Most graphics software supports the Version 87a or 89a format (the 89a format is an update the 87a format). Both have basic specification:

  • A header¬†with GIF¬†identification.
  • A logical screen descriptor block which defines the size, aspect ratio¬†and color¬†depth of the image place.
  • A global color¬†table.
  • Data blocks¬†with bitmapped images¬†and the possibility of text overlay.
  • Multiple images, with image sequencing or interlacing. This process is defined in a graphic-rendering block.
  • LZW¬†compressed bitmapped images.

Color tables

Color tables store the color information of part of an image (a local color table) or they can be global (a global table).

Blocks, extensions and scope

Blocks can be specified into three groups: control, graphic-rendering and special purpose. Control blocks contain information used to control the process of the data stream or information used in setting hardware parameters. They include:

  • GIF¬†Header ‚Äď which contains basic information¬†on the GIF file, such as the version¬†number and the GIF file signature.
  • Logical screen descriptor ‚Äď which contains information¬†about the active¬†screen display, such as screen width and height, and the aspect ratio.
  • Global color¬†table¬†‚Äď which contains up to 256 colors¬†from a palette of 16.7M¬†colors (i.e. 256 colors with 24-bit color information).
  • Data subblocks¬†‚Äď which contain the compressed image data.
  • Image description¬†‚Äď which contains, possibly, a local color¬†table¬†and defines the image¬†width and height, and its top left coordinate.
  • Local color¬†table¬†‚Äď an optional block which contains local color information¬†for an image as with the global color table, it has a maximum of 256 colors¬†from a palette of 16.7M.
  • Table-based image data¬†‚Äď which contains compressed image data.
  • Graphic control extension¬†‚Äď an optional block which has extra graphic-rendering information, such as timing information and transparency.
  • Comment extension¬†‚Äď an optional block which contains comments ignored by the decoder.
  • Plain text extension¬†‚Äď an optional block which contains textual data.
  • Application extension¬†‚Äď which contains application-specific data. This block can be used by a software package to add extra information¬†to the file.
  • Trailer¬†‚Äď which defines the end of a block of data.

GIF header

The header is 6 bytes long and identifies the GIF signature and the version number of the chosen GIF specification. Its format is:

  • 3 bytes with the characters ‚ÄėG‚Äô, ‚ÄėI‚Äô and ‚ÄėF‚Äô.
  • 3 bytes with the version¬†number (such as 87a¬†or 89a). Version numbers are ordered¬†with two digits for the year, followed by a letter (‚Äėa‚Äô, ‚Äėb‚Äô, and so on).

Logical screen descriptor

The logical screen descriptor appears after the header. Its format is:

  • 2 bytes with the logical screen width¬†(unsigned integer).
  • 2 bytes with the logical screen height¬†(unsigned integer).
  • 1 byte of a packed bit field, with 1 bit for global color¬†table¬†flag, 3 bits for color resolution, 1 bit for sort flag and 3 bits to give an indication of the number of colors in the global color table
  • 1 byte for the background color¬†index.
  • 1 byte for the pixel¬†aspect ratio.

Global color table

After the header and the logical display descriptor comes the global color table. It contains up to 256 colors from a palette of 16.7M colors. Each of the colors is defined as a 24-bit color of red (8 bits), green (8 bits) and blue (8 bits). The format in memory is:

RRRRRRRR
GGGGGGGG
BBBBBBBB
RRRRRRRR
GGGGGGGG
BBBBBBBB
:    :
RRRRRRRR
GGGGGGGG
BBBBBBBB

The 24-bit color scheme allows a total of 16777216 (224) different colors to be displayed. Table B2.12 defines some colors in the RGB (red/green/blue) strength. The format is rrggbbh, where rr is the hexadecimal equivalent for the red component, gg the hexadecimal equivalent for the green component and bb the hexadecimal equivalent for the blue component. For example, in binary:

000000000000000000000000 represents black (000000h)
111111111111111111111111 represents white (FFFFFFh)
011101110111011101110111 represents gray (777777h)
111110101110010100000011 represents yellow (FCE503h)
001110100000101101011001 represents purple (3A0B59h)

Table Hexadecimal colors for 24-bit color representation

Color

Code

Color

Code

White

FFFFFFh

Dark red

C91F16h

Light red

DC640Dh

Orange

F1A60Ah

Yellow

FCE503h

Light green

BED20Fh

Dark green

088343h

Light blue

009DBEh

Dark blue

0D3981h

Purple

3A0B59h

Pink

F3D7E3h

Nearly black

434343h

Dark gray

777777h

Gray

A7A7A7h

Light gray

D4D4D4h

Black

000000h

Image descriptor

After the global color table is the image descriptor. Its format is:

  • 1 byte for the image separator (always 2Ch).
  • 2 bytes for the image left position (unsigned integer).
  • 2 bytes for the image top position (unsigned integer).
  • 2 bytes for the image width (unsigned integer).
  • 2 bytes for the image height (unsigned integer).
  • 1 byte of a packed bit field, with 1 bit for local color¬†table¬†flag, 1 bit for interlace¬†flag, 1 bit for sort flag, 2 bits are reserved and 3 bits for the size of the local color table.

Local color table

The local color table is an optional block which defines the color map for the image that precedes it. The format is identical to the global color map, i.e. 3 bytes for each of the colors.

Table-based image data

The table-based image data follows the local color table. This table contains compressed image data. It consists of a series of subblocks of up to 255 bytes. The data consists of an index to the color table (either global or local) for each pixel in the image. As the global (or local) color table has 256 entries, the data value (in its uncompressed form) will range from 0 to 255 (8 bits). The tables format is:

  • 1 byte for the LZW minimum code size, which is the initial number of bits used in the LZW¬†coding.
  • N bytes for the LZW¬†compressed image data. The first block is preceded by the data size.

GIF coding uses the variable-length-code LZW technique where a variable-length code replaces image data (pixel color references). These variable-length codes are specified in a Huffman code table. The encoder replaces the data from the input and builds a dictionary with the patterns in the data. Every new pattern is entered into the dictionary and the index value of the table is added to coded data. When a previously stored pattern is encountered, its dictionary index value is added to the coded data. The decoder takes the compressed data and builds the dictionary which is identical to the encoder. It then replaces indexed terms from the dictionary.

The VLC algorithm uses an initial code size to specify the initial number of bits used for the compression codes. When the number of patterns detected by the encoder exceeds the number of patterns encodable with the current number of bits then the number of bits per LZW is increased by 1.

Graphic control extension

The graphic control extension is optional and contains information on the rendering of the image that follows. Its format is:

  • 1 byte with the extension identifier (21h).
  • 1 byte with the graphic control label (F9h).
  • 1 byte with the block size following this field and up to but not including, the end terminator. It always has a fixed value of 4.
  • 1 byte with a packed array of which the first 3 bits are reserved, 3 bits define the disposal method, 1 bit defines the user input flag and 1 bit defines the transparent color¬†flag.
  • 2 bytes with the delay time for the encode wait, in hundreds of a seconds, before encoding the image data.
  • 1 byte with the transparent color¬†index.
  • 1 byte for the block terminator¬†(00h).

Comment extension

The comment extension is optional and contains information which is ignored by the encoder. Its format is:

  • 1 byte with the extension identifier (21h).
  • 1 byte with the comment extension label (FEh).
  • N bytes, with comment data.
  • 1 byte for the block terminator¬†(00h).

Plain text extension

The plain text extension is optional and contains text information. Its format is:

  • 1 byte with the extension identifier (21h).
  • 1 byte with the plain text¬†label (01h).
  • 1 byte with the block size. This is the number of bytes after the block size field up to but not including the beginning of the plain text¬†data¬†block. It always contains the value 12.
  • 2 bytes for the text grid left position.
  • 2 bytes for the text grid top position.
  • 2 bytes for the text width.
  • 2 bytes for the text height.
  • 1 byte for the character cell width.
  • 1 byte for the character cell height.
  • 1 byte for the text foreground color.
  • 1 byte for the text background color.
  • N bytes for the plain text¬†data.
  • 1 byte for the block terminator¬†(00h).

Application extension

The application extension is optional and contains information for application programs. Its format is:

  • 1 byte with the extension identifier (21h).
  • 1 byte with the application extension label (FFh).
  • 1 byte for the block size. This is the number of bytes after the block size field up to but not including the beginning of the application data. It always contains the value 11.
  • 8 bytes for the application identifier.
  • 3 bytes for the application authentication¬†code.
  • N bytes, for the application data.
  • 1 byte for the block terminator¬†(00h).

Trailer

The trailer indicates the end of the GIF file. Its format is:

  • 1 byte identifying the trailer (3Bh).

File Forensics with Signatures

Magic Numbers

Sometimes we need to scan a disk at a low level, and determine the files that are contained on a disk. One method of determining the files is to look for standard signatures, normally using standard sequences at the start of the file. I’ve tried to gather as many of these signatures as possible for key file types (see Table 1). For example an Abobe Illustrator file should start with the hex sequence of 0x25, 0x50, 0x44, 0x46 (which is the ASCII characters of %PDF), and which shows that it is a standard PDF file. If we scan a disk and find this signature, it may thus be an Illustrator file.

PNG File

PNG files provide high quality vector and bit mapped graphic formats. They have a magic number of 0x89 0x50 0x4E 0x47 0x0D 0x0A 0x1A 0x0A. The following gives a sample listing for a real PNG file:

http://www.profsims.com/information/png?file=bg.png

The starting part of the file shows the magic number:

[00000000] 89 50 4E 47 0D 0A 1A 0A   .PNG....
[00000008] 00 00 00 0D 49 48 44 52   ....IHDR
[00000016] 00 00 00 F3 00 00 00 C3   ........
[00000024] 08 06 00 00 00 57 8C 27   .....W.'
[00000032] 92 00 00 00 04 67 41 4D   .....gAM
[00000040] 41 00 00 AF C8 37 05 8A   A....7..
[00000048] E9 00 00 00 19 74 45 58   .....tEX

A demonstration of this is given in:

GIF file

The GIF file format uses a file signature of 0x47 0x49 0x46 0x38 0x39 0x61 (GIF89a) in the first few bytes of the file. After this, the key fields are then Width (16 bits), Height (16 bits), Packed (8 bits), Color Index (8 bits) and Aspect (8 bits), followed by a colour table of 256 24-bit colors. This means that GIF files have good resolution of the colour of a pixel, but only have 256 different colours, which limits its scope. For example it is not good for photographs, as these typically need thousands of colours.

A sample analysis is:

http://www.profsims.com/information/gif?file=cat01_with_hidden_text.gif

which analyses this image:

cat01

An example header is then:

[00000000] 47 49 46 38 39 61 64 00   GIF89ad.
[00000008] 55 00 E6 00 00 FF FF FF   U.......
[00000016] F7 F7 F6 F1 F4 F2 EE EE   ........
[00000024] EF E7 E7 E7 E1 E4 E6 DF   ........

It should be noted that I have added a covert message into the colour table (which will only affect a few pixels – where a few pixels change their colour):

[00000048] A1 CC CC CC C4 C8 CC 68   .......h
[00000056] 65 6C 6C 6F C0 D1 C6 84   ello....
[00000064] C0 BF BD BD BB B8 B8 B6   ........

A presentation on this is at:

PKZIP File

The PKZIP file format is used to compress files, and, potentially encrypt them. It can be identified with the magic number of 0x504B0304 at the start of the file, followed by a fairly structure format of:

Version: 14 00
General purpose bit flag: 02 00
Compression method: 08 00
File last modification time: 80 9D
File last modification date: 6C 39
CRC: DA4DB80F
Compessed size: 90010000
Uncompressed size: 27060000
File name length: 0900
Extra field length: 0000
Filename: anim.xaml

The following shows an example with a real file:

http://www.profsims.com/information/zip?file=anim.zip

where we see the following at the start of the file:

[00000000] 50 4B  03 04 14 00 02 00    PK......
[00000008] 08 00 80 9D 6C 39 DA 4D   ....l9.M

A presentation is here:

An interest fact is that Office 2010 files, such DOCX, XLSX, and so on, in a XML format, which has a PKZIP compressed format. This can be seen with:

http://www.profsims.com/information/docx?file=hello.docx

[00000000] 50 4B  03 04 14 00 06 00    PK......
[00000008] 08 00 00 00 21 00 09 24   ....!..$
[00000016] 87 82 81 01 00 00 8E 05   ........
[00000024] 00 00 13 00 08 02 5B 43   ......[C
[00000032] 6F 6E 74 65 6E 74 5F 54   ontent_T
[00000040] 79 70 65 73 5D 2E 78 6D   ypes].xm
[00000048] 6C 20 A2 04 02 28 A0 00   l....(..

Table 1: Magic file numbers

Description Extension Magic Number
Adobe Illustrator .ai 25 50 44 46 [%PDF]
Bitmap graphic .bmp 42 4D [BM]
Class File .class CA FE BA BE
JPEG graphic file .jpg FFD8
JPEG 2000 graphic file .jp2 0000000C6A5020200D0A [….jP..]
GIF graphic file .gif 47 49 46 38 [GIF89]
TIF graphic file .tif 49 49 [II]
PNG graphic file .png 89 50 4E 47 .PNG
Photoshop Graphics .psd 38 42 50 53 [8BPS]
Windows Meta File .wmf D7 CD C6 9A
MIDI file .mid 4D 54 68 64 [MThd]
Icon file .ico 00 00 01 00
MP3 file with ID3 identity tag .mp3 49 44 33 [ID3]
AVI video file .avi 52 49 46 46 [RIFF]
Flash Shockwave .swf 46 57 53 [FWS]
Flash Video .flv 46 4C 56 [FLV]
Mpeg 4 video file .mp4 00 00 00 18 66 74 79 70 6D 70 34 32 [….ftypmp42]
MOV video file .mov 6D 6F 6F 76 [….moov]
Windows Video file .wmv 30 26 B2 75 8E 66 CF
Windows Audio file .wma 30 26 B2 75 8E 66 CF
PKZip .zip 50 4B 03 04 [PK]
GZip .gz 1F 8B 08
Tar file .tar  75 73 74 61 72 
Microsoft Installer .msi D0 CF 11 E0 A1 B1 1A E1
Object Code File .obj 4C 01
Dynamic Library .dll 4D 5A [MZ]
CAB Installer file .cab 4D 53 43 46 [MSCF]
Executable file .exe 4D 5A [MZ]
RAR file .rar 52 61 72 21 1A 07 00 [Rar!…]
SYS file .sys 4D 5A [MZ]
Help file .hlp 3F 5F 03 00 [?_..]
VMWare Disk file .vmdk 4B 44 4D 56 [KDMV]
Outlook Post Office file .pst 21 42 44 4E 42 [!BDNB]
PDF Document .pdf 25 50 44 46 [%PDF]
Word Document .doc D0 CF 11 E0 A1 B1 1A E1
RTF Document .rtf 7B 5C 72 74 66 31 [{ tf1]
Excel Document .xls D0 CF 11 E0 A1 B1 1A E1
PowerPoint Document .ppt D0 CF 11 E0 A1 B1 1A E1
Visio Document .vsd D0 CF 11 E0 A1 B1 1A E1
DOCX (Office 2010) .docx 50 4B 03 04 [PK]
XLSX (Office 2010) .xlsx 50 4B 03 04 [PK]
PPTX (Office 2010) .pptx 50 4B 03 04 [PK]
Microsoft Database .mdb 53 74 61 6E 64 61 72 64 20 4A 65 74
Postcript File .ps 25 21 [%!]
Outlook Message File .msg D0 CF 11 E0 A1 B1 1A E1
EPS File .eps 25 21 50 53 2D 41 64 6F 62 65 2D 33 2E 30 20 45 50 53 46 2D 33 20 30
Jar File .jar 50 4B 03 04 14 00 08 00 08 00
SLN File .sln 4D 69 63 72 6F 73 6F 66 74 20 56 69 73 75 61 6C 20 53 74 75 64 69 6F 20 53 6F 6C 75 74 69 6F 6E 20 46 69 6C 65