PROPERTY LISTS IN DIGITAL FORENSICS

PROPERTY LISTS IN DIGITAL FORENSICS Alex Caithness CCL-Forensics December 2010 CCL-Forensics Limited Tel: 01789 261200 Email: [email protected]...
Author: Jerome Carter
0 downloads 1 Views 306KB Size
PROPERTY LISTS IN DIGITAL FORENSICS

Alex Caithness CCL-Forensics December 2010

CCL-Forensics Limited Tel: 01789 261200 Email: [email protected] www.ccl-forensics.com

PROPERTY LISTS IN DIGITAL FORENSICS

Abstract Property Lists (often shortened to “Plists”) are one of the favoured data storage formats used in Apple products. The data held within them can be of high evidential value, so an understanding of the format is essential during the forensic investigation of iPhones, iPads and Apple’s desktops and laptops. This document sets out to explain the structure and meaning of Property lists and examine some of the challenges that they can present. History Property Lists originate from the NeXTSTEP platform. NeXT was Apple Inc. CEO Steve Jobs’ company between his two stints at Apple itself and had produced an effective object oriented operating system (NeXTSTEP), in addition to a range of desktop machines and a framework for web applications (WebObjects). When, in 1996, Apple Computer Inc. acquired NeXT (and in doing so re-acquired Jobs) NeXTSTEP was to become the foundation of Apple’s new operating system OS X, and the concept of Property Lists persisted. The original property list format was designed to be a human readable and editable representation of serialised data, and as such was encoded in ASCII and had a format similar to a programming language (it bears more than a passing resemblance to JSON (JavaScript Object Notation). OS X still understands this format; however it has been deprecated in favour of two new formats: an XML based format and a binary encoded format. XML Format Overview The XML format (officially known as xml1) introduced in OS X overcame a number of the NeXTSTEP property list format’s shortcomings; namely the encoding of nonASCII characters (the files are usually encoded using UTF-8) and better representation of data-types: NeXTSTEP’s format only encoded strings and binary data whereas the XML format dealt natively with strings, integers, real numbers, Booleans, dates and binary data. Structure The XML format is defined by Apple in a public DTD (Document Type Definition) which can be found in Appendix A, however it is probably more useful to see an example of a Plist file (see figure 1).

Page 1 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

Created 297262420.38746202 Domain .apple.com Expires 2011-06-03T12:53:44Z Name X-Dsid Path /WebObjects Value 1334998668 Created 300106178.69718999 Domain .apple.com Expires 2010-08-06T09:49:35Z Name Pod Path / Value 18 Figure 1: Extract from Cookies.plist

All well-formed XML Plist files should begin with and tags defining the XML version, encoding and the DTD (as mentioned above). The root node of the document should always be , and it is this element which will contain the data in the document (if any). This structure allows us to define a regular expression for an XML property list containing data which could then be used to search for Plists in an arbitrary block of data (see figure 2). \n\n(\n|.)* Figure 2: Regular Expression for an XML Property list

If there is to be more than one element of data in the property list (as in our example) the first child of the element should be one of the two collection elementsi: or . Array is straightforward: an ordered collection of i

Actually there is a 3rd collection type defined for Property Lists: “Set”, however this data-type is not implemented in the XML representation.

Page 2 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

other data type elements. Arrays in property lists are not ‘typed’, ie they may contain a mixture of different data types (so, any number of strings, reals, etc., and even other collections can co-exist in a single array). In our example the element contains two elements. “Dict” (Dictionary) is the other possible collection type in a property list, and rather than a straight collection of other elements, it contains a collection of key-value pairs. The key-value pairs are represented by a element containing a string which is the key (it is possible, although unusual, for the key to contain an empty string) followed directly by another element which contains the value (as with an array, the element could be any of the data types, including another collection). It is important to note that, as in a ‘normal’ dictionary data structure, it is not valid to have multiple instances of the same key inside a dictionary, which affords us some safety when we eventually come to parsing data from a property list dictionary. Other Primitive Data Types In addition to the two collection types there are a number of “primitive” data types which can be directly represented in an XML property list. As this is an XML based file they are all represented in string form, encoded as defined in the tag. The data types and their associated xml elements are as follows: • Boolean: and are the two Boolean values. It is of note that these are the only two elements defined by the DTD (see Appendix A) which are explicitly empty – they cannot contain any other data • Integer: contains a possibly signed base-10 integer number • Real: contains a possibly signed floating point number using “e notation” • String: contains a text string • Data: contains binary data encoded as Base-64ii • Date: contains a date in conformance with ISO 8601, generally the form: YYYY-MM-DDTHH:MM:SSZ Longevity The XML Property List format maintained and even improved upon the original NeXTSTEP Plist’s raison d’etre – serialized data in a human readable and editable form, and for that reason it remains in frequent use, however XML is fairly inefficient in terms of space and some might say that it tends to be an overly verbose way to store data. With that in mind, at the release of OS X 10.2 (August 2002) support for an alternative model was implemented.

ii

It is not unusual to see embedded property lists stored in tags!

Page 3 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

Binary Property List Format Overview The Binary Property List file format provided a more compressed method to store data in a Plist. It removes the need to define each element of data in a full XML tag; data can be stored in a format which is more “digitally native” (integers are stored as a sequence of bytes rather than a string, for example) and repeated values need only be stored once and then just re-referenced when needed. These changes can reduce the space required to store a Plist by a significant amount, especially on larger files, however it breaks the human readability that was the hallmark of previous formats. Structure A Binary Property List (bplist) file is made up of 4 parts; they will be described not in the order that they occur, but in the order that makes most sense when reading the file. They are as follows: Header The header is the 1st section of the file and occupies the first 8 bytes of the file. It comprises the file signature (the string “bplist” encoded in ASCII) and the file format version (also encoded in ASCII and as of July 2010 should always be “00”). Trailer The trailer takes up the final 32 bytes in the file (which necessarily makes it the 4th section in the file) and contains metadata that is essential for reading the file (the full meaning of which will be described fully later. The information it contains is detailed in Table 1: Data Size of integers (in bytes) for offset table Size of collection object reference integers (in bytes) Number of objects

Offset Length Data Structure 6 1 8-bit unsigned integer 7 1 8-bit unsigned integer 8

8

Top level object index

16

8

Offset of offset table

24

8

64-bit unsigned big endian integer 64-bit unsigned big endian integer 64-bit unsigned big endian integer

Table 1: Data in Bplist trailer

Offset Table The Offset Table is the 3rd section to occur in the file; its starting offset in the file is detailed in the trailer (“Offset of offset table”) and its length can be derived by multiplying the trailer fields “Number of objects” by “Size of integers in offset table”. The offset table contains an array of offsets, zero-indexed (ie the first entry is referred to as entry 0, the second as entry 1 and so on) which point to objects in the Object Table (see below). The offsets are encoded as big endian

Page 4 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

integers of the length described in the trailer field “Size of integers in offset table”. See Figure 3 for an example.

Figure 3: Example of a Binary Property List trailer and Offset Table.

Object Table The Object Table is the 2nd section in the file, starting directly after the Header at file offset 8 and continuing until the start of the Offset Table. It contains the binary representations of each “object” in the Plist. Each object comprises a type-descriptor byte, optionally followed by a length (depending on the data-type, the length may be implicit in the data type byte), followed by the data itself. The Type descriptor bytes and their meanings are detailed in Table 2. Data Type Null

Type Descriptor Byte 0000 0000

Boolean

0000 1000

Boolean

0000 1001

Fill Integer

0000 1111 0001 nnnn

Real

0010 nnnn

Date

0011 0011

Data

0100 nnnn

String

0101 nnnn

Page 5 of 14

Meaning Null: entire data is contained within the type byte, no further data follows False: entire data is contained within the type byte, no further data follows True: entire data is contained within the type byte, no further data follows Fill byte Big endian integer, data that follows is nnnn^2 bytes in length Big endian floating point number, data that follows is nnnn^2 bytes in length 8 byte big endian floating point number follows, represents number of seconds from 2001/01/01 00:00:00 Binary data, length of data is nnnn bytes, unless 1111 in which case an integer data type follows, giving the length of the data in bytes ASCII string, length is nnnn characters, unless 1111 in which case an integer data type follows, giving the length of the data in characters

PROPERTY LISTS IN DIGITAL FORENSICS

String

0110 nnnn

Unused UID

0111 xxxx 1000 nnnn

Unused Array

1001 xxxx 1010 nnnn

Unused Set

1011 xxxx 1100 nnnn

Dictionary

1101 nnnn

Unused Unused

1110 xxxx 1111 xxxx

UTF-16-be string, length is nnnn characters, unless 1111 in which case an integer data type follows, giving the length of the data in characters UID (encoded as a big endian integer), data that follows is nnnn+1 bytes in length Array of object references (ie indices for the offset table): nnnn is the number of object references, unless 1111 in which case an integer data type follows, giving the count. The size of each object reference is defined in the trailer’s “Size of collection object reference integers” field. Set of object references (ie indices for the offset table): nnnn is the number of object references, unless 1111 in which case an integer data type follows, giving the count. The size of each object reference is defined in the trailer’s “Size of collection object reference integers” field. Dictionary is a collection of key-object references (ie indices for the offset table) followed by a collection of value-object references: nnnn is the number of key-value pairs, unless 1111 in which case an integer data type follows, giving the count; note that this count is for pairs of object references, therefore the total number of object references is twice this count. The size of each object reference is defined in the trailer’s “Size of collection object reference integers” field. -

Table 2: Binary Property List Files, Type Descriptor Bytes and their Meanings

Page 6 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

Decoding It is possible to directly convert between the binary and XML file formats as they contain the same data (with a few caveats to be discussed later). The coarse algorithm for doing so is: • Locate and read the Offset Table, based on the data held in the trailer • Read the “Top level object index” from the trailer and look up the object’s file offset in the Offset Table • Read the object; if it is a collection it will contain object references which are indices for the Offset Table, use them to look up the file offsets for the objects contained in the collection, repeat this (recursively if collections contain collections) for each object in the collection, until the end of the top level object and the objects it contains (if any) have been read The hierarchical nature of the data read from a binary Plist in this fashion then easily translates to XML almost directly, wrapping the data in the relevant tags; however there are, as mentioned, a few caveats. UIDs Although there is no tag “” in the specification for the XML property list format, there is a transformation that is applied to this data type when converting between the binary and XML file formats. A UID containing the data “1” (UID can only contain an integer) should be represented in the XML as shown in Figure 4: CF$UID 1 Figure 4: XML Property List representation of the UID data type

Null and Set Neither “null” nor “set” have representations in the current XML specification, indeed, trying to convert a file containing either of these data types using Apple’s official “plutil” utility raises the error: “invalid object in plist for destination format”. Of course this doesn’t preclude the possibility of using imaginary and tags for ease of parsing/understanding – just be aware that the resulting document will not be a “well-formed” property list (it won’t be understood by OSX for example). Although this initially seems to be an issue, in practice, none of the test data investigated thus far has contained either of these data types. Carving for Binary Property List Files Binary property lists have a clear file signature (bplist00) which is obviously useful when it comes to identifying the start of a file in (for example) unallocated space, however there is no static footer to the file. This causes significant complications when you consider that the last 32 bytes of a bplist are essential when it comes to parsing/opening the file – binary plists must be carved accurately, to the byte, in order for a plist viewer/converter to open the file.

Page 7 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

It is possible, with practice, to recognise the end of a bplist file (where the offset table ends and the trailer begins), it may also be possible to automate the recovery by exploiting some properties of the rest of the file and it’s relationship to the trailer: • •

• •

The first two fields (byte size of integers for the offset table and for object references) can only sensibly be 0x01, 0x02, 0x03 or 0x04 The 3rd field’s value (total number of objects) must not be greater than half the number of bytes between the end of the signature and the start of the theoretical footeriii The 4th field’s value (top level object index) doesn’t have to be, but is often 0 The 5th field’s value (offset of the offset table) must be less than or equal to the number of bytes between the start of the signature and the start of the theoretical footer

Hurdles to Parsing Property List Content As it is generally easier to parse property lists in their XML form it is often useful to convert binary plists to their XML representation; however, even then there can be hurdles when it comes to parsing or understanding the data, especially when it comes to automating the process. Dictionary Structure The structure used in the XML Property List file format to represent dictionaries, while initially clear, does not (in the author’s opinion) make good use of XML. The hierarchical nature of XML lends itself to the idea that an element can have “ownership” of another by having the owner enclosing the owned. In the case of the dictionary data type it would make sense for the value to “belong” to the key however this is not the case; instead the key and the values are “siblings” making dealing with deeply nested structures (array inside dictionary, inside dictionary, etc.) and accessing the value through the key using (for example) XPath needlessly complex, although by no means impossible (see Figure 5). Using a “parent-child” model makes it abundantly clear when a value belongs to a key because it is contained within it. Although the structure is clear enough in our example below, in files 100 times the size of our example or those with complex nested data it can be easy to lose your way with the default “sibling” model. To further illustrate the point, imagine the task of selecting each node with the name of an available service (signified by “serv_name”). We could use the XPath language to achieve this; however the implementation would be very different for the two versions of the file in Figure 5. iii

Worst case is a plist containing a single ‘null’ or empty collection object. This object still takes up a single byte in the object array and a single byte in the offset table (though this still satisfies the requirement). For everything else the length would be greater than two bytes per object.

Page 8 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

Firstly for key-values as siblings the query would be: /dict/key[contains(self::*, "services")]/followingsibling::[position()=1]/dict/key[contains(self::*, "serv_name")]/followingsibling::[position()=1]

Which is fairly unwieldy when contrasted with the following for the parent-children structure: /dict/key[@name=”services”]/array/key[@name=”serv_name”]/string

Property List Format Key-Values as siblings user_name john.doe last_log_on 2010-07-27T15:04:45Z services serv_name email enabled serv_name ftp enabled

Proposed “Improved” Format Key-Values as parent-children john.doe 2010-07-27T15:04:45Z email ftp

Figure 5: Comparison of Key-Value pairs as siblings and as parent-children

Being able to perform this conversion can significantly simplify later parsing of the data contained.

Page 9 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

NSKeyedArchiver Some test data investigated, especially from iPhones, showed references to “NSKeyedArchiver” generally in complex Plists with seemingly difficult structures. NSKeyedArchiver is part of Apple’s standard application framework (since OS X 10.2) and is used to archive a set of objects in memory by an application to a format that can be safely transported and reloaded when necessary; the two file formats which NSKeyedArchiver supports are, unsurprisingly, XML Property List and Binary Property List. These files contained a far higher number of UID objects that had been seen in other property lists and also a high number of data objects which at first appearance had no context with which to ascertain their meaning. Despite these initial difficulties it was noted that at the least these documents did have a consistent high-level structure. The property list was always based around a dictionary containing 4 keys (see Figure 6). $archiver NSKeyedArchiver $objects $top root CF$UID 1 $version 100000 Figure 6: Overview of an NSKeyedArchiver Property List File

The 4 keys involved in the structure are described below: • • • •

$archiver: value is always the string “NSKeyedArchiver” $version: value is always an integer value, and during research only the value 100000 was encountered $top: value is another dictionary with a single key “root” with the value of a UID $objects: value is always an array with any number of objects, often nested and including a high number of UIDs, a partial example of these contents can be seen in Figure 7.

Page 10 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

$null $class CF$UID 7 models CF$UID 2 $class CF$UID 6 NS.objects CF$UID 3 $class CF$UID 5 json CF$UID 4 version 1 {"imCount":0,"unreadCount":0,"inputState":{"lastVisibleRow":0 ,"text":""},"buddy":{"displayId":"mybot","state":"unknown","aimId":"mybot"} ,"lastHistoryAppendDate":1278415643.763698} Cont… Figure 7: Excerpt of the contents of the $objects array (taken from the iPhone AOL Messenger Application conversations file)

Page 11 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

A combination of reverse engineering and analysis of the source code for the NSKeyedArchiver class in the GNUstep base library (GNUstep is the open source successor to NeXTSTEP and its NSKeyedArchiver class is designed to be fully compatible with the one found in OS X) revealed the meaning of the UID fields. Each node directly under the $objects array can be enumerated starting from zero, giving each object an index. Each time a UID object is subsequently encountered it represents an object in the $objects array, the index of which is the UID value. In Figure 7 it should be noted that UID objects are found nested inside other objects,the same principal applies – no matter where the UID occurs, it represents an object in the $objects array. The $top section of the Plist contains a single UID which is the “entry-point” for the whole structure. If a recursive transformation is performed, replacing all UIDs with their matching object from the array, $top will now contain the complete structure of the archived data. This can give the previously unstructured data in the $objects section context, allowing meaningful parsing to take place. It should be noted that, because of the nature of NSKeyedArchiver, which is designed to archive a set of objects in memory, even once the data has been “expanded” in the manner described above, the resulting structure can still be complex and some understanding of object oriented programming concepts is helpful when it comes to manually parsing the data. Conclusion Because Property Lists are one of Apple’s own favoured data storage formats they are widely found on Apple devices, both desktop and mobile. Although they may seem straightforward on initial inspection, they do have some particular quirks which the forensic practitioner should be aware of. Additionally having an understanding of the file’s structure, especially the binary format, allows a practitioner to be more successful when it comes to recovering these files.

Page 12 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

References Apple Inc. (Press Release) (1996) Apple Computer, Inc. Agrees to Acquire NeXT Software Inc., [Online], Available: http://web.archive.org/web/20020208190346/http://product.info.apple.com/pr/ press.releases/1997/q1/961220.pr.rel.next.html [28th July 2010] Wikimedia Foundation (2010) Wikipedia: NeXT, [Online], Available: http://en.wikipedia.org/wiki/NeXT [28th July 2010] World Wide Web Consortium (2010) Extensible Markup Language (XML) [Online], Available: http://www.w3.org/XML/ [6th December 2010] Apple Inc. (2003) Mac OS X Reference Library, Manual Page for PLIST(5), [Online], Available: http://developer.apple.com/mac/library/documentation/Cocoa/Reference/Found ation/Classes/NSKeyedArchiver_Class/Reference/Reference.html [28th July 2010] Apple Inc. (2009) Source Code for Binary Property List format (CFBinaryPList.c), [Online], Available: http://opensource.apple.com/source/CF/CF550/CFBinaryPList.c [28th July 2010] Apple Inc. (2010) Mac OS X Reference Library, NSKeyedArchiverClass Reference, [Online], Available: http://developer.apple.com/mac/library/documentation/Cocoa/Reference/Found ation/Classes/NSKeyedArchiver_Class/Reference/Reference.html [28th July 2010] Free Software Foundation (2004), Implementation for NSKeyedArchiver for GNUstep (NSKeyedArchiver.m), [Online], Available: http://cvs.savannah.gnu.org/viewvc/gnustep/core/base/Source/NSKeyedArchiver. m?revision=1.18&root=gnustep&view=markup [28th July 2010]

Page 13 of 14

PROPERTY LISTS IN DIGITAL FORENSICS

Appendix A: XML Property List File Format DTD http://www.apple.com/DTDs/PropertyList-1.0.dtd



Page 14 of 14

Suggest Documents