Introduction
- Know what you’ve got…
- How we try to identify file formats.
- Use and contribute to PRONOM!
- It isn’t just the beginning of your PRONOM journey, it’s the beginning of your digital forensics journey!
- Enjoy!
Hexadecimal
- Hexadecimal is a number system.
- Hexadecimal makes it easier to understand “binary”.
- Hexadecimal is mapped to signals and characters that have meaning to a computer.
- Hexadecimal can take on arbitrary meaning through “encodings”.
- Hexadecimal is the foundation for a PRONOM signature!
Using a hex editor
- Recommend HexEd.it as an online tool for the session. Mention HxD, others
- ‘Bytecode’ representation of file - both Hexadecimal and ‘ASCII’, with 0x00-1F control characters usually represented as periods (dots) or spaces
- File Forensics, reverse engineering, understanding file formats at a low-level. My first exposure to Hex Editors was editing the save files of video games to give me extra lives or gold!
- Understanding offsets, Hex-view, limitations of ASCII view
- Encourage ‘Safety First’ - it’s called an ‘editor’ for a reason, so to avoid the risk of corrupting your own originals, always work with a copy of your original files
- Drag a Plain text file everybody has access to to demonstrate ASCII representation. Drag a further file (PDF?) to demonstrate mixture of binary and ASCII data- with reference to PRONOM, highlight magic number, reinforcement of offset meaning. Demonstrate how easy it is to change data, to reinforce safety first aspects!
- Drag a file of your choosing into the hex editor - raise your hand if you’d like to share any observations
Looking for patterns
- The more samples from different versions of the format can ensure better identification.
- Not all formats have available specifications
- The more variations in samples, patterns emerge.
Introducing PRONOM syntax
- PRONOM syntax is a regular expression (regex).
- PRONOM syntax can be combined in multiple ways.
- Sometimes there is more than one way to write a signature.
Reversing PRONOM syntax
- You can reverse engineer PRONOM signatures to debug existing files.
- Reversing PRONOM syntax has other implications, e.g. skeleton files.
Creating signature files
- A signature file is a set of instructions for DROID.
- You can create signature files using the Signature Development Utility.
- A signature file is separated into sections.
- One section is used for metadata about identification results.
- Another section is used to store the instructions for identification.
Plugging it in
- You can use any tool!
- There are different merits to each.
Doing it for yourself
- You’ve all the tools needed to write file format signatures.
- It might not always work.
- It will certainly take trial and error.
- Persevere and keep working on it.
- Practice makes perfect!
Teaching us how to do it!
- Seeing one, doing one, teaching one allows you to reinforce what you’ve learned.
- Teaching one helps you to exernalise and formalize your language around this work making it easier to articulate in future in other forums.
Advanced PRONOM
- Much of this effort is researching files and writing a signature but another big part is testing, calibration, AND documentation.
Final thoughts
- It might look scary at first, but take your time, explore, and enjoy!
- There’s help out there.
- Keep in touch!