Revisiting REAPER: Automating digital forensic investigations

The Rapid Evidence Acquisition Project for Event Reconstruction [1] was one of the first projects that I worked on during my PhD. It started around 2008, when I got interested in trying to completely automate digital forensic investigations. Yes, it sounds impossible, but I wanted to see how far we could automatically handle digital evidence.<div class="separator" style="clear: both; text-align: center;"></div><div>
</div><div>This was a little before digital forensic triage [2] and preliminary analysis gained popularity.</div><div>
<div>The idea was that once the process started, the investigator would not need to interact with the system. At the end of the automated investigation process, the “smoking gun” would be presented to the investigator in context.</div><div>
</div><div>Literally push-button forensics.</div><div>
</div><div>The Process</div><div>An investigator would insert a forensic live CD into the suspect’s computer (single mortem). After starting the computer, the live CD (with attached external disk) would provide only an information panel to see the stage of the investigation process.</div><div>
</div><div>First, REAPER would check the suspect computer to see what disks it could access, and if there was encryption / hidden data. If hidden / encrypted data was detected, it would try to recover / access the data. With toy examples, this worked, but how it would work on real systems - especially now - I’m not sure. All detectable media would be hashed, and verbose logging was on by default (for every action).</div><div>
</div><div>Next, all detectable media would be automatically imaged to the investigator’s external disk. Once complete, the images would be verified. If verification failed, the disk would be re-imaged.</div><div> </div><div>Next, REAPER would start standard carving, parsing and indexing. The Open Computer Forensic Architecture was used to extract as much data as possible. OCFA is an extremely powerful architecture, but the open source version is a bit difficult to use (especially from a live CD). I understand that the NFI has a commercial front-end that makes working with it much easier.</div><div>
</div><div>Once all data has been acquired, verified and processed, the actual investigation / analysis should take place.</div><div>
</div><div>Here is where things get tricky.</div><div>
</div><div>First, we have to know what the investigation question is, and we have to ‘tell’ the system what the investigation question is. We currently do this by specifying the type of investigation generally. For example, “hacking” or “child exploitation”. We then have a (manually) pre-set list of tasks related to those particular types of crimes. Either that or we could search for ‘all crimes’.</div><div>
</div><div>Here, some basic analysis could take place. For example, we could automatically determine attack paths of intrusions based on processed data [3]. We could also test whether it was possible / impossible for a certain statement to be true based on the current state of the system [4]. Also, by building up ‘knowledge’ (models) about systems before an investigation, we could also accurately, automatically determine user actions using traces that are difficult for humans to analyze [5].</div><div>
</div><div>Where it falls apart</div><div>The problem is, we are still essentially in the processing phase of the investigation. We are condensing the available information into a useable form, but we are not yet saying what this information means in the context of the investigation. While we can gain more information about the data in an automated way, a human still needs to ‘make sense’ of the information.</div><div>
</div><div>Even though we are not there yet, automation has been shown to be useful for investigations [6], and can help reduce the time for investigations while improving the accuracy [7] of the investigation. For more comments on automation in investigations, please see [8].</div><div>
</div><div><ol><li><div style="margin-left: 24pt; text-indent: -24.0pt;">James, J. I., Koopmans, M., & Gladyshev, P. (2011). Rapid Evidence Acquisition Project for Event Reconstruction. In The Sleuth Kit & Open Source Digital Forensics Conference. McLean, VA: Basis Technology. Retrieved from </div></li><li><div style="margin-left: 24pt; text-indent: -24.0pt;">Koopmans, M. B., & James, J. I. (2013). Automated network triage. Digital Investigation, 1–9.</div></li><li><div style="margin-left: 24pt; text-indent: -24.0pt;">Shosha, A. F., James, J. I., & Gladyshev, P. (2012). A novel methodology for malware intrusion attack path reconstruction. In P. Gladyshev & M. K. Rogers (Eds.), Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering (Vol. 88 LNICST, pp. 131–140). Springer Berlin Heidelberg.</div></li><li><div style="margin-left: 24pt; text-indent: -24.0pt;">James, J., Gladyshev, P., Abdullah, M. T., & Zhu, Y. (2010). Analysis of Evidence Using Formal Event Reconstruction. In Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering (pp. 85–98). Springer Berlin Heidelberg.</div></li><li><div style="margin-left: 24pt; text-indent: -24.0pt;">James, J. I., & Gladyshev, P. (2014). Automated inference of past action instances in digital investigations. International Journal of Information Security.</div></li><li><div style="margin-left: 24pt; text-indent: -24.0pt;">James, J. I., & Gladyshev, P. (2013). A survey of digital forensic investigator decision processes and measurement of decisions based on enhanced preview. Digital Investigation, 10(2), 148–157.</div></li><li><div style="margin-left: 24pt; text-indent: -24.0pt;"></div><div style="margin-left: 24pt; text-indent: -24.0pt;">James, J. I., Lopez-Fernandez, A., & Gladyhsev, P. (2014). Measuring Accuracy of Automated Parsing and Categorization Tools and Processes in Digital Investigations. In Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering (pp. 147–169). Springer International Publishing.</div></li><li><div style="margin-left: 24pt; text-indent: -24.0pt;"></div><div style="margin-left: 24pt; text-indent: -24.0pt;">James, J. I., & Gladyshev, P. (2013). Challenges with Automation in Digital Forensic Investigations, 17. Computers and Society. Retrieved from</div></li></ol></div><div><div style="margin-left: 24pt; text-indent: -24.0pt;">
</div><div style="margin-left: 24pt; text-indent: -24.0pt;">

4 min read

Philipp Amann interviewed about robustness and resilience in digital forensics laboratories

Forensic Focus recently interviewed Philipp Amann, Senior Strategic Analyst, Europol about our DFRWS EU 2015 paper “Designing robustness and resilience in digital investigation laboratories”. Philipp and his team are doing some great work that is definitely worth following. See the full interview here.
<div class="separator" style="clear: both; text-align: center;"></div>

~1 min read

[How-To] Using GnuPG to verify data using detached signatures

GnuPG logo

Many software downloads come with a signature file. You normally need to download this signature file separately. Signatures are a great way to let people know that you are the person / company that is making the software available, and that no one else has changed the data since its release.
</div><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; text-align: right;"><tbody><tr><td style="text-align: center;">Tails linux ISO and signature download links with SHA256 checksum</td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 1: Tails ISO and signature file download</td></tr></tbody></table><div>We are going to use Tails Linux as an example. On their download page, you will find a link to download the Tails ISO image. This is the data we are interested in running. Think of it like the main program that we want to install / use.</div><div>
</div><div>Next, we are given a link to the “Tails 1.4 signature”. This is signature file that the distributor created. With this signature we can verify that the Tails ISO Image has not been modified by anyone else.</div><div>
</div><div>Tails also provides a “SHA256 Checksum”. This is a less-rigorous way than signatures to verify the data has not changed.</div><div>
</div><div>First, download the ISO file AND the signature file. The signature file will almost always end with “.sig”. Make sure both files are in the same directory.</div><div>
</div><div class="separator" style="clear: both; text-align: center;"></div><div>
</div><div>Once you had both files, open the command line / terminal and navigate to that directory. Next we need to use gpg to verify the signature. If we try to verify now, we may get the following results:</div><div>
</div><pre>gpg2 –verify tails-i386-1.4.iso.sig gpg: assuming signed data in ‘tails-i386-1.4.iso’
gpg: Signature made Tue 12 May 2015 02:56:27 AM KST using RSA key ID 752A3DB6
gpg: Can’t check signature: No public key
In this case, we also need to get the public key of the person that created the signature. From the tails website, I find the ID of their signing key, so now we need to import.

<pre>gpg2 –recv-keys A490D0F4D311A4153E2BB7CADBB802B258ACD84F
gpg: key 58ACD84F: public key “Tails developers (offline long-term identity key) " imported
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 2 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 2u
gpg: next trustdb check due at 2017-01-09
gpg: Total number processed: 1
gpg: imported: 1
Make sure we have the right key:

<pre>gpg2 –list-keys
pub rsa4096/58ACD84F 2015-01-18 [expires: 2016-01-11]
uid [ unknown] Tails developers (offline long-term identity key)
sub rsa4096/752A3DB6 2015-01-18 [expires: 2016-01-11]
sub rsa4096/2F699C56 2015-01-18 [expires: 2016-01-11]
Now verify the signature again:

<pre>gpg2 –verify tails-i386-1.4.iso.sig gpg: assuming signed data in ‘tails-i386-1.4.iso’
gpg: Signature made Tue 12 May 2015 02:56:27 AM KST using RSA key ID 752A3DB6
gpg: Good signature from “Tails developers (offline long-term identity key) " [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: A490 D0F4 D311 A415 3E2B B7CA DBB8 02B2 58AC D84F
Subkey fingerprint: BA2C 222F 44AC 00ED 9899 3893 98FE C6BC 752A 3DB6
Here we can see when the signature was made, and the ID of that key. Next we see “Good signature” which means that the signature does verify the data.

Remember, we were given the SHA256 value of the ISO file. Get the SHA256 hash with the following command (linux):

<pre>sha256sum tails-i386-1.4.iso
339c8712768c831e59c4b1523002b83ccb98a4fe62f6a221fee3a15e779ca65d tails-i386-1.4.iso
Now can can compare this hash value to the one on the website, and we see that they are the same.
<h3>If I can just check the hash value, why verify with a signature?</h3><div>Hash values do allow you to make sure that the data has not changed, however, there are a number of weaknesses. For example, someone intercepting your network traffic could deliver the web page to you with an altered ISO link AND an altered hash value on the page. This means that the hash value will be valid, but the source of the information cannot be trusted.</div><div>
</div><div>Signatures help this in a number of ways. Because the signature is generated by a developer’s private key, and we are verifying it with their public key, it is nearly impossible for someone to pretend to be the developer. Also, since we did not download the public key from the webpage, but looked it up on a different server, it is slightly more difficult for someone to trick us into download the wrong key. Further, we can try to use the Web of Trust to make sure we are getting the right key. In our case, we can see who has signed this key by going to a keyserver checking.</div>

3 min read


The DFRWS EU 2016 conference will be held in Lausanne, Switzerland from March 30th to April 1st, 2016.
<div class="separator" style="clear: both; text-align: center;"></div>

The DFRWS is dedicated to the advancement of digital forensics research through open sharing of knowledge and ideas. Ever since it organized the first open workshop in 2001, the DFRWS continues to bring leading researchers, developers, practitioners, and educators from around the world together in an informal collaborative environment. DFRWS conferences publicize and discuss high quality research outcomes selected in a thorough peer review process.

The DFRWS EU 2016 conference extends the 15-year tradition of research conferences organized by, including the DFRWS US 2015 conference from August 9 to 13, 2015 in Philadelphia. Information on the upcoming USA the program and how to register can be found at

The continued expansion of DFRWS EU conferences is intended as a focal point for the European digital forensic community, allowing participants to meet and exchange ideas without the need for transatlantic travel. The proceedings of DFRWS EU 2016 will be published in a special issue of Elsevier’s Digital Investigation journal, and will be freely available on the DFRWS website.

NOTE: Immediately before the conference, on March 29, rooms are available at the venue to be booked by research consortia to have meetings. If you are interested in reserving one of these meeting rooms, please contact us at eu-sponsorship <at> dfrws <dot> org.
<h3>Possibilities to contribute</h3>In recent years, DFRWS conferences have added practitioner presentations and hands-on tutorials/workshops taught by leading experts in the fields. Presentations are opportunities for industry researchers and practitioners who do not have the time to write a paper, but who have forensics information and experiences that would be of interest to DFRWS attendees. Presentation proposals undergo a light reviewing process to filter out sales pitches and ensure the topic is relevant to our audience.

We invite original contributions as research papers, presentation proposals, panel proposals, tutorial/workshop proposals, and demo or singleer proposals on the following topics:
<ul><li>“Big data” approaches to forensics, including data collection, data mining, and large scale visualization</li><li>Addressing forensic challenges of Systems-on-a-chip</li><li>Anti-forensics and anti-anti-forensics</li><li>Bridging the gap between analog and digital traces/evidences/investigators</li><li>Case studies and trend reports</li><li>Data hiding and discovery</li><li>Data recovery and reconstruction</li><li>Database forensics</li><li>Digital evidence and the law</li><li>Digital evidence storage and preservation</li><li>Event reconstruction methods and tools</li><li>Impact of digital forensics on forensic science</li><li>Incident response and live analysis</li><li>Interpersonal communications and social network analysis</li><li>Malware and targeted attacks: analysis, attribution</li><li>Memory analysis and snapshot acquisition</li><li>Mobile and embedded device forensics</li><li>Multimedia analysis</li><li>Network and distributed system forensics</li><li>Non-traditional forensic scenarios and approaches (e.g. vehicles, control systems, and SCADA)</li><li>Storage forensics, including file system and Flash</li><li>Tool testing and development</li><li>Triage, Prioritization, Automation: Efficiently processing large amounts of data in digital forensics</li><li>Typology of digital traces</li><li>Virtualized environment forensics, with specific attention to the cloud and virtual machine introspection</li></ul>
The above list is only suggestive. We welcome new, original ideas from people in academia, industry, government, and law enforcement who are interested in sharing their results, knowledge, and experience. Authors are encouraged to demonstrate the applicability of their work to practical issues.  Questions about submission topics can be sent via email to: eu-papers <at> dfrws <dot> org

IMPORTANT DATES - Please note that all deadlines are firm.

<ul><li>Papers & Presentation/Panel Proposals: October 5, 2015</li><li>Author notification: December 14, 2015</li><li>Final draft papers due and presenter registration(): January 25, 2016</li><li>( Papers for which no author has registered by this date may be dropped from the program.)</li><li>Workshop/Tutorial Submission Deadline: October 26, 2015</li><li>Demo & Poster Proposals: January 18, 2016</li><li>Conference Dates: March 30 - April 1, 2016</li></ul>
The FULL CFP with details about submissions is located at:
<h3>SUBMISSIONS</h3>Research papers and presentation proposals must be submitted through the EasyChair site at
Submissions must be in Adobe Acrobat PDF format. Send any questions about research paper / presentation proposal submissions to: eu-papers (at) dfrws (dot) org.

Panel proposals must be emailed to eu-panels (at) dfrws (dot) org in PDF or plain text format.
Demo proposals must be emailed to eu-demos (at) dfrws (dot) org in PDF or plain text format.

To submit a tutorial/workshop proposal please visit the Call for Workshop Proposals page at
<h3>STUDENT AWARD and STUDENT SCHOLARSHIP PROGRAM</h3>DFRWS continues its outreach to students studying digital forensics. This year DFRWS will be offering an award with a cash prize to the best student paper. A student paper is any paper in which the  majority of the work was performed and the paper written by full-time students at an accredited university, college, or high school.

A limited number of scholarships may be awarded to students presenting a paper at the conference. The intent is to help alleviate the financial burden due to the cost of hotel expenses and conference registration. For more information, see the DFRWS EU 2016 homepage at:

3 min read

Clearing USB disk read cache for testing and forensics in Linux

When copying data from USB devices in Linux (Debian / Ubuntu), you may have noticed that reading data from the disk the first time takes a while, and reading the second time takes only a few seconds.
</div><div>For example:</div><div>
</div><pre>[email protected] /media/joshua/ucdntfs $ time sudo md5sum /dev/sdf1
3a698f0c3155e494274e5e7829f4d246 /dev/sdf1

real 2m58.620s
user 0m7.032s
sys 0m1.429s

[email protected] /media/joshua/ucdntfs $ time sudo md5sum /dev/sdf1
3a698f0c3155e494274e5e7829f4d246 /dev/sdf1

real 0m3.467s
user 0m3.285s
sys 0m0.181s
Here the first read took 2 minutes 58 seconds, while the second took only 3 seconds.This is because all data on the disk is cached to memory when read the first time. In cases where the disk may change between reads, caching may return results that are not consistent with the current state of the disk (like a hash).

When looking how to disable read cache, I found a lot of information about disabling write cache, but not a lot about disabling read.

To disable write cache (if supported) for the current session that the device is plugged in:

<pre>sudo hdparm -W 0 /dev/[device]
But this does not solve our read cache problems. Unfortunately, I could not find a way to completely disable read cache, but we can clear the cache buffer.

First, determine the path to echo with

<pre>which echo</pre>
Then we want to tell the kernel to drop caches. To do this, we need to echo a value to /proc/sys/vm/drop_caches.
<blockquote class="tr_bq"><pre style="white-space: pre-wrap; word-wrap: break-word;">To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
To free reclaimable slab objects (includes dentries and inodes):
echo 2 > /proc/sys/vm/drop_caches
To free slab objects and pagecache:
echo 3 > /proc/sys/vm/drop_caches</pre></blockquote>
So our echo command to clear all caches should look like:

<pre>sudo sh -c “/bin/echo 3 > /proc/sys/vm/drop_caches”
Note: You probably cannot echo directly to drop_caches with sudo - you should be root. The work-around to that is wrap the whole command in sudo. Make sure you are putting the full path to echo on your system.

<pre>[email protected] /media/joshua/ucdntfs $ time sudo md5sum /dev/sdf1
3a698f0c3155e494274e5e7829f4d246 /dev/sdf1

real 3m18.294s
user 0m6.389s
sys 0m1.390s

[email protected] /media/joshua/ucdntfs $ sudo sh -c “/bin/echo 3 > /proc/sys/vm/drop_caches”

[email protected] /media/joshua/ucdntfs $ time sudo md5sum /dev/sdf1
3a698f0c3155e494274e5e7829f4d246 /dev/sdf1

real 3m18.344s
user 0m6.545s
sys 0m1.438s
If you use it a lot, like me, you might want to make an alias:

<pre>alias clearusbcache=”sudo sh -c ‘/bin/echo 3 > /proc/sys/vm/drop_caches’“
If you want to clear the cache in the background while running experiments, you can try this script:

while true; do
/bin/echo 3 > /proc/sys/vm/drop_caches
sleep 1

1 min read

Seoul Tech Society Crypto Event

On June 24th, Seoul Tech Society held an ‘introduction to cryptography’ event. First, Artem Lenskiy gave an overview of how symmetric and asymmetric encryption works. Followed by Joshua James with a hands-on tutorial about using GnuPG for electronic document signing and encryption. Finally, Max Goncharov talked about the new Paranoid.EMAIL service. This was all rounded out with pizza and libations.

Overall, attendees got a quick (but intense) overview of theoretical and practical encryption. We had a lot of great questions from the audience, and almost everyone left with their own GPG keypair.

For anyone who didn’t make it, the presentations can be found here:

<ul><li>20150624-James-Practical_GnuPrivacyGuard_(GPG).pdf</li><li>20150624-Lenskiy-Cryptography_as_a_solution_to_secure_communication.pdf</li><li>20150624-Goncharov-Welcome_to_Paranoid.EMAIL.pdf</li></ul><div>If you are in Korea, check out Seoul Tech Society’s Meetup page to catch our next event.</div><div></div><div class="separator" style="clear: both; text-align: center;"></div>
<div class="separator" style="clear: both; text-align: center;"></div>
<div class="separator" style="clear: both; text-align: center;"></div><div>

~1 min read