Forensic Hashing In Criminal And Civil Discovery

Published date18 May 2022
Subject MatterIntellectual Property, Litigation, Mediation & Arbitration, Trade Secrets, Disclosure & Electronic Discovery & Privilege, Trials & Appeals & Compensation
Law FirmHolland & Knight
AuthorMr Jacob W Schneider

After reading an earlier IP/Decode post about hashing, my friend Jenny Rossman reached out to explain how law enforcement was using hash values to fight the spread of child pornography. For over a decade, Jenny had been a sex crimes prosecutor in Florida. She, alongside law enforcement, had been using the technique to identify suspects and secure convictions. It is a brilliant use of hashing that is also worth considering in civil cases, particularly trade secret litigations.

Using Forensic Hashing to Fight Child Pornography

As I wrote in the earlier post, hashing can convert files to shorter strings of numbers and letters (the "hash value"). To demonstrate this, below is a set of five files that contain different content. I computed their unique hash values using the MD5 algorithm:

Filename

MD5 Hash Value

File1

585960c5cf6ed77c10d37e8dfa66629f

File2

994d6db8e10d41ac5cc49f15281a5bef

File3

fec2a0796d37905dec5b9ef0b24045bf

File4

a3d95a3899c1050c146cd05c054cebf8

File5

748f65d8e5d27d17dd2f142a7b712392

Law enforcement, along with private entities, have been using these unique hash values like fingerprints to identify illicit digital materials. In practice, if law enforcement knows that File5 is child pornography from a previous investigation, then File5's hash value can be used to identify other files with that same hash value. If there is a match, then there may be a crime. (U.S. v. Miller, 982 F.3d 412 (6th Cir. 2020), is a good read for those interested in how this practice implicates the Fourth Amendment.)

As I wrote in the previous post, the solution to speeding up nearly any search problem is hashing, and it provides the solution in this context as well. To find File5 in a suspect's computer, one would only need to run all files on the computer through an MD5 hash. After those hash values are generated, you search for File5's unique string: 748f65d8e5d27d17dd2f142a7b712392. Below are hash values for another set of randomized files that include the illicit File5:

Filename

MD5 Hash Value

File6

01cadc70bb61741a28915dd336f878d0

File7

748f65d8e5d27d17dd2f142a7b712392

File8

8259db3e9b95531adae71e740ff362b0

File9

d76c67896451dc0d920dc39ed8c802fb

File10

cdf2d0112d601302ede03f6eafea0ad4

File7's MD5 hash value is the same as File5's, so we have a match. Due to the math behind the MD5 hash algorithm, the odds of File7's content differing from File5's, but still resulting in the same hash value, are almost...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT