Legal Technology News - E-Discovery and Compliance Blog

« The Good and Bad News About E-Mail | Main | The Three Most Dangerous Buttons »

November 10, 2007

Zeno, Draco, Mozart and Solomon, L.L.P.

110_f_469470_awsk3d9jgh0ifgzvjwtude Do hard cases make bad law?  Or do hard cases just so piss off judges that angry jurists make bad law?  Case in point is John B. v. Goetz, 2007 WL 3012808 (M.D. Tenn.Oct. 10, 2007).   At 187 pages, the Court’s ponderous memorandum opinion was brutal to wade through, but when I saw the accompanying order requiring production of “all metadata as well as all deleted information on any computer of any of the Defendants' designated key custodians,” (emphasis added) I was determined to find what led to such an untenable directive.

Folks, if you’re wondering what’s wrong with an order directing a party produce all metadata and all deleted information, we need to talk.  I mean, we need to talk for a while.

In a nutshell, there are all sorts of different fields of metadata associated with all sorts of different files. Some metadata is familiar and easily captured and interpreted. A file’s name, last modified date, location and size are common examples. But plenty of metadata is much harder to grab and present.  Consider a binary flag in the master file table--that's just a one or zero--or an embedded UTC date and time value stored as a hexadecimal value representing the number of 100 nanosecond intervals since January 1, 1601?  Or how about the way an e-mail program represents whether a message has or has not been read or flagged for follow up?

Worse than the vagary of “all metadata” is the absurdity of demanding “all deleted information.” If you want someone to fish a few files out of the Recycle Bin, fine. But when you delve into computer forensics, recovering “all” deleted information is a monumental—and frustrating—task.

To appreciate why requires some understanding of the principal ways by which computer forensic examiners recover deleted data. The first entails reconstructing the name, metadata and disk location of deleted files using folder remnants. Basically, you determine where deleted files used to be, and then grab data from those locations. If you’re lucky, you’ll get the old file back. Or you might get just the part that wasn’t fragmented or the part that wasn’t overwritten when clusters once used by the deleted file were dedicated to new data.

The second method entails searching the typically vast expanse of unallocated clusters (disk space previously used to store data but, by virtue of deletion, released for reuse). To recover deleted data this way (called “data carving”), you’ve got to know what type of file you’re seeking (Excel spreadsheet, JPEG photo, WAV sound file, etc.) and comb each of millions of recycled clusters hoping to match one of hundreds of unique binary signatures that occur within the first few bytes or “header” of many (but not all) file types. Then, assuming you find a recognizable signature, you have to figure out a way to deduce where the deleted file ended and hope that the system hasn’t re-allocated part of the intervening space to another use and plopped an entirely unrelated block of data down in the midst of the deleted file.

If the binary signature was overwritten, you’ve got an even bigger challenge. Lacking a file identifier, you’ve got to find some other way to pick out deleted file fragments with no metadata and no header to guide you. Now, you’re likely talking keywords or code fragments requiring manual examination and manual collection of thousands of “hits.”

And did I mention the file slack space (the area between the end of a file and the end of its final cluster, where one or more old file remnants may lodge). In file slack, the header data is always overwritten, so keyword search and manual examination is pretty much your only option. In a typical Windows computer, the instances of slack space may number in the hundreds of thousands.

To this point, we’ve spoken only of deletion of files and file fragments from within the file system and unallocated clusters; however, data can be deleted within a so-called compound file and entail entirely different methods of recovery.

The classic example is an e-mail container like the .PST or .OST files used to house Outlook e-mail on a desktop or laptop machine. When you delete Outlook e-mail, it doesn’t slink off to unallocated clusters. Initially, it hangs about in the purgatory of “Deleted Items,” where it’s not really “gone” at all and can be easily resurrected. Even when you empty the Deleted Items folder (‘double deletion”), the messages don’t tumble out of their container into unallocated clusters. They invisibly skulk around the .PST or .OST container and can sometimes be recovered by, e.g., the peculiar method of corrupting the file header and running a repair tool to leap back in time before the double deletion. Until the container file is periodically compacted and the double deleted messages really do get squeezed out, they’re not gone. Even then, older versions of the e-mail container files can sometimes be carved from the unallocated clusters and the whole dance begins again.

Then you’ve got the jigsaw puzzle of web page pieces retained in the web cache or Temporary Internet Files area where you’ll pick up traces of web e-mail and other online activity. Oh, and did I mention the deleted web cache pieces relegated to unallocated clusters when a user clears or fills their web cache?

Bottom line: unless you have all the time and money in the world, you don’t ever order anyone to undelete everything. It sounds straightforward, but it’s an abyss, a Zeno’s paradox destination you can never quite reach but must ultimately abandon in favor of wherever you find yourself when you decide that enough is enough.

Coming back to the John B. opinion and order, the court had darned good reason to be livid with the State of Tennessee, which had repeatedly thumbed its nose at e-discovery duties and offered up endless excuses. To set the stage for its Draconian order, the opinion cobbles together much of the text of the Rules amendments and the Zubulake opinions, expending effort quibbling about how many gigabytes are implicated by a keyword search, without any consideration of what documents and file formats comprise that volume. Shouldn’t the focus be on what these documents convey with respect to the issues in the case? If the search was a good one in terms of precision and recall then a contention that there’s too much relevant evidence is as ridiculous as the emperor Joseph II’s carping that Mozart’s opera, Die Entführung aus dem Serail, had “too many notes.”

The opinion inexplicably holds that “a gigabyte can range from 75,000 to 77,000 pages,” never addressing what the form of the content might be. Regrettably, the court offered no like estimate of the number of angels that can dance on the head of a pin—clearly a lost opportunity to commit a similarly incalculable value to numerical certainty.

Dear reader, I hope that the frustration and derision you take from my tone doesn’t mask my conviction (based on the facts as related in the opinion) that the court was amply justified in pounding the stuffing out of the defendants. My concern is that, by imposing a problematic remedy and propagating an unfounded page equivalency, the effort to discipline the errant defendants will have untoward repercussions. Faced with the complexity of e-discovery, lawyers are clinging to anything that seems simple and clear, grabbing onto pronouncements about EDD duties as though they were writ by lightning onto stone. They aren’t. Sometimes such pronouncements are just the halt leading the blind, or more accurately, judges doing their level best to decide technical matters without benefit of knowledgeable counsel or experts. Though a laudable effort to help the children of Tenneesee, the John B. opinion leaves me wondering how wise would Solomon’s dictate to cut the baby in half have been if nobody appreciated its impact on the child?

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8345280a669e200e54f7f63d08833

Listed below are links to weblogs that reference Zeno, Draco, Mozart and Solomon, L.L.P.:

Comments

In discovery as in other areas of life, we should all beware of the extremely harsh words such as "all" and "everything" as well as "none" and "nothing." If these words crop up in a discovery request or answer, they should be reviewed before a judge or magistrate sees them.

In discovery as in other areas of life, we should all beware of the extremely harsh words such as "all" and "everything" as well as "none" and "nothing." If these words crop up in a discovery request or answer, they should be reviewed before a judge or magistrate sees them.

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

This weblog only allows comments from registered users. To comment, please Sign In.

Sign Up for the E-Discovery and Compliance Newsletter



An Affiliate of the Law.com Network

From the Law.com Newswire

Sign up to receive Legal Blog Watch by email
View a Sample

Contact EDD Update


Subscribe to this blog's feed



RSS Feed: LTN Podcast

Monica Bay's Law Technology Now Podcasts are also available as an RSS feed.

Go to RSS Subscribe page




February 2012

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29      

Blog Directory - Blogged