While the gaze of the eDiscovery community has been firmly transfixed on the unfolding drama in the Da Silva Moore, et. al. v. Publicis Groupe, et. al. predictive coding case, an equally important case in the Northern District of Illinois has been quietly flying under the radar. I recently traveled to Chicago to attend the second of a two day hearing in the 7th Circuit Kleen Products, LLC, et. al. v. Packaging Corporation of America, et. al. case where plaintiff and defense experts duked it out over whether or not defendants should be required to “redo” their document production. On its face, plaintiffs’ request may not seem particularly unusual. However, a deeper dive into the facts reveals that plaintiffs are essentially asking Magistrate Judge Nan R. Nolan to issue an order that could potentially change the way parties are expected to handle eDiscovery in the future.
Can One Party Dictate Which Technology Tool Their Opponent Must Use?
The reason plaintiffs’ position is shocking to many observers is as much about the stage of the case as it is about their argument. Plaintiffs basically ask Judge Nolan to order defendants to redo their production even though defendants have spent thousands of hours reviewing documents, have already produced over a million documents, and at least one defendant claims their review is over 99 percent complete. Given that plaintiffs don’t appear to point to any glaring deficiencies in defendants’ production, an order by Judge Nolan requiring defendants to redo their production using a different technology tool would likely sound alarm bells for 7th Circuit litigants. Judges normally care more about results than methodology when it comes to eDiscovery and they typically do not allow one party to dictate which technology tools their opponents must use.
Plaintiffs’ main contention appears to be that defendants should redo the production because keyword search and other tools were used instead of predictive coding technology. There is no question that keyword search tools have obvious limitations. In fact, Ralph Losey and I addressed this precise issue in a recent webinar titled: “Is Keyword Search in eDiscovery Dead?” The problem with keyword searches, says Losey, is that they are much like the card game “go fish.” Parties applying keyword searches typically make blind guesses about which keywords might reveal relevant documents. Since guessing every relevant keyword contained in a large collection of documents is virtually impossible, using keyword search tools normally results in some relevant documents being overlooked (those that do not contain the keyword) and some irrelevant documents being retrieved (documents that are not relevant may contain the keyword). Although imperfect, keyword search tools still add value when used properly because they can help identify important documents quickly and expedite document review.
Regardless, plaintiffs take the position that defendants should have used predictive coding to avoid the limitations of keyword search tools. The arguments are not well framed, but ostensibly plaintiffs rely on the common belief that predictive coding tools can minimize the inherent limitations of keyword search tools. The rationale is based in part on the notion that predictive coding tools are better because they don’t require users to know all the relevant keywords in order to identify all the relevant documents. Instead, predictive coding tools rely on human input to construct complex search algorithms. Provided the human input is accurate, computers can use these algorithms to automate the identification of potentially relevant documents during discovery faster and more accurately than humans using traditional linear document review methodologies. Plaintiffs contend defendants should redo their document production using predictive coding technology instead of relying on keywords and traditional linear review because it would provide added assurances that defendants’ productions were thorough.
Aside from the fact that defendants have essentially completed their document production, the problem with plaintiffs’ initial argument is that too much emphasis is placed on the tool and almost no value is attributed to how the tool is used. Today there are a wide range of technology tools available in the litigator’s tool belt including keyword search, transparent concept search, topic grouping, discussion threading, and predictive coding to name a few. Knowing which of these tools to use for a particular case and in what combination is important. However, even more important is the realization that none of these tools will yield the desired results unless they are used properly. Simply swapping a predictive coding tool for a keyword tool will not solve the problem if the tool is not used properly.
The Artist or The Brush?
Plaintiffs’ blank assertion that defendants’ document production would be more thorough if a predictive coding tool was used as a replacement for keyword searching is naïve. First, using keyword searches and other tools to filter data before using a predictive coding tool is a logical first step for weeding out clearly irrelevant documents. Second, ignoring the importance of the process by focusing only on the tool is like assuming the brush rather than the artist is responsible for the Mona Lisa. The success of a project depends on the artist as much as the tool. Placing a brush in the hands of a novice painter isn’t likely to result in a masterpiece and neither is placing a predictive coding tool in the hands of an untrained end user. To the contrary, placing sophisticated tools in unskilled hands is likely to end poorly.
Hearing Testimony and Da Silva Moore Lessons
Perhaps recognizing their early arguments placed too much emphasis on predictive coding technology, plaintiffs spent most of their time attacking defendants’ process during the hearing. Plaintiffs relied heavily on testimony from their expert, Dr. David Lewis, in an attempt to poke holes in defendants’ search, review, and sampling protocol. For example, Dr. Lewis criticized the breadth of defendants’ collection, their selection of custodians for sampling purposes, and their methodology for validating document review accuracy on direct examination. During a spirited cross examination of Dr. Lewis by Stephen Neuwirth, counsel for defendant Georgia Pacific, many of Dr. Lewis’ criticisms seemed somewhat trivial when measured against today’s eDiscovery status quo – basically the “go fish” method of eDiscovery. If anything, defendants appear to have followed a rigorous search and sampling protocol that goes far beyond what is customary in most document productions today. Since courts require “reasonableness” when it comes to eDiscovery rather than “perfection,” plaintiffs are likely facing an uphill battle in terms of challenging the tools defendants used or their process for using those tools.
The important relationship between technology and process is the lesson in Da Silva Moore and Kleen Products that is buried in thousands of pages of transcripts and pleadings. Although both cases deal squarely with predictive coding technology, the central issue stirring debate is confusion and disagreement about the process for using technology tools. The issue is most glaring in Da Silva Moore where the parties actually agreed to the use of predictive coding technology, but continue to fight like cats and dogs about establishing a mutually agreeable protocol.
The fact that the parties have spent several weeks arguing about proper predictive coding protocols highlights the complexity surrounding the use of predictive coding tools in eDiscovery and the need for a new generation of predictive coding tools that simplify the current process. Until predictive coding tools become easier to use and more transparent, litigants are likely to shy away from new tools in favor of more traditional eDiscovery tools that are more intuitive and less risky. The good news is that predictive coding technology has the potential to save millions of dollars in document review if done correctly. This fact is fostering a competitive environment that will soon drive development of better predictive coding tools that are easier to use.
Given the amount of time and money defendants have already spent reviewing documents, it is unlikely that Judge Nolan would go out on a limb and order defendants to redo their production unless plaintiffs point to some glaring defect. I did not attend the entire hearing and have not read every court submission. However, based on my limited observations, plaintiffs have not provided much if any evidence that defendants failed to produce a particular document or documents. Similarly, plaintiffs’ attacks on defendants’ keyword search and sampling protocol are not convincing to the average observer. Even if plaintiffs could poke holes in defendants’ process, a complete redo is unlikely because courts typically require reasonable efforts during document production, not perfection. A third day of hearings has been scheduled in this case, so it may be several more weeks before we find out if Judge Nolan agrees.