Discovery is the largest time and cost component of civil litigation. As the volume of documents in scope increases exponentially, teams need to manage them as “quickly, inexpensively, and efficiently as possible” (to quote the Federal Court rules).
One area commonly overlooked, which can significantly streamline the process, is the production format. As we explore below, this can have a material impact on time and cost for both sides.
Background – Near Paper (Image) Production
In the nascent days of eDiscovery, rather than printing every file, a digital image for each page of the document was created, stamped with an ID and page number, then exchanged – along with .txt files of extracted text and a metadata index file (dates, authors, etc.).
As the technology evolved and PDFs became an easily accessible file type, the preferred format of discovery in Australia has become PDF image productions – with the document’s text disclosed as a searchable layer within the PDF.
While discovery approaches differ, many teams prefer to convert every electronic document in scope to these stamped PDF formats before review. As the scale of data in contentious proceedings increases, this becomes very time-consuming – with days or weeks of delay to commencing review.
The Future – Native Production
As data sizes grow, the variety of data sources has too – with videos, complex spreadsheets, and 3D renders becoming commonplace. These types of documents should not – or cannot – be rendered to PDF for effective disclosure.
To tackle these problems associated with “PDFing” in general, disclosing electronic files in their native format has become the simplest and most cost-efficient production format – given no changes are needed.
Outside of renaming the files to match their assigned document IDs, native production doesn’t involve any transformation to the documents themselves – no digital images are created, and documents are produced as they are (e.g., Word documents as .doc or .docx and Excel files as .xlsx or .xls). Alongside the native files, a metadata index or load file is provided to ensure all agreed metadata is properly preserved and transferred.
The only exceptions are:
- Email files – To remove the risk of forwarding/replying, emails are provided as .mhtml files, which can be viewed in a browser or eDiscovery tool.
- Redactions – For redactions to be properly burnt into documents, they are converted to an image-based format. However, this generally impacts only a very small percentage of the documents that are ultimately disclosed.
What Are Other Legal Markets Doing?
Australia remains strongly attached to PDF production, arguably attributable to legacy from the Ringtail electronic review platform – but to provide a comparison:
United States – While the US legal market still tends towards single-page image productions, the issues cited above are even more onerous because of the scale of discovery involved. Preference is shifting to the native format due to its lower cost and speed advantages.
United Kingdom – Civil Procedure Practice Directions make native format the standard disclosure format. While imaged documents can be exchanged, this is generally limited to when a document requires redaction.
Near Paper vs. Native Format – Which Is Best?
For reviewing and disclosing documents in discovery, by far the best option is to keep documents in native format:
- Native formats are generally much smaller than PDF-converted counterparts, meaning the cost of hosting documents is significantly reduced.
- Hosting data produced from other parties is also significantly cheaper.
- The time to import/export/transfer files is significantly reduced.
- There is less risk of inadvertent metadata spoilage.
- Documents can generally be viewed in their original format, preserving unique file properties.
When starting to handle documents for trial, it can be more practical and efficient to have documents converted to PDFs, stamped with doc IDs, and redactions burnt in. But, given that only a small percentage of documents end up being relied on as evidence at trial, it’s most efficient to only do this for data that will actually be used in evidence