« 10 delicious links | Main | Fax signatures - how secure are they? »

May 05, 2008

Request From An Attorney

Here's an request that was posted on the LitSupport Listserv this morning in the midst of a thread talking about how uninformed attorneys are about e-documents.  I referred the poster to Rick Borstein but thought this group would find it, well ...interesting.

Can someone recommend a whitepaper, article or book which will clearly reinforce the following statement:

"converting a document such as a contract from a MS Word document into a
.PDF eliminates metadata concerns associated with the MS Word version of the
document."

I am working with a law firm currently where one attorney will only convert to .PDF, print, and
re-scan
the document to another .PDF before she will transmit it anywhere.

04:01 PM | Permalink

Comments

Why would anyone convert a document to PDF simply to print it and then scan it in again.

Scanning a paper document would likely eliminate the metadata concerns that the person has. A metadata removal program will do the same. Also, it will do it much more easily.

Converting to PDF (without scanning) will likely remove most of the metadata concerns (depending on what your concerns are). Keep in mind, however, that, depending on your settings, you can convert things such as comments in a Word document into the final PDF that you create.

Posted by: Bryan Sims | May 5, 2008 8:03:59 PM

One thing you can do to appease her is to have her use a metadata stripping tool first before converting the doc to PDF. Try Doc Scrubber for this (http://www.javacoolsoftware.com/docscrubber/index.html). It is quite effective.

Posted by: Curtis Carmack | May 6, 2008 8:45:33 AM

Actually, the whether metadata is included in the conversion process of Word->PDF depends on what software you use and its settings.

For example, the PDFMakers of Adobe Acrobat (the toolbar buttons that we add to Office) will, by default, COPY the metadata from the Word file to the PDF - because that's what the average user expects - the highest fidelity conversion. However, you can certainly turn that option off in its settings.

In addition, Adobe Acrobat includes an Examine Document feature that enables you to check a PDF for any metadata AND other potentially "problematic" things (hidden text, scripts, etc.) and remove them before sending the document out of house.

Leonard Rosenthol
Adobe Systems

Posted by: Leonard Rosenthol | May 6, 2008 9:33:33 AM

I did some analysis of this issue some time back. The idea that "metadata" can be transmitted from Word to PDF when the PDF is created has some basis in truth, but the danger (as described by some authors) is overstated. It boils down to what is meant by the word.

Essentially, the standard identification metadata that appears in MS documents created in Word and Excel (and perhaps in other apps) can appear in the PDF file after it is created. This is the information that appears when you choose File | Properties, and includes such things as Title, Author, Subject, Keywords. These data can even be passed along when the PDF is created by a third-party program like pdfFactory. (It apparently only happens with MS apps, by the way.) This "metadata" is relatively innocuous, and can be changed at any rate if there is something unwanted before the PDF is generated.

The problem is that the word "metadata" is also used to refer to such undesirable stuff as deleted text, file comments, etc. There is no evidence that this type of data is passed along to the PDF when it is created.

Where I think some authors have gone wrong is in failing to make the distinction between these different concepts of "metadata" when discussing this issue. As a result they have unnecessarily injected a lot of fear into this area.

Posted by: yclipse | May 14, 2008 8:51:02 PM

There is a tool named Metadata Assistant that will remove metadata from any Office document, that is the Office metadata. It does not remove the system metadata such as the access, creation, and modification times-but neither does creating a .pdf version.

The idea of creating a .pdf document, printing it, and scanning in the printed document is not a good one. First of all it wastes resources. Secondly, if this is in a document production and the second .pdf document is not searchable, it will be in violation of the e-Discovery amendments to the FRCP.

Finally, again in document production, the courts are beginning to hold that the metadata is discoverable under the e-Discovery amendments.

Posted by: Johnette Hassell | May 15, 2008 4:33:02 PM

Post a comment