OOo Compare: Inadequate
OpenOffice.org’s compare function has, historically, performed very poorly for me. Being able to more or less accurately compare two documents for changes is an essential function for any law practice which does any sort of transactional work. Without it, OpenOffice will never find a place in legal firms. Moreover, this functionality is of use to anyone who wants to have changes between versions readily identifiable without the need to have track changes on all of the time.
OpenOffice’s compare performance is inadequate. By way of a test I took two similar clauses and saved them to separate text files. I then used OpenOffice’s compare function to show the changes. This was the output:
Despite there being some commonality between the two clauses – eg “may invoice” are the second and third words of both clauses, OpenOffice simply marked the whole thing as a change. The clauses are each less than 60 words – a markup should not be difficult. The output is unhelpful and this sort of output from compare is not unusual for OpenOffice. It should come as no surprise that things do not improve with longer documents.
By way of example, Word’s compare function gave this output:
This markup shows pretty much an accurate record of the changes which were made. Text which has not changed is shown as unchanged. Changed text is shown as a change. This is not to say that Word does not go markup haywire from time to time, but its track record is vastly superior to OOo’s in my experience. That said, OOo’s mark up has been so poor for me that the number of times I have experienced it is not that great.
Exactly who uses OOo’s compare function? What is its intended domain of application?
Update: Richard [thanks Richard] has posted a comment linking to an issue dating to 2005 in the OOo bug tracker. If you think this should be improved please go vote for the issue.
Update 2: Oh, and based on my experience – any compare based on diff will also be inadequate. Don’t even consider it. FWIW I ran them through diff (the output is at least more reader friendly):
< 1.1[Vendor] may invoice Customer the Fees for each Service in accordance with the Payment Terms and, where no relevant time is set out in the Payment Terms, [Vendor] may invoice the Customer for ongoing fees quarterly in advance and other fees in arrears. [Vendor] will provide Customer with a tax invoice in respect of all GST charged.
> 1.1Contractor may invoice [Customer] for ongoing fees quarterly in arrears and other fees monthly in arrears. Contractor must provide [Customer] with a tax invoice in respect of all GST charged. [Customer] is not liable to pay any amount in respect of GST except as set out in a valid tax invoice.
Update 3: One of the comments on a linking site points out this code (which I haven’t tried): http://www.plagiarism.phys.virginia.edu/copyfind.cpp
Update 4: Wdiff seems to be very promising.
OOo compare results were the same for both versions of OOo I tried (2.3.something and 3.0.1).