Archive for September, 2010

Localised [sic] Advertising Fail

Localised [sic] Advertising Fail

Sydney Mom Makes $77/HR online;  Australian Mom discovers one simple trick to turn yellow teeth white from home for under $10

Ahem, no.

[1/10/10] Oooh ahhhh they’ve fixed it.  Fancy, I didn’t even tell them.  Maybe they watch my blog??

Open Source Licence Non-Compliance == Legal Trouble

Open Source Licence Non-Compliance == Legal Trouble

Brendan Scott, September MMX

For those of you who haven’t seen it, I have recently released the results of some research work conducted into the Trade Practices Implications of Infringing Copies of Open Source Software.   Linux Australia has agreed to contribute some funding towards this research note.  This is Australian law specific.

The main finding of the research is that a corporate vendor selling an infringing copy of open source software is likely to be in breach of at least one section of Part V the Trade Practices Act 1974 (Cth) relating to misleading or deceptive statements or conduct, and likely more than one. There are many cases in which such breaches have been found in relation to infringing copies of software. Even where a vendor only offers to sell (as opposed to actually selling) an infringing copy they are still likely to be in breach of the Act.

The Research Note is available here: Research Note on Trade Practices Implications of Infringing Copies of Open Source Software and also from the publications section at

Linux .doc to Text Conversions Inadequate

Linux .doc to Text Conversions Inadequate

Having a look at converting doc and html to text on Linux. While these are good for indexing purposes, they aren’t that great for actually presenting the output.

update (16/6/11): LibreOffice seems to do numbering export correctly – at least after a quick look.  I need to check in detail.

Column Width

The engines which perform pretty well, like wvText and lynx (via html) have the annoying ‘feature’ of formatting to a column width.   This means that it is very hard to predict what are actually paragraph breaks and what are just the pager splitting lines.  Using wvHtml can then be transformed by html2text -width 0 to avoid this width problem (although results are a little haphazard at times). Lynx maxes out at a width of about 990 characters, but there are plenty of paragraphs around which exceed that length. … I take it back about html2text -width 0, which seems to fail more often than it succeeds.  However, it does seem to honour large width arguments (unlike lynx which has a magic number limiting the width) – width 20000 seems to work (although if there is a line, it inserts 20000 ‘=’ characters) – w3m also looks promising.


It is possible to automate conversion with open office (and seems to be less fragile with version 3 than earlier versions). However, version 3 can’t import Word outline numbering correctly. It drops the first numbering level, marking it as a heading style (see this, this and this among others). Given that they have had 10 years to work this stuff out and have been putting heaps of effort into braindead things like commenting (and, really, whoever introduced the commenting anti-feature in MS Word should be broken on the wheel – maybe not, but only just) and it’s probably something simple (like setting the right numbering level in the xml style)  not importing outline numbering correctly strikes me as pretty astounding.

Openoffice 3.1 and earlier also has a problem that it runs numbering together with the text without a separator. (eg you get something like: “1.1This is second level numbering” rather than “1.1 This is second level numbering” or “1.1\tThis is second level numbering”). This, by the way, means that your indexes will be wrong for the first word in any numbered paragraph. This problem has been fixed in 3.2. You might think it would be easy to regex after export to insert a tab or something after leading numbering, until you realise that structures like A, A1, a., (a), 1A, 1a, I, IV,  are valid numbering structures -> you would fail on something like “1A A First level number”.

wvText and Abiword don’t get outline numbering right. They do recognise outline numbering, but don’t preserve the numbering symbols (eg a numbering running A, B, C will become 1,2 ,3) or decorators “(a)” becomes “a.”. wvText also outputs numbering on a line by itself (probably because the import is going via html?). So 1.1 This is second level numbering will become something like:



        This is second level numbering"

Perhaps this could be scriptable into some sort of sanity but numbering like A, 1A will still be a problem.

IBM’s Lotus Symphony does import numbering correctly (hurrah!) – including decorators on numbering (eg “(a)” and “1.1”), however, being based on an old version of open office, suffers from the export problem of running the numbering together with the text <sigh>.  Maybe .doc -> Symphony -> html -> text might work?

Moreover, there seems no obvious way to get the pyUno bridge working for Symphony in the way that you can for openoffice. How, for example, do you get Symphony to run headless/listen on a port?

Yet to be tried are go-oo (which apparently conflicts with openoffice?) and oxygen office….

Note to self:

/opt/ibm/lotus/Symphony/framework/shared/eclipse/plugins/ -nologo -nodefault -norestore -nocrashreport -nofirststartwizard -symphony -accept=pipe,name=_214862394_Office;urp;StarOffice.ServiceManager

World Didn’t End, Annoying Reporters

World Didn’t End, Annoying Reporters

If I hear another reporter talk about who would win if we went back to the polls or who has a mandate or who got the most votes (on a two party preferred basis) I may well scream.  Our Constitution does not recognise parties, or policies, or platforms.  What we vote for is people.  People.  We elect them to make decisions on our behalf as our delegaterepresentative (whence “representative democracy”).   Who forms government is the person who can convince enough of those delegates to support them in a no confidence motion on the floor of the House.   That can always change, and where there is a minority government, change is perhaps a little more likely. We don’t vote for mandates.  Any analysis which talks about which party has a right to hold government presumes that one or other of the parties has some pre-existing right to govern.  They’re caught in the two party box.  The reason we have a two party box is primarily because of Labor’s caucusing system, where party members have to vote as a block on any policy which has been debated/agreed in caucus.

A minority government is good news for everyone – or at least everyone interested in democracy, primarily because it is now more likely that there will be greater transparency generally and greater independence to the public service in particular (“without fear or favour” and all that).  The Independents have already signalled that they want improvements to question time procedures.  The media should be happy because it will be an environment rife for leaks.   This is not to say that minority government is always best, but after such a long period of party dominance it should be a good thing now.

Rob Oakeshott is completely right when he says it’ll be “beautiful and ugly”.   Whether it will hold together will depend entirely on whether the three independents + Mr Bandt can control the urge to pork barrel for their electorates and make law for the benefit of the country instead.

With a little luck we’ll also be given some respite from the ridiculous overanalysis of every little action or statement of the Independents.  Heavens!

Copyright Harms Australia Again

Copyright Harms Australia Again

                             Amazon (AUD)  Booktopia  Australians Ripped off by
The Wealth of Networks: Ho...   $15.63     $27.80         78%

   Ajax: The Definitive Guide   $36.19     $66.75         84%

Star Wars Clone Wars Chara...   $13.27     $23.40         76%

                        Heads   $14.06     $19.95         42%

So, for example, the best price from Booktopia in Australia is 1.78 times the US price of the Wealth of Networks book when expressed in AUD, or Australians pay 78% more for the book.

Prices exclusive of delivery.  Delivery is calculated differently, but price is roughly equivalent, with longer nominal delivery time for Amazon (weeks vs days).

I also looked up the first two in the Nile and emporium books but the prices were slightly more expensive.

Blog Stats

  • 224,042 hits

OSWALD Newsletter

If you would like to receive OSWALD, a weekly open source news digest please send an email to oswald (with the subject "subscribe") at