Sun Wiki Publisher
By Adrian Sutton
Kevin Gamble pointed me towards the Sun Wiki Publisher for publishing documents to MediaWiki servers straight from OpenOffice/StarOffice. The key problem with these types of integrations is that wiki markup simply can’t handle anywhere near the same level of expressiveness as even HTML, let alone a word processor document. Hence the description mentions:
All important text attributes such as headings, hyperlinks, lists and simple tables are supported. Even images are supported as long as they have already been uploaded to the wiki site. An automatic upload of images is currently not supported.
The lack of image upload is just due to the early stages of development, but the loss of formatting is going to be permanent. Generally wiki markup can’t handle things like nested tables and there’s a big difference between tables and “simple tables”. Those users who don’t use heading styles (and there are, unfortunately, a lot of them) will probably akin to a plain text dump.
Of course, determining which formatting you want to keep and which you don’t when moving content between systems is incredibly difficult but it’s always nice to actually have a choice which wiki markup just doesn’t provide. For example, EditLive! has three modes for cleaning pasted content from Microsoft Word (and from any other source really):
- Clean – structural information only, like headings, tables, lists etc.
- Inline Formatting – preserve the formatting as best as possible with HTML by using inline styles.
- Embedded Formatting – preserve the formatting as best as possible by adding CSS styles to the head of the document and using classes.
There’s also some options about plain text and prompting the user when the paste but they’re so unpopular we may as well just ignore them.
Towards the end of many EditLive! releases someone goes through the default configuration to make sure it’s got all the new settings and is the best set of defaults we can find. Inevitably, this leads to a discussion about which of the above three options is best and after 6 years or so of having this discussion there still isn’t a completely clear answer. Lately the trend has been towards pasting clean among our clients but there are still plenty that want pixel perfect rendering accuracy. In large part, it depends on how well structured the original documents are because if they don’t use heading styles they simply won’t import well with “clean” and if they’re full of gratuitous formatting that you want to get rid of but use heading styles well then “clean” works well for you.
Bottom line, having a choice about these things makes importing existing content so much easier – even if you hide all the inline formatting functions in the editor so users are still encouraged to just use headings and CSS etc.