I’ve suddenly found myself pondering whether to use plain text or simple HTML for the “readme” files, licence texts and other such simple documents in my “ObMimic” software product.
Traditionally, such documents are delivered as plain text files, and this still seems to be how it’s usually done. In the past, this was the safest way to ensure that these files would be readable on all systems, whatever tools the user might or might not have. But I’m not sure that’s still the best approach.
I guess there could be mobile and embedded environments where plain text might still be necessary or preferable, but I’d kind of expect to know when this is going to be relevant, and even then I’m not sure HTML for these particular files would necessarily be a problem.
So, what are the main pros and cons?
- First of all, even plain text isn’t without it’s problems. The most obvious of which is line-endings: for anything cross-platform, what line-endings do you use? I’m forever encountering “readme” files under MS Windows that have UNIX-style line-endings and need to be opened in something that sorts this out. OK, it’s not a big problem, but with HTML this just goes away – no decision to make, no build-time adjustments to make, no worries about how it will really look on the other platforms.
- Similarly, for file extensions “.html” seems a more reliable cross-platform bet than “.txt” or the absence of an extension.
- I’d also expect HTML to be preferable for accessibility, character-set issues, and general user control over how the text is displayed (e.g. font size).
- More generally, even a simple HTML page usually looks better than a text file.
- In addition, using HTML means you can have real links to other documentation, web-sites etc (even if these didn’t work for some reason, you’d be no worse off than if you just had the plain text).
- The risk with HTML is that if an organization starts using HTML files for this, they will let them get progressively more and more complex until they break and you can’t read them (which could be as silly as black text on a black background – it does happen, I used to regularly get a marketing e-mail of this kind from my former ISP!). If everybody starts doing this, you can guarantee that someday you’ll encounter a “readme” file that you can’t read without hacking.
- It’s conventional for these files to be plain text, so to at least some extent that’s what everyone expects.
There doesn’t seem to be any killer reason to decide this either way. On the whole I’m inclined to switch to using HTML for these files so as to avoid the “line endings” problem and provide slightly better-looking and more “accessible” content, despite plain text files being more conventional.
Personally I’d want to stick to hand-written, basic HTML with just headings, text, simple lists and some simple links – the same kind of thing as used within Javadoc comments. And I’d want to keep it XHTML compliant, with validation as part of the build process. (Ideally this maybe needs an explicitly-defined, guaranteed “safe” subset of XHTML – something else to ponder…). Probably accompanied by an optional stylesheet, also kept as simple and safe as possible.
Or are there any good arguments for sticking to plain text files? Can anyone see any other pros and cons?