On formatting poems for e-readers — Part III. E-reader formats and conversion routines.

Click here to return to the preceding posting.

Different e-readers require documents in different formats.  There is no universal standard.  Once a text has been formatted and laid out on a page using an appropriate editor, it must be converted into a suitable format for display on an e-reader.  If you want your text to be available on a wide range of e-readers, you have to convert it into a wide range of formats.  Here are some of them:

HTML [Hypertext Markup Language]. This is the standard format for displaying web pages in browsers such as Microsoft Internet Explorer, Firefox and Google Chrome.   An HTML document can be viewed on any desktop or laptop computer and many other devices — any device that supports an internet browser.  There are actually several versions of HTML, and new ones come into circulation from time to time.  HTML has no difficulty displaying diacritical marks.  It does have facilities for precisely controlling indents, but they can be a nuisance to use.

Advantages. HTML documents can be displayed on many devices.  Browsers typically make lines of text in HTML format flow to fit the display space available.

Disadvantages. Because browsers may make an HTML document flow to fit the available space, the author or publisher may lose control of line lengths and indents.

Conversion Routines. As mentioned above, many web-hosting sites provide WYSIWYG editors for HTML.  Each of the word-processing systems mentioned in the preceding posting will save documents in HTML format.  The Smashwords web site, to be described in detail below, will convert MSWord documents into HTML free of charge.  The MobiPocket Creator converts an MSWord document to HTML format on its way to creating a MobiPocket version.  See further remarks on MobiPocket Creator below.

PDF [Portable Document Format]. This format is sponsored by Adobe.  According to the Adobe website, PDF “is the global standard for capturing and reviewing rich information from almost any application on any computer system and sharing it with virtually anyone, anywhere.”  It gives excellent control of all aspects of formatting and display.  PDF viewers (Adobe Readers) for PCs are available free of charge from ADOBE.  Many e-readers can display PDF documents.  A reader for Windows Mobile phones is available free here. Cerience sells a PDF reader for Blackberries and Android phones that has some excellent capabilities, including the ability to reformat a PDF document for easy reading on a small screen.  PDF documents are widely accepted as input for publishing in print.

Advantages. PDF can give precise control of page layout.  There are some display technologies, like the Cerience Reader, that can reformat PDF documents for easy reading on small (cell phone) displays.

Disadvantages. Unless your reader can reformat the document for a small screen, a PDF document can be difficult to read on a cell phone or other small-screen device because, when the original layout is retained, the print is very small.

Conversion Routines. Each of the word-processing systems mentioned in the preceding posting can produce PDF documents simply by saving a document in the PDF format.   TeXnicCenter is particularly good at formatting and producing PDF documents. Smashwords will convert MSWord documents into PDF free of charge.

RTF [Rich Text Format]. RTF documents can be produced on most word-processing systems (including those mentioned in the preceding posting) and can be displayed on most word processors.

Advantages. RTF documents give good control of page layout.  They are easy to produce, and can be displayed on almost any word processor.  Most word processors will save documents in RTF format with good preservation of the original layout.

Disadvantages. On very small screens, such as the screens of cell phones, RTF documents must flow, changing line lengths, unless the text displayed is very small.  Some e-readers do not accept RTF.

Conversion Routines. Most word processors, including those mentioned in the preceding posting, can save documents in RTF format with good preservation of layout.  However, the TeXnicCenter does not do so. Smashwords will convert MSWord documents into RTF format free of charge.  However, I have found the Smashwords conversion to RTF to be particularly vulnerable to hidden characters in the MSWord source document and even to the presence of an em dash character.  Here below is a screen shot of the rather grotesque result of converting one MSWord document to RTF via Smashwords, when the conversions to other formats worked out well.  Although the original document was entirely formatted in Times New Roman font, like this blog posting, the Smashwords conversion changed the font to something else, displayed the poem and the following paragraph as small capitals, and boldfaced the entire paragraph following the poem.  None of these changes was intended, and none of them showed up in any of the other Smashwords conversions.  Eventually, after strenuous efforts to remove hidden formatting characters and take control of the representation of em dashes, I did succeed in obtaining a faithful conversion of this document with Smashwords, which is available here.  The purification was achieved by opening the MSWord original in Word Perfect, using the Reveal Codes feature of Word Perfect to disclose and delete the hidden codes, saving the resulting document in Word Perfect format, then letting MSWord convert it back to MSWord format.

PRC [Palm Resource Code, used by the Amazon Kindle and MobiPocket readers] Amazon.com purchased MobiPocket and has used its technology to support the Kindle reader.

Advantages. Free MobiPocket readers for a variety of devices are available here.  A free Kindle reader for PCs is available from www.amazon.com.  These readers do a reasonably good job of displaying texts on small-screen devices.  At this moment, the Kindle reader has a very large market share, although it is being challenged by other readers.

Disadvantages. Current versions of the Kindle device do not display colors, although this is probably a minor concern for poetry.  Partly because the user can change fonts and line lengths, the author and publisher lose a measure of control of page layout, as illustrated in the first posting in this series.

Conversion Routines. The MobiPocket Creator is a free conversion routine that will convert MSWord or HTML documents to MobiPocket format, suitable for use on the Amazon Kindle reader and the MobiPocket readers.  The Kindle edition of Farewell Rio was created using this conversion routine.  The MobiPocket Creator did a good job of preserving indents and diacritical marks in the poem Pied Beauty as illustrated with screen shots of my cell phone in the first posting of this series.  The Smashwords website will convert MSWord documents to MobiPocket format free of charge.  However, as detailed in the document Poetry Test, available free at http://www.smashwords.com, the Smashwords conversion was less faithful to the original indents than the MobiPocket Creator conversion unless the indents were coded as tab characters.  For faithful reproduction of the original layout, I recommend the MobiPocket Creator in preference to the Smashwords service.  The MobiPocket Creator first converts an MSWord source document to HTML format, then converts the HTML to MobiPocket format.  It made a small number of errors in its conversion of Farewell Rio, which I was able to correct by direct manipulation of the HTML version using the editor provided for the purpose in the MobiPocket Creator.

EPUB [Electronic Publication] is an open standard document format that can be displayed on many e-readers on many devices including PCs, the iPad, and Android phones.

Advantages. EPUB is an open (non-proprietary) standard that is widely supported by readers for many devices.  With a properly formatted source document, it can do a good job of representing indents and diacritical marks.

Conversion Routines. Smashwords provides a free conversion service that will convert an MSWord source document to EPUB format.  Provided that indents are represented either as paragraph indents or by tab characters, it did a good job of preserving the original layout.

LRF [Library Reader Format (?)] is an undocumented proprietary format used by Sony readers.

Advantages. It works on the Sony reader.  It can do a good job of representing indents and diacritical remarks when one starts from a properly formatted MSWord source document.

Disadvantages. It is not supported on most other readers.   The Smashwords LRF conversion inserted gratuitous blank lines between the lines of the test poem Pied Beauty, as detailed in the Poetry Test document.

Conversion Routines. Smashwords provides a free conversion service that will convert an MSWord source document to LRF format.  Provided that the indents are represented either as paragraph indents or by tab characters, it did a good job of preserving the indents in the converted document, but it introduced gratuitous blank lines in the test poem.

PDB [Program Data Base] is a file format that is supported by Palm reading devices and some other readers.

Advantages. This format works on Palm reading devices.

Disadvantages. Of all the conversions offered by Smashwords (excepting TEXT — see below), the PDB conversion produced the least satisfactory results.  This was the only conversion in which diacritical marks were not well displayed, while the em dash in the original was displayed in the PDB document as a letter ‘C’.  The PDB conversion introduced gratuitous blank lines between the lines of the original poem.

Conversion Routines. Smashwords provides a free conversion service that will convert an MSWord source document to PDB format.  The conversion of the test poem Pied Beauty was highly unsatisfactory.

TXT [Text] is a very basic format with hardly any page layout capabilities.

Advantages. This format is supported on almost every device.

Disadvantages. This format supports only minimal control of display layout.  Such effects as underlining and boldfacing are lost.

Conversion Routines. All the word processors described in the preceding posting can save a document in this format.  When MSWord saved the Poetry Test document as a TXT file, indents were lost when they were specified in the original as paragraph indents, but they were displayed correctly when they were coded in the original by inserting initial spaces or tab characters.  Diacritical marks were mostly preserved, although the em dash was not.  The Smashwords conversion to TXT format loses all formatting, producing an undifferentiated string of text without paragraphs or indents of any kind.  If you want to produce a TXT document, I would recommend not using the Smashwords conversion service.  Let your word processor do it instead.

REMARKS ON THE SMASHWORDS SERVICE. If you want to make a document available on a wide variety of readers, the Smashwords service has much to recommend it.

Advantages. It is free of charge.  It supports a wide variety of readers and devices, as detailed above.  With a single submission, you can convert an MSWord source document into multiple formats.  Smashwords provides excellent advice and documentation, notably the free Smashwords Style Guide, which will explain how to format an MSWord source document in order to get good results from the Smashwords conversion process.  Note that the Smashwords conversion process is very sensitive to hidden formatting characters in the MSWord source document.   Good results are likely to depend on eliminating them.  As I have mentioned above, the Reveal Codes feature of Word Perfect can be used for this purpose.  Hidden characters can also be eliminated by saving a document in TXT format, then re-editing it to restore the proper page layout.

Disadvantages. Smashwords only accepts source documents in the MSWord .doc format.  It does not accept the newer .docx format as input, nor does it accept source documents in HTML, pdf, or Word Perfect format.  Most of its conversion routines do a good job, but a few do not.  If you want to produce an RTF or TXT document, you are likely to get a better result by using your word processor to save in the appropriate format.  If you want a MobiPocket (Kindle) document, the free MobiPocket Creator might do a better job.

The next posting in this series will describe ways of choosing a form for a poem so it will display well in e-readers, and ways of using MSWord effectively to layout poems for conversion to e-reader formats.


About corcovadopress

I am the manager of Corcovado Press, which publishes works in English on Brazilian themes. We do not accept unsolicited manuscripts.
This entry was posted in E-readers, Poetry and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s