By Julio Vazquez
posted on July 08, 2010 09:39
(Image courtesy of http://www.freeimages.co.uk/)
A number of people have encouraged me to make and epub version of Practical DITA. I have resisted this request for a while because I was not pleased with the transform of PDF to the Kindle format and really didn't want to put out anything ugly. I got prodded again a few weeks ago by Sarah O'Keefe and decided that when I had some spare time, I'd give it a shot. Yesterday was that day.
I'm not going to mention the names of the tools I used because my test may not have been fair to them, but they are both readily available for no fee. Both had some nice features and both were quick on the transforms. My initial feel that using the PDF to generate the epub would give less than stellar results was quickly validated.
So I went the step of generating XHTML from the DITA source and running that through the tools. Both tools ignored the fact that the content already had a toc that they might have been able to modify or expand on to meet their needs. This resulted in structures that were far off the path of the original source (which forces hand editing to get the structure correct). Worse yet, both tools created duplicate ids in the HTML by adding anchors to the HTML files with the same id that already existed. That absolutely made no sense to me as the target already existed.
When I ran the output from both tools through the Threepress epub validtor both failed for various reasons. I think mostly because the tools did not remove things from the content that they should have known were not supported in the standard. In my mind, that's a failure because what may be correct HTML that can be viewed in browsers (and even their own viewers) does not cut the mustard. Of course, how to correct or prevent the issues that came up were not described in either user's guide.
I'll be looking at a different way of transforming the source to an epub. Hopefully I'll have more success in a day or so. Why do tools have to be so hard?
About the Author
Julio Vazquez is a Senior Information Architect at SDI with over 30 years of experience in technical communications and information technology. As one of the members of the initial DITA task force, he takes his share of blame for the current architecture and language structure. Julio holds a bachelor’s degree in computers and information systems from Empire State College of the State University of New York and has spoken at technical communication and STC conferences about DITA and information architecture and is the author of Practical DITA.