Alcides Fonseca

40.197958, -8.408312

Portable EPUBs

A simple answer is to improve the PDF format. After all, we already have billions of PDFs — why reinvent the wheel?

— Will Crichton, in Portable EPUBs

I’ve met Will last year during SPLASH and besides having awesome game host presentation skills, he is also very passionate about this topic. LaTeX was made for a world where paper is king. But I don’t read papers in paper anymore, I read them on my laptop, frequently on external screens. Sometimes even on my phone. And let me tell you that most of the time I have to pan around to read a single line. We desperately need responsive layouts in most written form. eBooks got it right (but not all books were ported properly, and some will never be, and that’s okay).


I’ve learned a lot from his post, mainly about the advanced PDF capabilities that open-source software usually doesn’t support. You wouldn’t even need to extend the PDF format.

He proposes that the best practical solution is to use self-contained ePUB written in a safe subset of HTML, CSS and Javascript. His notion of safe is left too much for interpretation to my liking, but the overall idea is a good one.

And while ACM is looking into improving the status of accessibility in PDF papers and whitelisting packages that support HTML exporting, antagonizing computer scientists have relied on advanced macros for decades, ArXiV did without asking anyone’s permission.

I’m still not sure that an HTML-based format is the solution. I don’t think we have the proper authoring tools. Yes, we have TinyMCE and friends, but that has limited support for templating. Heck, even Microsoft FrontPage would give you better control over the layout, at the cost of unreadable source code. But designers want Adobe Indesign and QuarkXPress so they can have some control about pagination and whitespace. Maybe we need a new generation of those tools that also targets responsive HTML views?

But what doesn’t convince me the most is HTML, CSS and Javascript evolutions. Those are languages that have and will continue to evolve at a faster pace than PDF or Postscript. I argue for the tradeoff of having a very basic layout and content language with Active-X plugins that authors can use, at the cost of being lost in time, just like those awesome little Flash games that no-one can play anymore.

1 Curiously, I couldn’t hot link this image from his own post, probably due to the way the ePUB is being dynamically uncompressed.