gut has been tested on GNU/Linux and SPARC Solaris with perl 5.6.1. and Windows 98 with ActivePerl 5.6.1.
You may also find pluckbook useful. It is a convenience front-end shell-script to Plucker that converts a HTML file to a Plucker-compatible pdb file.
unzip -p etext10.zip | gut > etext.html
gut < etext10.txt > etext.html
(NOTE: pluckbook s a perl script. Please modify the path to your perl binary, the path of plucker-build and where your pdb files are to be stored, before using it).
gut currently only understands Etexts that contain paragraphs. If you attempt to mark-up plays or poems written in verse, the original formatting will not be preserved.
If you add additional heuristics or improve the current ones and wish to share them with others, email me your changes and I will incorporate them into the next release.
When I purchased my first palm computer in October 2000, a Visor Deluxe, one of my first searches was for free Etexts and tools to convert Etexts into formats that could be read on my Visor. Memoware was the premier site, but the books marked-up in Doc (native Palm text format) were rather poorly done and books in other formats required proprietary readers and converters (Peanut Reader, iSilo, MobiPocket).
I finally settled on Plucker because it was non-proprietary, it used HTML, and the Plucker data format was documented. Initially, the Gutenberg Etexts I read were marked up manually using Emacs macros. This technique soon became tedious and I began envisioning an automated way of marking up these files.
gut is the manifestation of this dream.