Please see, this page is obsolete.


These are steps to up-translate Genesis from source RTF format into sensible TEI Lite.

  1. This is source RTF file.
  2. It was run through rtf2xml converter, which have produced this XML file. TRANSDOC file carries information about text and styles used to format it in input RTF file.
  3. Then it was run through custom made CoST script transdoc2tei, which in turn maps those styles and their mixes onto TEI Lite elements and their attributes. Output is this TEI Lite file.
  4. Voila, we're there. Well, truth is result was almost perfect, so hand editing was still required, but it tool just several hours, not several days as if plain (==styleless) text were marked up.

Now what can we do with this result? Short answer is whatever we ever want to. We can down-translate it to LaTeX, for example, to get PDF suitable to printing. Or to on-screen viewing. To facilitate that I did custom tei2tex stylesheet, which supports small subset of TEI Lite, namely, those elements and attributes used in transdoc2tei output. This is an output of PDFLaTeX.

Another custom stylesheet, tei2html, and its output: HTML.

What now? Plain text? Speech synthesizer? eBook? You name it

For example, we can try to extract commented text from markup. This is done essentially by following CoST script fragment, generating HTML:

foreachNode doctree element S {
    puts "<strong>[filteredContent]</strong>:--"
    withNode doctree element NOTE withattval TARGET [query attval ID] {
	puts "<em>[filteredContent]</em><br>

that is, find all elements S (spans), for each one find corresponding element NOTE, commenting this span, and show them in HTML. Resulting page looks like this. Looks promising, eh?

We can make this functionality accessible to users, like I did here (example at the very bottom)...

Boris Tobotras
Last modified: Sun Oct 17 10:47:54 MSD 1999