Conversion of Knuth's WEB to HTML.
Currently only “Phase 1” and “Phase 2” (see a comment I wrote elsewhere) are implemented.
Prerequisites: You must have a Common Lisp implementation and ASDF (which likely comes with your Lisp).
First, download the project into a place where ASDF can see it (such as ~/common-lisp/ on *nix). Next, start a Lisp session and do
(require "asdf")
(asdf:load-system "web-to-html")
(in-package :web-to-html)Now you have the program loaded. The entry point for Phase 1 is the Phase-1 function, which has the following syntax:
(Phase-1 WEB-file [change-file])
where WEB-file and change-file are streams. By default, change-file is (make-string-input-stream ""), i.e., an empty stream. An example usage would be
(with-open-file (input "tex.web")
(Phase-1 input))
After executing the above expression, you can use the following functions to inspect the collected data:
(get-nth-section n)returns the representation of thenth section.(section-text section)returns the token list associated with a section object.(lookup-module name)returns themoduleobject corresponding to the module namedname.(lookup-prefix prefix)returns themoduleobject uniquely identified by the givenprefix(don't include “...”).(module-name module)returns the name associated with a givenmoduleobject.(module-definitions module)returns a list of section numbers where the given module is defined.(map-modules function)calls thefunctionon each module. They are traversed in tree order, but you shouldn't depend on this.
To get the full text of a module module, you might run
(let ((definitions (module-definitions module)))
(loop for definition in definitions
appending (section-Pascal-part (get-nth-section definition))))Running Phase 2 is easier; all you need to do is evaluate (Phase-2), provided that Phase 1 has already completed. However, most of Knuth's programs will need to be changed, because they have meta-comments (@{…@}) that confuse Phase 2. For WEAVE and TANGLE it suffices to remove the reference to the “Compiler directives” module, and the 'BREAKPOINT' comment in TeX's debug_help must also be removed.
Here are some things you might notice about the code:
- There are no abbreviations, except when the abbreviation is a single character long. This is a stylistic preference.
- Comments use the “incorrect” quotation marks
``and''. I would just write“and”, but I've decided to make the program Unicode-agnostic, and I can't bear the sight of". - The condition system isn't used for errors in Phase 1. Eventually there will be a whole heirarchy of condition types, but for now errors are simply reported.
- There is a mysterious
:long-distancetokentype, apparently corresponding to an@ncontrol code. This is intended to be used to specify the section in which an identifier should be identified, in case it is ambiguous.