7 Scribble’s Extensibility
Scribble’s foundation on PLT Scheme lets us implement a number of features as libraries that ordinarily must be built into a documentation tool. More importantly, users can experiment with new and more interesting ways to write documentation without having to modify Scribble’s implementation.
In this section, we describe several libraries that we have already built: for stand-alone API documentation, for automatically running examples when building documentation, for combining code with documentation in the style of JavaDoc, and for literate programing.
7.1 API Specification
Targets for code hyperlinks are defined by defproc (for functions), defform (for syntactic forms), defstruct (for structure types), defclass (for classes in the object system), and other such forms – one for each kind of binding. When a library defines a new kind of binding, an associated documentation library can define a new form for documenting the bindings.
As we demonstrated in Scribbling Code, the defproc form documents a function given its name, information about its arguments, and a contract expression for its result. Information for each argument includes a contract expression, the keyword (if any) for the argument, and the default value (if any). For example, a louder function that consumes and produces a string might be documented as follows:
Adds “!” to the end of @scheme[str]. |
} |
The description of the function refers back to the formal argument str using scheme. In the typeset result, the reference to str is typeset in a slanted font both in the function prototype and description.
(louder str) → string? str : string? Adds “!” to the end of str.
As usual, lexical scope provides the connection between the formal-argument str and the reference. The defproc form expands to a combination of Scribble functions to construct a table representing the documentation and Scheme local-macro bindings to control the expansion and typesetting of the procedure description.
For the above defproc, the for-label binding of louder partly determines the library binding that is documented by this defproc form. A single binding, however, can be re-exported by many modules. On the reference side, the scheme and schemeblock forms follow re-export chains to discover the first exporting module for which a binding is documented; on the definition side, defproc needs a declaration of the module that is being documented. The module declaration is no extra burden on the document author, because the reader of the document needs some indication of which module is being documented, anyway.
The defmodule form both generates the user-readable explanation of the module being documented and declares that all definitions within the enclosing section (and sub-sections, unless overridden) correspond to exports from the declared module. Thus, if louder is exported by the comics/string library, it is documented as follows:
@(require scribble/manual |
(for-label scheme/base |
comics/string)) |
|
@title{String Manipulations} |
|
@defmodule[comics/string] |
|
Adds “!” to the end of @scheme[str]. |
} |
The defproc form is implemented by a scribble/manual layer of Scribble, which provides many functions and forms for typesetting PLT Scheme documentation. The scribble/manual layer is separate from the core Scribble engine, however, and other libraries can build up defproc-like abstractions on top of the core typesetting and cross-referencing capabilities described in Core Datatypes.
7.2 Examples and Tests
In the documentation for a function or syntactic form, concrete examples help a reader understand how a function works, but only if the examples are reliable. Ensuring that examples are correct is a significant burden in a conventional approach to documentation, because the example expressions must be carefully checked against the implementation (often by manual cut and paste), and a small edit can easily introduce a bug.
The examples form of the scribble/eval library typesets an example along with its result using the style of a read-eval-print loop. For example,
produces the output
Examples:
> (/ 1 2) 1/2
> (/ 1 2.0) 0.5
> (/ 1 +inf.0) 0.0
Since building the documentation runs the examples every time, the typeset results are reliable. When an author makes a mistake, or when an implementation changes so that the documentation is out of sync, the example remains correct – though it may not reflect what the author intended. For example, if we misspell +inf.0 in the example, then the output is still accurate, though unhelpful in describing the behavior of division:
Examples:
> (/ 1 +infinity.0) reference to undefined identifier: +infinity.0
To guard against such mistakes, an example expression can be wrapped with eval:check to combine it with an expected result:
Instead of typesetting an error message, this checked example will raise an exception when the document is built, since the expression does not produce the expected result 0.0. In this way, documentation source can serve partly as a test suite.
Evaluation of example code mingles two phases that we have otherwise worked to keep separate: the time at which a library is executed, and the time at which its documentation is produced. For simple functional expressions, such as (/ 1 2), the separation does not matter, and examples could simply duplicate its argument in both an expression position and a typeset position. More generally, however, examples involve temporary definitions and side-effects. To prevent examples from interfering with each other while building a large document, examples uses a sandboxed environment, for which PLT Scheme provides extensive support (Flatt et al. 1999).
7.3 In-Code Documentation
For some libraries, the programmer may want to write documentation with the source instead of in a separate document. To support such documentation, we have created a Scheme/Scribble extension that is used to document some libraries in the PLT Scheme distribution.
Using this extension, the comics/string module could be implemented as follows:
(require scheme/contract |
scribble/srcdoc) |
(require/doc scheme/base |
scribble/manual) |
|
(define (louder s) |
(string-append s "!")) |
|
[louder |
@{Adds “!” to the end of @scheme[str].}]) |
The #lang at-exp scheme/base line declares that the module uses scheme/base language extended with @-notation. The imported scribble/srcdoc library binds require/doc and provide/doc. The require/doc form imports bindings into a “documentation” phase, and provide/doc form exports louder, annotates it with a contract for run-time checking, and records the contract and description for inclusion in documentation. The description is an expression in the documentation phase; it is dropped by normal compilation of the module, but combined with the require/doc imports and inferred (require (for-label ...)) imports to generate the module’s documentation.
The documentation part of this module is extracted using include-extracted, which is provided by the scribble/extract module in cooperation with scribble/srcdoc. The extracted documentation might provide the entire text of the document directly, or it might be incorporated into a larger document:
@(require scribble/manual |
scribble/extract |
(for-label comics/string)) |
|
@title{Strings} |
|
@defmodule[comics/string] |
|
The @schememodname[comics/string] library |
provides functions for creating punchlines. |
|
@include-extracted[comics/string] |
An advantage of using scribble/srcdoc and scribble/extract is that the description of the function is with the implementation, and the function contract need not be duplicated in the source and documentation. Similarly, the fact that string? in the contract gets its binding from scheme/base is specified once in the code and inferred for the documentation. At the same time, a phase separation prevents document-generating expressions from polluting the library’s run-time execution, and vice versa.
7.4 Literate Programming
The techniques used for in-source documentation extend to the creation of WEB-like literate programming tools. Figure 3 shows an example use of our literate-programming library; the left-hand side shows a screenshot of DrScheme editing the source code for a short, literate discussion of the Collatz conjecture, while the right-hand side shows the rendered output.
Literate programs written with our library look like ordinary Scribble documents, except that they start with #lang scribble/lp and use chunk to introduce a piece of the implementation. A use of chunk consists of a name followed by definitions and/or expressions:
@chunk[<name-of-chunk> |
The definitions and expressions in a chunk can refer to other chunks by their names.
Unlike a normal Scribble program, running a scribble/lp program ignores the prose exposition and instead evaluates the program in the chunks. In literate programming terminology, this process is called tangling the program. Thus, to a client module, a literate program behaves just like its illiterate variant. The compiled form of a literate program does not contain any of the documentation, nor does it depend on the runtime support for Scribble, just as if an extra-linguistic tangler had been used. Consequently, the literate implementation suffers no overhead due to the prose.
To recover the prose, the
@lp-include[filename] |
form extracts a literate view of the program from filename. In literate programming terminology, this process is called weaving the program. The right-hand side of Figure 3 shows the woven version of the code in the screenshot.
Both weaving and tangling with scribble/lp work at the level of syntactic extensions, and not in terms of manipulating source text. As a result, the language for writing prose is extensible, since Scribble libraries such as scribble/manual can be imported into the document. The language for implementing the program is also obviously extensible, since a chunk can include imports from other PLT Scheme libraries. Finally, even the bridge between the prose and the implementation is extensible, because the document author can create new syntactic forms that expand to a mixture of prose, implementation, and uses of chunk.
Tangling via syntactic extension also enables many tools for Scheme programs to automatically apply to literate Scheme programs. The arrows in Figure 3’s screenshot demonstrate how DrScheme can draw arrows from chunk bindings to chunk references, and from the binding occurrence of an identifier to its bound occurrences, even across chunks. These latter arrows are particularly helpful with literate programs, where lexical scope is sometimes obscured by the way that textually disparate fragments of a program are eventually tangled into the same scope. DrScheme’s interactive REPL, test-case coverage support, module browser, executable generation, and other tools also work on literate programs.
To gain some experience with non-trivial literate programming in Scribble, we have written a 34-page literate program that describes our implementation of the Chat Noir game, which is distributed with PLT Scheme. The source is included in the distribution as "chat-noir-literate.ss", and the rendered output is in the help system and online at http://docs.plt-scheme.org/games/chat-noir.html.
Consider a function that, starting from (collatz n), recurs with
<even> ::=
(collatz (/ n 2))
if n is even and recurs with
<odd> ::=
(collatz (+ (* 3 n) 1))
if n is odd.
We can package that up into the collatz function:
<collatz> ::=
(define (collatz n) (unless (= n 1) (if (even? n) <even> <odd>))) The Collatz conjecture is true if this function terminates for every input.
Thanks to the flexibility of literate programming, we can package up the code to compute orbits of Collatz numbers too:
(define (collatz n) (cond [(= n 1) '(1)] [(even? n) (cons n <even>)] [(odd? n) (cons n <odd>)])) Finally, we put the whole thing together, after establishing different scopes for the two functions.
<*> ::=
(require scheme/local) (local [<collatz-sequence>] (collatz 18)) (local [<collatz>] (collatz 18))