78-lightweight-testing/srfi-78.txt

Title

Title
-----

Lightweight testing

Author
------

Sebastian.Egner@philips.com

Abstract
--------

A simple mechanism is defined for testing Scheme programs.
As a most primitive example, the expression

   (check (+ 1 1) => 3)

evaluates the expression (+ 1 1) and compares the result
with the expected result 3 provided after the syntactic
keyword =>. Then the outcome of this comparison is reported
in human-readable form by printing a message of the form

   (+ 1 1) => 2 ; *** failed ***
   ; expected result: 3

Moreover, the outcome of any executed check is recorded
in a global state counting the number of correct and failed
checks and storing the first failed check. At the end of a
file, or at any other point, the user can print a summary.

In addition to the simple test above, it is also possible
to execute a parametric sequence of checks. Syntactically,
this takes the form of an eager comprehension in the sense
of SRFI 42 [5]. For example,

   (check-ec (:range e 100)
             (:let x (expt 2.0 e)) 
             (= (+ x 1) x) => #f (e x))

This statement runs the variable e through {0..99} and
for each binding defines x as (expt 2.0 e). Then it is 
checked if (+ x 1) is equal to x, and it is expected that
this is not the case (i.e. expected value is #f). The
trailing (e x) tells the reporting mechanism to print
the values of both e and x in case of a failed check.
The output could look like this:

   (let ((e 53) (x 9007199254740992.0)) (= (+ x 1) x)) => #t ; *** failed ***
    ; expected result: #f

The specification of bindings to report, (e x) in the 
example, is optional but very informative.

Other features of this SRFI are:
* A way to specify a different equality predicate (default is equal?).
* Controlling the amount of reporting being printed.
* Switching off the execution and reporting of checks entriely.
* Retrieving a boolean if all checks have been executed and passed.

Rationale
---------

The mechanism defined in this SRFI should be available in
every Scheme system because it has already proven useful
for interactive development---of SRFIs. 

Although it is extremely straight-forward, the origin of the
particular mechanism described here is the 'examples.scm' file
accompanying the reference implementation of SRFI 42.
The same mechanism has been reimplemented for the reference
implementation of SRFI 67, and a simplified version is yet
again found in the reference implementation of SRFI 77.

The mechanism in this SRFI does not replace more sophisticated
approaches to unit testing, like SRFI 64 [1] or SchemeUnit [2].
These systems provide more control of the testing, separate
the definition of a test, its execution, and its reports, and
provide several other features.

Neil Van Dyke's Testeez library [3] is very close in spirit 
to this SRFI. In Testeez, tests are disabled by (re-)defining a
macro. The advantage of this method is that the code for the
test cases can be removed entirely, and hence even the dependency
on the Testeez library. This SRFI on the other hand, uses a
Scheme conditional (COND, IF) to prevent execution of the
testing code. This method is more dynamic but retains dead
testing code, unless a compiler and a module system are used
to apply constant folding and dead code elimination. The only
major addition in SRFI over Testeez is the comprehension for
formulating parametric tests.

Design considerations for this SRFI include the following:
* Reporting is human-readable and as specific as possible,
  i.e. not just "assertion failed" but the expression with
  actual and expected value, and if possibly the relevant 
  part of the bindings environment.
* An effort is made to print closed Scheme expressions, i.e. 
  expressions that can directly be copy/pasted into a REPL
  for further analysis (e.g. the let expression in the abstract).
* By default the checks report both correct and failed checks.
  However, it is possible to reduce the output---or even to 
  switch off the execution of checks. It has turned out useful
  to be able to run only some subset checks for the features
  currently under development. This can be done by changing
  the reporting mode between differnt sections.
* The global state (correct/failed count) is not made available
  to the user program. This reduces the dependencies between
  different checks because it is not possible to use the state.
* Ocassionally, it is useful to check that a certain expression 
  does *not* yield an ordinary result but raises an error. However, 
  R5RS [4] does not specify the mechanism by which this occurs
  (e.g. raising exception, breaking into a REPL, aborting the
  program, etc.). For this reason, this SRFI is restricted to 
  the case that the checked expressions evaluate normally.
* Though usually I am very much in favor of strictly prefix
  syntax, for this SRFI I make an exception because the 
  infix "=>" syntax is widely used and intuitive.

Specification
-------------

(check <expr> (=> <equal>) <expected>)                                   MACRO
(check <expr>  =>          <expected>)
   evaluates <expr> and compares the value to the value
   of <expected> using the predicate <equal>, which is
   equal? when omitted. Then a report is printed according
   to the current mode setting (see below) and the outcome
   is recorded in a global state to be used in check-report.
      The precise order of evaluation is that first <equal>
   and <expected> are evaluated (in unspecified order) and
   then <expr> is evaluated.
   Example: (check (+ 1 1) => 2)

(check-ec <qualifier>^* <expr> (=> <equal>) <expected> (<argument>^*))   MACRO
(check-ec <qualifier>^* <expr>  =>          <expected> (<argument>^*))
(check-ec <qualifier>^* <expr> (=> <equal>) <expected>)
(check-ec <qualifier>^* <expr>  =>          <expected>)
   an eager comprehension for executing a parametric set of checks.
      Enumerates the sequence of bindings specified by <qualifier>^*.
   For each binding evaluates <equal> and <expected> in unspecified
   order. Then evalues <expr> and compares the value obtained to the
   value of <expected> using the value of <equal> as predicate, which
   is equal? when omitted.
      The comprehension stops after the first failed check, if there
   is any. Then a report is printed according to the current mode 
   setting (see below) and the outcome is recorded in a global state 
   to be used in check-report. The entire check-ec counts as a single 
   check.
      In case the check fails <argument>^* is used for constructing an
   informative message with the argument values. Use <argument>^* to
   list the relevant free variables of <expr> (see examples) that you
   want to have printed.
      A <qualifier> is any qualifier of an eager comprehension as
   specified in SRFI 42 [1].

   Examples:
     (check-ec (: e 100) (positive? (expt 2 e)) => #t (e)) ; fails on fixnums
     (check-ec (: e 100) (:let x (expt 2.0 e)) (= (+ x 1) x) => #f (x)) ; fails
     (check-ec (: x 10) (: y 10) (: z 10)
               (* x (+ y z)) => (+ (* x y) (* x z))
               (x y z)) ; passes with 10^3 cases checked

(check-report)                                                     PROCEDURE
   prints a summary and the first failed check, if there is any,
   depending on the current mode settings.

(check-set-mode! mode)                                             PROCEDURE
   sets the current mode to mode, which must be a symbol in
   '(off summary report-failed report), default is 'report.
   The mode symbols have the following meaning:
     off:           do not execute any of the checks
     summary:       print only summary in (check-report) and nothing else
     report-failed: report failed checks when they happen, and in summary
     report:        report every example executed
   Note that you can change the mode at any time, and that check,
   check-ec and check-report use the current value.

(check-reset!)                                                     PROCEDURE
   resets the global state (counters of correct/failed examples)
   to the state immediately after loading the module for the
   first time, i.e. no checks have been executed.

(check-passed? expected-total-count)                               PROCEDURE
   #t if there were no failed checks and expected-total-count 
   correct checks, #f otherwise. 
     Rationale: This procedure can be used in automatized
   tests by terminating a test program with the statement
   (exit (if (check-passed? <n>) 0 1)).

Reference implementation
------------------------

'check.scm': 
  implementation in R5RS + SRFI 23 (error) + SRFI 42 (comprehensions);
  tested under PLT 208p1 and Scheme 48 1.3.

'examples.scm':
  a few examples.

References
----------

[1] SRFI 64 by Per Bothner: A Scheme API for test suites. January 2005.
    http://srfi.schemers.org/srfi-64 

[2] Noel Welsh: SchemeUnit. February 2003.
    http://schematics.sourceforge.net/schemeunit.html

[3] Neil Van Dyke: 
    Testeez, Lightweight Unit Test Mechanism for Scheme. May 2005.
    http://www.neilvandyke.org/testeez

[4] Revised^5 Report on the Algorithmic Language Scheme (R5RS).
    http://www.schemers.org/Documents/Standards/R5RS/

[5] SRFI 42 by Sebastian Egner: Eager Comprehensions.
    http://srfi.schemers.org/srfi-42 

Copyright
---------

Copyright (C) Sebastian Egner (2005-2006). All Rights Reserved. 

Permission is hereby granted, free of charge, to any person obtaining 
a copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions: 

The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software. 

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
DEALINGS IN THE SOFTWARE.