defs-parser.ss
#lang scheme

; TODO:
; - Pouvoir choisir le préfixe des commentaires spéciaux
; (certains préfèrent ";;", voire on pourrait choisir ";")
; -> en faire un paramètre

;: @(require (planet cce/scheme:6:0/planet)
;:           (this-package-in defs-parser)
;:           (for-label (this-package-in package)))

;[:title Scribble Definition Parser]

;: @(define (comm . txt) (litchar (apply string-append ";: " txt)))
;: @(define (ccom . txt) (litchar (string-append ";[:" (apply string-append txt) "]")))
;: @(define scrbl @filepath{.scrbl})

;: @(define (my-prog . str) (apply verbatim #:indent 5 str))

;: This module provides tools to easily write documentation for planet packages.
;: It builds on the Simple Parser (see @secref{simple-parser}).
;:
;: The Scribble Definition Parser reads scheme plain text files
;: and generates a corresponding documentation.
;: It currently lacks many features and is certainly not perfect
;: to build fully reliable documentations,
;: but it can be used at least as a helper to semi-automate the process.
;:
;: This parser is meant to be used by either lazy people
;: who don't like documenting sources, or think it takes too much time,
;: or those that want to document their sources incrementally (i.e.,
;: do some doc now, do the rest later).
;: Consider using the built-in inline source documentation of Scribble
;: if your are not as lazy as me.
; (lien ??)
;:
;: Contracts and descriptions of functions and forms
;: are added as comments in the (plain text) source files.
;: Most of the job is done automatically by the parser, like writing
;: function and form names, argument names, keywords and default values,
;: and even some of the contracts.
;:
;: It allows for incremental documenting, ranging from writing nothing
;: to writing restrictive contracts and text.
;: When contracts are not provided,
;: the parser tries to use some default conventions (e.g., $l is given the contract $list?),
;: and if that fails too, then the most general ones are used.
;:
;: It is always possible to fall-back to the default Scribble documentation
;: by using @ccom{skip} before a given definition and then documenting
;: it as Scribble code inside @comm{} comments.
;:
;: This parser does not yet use the real contracts used in Scheme sources
;: and the one used are written inside comments and are
;: for documentation purposes only (i.e., they are not added to Scheme code).
;:

; In-source scribble documenting:
;file:///C:/Users/orseau/AppData/Roaming/PLT%20Scheme/planet/300/4.2.1/cache/mflatt/scribble-paper.plt/2/1/planet-docs/scribble/extensions.html
;file:///C:/Program%20Files/PLT/doc/scribble/srcdoc.html

; Documenting forms, procs, ids, params, ... :
; file:///C:/Program%20Files/PLT/doc/scribble/doc-forms.html


;[:section In-Source Documenting]
;: The parser provides some comment-commands to document source-code
;: inside a module file.
;:
;: To write a line of text that will be added to the @scrbl file,
;: just write a comment starting with @comm{} followed by the text,
;: which can of course contain scribble commands that will be interpreted
;: when PLaneT will compile the @scrbl files.
;:
;: To add a title to the module, just use the command @ccom{title My Title}.
;: The $#:tag is automatically set to the name of the module.
;: This will also add the @scheme[defmodule] declaration (from my experience,
;: it must be set after the title, otherwise it can cause obscur errors when installing
;: the package).
;:
;: To add a section, use @ccom{section My Section}. Same for subsection.
;: If you want to use @(litchar "; my text") instead of @comm{my text}
;: for several lines, surround
;: the paragraph with the lines @ccom{text} and @ccom{end-text}.

;[:subsection Documenting Definitions]

;: An important part of documenting is to describe definitions (of functions, forms, etc.)
;: and to give contracts to the arguments.
;: This parser provides tools to make it simpler in many cases.
;: It automatically recognizes functions with keywords and default arguments,
;: some forms (when the head of the form provides information
;: on how it works), and parameters.
;: Furthermore, it only parses definitions that the module provides.
;: It also parses $require to correctly output relative paths in the @scrbl file.
;:
;: See @secref{package} for functions that augment the
;: @scrbl file with useful information.
;:
;: If the parser does not (yet) recognize a definition, you can still
;: document it with scribble functions, preceded by @comm{}.
;: If the parser fails to properly recognize a definition, you can precede it with the
;: @ccom{skip} command, and then document it like if the definition is not recognized.
;:
;[:subsection Documenting Functions]
;: To document a given function, text must be added after the head of the function and before
;: its body, and must be preceded by @comm{}.
;: In such text, if the parser reads a $ followed by a scheme identifier (but with no dot in it),
;: say @(litchar (string-append "$" "foo")),
;: it is translated into @(litchar "@scheme[foo]").
;: More complex expressions can of course use directly @(litchar "@scheme[]").
;:
;: Here is an example on how to write the description text of a function:
;: @(define prog1
;: (string-append
;:"(define (foo x [name 'me])
;: ;: Returns the string \"foo\" followed by $" "x
;: ;: and the $" "name.
;: ( .... ))"))
;: @(my-prog prog1)
;:
;[:subsection Contracts]
;:
;: It is possible to write contracts for the function arguments
;: and for the return value:
;: @(define prog2
;: (string-append
;:  "(define (foo x [name 'me]) ;:-> string?
;: ;: [x number?]
;: ;: [name symbol?]
;: ;: Returns the string \"foo\" followed by $" "x
;: ;: and the $" "name.
;: ( .... ))"))
;: @(my-prog prog2)
;: The return value is preceded by the special comment @(litchar ";:->").
;:
;: The parser creates only documentation contracts, which do not interfere
;: with actual contracts defined in Scheme for the module.
;:
;: A more concise contract definition can be used, without repeating the argument ids:
;: @(define prog3
;: (string-append
;:"  (define (foo x          ;: number?
;:               [name 'me] ;: symbol?
;:               )          ;:-> string?
;:   ;: Returns the string \"foo\" followed by $" "x
;:   ;: and the $" "name.
;:   ( .... ))"))
;: @(my-prog prog3)
;: This will be translated to:
;: @(my-prog (scrbl-parse-text prog3))
;:
;: If a contract is not given, it defaults to $any or $any/c, except
;: when a convention can be used.

;[:subsection Conventions]
;:
;:
;: Sometimes the programmer uses untold conventions for
;: function argument ids, like $lst for a list, $str for a string, etc.
;: The parser can take advantage of such conventions to avoid writing such
;: obvious contracts.
;: For example:
;: @(define conv-example
;: (string-append
;: ";[:convention x number?]
;:;[:convention z (listof string?)]
;:
;:(define (foo x) ;:-> z
;: ;: This is $" "foo.
;: (....))
;:(define (bar x) ;:-> z
;: ;: This is $" "bar.
;: (....))"))
;: @(my-prog conv-example)
;: will be translated to:
;: @(my-prog (scrbl-parse-text conv-example))
;: The convention is applied to all remaining functions (but is not applied backwards).
;: Conventions added when parsing the source file do not persist
;: for future parsing of other source files.
;: @(remove-convention "x")
;: @(remove-convention "z")
;:
;: There also exists the @ccom{remove-convention x} to remove a convention that was previously
;: bound to $x, and also @ccom{remove-all-conventions}.
;:
;: Some default conventions are already defined:
;[:text]
;@(tabular
;  (map (λ(row)(list (verbatim (car row)) (verbatim " : ") (verbatim " " (cdr row))))
;       (reverse (conventions))))
;[:end-text]
;: They have less priority than local conventions, which have less priority than
;: per-definition contracts.
;:
;: Others may be added in future versions or upon request.
;:
;: Conventions can also be managed programmatically using $add-convention,
;: $remove-convention, $remove-all-conventions, and the $conventions parameter.

;[:subsection Writing Examples]
;: To insert examples of interactions, surround your examples
;: with @ccom{examples} and @ccom{end-examples}. It uses the @scheme[@examples]
;: scribble function, but automatically creates an evaluator that requires the current module.
;: Examples must be preceded by @(litchar ";") and not @comm{}.
;: For example:
;: @(define example-example
;: ";[:examples]
;:; (+ 2 7)
;:; (display \"plop\")
;:;[:end-examples]")
;: @(my-prog example-example)
;: will be translated to:
;: @(my-prog (scrbl-parse-text example-example))
;: where @scheme[(make-my-eval)] is defined in the generated
;: @scrbl file when using functions of the module @secref{package}.

;[:subsection Documenting Parameters]
;: For a parameter, the comments must be placed after the closing ")"
;: matching the "(make-parameter" parenthese.
;: For now, "(make-parameter" must be on the same line as "(define ....".
;: A @comm{argid my-arg-id} special comment can be used to rename
;: the argument of the parameter (default $x).
;:
;: For example:
;: @(define prog-param1
;: (string-append
;:"(define my-param (make-parameter \"something\") ;:-> string?
;: ;[:arg-id the-param-arg]
;: ;: A parameter that controls something.
;: ;: Default: \"something\"
;: )"))
;: @(my-prog prog-param1)
;: is translated to:
;: @(my-prog (scrbl-parse-text prog-param1))

;[:subsection Documenting Forms]
;: Currently, the parser only supports correctly $define-syntax-rules forms.
;: Other forms have to be documented by hand in @comm{} comments.

;[:section Definitions]

;: The following definitions are provided
;: to allow for possible extensions or modifications of the parser's default behavior
;: but are not necessary to generate scribble documentation.
;: To generate @scrbl files with this parser,
;: see the @secref{package} module.

(require "simple-parser.ss"
         "common.ss"
         )

(provide ;scrbl-parser
         add-convention
         remove-convention
         remove-all-conventions
         conventions
         scrbl-parse-text
         scrbl-parse-file
         )

(define scrbl-parser (new-parser))

(define scheme-word-re (string-append "[^\\s" (regexp-quote "()[]{},.;'\"") "]+"))
(define scheme-keyword-re (string-append "#:" scheme-word-re))
(define default-pair-re 
  (string-append "\\[(" scheme-word-re ")\\s+([^\\]]+)\\s*\\]"))
(define define-syntax-re 
  (string-append "\\(define-syntax[^\\s\\(]*\\s*\\((" scheme-word-re ")"))
(define keyword-default-re (string-append "(" scheme-keyword-re ")\\s+"
                                          default-pair-re))

(define open-paren-re "(?:\\(|\\[|\\{)")
(define close-paren-re "(?:\\)|\\]|\\})")
  
(define (comm-re re) 
  ; regexp for parsing data comments
  (string-append "^\\s*;\\[\\:" re "\\s*\\]\\s*$"))
(define comm-line-re "^\\s*;:(.*)$") ; ;: bla bla bla


;[:convention w string?]
;[:convention con string?]
(define conventions (make-parameter '())
  ;: A parameter that holds the current dictionary of conventions.
  )
(define (add-convention w con) ;:-> void?
  ;: Adds a module-wise convention.
  ;: For example, if the parser reads @ccom{convention text string?}
  ;: then all following argument named $text will be given the contract $string? by default.
  ;: This behavior does not have the priority on per-definition contracts.
  (conventions (cons (cons w con) (conventions))))
(define (remove-convention w) ;:-> void?
  ;: Removes a convention module-wise.
  (conventions (dict-remove (conventions) w)))
(define (remove-all-conventions) ;:-> void?
  ;: Removes all conventions module-wise.
  (conventions '()))

(define (get-contract w)
  (dict-ref (conventions) w "any/c"))
(define (get-out-contract w)
  (dict-ref (conventions) w w))

; default conventions
(for-each 
 (λ(pair) (add-convention (first pair) (second pair)))
 '(("l" "list?")
   ("lst" "list?")
   ("ll"  "(listof list?)")
   ("n" "number?")
   ("num" "number?")
   ("str" "string?")
   ("s" "string?")
   ("sym" "symbol?")
   ("vec" "vector?")
   ("proc" "procedure?")
   ("fun" "procedure?")
   ("file" "path-string?")
   ("path" "path-string?")
   ))


(define skip-next? #f)
(define (skipped w)
  (set! skip-next? #f)
  (string-append "@;SKIP[" w "]\n\n"))

; what the read module provides
; modified by parse-module
(define provided (make-parameter #t))
(define (provided? str)
  (or (equal? (provided) #t)
      (member (string->symbol str) (provided))))

(define (not-provided w)
  (string-append "@;NOT-PROVIDED[" w "]\n\n"))

;(define-struct one-def
;  (name args out-contract text)
;  #:mutable)

(define one-def% 
  (class object%
    (super-new)
    (init-field [name #f] [text ""])
    (define/public (set-name n)(set! name n))
    (define/public (set-text t) (set! text t))
    (define/public (header-string) (string-append "@defform[" name "]"))
    (define/public (to-string)
      (cond [skip-next? (skipped name)]
            [(provided? name) (string-append (header-string)
                                             "{\n\n" text "\n}\n\n")]
           [else (not-provided name)]))
    ))
(define one-defform%
  (class one-def%
    (super-new)
    (init-field [header ""])
    (inherit-field name)
    (define/public (set-header h)(set! header h))
    (define/override (header-string)
      (string-append "@defform[(" name header ")]"))
    ))
(define one-defparam%
  (class one-def%
    (super-new)
    (init-field [out-contract "any/c"] [arg-id "x"])
    (inherit-field name)
    (define/public (set-out-contract con)(set! out-contract con))
    (define/public (set-arg-id a) (set! arg-id a))
    (define/override (header-string)
      (string-append "@defparam[" name " " arg-id " " out-contract "]"))
    ))

(define (arg->string arg)
  (let-values ([(num kw arg con val) (apply values arg)])
    (string-append " [" (if kw kw "") " "
                   arg " "
                   (if con con (get-contract arg)) " "
                   (if val (first val) "") "]")))
(define (modif-arg arg new-arg)
  (map (λ(a na)(if na na a)) arg new-arg))
  
(define one-defproc%
  (class one-def%
    (super-new)
    (inherit-field name)
    (init-field [args '()]  ; args : (num #:keyword arg contract (default-value)))
                ; the rest argument has number -1
                [out-contract "any"])
    (define/public (set-out-contract con)(set! out-contract con))
    (define arg-num 0)
    (define/public (add-arg . nonum-arg) ; arg: (#:keyword arg contract (default-value))) (no num)
      ; to use in args-id-phase
      (set-arg-numed (cons arg-num nonum-arg))
      (++ arg-num))
    (define/public (set-arg . nonum-arg) ; to use before or after args-id-phase
      (set-arg-numed (cons #f nonum-arg)))
    (define/public (set-arg-numed new-arg)
      (let ([found #f])
        (set! args
              (map (λ(arg)(if (equal? (third arg) (third new-arg))
                              (begin (set! found #t) (modif-arg arg new-arg))
                              arg))
                   args))
        (unless found (set! args (cons new-arg args)))))
    (define/public (set-last-arg-contract con)
      (let ([last-arg (argmax first args)])
        (set-arg-numed (list-set last-arg 3 con))))
    (define/override (header-string)
      (let ([sort-args (sort args < #:key first)])
        (let-values ([(rest-arg args) (if (and (not (empty? sort-args))
                                               (equal? -1 (first (first sort-args))))
                                          (values (first sort-args) (rest sort-args))
                                          (values #f sort-args))])
      (string-append "@defproc[(" name
                     (apply string-append (map arg->string args))
                     (if rest-arg (string-append (arg->string rest-arg) " ...") "")
                     ") " (get-out-contract out-contract) "]"))))
    ))

(define next-def #f)

(define ident #f)

;;;;;;;;;;;;;;;;;;
;;;   Parser   ;;;
;;;;;;;;;;;;;;;;;;

(add-items 
 scrbl-parser 
 (#t 
  [#t identity]
  )
 
 ('start
  [#t ""] ; skips lines (no line-break)
  ["^\\s*\\(require\\s+" 
   (λ(s)(sub-parse 'paren 
                   (λ(text)(string-append "@(require (for-label " 
                                          (parse-text scrbl-parser text 
                                                      #:phase 'require)
                                          ")\n\n"))))]
  [(string-append "^\\s*\\(define\\s*\\((" scheme-word-re ")")
   (λ(s w)(set! next-def (new one-defproc% [name w]))
     (sub-parse 'args-id
                (λ(header)
                  (sub-parse 'desc
                             (λ(text)(send next-def set-text text)
                               (send next-def to-string))))))]
  ; next-def will be filled with the correct values
                   
  [(string-append "^\\s*\\(define\\s+(" scheme-word-re ")\\s*\\(\\s*make\\-parameter")
   ;.*;:->\\s*(.*)\\s*$")
   (λ(s w)(set! next-def (new one-defparam% [name w]))
     (sub-parse 'paren ; end the (make-parameter parenthese
                (λ(useless)
                  (sub-parse 'desc
                             (λ(text)(send next-def set-text text)
                               (send next-def to-string))))))]


  [define-syntax-re
    ; récupérer jusqu'à la parenthèse fermante et afficher le texte tel quel
    (λ(s w)(set! next-def (new one-defform% [name w]))
      (sub-parse 'paren ; get the header until the next matching parenthese
                 (λ(header) (send next-def set-header (trim header 0 1))
                   (sub-parse 'desc ; get the description text
                              (λ(text)(send next-def set-text text)
                                (send next-def to-string))))))]
  ; processes comments :
  [(comm-re "title\\s+(.*)") (λ(s t)(string-append "@title[#:tag \"" 
                                                   (this-filename) "\"]{" t "}\n"
                                                   "@(defmodule/this-package " (this-filename) ")\n\n"))]
  ; WARNING!! The [:title] field is mandatory! Otherwise it wont declare the module!
  ; Also: the @defmodule must be AFTER the title, otherwise there is an error on installing the package (not on building it though)
  [(comm-re (string-append "convention\\s+(" scheme-word-re ")\\s+(.*)"))
   (λ(s w con)(add-convention w con)"")]
  [(comm-re (string-append "remove-convention\\s+(" scheme-word-re ")"))
   (λ(s w)(remove-convention w)"")]
  [(comm-re "remove-all-conventions")
   (λ(s)(remove-all-conventions)"")]
  [(comm-re "section\\s+(.*)")
   (λ(s t)(string-append "@section{" t "}\n\n"))]
  [(comm-re "subsection\\s+(.*)")
   (λ(s t)(string-append "@subsection{" t "}\n\n"))]
  [comm-line-re (λ(s t) ; line-comment
                      ; calls the same parser on a subtext in a different phase
                      (string-append (parse-text scrbl-parser t #:phase 'desc-text) "\n"))] 
  [(comm-re "text") (λ(s)(switch-phase 'text))]
  [(comm-re "skip") ; skip next definition
   (λ(s)(set! skip-next? #t)"")]
  [(comm-re "examples") 
   (λ(s)(switch-phase 'examples)"@(examples #:eval (make-my-eval)\n")]
;   (λ(s)(switch-phase 'examples)"@(examples #:eval (evaluator)\n")]
  )
 
 ('require
  [#t identity]
  ["\"([^\"]*)\"" 
   (λ(s f)(let-values ([(filename ext) (file->name-ext f)])
            (string-append  "(this-package-in " filename ")")))]
  )
 
 ('examples
  ["^\\s*;(.*)" (λ(s t)t)]
  [(comm-re "end-examples") (λ(s)(switch-phase 'start) ")\n\n")]
  ; instead of switch-phase, a return-to-previous-phase would be great!
  )
 
 ('text
  ["^\\s*;(.*)" 
   ; calls the same parser on a subtext in a different phase
   (λ(s t)(parse-text scrbl-parser t #:phase 'desc-text))] 
  [(comm-re "end-text") (λ(s)(switch-phase 'start))]
  )
 
 ('desc ; description subparser
  ; after the ")" ending the function header ; text-description sub-parser
  [#t ""]
  [";:(.*)" (λ(s t)
              ; calls the same parser on a subtext in a different phase
              (string-append (parse-text scrbl-parser t #:phase 'desc-text) "\n"))] 
  [";:\\->\\s*(.*)"  ;:-> out-contract?
   (λ(s con)(send next-def set-out-contract con) "")]
  [(string-append ";:\\s*\\[(" scheme-word-re ")\\s*(.*)\\]") ; ;: [f procedure?]
   (λ(s w con)(send next-def set-arg #f w con #f) "")]
  [(comm-re "arg-id\\s+(.*)")
   (λ(s arg-id)(send next-def set-arg-id arg-id)"")]
  ["^\\s*[^;\\s]" (λ(s)(sub-parse-return))]
  )
 
 ('desc-text ; phase for the description phase
  [#t identity] ; useless because inherited, but for easier comprehension
  [(string-append "\\$(" scheme-word-re ")") ; $var -> @scheme[var]
   (λ(s v) (string-append "@scheme[" v "]"))]
  [(txt "\\n") "\n"]
  )
 
 ('args-id  ;parses the arguments of the function
  ; most generic matchers first (less priority)
  ["\\)"  ; close the definition
   (λ(s)(sub-parse-return))]
  [scheme-word-re (λ(s)(send next-def add-arg #f s #f #f) "")]
  [(string-append "\\.\\s+(" scheme-word-re ")") ; . rest
   (λ(s w)(send next-def set-arg-numed (list -1 #f w #f #f)) "")]
  [(string-append "\\[(" scheme-word-re ")") ;default-pair-re ; [arg 5]
   (λ(s w)(sub-parse 'paren 
                      (λ(val)(send next-def add-arg #f w #f (list (trim val 0 1))) "")))]
  [(string-append "(" scheme-keyword-re ")\\s+(" scheme-word-re ")") ; #:arg arg
   (λ(s kw w) (send next-def add-arg kw w #f #f) "")]
  [(string-append "(" scheme-keyword-re ")\\s+\\[(" scheme-word-re ")") ; #:p [p 5]
   (λ(s kw w)(sub-parse 'paren 
                        (λ(val)(send next-def add-arg kw w #f (list (trim val 0 1))) "")))]
  ; parse args-id phase comments:
  [";(.*)" ""]
  [(string-append ";:\\s*(.*)\\s*") ; (define (plop v ;:number? ...
   (λ(s con)(send next-def set-last-arg-contract con) "")]
  )
 
 ('paren ; parenthesis sub-parser
  [close-paren-re (λ(s)(sub-parse-return s))]
  [open-paren-re (λ(s)(sub-parse 'paren)s)]
  ; recursive call to the sub-parser (ok because there is a stack of sub-parsers)
  [";.*" ""] ; deletes comments
  ["\"" (λ(s)(sub-parse 'string) s)]
  [(map txt '("#\\(" "#\\)" "#\\[" "#\\]" "#\\{" "#\\}")) identity] 
  )
 
 ('string ; string sub-parser
  ["[^\\\\\"]*" identity]
  ["\"" (λ(s)(sub-parse-return s))]
  ["\\\\." identity]
  )
 )

(define (scrbl-parse-text #:phase [phase 'start] 
                          #:prov [prov #t] . text) ;:-> string?
  ;: [text string?]
  ;: Parses $text with the scribble definition parser.
  ;: A list of provided definitions can be given to $prov so that
  ;: only them are parsed.
  (parameterize ([provided prov])
    (apply parse-text scrbl-parser text)))

(define this-filename (make-parameter "UNKNOWN-FILE"))
  
(define (scrbl-parse-file filename ext [prov #t]) ;:-> string?
  ;: [filename string?]
  ;: [ext string?]
  ;: Parses the scheme plain text source file $filename.$ext and returns its documentation string.
  ;: The parsed-file should be in the $current-directory.
  ;: By default, all definitions are parsed, but only a subset is currently supported,
  ;: like functions, parameters, and forms of the type @scheme[(id . arg)].
  ;: Like for $scrbl-parse-text, a list of provided definitions can be supplied.
  (parameterize ([this-filename filename])
    (apply scrbl-parse-text #:prov prov
           (file->lines (string-append filename "." ext)))))

; un nom finissant par "?" renvoie un boolean

;(display (scrbl-parse-file "common.ss"))
;(display (scrbl-parse-file "defs-parser.ss"))
;(display (scrbl-parse-file "package.ss"))