basic-blocks: compute basic blocks from a list of statements and labels
Danny Yoo <dyoo@hashcollision.org>
1 Introduction
This package takes a list of statements and labels, and breaks them down into basic blocks. As a quick example:
> (require (planet dyoo/basic-blocks))
> (define a-block (fracture '(entry (blah) (baz) (goto entry))))
> (length a-block) 1
> (bblock-name (first a-block)) 'entry
> (bblock-stmts (first a-block)) '((blah) (baz) (goto entry))
> (bblock-succs (first a-block)) (set 'entry)
See A larger example for a more substantial use of this library.
2 API
(require (planet dyoo/basic-blocks:1:=0)) |
The main structure that this library produces is the bblock basic-block structure.
(struct bblock (name entry? stmts succs next-succ) #:extra-constructor-name make-bblock) name : symbol entry? : boolean stmts : (listof statement) succs : (set/c (or/c symbol DYNAMIC)) next-succ : (or/c symbol #f)
bblock also stores what blocks are successors of a given block; the set of these jump targets can be accessed with bblock-succs. If the block has a dynamic (computed) jump, then DYNAMIC is a member of bblock-succs.
For convenience, if the block ends with a conditional jump, then bblock-next-succ refers to the block that follows immediately next.
(fracture stmts [ #:entry-names entry-names #:label? label? #:label-name label-name #:jump? jump? #:jump-targets jump-target #:fresh-block-name fresh-block-name]) → (listof bblock) stmts : (listof (or/c statement label)) entry-names : (listof symbol) = '() label? : (any/c -> boolean) = default-label? label-name : (label -> symbol) = default-label-name jump? : (any/c -> boolean) = default-jump?
jump-target : (jump -> (listof (or/c symbol? NEXT DYNAMIC))) = default-jump-targets fresh-block-name : (-> symbol) = default-fresh-block-name
The very first statement of stmts must be a label; it’s assumed to be the central entry point. It’s important to provide a #:entry-names that names all entry points we care about. Although fracture does not do much optimization, it does omit blocks that can not be reached by any entry-point basic block.
> (fracture '(entry (hello world) (goto done) another-entry-point (this is another block) done (bye)))
(list
(bblock 'entry #t '((hello world) (goto done)) (set 'done) #f)
(bblock 'done #f '((bye)) (set) #f))
Note that another-entry-point does not occur in the outputted set of basic blocks, because fracture could not find a path from any basic block to it.
> (fracture #:entry-names '(another-entry-point) '(entry (hello world) (goto done) another-entry-point (this is another block) done (bye)))
(list
(bblock 'entry #t '((hello world) (goto done)) (set 'done) #f)
(bblock 'another-entry-point #t '((this is another block)) (set 'done) 'done)
(bblock 'done #f '((bye)) (set) #f))
All of the other default values assume a particular structure for labels and jump statements. The default can be overriden by providing for the keyword arguments. The example in A larger example shows how to use fracture on a somewhat different statement structure than the default.
2.1 Labels
By default, a label is defined to be a symbol, and getting its name is just the identity.
(define (default-label? x) (symbol? x)) (define (default-label-name a-label) (cond [(symbol? a-label) a-label] [else (raise-type-error 'default-label-name "symbol" a-label)]))
2.2 Jumps
A jump is either conditional or unconditional, and may jump to a static or dynamic target. #:jump? consumes a statement, and produces true if the statement is a conditional or unconditional jump.
(define (default-jump? x) (match x [(list 'goto target) #t] [(list 'if condition 'goto target) #t] [else #f]))
Given a jump statement, #:jump-targets produces a list of the targets that the jump can go to. We can specify that it’s a conditional jump by including the constant NEXT as one of its targets. We can also use the constant DYNAMIC to indicate that the jump has a runtime-dependent target.
NEXT : jump-target
DYNAMIC : jump-target
(define (default-jump-targets x) (match x [(list 'goto target) (cond [(symbol? target) (list target)] [else (list DYNAMIC)])] [(list 'if condition 'goto target) (list (cond [(symbol? target) target] [else DYNAMIC]) NEXT)] [else (raise-type-error 'default-jump-targets "Statement with jump targets" x)]))
3 A larger example
As a more substantial example, we can use fracture to construct the basic blocks from one of the examples of Structure and Interpretation of Computer Programs. We pass in additional keyword arguments to teach fracture what statements are labels and jumps.
> (define factorial-snippet '(START (assign val (op make-compiled-procedure) (label entry2) (reg env)) (goto (label after-lambda1)) entry2 (assign env (op compiled-procedure-env) (reg proc)) (assign env (op extend-environment) (const (n)) (reg argl) (reg env)) (save continue) (save env) (assign proc (op lookup-variable-value) (const =) (reg env)) (assign val (const 1)) (assign argl (op list) (reg val)) (assign val (op lookup-variable-value) (const n) (reg env)) (assign argl (op cons) (reg val) (reg argl)) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch17)) compiled-branch16 (assign continue (label after-call15)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch17 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) after-call15 (restore env) (restore continue) (test (op false?) (reg val)) (branch (label false-branch4)) true-branch5 (assign val (const 1)) (goto (reg continue)) false-branch4 (assign proc (op lookup-variable-value) (const *) (reg env)) (save continue) (save proc) (assign val (op lookup-variable-value) (const n) (reg env)) (assign argl (op list) (reg val)) (save argl) (assign proc (op lookup-variable-value) (const factorial) (reg env)) (save proc) (assign proc (op lookup-variable-value) (const -) (reg env)) (assign val (const 1)) (assign argl (op list) (reg val)) (assign val (op lookup-variable-value) (const n) (reg env)) (assign argl (op cons) (reg val) (reg argl)) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch8)) compiled-branch7 (assign continue (label after-call6)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch8 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) after-call6 (assign argl (op list) (reg val)) (restore proc) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch11)) compiled-branch10 (assign continue (label after-call9)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch11 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) after-call9 (restore argl) (assign argl (op cons) (reg val) (reg argl)) (restore proc) (restore continue) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch14)) compiled-branch13 (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch14 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) (goto (reg continue)) after-call12 after-if3 after-lambda1 (perform (op define-variable!) (const factorial) (reg val) (reg env)) (assign val (const ok))))
> (require racket/match)
> (define blocks (fracture factorial-snippet #:entry-names '(START entry2 after-call15 after-call6 after-call9) #:fresh-block-name (let ([counter 0]) (lambda () (set! counter (add1 counter)) (string->symbol (format "l~a" counter)))) #:label? symbol? #:label-name (lambda (x) x) #:jump? (lambda (stmt) (match stmt [(list 'goto place) #t] [(list 'branch place) #t] [else #f])) #:jump-targets (lambda (a-jump) (match a-jump [(list 'goto place) (match place [(list 'label name) (list name)] [else (list DYNAMIC)])] [(list 'branch place) (match place [(list 'label name) (list name NEXT)] [else (list DYNAMIC NEXT)])]))))
> (for ([b blocks]) (printf "~a -> ~a\n" (string-append (symbol->string (bblock-name b)) (if (bblock-entry? b) "*" "")) (set->list (bblock-succs b))))
START* -> (after-lambda1)
entry2* -> (compiled-branch16 primitive-branch17)
compiled-branch16 -> (#<dynamic>)
primitive-branch17 -> (after-call15)
after-call15* -> (true-branch5 false-branch4)
true-branch5 -> (#<dynamic>)
false-branch4 -> (compiled-branch7 primitive-branch8)
compiled-branch7 -> (#<dynamic>)
primitive-branch8 -> (after-call6)
after-call6* -> (compiled-branch10 primitive-branch11)
compiled-branch10 -> (#<dynamic>)
primitive-branch11 -> (after-call9)
after-call9* -> (primitive-branch14 compiled-branch13)
compiled-branch13 -> (#<dynamic>)
primitive-branch14 -> (#<dynamic>)
after-lambda1 -> ()