Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Google   


Peephole Refactoring As Program Analysis and Decompilation Method

News Recommended Books Recommended Links Compilation Decompilation  The decomplation of control flow
XPL Slicing Program understanding History Humor Etc

Peephole optimization is a very simple and powerful method for both refactoring and reverse engineering of program control flow.  It was invented by Bill McKeeman  and published in 1965 in his famous article MCKEEMAN, W.M. Peephole optimization. Commun. ACM 8, 7 (July 1965), 443-444.  It was used in XPL complier that McKeeman  developed. XPL language was specialized compiler writing languages that has strings with garbage collection.  The text of the compiler was freely available (and still is). Later peephole optimizers were developed for many compilers in all major hardware architectures and intermediate languages. See roster

The concept of peephole is somewhat collected with the concept of a slice and in general case nothing prevents  peephole from being applied to a selective view of the intermediate code or assembler code. Such selective view might have,  for example,  only all control related statements in assembler.  Slicing proved its great value in program understanding and as such is a valuable tool in any program reconstruction activity.

Stony Brook Pascal, a popular Pascal compiler for S/370 from the State University of New York at Stony Brook was built with XPL.

The peephole is a small moving window on the target program. Instructions are optimized only considering the instructions in the peephole. Peephole optimization is applied over the while target program, moving the peephole window. Several passes can be used if necessary.  It can be adapted to intermediate simplified marking language representation (XML or XHTML+styleseet ).  Some people even tried to used XSLT to generate code in an application software project like "template-driven code generator". See Soumen Sarkar experience of using XSLT and the book Program Generators with XML and Java for more information.

Peephole optimization should be probably called refactoring to make it sound more modern and fashionable :-)  

Refactoring is typically applied to the  source code and is a program transformation that improves the design of a program while preserving its behavior. The term "factoring" has been used in the Forth community since at least the early 1980s. Chapter Six of Leo Brodie's book Thinking Forth (1984) is dedicated to the subject. If we understand refactoring
  • a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior
then optimization is just one case of refactoring :-). 

Peephole refactoring is essentially a the pattern matching and multistage conditional replacement technique that reminds Floyd-Evans production parsing algorithms as described by David Gries in his famous 1971 book Compiler Construction for Digital Computers .  It is performed on small sliding window called peephole. By pattern matching the tuples of the control flow graph we can determine if they can be replaced by an equivalent higher level construction. This approach can be used for the decomplation of control flow  in assembler programs. Regular expressions are the most natural candidate for recognition of simple control structures. The most relevant paper for this usage of regular expressions is:

GOTO Removal Based On Regular Expressions - Paul Morris Ronald     

......our GOTO removal program served as a bridge to extend the range of the software. We have also applied this work to the process of "levitating" IBM 370 assembly language code to higher-level forms (Morris et al., 1996). Previous work (Ashcroft et al., 1972, Peterson et al., 1973, Ramshaw, 1988, Erosa et al., 1994), while highly developed and abstract in nature, treats GOTO removal as an isolated problem in the area of programming languages. This paper is based on the observation that GOTO removal for flowcharts resembles the problem of converting finite-state transition networks to regular expressions, and ......

......preferred over maintaining the existing structure. The algorithm presented here is applicable to nonreducible programs, and meets the lesser path-equivalence standard. Subjectively, this standard appears to produce more readable results than approaches that introduce auxiliary variables, such as (Erosa et al., 1994), and is easier to relate to the source presented for restructuring. These considerations are important for reverse-engineering. While the algorithm discussed in this paper has worked well in practical contexts, we feel its most significant feature is the link to important concepts in computer ......

Erosa, A. M. and Hendron, L. J. (1994). `Taming Control Flow: A structured Approach to Eliminating Goto Statements,' Proc. IEEE International Conference on Computer Languages, 229--240.

 

Notes:
  • Those pages are written by people for whom English is not a native language. Some amount of grammar and spelling errors should be expected.
  • This is a Spartan WHYFF (We Help You For Free) site. It cannot replace the best teachers and the best books.
  • The site contain some obsolete pages as it develops like a living tree... Some links on older pages are broken. Please try to use Google, Open directory, etc. to find a replacement link (see HOWTO search the WEB for details). We would appreciate if you can mail us a correct link.

Search Amazon by keywords:

Google   
Open directory

Research Index

 

Old News ;-)

A Peephole Optimizer for Python

This paper describes the implementation of a peephole optimizer for Python byte codes. Python's byte code compiler currently generates code that can easily be improved. The peephole optimizer implemented presented here implements a number of common optimizations, including jump chaining, constant expression evaluation and elimination of unecessary loads. Some optimizations rely on specific properties of Python or its virtual machine. Some optimizations common to statically typed languages, such as algebraic simplification and expression rearrangement, are prevented by Python's dynamic typing. Preliminary results obtained using pybench [Lemb] suggest that the optimizer has a positive benefit for specific operations. With the current measurement tools available however, it is difficult to quantify benefits that can be derived by using the optimizer. Situations in which the benchmarks may not yield reliable results are considered. The limitations Python places on the optimizer are discussed, especially restrictions caused by the dynamic nature of the language. Other performance improvements to the Python interpreter are discussed briefly.

Declarative Peephole Optimization Using String Pattern Matching

Peephole optimisation as a last step of a compilation sequence capitalises on significant opportunities for removing low level code slackness left over by the code generation process. We have designed a peephole optimiser that operates using a declarative specification of the optimisations to be performed. The optimiser's implementation is based on string pattern matching using regular expressions. We used this approach to prototype an optimiser to convert target machine instruction sequences containing conditional execution of instructions inside loop bodies into code that adaptively executes the optimum branch instructions according to the program's branch behaviour.

Peephole Optimization - Untitled

Recursive descend query compilers easily miss opportunities for better code generation, because limited context is retained or lookahead available. The peephole optimizer is built around such recurring patterns and compensates for the compilers 'mistakes'. The collection of peephole patterns should grow over time and front-end specific variations are foreseen.

The SQL frontend heavily relies on a pivot table, which is a generated oid sequence. Unfortunately, this is not seen and the pattern '$i := calc.oid(0@0); $j:= algebra.markT($k,$i);' occurs often. This can be replaced with '$j:= algebra.markT($k)';

Newsgroups: comp.compilers
From: Mark Leone <mleone+@cs.cmu.edu>
Keywords: optimize
Organization: School of Computer Science, Carnegie Mellon
References: 95-03-006
Date: Fri, 3 Mar 1995 03:00:53 GMT

<Steven.R.Walk@att.com> wrote:
> In the article listed below Peter Kessler describes a method for automatically
> analyzing a machine description to discover 'machine idioms'. He mentions
> near the end of the article that over 300 idioms for the 68000 and a similar
> number for the VAX were found.
 

There's a related paper by Massalin in ASPLOS '87:
Superoptimizer --- A look at the smallest program. Proceedings of the
Second International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS II), Oct. 1987, IEEE.
 

Abstract:
        The superoptimizer is a tool that, given an instruction set, finds
        the shortest program to compute a function. Startling programs
        have been generated, many of them engaging in convoluted bit-
        fiddling bearing little resemblance to the source programs which
        defined the functions. The key idea in the superoptimizer is a
        probabilistic test that makes exhaustive searches practical for
        programs of useful size. The search space is defined by the
        processor's instruction set, which may include the whole set, but
        it is typically restricted to a subset. By constraining the
        instructions and observing the effect on the output program, it is
        possible to gain insight into the design of instruction sets. In
        addition, superoptimized programs may be used by peephole
        optimizers to improve the quality of generated code, or by
        assembly language programmers to improve manually written code
 

--
Mark Leone <mleone@cs.cmu.edu>
School of Computer Science, Carnegie Mellon University
Pittsburgh, PA 15213 USA

 

Another example of a 2-way instruction sequence produced is then '$j:= algebra.markT($k); $l:= bat.reverse($j);', which can be replaced by '$l:= algebra.markH($k);'.

The reverse-reverse operation also falls into this category. Reversal pairs may result from the processing scheme of a front-end compiler or from a side-effect from other optimization steps. Such reversal pairs should be removed as quickly as possible, so as to reduce the complexity of finding alternative optimization opportunities. As in all cases we should ensure that the intermediates dropped are not used for other purposes as well.

	r:bat[:int,:int]:= bat.new(:int,:int);
	o:= calc.oid(0@0);
	z:= algebra.markT(r,o);
	rr:= bat.reverse(z);
	s := bat.reverse(r);
	t := bat.reverse(s);
	io.print(t);
	optimizer.peephole();
which is translated by the peephole optimizer into:
	r:bat[:int,:int] := bat.new(:int,:int);
	rr := algebra.markH(r);
	io.print(r);

The peephole optimizer

Last updated 3 June 1996.

The peephole optimizer is now available for both C and Scheme.

PDF] Peephole Optimization

File Format: PDF/Adobe Acrobat - View as HTML
Peephole Optimization. • It is applied to low-level code, ... More peephole optimizations will be uncovered if we (a) merge multiple adjacent labels into ...
www.csc.uvic.ca/~csc435/ slides/PeepholeOptimization-2up.pdf - Similar pages

Quick Compilers Using Peephole Optimization - Davidson, Whalley (ResearchIndex)

Abstract: machine modeling is a popular technique for developing portable compilers. A compiler can be quickly realized by translating the abstract machine operations to target machine operations. The problem with these compilers is that they trade execution efficiency for portability. Typically the code emitted by these compilers runs two to three times slower than the code generated by compilers that employ sophisticated code generators. This paper describes a C compiler that uses abstract machine... (Update)

Recommended Links


In case of broken links please try to use Google search. If you find the page please notify us about new location
Google     

Peephole optimization - Wikipedia, the free encyclopedia

Peephole Optimization

[PDF] Using Peephole Optimization on Intermediate Code ANDREW S. TANENBAUM, HANS van STAVEREN, and JOHAN W. STEVENSON  Vrije Universiteit, Amsterdam, The Netherlands

Many portable compilers generate an intermediate code that is subsequently translated into the Target machine's assembly language. In this paper a stack-machine-based intermediate code suitable for algebraic languages (e.g., PASCAL, C, FORTRAN) and most byte-addressed mini- and microcomputers is described. A table-driven optimizer that improves this intermediate code is then discussed in detail and compared with other local optimization methods. Measurements show an improvement of about 15 percent, depending on the precise metric used.

Categories and Subject Descriptors: D.3.4 [Programming Languages]:

Using and Porting GNU CC - Peephole Definitions

Code Compaction Bibliography Keywords code compaction, code compression, code size, code size reduction, code density, compression, compact code, compiler, code size o

Internal documentation on the peephole optimizer

There is only one peephole optimizer, so the substitutions to be made for the ... This is an instruction to the peephole optimizer to perform one of its ...
tack.sourceforge.net/doc/peep.html - 34k - Cached - Similar pages

Peephole Optimization from Optimized Code Generation for Digital Signal Processors Thomas Brüggen - Andreas Ropers 11.08.1999

KOPI Classfile API Java peephole optimizer

Larceny Note #6 Larceny on the SPARC

Peephole “Optimization”

Peephole optimization

Tucows Linux perlguts.1

Compile pass 3: peephole optimization

After the compile tree for a subroutine (or for an eval or a file) is created, an additional pass over the code is performed. This pass is neither top-down or bottom-up, but in the execution order (with additional complications for conditionals). These optimizations are done in the subroutine peep(). Optimizations performed at this stage are subject to the same restrictions as in the pass 2.

Peephole Optimisation -- Peephole optimization is a local form of optimization. It can be defined as 'The pattern matching and conditional replacement performed on small sections' [Allan 1990]. By pattern matching the tuples of the computational graph we can determine if they can be replaced by an equivalent instruction or set of instructions that are more efficient.

Peephole optimization as a targeting and coupling tool

A Preliminary Exploration of Optimized Stack Code Generation

Optimizing Alpha Executables on Windows NT with Spike Windows NT-based applications, Spike Optimizer, optimization system for Alpha executables

Delivering Binary Object Modification Tools for Program Analysis and Optimization

A Peephole Optimizer for Python -- Skip Montanaro Automatrix, Inc. Rexford, NY

Using and Porting GNU CC - Peephole Definitions

Amortizing 3D Graphics Optimization Across Multiple Frames

www.partner.digital.comwww-swdevpagesHomeTECHdocumentsalpha_cookbooklccport.htm

Java Bytecode to Native Code Translators October, 1997

Catalog of compilers y+po 

Peephole Log Optimization - Huston, Honeyman (ResearchIndex)

http://www.ics.uci.edu/~kistler/ics-tr-96-54.ps Kistler, Thomas. "Dynamic Runtime Optimization", UC Irvine Technical Report 96-54, November 1996.

Chambers, Craig. "The Design and Implementation of the Self Compiler, an Optimizing Compiler for Object-Oriented Programming Languages". Ph. D. dissertation, Computer Science Department, Stanford University, March 1992.

An Approach for Exploring Code Improving Transformations - Whitfield, Soffa (ResearchIndex)

Abstract: : Although code transformations are routinely applied to improve the performance of programs for both scalar and parallel machines, the properties of code improving transformations are not well understood. In this paper we present a framework that enables the exploration, both analytically and experimentally, of properties of code improving transformations. The major component of the framework is a specification language, Gospel, for expressing the conditions needed to safely apply a... (Update)

[PDF] The Design and Application of a Retargetable Peephole Optimizer

File Format: PDF/Adobe Acrobat - View as HTML
This paper describes PO, a peephole optimizer that uses a symbolic machine ... PO is a retargetable peephole optimizer. Given an assembly language program ...
www.well.com/~cwf/pro/Davidson%20and%20Fraser. %20The%20design%20and%20application%20of%20a%20retargetable... - Similar pages

[PDF] A Compact, Machine-Independent Peephole Optimizer

File Format: PDF/Adobe Acrobat - View as HTML
PEEPHOLE. OPTIMIZER. Christopher. W. Fraser. Abstract. Department ... independent. peephole. optimizer. Introduction. Of. all. optimizations, ...
www.well.com/~cwf/pro/ Fraser.%20A%20compact,%20machine-independent%20peephole%20optimizer.pdf - Similar pages

[PDF] Optimizer paper

File Format: PDF/Adobe Acrobat - View as HTML
Techniques for increasing the throughput of a peephole optimizer for ... common task and this paper focuses on the peephole optimizer in the Amsterdam ...
www.cosc.canterbury.ac.nz/research/ reports/TechReps/1988/tr_8803.pdf - Similar pages

Raisonance CodeCompressor technology - Using Peephole scripts

CodeCompressor will first execute the "standard" peephole optimizations defined for the architecture (for example, replacing LJMP by SJMP when possible, ...
www.raisonance.com/products/CCpeepholescript.php - 18k - Cached - Similar pages
 

Static Analysis for a Software Transformation Tool - Morgenthaler (ResearchIndex)

Abstract: Software is difficult and costly to modify correctly. Automating tiresome mechanical tasks such as program restructuring is one approach to reducing the burden of software maintenance. Several restructuring tools have been proposed and prototyped, all centered on the concept of meaning-preserving transformations similar in spirit to compiler optimizations. Like optimizing compilers, these tools rely on static analysis to reason about the correctness of program changes. However, the cost (in... (Update)

Active bibliography (related documents):   More   All
1.1:   Building an Efficient Software Manipulation Tool - Morgenthaler (1998)   (Correct)
0.7:   Program Restructuring as an Aid to Software Maintenance - Griswold (1991)   (Correct)
0.6:   Supporting the Restructuring of Data Abstractions through.. - Bowdidge (1995)  
(Correct)

 


Copyright © 1996-2008 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: June 02, 2008 Programming, vol. 28, no. 6, pp. 563-588, 2000.

[9] C. Cifuentes, Structuring Decompiled Graphs Lecture Notes in Computer Science, vol. 1060, pp. 91-104, 1996.
[10] B. De Bus, D. Kästner, D. Chanet, L. Van Put, and B. De Sutter, Post-Pass Compaction Techniques Comm. ACM, vol. 46, no. 8, pp. 41-46, 2003.
[11] E.H. D'Hollander, and F. Zhang, Restructuring Program Transformations with Proof Verification Using PVS Technical Report RUG/ELIS01, p. 47, 2001.
[12] E.H. D'Hollander, F. Zhang, and Q. Wang, The Fortran Parallel Transformer and Its Programming Environment Information Sciences, vol. 106, nos. 3-4, pp. 293-317, 1998.
[13] A.M. Erosa, and L.J. Hendren, Taming Control Flow: A Structured Approach to Eliminating Goto Statements Proc. Fifth Int'l Conf. Computer Languages, pp. 229-240, 1994.
 
[14] J. Ferrante, K.J. Ottenstein, and J.D. Warren, The Program Dependence Graph and Its Use in Optimization ACM Trans. Programming Languages and Systems, vol. 9, no. 3, pp. 319-349, 1987.
[15] P. Havlak, Nesting of Reducible and Irreducible Loops ACM Trans. Programming Languages and Systems (TOPLAS), vol. 19, no. 4, pp. 557-567, 1997.
[16] M.S. Hecht, and J.D. Ullman, Characterizations of Reducible Flow Graphs J. ACM, vol. 21, no. 3, pp. 367-375, 1974.
[17] C.A.R. Hoare, An Axiomatic Basis of Computer Programming Comm. ACM, vol. 12, pp. 576-580, 1969.
[18] C.A.R. Hoare, An Axiomatic Definition of the Programming Language Pascal Acta Informatca, vol. 2, pp. 335-355, 1973.
[19] J. Janssen, and H. Corporaal, Making Graphs Reducible with Controlled Node Splitting ACM Trans. Programming Languages and Systems, vol. 19, no. 6, pp. 1031-1052, 1997.
[20] V.N. Kas'janov, Distinguishing Hammocks in a Directed Graph Soviet Math. Doklady, vol. 16, no. 5, pp. 448-450, 1975.
[21] A. Klauser, T. Austin, D. Grunwald, and B. Calder, Dynamic Hammock Predication for Non-Predicated Instruction Set Architectures Proc. 1998 Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '98), pp. 278-285, Oct. 1998.
[22] L. Lamport, How to Make a Multiprocessor Computer that Correctly Executes Multiprocess Programs IEEE Trans. Computers, vol. 28, no. 9, pp. 690-691, 1979.
[23] P. Morris, R.A. Gray, and R.E. Filman, GOTO Removal Based on Regular Expressions J. Software Maintenance Research and Practice, vol. 9, no. 1, pp. 47-66, Jan./Feb. 1997.
[24] H.R. Nielson, and F. Nielson, Semantics with Applications. John Wiley and Sons, 1993.
[25] G. Oulsnam, Unravelling Unstructured Programs The Computer J., vol. 25, no. 3, pp. 379-387, Aug. 1982.
[26] G. Oulsnam, The Algorithmic Transformation of Schemas to Structured Form The Computer J., vol. 30, no. 1, pp. 43-51, Feb. 1987.
[27] S. Owre, J. Rushby, N. Shankar, and F. von Henke, "Formal Verification for Fault-Tolerant Architectures: Prolegomena to the Design of PVS," IEEE Trans. Software Eng., vol. 21, pp. 107-125, Feb. 1995.
 
[28] S. Owre, J.M. Rushby, and N. Shankar, PVS: A Prototype Verification System Proc. 11th Int'l Conf. Automated Deduction (CADE-11), D. Kapur, ed., June 1992.
[29] W.W. Peterson, T. Kasami, and N. Tokura, On the Capabilities of While, Repeat, and Exit Statements Comm. ACM, vol. 16, no. 8, pp. 503-512, Aug. 1973.
[30] L. Ramshaw, Eliminating Goto's While Preserving Program Structure J. ACM, vol. 35, no. 4, pp. 893-920, Oct. 1988.
[31] N. Shankar, Steps Towards Mechanizing Program Transformations Using PVS Science of Computer Programming, vol. 26, nos. 1-3, pp. 33-57, May 1996.
[32] S. Unger, and F. Mueller, Handling Irreducible Loops: Optimized Node Splitting Versus DJ-Graphs ACM Trans. Programming Languages and Systems (TOPLAS), vol. 24, no. 4, pp. 299-333, 2002.
[33] M. Weiser, Program Slicing IEEE Trans. Software Eng., vol. 10, no. 4, pp. 352-357, July 1984.
[34] M.H. Williams, Generating Structured Flow Diagrams: The Nature of Unstructuredness The Computer J., vol. 20, no. 1, pp. 45-50, Feb. 1977.
[35] M.H. Williams, and G. Chen, Restructuring Pascal Programs Containing Goto Statements The Computer J., vol. 28, no. 2, pp. 134-137, 1985.
[36] M.H. Williams, and H.L. Ossher, Conversion of Unstructured Flow Diagrams to Structured Form The Computer J., vol. 21, no. 2, pp. 161-167, 1978.
[37] F. Zhang, FPT: A Parallel Programming Environment PhD thesis, Dept. of Electrical Eng., Univ. of Ghent, 1996.
[38] F. Zhang, and E.H. D'Hollander, Extracting the Parallelism in Programs with Unstructured Control Statements Proc. Int'l Conf. Parallel and Distributed Systems, (ICPADS '94), pp. 264-270, Dec. 1994.
 
  Index Terms- Program transformation, structured programming, compilers, optimization, parallel processing, software/program verification, correctness proofs.

Citation:  Fubo Zhang, Erik H. D'Hollander, "Using Hammock Graphs to Structure Programs," IEEE Transactions on Software Engineering, vol. 30,  no. 4,  pp. 231-245,  Apr.,  2004.
The Program Structure Tree: Computing Control Regions in.. - Johnson, Pearson, Pingali (1994)   (28 citations)  (Correct)

....applications of the PST to parallel and incremental program analysis. 2 Single entry single exit regions and the program structure tree In the literature, the term single entry single exit region is not used consistently there appear to be several related constructs aliased to this term [Kas75, Val78, TV80, GPS90] Therefore, we begin this section with a formal definition of single entry single exit regions as used in this paper . This definition is motivated in part by considerations of control dependence, as will be made precise in Section 5. We then show that single entry single ....

V. N. Kas'janov. Distinguishing hammocks in a directed graph. Soviet Math. Doklady, 16(5):448--450, 1975.


Range Searching Over Tree Cross Products - Buchsbaum, Goodrich, Westbrook (2000)   (7 citations)  (Correct)

....node, t. A node u dominates a node v if every path from s to v goes through u. A node v post dominates a node u if every path from u to t goes through v. The hammock between two nodes u and v is the set of nodes dominated by u and post dominated by v. This modi es the de nition due to Kas janov [18]. Johnson, Pearson, and Pingali [17] de ne a canonical, nested hammock structure, which is useful in compiler control ow analysis, and devise an O(m) time algorithm to discover it. n = jV j, and m = jEj. No previous result, however, allows ecient, on line queries of the form: return the ....

V. N. Kas'janov. Distinguishing hammocks in a directed graph. Sov. Math. Dokl., 16(2):448-50, 1975.


Finding Regions Fast: Single Entry Single Exit and.. - Johnson, Pearson.. (1993)   (8 citations)  (Correct)

....behavior. The topmost SESE region on this stack when DFS reaches the entry node of a SESE region R 1 is the name of the smallest SESE region containing R 1 . 5. 3 Discussion As a final remark, we note that a concept related to SESE regions called a hammock has been defined in the literature [Kas75]. Although it appears common practice to confuse hammocks and SESE regions, they are in fact different. The exit node of a hammock can be the target of edges outside the hammock; therefore, there are hammocks that are not SESE regions, as shown in Figure 7. Also, the entry node of a hammock cannot ....

....node of a hammock cannot be the target of an edge from the exit of the hammock; therefore, there are SESE regions that are hammocks. For example, a repeat until loop in a structured program is a SESE region but it is not a hammock. The standard algorithm for finding hammocks has complexity O(EN ) [Kas75]. It may be possible to use the ideas of this paper to find hammocks in O(E) time, but we have not explored this further. 6 Extensions We conclude this paper by describing three open problems related to this work. A small program transformation can increase the number of SESE regions in the ....

V. N. Kas'janov. Distinguishing hammocks in a directed graph. Soviet Math. Doklady, 16(5):448-- 450, 1975.


Dependence-Based Program Analysis - Johnson, Pingali (1993)   (52 citations)  (Correct)

....ffl e 1 dominates e 2 , e 2 postdominates e 1 , and every cycle containing e 1 also contains e 2 and vice versa. ffl e 1 and e 2 have the same control dependences. For lack of space, we omit the proof of this theorem. A related structure called a hammock has been discussed in the literature [Kas75] Hammocks are not the same as singleentry single exit regions since the exit node in a hammock can be the target of edges outside the hammock; besides, the algorithm for finding hammocks is O(EN ) Just as def use chains are intercepted by OE functions in the SSA representation, they are ....

V. N. Kas'janov. Distinguishing hammocks in a directed graph. Soviet Math. Doklady, 16(5):448--450, 1975.

Efficient Parallel Shortest-Paths in Digraphs with a Separator Decomposition - group of 7 »
E Cohen - J. Algorithms, 1996 - portal.acm.org
... Their algorithm is based on paralleliz- ing the hammock decomposition technique
introduced by Fredrickson [5]. Their algorithm runs in 0(log2 n) time and ...

Dependence-Based Program Analysis - group of 6 »
R Johnson, K Pingali - PLDI, 1993 - portal.acm.org
... node in a hammock can be the target of edges outside the hammock; besides, the ... Thus
these regions give a hierarchical decomposition of a control flow graph’s ...
Cited by 58 - Web Search - BL Direct

Function Recovery Based on Program Slicing - group of 4 »
F Lanubile, G Visaggio - ICSM, 1993 - ieeexplore.ieee.org
... In order to extract our components we need a decomposition method which uses both
data ... Definition 3: An hammock graph HG is a quadruple (N, E, IQ, ne), where (N ...
Cited by 16 - Web Search

Oulsnam  Works

G. Oulsnam: Unravelling Unstructured Programs. Comput. J. 25(3): 379-387 (1982)

The Computer Journal, Volume 25, Issue 3, pp. 379-387 Abstract.

Department of Computer Science, University of Queensland, St. Lucia, Australia

A method is presented for converting unstructured program schemas to strictly equivalent structured form. The predicates of the original schema are left intact with structuring being achieved by the duplication of the original decision nodes without the introduction of compound predicate expressions, or, where possible, by function duplication alone. It is shown that structured schemas must have at least as many decision nodes as the original unstructured schema, and must have more when the original schema contains schema contains branches out of alternation constructs. The structuring method allows the complete avoidance of function duplication, but only at the expense of decision node duplication. It is shown that structured schemas always require an increase in space-time requirements, and it is suggested that this increase can be used as a complexity measure for the original schema.

The Algorithmic Transformation of Schemas to Structured Form -- Oulsnam 30 (1) 43 -- The Computer Journal

G. Oulsnam: The Algorithmic Transformation of Schemas to Structured Form. Comput. J. 30(1): 43-51 (1987)

G. Oulsnam *

Department of Computer Science, University College, Cork, Republic of Ireland, UK

Algorithms are given to transform unstructured program schemas into equivalent structured forms. These algorithms are shown to have a computational complexity which is linearly related to schema size for almost all schemas, but at worst exponential with an exponent greater than but asymptotically close to one for large problems. Structuring is achieved by first identifying the forward paths of the schema and reducing them to an equivalent structured elementary path E. Each back path of E is then recursively structured to an equivalent single back arc, thus reducing the original schema to a single elementary path together with a set of possibly overlapping back arcs. The remaining unstructuredness is removed by recursive application of a loop-structuring algorithm. The algorithms are illustrated by application to a complex hypothetical schema and to two practical problems.
Received August 1984.

G. Oulsnam: Cyclomatic Numbers Do Not Measure Complexity of Unstructured Programs. Inf. Process. Lett. 9(5): 207-211 (1979)

Interactive Control Restructurin g - group of 2 » DJ Rosenkrantzl, JM Bruno, GE Co - portal.acm.org
... Our algorithm uses Oulsnam's technique of repeatedly structurin g and condensing subgraphs of the flowgraph until the graph has ... Web Search

Baker Decomposition

Extracting the Parallelism in Program with non-structured.. - Fubo Zhang And   (Correct)

....be applied to generate a doall loop. The program structuring is based on a hammock transformation which makes the resulting program readable and clarifies the control scope. A lot of related work has been done on the program structuring. These techniques are applied either on reducible graphs [8, 15, 19, 20] or on irreducible graphs [7] New variables are introduced to register the state of the condition of a if goto statement. The newly proposed technique focuses on the discussion of forward, backward and exit branches. The processing of reducible and irreducible flow graph is not distinguished. ....

....are structured by flow diagrams transformations. A reducible flow graph is a graph to which an interval analysis can be applied to structure the graph. R.E. Tarjan [17] has discussed the properties of a reducible flow graph and an algorithm is proposed to reduce a reducible graph. B.S. Baker [8] proposed an algorithm to transform a flow graph into a structured program. L. Ramshaw [15] gave theorems and rules to eliminate goto s and preserve the program structure. These techniques are only applied on reducible graphs. However, unstructured forms may remain in the irreducible flow graph of ....

Baker, B., "An algorithm for structuring flowgraphs," JACM, vol 24, no. 1, 1977.

Recognizing a Program's Design: A Graph-Parsing Approach - group of 4 »
C Rich, LM Wills - IEEE Software, 1990 - portal.acm.org
... Recognizing a Program's Design: A Graph-Parsing Approach. Full text, Full
text available on the Publisher sitePublisher Site. Source, ...
Cited by 115 - Web Search

 

Etc

[PS] Good and semi-strong colorings of oriented planar graphs - group of 5 »
A Raspaud, E Sopena - Information Processing Letters, 1994 - labri.fr
... Ak coloring of an oriented graph G = (V; A) is an assignment c of one of the colors
1; 2; : : :; k to each vertex of the graph such that, for every arc (x; y ...
Cited by 45 - View as HTML - Web Search - BL Direct


Copyright © 1996-2008 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: June 02, 2008 immediate post-dominator and isBackEdge(v,w) returns whether or not (v ; w ) is an edge.

ffl procedure calls are written in their usual style with the distinguishing

feature of having an uppercase first letter. Return values are implemented as var (i.e. reference) parameters.

4.2. GRAPH STRUCTURING 41

ffl the nesting levels are represented through the use of indentation. ffl a test for equality uses `=' and assignment uses the `:=' symbol.

While it is presumed that the meaning of each node attribute is self-explanatory, a reference for all these attributes is provided in Appendix A.1. This reference relies upon the definitions given in the following sections. Section A.2 defines any predicates used while Section A.3 defines an attribute of a loop when referred to as an object.

4.2 Graph Structuring To structure a control flow graph, a set of high level constructs is required to give the structuring some context. For example, the structuring of control flow graphs could be performed very easily if we were only aiming to describe the flow of control using two-way conditionals (if..then..else) and goto statements. However, it is questionable whether this leaves us much better off than the original assembly program.

The set of high level constructs chosen for structuring needs to be general enough to cater for some of the more common high level languages. This allows any implementation of the assembly structuring process to be easily retargeted for any high level language that at least provides the set of constructs used to do the structuring. Further analysis could then be performed on the output to use additional constructs provided by the language in question. Alternatively, the algorithms could be modified slightly to support new constructs.

A comprehensive review of some common high level languages and the control flow mechanisms they provide is given on pages 138 - 140 in [5]. The constructs embodied in each of the languages is a subset of the constructs shown in Figure 4.1 along with their control flow graph representations. Note that some of the other common control structures such as for loops have not been shown. These omitted structures can usually be implemented using those already shown. For example, in the case of the for loop, it can be implemented by using a sequential statement followed by a while loop.

The set of structures used for our purposes include all those shown in Figure 4.1 except for the last two, multilevel exits and multilevel cycles. Instead,

42 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

Goto Statement

Repeat loop

Multilevel exit Multilevel cycle Multiexit loop

Multicycle loop

While loop Endless loopSequence

n-way branch conditionalConditionalSingle branch conditional

Figure 4.1: HLL control structures and examples of their control flow graph representations.

goto's will be used to emulate these structures should they appear as most languages provide the former but very few (e.g. Ada) provide the latter. However, the inclusion of gotos also excludes at least one common language, Modula2.

4.3 Control Flow Graph Terminology Before discussing how we gather information about the underlying structures present in a control flow graph, we first define the relevant terminology and definitions used. These definitions are taken directly from [5] (apart from

4.3. CONTROL FLOW GRAPH TERMINOLOGY 43 the post-dominator definitions which are a modification of the immediate dominator definitions given in the same piece of work) and are given in terms of a directed graph G = (N ; E ; h; e). This is an extension of the normal representation of a graph in that we now explicity mention the exit node (see Definition 3.8).

Definition 4.1 A closed path or cycle is a path n1 ! n

v

where n

1

= n

v

.

Definition 4.2 The successors of n

i

2 N are fn

j

2 N j n

i

! n

j

g. The

immediate successors of n

i

2 N are fn

j

2 N j (n

i

; n

j

) 2 E g.

Definition 4.3 The predecessors of n

j

2 N are fn

i

2 N j n

i

! n

j

g. The

immediate predecessors of n

j

2 N are fn

i

2 N j (n

i

; n

j

) 2 E g. A node

n

i

2 N post-dominates a node n

k

2 N if n

i

is on every path n

k

! e.

Definition 4.4 A node n

i

2 N immediately post-dominates n

k

2 N if

@n

j

2 N ffl n

j

post-dominates n

k

and n

i

post-dominates n

j

(i.e. n

i

is the

closest post-dominator of n

k

).

Definition 4.5 A strongly connected region (SCR) is a subgraph S = (N

S

; E

S

; h

S

) ffl 8 n

i

; n

j

2 N

S

ffl 9 n

i

! n

j

and n

j

! n

i

.

Definition 4.6 A strongly connected component of G is a subgraph S = (N

S

; E

S

; h

S

) ffl S is a strongly connected region and @S

2

2 SCR(G) ffl

S ae S

2

(i.e. it is the smallest connect region involving h

S

).

Definition 4.7 A depth first search (DFS) is a method of traversing a graph that selects edges to traverse from the most recently visited node that still has unvisited nodes on at least one of its edges. A depth first spanning tree (DFST) of a flow graph G is a directed, rooted, ordering spanning tree of G grown by a DFS algorithm.

As described on page 482 of [7], the DFST T of a graph partitions the edges of the graph into four categories; tree edges, forward edges, backward edges and crossedges. For the purpose of structuring CFGs, the differentiation between tree, forward and cross edges is uninteresting and so we collapse them all to the one category, forward edges. We end up with the following two categories:

44 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

1. Backward edges = f(v ; w ) j (v ; w ) 2 E ^ w ! v 2 T g. 2. Forward edges are all the remaining edges.

The backward edges (also referred to as back edges) are used when structuring loops. The forward edges are used for structuring all other constructs as well as loops.

4.3.1 Interval Theory and the Derived Sequence of

Graphs

For the purpose of the describing the structuring algorithm we aim to improve upon, we define the two data structures it uses. The first structure was defined by Cocke in [6] and the algorithm used to build it developed by Cifuentes in [4].

Definition 4.8 Given some node, n 2 N , the interval I (n) is the maximal, single-entry subgraph in which n is the only entry node (i.e. the only node with a predecessor outside the subgraph) and in which all closed paths contain n. The unique interval node n is called the interval head or simply the header node.

We can use intervals to partition the nodes of G into a unique set of disjoint intervals I = fI (h

1

); : :; I (h

n

)g for some N ? 1. The algorithm presented

in Figure 4.2 is used to derive the sequence of intervals for a graph. This algorithm is derived from work done by Allen [2] in the area of control flow analysis. Defining I as a sequence imposes an ordering between the intervals of a graph which is required when building the other data structure required for this algorithm, the derived sequence of graphs.

Interval theory alone is not sufficient for the complete structuring of loops. We use an additional concept called the derived sequence of graphs which was described by Allen [2] and is based upon the intervals of a graph. The construction of the graphs is an iterative method that collapses intervals into nodes. The nodes formed by this process are then the nodes of the next order graph. The edges for this next order graph are both the in edges to the header of the interval from which the new node was derived as well as the out edges from nodes within the same interval whose destination was not

4.3. CONTROL FLOW GRAPH TERMINOLOGY 45

procedure FindIntervals (G = (N ; E ; s); var I) * Pre: G is a graph * Post: the intervals of G are contained in the sequence I

I:= h i H := fhg for (all unprocessed n 2 H ) do

I(n) := fng repeat

I (n) := I (n) + fm 2 N j 8 p 2 pred(m) ffl p 2 I (n)g until (no more nodes can be added to I (n)) H := H + fm 2 N j m 62 H ^ m 62 I (n) ^ (9 p 2 pred(m) ffl p 2 I (n))g I:= I + I (n) end procedure

Figure 4.2: Interval building algorithm.

within the interval. The process of building the intervals is then performed on the new graph after which the whole process starts again. This continues until either the trivial graph is derived or the graph derived is the same as the graph from which it was derived. If the latter is the case, then the graph is said to be irreducible.

An adaptation of the algorithm given by Cifuentes in [4] is shown in Figure 4.3. The adaptation is a result of using an object oriented design. Under this paradigm, an interval is represented by a class derived from the class used to represent an ordinary graph node with a few additional attributes (see Appendix A.4). For readability, we do not explicitly show the necessary type conversion required when treating a sequence as a set. The nodes and edges of the derived graph G

i

are denoted by N

i

and E

i

respectively.

Examples of the derived sequences for two programs are shown in Figures 4.4 and 4.5. They demonstrate how the intervals are defined for each graph within the sequence. Figure 4.5 is an example of an irreducible graph. The final graph in its derived sequence is the canonical irreducible graph. All irreducible graphs will reduce to containing only or more instances of this graph. An irreducible graph will occur when there is a forward edge into a loop subgraph.

For convenience, we use the term DSG (derived sequence of graphs) when describing the loop structuring algorithm that uses these two data structures.

46 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

procedure BuildDerivedSequences(G = (N ; E ; h); var G) * Pre: G is a graph * Post: the derived sequence of graphs G

1

; : :; G

n

; n ? 1 has

* been constructed and is returned in G

G

1

:= G

G:= hG

1

i

i := 2 repeat /* Construction of G

i

*/

FindIntervals(G

i\Gamma 1

,I)

N

i

:= I

h

i

:= I (1)

E

i

:= f g

if (#N

i\Gamma 1

6= # I) then

for (all I 2 I) do

for (all n 2 nodesOf(I )) do

for (all m 2 (succ(n) - nodesOf(I ))) do

E

i

:= E

i

+ (I ; inInterval(m))

G:= G + G

i

i := i + 1 until (G

i

is the trivial graph or #N

i\Gamma 1

= # I)

end procedure

Figure 4.3: Derived Sequence of Graphs building algorithm.

4.3.2 Parenthesis Theory The proposed alternative loop structuring algorithm does not require any additional data structures beyond the control flow graph itself. However, it requires an extra property to be given to each node within the graph. Given that this extra property can also be used to make another improvement through 2 way conditional restructuring to the structuring algorithms, we define it for the nodes in the graph irrespective of what loop structuring algorithm is used.

We define this property in terms of a graph G = (N ; E ; h). We also rely on a DFS traversal of the graph from h to increment a `time' counter (initialised to 1) after the first and last visit to each node during the traversal. The first of these properties was given by Cormen et al. on pages 477-481 of [7]. The other two are derived from this first one.

Definition 4.9 A parenthesis of n 2 N is denoted t

1

! t

2

and t

1

is the

4.3. CONTROL FLOW GRAPH TERMINOLOGY 47

I(1)

I(2)

G2

G3 1

2

3 4

5

6

G1

2-6

1 1-6

I(1)

I(1)

Figure 4.4: The Derived Sequence of a graph with the intervals of the initial graph and each derived graph shown by the dotted boxes.

4

1 2 1 2

3 3-4

I(2)

I(1) I(1)

I(2) 1(3)

G1

G2

Figure 4.5: An example of an irreducible graph. The final graph is the canonical irreducible graph.

time at which n was first visited during a DFS traversal from h and t

2

is the

time at which n was last visited during the same traversal.

Definition 4.10 A parenthesis n

1

! n

2

is enclosed by parenthesis m

1

!

48 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS m

2

iff both were defined during the same traversal of a graph and m

1

! n

1

!

n

2

! m

2

.

Definition 4.11 9 n

1

; n

2

2 N ffl (n

1

O/ n

2

j (9 P

1

; P

2

ffl P

1

is a parenthesis

of n

1

^ P

2

is a parenthesis of n

2

^ P

1

is enclosed by P

2

)). The O/ relationship

is pronounced "is in".

It is relevant to note that the above definitions imply that there is more than one possible DFS traversal of a graph from the same starting point. This is important in our case as will be shown later. It is sufficient to state now that the two (or more) different traversals are a result of the ordering defined between the out edges of a node. Recall that for a 2way (or cond) node, the edges represent the flow of control taken when the condition is true or false respectively. This gives us a means of performing two symmetrical DFS traversals of the graph, one traversing the true branches first, the other traversing the false branches first. For an nway node, the symmetrical case involves working through the out edges in reverse.

Figure 4.6 shows the recursive algorithm used to set the parenthesis of each node in a graph. Note that it has a boolean parameter, indicating which edges are to be traversed first. For an nway node, when doing the `true' traversal, we work through the sequence of out edges in ascending order and descending order for the `false' traversal. In the context of parenthesis theory, the nodes have a few extra attributes which are described in Appendix A.5.

procedure SetParenthesis(G = (N ; E ; h); brch;var time) * Pre: G is a graph * Post: The brch traversal parenthesis for h has been determined.

traversed(h) := true parenthesis

brch

(h)(1) := time

time := time + 1 for (n 2succ(h) ^ : traversed(n)) do

SetParenthesis(G; brch; time) parenthesis

brch

(h)(2) := time

time := time + 1 end procedure

Figure 4.6: Algorithm used to set the parentheses for each node.

For convenience, we use the term PT (Parenthesis theory) when describing the loop structuring algorithm that uses the parenthesis properties of nodes.

4.4. STRUCTUREDNESS 49 4.4 Structuredness Having defined the context for structuring, we can now classify control flow graphs as either structured or unstructured. Structured graphs are those in which the flow of control represented can be explained completely in terms of a set of structured control structures. All of the structures in our chosen set (see Section 4.2) apart from the goto are structured control structures. Therefore, any control flow graph we can reduce in terms of these structures is a structured graph. In the context of assembly programs, the control flow graphs will quite often be unstructured. We are not recognising subgraphs representing short circuit evaluation of compound conditions

1

, so the graphs

containing these subgraphs will appear to be unstructured.

To deal with cases of unstructuredness in the CFGs, it is important to first identify the types of unstructured cases that we may come across. In the following sections, we describe the categories of structures we want to detect and what information we aim to derive for instances within each of these categories. The type of unstructuredness that may arise in each category is mentioned.

4.4.1 Loops The structured forms of loops are subgraphs that display a single entry/single exit nature. This includes the while, repeat and do loops. We also include multiexit and multicycle loops in this case but this decision is qualified in the following section on unstructured loops.

Examples of structured loops are the pretested, postested, endless, multiexit and multicycle loops shown in Figure 4.1.

For any loop, we want to derive the following information:

header - the node at the entrance to the subgraph representing the loop. latch - the node in the loop whose edge to the header defines the loop. follow - the first node reached after termination of the loop (if any).

1

See Section 2.4 for an explanation of why we make this restriction.

50 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS loop nodes - the nodes that are within the subgraph representing the loop. loop type - is it a preTested (while), postTested (repeat) or endless (do)

loop?

We introduce the notation h \Phi l to represent a loop with the header node h and the latching node l .

Unstructured Loops Some of the categories defined by Williams [11] that were discussed in Chapter 2 are shown in Figure 4.7. We ommit his category of parallel loops as we can structure these as multicycle loops. This is influenced by the target language we have chosen, C. However, both break's and continue's within a loop are only detected when generating the high level code. This means that we consider them as forms of unstructuredness when doing the structuring. When modifying the assembly structuring process for another high level language that provides neither of these control structures, all that is required is a small modification to the code generation phase. The type of control flow transfer occurring in these two structures is often considered to be a restricted form of unstructuredness. That is, they can both be seen as gotos with a limitation on the destinations to which control can be transfered.

Multientry

Overlapping Multiexit Figure 4.7: Sample unstructured loops.

4.4. STRUCTUREDNESS 51 4.4.2 2way Conditionals The control flow graph representation of structured 2way conditionals will also display a single entry/single exit nature. For each 2way conditional header, there will be two branches (one for each out edge from the header). For a structured 2way conditional the nodes reachable along each of the branches will be disjoint from each other except for the one node upon which they must both converge.

Examples of structured 2way conditionals are the two middle graphs in the first row in Figure 4.1.

For any 2way conditional, we need to derive the following information:

header - the node at the entrance to the subgraph representing the conditional.

follow - the unique node at which all paths from each branch converge. conditional type - is it a single clause (if..then) or double clause (if..then..else)

2way conditional?

We introduce the notation h3f to represent a 2way conditional with the header node h and the follow node l .

Unstructured 2way conditional Unstructured 2way conditionals are those that have an abnormal entry or exit to or from a branch. Both of these cases are represented in the canonical abnormal selection path graph, shown in Figure 4.8. Graph (a) contains two potential if..then..else structures, headed by nodes 1 and 3. The former is shown in graph (c) and contains an abnormal exit from its right branch whereas the latter is shown in graph (b) which has an abnormal entry into its left branch. Both types of abnormal exit can also be along a back edge. In that case, the conditional will be overlapping a loop subgraph.

52 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

1 2 3

4 5

6

4

3

6

5 2

1

3 4

(c)(b)(a) Figure 4.8: Unstructured 2way conditional graphs.

4.4.3 Nway conditionals Similarly, an nway conditional is structured if the nodes reachable from each of the branches corresponding to the out edges from the nway header are all disjoint. Once again, there will also be a unique follow node on which all branches converge. The same set of properties is required for an nway conditional as is required for a 2way conditional plus one additional property. As with loops, we also want to know the set of nodes within the nway subgraph. This helps with unstructured cases.

We use the same notation to represent an nway conditional as a 2way conditional.

Unstructured Nway conditionals Being a generalisation of a 2way conditional, an unstructured nway conditional is similar to an unstructured 2way conditional. The three forms of unstructured nway conditionals we can encounter is shown in Figure 4.9. In graph (a) the nway conditional A3E has an unstructured entry from the back edge (F ; D ). Graph (b) has an unstructured exit from B 3F via the back edge (C ; A). The nway conditional B 3F in graph (c) has an unstructured entry via the edge (A; C ). We cannot have an unstructured exit from an nway conditional via a forward edge. If we did, then the node reached by that edge would be lower in the graph than the immediate post-dominator of the nway header. As this contradicts the definition of an immediate postdominator, such a case cannot occur.

4.4. STRUCTUREDNESS 53 Graphs (a) and (b) raise an interesting issue that must be resolved. The back edge in both of these can potentially be structured as the defining back edge for a loop. At the HLL level, overlapping nway conditionals (switch) and loops will result in a large number of gotos being used. To prevent this, we must make a choice between one or the other when they overlap. We choose to prioritise nways above loops as an nway can have a large number of clauses. If we were not to structure these as clause of an nway, then each one would involve generating a goto. Loops, on the otherhand, will only involve generating one goto for the backward edge.

A

B C D E

FF

A

DCB E

(a) (b) (c)

A

B C D E

F

Figure 4.9: Unstructured nway conditionals.

4.4.4 Multi-structure headers Unlike the original algorithm given by Cifuentes in [4], a node in a control flow graph is not restricted to being a header of only one type of control structure. However, the only case in which this can happen is when a node is the header of a repeat or do loop header and is also the header of a 2way conditional. This exception can occur in both structured and unstructured graphs as shown in Figure 4.10. Node 1 in graph (a) is both a do loop and a single-branch conditional header. This intuitively makes sense as the header of an endless loop does not actually control the loop iterations. It is merely the first node inside the loop. If this kind of multi-structuring was not enabled for a single node, a goto would have to be used instead. The same applies to graph (b). Node 1 here can be the header of both a repeat loop and a 2way conditional. The edge (2,5) will generate a goto in this case

54 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS but without the multi-structuring, another goto would also be required for the edge (4,1).

1

2 3

1 2 3

4

5 (b)(a) Figure 4.10: (a) A structured and (b), an unstructured graph in which node 1 is the header of two different control structures.

4.5 Structuring Algorithms This section presents the algorithms designed to determine the underlying control structures of an arbitrary control flow graph (i.e. structured or unstructured, reducible or irreducible). The nodes within the graph are augmented with the structuring information determined which is used during the following phase when generating the high level code.

As we are considering two variations on the algorithm used to structure loops, the treatment of both algorithms is given in separate sections. It also means that the phase diagram in Figure 4.11 describing the order in which the structuring algorithms are performed is not the usual simple sequence. Instead, there is a branch in the path followed by the structuring process when it uses different algorithms depending on how it does the loop structuring.

The determine ordering phase imposes an ordering between the nodes of the graphs. The second phase, determine reverse ordering, uses the concept of a reverse graph and imposes a similar ordering between the nodes in this graph. The set parenthesis phase prepares the nodes for loop structuring under the PT algorithm of Section 4.3.2. However as mentioned earlier, the parenthesis of a node can be used to make improvement to both quality of the generated code when 2way conditionals overlap with either loops or

4.5. STRUCTURING ALGORITHMS 55

Determine

Ordering

Determine Reverse

Ordering

Conditionals

Structure

Determine Immediate

Post-dominators

Build DerivedSequences

StructuringPhases Unstructured CFG

Restructure Strucured CFG

Structure Structure

ParenthesesSet

LoopsLoops DSG DSGPT Figure 4.11: The phases performed during structuring. Branches taken by PT algorithms are labelled `PT' and branches taken by DSG algorithms are labelled `DSG'.

nway conditionals and so this phase is common to both variations of the overall structuring algorithms. The build derived sequences phase, builds the data structures required for loop structuring under the DSG algorithm Section 4.3.1. The next phase, determine immediate post-dominators does exactly that for each node in the graph. The structure conditionals phase detects the potential conditionals represented within the CFG. The phases diverge again here depending on which loop structuring algorithm is being used. Each alternative, structure loops (PT or DSG), detects the loops represented within the CFG using one of the two loop structuring algorithms. Finally, the restructure phase gives the aforementioned improvement to the overall structuring process. It reanalyses the 2way conditionals detected in

56 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS the conditional structuring phase to see if they can be restructured when they overlap with an nway conditional or a loop. Each phase is next explained in detail.

4.5.1 Determine Node Ordering The first step in structuring a CFG is to define an ordering between the nodes within the graph. Apart from other uses, this provides an efficient means of traversing the graph in a linear but partially-ordered fashion. The ordering we define is the post-order of the nodes. The node that was last visited first during a DFS traversal will be first in the ordering and the start node will have an order equal to the number of nodes within the graph. That is, the nodes are ordered according to when they were last visited. A traversal of the nodes according to this ordering can be visualised as starting from the bottom of the graph and working to the top of the graph.

During the traversal used to determine this ordering, an ordering is also imposed on the in edges to the node from its predecessors. This is the ordering used when later algorithms include a traversal of the in edges in ascending order.

The algorithm designed to perform this traversal is given in Figure 4.12.

procedure DefineOrdering (G = (N ; E ; s); var ord) * Pre: G is a graph * Post: the reverse post ordering has been determined

traversed(s) := true for (n 2 succ(s))

append s to the end of the predecessor ordering for n if (: traversed(s))

DefineOrdering (G = (N ; E ; n); ord) order(s) := ord ord := ord + 1 end procedure

Figure 4.12: Algorithm to establish the reverse post-order within the nodes of the original CFG.

4.5. STRUCTURING ALGORITHMS 57 4.5.2 Determine Reverse Ordering The reverse graph G

rev

= (N

rev

; E

rev

; h

rev

) of the graph G = (N ; E ; h) can be

defined by the following set of equations (we assume that a unique eit node has been added if it did not already exist):

1:N

rev

= N

2:E

rev

= f(n; m) j (m; n) 2 E g

3: 9

1

h

rev

2 N ffl @n 2 N ffl (h

rev

; n) 2 E

The order between the predecessors established by the algorithm in Figure 4.12 gives us an ordering between the successors of each node in the reverse graph. The importance of having this relationship between a node's predecessors in the original graph and it successors in the reverse graph becomes apparent when determining the immediate post-dominator of each node. This process relies upon the reverse ordering of the nodes. There are multiple possible reverse orderings and so the aforementioned relationship enables us to constrain the reverse ordering to be directly related to the forward ordering.

The algorithm for determining the post-ordering of the nodes in the reverse graph in shown in Figure 4.13. Note that the input to this algorithm is the reverse graph. We never actually need to construct this graph as it is just the original graph, viewed in a different way (as shown by the above equations).

procedure DefineReverseOrdering (G

rev

= (N

rev

; E

rev

; h

rev

); var ord)

* Pre: G

rev

is the reverse of graph G and the ordering

* between the predecessors of each n 2 G has been established * Post: the reverse post ordering has been determined

traversed(s) := true for (n 2 succ(h

rev

) in ascending order)

if (: traversed(h

rev

))

DefineReverseOrdering (G = (N

rev

; E

rev

; n); ord)

revOrder(s

rev

) := ord

ord := ord + 1 end procedure

Figure 4.13: Algorithm to establish the reverse post-order for the nodes within the reverse CFG.

58 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

2 43 11 12

13

5

7 8

9

6

1

10

Node Post Order in

Reverse Graph

1 13 2 11 3 4 4 5 5 8 6 10 7 7 8 9 9 6 10 3 11 2 12 12 13 1

Figure 4.14: A control flow graph with the nodes identified by their post-order value. The table gives the post-order of each node in the reverse graph. The nodes in the `Node' column correspond with the nodes in the graph shown.

The control flow graph shown in Figure 4.14 shows how the post-ordering of the nodes enables a top to bottom (or bottom to top) traversal of the graph. The nodes are labeled by their post-order value. This convention of identifying nodes by their post-order value will be followed for the graphs presented hereafter. In this graph, the true edges for a 2way node are on the left. The accompanying table displays the post-order for each node in the reverse graph.

4.5.3 Immediate Post Dominators The intuitive interpretation of Definition 4.4 is that the immediate postdominator, ipd , of a node, n, is the closest node to n such that all paths from n to the graph's exit node pass through ipd . Hecht and Ullman in [12] give an algorithm for finding the dominator tree of a control flow graph. If we consider the graph to be the reverse of another graph, then this algo 4.5. STRUCTURING ALGORITHMS 59 rithm gives us the post-dominator tree. As we are only concerned with the immediate post-dominators, a simplification of their algorithm is fine for our purposes. This simplified version is given in Figure 4.15 and Figure 4.16.

procedure FindImmPostDominators(G = (N ; E ; s); RevG) * Pre: RevG is the reverse of G and the post-ordering of its nodes has been determined * Post: the immediate post-dominator for each node has been determined.

/* first pass */ for (all n 2 N in descending order in RevG)

iPDom(n) := undefined for (all c 2 pred

rev

(n))

if (revOrder(c) ? revOrder(n))

iPDom(n) := CommPostDom(iPDom(n),c)

/* second pass */ for (all n 2 N in ascending order in G ffl # succ(n) ? 1)

for (all c 2 succ(n))

iPDom(n) := CommPostDom(iPDom(n),c)

/* third and final pass */ for (all n 2 N in ascending order in G ffl # succ(n) ? 1)

for (all c 2 succ(n))

if (isBackEdge(n; c) ^ order(iPDom(c)) ! order(iPDom(n))

iPDom(n) := CommPostDom(iPDom(n),iPDom(c)) else

iPDom(n) := CommPostDom(iPDom(n),c) end procedure

Figure 4.15: Algorithm for calculating the immediate post-dominators.

The first algorithm, FindImmPostDominators, makes use of the ordering defined between the nodes of the reverse graph. To make sure that the immediate-post dominator for each node is correct with respect to the original CFG, we require three passes over the nodes in the graph.

The first pass traverses the nodes using the ordering defined within the reverse graph. Intuitively, we traverse the nodes in the reverse graph from top to bottom. For each node, we consider its predecessors that are higher in the reverse graph (i.e. they are not reached via a back edge). If there is only one such predecessor, then it is determined to be the immediate postdominator. Otherwise, the immediate post-dominator is initialised to be the common post-dominator of all the predecessors. Finding this common postdominator is performed by the CommPostDom algorithm. The result of this

60 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

procedure CommPostDom (var ipd; s) * Pre: ipd is the current immediate post-dominator and s is a successor of the * node being considered and is lower in the graph * Post: ipd is the immediate post-dominator of the node under consideration.

if (ipd = undefined)

ipd := s else if (s = undefined)

ipd := ipd else

while (ipd and s are defined ^ ipd 6= s)

if (revOrder(ipd) ? revOrder(s))

s := iPDom(s) else

ipd := iPDom(ipd) end procedure

Figure 4.16: Common post-dominator algorithm.

first pass is that every node except for the exit node has now been given an immediate post-dominator.

The second pass is required because the immediate post-dominator information determined in the first pass is not necessarily correct with respect to the original control flow graph. This occurs only when the original graph includes loops, which is the case for most control flow graphs. The CFG shown in Figure 4.17 is an example where we cannot rely upon the immediate postdominator information gained from only one pass. The three graphs depict the state of each node after each pass. Each node is given an alphanumeric identifier. The parenthesised data for each node is its position in the reverse ordering of the graph and its immediate post-dominator after the relevant phase. Shaded nodes are those whose immediate post-dominator changed after the previous pass.

The immediate post-dominators determined for nodes B,C and E after pass 1 are wrong. For node B, there is a path (B,D,H) to the exit node (H) that does not pass through C. Therefore, C is not a post-dominator of B, let alone the immediate post-dominator. The reason for this error is that the reverse ordering algorithm is `confused' by the presence of loops. For instance, one would expect node D to have a higher reverse order than both A and B as it is `higher' than both of these in the reverse graph. The problem arises from the fact that it is impossible to include in the reverse ordering algorithm any

4.5. STRUCTURING ALGORITHMS 61

(a) (b)

(c)

H(8,-) G(7,H)(4,B)F (6,G)E

(3,E)C (1,H)D

B(5,H)

A(2,B)

H(8,-) G(7,H)(4,B)F (6,G)E

(3,E)C (1,H)D

B(5,C)

A(2,B)

H(8,-) G(7,H)(4,B)F (6,H)E

(3,H)C (1,H)D

B(5,H)

A(2,B)

Figure 4.17: A CFG that has different reverse graphs. capabilities for dealing with loops in the reverse graph as it does not know what are back edges until the reverse ordering has been completed.

The second pass attempts to correct the remaining errors by doing another traversal based on the original ordering of the CFG. This time, only nodes that have 2 or more out edges are considered as the immediate postdominators determined for single out edges in pass one will be correct (there is no other choice of immediate post-dominators for these nodes). In the example, only node B's immediate post-dominator is corrected. Nodes C and

62 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS E are left uncorrected as they were traversed in the second pass before B was and they both rely on B to be correct.

After the second pass, there is still a possibility (as shown in the example) that some nodes within a loop (including the latching node) will still have the incorrect immediate post-dominator. The third pass corrects these remaining nodes. Preliminary analysis seems to indicate that a third pass is only required when the graph contains overlapping loops such as the two loops in the example induced by the back edges (D,A) and (F,B). The table in Figure 4.18 gives the immediate post-dominators determined for the nodes in the graph of Figure 4.14.

Node 1 2 3 4 5 6 7 8 9 10 11 12 13 ImmPDom - 12 2 2 2 2 2 6 2 2 2 1 1

Figure 4.18: The immediate post-dominators for each node in the graph of Figure 4.14.

4.5.4 Structuring Conditionals When structuring conditionals, we are analysing nodes that have more than two out edges. For the subgraph headed by one of these nodes, there will be a unique node such that all paths from each of the out edges from the conditional header converge upon it. As mentioned in Sections 4.4.2 and 4.4.3, this is referred to as follow node. Defined in this way, the follow node corresponds with the conditional header's immediate post-dominator and is already determined once this immediate post-dominator is known.

The set of nodes on all paths starting from some successor of a conditional header is called a branch. This set is defined in terms of the conditional h3f and one of the succesors of h such that it is not f .

Definition 4.12 For 9 s 2 (succ(h)nff g), a branch of the conditional h3f is fn 2 N j n is on a path s ! f ^ n 62 fh; f gg.

Note that the above definition implies that a branch may be empty. If a subgraph represents a structured conditional h3f , then it will satisfy the following two properties:

4.5. STRUCTURING ALGORITHMS 63

1. 8 b

i

2 branches of h3f ffl

"

i

(b

i

) = f g

2. 8 b 2 branches of h3f ffl (@n 2 b ffl (m; n) 2 E ^ m 62 (fhg [ b))

The first property states that the membership of the branches must be mutually exclusive (i.e no node may belong to more than one branch of a given conditional). The second property says the predecessor of any node within a branch must be either another node within the same branch or the conditional header. An unstructured conditional will violate at least one of these properties.

Whenever, conditional subgraphs are nested within each other, it may be the case that they share the same follow node. This will mean that both the outer and inner conditional headers will share the same immediate post-dominator. The same will also be true for overlapping (unstructured) conditionals.

2 43 11 12

13

5

7 8

9

6

1

10

Figure 4.19: A control flow graph with the potential conditional nodes shaded.

64 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS Consider the graph in Figure 4.19. The shaded nodes indicate the potential conditional header nodes. Even though some of these may later be structured as loop headers or latching nodes, we still consider them when structuring conditionals. The reason for this is that a node may be both a conditional and a loop header (as discussed in Section 4.4.4).

Determining the type of a 2way conditional For a 2way conditional, we seek to determine exactly what type of 2way conditional it is. The possibilities are:

ffl if..then..else - neither successor of the header is the follow node or at

least one successor is reached by a back edge.

ffl if..then - the false successor is the follow node. ffl if..else - the true successor is the follow node.

Figure 4.19 gives an example of these 2way conditional types. The conditionals (1331) and (1231) are if..else type of 2way conditionals as their true branches (i.e. the left branches) lead directly to their follow nodes. When generating HLL code for this type of conditional, we have the option of either generating an if..then..else that has an empty then clause or we can negate the condition in the header node and generate an if..then statement. The latter is a more natural looking statement and so we choose it. The conditionals (732), (932) and (1132) are if..then..else's as neither of the successors of the respective header nodes are the follow node. The conditional (532) is an interesting case as one of its out edges is a back edge. This type of node will always be initially structured as an if..then..else header. As conditional structuring is performed before loop structuring, there is still a chance it may be detected as the latching node for a loop.

Determining the nodes within an nway conditional For an nway conditional, we want to tag all the nodes on a branch of the conditional with the header node of their most enclosing nway conditional. This information will be used later when performing loop structuring.

4.5. STRUCTURING ALGORITHMS 65 The algorithm used to tag each node within a given nway conditional h3f is essentially a modified depth first traversal from h. All nodes (except for the nway header node and its follow) visited during the traversal are tagged with h as their nway header unless it has already been tagged with another nway header node or it was reached via a back edge. The base case for the recursive call is when the follow node is reached. If h3f encloses another nway conditional, h

2

3f

2

(where f

2

may be f ), then all the nodes within h

2

3f

2

will already have been tagged. This is because we use the post-ordering of

the nodes to traverse the graph in such a way that inner nway conditional will be structured first. Therefore, when we come across the header of an inner (or nested) nway conditional, we can jump directly to the follow node of the inner conditional. This means that each node will only be traversed once during all the nway tagging traversals for the complete CFG. The only nway conditional in Figure 4.19 is (1032). The nway tagging traversal for this conditional will tag nodes 3 : : 9 as having node 10 as their most enclosing nway conditional header. The algorithm that performs this traversal is shown in Figure 4.20.

procedure TagNodesInCase (G = (N ; E ; n); h; f ) * Pre: (h3f ) is an nway conditional and n = h or is in a branch of (n3f ) * Post: all the nodes within the branches of (h3f ) * that are not already members of another nway conditional have been * tagged as belonging to this nway

if (n 6= h)

caseHead(n) := h if (nodeType(n) = nway ^ condFollow(n) 6= f )

TagNodesInCase(G = (N ; E ;condFollow(n)); h; f ) else

for (all c 2 succ(n) ^ caseHead(c) is undefined)

if (c 6= f ^ : isBackEdge(n; c)

TagNodesInCase(G = (N ; E ; c); h; f ) end procedure

Figure 4.20: Algorithm to Tag the Nodes within an Nway Conditional.

Complete Conditional Structuring Algorithm The complete algorithm for structuring both types of conditionals is given in Figure 4.21. The nodes of the graph are traversed by an ascending post 66 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS order traversal. This ensures that inner conditionals are structured before outer ones. Only nodes with at least two out edges are analysed during the traversal.

procedure StructConds (G = (N ; E ; h)) * Pre: The immediate post-dominator has been determined for each node * Post: The conditional headers have been tagged, their follows have been determined. * Each node has been tagged with the header of its most enclosing nway conditional * and the type of a 2way conditional has been determined.

for (all n 2 N in ascending order ffl #succ(n) ? 1)

structType(n) := Cond if (9 m ffl (n; m) 2 E ^ isBackEdge(n; m))

condType(n) := ifThenElse else

condFollow(n) := iPDom(n)

if (n is an nway node)

condType(n) := Case TagNodesInNway(n,condFollow(n)) elseif (succ(n,Then) = condFollow(n))

condType(n) := ifElse elseif (succ(n,Else) = condFollow(n))

condType(n) := ifThen else

condType(n) := ifThenElse end procedure

Figure 4.21: Conditional Structuring Algorithm

4.5.5 Loop Structuring using Derived Sequences of Graphs This section describes how loops are structured using the DSG theory. It is based upon the work on loop structuring given in [5]. Slight modifications have been made as a result of further development on the original algorithm.

The first step in structuring loops is to determine how a loop is represented within a CFG. A loop must involve a back edge from its latching node to its header node. Therefore, we want to know both of these nodes for any loop. We also want to know which nodes are within a given loop as well as a nesting order between the loops. Hecht in [8] made the point that the

4.5. STRUCTURING ALGORITHMS 67 representation of loops by cycles is too fine as not all loops are properly nested or disjoint. The use of strongly connect components is too coarse as there is no nesting order whereas strongly connected regions do not provide a unique coverage of the graph and does not cover the whole graph. The interval and derived sequence of graphs theory discussed in Section 4.3.1 provides us the information we require: one loop per interval and a nesting order given by the derived sequence of graphs.

2 43 11 12

13

5

7 8

9

6

10

I3={3,4,10..12}

I1={13}

I5={2} 1 I2={1}

I4={5..9}

Figure 4.22: The intervals of the Control Flow Graph of Figure 4.14 For some interval I (h) (where h is the interval header), there is a loop headed by h if there is an edge from some node l 2 I (h) to h. These edges will invariably be a back edge and l will be the potential latching node for the loop. Figure 4.22 shows the intervals determined for the graph of Figure 4.14. The interval I

4

is the only interval so far that has a loop in it. This loop,

9 \Phi 5 results from the back edge from 5 to 9.

To detect the loop 12 \Phi 2, we make use of the derived sequence of graphs. To find the next graph in the sequence, we collapse each interval into a single node and form the new graph from these nodes. Figure 4.23 shows the derived sequence of graphs for the original flow graph of Figure 4.22. In the

68 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS first derived graph, G

1

, the loop 12 \Phi 2 is now detectable by the back edge

from the node I

5

(which contains node 2) to the node I

3

(which contains node

12). The derived graph G

2

does not contain any back edges and therefore,

there are no more loops in the graph.

I1 I2 I3

I4 I5

I8={I3,I4,I5} I6={I1}I7={I2} I6

I7 I9={I6,I7,I8} I9

I8

G1

G2

G3

Figure 4.23: Derived Sequence of Graphs for Figure 4.22 The nesting of loop 9 \Phi 5 within loop 12 \Phi 2 is derived from the fact that the former is was detected earlier in the sequence of derived graphs.

We made a note earlier that a back edge from within an interval to the interval header is only a potential latching node. It may be the case that a more suitable latching node can be found. Consider the example in Figure 4.24. In the first graph, G

0

, the interval I

1

contains a back edge to the interval

header, node 1, from node 2 (which is within the same interval). This will result in the initial structuring of the loop 1 \Phi 2. When considering the next graph in the sequence, G

1

, we are presented another possible latching

node, node 4 (from interval I

2

) for a loop headed by node 1. We therefore

restructure the loop to be 1 \Phi 4 as it encloses more nodes. We generalise this decision and state that the latching node found latest in a sequence of derived graphs will always overide an alternative latching node for the same loop header found earlier in the sequence.

Given a control flow graph G = G

0

that has been annotated with its intervals,

its derived sequence of graphs G

0

: : G

n

and the set of intervals of these

graphs, I

0

: : I

n

, we can use the following algorithm to structure loops: each

header of an interval in G

0

is checked for having a back edge from a latching

node that belongs to the same interval and is enclosed by the same nway

4.5. STRUCTURING ALGORITHMS 69

1 2 3

I1={1,2} I3={I1,I2}

G1 I1 I2

4

5

I2={3,4,5}

G0 Figure 4.24: A loop header with two potential latching nodes.

conditional

2

. We choose the latching with the least post-order value if there

is more than one. If a latching node was found, then we check to see if the header has already been structured as a loop header (from a derived graph earlier in the sequence). If so, then this latching node is set back to what is was determined to be (an if..then..else conditional header) before being detected as a latch node and all the nodes within the now defunct loop have their loop header marker set back to undefined. We now set the latch node to be the most recently found latch node and together with the header node, a loop has been found. We now determine its type (preTested, postTested or endless) and mark all the nodes belonging to it. Next, the intervals of G

1

, I

1

are checked for loops and the process is repeated until the intervals in

I

n

have been processed. Whenever there is a potential loop (i.e. an interval

header that has a back edge into from a node within the same interval) such that the loop header and latch node are currently marked as belonging to different loops, the new loop is disregarded as it belongs to an unstructured loop. This back edge will always generate a goto during code generation. The complete algorithm is given in Figure 4.25. This algorithm correctly determines the appropriate nesting levels between the loops. Section A.3 describes the loopNodes attribute used in this and other loop structuring algorithms.

2

See Section 4.4.3 for an explanation of why this check is made.

70 CHAPTER 4. STRUCTURING ASSEMBLY GRAPHS

procedure StructLoops (G = (N ; E ; h)) * Pre: G

1

: : G

n

and I

1

: : I

m

have been constructed.

* Post: all nodes of G have been tagged with their most enclosing loop and all * loop headers have been given their loop type, loop follow (if any) and latching node

for (G

i

:= G

1

: : G

n

)

for (I

j

with header h

j

:= each of the intervals in G

i

)

if (9 x 2 nodesOf(I

j

) ffl

(x ; h

j

) 2 E ^ loopHead(x ) is undefined ^ caseHead(x ) = caseHead(h

j

) ^

@y 2 nodesOf(I

j

) ffl (y; h

j

) 2 E ^ order(y) ! order(x ))

if (h

j

is already a loop header with latch l

j

)

for (n 2 loopNodes(h

j

\Phi l

j

))

loopHead(n) := undefined reset condType(l

j

) if necessary

structType(h

j

) := loop

FindLoopNodes(h

j

; x ; I

j

)

DetermineLoopType(h

j

; x )

FindLoopFollow(h

j

; x )

end procedure

Figure 4.25: Loop Structuring Algorithm.

Determining the Nodes in a Loop The graph in Figure 4.14 contains two loops that will be detected under the DSG algorithm; 12 \Phi 2 and 9 \Phi 5. It can be seen that for both of these loops, the set of nodes given by the following equality is an intuitively valid membership for each loop:

loopNodes(h \Phi l ) = fn 2 N ffl order(l ) ? order(n) ! order(h)g That is, the loop membership is determined to be all the nodes that have a post-order value in between the post-order values