Expression Trees - Musings page

CMIS 240
Notes on Expression Trees
By Nicholas Duchon (copyright)
And you can at least check your work with tool I wrote:

Examples

See the infix, prefix, postfix and expression trees associated with the following expressions:
The expression (a+b)/(c+d)+(e-f-g)/(h+j)

Infix to Postfix Algorithm

The main algorithm is the one presented by Edsger Dijkstra

And here's a simpler version, from Stallings:

// input: infix expression, tokenized
// data structure: temporary stack of operators
// output: postfix expression
// TOS = abbreviation for top of stack, the result of peek()

while there are more tokens
  if token is "("
    push token onto stack
  else if token is ")"
    while ((v = pop stack) != "(")
      output v
  else if token is operator
    while stack not empty AND TOS != "(" AND priority of operator <= priority of TOS operator
      output pop stack
    push operator
  else
    output token // assume token is a variable or number
end while there are more tokens

while stack not empty
  output pop stack
end while there are more items on the stack


Notes

Expression trees and the related topics of infix, prefix and postfix notation are particularly interesting applications of the relatively simple binary tree data structure and the traversal algorithms.

The basic problem with arithmetic expressions is already apparent in 3rd or 4th grade, when you were introduced to the mnemonic "Please Excuse My Dear Aunt Sally", meaning that arithmetic operations should be performed in the following order:
Mathematical notation typically makes it quite clear what is under the division sign, but in any case, the mathematical order is the same as the order used in programming languages. Thus, in C, C++, Java, Fortran, Ada, Basic and most other programming languages:
But this is only a minor problem compared to the following, which is the basic problem in arithmetic:
As you can see from this example, parentheses are the way to make sure the operations are done in the order the programmer wants. By taking any arithmetic expression and adding a full set of parentheses, one can be absolutely certain that the operations will occur in exactly the order desired:
Mathematicians proposed an interesting pair of notations for getting rid of parentheses: prefix and postfix. Both of these notations should be thought of as working with a stack, where a string is processed left to right and entries are either put on a stack or two, and when matching entries are encountered, an arithmetic operation is performed. This is pretty abstract, so let's consider how to interpret a few prefix and postfix examples:

Postfix


Numbers are pushed onto a stack, and whenever an operation is encountered, the top two numbers on the stack are used as operands to the operation. The semantics is: (TOS-1) op (TOS). This last makes a difference for the asymmetric operations subtraction and division. For example, consider the following expression:
Token
TOS


a
a


b
b
a

+
a+b


c
c
a+b

d
d
c
a+b
*
c*d
a+b

-
(a+b)-(c*d)



As another example, consider the stack evaluation of the expression used in the second and third sequences at the start of this lecture:
Using the postfix version of this expression, we get the following table:
Token
TOS



a
a



b
b
a


+
a+b



c
c
a+b


d
d
c
a+b

+
c+d
a+b


/
(a+b)/(c+d)



e
e
(a+b)/(c+d)

f
f
e
(a+b)/(c+d)
-
e-f
(a+b)/(c+d)

g
g
e-f
(a+b)/(c+d)
-
(e-f)-g
(a+b)/(c+d)

h
h
(e-f)-g
(a+b)/(c+d)
j
j
h
(e-f)-g
(a+b)/(c+d)
+
h+j
(e-f)-g
(a+b)/(c+d)
/
((e-f)-g)/(h+j)
(a+b)/(c+d)

+
(a+b)/(c+d)+((e-f)-g)/(h+j)



Prefix

The evaluation of prefix expressions is best understood by using a stack in essentially the reverse of the postfix example, except that operators are also put on the stack. If there are two operands (TOS and TOS-1) followed by an opertor (TOS-2), evalution is performed, otherwise, push the token. The order matters for subtraction and division, and the semantics is: (TOS-1) (op = TOS-2) (TOS). Again, we will look at the expression in the second sequence that started this lecture:

Token
TOS






+
+






/
/
+





+
+
/
+




a
a
+
/
+



b
b
a
+
/
+


(eval)
a+b /
+




+
+
a+b
/
+



c
c
+
a+b
/
+


d
d
c
+ a+b
/
+

(eval)
c+d a+b /
+



(eval)
(a+b)/(c+d)
+





/
/
(a+b)/(c+d)
+




-
-
/
(a+b)/(c+d) +



-
-
-
/
(a+b)/(c+d) +


e
e
-
-
/
(a+b)/(c+d) +

f
f
e
-
-
/
(a+b)/(c+d) +
(eval)
e-f -
/
(a+b)/(c+d) +


g
g
e-f - / (a+b)/(c+d) +

(eval)
(e-f)-g /
(a+b)/(c+d) +



+
+
(e-f)-g
/
(a+b)/(c+d) +


h
h
+
(e-f)-g
/
(a+b)/(c+d) +

j
j
h
+
(e-f)-g / (a+b)/(c+d) +
(eval)
h+j (e-f)-g
/
(a+b)/(c+d) +


(eval)
((e-f)-g)/(h+j)
(a+b)/(c+d) +




(eval)
(a+b)/(c+d)+((e-f)-g)/(h+j)






Comparisons

Advantages of the different notations:
The expression tree is evaluated using a post-order traversal of the expression tree:
  1. If this node has no children, it shoud return the value of the node
  2. Evaluate the left hand child
  3. Evaluate the right hand child
  4. Then evaluate the operation indicated by the node and return this value
As exercises, you should use the examples I have created to create algorithms to convert any of the four representations of expressions to any of the others:
  1. Infix to tree
  2. Infix to postfix
  3. Infix to prefix
  4. Postfix to tree
  5. Postfix to infix
  6. Postfix to prefix
  7. Prefix to tree
  8. Prefix to infix
  9. Prefix to postfix
  10. Tree to infix
  11. Tree to prefix
  12. Tree to postfix
You could solve this problem by simply creating a circular set of algorithms, for example:
  1. Infix to tree
  2. Tree to postfix
  3. Postfix to prefix
  4. Prefix to infix


Last updated: November 8, 2003.