Lecture: Evaluating Trees
Dear students:
We can't do much with a program in its text form. But if we turn it into a tree, then we can act on it. We can evaluate it, turn it back into nicely formatted source code, turn it into machine code, and so on. Today we consider a sustainable approach to defining tree operations. Instead of fragmenting the code for an operation across the many node classes, we'll put them in one place: a visitor.
Ruby Investigation
Before we dive into the problem of operating on trees, let's investigate the Ruby way of accomplishing certain programming tasks. We'll number off as tables. Answer your table's question from this list:
- What's weird about returning values in Ruby?
- What's the idiomatic way to loop through a collection in Ruby?
- How can I form a string in Ruby without ugly concatenation?
- How do we write getters and setters in Ruby?
Consult several sources to confirm your answer. Discuss it with your tablemates.
Node Operations
Once a program is in a tree form, we can do things with it. In general, we can feed it into a machine and get back something else.

What sort of operations might we want to perform on the tree? I can think of the following tasks:
- execute the program
- reformat the code (or serialize the tree back into text)
- check the types
- rename a variable in a particular scope
- generate documentation
- lint the code
- generate machine code
To implement these operations across a tree, our object-oriented intuition tells us to put a method in every class of our node hierarchy. The code would look something like this:
module Ast
class Integer
def initialize(raw_value)
@raw_value = raw_value
end
def evaluate
# ...
end
def serialize
# ...
end
def check_types
# ...
end
end
class Vector2
def initialize(raw_x, raw_y)
@raw_x = raw_x
@raw_y = raw_y
end
def evaluate
# ...
end
def serialize
# ...
end
def check_types
# ...
end
end
class Print
def initialize(parameter_node)
@parameter_node = parameter_node
end
def evaluate
# ...
end
def serialize
# ...
end
def check_types
# ...
end
end
end
module Ast
class Integer
def initialize(raw_value)
@raw_value = raw_value
end
def evaluate
# ...
end
def serialize
# ...
end
def check_types
# ...
end
end
class Vector2
def initialize(raw_x, raw_y)
@raw_x = raw_x
@raw_y = raw_y
end
def evaluate
# ...
end
def serialize
# ...
end
def check_types
# ...
end
end
class Print
def initialize(parameter_node)
@parameter_node = parameter_node
end
def evaluate
# ...
end
def serialize
# ...
end
def check_types
# ...
end
end
endThis approach makes for some very scattered logic. If there are \(n\) classes in our hierarchy, we'll have \(\frac{1}{n}\) of each algorithm in each class. Wouldn't it be nice if we could put all the evaluate logic together? And all the serialization logic together in a separate structure? We can, using a little indirection.
Visitor Pattern
Let's take all the serialize methods out of the individual classes and put them into one Serializer class that is focused entirely on serialization. As we extract the methods, we rename so they are unique. Since we're not in the classes anymore, the nodes must come in as a parameter. We end up with something like this:
class Serializer
def visit_integer(node)
# ...
end
def visit_vector2(node)
# ...
end
def visit_print(node)
# ...
end
end
class Serializer
def visit_integer(node)
# ...
end
def visit_vector2(node)
# ...
end
def visit_print(node)
# ...
end
endLikewise, all the evaluate methods go in a class named Evaluator:
class Evaluator
def visit_integer(node)
# ...
end
def visit_vector2(node)
# ...
end
def visit_print(node)
# ...
end
end
class Evaluator
def visit_integer(node)
# ...
end
def visit_vector2(node)
# ...
end
def visit_print(node)
# ...
end
endWe'll make a class for each operation that we want to perform on a tree. The classes will be focused and easier to maintain. We will like our code and our smart selves.
We have lost something by moving these methods away from the objects. No longer can we ask an object to go serialize itself or evaluate itself. To serialize a node now, we have to know its type so we can call the appropriate method.
node = Ast::Print.new(Ast::Vector2.new(6, 3))
# The old way was flexible.
# node.serialize
# The new way is not flexible.
serializer = Serializer.new
serializer.visit_print(node)
node = Ast::Print.new(Ast::Vector2.new(6, 3)) # The old way was flexible. # node.serialize # The new way is not flexible. serializer = Serializer.new serializer.visit_print(node)
However, we are not done. All our tree algorithm machines have the same interface. They visit nodes. They are visitors. We can give each object exactly one method that tells it how to be visited:
module Ast
class Integer
def initialize(raw_value)
@raw_value = raw_value
end
def visit(visitor)
visitor.visit_integer(self)
end
end
class Vector2
def initialize(raw_x, raw_y)
@raw_x = raw_x
@raw_y = raw_y
end
def visit(visitor)
visitor.visit_vector2(self)
end
end
class Print
def initialize(parameter_node)
@parameter_node = parameter_node
end
def visit(visitor)
visitor.visit_print(self)
end
end
end
module Ast
class Integer
def initialize(raw_value)
@raw_value = raw_value
end
def visit(visitor)
visitor.visit_integer(self)
end
end
class Vector2
def initialize(raw_x, raw_y)
@raw_x = raw_x
@raw_y = raw_y
end
def visit(visitor)
visitor.visit_vector2(self)
end
end
class Print
def initialize(parameter_node)
@parameter_node = parameter_node
end
def visit(visitor)
visitor.visit_print(self)
end
end
endWith these bridging methods, we again put the authority back in the hands of the objects themselves. This node knows how to get itself visited:
node = Ast::Print.new(Ast::Vector2.new(6, 3))
node.visit(Serializer.new)
node = Ast::Print.new(Ast::Vector2.new(6, 3)) node.visit(Serializer.new)
This way of organizing objects without polluting the hierarchy with fragments of their operations is called the visitor pattern. In some languages where types are checked early, like Java, we would need to make a Visitor interface or abstract class. Not so in Ruby. A visitor visits if it has the right visit methods.
Serializer
We're ready to fully implement our serializer. Each visit method must return a string representation of the node. The primitives aren't too bad:
class Serializer
def visit_integer(node)
node.raw_value.to_s
end
def visit_vector2(node)
"[#{node.x_value}, #{node.y_value}]"
end
end
class Serializer
def visit_integer(node)
node.raw_value.to_s
end
def visit_vector2(node)
"[#{node.x_value}, #{node.y_value}]"
end
endOh, but serialization fails when we test this. We can't access instance variables from outside a class in Ruby. They are private, and that can't be changed. We must add getters to all our classes, like this:
class Integer
attr_reader :raw_value
def initialize(raw_value)
@raw_value = raw_value
end
end
class Integer
attr_reader :raw_value
def initialize(raw_value)
@raw_value = raw_value
end
endThe remaining visit methods are implemented similarly, but most of them have other nodes as children. We will need to recurse across the node hierarchy to fully serialize a tree. We might serialize an Add node like this:
class Serializer
# ...
def visit_add(node)
"(#{node.left_node.visit(self)} + #{node.right_node.visit(self)})"
end
end
class Serializer
# ...
def visit_add(node)
"(#{node.left_node.visit(self)} + #{node.right_node.visit(self)})"
end
endWe put parentheses around the string to make the association clear.
In the end, we'll have a machine that can turn trees into text.
Evaluator
Each method of the Evaluator must produce a tree with just a single primitive node, an Integer or a Vector2. That means the visit methods for the primitives have it easy:
class Evaluator
def visit_integer(node)
node
end
def visit_vector2(node)
node
end
end
class Evaluator
def visit_integer(node)
node
end
def visit_vector2(node)
node
end
endThe print command is not an expression. It doesn't yield a value but instead achieves a side effect. It has a parameter node, which could be any arbitrary tree. How do we turn that parameter into something we can print? We evaluate it down to a primitive. Then we check the type of that primitive to see what it is. Like this:
class Evaluator
# ...
def visit_print(node)
primitive = node.parameter_node.visit(self)
if primitive.is_a?(Ast::Integer)
puts primitive.raw_value
elsif primitive.is_a?(Ast::Vector2)
puts "[#{primitive.raw_x_value}, #{primitive.raw_y_value}]"
else
raise "type can't be printed"
end
end
end
class Evaluator
# ...
def visit_print(node)
primitive = node.parameter_node.visit(self)
if primitive.is_a?(Ast::Integer)
puts primitive.raw_value
elsif primitive.is_a?(Ast::Vector2)
puts "[#{primitive.raw_x_value}, #{primitive.raw_y_value}]"
else
raise "type can't be printed"
end
end
endThe visit methods for Add and Multiply are implemented similarly. Variables require a bit more work. We need the evaluator to keep track of what primitives are associated with what names. Let's create a helper data structure for tracking information that must be managed as a program executes. We'll call it Runtime. Currently it will just have a dictionary (hash) of the variables:
class Runtime
def initialize
@variables = {}
end
def get_variable(identifier)
@variables[identifier]
end
def set_variable(identifier, primitive)
@variables[identifier] = primitive
end
end
class Runtime
def initialize
@variables = {}
end
def get_variable(identifier)
@variables[identifier]
end
def set_variable(identifier, primitive)
@variables[identifier] = primitive
end
endLater on we may choose to add extra execution state. We pass a Runtime along to our Evaluator to do lookups and assignments:
class Evaluator
def initialize(runtime)
@runtime = runtime
end
def visit_variable_reference(node)
# TODO: raise error if unknown
@runtime.get_variable(node.identifier)
end
def visit_assignment(node)
primitive = node.right_node.visit(self)
@runtime.set_variable(node.identifier, primitive)
end
end
class Evaluator
def initialize(runtime)
@runtime = runtime
end
def visit_variable_reference(node)
# TODO: raise error if unknown
@runtime.get_variable(node.identifier)
end
def visit_assignment(node)
primitive = node.right_node.visit(self)
@runtime.set_variable(node.identifier, primitive)
end
endIn the end, we'll have a machine that can turn a tree into a value.
The task of developing model classes and writing these two visitors is exactly what you will be doing in milestone 1 of the spreadsheet project.
TODO
Here's your list of things to do before we meet next:
See you for our first, in which we continue exploring lexers, grammars, and precedence.
Sincerely,
P.S. It's time for a haiku!