Lecture: Visiting

Dear students:

In this class, we continue to learn about programming languages in two ways: by building an interpreter for a small language and by building it in Ruby, a language that few of you have used before. The first step of working with programs in an invented language is to model them—to translate the program into a tree structure in a language we already know. The second step is to operate on that tree. That's what we'll do today. So far we have been building these trees by hand, which is painful.

What we talk about today is meant to mirror what you will need to do for milestone 1 in the project. You should take copious notes because everything we do today is on purpose.

Visiting

Last class we started writing a language for performing fraction arithmetic. We ended with several different kinds of nodes in our tree representation. There may be other operations we'd like to perform on ASTs, like these:

execute the program
reformat the code
translate to a different language
check the types
rename a variable in a particular scope
generate documentation
lint the code
generate machine code

Here's one way we could implement these operations: we could add a method to each node class. For example, to translate our AST to Ruby, we could write these methods:

Ruby

class Fraction
  # ...

  def translate
    "Rational(#{@top}, #{@bottom})"
  end
end

class Add
  # ...

  def translate
    "#{@leftNode.translate} + #{@rightNode.translate}"
  end
end

class Fraction
  # ...

  def translate
    "Rational(#{@top}, #{@bottom})"
  end
end

class Add
  # ...

  def translate
    "#{@leftNode.translate} + #{@rightNode.translate}"
  end
end

Okay, this doesn't look too bad. But let's look farther forward. Suppose the final version of our language has 30 different node types, and maybe 8 different tree operations we want to perform. This approach makes for some very scattered logic. If there are \(n\) classes in our hierarchy, we'll have \(\frac{1}{n}\) of each tree algorithm in each class. Wouldn't it be nice if we could put all the translation logic together? And all the evaluation logic together? We can, using a little indirection.

Visitor Pattern

Let's take all the translate methods out of the individual classes and put them into one Translator class that is focused entirely on translation. As we extract the methods, we rename them so they are unique. Since we're not in the classes anymore, the nodes must come in as a parameter. We end up with something like this:

Ruby

class Translator
  def visit_fraction(node)
    # ...
  end

  def visit_add(node)
    # ...
  end
end

class Translator
  def visit_fraction(node)
    # ...
  end

  def visit_add(node)
    # ...
  end
end

Likewise, all the evaluate methods go in a class named Evaluator:

Ruby

class Evaluator
  def visit_fraction(node)
    # ...
  end

  def visit_add(node)
    # ...
  end
end

class Evaluator
  def visit_fraction(node)
    # ...
  end

  def visit_add(node)
    # ...
  end
end

We'll make a class for each tree operation. The classes will be focused and easier to maintain. We will like our code and our smart selves.

We have lost something by moving these methods away from the objects. With something like node.translate, the object itself figures out what code to call. If these methods are moved out of the objects, we lose that knowing. No longer can we ask a node to go translate itself or evaluate itself. To translate a node now, we have to know its type so we can call the appropriate method, like visit_fraction or visit_add.

However, we are not done. All our tree processors have the same interface. They visit nodes. They are visitors. We can give each object exactly one method that tells it how to be visited:

Ruby

class Fraction
  # ...
  def visit(visitor)
    visitor.visit_fraction(self)
  end
end

class Add
  # ...
  def visit(visitor)
    visitor.visit_add(self)
  end
end

class Fraction
  # ...
  def visit(visitor)
    visitor.visit_fraction(self)
  end
end

class Add
  # ...
  def visit(visitor)
    visitor.visit_add(self)
  end
end

With these bridging methods, we again put the authority back in the hands of the objects themselves. This node knows how to get itself visited:

Ruby

node.visit(Translator.new)

node.visit(Translator.new)

This way of organizing objects without polluting the hierarchy with fragments of their operations is called the visitor pattern. In some languages where types are checked early, like Java, we would need to make a Visitor interface or abstract class. Not so in Ruby. A visitor visits if it has the right visit methods.

Object-oriented programming is neat, but it makes classes the organizing principle of our code. Sometimes we want to organize by behavior rather than class. That's when the visitor pattern comes in handy. It lets us factor out all the behaviors into their own class. The other two languages we look at this semester are not strictly object-oriented. Organizing by operation is much easier.

The job of every visit method in Translator is to produce what? A string. The job of every visit method in Evaluator is to produce what? A primitive. Here's how we might evaluate a Fraction node:

Ruby

class Evaluator
  def visit_fraction(node)
    node
  end
end

class Evaluator
  def visit_fraction(node)
    node
  end
end

Primitive nodes are already primitive nodes, so we can just give them right back. Operation nodes require more work. We must evaluate the children—which are arbitrary nodes—and then perform the operation:

Ruby

class Evaluator
  def visit_add(node)
    left_primitive = node.left_node.visit(self)
    right_primitive = node.right_node.visit(self)
    new_top = left_primitive.top * right_primitive.bottom + right_primitive.top * left_primitive.bottom
    new_bottom = left_primitive.bottom * right_primitive.bottom
    Fraction.new(new_top, new_bottom)
  end
end

class Evaluator
  def visit_add(node)
    left_primitive = node.left_node.visit(self)
    right_primitive = node.right_node.visit(self)
    new_top = left_primitive.top * right_primitive.bottom + right_primitive.top * left_primitive.bottom
    new_bottom = left_primitive.bottom * right_primitive.bottom
    Fraction.new(new_top, new_bottom)
  end
end

The other operations are implemented similarly.

Typechecking

What if we want to write an expression like 1/8 + 2? Can we represent this program using our model classes? Not exactly. We have a Fraction primitive, but not an Int primitive. Let's add this one:

Ruby

class Int
  attr_reader :raw_value

  def initialize(raw_value)
    @raw_value = raw_value
  end
end

class Int
  attr_reader :raw_value

  def initialize(raw_value)
    @raw_value = raw_value
  end
end

I'm going with the name Int because Integer is taken. Generally, I favor spelling things out over abbreviating them. To avoid the collision, we'd have to use namespaces.

With two primitive types in our language, we now have a dilemma. Our operations have to be sensitive to the kinds of values they are getting back from evaluating the child nodes. We can't assume they are fractions. That's nothing a good conditional statement can't fix:

Ruby

class Evaluator
  def visit_add(node)
    # Evaluate children
    left_primitive = node.left_node.visit(self) 
    right_primitive = node.right_node.visit(self) 

    # Typecheck and perform operation
    if left_primitive.is_a?(Int) && right_primitive.is_a?(Int)
      Int.new(left_primitive.raw_value + right_primitive.raw_value)
    elsif left_primitive.is_a?(Int) && right_primitive.is_a?(Fraction)
      left_primitive = Fraction.new(left_primitive.raw_value, 1)
      # sum two fractions
    elsif left_primitive.is_a?(Fraction) && right_primitive.is_a?(Int)
      right_primitive = Fraction.new(right_primitive.raw_value, 1)
      # sum two fractions
    else
      raise "+ expects ints or fractions" 
    end
  end
end

class Evaluator
  def visit_add(node)
    # Evaluate children
    left_primitive = node.left_node.visit(self) 
    right_primitive = node.right_node.visit(self) 

    # Typecheck and perform operation
    if left_primitive.is_a?(Int) && right_primitive.is_a?(Int)
      Int.new(left_primitive.raw_value + right_primitive.raw_value)
    elsif left_primitive.is_a?(Int) && right_primitive.is_a?(Fraction)
      left_primitive = Fraction.new(left_primitive.raw_value, 1)
      # sum two fractions
    elsif left_primitive.is_a?(Fraction) && right_primitive.is_a?(Int)
      right_primitive = Fraction.new(right_primitive.raw_value, 1)
      # sum two fractions
    else
      raise "+ expects ints or fractions" 
    end
  end
end

In C, when are types checked? At compile time. Here, compiling has already happened; we already have a tree. We are checking types as the code is being run, which is called dynamic typing. Checking types as the code is being compiled is called static typing. They are two different approaches. Both are wonderful and terrible. More on that throughout the semester.

We'll need to add such type checks to every operation. There are ways to simplify the code, but we'll leave it verbose.

TODO

Here's your list of things to do before we meet next:

Read the the first chapter of the textbook and complete the front quiz before Tuesday.

Start working on milestone 1. Devote a little time each day, and it will be done. Commit to your GitHub repository after every work session.

Keep learning Ruby through YouTube or a book.

Default to #general in Discord for messages. I am not the star of this show. We are a community. Use a DM if you need to share personal matters. Don't share graded work like screenshots of book questions in #general.

Tell me through a DM if you are partnering with someone on the project.

See you next time.

Sincerely,

P.S. It's time for a haiku!

I learned English first But not till I learned Spanish Did I learn language

Dear Computer

Lecture: Visiting

Visiting

Visitor Pattern

Typechecking

TODO