Lecture: Programming Language

Dear students:

Welcome to CS 430: Programming Languages. I have two primary computing interests: computer graphics and programming languages. They don't necessarily go together, but I like them both and I try to find ways to unite them so I can think about both of them.

I'm Chris, and you can call me Chris. Or Dr. Chris. Or Dr. Johnson. I'm still kind of new to this department and this part of the country. My family fled the Midwest because of the cold and other reasons that we can talk about some time. When I say family, I'm referring to my wife and four sons. I'll show you a picture so you know that you aren't the only thing I've got going on in my life.

Course Structure

My aim is for our weekly schedule to be predictable. On Tuesdays, we'll have lecture. I will talk. I will try to get you to talk. We'll write a fair bit of code together. On Thursdays we'll have a lab. Some labs will be on paper, some on a computer.

The textbook for our is course is one that I am working on. I have chosen to write it not because I think I can do it better than others. I have other reasons: so that I myself can better understand the material, so that I can give it to you for free, so that I can distill the book down to the material that we'll actually cover, so that I can integrate exercises directly into the reasons, and so that I can improve the writing and the exercises semester after semester. The book is a work-in-progress, and I welcome your constructive criticism and error reports.

Each week will look something like this: read → front quiz → lecture → middle quiz → lab. Readings will generally be assigned on Thursdays. You will take a front quiz that accompanies the reading, all before Tuesday's lecture. If you like your score, you can keep it. If not, you can take a middle quiz and replace it. You don't have as much time for that one; it's due before Thursday. Then on Thursday you'll work on a lab exercise in teams. These labs will generally be due by Monday noon. I will grade them almost immediately and move on with my life. I will not accept late labs. Please don't contribute to the unsustainability of my profession. Your late work hurts you, me, and the other people I am here to serve. The quizzes aren't timed because I want to eliminate panic where I can.

We'll be discussing three languages throughout the semester: Ruby, Haskell, and Rust. They are chosen for their differences from what what you are used to. For each of them, we will have a programming exam.

I don't technically take attendance because you are adults. However, if you don't attend, our whole mission seems sort of pointless. We're here to talk to each other about programming languages—together. So, while I'm not going to waste my time on attendance itself, I am going to assign in-class exercises each time we meet, both Tuesdays and Thursdays. These will be short and related to the course topics. Your participation will count toward your grade. If you complete the exercise correctly, you'll get four points.

We'll try one of these exercises on this first day. In the first round, you do not talk but think quietly to yourself. Once all have answered, you'll briefly confer with your neighbors. Then you'll answer again.

For programming assignments, we will have a single project with four milestones. You will build a coding challenge application that runs in the terminal. A challenge is presented to the user as a black box. They enter parameters and examine the return value. When they've formed a hypothesis, then reverse engineer the box by writing code in a language that you design. We'll talk more about this next week.

I push due dates right up to the point at which I'm going to start grading. There won't be extensions. I have a job that does not end and easily destroys relationships and physical and mental health. If you find that you don't have enough time to complete something, please just accept the very appropriate consequence of not receiving credit.

The four project milestones will not have exact due dates. Instead I have scheduled six ready dates. These are dates at which you may turn in your work for a milestone. You will get credit if you've met all the requirements of the milestone. If you don't, then you can roll over to the next ready date. This is my scheme for accommodating emergencies, illness, breakups, breakdowns, and other trying times. I do not otherwise grant extensions because I'm a finite being.

You will turn in all your work on GitHub using a repository that I have made for you. I need to get you access to that repository. Also, we will use Discord for all communication. Please do not use email. If you send me an email, I will passive-aggressively wait one day before responding, and then my response will be, “Please ask on Discord.” Email is terrible because I cannot easily revisit the history of our conversation and because the university floods us with noise. Emoji is also easier on Discord.

Let's take a moment right now to make sure I have your GitHub and Discord usernames. Find the link on Canvas and submit these to me. Make accounts if you need to.

Language to Language

In this class, we approach our subject from two directions: we implement a programming language and we study several different mainstream languages. In implementing a language, we'll see how humans make choices about the programming experience. You may develop some empathy for the folks who wrote those error messages. In studying several languages, we'll see how many different choices there are and how our options can't be ranked in a simple manner from right to wrong. Each option has tradeoffs. Making some computation easy, fast, or safe generally makes something other computation hard, slow, or dangerous.

Today we begin work on both of these fronts. We're going to start implementing an interpreter for a fraction arithmetic language so we can compute values like \(\frac{1}{4} + \frac{5}{6}\) and keep the results as fractions. What should we call this language?

The computer doesn't know this language. Our job as language designers is to translate the language it doesn't know into a language it does know. For now, our target language is Ruby, the first language we will discuss. The computer does know Ruby because someone taught it how to translate Ruby into machine code.

What we talk about today is meant to mirror some of what you will need to do for milestone 1 in the project. You should take copious notes. Translating a program is a bit of work, and we'll spend several lectures talking about it.

Abstract Syntax Tree

Language tools first translate a program from an unknown language into a tree in a known language. Specifically an abstract syntax tree (AST). We'll draw a tree for this fraction program:

1/4 + 5/6

1/4 + 5/6

An AST is syntactic in the sense that represents the structure of the original program—before anything has been evaluated. It is abstract in the sense that not every single syntactic detail is represented. For instance, an AST likely doesn't have nodes for whitespace, semi-colons, or parentheses.

A tree is made of nodes. In an object-oriented language, each node is an instance of a class in a hierarchy of program anatomy. From this tree, I see the following node types:

Fraction
Add

These classes form the model of the translated program. We will implement them in Ruby. Here's how we might implement Fraction:

Ruby

class Fraction
  def initialize(top, bottom)
    @top = top
    @bottom = bottom
  end
end

f = Fraction.new(1, 4)

class Fraction
  def initialize(top, bottom)
    @top = top
    @bottom = bottom
  end
end

f = Fraction.new(1, 4)

This program makes a node. Let's try printing it:

Ruby

puts node

puts node

The output is not helpful. If we call inspect on the node, we'll get a debugging form of the object:

Ruby

puts node.inspect

puts node.inspect

In fact, we might want to write a function that printed a debugging view of an object automatically. But this function has already been written. It's called p. The puts statement above is written in this shorthand:

Ruby

p node

p node

Suppose we want to print just the top value. We might write this:

Ruby

puts f.top

puts f.top

The Ruby interpreter gives an error about top being an undefined method. Why a method? Because Ruby doesn't allow direct access to instance variables. It thinks we are trying to call a method named top, but it doesn't exist. We could write it:

Ruby

class Fraction
  # ...

  def top
    @top
  end
end

class Fraction
  # ...

  def top
    @top
  end
end

But writing getters like this is a waste of time and space. Ruby offers a shortcut with attr_reader:

Ruby

class Fraction
  attr_reader :top, :bottom

  # ...
end

class Fraction
  attr_reader :top, :bottom

  # ...
end

The nodes of our tree can be categorized in several different ways. Nodes that aren't built out of other nodes are primitives. Nodes that yield a value consumed by other nodes are expressions. Primitives are expressions, but so are the arithmetic operations. Nodes that don't yield a value but do achieve some effect—like printing or assigning memory—are statements. Thinking about these terms will help us implement the rest of our model classes.

Let's write an Add class. It is not primitive, as it accepts two nodes as parameters:

Ruby

class Add
  def initialize(left, right)
    @left = left
    @right = right
  end
end

class Add
  def initialize(left, right)
    @left = left
    @right = right
  end
end

Now we can represent more sophisticated fraction programs, like this one:

Ruby

f = Fraction.new(1, 4)
g = Fraction.new(5, 6)
sum = Add.new(f, g)

f = Fraction.new(1, 4)
g = Fraction.new(5, 6)
sum = Add.new(f, g)

Okay, so we have a way to represent a program from an unknown language as a tree in a known language. That's great, but I'm feeling disappointed in two ways. First, making the tree by hand is tedious. We'll fix this next week. Second, we can't run the tree. Let's fix that today.

To evaluate a tree, we give every node across the hierarchy an evaluate method. The job of each evaluate is to turn the tree into a primitive node. For primitives, that's easy:

Ruby

class Fraction
  # ...

  def evaluate
    return self
  end
end

class Fraction
  # ...

  def evaluate
    return self
  end
end

Non-primitives are more work. We must evaluate the operand trees down to primitives and perform the node's operation. For adding, we do the math to combine the fractions, which means finding a common denominator. We also reduce the resulting fraction. Adding might look like this:

Ruby

class Add
  # ...

  def evaluate
    # Reduce child trees to primitives.
    leftPrimitive = @leftNode.evaluate
    rightPrimitive = @rightNode.evaluate

    # Find a common denominator and add.
    newBottom = leftPrimitive.bottom * rightPrimitive.bottom
    newTop = leftPrimitive.top * rightPrimitive.bottom + rightPrimitive.top * leftPrimitive.bottom
    gcd = top.gcd(bottom)
    Fraction.new(newTop / gcd, newBottom / gcd)
  end
end

class Add
  # ...

  def evaluate
    # Reduce child trees to primitives.
    leftPrimitive = @leftNode.evaluate
    rightPrimitive = @rightNode.evaluate

    # Find a common denominator and add.
    newBottom = leftPrimitive.bottom * rightPrimitive.bottom
    newTop = leftPrimitive.top * rightPrimitive.bottom + rightPrimitive.top * leftPrimitive.bottom
    gcd = top.gcd(bottom)
    Fraction.new(newTop / gcd, newBottom / gcd)
  end
end

There's our a model. It's small right now. We have only one primitive and one non-primitive. We could add nodes for multiplication, variable assignments, conditionals, and so on. We'll expand it a bit over the next two weeks. The model is what we translate to. We'll build some tools called a lexer and parser to do this translation so we don't have to built trees by hand.

Note on AI

Let's take a quick moment to discuss AI. It has some benefits: it can eliminate drudgery, it can give you ideas, it has entered the lexicon of modern society, and it is disrupting a world that probably needed disrupting. Knowing something about it is like knowing about Minecraft and Beyoncé. You can participate in more conversations. However, it also has costs. Not just the environmental ones, but also the opportunity ones. You are in school to develop yourself, not produce right answers. The activities in this class are designed for you to learn, and it's very easy to lose that opportunity by using AI in the wrong way. I think AI can help you learn if you use it in the following manner:

Disable inline suggestions.
Hamstring it by not giving assignment details. Ask “How do I add getters in Ruby?” Don't say “I need a Fraction class.”
Reject code that you don't understand.

There's a lot of pressure from industry to use AI. That doesn't mean it wants to hire fresh graduates who got through a CS degree by having AI doing their work.

The exams we complete in this course will be completed in-class through the LockDown browser. In the past I allowed folks to use any browser and IDE but not AI. They used AI anyway, and they ruined it for everyone else.

TODO

Here's your list of things to do before we meet next:

Send your GitHub and Discord usernames to me via the Canvas widgets if you didn't do this during class.

Read through the syllabus.

Work through Managing Your Repository and Installing Ruby in Dear Computer.

Read the the first chapter of the textbook and complete the front quiz before Tuesday.

See you next time, when we start looking at how to represent a program written in a language we don't know in a language that we do know.

Sincerely,

P.S. It's time for a haiku!

She's French but I'm not There we were on our big day When she said "Adieu"

← Parsing