Lecture: Types

Dear students:

When our programmer ancestors wrote assembly, how did they know if the memory they were looking at held an integer, a float, or a character? The computer itself offered no help. The programmers had to hold that information in their heads, and it was easy to get wrong. So, we started telling the computer what kind of data a memory cell could hold. Then it made sure the instructions we applied to that cell were appropriate. That's the essence of a language's type system.

When we cook, our steps are shaped by a small set of standard measurements: teaspoons, tablespoons, and cups. When we program, our steps are guided by a small set of builtin types: ints, floats, characters, pointers, booleans, and strings. But we can combine these builtin types in sophisticated ways. We can make tuples, arrays, unions, and records.

Usually types and typechecking are good things, but sometimes find ourselves fighting the type system. Today we'll work through some type-related exercises in C++ and Ruby. First we'll look at unions in C++ as we inspect a bitfield using Ncurses. Then we'll investigate Ruby's duck-typing system by finding mispelled method names and making a hash have methods for its keys.

Static Typing

With static typing, types come in early. The compiler knows about them. Variables, functions, and operations are all fixed in what types they can work with, and the compiler can sense when we're violating an interface. That's good for preempting bugs, but also restricting. A humane type system lets us target a family of types instead of individual types.

Unions in C and C++ provide a container that can hold many different kinds of data, but only one at a time. They are used to add generics to collections in a language that doesn't otherwise support generics. But they can also be used for other, stranger purposes. You can put a value of type X in a union but then read it back out as a a value of type Y. The object is not cast. Rather the bits of the value are just interpreted according to the other type.

Binary numbers are confusing, especially binary floats, so I thought maybe we could make a sandbox for inspecting a bitfield. Let's call it Bitwise. We want to see 32 individual bits and the float and integer values they produce, and we want to be able to toggle individual bits. Bitwise operations are only supported on integer types, but we have a float. So, we define a union that gives us a bitwise view of an int or float:

C++

union view_t {
  float f;
  int i;
};;

union view_t {
  float f;
  int i;
};;

With a handy type defined, we declare and initialize a bitfield:

C++

int main() {
  view_t view;
  view.i = 3;
  return 0;
}

int main() {
  view_t view;
  view.i = 3;
  return 0;
}

Now we are ready to dissect the view. We want to be able to get individual bits. We can only get individual bits from the integer view. The general strategy for extracting a bit is to slide it into the ones place and then filter out just that single digit with a bitmask. To get the leftmost bit, which is bit 31, we slide over 31 bits and then mask out the single rightmost bit:

C++

(view.i >> 31) & 1

(view.i >> 31) & 1

In general, we get bit i with an expression like this:

C++

(view.i >> i) & 1

(view.i >> i) & 1

We also want to be able to toggle bits. When the bit to toggle is 0, we flip it to 1. When the bit is 1, we flip it to 0. What operation flips? Well, the negation operator ~ does, but it flips all the bits. To flip one at a time, we use the exclusive-or operator ^. We pit each untoggled bit against 0 and the toggled bit against 1. This opponent is built by left-shifting 1 into the correct place. The 0s will stay the same, and the toggled bit will flip. This statement flips bit i:

C++

view.i = view.i ^ (1 << i);

view.i = view.i ^ (1 << i);

Now we're ready to make this application interactive. We want to be able to toggle bits in the terminal! This example will be a teaser of the Ncurses library that you'll use in milestone 3. The simplest Ncurses program that we can write clears the terminal and waits for a keypress before exiting:

C++

#include <ncurses.h>

int main(int argc, char **argv) {
  initscr();
  clear();
  int command = getch();
  endwin();

  return 0;
}

#include <ncurses.h>

int main(int argc, char **argv) {
  initscr();
  clear();
  int command = getch();
  endwin();

  return 0;
}

For Bitwise, we want to keep getting keypresses until the user says they want to quit. Let's make an event loop and only exit when they hit the Q key:

C++

while (true) {
  int command = getch();
  if (command == 'q') {
    break;
  }
}

while (true) {
  int command = getch();
  if (command == 'q') {
    break;
  }
}

We want to show the bits of the view. To print characters with Ncurses, we move the cursor to the desired row and column with the move function and then echo the character with the addch function. This code prints a question mark at line 5, column 7:

C++

move(5, 7);
addch('?');

move(5, 7);
addch('?');

The line and column numbers are 0-based.

Now we are ready to print all the bits of the float on line 0 and the float itself on line 2 with this function:

C++

void draw(const view_t &view) {
  clear();

  move(0, 0);
  for (int i = 31; i >= 0; --i) {
    int bit = (view.i >> i) & i; 
    addch(bit ? '1' : '0');
  }

  // Float
  move(2, 0);
  printw("%.25f", view.f);

  // Int
  move(3, 0);
  printw("%d", view.i);
}

void draw(const view_t &view) {
  clear();

  move(0, 0);
  for (int i = 31; i >= 0; --i) {
    int bit = (view.i >> i) & i; 
    addch(bit ? '1' : '0');
  }

  // Float
  move(2, 0);
  printw("%.25f", view.f);

  // Int
  move(3, 0);
  printw("%d", view.i);
}

The printw function is Ncurses's version of printf.

We can see how the bits are interpreted. For the user to be able to change the bits, we must respond to key events. This requires some extra initialization of the Ncurses library:

C++

keypad(stdscr, TRUE);               // refine inputs
curs_set(0);                        // disable cursor

keypad(stdscr, TRUE);               // refine inputs
curs_set(0);                        // disable cursor

Let's have a notion of a current bit, whose index we track with this global:

C++

int bit_index = 0;

int bit_index = 0;

The cursor keys will move the selection left or right. The space key flip the current bit. We add these reactions to the event loop:

C++

if (command == 'q') {
  break;
} else if (command == KEY_LEFT) {
  bit_index = (bit_index + 1) % 32;
} else if (command == KEY_RIGHT) {
  bit_index = (bit_index - 1 + 32) % 32;
} else if (command == ' ') {
  view.i = view.i ^ (1 << bit_index);
}

if (command == 'q') {
  break;
} else if (command == KEY_LEFT) {
  bit_index = (bit_index + 1) % 32;
} else if (command == KEY_RIGHT) {
  bit_index = (bit_index - 1 + 32) % 32;
} else if (command == ' ') {
  view.i = view.i ^ (1 << bit_index);
}

Showing the selection would be nice. We can enable reverse video for just that bit with this code:

C++

int bit = (view.i >> bit_index) & 1;
attron(A_REVERSE);
addch(bit ? '1' : '0');
attroff(A_REVERSE);

int bit = (view.i >> bit_index) & 1;
attron(A_REVERSE);
addch(bit ? '1' : '0');
attroff(A_REVERSE);

That's it for the bitfield viewer. We have just relaxed the type system that is meant to help us from hurting people and their data. Unions made this possible. C and C++ are the only languages I know of that have unions that can be interpreted in multiple ways. This can be unsafe, but it can also be useful.

Dynamic Typing

With static typing, types are known early—at compile time. With dynamic typing, they are known late—at runtime. Each has its advantages. Knowing types early means finding errors early and optimizing code early. Knowing types late means functions are less pinned down and we can work with them as the program executes just like regular data.

To make Bitwise work, we had to trick the static type system to ensure we could perform bitwise operations. In a dynamically typed language with duck typing, we don't have to resort to such measures. That's because there is no notion of a type that determines which operations are supported. Instead, we ask a value directly if it supports an operation.

Languages with duck typing often provide a means of asking a program about itself. Such a feature is called reflection. In a language with reflection, we can ask these kinds of questions:

What functions or methods can we call?
What are their parameters?
What variables are in scope?
What exceptions might a function raise?

Method Misnaming

Let's use Ruby's reflective features to catch typos in method names. We can measure the similarity between two strings using Levenshtein distance, which effectively measures how many edits are needed to transform one string into another. The three possible edit operations include adding a character, deleting a character, or replacing a character.

Ruby ships with a builtin distance function in the did_you_mean library:

Ruby

require 'did_you_mean'

puts DidYouMean::Levenshtein::distance("abdominable", "abominable")

require 'did_you_mean'

puts DidYouMean::Levenshtein::distance("abdominable", "abominable")

Now, how do we detect that a bad method called occurred? All objects in Ruby have a special method that will get called if they don't support a method. It's called method_missing:

Ruby

def method_missing(symbol, *args)
  # ...
end

def method_missing(symbol, *args)
  # ...
end

Inside this method, we use reflection to get a list of legal methods and then compare their Levenshtein distance to the name of the unknown method. We collect up the ones that are close and suggest them to the user. This method_missing does the trick:

Ruby

def method_missing(symbol, *args)
  suggestions = methods.filter do |method|
    DidYouMean::Levenshtein::distance(symbol.to_s, method.to_s) < 4
  end

  if suggestions.empty?
    raise "I dunno #{symbol}."
  else
    raise "I dunno #{symbol}. Maybe you meant: #{suggestions.join(", ")}"
  end
end

def method_missing(symbol, *args)
  suggestions = methods.filter do |method|
    DidYouMean::Levenshtein::distance(symbol.to_s, method.to_s) < 4
  end

  if suggestions.empty?
    raise "I dunno #{symbol}."
  else
    raise "I dunno #{symbol}. Maybe you meant: #{suggestions.join(", ")}"
  end
end

Ruby's reflection power is pretty limited because dynamic languages don't have a lot of knowledge about parameters and types. Java's reflection, on the other hand, is quite sophisticated, enabling things like testing frameworks and documentation generators. In lab this week, we'll continue to explore the ideas of dynamic typing and reflection.

GravyHash

After writing a lot of JavaScript, there's one thing about Ruby that I don't love. I wish I could access values with the . operator, like this:

Ruby

h = {
  color: 'red',
  width: 10,
  height: 20,
}

puts h.color   # bad, must use h[:color]

h = {
  color: 'red',
  width: 10,
  height: 20,
}

puts h.color   # bad, must use h[:color]

The Ruby interpreter sees .color as a method call. There is no color method. Thanks to Ruby's very dynamic type system, we can fix this in a universal way. We can make one hash wrapper that dynamically gives the wrapper the methods it needs based on the hash that it wraps. Like this:

Ruby

class GravyHash
  def initialize(hash)
    @hash = hash
    @hash.keys.each do |key|
      define_singleton_method(key) do
        @hash[key]
      end
    end
  end
end

h = GravyHash.new({
  color: 'red',
  width: 10,
  height: 20,
})

class GravyHash
  def initialize(hash)
    @hash = hash
    @hash.keys.each do |key|
      define_singleton_method(key) do
        @hash[key]
      end
    end
  end
end

h = GravyHash.new({
  color: 'red',
  width: 10,
  height: 20,
})

When we say h.color, Ruby doesn't check the type of h. Instead it asks h if it has a method named color. It does thanks to the wrapper constructor.

Static typing is great for keeping a large codebase in good repair. Dynamic typing is great for doing a lot with a little. Programmers are presented with many different challenges within a day and should have both type systems at their disposal.

TODO

Here's your list of things to do before we meet next:

Complete the middle quiz as desired.

Keep working on milestones 1 and 2.

Be writing Ruby. Our first coding exam will be September 30. It will be six Ruby coding questions drawn from the pool of old exams in the Ruby Exams section of the textbook.

See you next time.

Sincerely,

P.S. It's time for a haiku!

Black, woman, and young voter_t's no static type It got amended

← Monkey Patching Lab: Dynamic Typing →

Dear Computer

Lecture: Types

Static Typing

Dynamic Typing

Method Misnaming

GravyHash

TODO