Overriding Behaviors

Dear Computer

Chapter 5: Objects

Overriding Behaviors

A subclass normally gains all the behaviors of its superclass. Consider this List class in Ruby, which defines a constructor, an add method, and a getter and setter for the items instance variable:

Ruby
class List
  attr_accessor :items

  def initialize
    @items = []
  end

  def add(item)
    @items.push(item)
  end
end

list = List.new
list.add(11)
list.add(19)
puts list.items.to_s     # puts [11, 19]
class List
  attr_accessor :items

  def initialize
    @items = []
  end

  def add(item)
    @items.push(item)
  end
end

list = List.new
list.add(11)
list.add(19)
puts list.items.to_s     # puts [11, 19]

This ReverseList subclass retains the constructor, getter, and setter but overrides the add behavior to do something completely different than the superclass:

Ruby
class ReversedList < List
  def add(item)
    items.unshift(item)
  end
end

list = ReversedList.new
list.add(11)
list.add(19)
puts list.items.to_s     # puts [19, 11]
class ReversedList < List
  def add(item)
    items.unshift(item)
  end
end

list = ReversedList.new
list.add(11)
list.add(19)
puts list.items.to_s     # puts [19, 11]

The push method appends, while the unshift method prepends.

An overriding method can choose to invoke the superclass behavior instead of replacing it completely as ReverseList does. In many languages, this is done by invoking the method on the super receiver. This subclass adds each element twice using a super call:

Ruby
class DoubledList < List
  def add(item)
    super.add(item)
    super.add(item)
  end
end

list = DoubledList.new
list.add(11)
list.add(19)
puts list.items.to_s     # puts [11, 11, 19, 19]
class DoubledList < List
  def add(item)
    super.add(item)
    super.add(item)
  end
end

list = DoubledList.new
list.add(11)
list.add(19)
puts list.items.to_s     # puts [11, 11, 19, 19]

Some language communities use the term base class instead of superclass. For example, in C#, we invoke the superclass behavior with base instead of super. C++ uses neither super nor base because it allows multiple superclasses. We must disambiguate which behavior we want by qualifying the call with the appropriate superclass name, as in this code:

C++
class DoubledList : public List, SomeOtherSuperclass {
  public:
    void add(int i) {
      List::add(item);
      List::add(item);
    }
}
class DoubledList : public List, SomeOtherSuperclass {
  public:
    void add(int i) {
      List::add(item);
      List::add(item);
    }
}

Sealing

Suppose we are writing an application that shows styled text. We want an abstraction that bundles together the text and its color, so we write this ColorfulString class:

Java
public class ColorfulString extends String {
  private Color color;

  public ColorfulString(String text, Color color) {
    super(text);
    this.color = color;    
  }

  // ...
}
public class ColorfulString extends String {
  private Color color;

  public ColorfulString(String text, Color color) {
    super(text);
    this.color = color;    
  }

  // ...
}

But our IDE and compiler reject this code. That's because the String class has been sealed in Java, prohibiting it from having any subclasses. A class is sealed in Java by adding the final modifier to the class header:

Java
public final class String ... {
  // ...
}
public final class String ... {
  // ...
}

There are at least two reasons to seal a class. The first reason is demonstrated by Java's Math class, which contains only static methods. Since it's not used to make objects, there is neither state nor behaviors to inherit. Subclasses would gain nothing from this inheritance, so Math is sealed.

The second reason is that if we write a method and declare a formal parameter of a certain type, we can't be certain what type will be passed in as an actual parameter. The value could be the declared type, or it could be any of its subclasses. If we must be certain of the actual parameter's type, then we seal the type. Unsealed types permit subtype polymorphism, which is often good, but not always.

The designers of Java chose to block polymorphic behavior for String. Subclasses, if they were possible, might not make the same guarantees as String. Recall that String is immutable. The hashcode of an immutable String never changes, so it may be computed once and cached. This leads to very fast lookups in hashed collections. If we could create a subclass of String, we could add mutable state. Any time its state changed, the hashcode would need to be recomputed, thereby degrading performance.

Sealing a class is an extreme action, a bit like sterilizing a pet or a human, in that we are forbidding entire lines of descendents from ever existing. But we don't have to seal an entire class. We may instead seal individual methods, which would allow subclasses but prevent them from overriding the sealed methods.

Virtual

The word virtual is used to describe an illusion that approaches or simulates reality. A virtual machine is like a real CPU, but it's implemented in software. A virtual tour is like an in-person tour, but we are sitting at home. A virtual method is a real method, but when we call it, we might be calling not the supertype's implementation but an overriding implementation defined by a subclass.

When we declare a method as virtual in C++, we allow an overriding method in the subclass to be called from polymorphic code that knows only the supertype. Consider this inheritance hierarchy:

C++
class Message {
  public:
    /* virtual */ void deliver() {
      cout << "breathe" << endl;
    }
};

class LoudMessage : public Message {
  public:
    void deliver() {
      cout << "BREATHE" << endl;
    }
};

int main() {
  LoudMessage loudMessage;
  Message& message = loudMessage;

  loudMessage.deliver();          // prints "BREATHE"
  message.deliver();              // prints "breathe"

  return 0;
}
class Message {
  public:
    /* virtual */ void deliver() {
      cout << "breathe" << endl;
    }
};

class LoudMessage : public Message {
  public:
    void deliver() {
      cout << "BREATHE" << endl;
    }
};

int main() {
  LoudMessage loudMessage;
  Message& message = loudMessage;

  loudMessage.deliver();          // prints "BREATHE"
  message.deliver();              // prints "breathe"

  return 0;
}

Both the superclass and subclass define the deliver method. This is possible because Message does not seal the method with the final modifier. The main function creates a single object, and its type is LoudMessage. Calling deliver on the loudMessage receiver calls LoudMessage::deliver, as we'd expect. However, the message reference behaves differently, even though it refers to the exact same object. It has the type Message&. Since the deliver method is not virtual, the second call is to Message::deliver. If a method is not virtual, the compiler triggers the method associated with the receiver's declared type, not the underlying type of its actual value. If we uncomment the virtual modifier on Message::deliver, both calls will invoke LoudMessage::deliver.

You might be wondering why we would ever want a non-virtual method. This is a fair question. The Java designers wondered the same thing. They didn't find a compelling answer, so they made all instance methods virtual, no matter what. C++ requires us to opt-in to polymorphic behavior because virtual methods have a slight performance cost and introduce some metadata into an object's memory that may cause incompatibilities when objects are shared with libraries written in languages like C. C++ objects and C structs can have the same memory layout, but only if the objects have no virtual behaviors.

Abstract

When we factor out a common interface to a superclass, we may find that we want all subclasses to have a certain behavior, yet they have no common implementation. This is the case with the evaluate method in the BinaryOperator hierarchy in Ruby. Each subclass defines its own version of the evaluate method, but there is no mention of it in the superclass. The expression e.evaluate will work no matter what subclass e refers to because Ruby has dynamic typing.

A compiler for a language with static typing, on the other hand, will need to typecheck e.evaluate. It will look at the type of e and ensure that it has an evaluate method. The type of e is the supertype over all expressions: Expression. For typechecking to succeed, Expression must declare this method, even if it has no definition. A method imposed by a supertype but not defined by it is abstract. An abstract method is defined in Java using the abstract modifier:

Java
public abstract class Expression {
  public abstract int evaluate();
}
public abstract class Expression {
  public abstract int evaluate();
}

Note that evaluate has a header but no body. The class itself is also marked abstract, which outlaws any instances of it from being constructed. There are many kinds of Expression, like Add and Number, but there's no entity that is just an Expression and nothing more. It would be an error to instantiate one.

Any class with an abstract method is necessarily abstract. A class without any abstract methods may still be marked abstract if it is too general to be instantiated. Java's Object class, for example, is abstract but has no abstract methods.

A Java class that is entirely abstract and has no state is a good candidate for being turned into an interface, which has simpler syntax:

Java
public interface Expression {
  int evaluate();
}
public interface Expression {
  int evaluate();
}

All behaviors in an interface are implicitly public. As of Java 8, interfaces are less abstract than they were in the earlier versions of Java in that they may contain default method definitions.

An abstract method in C++ is called a pure virtual function. It must be marked virtual to defer any calls to the subclass implementation and assigned 0:

C++
class Expression {
  public:
    virtual int evaluate() = 0;
};
class Expression {
  public:
    virtual int evaluate() = 0;
};

Assigning an abstract method 0 is a strange syntactic move. Stroustrup chalked this non-intuitive syntax up to expediency:

The curious =0 syntax was chosen over the obvious alternative of introducing pure or abstract because at the time I saw no chance of getting a new keyword accepted. Had I suggested pure, Release 2.0 would have shipped without abstract classes. Given the choice between a nicer syntax and abstract classes, I chose abstract classes. Rather than risking delay and incurring certain fights over pure, I used the traditional C and C++ notation of using 0 to represent “not there”.

Supercasting

Consider again this superclass and subclass with their now virtual deliver methods:

C++
#include <iostream>

using namespace std;

class Message {
  public:
    virtual void deliver() {
      cout << "breathe" << endl;
    }
};

class LoudMessage : public Message {
  public:
    void deliver() {
      cout << "BREATHE" << endl;
    }
};
#include <iostream>

using namespace std;

class Message {
  public:
    virtual void deliver() {
      cout << "breathe" << endl;
    }
};

class LoudMessage : public Message {
  public:
    void deliver() {
      cout << "BREATHE" << endl;
    }
};

Thanks to subtype polymorphism, we can assign an instance of a subclass to a superclass. We don't even need to cast it. This main function assigns an instance of LoudMessage to three different superclass types:

C++
int main() {
  LoudMessage loudMessage;
  Message* message1 = &loudMessage;
  Message& message2 = loudMessage;
  Message message3 = loudMessage;

  loudMessage.deliver();
  message1->deliver();
  message2.deliver();
  message3.deliver();

  return 0;
}
int main() {
  LoudMessage loudMessage;
  Message* message1 = &loudMessage;
  Message& message2 = loudMessage;
  Message message3 = loudMessage;

  loudMessage.deliver();
  message1->deliver();
  message2.deliver();
  message3.deliver();

  return 0;
}

Predict the output of this function. Then find yourself a C++ compiler and compile and run it.

The first two assignments are to a pointer and a reference, respectively, but message3 is a plain old variable of type Message. When we assign a subclass to a plain old superclass variable, we have object slicing. Any extra state introduced by the subclass gets sliced off, and the object degenerates to an instance of the supertype. If we are slicing on purpose, this is not a problem. But slicing is often accidentally introduced when we pass a subclass to a function that expects a superclass, as in this C++ program:

C++
void take(TreasureChest chest) {
  string item = chest.open();
  inventory.push(item);
}

int main() {
  LockableTreasureChest chest("Powdered Hens' Teeth");
  take(chest);
  return 0;
}
void take(TreasureChest chest) {
  string item = chest.open();
  inventory.push(item);
}

int main() {
  LockableTreasureChest chest("Powdered Hens' Teeth");
  take(chest);
  return 0;
}

The LockableTreasureChest is sliced into a TreasureChest instance when it's passed to the take function. The function opens it, even though it should have been locked. Generally we want to avoid slicing. Non-primitive parameters in C++ should almost always be references or pointers, which won't be sliced and will be cheaper to pass.

← SubclassingReuse Without Inheritance →