Overriding Behaviors
A subclass normally gains all the behaviors of its superclass. Consider this List
class in Ruby, which defines a constructor, an add
method, and a getter and setter for the items
instance variable:
class List
attr_accessor :items
def initialize
@items = []
end
def add(item)
@items.push(item)
end
end
list = List.new
list.add(11)
list.add(19)
puts list.items.to_s # puts [11, 19]
This ReverseList
subclass retains the constructor, getter, and setter but overrides the add
behavior to do something completely different than the superclass:
class ReversedList < List
def add(item)
items.unshift(item)
end
end
list = ReversedList.new
list.add(11)
list.add(19)
puts list.items.to_s # puts [19, 11]
The push
method appends, while the unshift
method prepends.
An overriding method can choose to invoke the superclass behavior instead of replacing it completely as ReverseList
does. In many languages, this is done by invoking the method on the super
receiver. This subclass adds each element twice using a super
call:
class DoubledList < List
def add(item)
super.add(item)
super.add(item)
end
end
list = DoubledList.new
list.add(11)
list.add(19)
puts list.items.to_s # puts [11, 11, 19, 19]
Some language communities use the term base class instead of superclass. For example, in C#, we invoke the superclass behavior with base
instead of super
. C++ uses neither super
nor base
because it allows multiple superclasses. We must disambiguate which behavior we want by qualifying the call with the appropriate superclass name, as in this code:
class DoubledList : public List, SomeOtherSuperclass {
public:
void add(int i) {
List::add(item);
List::add(item);
}
}
Sealing
Suppose we are writing an application that shows styled text. We want an abstraction that bundles together the text and its color, so we write this ColorfulString
class:
public class ColorfulString extends String {
private Color color;
public ColorfulString(String text, Color color) {
super(text);
this.color = color;
}
// ...
}
But our IDE and compiler reject this code. That's because the String
class has been sealed in Java, prohibiting it from having any subclasses. A class is sealed in Java by adding the final
modifier to the class header:
public final class String ... {
// ...
}
There are at least two reasons to seal a class. The first reason is demonstrated by Java's Math
class, which contains only static methods. Since it's not used to make objects, there is neither state nor behaviors to inherit. Subclasses would gain nothing from this inheritance, so Math
is sealed.
The second reason is that if we write a method and declare a formal parameter of a certain type, we can't be certain what type will be passed in as an actual parameter. The value could be the declared type, or it could be any of its subclasses. If we must be certain of the actual parameter's type, then we seal the type. Unsealed types permit subtype polymorphism, which is often good, but not always.
The designers of Java chose to block polymorphic behavior for String
. Subclasses, if they were possible, might not make the same guarantees as String
. Recall that String
is immutable. The hashcode of an immutable String
never changes, so it may be computed once and cached. This leads to very fast lookups in hashed collections. If we could create a subclass of String
, we could add mutable state. Any time its state changed, the hashcode would need to be recomputed, thereby degrading performance.
Sealing a class is an extreme action, a bit like sterilizing a pet or a human, in that we are forbidding entire lines of descendents from ever existing. But we don't have to seal an entire class. We may instead seal individual methods, which would allow subclasses but prevent them from overriding the sealed methods.
Virtual
The word virtual is used to describe an illusion that approaches or simulates reality. A virtual machine is like a real CPU, but it's implemented in software. A virtual tour is like an in-person tour, but we are sitting at home. A virtual method is a real method, but when we call it, we might be calling not the supertype's implementation but an overriding implementation defined by a subclass.
When we declare a method as virtual in C++, we allow an overriding method in the subclass to be called from polymorphic code that knows only the supertype. Consider this inheritance hierarchy:
class Message {
public:
/* virtual */ void deliver() {
cout << "breathe" << endl;
}
};
class LoudMessage : public Message {
public:
void deliver() {
cout << "BREATHE" << endl;
}
};
int main() {
LoudMessage loudMessage;
Message& message = loudMessage;
loudMessage.deliver(); // prints "BREATHE"
message.deliver(); // prints "breathe"
return 0;
}
Both the superclass and subclass define the deliver
method. This is possible because Message
does not seal the method with the final
modifier. The main
function creates a single object, and its type is LoudMessage
. Calling deliver
on the loudMessage
receiver calls LoudMessage::deliver
, as we'd expect. However, the message
reference behaves differently, even though it refers to the exact same object. It has the type Message&
. Since the deliver
method is not virtual, the second call is to Message::deliver
. If a method is not virtual, the compiler triggers the method associated with the receiver's declared type, not the underlying type of its actual value. If we uncomment the virtual
modifier on Message::deliver
, both calls will invoke LoudMessage::deliver
.
You might be wondering why we would ever want a non-virtual method. This is a fair question. The Java designers wondered the same thing. They didn't find a compelling answer, so they made all instance methods virtual, no matter what. C++ requires us to opt-in to polymorphic behavior because virtual methods have a slight performance cost and introduce some metadata into an object's memory that may cause incompatibilities when objects are shared with libraries written in languages like C. C++ objects and C structs can have the same memory layout, but only if the objects have no virtual behaviors.
Abstract
When we factor out a common interface to a superclass, we may find that we want all subclasses to have a certain behavior, yet they have no common implementation. This is the case with the evaluate
method in the BinaryOperator
hierarchy in Ruby. Each subclass defines its own version of the evaluate
method, but there is no mention of it in the superclass. The expression e.evaluate
will work no matter what subclass e
refers to because Ruby has dynamic typing.
A compiler for a language with static typing, on the other hand, will need to typecheck e.evaluate
. It will look at the type of e
and ensure that it has an evaluate
method. The type of e
is the supertype over all expressions: Expression
. For typechecking to succeed, Expression
must declare this method, even if it has no definition. A method imposed by a supertype but not defined by it is abstract. An abstract method is defined in Java using the abstract
modifier:
public abstract class Expression {
public abstract int evaluate();
}
Note that evaluate
has a header but no body. The class itself is also marked abstract, which outlaws any instances of it from being constructed. There are many kinds of Expression
, like Add
and Number
, but there's no entity that is just an Expression
and nothing more. It would be an error to instantiate one.
Any class with an abstract method is necessarily abstract. A class without any abstract methods may still be marked abstract if it is too general to be instantiated. Java's Object
class, for example, is abstract but has no abstract methods.
A Java class that is entirely abstract and has no state is a good candidate for being turned into an interface, which has simpler syntax:
public interface Expression {
int evaluate();
}
All behaviors in an interface are implicitly public. As of Java 8, interfaces are less abstract than they were in the earlier versions of Java in that they may contain default method definitions.
An abstract method in C++ is called a pure virtual function. It must be marked virtual
to defer any calls to the subclass implementation and assigned 0:
class Expression {
public:
virtual int evaluate() = 0;
};
Assigning an abstract method 0 is a strange syntactic move. Stroustrup chalked this non-intuitive syntax up to expediency:
The curious =0 syntax was chosen over the obvious alternative of introducing
pure
orabstract
because at the time I saw no chance of getting a new keyword accepted. Had I suggestedpure
, Release 2.0 would have shipped without abstract classes. Given the choice between a nicer syntax and abstract classes, I chose abstract classes. Rather than risking delay and incurring certain fights overpure
, I used the traditional C and C++ notation of using 0 to represent “not there”.
Supercasting
Consider again this superclass and subclass with their now virtual deliver
methods:
#include <iostream>
using namespace std;
class Message {
public:
virtual void deliver() {
cout << "breathe" << endl;
}
};
class LoudMessage : public Message {
public:
void deliver() {
cout << "BREATHE" << endl;
}
};
Thanks to subtype polymorphism, we can assign an instance of a subclass to a superclass. We don't even need to cast it. This main
function assigns an instance of LoudMessage
to three different superclass types:
int main() {
LoudMessage loudMessage;
Message* message1 = &loudMessage;
Message& message2 = loudMessage;
Message message3 = loudMessage;
loudMessage.deliver();
message1->deliver();
message2.deliver();
message3.deliver();
return 0;
}
Predict the output of this function. Then find yourself a C++ compiler and compile and run it.
The first two assignments are to a pointer and a reference, respectively, but message3
is a plain old variable of type Message
. When we assign a subclass to a plain old superclass variable, we have object slicing. Any extra state introduced by the subclass gets sliced off, and the object degenerates to an instance of the supertype. If we are slicing on purpose, this is not a problem. But slicing is often accidentally introduced when we pass a subclass to a function that expects a superclass, as in this C++ program:
void take(TreasureChest chest) {
string item = chest.open();
inventory.push(item);
}
int main() {
LockableTreasureChest chest("Powdered Hens' Teeth");
take(chest);
return 0;
}
The LockableTreasureChest
is sliced into a TreasureChest
instance when it's passed to the take
function. The function opens it, even though it should have been locked. Generally we want to avoid slicing. Non-primitive parameters in C++ should almost always be references or pointers, which won't be sliced and will be cheaper to pass.