Generics in Java: Adventures in Type Erasure

Marcus • May 04, 2020

The ends of five pencils with erasers.

Trying to use Generics in Java for reflection is an odd beast: You’re trying to tame a ghost, and crossing your fingers you’ll get what you expected.

There’s a certain illusion at play here, to be sure. Java 1.5 came around over 15 years ago, with a huge chunk of benefits over previous versions. Generics improved the readability of code plenty, reducing casts, but what are the common pitfalls?

And, more importantly in almost all cases: How do you use Reflection to get the original type?

Generics are a trick

The internals of how the type system was adjusted in Java 1.5 had a very good goal, actually: They didn’t want to break other people’s code, and code that previously relied on the Java 1.4-way of writing containers simply shouldn’t require a change.

Thus, existing types were retrofitted. While C# has two separate types with the non-generic System.Collections.List and the very generic ?System.Collections.Generic.List<T>, the same cannot be said for Java. There’s only one java.util.List, and whether or not that is generic depends on your language level.

Even Vector was extended: Not only to add a generic type parameter, but also to implement List to improve interopability between old and new code.

Generics? Not in MY bytecode.

Here’s a short example, using two very simple lists.

class FirstExample {
  static void createLists() {
    List<Number> firstList = new ArrayList<>();
    List<String> secondList = new ArrayList<>();
    List thirdList = new ArrayList();
  }
}

Now here’s what this looks in bytecode.

static createLists()V
   L0
    LINENUMBER 19 L0
    NEW java/util/ArrayList
    DUP
    INVOKESPECIAL java/util/ArrayList.<init> ()V
    ASTORE 0
   L1
    LINENUMBER 20 L1
    NEW java/util/ArrayList
    DUP
    INVOKESPECIAL java/util/ArrayList.<init> ()V
    ASTORE 1
   L2
    LINENUMBER 21 L2
    NEW java/util/ArrayList
    DUP
    INVOKESPECIAL java/util/ArrayList.<init> ()V
    ASTORE 2
   L3
    LINENUMBER 22 L3
    RETURN
   L4
    LOCALVARIABLE firstList Ljava/util/List; L1 L4 0
    LOCALVARIABLE secondList Ljava/util/List; L2 L4 1
    LOCALVARIABLE thirdList Ljava/util/List; L3 L4 2
    MAXSTACK = 2
    MAXLOCALS = 3

None of this suggests we’re actually aware of what type either of those instances is: I’ve removed a comment or two from the bytecode output which contained the metadata, but at runtime the type is unknown.

It’s a mirage, created by the compiler, to make us think we’re working with generics. That very same compiler enforces the rules that make up valid source code — and doing anything illegal with how generics should work in theory is forbidden. Obviously.

Can you retrieve the type of a local variable, easily? Probably not. If you’ve ever wondered why, despite calling a generic method, you have a Class<?> parameter on the method, the simple answer is that you need it to determine the actual type contained in your List<T>, Set<T> or CustomObject<T>. No-one else can do it for you.

Type Erasure

Generic types at runtime are an awkward thing to work around with, since they aren’t actually as existant as you sometimes would hope they were. For example, the following comparison is always true:

new ArrayList<String>().getClass() == new ArrayList<Integer>().getClass()

A modern IDE will probably suggest that both are very complicated ways to retrieve ArrayList.class.

Not all is lost, to be sure, but there is some awkwardness. If you can compile your code against class files that you don’t have source code to, you can still figure out what generic fields and parameters were in the code — with the types the original author put in.

That is, in the following example, you can get the some type information.

class SecondExample {
  public List<? extends Number> fourthList = new ArrayList<Double>();

  public void printTypes() {
    Field field = SecondExample.class.getField("fourthList");
    ParameterizedType declaredType =
      (ParameterizedType) field.getGenericType();

    // output: interface java.util.List
    System.out.println(declaredType.getRawType());

    // output: ? extends java.lang.Number
    System.out.println(declaredType.getActualTypeArguments()\[0\]);
  }
}

But here, you’re just given what was declared in the class file. You know it’s some kind of Number, not that, in practice, it’s a Double.

And if you try really hard, types don’t matter as much. The JVM doesn’t seem to keep track of the type, and luckily, neither do you have to. If you ignore compiler warnings, you can actually put any kind of number into your ArrayList<Double>.

class ThirdExample {
  public static List<? extends Number> fourthList =
    new ArrayList<Double>();

  @SuppressWarnings("unchecked")
  void add(Integer i) {
    ((List<Integer>) fourthList).add(i);

    // output: class java.lang.Integer 7
    System.out.println(fourthList.get(0).getClass() + " "
      + fourthList.get(0));
  }
}

But hold up a minute. We’re talking about a List<? extends Number> here. That’s not the same as a List<Number>, because the wildcard actually prevents you from putting blatantly wrong types in. If you try to do the same trying to add a String element, suddenly it’s a ClassCastException — why?

Wildcard Types

Turns out the JVM enforces the lower bound for wildcard types. That is, using a List<? extends Double> makes sure each of the items in the list is a Double, whereas List<Double> does not.

Let’s imagine you are a typical software engineer and carrying a backpack of numbers around, just in case you need them — doubles, floats, shorts, signed bytes, anything! A single slot in this backpack can thus be defined as follows:

class ItemSlot<T extends Number> {
  T value;

  T view() {
    return value;
  }

  void store(T newValue) {
    value = newValue;
  }
}

Now that we have the backpack to carry numbers, we can fill each slot by calling the store method, and view the items contained by calling view.

But something perhaps unexpected happened with this implementation: Our bytecode refers to Numbers everywhere, not Objects.

class ItemSlot {
  // access flags 0x0
  // signature TT;
  // declaration: value extends T
  Ljava/lang/Number; value

  // access flags 0x0
  // signature ()TT;
  // declaration: T get()
  get()Ljava/lang/Number;
   L0
    LINENUMBER 79 L0
    ALOAD 0
    GETFIELD ItemSlot.value : Ljava/lang/Number;
    ARETURN
   L1
    LOCALVARIABLE this LItemSlot; L0 L1 0
    // signature LItemSlot<TT;>;
    // declaration: this extends ItemSlot<T>
    MAXSTACK = 1
    MAXLOCALS = 1

  // access flags 0x0
  // signature (TT;)V
  // declaration: void set(T)
  set(Ljava/lang/Number;)V
   L0
    LINENUMBER 83 L0
    ALOAD 0
    ALOAD 1
    PUTFIELD ItemSlot.value : Ljava/lang/Number;
   L1
    LINENUMBER 84 L1
    RETURN
   L2
    LOCALVARIABLE this LItemSlot; L0 L2 0
    // signature LItemSlot<TT;>;
    // declaration: this extends ItemSlot<T>
    LOCALVARIABLE newValue Ljava/lang/Number; L0 L2 1
    // signature TT;
    // declaration: newValue extends T
    MAXSTACK = 2
    MAXLOCALS = 2
}

Naturally, all you know at runtime is that it’s a Number, but that’s now enforced. Trying to pass a non-Number where one is expected is now an error the compiler won’t let you make.

Building bridges

There’s one last oddity with wildcard types we should talk about. What happens if we want to override a method from a generic interface, but that interface has no wildcard?

For our bag of numbers, that’s just our backpack as a software engineer. If you’ve studied a language, perhaps you would only bring Dictionaries for a picknick, and we need a more generic Slot<T>.

interface Slot<T> {
  void set(T newValue);
}

class ItemSlot<T extends Number> implements Slot<T> {
  // value + getter removed for brevety
  public void set(T newValue) {
    value = newValue;
  }
}

As we’ve seen previously, ItemSlot.set() has a Number as parameter, but using Slot.set() can be called with any object. That’s where the compiler creates a bridge method for us, to pass the gap between types, that is just a type check before calling the real method.

class ItemSlot implements Slot {
  public set(Ljava/lang/Number;)V
    // ... same as above

  // access flags 0x1041
  public synthetic bridge set(Ljava/lang/Object;)V
   L0
    LINENUMBER 75 L0
    ALOAD 0
    ALOAD 1
    CHECKCAST java/lang/Number
    INVOKEVIRTUAL ItemSlot.set (Ljava/lang/Number;)V
    RETURN
   L1
    LOCALVARIABLE this LItemSlot; L0 L1 0
    // signature LItemSlot<TT;>;
    // declaration: this extends ItemSlot<T>
    MAXSTACK = 2
    MAXLOCALS = 2
}

Key Takeaways

}