Friday, May 16, 2008

What's a type conversion?

What is a type conversion? What's the difference between an implicit conversion, an explicit conversion, a coercion, or a cast? You've no doubt heard these terms thrown around and you probably know they all loosely describe one (or several) methods to enable a programmer to treat one type as another type. But what do they mean, precisely? Even seasoned programmers may be a little confused about this topic, and this may be a root cause of why many programmers don't know the difference between static, dynamic, strong, and weak typing.

Basically, if you think of types in a computer program as mathematical sets, its much easier to understand and remember what all these terms mean. This is not a far stretch if you consider that Type Theory is just one of several versions of Set Theory invented to circumvent Russel's Paradox. Basically, suppose you have two elements a and b, where:

a ∈ A
b ∈ B


This is congruous to saying an object c is of class C, and an object d is of class D. In Java, we would write:

C c = new C();
D d = new D();


Note: The equivalent of "member of" in Java would be the instanceOf operator, as a redditor has pointed out. C c = new C(); is sufficient for c instanceOf C to be true.

Coercions

A function f:A->B is an implicit conversion of an element in A to an element in B. This is also called a coercion, because you are starting with an element in A but you really want an element of type B. You can apply f to obtain that element (call it b). In Java, we have a toString method that converts (or more precisely coerces) an Object to a String:

System.out.println(c.toString());


Casts

On the other hand, an explicit conversion (or cast) is slightly different, but also converts one type to another. Except imagine sets A and C, where A is a subset of C:

a ∈ A
A ⊆ C


Obviously, if a is a member of A, and A is a subset of C, then a must be a member of C, and you may use a whenever a member of set A or set C is expected. In programming, when you treat an object as one of its sub-types or super-types, you must perform an action called a cast, and the object itself never changes - just like how the element a never changes.

In the following Java example, suppose we have a String. Well, obviously that String is also an Object (a superclass of String), so we can perform a cast to convert a String into an Object:

String s = new String("Hello World!");
Object o = (Object) s;


Confusion

Knowing all this, it probably doesn't make sense to say "explicit coercion" or "implicit casting"; however, I do see these terms thrown around a lot. Specifically, "explicit coercion" is usually construed to mean "manual coercion"; eg the developer must manually coerce an object of one type to another. When the compiler does it, people say it is an "implicit coercion" or "implicit conversion"; these terms seem redundant and confusing, so I urge readers to use the terms "automatic coercion" and "manual coercion" instead (unless your intention is to confuse). Some people also draw a distinction between conversion and coercion (defining conversion to mean casting, and coercion to mean ... well, coercion), as if we computer scientists haven't already made enough of a mess by sloppily defining "=" to mean "member of" in Big-Oh notation. Sigh.

Summary

So in summary, we know that implicit conversions aka coercions change one type to another type, explicit conversions aka casting change one type to another sub-type or super-type. This is important because the way languages handle type conversions play a part in determining what kind of type system they are using - be it static, dynamic, strong, or weak. I know I wrote a little about type systems earlier, but I'd like to revisit and expand on the topic in the future, with this article at hand to reference.