Paul Graham, in 2001, wrote on his success with Viaweb (now Yahoo Store) that it was due to a “secret weapon” that enabled the rapid development and deployment of code: Lisp. Designed with functional programming in mind, the language offers abstractions that make code dense and therefore quick to write and maintain. Although Lisp was impractical for desktop applications at the time, Graham’s use of it for web-based software allowed a small team to kick ass.
Viaweb launched in 1995. It’s now 2010, and the Lisp family of languages is alive and well, thanks to Clojure, a Lisp that runs on the Java Virtual Machine. Ideas from Lisp and functional programming have also filtered into languages like Ruby and Python, which are far more powerful than C++, the lingua franca of the ’90s. Functional programming is, of course, still as powerful and excellent as it was in 1995, but languages as a whole have improved, making the use of a powerful language like Clojure less of a relative advantage than it was for Graham to use Lisp. Using a functional language is still a good business decision, especially for a startup, but wouldn’t quality as a “secret weapon” in an era when every developer worth his salt has heard of Ruby on Rails, which may not be a “functional” language, but comes close enough for many purposes. So what is the next secret weapon? No one knows, but I’ve got a good guess: strong static typing.
Plenty of programmers love to hate static typing, and not for bad reasons, because they’re exposed to shitty static typing, such as that in Java. If one compares the static typing of Java to the dynamic typing of Python or Ruby, most programmers will prefer to program in the dynamic languages. It’s hard to blame them, since Java, as it is often used, requires explicit typing of every variable that is to be used. This becomes painful rapidly, because complex programs use “inner” variables and functions within API-level functions all the time, and having to explicitly declare a type for each will slow down development substantially. This leads us to:
Misconception #1: Java is a representative of “static typing”.
If we want to compare static and dynamic typing, we need to make best-in-class comparisons: what is the best that can be done in each paradigm? Comparing Ruby or Lisp to Java or C++ is not fair; a better comparison would use Haskell or OCaml, both of which have implicit typing as well as parameterized and algebraic data types– these concepts are not as “advanced” as they sound– to represent static typing.
Java’s type system simply isn’t very powerful or useful. Java actually contains two type systems, meshed together into an unsettling chimera. One of its type systems is bottom-up and vaguely algebraic, consisting of a few primitive types (integers of varying sizes, and single- and double-precision floating point numbers) and arrays of existing types. That’s it. It has int and double and char but cannot represent anything more interesting (tuples, strings, variants, parameterized types). To represent those more structured types one has to use Java’s other type system, which is top-down– everything is a subtype of “Object”, and each variable stands not for the object itself, but to a reference to what might be an object of that type, or none (null). The nastiness of this cannot be overstated. Java’s notorious NullPointerException, a runtime failure associated with dereferencing a null pointer, is both a common and (from a debugging perspective) a usually-quite-useless error. Java’s type system lacks the power to verify that a String variable, at compile time, will actually hold a string, and not null, a value that must be handled specially by any method that takes a String (ditto, for any other object type) as input. The string type is not available; you must make do with what is, in fact, a (string option) ref– a mutable reference that may hold a String.
Java’s type system catches some errors at compile time, but not enough of them, most programmers feel, to justify the pain of using its type system. No language can eliminate runtime failures, but statically-typed languages, used properly, can make them very, very rare. Java’s type system doesn’t have enough power to achieve this; it merely makes them somewhat less common, but not enough to justify the required, explicit use of an ugly, underpowered type system.
The type systems of languages like Ocaml and Haskell are far more powerful, allowing user-specified algebraic types. Also, explicit typing is usually not required; the compiler infers most types using the Hindley-Milner algorithm. Although Ocaml and Haskell programmers usually explicitly type their API-level functions– this is just a good practice– they do not suffer the overhead associated with explicitly typing inner functions and variables as in Java. Code development is almost as fast in Haskell or Ocaml as in a language like Ruby; no worse than 10 or 20 percent slower, a difference easily made up for by the reduced debugging time.
Misconception #2: Static typing’s main justification is the faster performance of executables.
Executables generated by statically-typed languages generally have better performance than programs, even compiled ones, in dynamic languages, but for most of us, this is one of the weaker arguments for static typing, maybe fourth or fifth down on the list for most programmers. For most programmers on most applications, human performance is far more important: human time is valuable, and computer time is cheap. We want to write good programs, with minimal maintenance overhead, fast.
On large projects, one of the greatest benefits of static typing is interface control. Although many programmers in dynamic languages are disciplined, one “rock star” with no respect for interfaces can spoil a project, and the errors he produces can go undetected until they occur in runtime testing or (worse yet) in production. He may, for example, change the return type of an interface-level function and fail to inform anyone. In a statically-typed language, this breaks the build, and he’s expected to fix it. In a dynamic language, it can produce a difficult-to-detect runtime error. Worse yet, the failure this change produces can occur far from the function that is in error, after it has finished and is no longer on the call stack.
As for smaller projects with one developer, interface control may not be so important, but ease of debugging is, just because of the enormous amount of time programmers spend debugging and testing. At any scale, compile time bugs are less painful than runtime bugs, and do-the-wrong thing errors are even worse than program-terminating runtime bugs. I would argue that, on average, one runtime bug equals 15 to 50 compile-time bugs in terms of costliness. This is not only because they take more time and effort to find and fix. It’s also because of the cognitive state called flow, on which programmers rely in order to be productive. Fixing an error caught by the compiler, with a known line number, does not break flow much more than a quick trip to the bathroom (most bugs are trivial and, once caught, can be quickly fixed). A 30-minute forensic caper required to determine the source of runtime misbehavior will break flow, because the programmer has to drop what he’s doing and solve a different problem.
It’s often stated that 50% of a programmer’s time is spent debugging. In dynamically-typed languages and languages with weak type systems, I’d bump that percentage to 80, including unit testing, development and study of debugging tools, and defensive measures that must be taken to prevent possibly unknown bugs (“unknown unknowns”) from entering production. In statically-typed languages, this percentage is appreciably lower. It’s probably 30 to 40 percent, not because programmers in statically-typed languages produce fewer bugs, but because so many of those bugs are confronted immediately and quick to fix.
This is a simple economic argument based on human time. The fact that statically-typed languages produce faster executables is merely an added bonus.
Misconception #3: Static typing only catches trivial bugs.
First, it’s surprising how many bugs are trivial. Occasionally they are the result of deep, intrinsic errors that follow from faulty reasoning about the system one is building, and those take serious time and energy to fix no matter what language one is using, but most of the time, they are the result of mistakes like creating records with a field named “public” and, later in the code, reading a field named “pubic”. In a statically-typed language, properly used, this error will be caught by the compiler, noting that the record type of the data does not have such a field. In dynamically-typed languages, where records are usually represented using map (dictionary) types, the range of possible behaviors is greater. Lisps, for example, tend to return a special value nil when a nonexistent key is queried from a map, meaning that the error will not occur until another function, possibly much later, tries to do something with this null value.
The painfulness of a bug is not a function of whether it is “trivial” in origin, but how long it takes to detect and fix the bug. “Trivial” bugs, by definition, are fairly easy to fix once found; this does not mean they are always easy to find. In dynamically-typed languages, certain classes of trivial bugs take minutes to find at best and hours at worst. That time adds up very quickly.
Second, the usefulness of static typing is a function of the programmer’s knowledge of how to use it. Types provide a language through which programmers can specify certain constraints, but don’t require that the programmer use it. An undisciplined programmer could represent dates as, say, integer arrays or tuples– a bad idea, due to ambiguity in date formats. By contrast, a good programmer would create a record type with fields labeled “day”, “week”, and “month”, thereby eliminating certain classes of ambiguity.
Strong, static typing, properly used, can catch the vast majority of bugs in compilation. Using the type system to do so is an art more than it is a science, but most programmers can learn enough to get started within a couple of weeks.
I’ve only scratched the surface of the benefits of static typing, and there’s much I’ve left out. In sum, I believe the strongest benefit of static typing is that it offers a set of tools through which programmers can dramatically reduce the incidence of costly runtime bugs. Since type inference is automatic in languages like Ocaml and Haskell, it provides, essentially for free, a large suite of unit tests that never have to be written, and automatic, error-free documentation. It’s no silver bullet, obviously– no tool could entirely eliminate the need for unit testing and documentation– but in my experience, it’s still damn useful. If I’m right (and I may not be; these are estimates based on anecdotal experience) in my claim that debugging overhead in large projects reaches 80% in dynamically-typed languages, as opposed to 40% for statically typed ones, this indicates the potential for a threefold increase in the amount of time spent moving a project forward, and the potential for a dramatic improvement in real productivity.