Learning C, reducing fear.

I have a confession to make. At one point in my career, I was a mediocre programmer. I might say that I still am, only in the context of being a harsh grader. I developed a scale for software engineering for which I can only, in intellectual honesty, assign myself 1.8 points out of a possible 3.0. One of the signs of my mediocrity is that I haven’t a clue about many low-level programming details that, thirty years ago, people dealt with on a regular basis. I know what L1 and L2 cache are, but I haven’t built the skill set yet to make use of this knowledge.

I love high-level languages like Scala, Clojure, and Haskell. The abstractions they provide make programming more productive and fun than it is in a language like Java and C++, and the languages have a beauty that I appreciate as a designer and mathematician. Yet, there is still quite a place for C in this world. Last July, I wrote an essay, “Six Languages to Master“, in which I advised young programmers to learn the following languages:

  • Python, because one can get started quickly and Python is a good all-purpose language.
  • C, because there are large sections of computer science that are inaccessible if you don’t understand low-level details like memory management.
  • ML, to learn taste in a simple language often described as a “functional C” that also teaches how to use type systems to make powerful guarantees about programs.
  • Clojure, because learning about language (which is important if one wants to design good interfaces) is best done with a Lisp and because, for better for worse, the Java libraries are a part of our world.
  • Scala, because it’s badass if used by people with a deep understanding of type systems, functional programming, and the few (very rare) occasions where object-oriented programming is appropriate. (It can be, however, horrid if wielded by “Java-in-Scala” programmers.)
  • English (or the natural language of one’s environment) because if you can’t teach other people how to use the assets you create, you’re not doing a very good job.

Of these, C was my weakest at the time. It still is. Now, I’m taking some time to learn it. Why? There are two reasons for this.

  • Transferability. Scala’s great, but I have no idea if it will be around in 10 years. If the Java-in-Scala crowd adopts the language without upgrading its skills and the language becomes associated with Maven, XMHell, IDE culture, and commodity programmers, in the way that Java has, the result will be piles of terrible Scala code that will brand the language as “write-only” and damage its reputation for reasons that are not Scala’s fault. These sociological variables I cannot predict. I do, however, know that C will be in use in 10 years. I don’t mind learning new languages– it’s fun and I can do it quickly– but the upshot of C is that, if I know it, I will be able to make immediate technical contributions in almost any programming environment. I’m already fluent in about ten languages; might as well add C. 
  • Confidence. High-level languages are great, but if you develop the attitude that low-level languages are “unsafe”, ugly, and generally terrifying, then you’re hobbling yourself for no reason. C has its warts, and there are many applications where it’s not appropriate. It requires attention to details (array bounds, memory management) that high-level languages handle automatically. The issue is that, in engineering, anything can break down, and you may be required to solve problems in the depths of detail. Your beautiful Clojure program might have a performance problem in production because of an issue with the JVM. You might need to dig deep and figure it out. That doesn’t mean you shouldn’t use Clojure. However, if you’re scared of C, you can’t study the JVM internals or performance considerations, because a lot of the core concepts (e.g. memory allocation) become a “black box”. Nor will you be able to understand your operating system.

For me, personally, the confidence issue is the important one. In the functional programming community, we often develop an attitude that the imperative way of doing things is ugly, unsafe, wrong, and best left to “experts only” (which is ironic, because most of us are well into the top 5% of programmers, and more equipped to handle complexity than most; it’s this adeptness that makes us aware of our own limitations and prefer functional safeguards when possible). Or, I should not say that this is a prevailing attitude, so much as an artifact of communication. Fifty-year-old, brilliant functional programmers talk about how great it is to be liberated from evils like malloc and free. They’re right, for applications where high-level programming is appropriate. The context being missed is that they have already learned about memory management quite thoroughly, and now it’s an annoyance to them to keep having to do it. That’s why they love languages like Ocaml and Python. It’s not that low-level languages are dirty or unsafe or even “un-fun”, but that high-level languages are just much better suited to certain classes of problems.

Becoming the mentor

I’m going to make an aside that has nothing to do with C. What is the best predictor of whether someone will remain at a company for more than 3 years? Mentorship. Everyone wants “a Mentor” who will take care of his career by providing interesting work, freedom from politics, necessary introductions, and well-designed learning exercises instead of just-get-it-done grunt work. That’s what we see in the movies: the plucky 25-year-old is picked up by the “star” trader, journalist, or executive and, over 97 long minutes, his or her career is made. Often this relationship goes horribly wrong in film, as in Wall Street, wherein the mentor and protege end up in a nasty conflict. I won’t go so far as to call this entirely fictional, but it’s very rare. You can find mentors (plural) who will help you along as much as they can, and should always be looking for people interested in sharing knowledge and help, but you shouldn’t look for “The Mentor”. He doesn’t exist. People want to help those who are already self-mentoring. This is even more true in a world where few people stay at a job for more than 4 years.

I’ll turn 30 this year, and in Silicon Valley that would entitle me to a lawn and the right to tell people to get off of it, but I live in Manhattan so I’ll have to keep using the Internet as my virtual lawn. (Well, people just keep fucking being wrong. There are too many for one man to handle!) One of the most important lessons to learn is the importance of self-mentoring. Once you get out of school where people are paid to teach you stuff, people won’t help people who aren’t helping themselves. To a large degree, this means becoming the “Mentor” figure that one seeks. I think that’s what adulthood is. It’s when you realize that the age in which there were superior people at your beck and call to sort out your messes and tell you what to do is over. Children can be nasty to each other but there are always adults to make things right– to discipline those who break others’ toys, and replace what is broken. The terrifying thing about adulthood is the realization that there are no adults. This is a deep-seated need that the physical world won’t fill. There’s at least 10,000 recorded years of history that shows people gaining immense power by making “adults-over-adults” up, and using the purported existence of such creatures to arrogate political power, because most people are frankly terrified of the fact that, at least in the observable physical world and in this life, there is no such creature.

What could this have to do with C? Well, now I dive back into confessional mode. My longest job tenure (30 months!) was at a startup that seems to have disappeared after I left. I was working in Clojure, doing some beautiful technical work. This was in Clojure’s infancy, but the great thing about Lisps is that it’s easy to adapt the language to your needs. I wrote a multi-threaded debugger using dynamic binding (dangerous in production, but fine for debugging) that involved getting into the guts of Clojure, a test harness, an RPC client-server infrastructure, and a custom NoSQL graph-based database. The startup itself wasn’t well-managed, but the technical work itself was a lot of fun. Still, I remember a lot of conversations to the effect of, “When we get a real <X>”, where X might be “database guy” or “security expert” or “support team”. The attitude I allowed myself to fall into, when we were four people strong, was that a lot of the hard work would have to be done by someone more senior, someone better. We inaccurately believed that the scaling challenges would mandate this, when in fact, we didn’t scale at all because the startup didn’t launch.

Business idiots love real X’s. This is why startups frequently develop the social-climbing mentality (in the name of “scaling”) that makes internal promotion rare. The problem is that this “realness” is total fiction. People don’t graduate from Expert School and become experts. They build a knowledge base over time, often by going far outside of their comfort zones and trying things at which they might fail, and the only things that change are that the challenges get harder, or the failure rate goes down. As with the Mentor that many people wait for in vain, one doesn’t wait to “find a Real X” but becomes one. That’s the difference between a corporate developer and a real hacker. The former plays Minesweeper (or whatever Windows users do these days) and waits for an Expert to come from on high to fix his IDE when it breaks. The latter shows an actual interest in how computers really work, which requires diving into the netherworld of the command line interface.

That’s why I’m learning C. I’d prefer to spend much of my programming existence in high-level languages and not micromanaging details– although, this far, C has proven surprisingly fun– but I realize that these low-level concerns are extremely important and that if I want to understand things truly, I need a basic fluency in them. If you fear details, you don’t understand “the big picture”. The big picture is made up of details, after all. This is a way to keep the senescence of business FUD at bay– to not become That Executive who mandates hideous “best practices” Java, because Python and Scala are “too risky”.

Fear of databases? Of operating systems? Of “weird” languages like C and Assembly? Y’all fears get Zero Fucks from me.

Six languages to master.

Eric Raymond, in “How to Become a Hacker“, recommended five languages: C, Java, Python, Perl, and Lisp. Each he recommended for different reasons: Python and Java as introductory languages, C to understand and hack Unix, Perl because of its use in scripting, and Lisp for, to use his words which are so much better than anything I can come up with, the profound enlightenment experience you will have when you finally get it. That experience will make you a better programmer for the rest of your days, even if you never actually use LISP itself a lot.

It’s 2012. Many languages have come to the fore that didn’t exist when this essay was written. Others, like Perl, have faded somewhat. What is today’s five-language list? I won’t pretend that my answer is necessarily the best; it’s biased based on what I know. That said, I’d think the 5 highest-return languages for people who want to become good engineers are the following, and in this order: Python, C, ML, Clojure, and Scala.

Why these 5? Python I include because it’s easy to learn and, in-the-small, extremely legible. I’d rather not use it for a large system, but people who are just now learning to program are not going to be writing huge systems. They’re not going to be pushing major programs into production. At least, they shouldn’t be. What they should be able to do is scrape a webpage or build a game or investigate an applied math problem and say, “Shit, that’s cool.” Python gets people there quickly. That will motivate them to get deeper into programming. Python is also a language that is not great at many things, but good at one hell of a lot of them. It’s quick to write, legible in the small, and expressive. It allows imperative and functional styles. It has great libraries, and it has strong C bindings, for when performance is needed.

People who are getting started in programming want to do things that are macroscopically interesting from a beginner’s perspective. They don’t just want to learn about algorithms and how compilers work, because none of that’s interesting to them until they learn more of the computer science that motivates the understanding of why these things are important. Compilers aren’t interesting until you’ve written programs in compiled languages. At the start, people want to be able to write games, scrape webpages, and do simple systems tasks that come up. Python is good because it’s relatively easy to do most programming tasks in it.

After Python, C is a good next choice, and not because of its performance. That’s largely irrelevant to whether it will make someone a better programmer (although the confidence, with regard to understanding performance, that can come with knowing C is quite valuable). C is crucial because there’s a lot of computer science that becomes inaccessible if one sticks to higher-level languages (and virtual machines) like Java, C#, and Python. Garbage collection is great, but what is the garbage collector written in? C. As is Unix, notably. For all this, I think C is a better choice than C++ because there’s another important thing about C: C++ is a mess and it’s not clear whether it’s a good language for more than 1% of the purposes to which it’s put, but C, on the other hand, has utterly dominated the mid-level language category. For all its flaws, C is (like SQL for database query languages) a smashing, undisputed success, and for good reasons. The high-level language space is still unsettled, with no clear set of winners, but the mid-level language used to write the runtimes and garbage collectors of those high level languages is usually C, and will be for some time.

Python and C give a person coverage of the mid- and very-high levels of language abstraction. I’m avoiding including low-level (i.e. assembly) languages because I don’t think any of them have the generalist’s interest that would justify top-5 placement. Familiarity with assembly language and how it basically works is a must, but I don’t think mastery of x86 intricacies is necessary for most programmers.

Once a programmer’s fluent in Python and C, we’re talking about someone who can solve most coding problems, but improvement shouldn’t end there. Taste is extremely important, and it’s lack of taste rather than lack of intellectual ability that has created the abundance of terrible code in existence. Languages can’t inherently force people to learn taste, but a good starting point in this direction is ML: SML or OCaml with the “O” mostly not used.

ML has been described as a “functional C” for its elegance. It’s fast, and it’s a simple language, but its strong functional programming support makes it extremely powerful. It also forces people to program from the bottom up. Instead of creating vague “objects” that might be hacked into bloated nonsense over the lifespan of a codebase, they create datatypes (mostly, records and discriminated unions, with parameterized types available for polymorphism) out of simpler ones, and use referentially transparent functions as the basic building blocks of most of their programs. This bottom-up structure forces people to build programs on sound principles (rather than the vague, squishy abstractions of badly-written object-oriented code) but ML’s high-level capability brings people into the awareness that one can write complex software using a bottom-up philosophy. Python and C teach computer science at higher and lower levels, but ML forces a programmer to learn how to write good code.

There’s also something philosophical that Python, C, and Ocaml tend to share that C++ and Java don’t: small-program philosophy, which is generally superior. I’ve written at length about the perils of the alternative. In these languages, it’s much more uncommon to drag in the rats’ nest of dependencies associated with large Java projects. For an added bonus, you never have to look at those fucking ugly singleton directories called “com”. Once a person has used these three languages to a significant extent, one gets a strong sense of how small-program development works and why immodular, large-project orientation is generally a bad thing.

When you write C or Ocaml or Python, you get used to writing whole programs that accomplish something. There’s a problem, you solve it, and you’re done. Now you have a script, or a library, or a long-running executable. You may come back to it to improve it, but in general, you move on to something else, while the solution you’ve created adds to the total stored value of your code repository. That’s what’s great about small-program development: problems are actually solved and work is actually “done” rather than recursively leading to more work without any introspection on whether the features being piled on the work queue make sense. Developers who only experience large-program development– working on huge, immodular Java projects in IDEs a million metaphorical miles from where the code actually runs for real– never get this experience of actually finishing a whole program.

Once a person has grasped ML, we’re talking about a seriously capable programmer, even though ML isn’t a complex language. Learned in the middle of one’s ML career is a point to which I’ll return soon, but for now leave hanging: types are interesting. One of the most important things to learn from ML is how to use the type system to enforce program correctness: it generates a massive suite of implicit unit tests that (a) never have to be written, and (b) don’t contribute to codebase size. (Any decent programmer knows that “lines of code” represent expenditure, not accomplishment.)

The fourth language to learn is Clojure, a Lisp that happens to run on the JVM. The JVM has its warts, but it’s powerful and there are a lot of good reasons to learn that ecosystem, and Clojure’s a great entry point. A lot of exciting work is being done in the JVM ecosystem, and languages like Clojure and Scala keep some excellent programmers interested in it. Clojure is an excellent Lisp, but with its interactive “repl” (read-eval-print-loop) and extremely strong expressivity, it is (ironically)  arguably the best way to learn Java. It has an outstanding community, a strong feature set, and some excellent code in the open-source world.

Lisp is also of strong fundamental importance, because its macro system is unlike anything else in any other language and will fundamentally alter how an engineer thinks about software, and because Lisp encourages people to use a very expressive style. It’s also an extremely productive language: large amounts of functionality can be delivered in a small amount of time. Lisp is a great language for learning the fundamentals of computing, and that’s one reason why Scheme has been traditionally used in education. (However, I’d probably advocate starting with Python because it’s easier to get to “real stuff” quickly in it. Structure and Interpretation of Computer Programs and Scheme should be presented when people know they’re actually interested in computing itself.)

When one’s writing large systems, Lisp isn’t the best choice, because interfaces matter at that point, and there’s a danger that people will play fast-and-loose with interfaces (passing nested maps and lists and expecting the other side to understand the encoding) in a way that can be toxic. Lisp is great if you trust the developers working on the project, but (sadly) I don’t think many companies remain in such a state as they grow to scale.

Also, static typing is a feature, not a drawback. Used correctly, static typing can make code more clear (by specifying interfaces) and more robust, in addition to the faster performance usually available in compiled, statically typed languages. ML and Haskell (which I didn’t list, but it’s a great language in its own right) can teach a person how to use static typing well.

So after Lisp, the 5th language to master is Scala. Why Scala, after learning all those others, and having more than enough tools to program in interesting ways? First of all, it has an incredible amount of depth in its type system, which attempts to unify the philosophies of ML and Java and (in my opinion) does a damn impressive job. The first half of Types and Programming Languages is, roughly speaking, the theoretic substrate for ML. But ML doesn’t have a lot of the finer features. It doesn’t have subtyping, for example. Also, the uniqueness constraint on record and discriminated union labels (necessary for full Hindley-Milner inference, but still painful) can have a negative effect on the way people write code. The second half of TAPL, which vanilla ML doesn’t really support, is realized in Scala. Second, I think Scala is the language that will salvage the 5 percent of object-oriented programming that is actually useful and interesting, while providing such powerful functional features that the remaining 95% can be sloughed away. The salvage project in which a generation of elite programmers selects what works from a variety of programming styles– functional, object-oriented, actor-driven, imperative– and discards what doesn’t work, is going to happen in Scala. So this is a great opportunity to see first-hand what works in language design and what doesn’t.

Scala’s a great language that also requires taste and care, because it’s so powerful. I don’t agree with the detractors who claim it’s at risk of turning into C++, but it definitely provides enough rope for a person to hang himself by the monads.

What’s most impressive about Clojure and Scala is their communities. An enormous amount of innovation, not only in libraries but also in language design, is coming out of these two languages. There is a slight danger of Java-culture creep in them, and Scala best practices (expected by the leading build environments) do, to my chagrin, involve directories called “src” and “main” and even seem to encourage singleton directories called “com”, but I’m willing to call this a superficial loss and, otherwise, the right side seems to be winning. There’s an incredible amount of innovation happening in these two languages that have now absorbed the bulk of the top Java developers.

Now… I mentioned “six languages” in this post’s title but named five. The sixth is one that very few programmers are willing to use in source code: English. (Or, I should say, the scientifically favored natural language of one’s locale.) Specifically, technical English, which requires rigor as well as clarity and taste. Written communication. This is more important than all of the others. By far. For that, I’m not complaining that software engineers are bad at writing. Competence is not a problem. Anyone smart enough to learn C++ or the finer points of Lisp is more than intelligent enough to communicate in a reasonable way. I’m not asking people to write prose that would make Faulkner cry; I’m asking them to explain the technical assets they’ve created at, at the least, the level of depth and rigor expected in a B+ undergraduate paper. The lack of writing in software isn’t an issue of capability, though, but of laziness.

Here’s one you hear sometimes: “The code is self-documenting.” Bullshit. It’s great when code can be self-documenting, making comments unnecessary, but it’s pretty damn rare to be solving a problem so simple that the code responsible for solving it is actually self-explanatory. Most problems are custom problems that require documentation of what is being solved, why, and how. People need to know, when they read code, what they’re looking at; otherwise, they’re going to waste a massive amount of time focusing on details that aren’t relevant. Documentation should not be made a crutch– you should also do the other important things like avoiding long functions and huge classes– but it is essential to write about what you’re doing. People need to stop thinking about software as machinery that “explains itself” and start thinking of it as writing a paper, with instructions for humans about what is happening alongside the code actually doing the work.

One of the biggest errors I encounter with regard to commenting is the tendency to comment minutiae while ignoring the big picture. There might be a 900-line program with a couple comments saying, “I’m doing this it’s O(n) instead of O(n^2)” or “TODO: remove hard-coded filename”, but nothing that actually explains why these 900 lines of code exist. Who does that help? These types of comments are useless to people who don’t understand what’s happening at all, which they generally won’t in the face of inadequate documentation. Code is much easier to read when one knows what one is looking at, and microcomments on tiny details that seemed important when the code was written are not helpful.

Comments are like static typing: under-regarded if not ill-appreciated because so few people use them properly, but very powerful (if used with taste) in making code and systems actually legible and reusable. Most real-world code, unfortunately, isn’t this way. My experience is that about 5 to 10 percent of code in a typical codebase is legible, and quite possibly only 1 percent is enjoyable to read (which good code truly is). The purpose of a comment should not be only to explain minutiae or justify weird-looking code. Comments should also ensure that people always know what they’re actually looking at.

The fallacy that leads to a lack of respect for documentation is that writing code is like building a car or some other well-understood mechanical system. Cars don’t come with a bunch of labels on all the pieces, because cars are fairly similar under the hood and a decent mechanic can figure out what is what. With software, it’s different. Software exists to solve a new problem; if it were solving an old problem, old software could be used. Thus, no two software solutions are going to be the same. In fact, programs tend to be radically different from one another. Software needs to be documented because every software project is inherently different, at least in some respects, from all the others.

There’s another problem, and it’s deep. The 1990s saw an effort, starting with Microsoft’s visual studio, to commoditize programmers. The vision was that, instead of programming being a province of highly-paid, elite specialists with a history of not working well with authority, software could be built by bolting together huge teams of mediocre, “commodity” developers, and directing them using traditional (i.e. pre-Cambrian) management techniques. This has begun to fail, but not before hijacking object-oriented programming, turning Java’s culture poisonous, and creating some of the most horrendous spaghetti code (MudballVisitorFactoryFactory) the world has ever seen. Incidentally, Microsoft is now doing a penance by having its elite research division investigate functional programming in a major way, the results being F# and a much-improved C#. Microsoft, on the whole, may be doomed to mediocrity, but they clearly have a research division that “gets it” in an impressive way. Still, that strikes me as too little, too late. The damage has be done, and the legacy of the commodity-developer apocalypse still sticks around.

The result of the commodity-programmer world is the write-only code culture that is the major flaw of siloized, large-program development. That, I think, is the fundamental problem with Java-the-culture, IDE-reliance, and the general lack of curiosity observed (and encouraged) among the bottom 80 percent of programmers. To improve as programmers, people need to read code and understand it, in order to get a sense of what good and bad code even are, but almost no one actually reads code anymore. IDEs take care of that. I’m not going to bash IDEs too hard, because they’re pretty much essential if you’re going to read a typical Java codebase, but IDE culture is, on the whole, a major fail that makes borderline-employable programmers out of people who never should have gotten in in the first place.

Another problem with IDE culture is that the environment becomes extremely high maintenance, between plugins that often don’t work well together, build system idiosyncracies that accumulate over time, and the various menu-navigation chores necessary to keep the environment sane (as opposed to command-line chores, which are easily automated). Yes, IDEs do the job: bad code becomes navigable, and commodity developers (who are terrified of the command line and would prefer not to know what “build systems” or “version control” even are) can crank out a few thousand lines of code per year. However, the high-maintenance environment requires a lot of setup work, and I think this is culturally poisonous. Why? For a contrast, in the command-line world, you solve your own problems. You figure out how to download software (at the command line using wget, not clicking a button) and install it. Maybe it takes a day to figure out how to set up your environment, but once you’ve suffered through this, you actually know a few things (and you usually learn cool things orthogonal to the problem you were originally trying to solve). When a task gets repetitive, you figure out how to automate it. You write a script. That’s great. People actually learn about the systems they’re using. On the other hand, in IDE-culture, you don’t solve your own problems because you can’t, because there it would take too long. In the big-program world, software too complex for people to solve their own problems is allowed to exist. Instead of figuring it out on your own, you flag someone down who understands the damn thing, or you take a screenshot of the indecipherable error box that popped up and send it to your support team. This is probably economically efficient from a corporate perspective, but it doesn’t help people become better programmers over time.

IDE culture also creates a class of programmers who don’t work with technology outside of the office– the archetypal “5:01 developers”– because they get the idea that writing code requires an IDE (worse yet, an IDE tuned exactly in line with the customs of their work environment). If you’re IDE-dependent, you can’t write code outside of a corporate environment, because when you go home, you don’t have a huge support team to set the damn thing up in a way that you’re used to and fix things when the 22 plugins and dependencies that you’ve installed interact badly.

There are a lot of things wrong with IDE culture, and I’ve only scratched the surface, but the enabling of write-only code creation is a major sticking point. I won’t pretend that bad code began with IDEs because that’s almost certainly not true. I will say that the software industry is in a vicious cycle, which the commodity-developer initiative exacerbated. Because most codebases are terrible, people don’t read them. Because “no one reads code anymore”, the bulk of engineers never get better, and continue to write bad code.

Software has gone through a few phases of what it means for code to actually be “turned in” as acceptable work. Phase 1 is when a company decides that it’s no longer acceptable to horde personal codebases (that might not even be backed up!) and mandates that people check their work into version control. Thankfully, almost all companies have reached that stage of development. Version control is no longer seen as “subversive” by typical corporate upper management. It’s now typical. The second is when a company mandates that code have unit tests before it can be relied upon, and that a coding project isn’t done until it has tests. Companies are reaching this conclusion. The third milestone for code-civilizational development, which very few companies have reached, is that the code isn’t done until you’ve taught users how to use it (and how to interact with it, i.e. instantiate the program and run it or send messages to it, in a read-eval-print-loop appropriate to the language). That teaching can be supplied at a higher level in wikis, codelabs, and courses… but it also needs to be included with the source code. Otherwise, it’s code out-of-context, which becomes illegible after a hundred thousand lines or so. Even if the code is otherwise good, out-of-context code without clear entry-points and big-picture documentation becomes incomprehensible around this point.

What do I not recommend? There’s no language that I’d say is categorically not worth learning, but I do not recommend becoming immersed in Java (except well enough to understand the innards of Clojure and Scala). The language is inexpressive, but the problem isn’t the language, and in fact I’d say that it’s unambiguously a good thing for an engineer to learn how the JVM works. It’s that Java-the-Culture (VisitorSelectionFactories, pointless premature abstraction, singleton directories called “com” that betray dripping contempt for the command line and the Unix philosophy, and build environments so borked that it’s impossible not to rely on an IDE) that is the problem; it’s so toxic that it reduces an engineer’s IQ by 2 points per month.

For each of these five programming languages, I’d say that a year of exposure is ideal and probably, for getting a generalist knowledge, enough– although it takes more than a year to actually master any of these. Use and improvement of written communication, on the other hand, deserves more. That’s a lifelong process, and far too important for a person not to start early on. Learning new programming languages and, through this, new ways of solving problems, is important; but the ability to communicate what problem one has solved is paramount.

REPL or Fail

I wrote a post last week on the idealized trajectory of a software engineer in general professional ability, and I find the 3-point scale (with one decimal point) that I developed to be quite useful. It describes the transition from additive to multiplicative contributions to a team, with 1.0 representing the baseline competence (a net adder, rather than a subtracter) of a professional programmer, 1.2 being about average for the industry, 1.5 being seriously good (“senior” in most contexts) and 2.0 representing a consistent multiplier and technical leader. Sadly, one of the more common roadblocks occurs early (around 1.0 to 1.2) and for most developers, and it’s often associated with-established programming languages like Java, C++, and Visual Basic. What’s going on? Why do so many programmers reach a hard ceiling, with some persisting there for decades, while others pass through this barrier with ease? What is it about certain languages or technologies that holds programmers back?

Programming is a two-class society. We have the mere “coders” who use only one language, hate the command line, and don’t program outside of work. They typically work on bland, “enterprise” projects and solve annoyingly detailed, but not difficult, problems. They generally find programming to be a boring task, but “it pays the bills” and the other smart-person route to management (actuarial science) involves hard exams. If they remain “in programming”, they’re lucky to break six figures when they’re 40, and they are likely to face age discrimination and layoffs in the decade after that. On my scale, they plateau around 1.2 because, in enterprise programming, those who reach 1.3-1.4 are usually brought into management. For a contrast, the other class is comprised of elite “hackers” who prefer languages like Python, Scala and Erlang (although some might have to use Java) and who program outside of work, who are hotly desired by huge companies and startups alike, and who continue growing even into old age. These usually reach 1.3 in their first few years as professional programmers, and reliably break 1.5 by mid-career. What separates the two? What is it about a programming language that makes it highly indicative of a software engineer’s future progress (or lack thereof)?

It’s evident that this problem is outside of languages themselves, because a 1.5 programmer can, after some adjustment, program at a comparable level in Java. So it’s actually not the case (aside from an opportunity cost argument) that Java and C++ make people worse programmers. Rather, something happens in other, more modern, languages that makes people improve faster and helps them smash through that 1.2 barrier. So, what is it? I spent much time trying to figure this out, and when I came upon the answer, it was so simple it shocked me: The Mighty REPL.

Modern programming languages supply an interactive mode, also known as a “read-eval-print loop” or REPL, as part of the programming environment. The REPL allows programmers to try out code and get immediate results, and to explore existing software modules interactively by calling their functions and seeing what they do. Although typically associated with interpreted languages (notably, Lisp) REPLs are supplied for compiled languages such as Scala, Ocaml, and Haskell. (They do not provide the speed of compiled code, but code is not run for speed in the interactive mode.) This tight feedback loop facilitates a style of development and code exploration that is far more engaging and effective than the processes of writing and reading code would be without a REPL, and far superior to anything available from an IDE. (A REPL is, in some sense, a simple but highly effective IDE. Properly built, it makes IDEs unnecessary.)

Programmer productivity is binary, in that a programmer is either in a productive, engaged state of “flow” or in a disorganized, unpleasant, and unproductive state out of flow where 10% as much (if that) is accomplished. (A significant amount of work stress, in my estimation, is caused by a self-inflicted sense of pressure to be productive while one is out of flow.) It takes programmers about 15 to 30 minutes to enter flow, but once in it, they are immensely productive and moreover, quite happy. Developers usually write the most code and their best code when in flow, and are best at reading code when in this state as well. There’s a problem: reading other peoples’ code, if the code presents a lot of accidental complexity obscuring the question the reader is trying to answer, often shatters flow (which is why bad code is hated with a passion that only programmers understand).

Programmers understand flow and its importance from experience, and flow becomes more important as one increases one’s programming skills (to the point that 2.0+ programmers, rather than negotiating for higher salaries, tend to negotiate perks oriented toward flow and engagement, such as a quiet working space and an unconditional right to turn down meetings). An experienced programmer can work at a 1.0 level (cranking out code of nominal additive value) without inspiration, but 1.5+ and especially 2.0+ level contributions require creativity and focus. So experienced, elite programmers know that a REPL-less language is generally a dead-end. In a “green field” environment where the programmer controls the entire context in which is work exists, engaged writing of code in languages like Java is still possible– but engaged reading of code is out of the question. The engaging way to read code is to get a big-picture sense of it through interaction, and then to examine the code for implementation details strictly after this has been achieved.

This, in my mind, singularly explains the ceiling that Java developers hit around 1.2. By some non-satisfactory definition akin to Turing-completeness for programming languages, a 1.2 programmer has the knowledge and resources to “solve any programming problem” (ignoring performance and feasibility concerns). The solution may be inelegant, slow, unmaintainable, and even bug-prone, and it might take a long time for the solution to be delivered, but there aren’t programming problems that a 1.2 (or even 1.0) engineer “can’t solve”. On the other hand, far more interesting is how people solve problems, and some of the things that programmers must do if they want to become elite (1.5+) programmers are (a) figure out which problems are worth solving, and (b) learn how to read solutions that other people have created. To become a great programmer, one must be able to read code (in an engaged state of flow). Moreover, one must read good code, a commodity that is depressingly rare in this industry.

Reading code can be an immense joy that produces “Aha!” moments, or it can be hellishly tedious and unfruitful. Sadly, most real-world code (especially in languages like Java and C++) is closer to the latter extreme– it’s probably over 90 percent. Code rots for a variety of reasons. One is the wine/sewage problem (“a teaspoon of sewage in a barrel of wine makes a barrel of sewage, a teaspoon of wine in a barrel of sewage makes a barrel of sewage”): if a system is corrupted and the nastiness isn’t aggressively refactored, the kludges will beget counterkludges in “maintenance” and destroy the whole system. A related issue is the “broken windows” effect: tolerance of ugliness leads to a sense of abandon, and this is more common than most programmers will admit. Modifying code in a reasonable way (i.e. one that doesn’t, despite solving an immediate bug or adding a specific feature, make the general quality of the code worse) requires understanding it, and that usually involves reading it, and most code is so terrible that programmers who have to use, extend or maintain it just give up on comprehension and “hack” it as far as they can. Programming in this style is akin to the” Jenga” game, where players must remove planks from a tower and place them on top of it, making the structure less sturdy and higher as they go (until it collapses, and the player whose turn it is loses).

There’s no silver bullet for code comprehension, but the REPL is the closest thing. The worst code may remain impenetrable, but few real projects begin their existence as incomprehensible legacy nightmares; they usually start out as average code for a project of that size. REPLs make it possible to explore average-case code and comprehend it without dedicating massive amounts of time to the process, and this makes a huge difference. Aging modules can be refactored in the earliest stages of decay, long before they get anywhere near the “legacy horror” state. By enabling interactive peeking and poking of functions, REPLs allow programmers to explore libraries and get a sense of their interfaces. For example, in Ocaml, it’s possible to get the full type signature of any module:

# module L = List;;
module L :
sig
val length : 'a list -> int
val hd : 'a list -> 'a
val tl : 'a list -> 'a list
val nth : 'a list -> int -> 'a
val rev : 'a list -> 'a list
val append : 'a list -> 'a list -> 'a list
val rev_append : 'a list -> 'a list -> 'a list
val concat : 'a list list -> 'a list
val flatten : 'a list list -> 'a list
val iter : ('a -> unit) -> 'a list -> unit
val map : ('a -> 'b) -> 'a list -> 'b list
[...]
end

From these type signatures, it’s relatively easy to get a sense of what these functions do and to test those intuitions:

# List.length [1; 1; 2; 3; 5; 8];;
- : int = 6
# List.map (fun x -> x*x) [1; 1; 2; 3; 5; 8];;
- : int list = [1; 1; 4; 9; 25; 64]

With Ocaml’s powerful REPL, a person can explore code and get a sense of the big picture before starting to read it. That makes a huge difference: reading code is an order of magnitude easier and more engaging when one understands what one is looking at. Moreover, many Lisps such as Clojure and Common Lisp provide documentation functions at the REPL that allow the user to read a function’s documentation without having to leave the command-line. This provides all of the benefits of an IDE, without the flow-breaking drawbacks.

For an aside, there’s something that elite programmers (1.5+) call “keyboard snobbery”: IDEs are scorned, while the key-combos of emacs and vim are venerated, even with anachronistic names like “meta” for the escape key.  The command line interface is highly valued. This doesn’t apply to all computer use (when web surfing, keyboard snobs use the mouse like anyone else) but it does apply to writing and reading code. Why? Because the mouse is physical, continuous and imprecise, while the keyboard is  cerebral, discrete and exact (and therefore a better tool when programming). When we use web pages, we trust the developer to handle the imprecision (in determining whether a button was clicked, and in interpreting the mouse event). This is fine for this purpose, but when we’re writing code, we want exactitude and total control of our interaction with the machine. We want exactly the result we expect at all times. So that’s why we prefer the keyboard when coding, but there’s something else going on as well. Switching from keyboard to mouse doesn’t only involve a move of the hand. It reframes the interaction between the human and the machine, and that’s a context switch. Seemingly benign context switches inflict major drag on programmer productivity. Switching to the mouse because one’s IDE requires it? That’s 3-5 minutes. Pinging about the filesystem because of some stupid requirement that each class live in its own file? That’s about ten minutes. Managerial interruption? An hour, and half a day if the meeting is unexpected and intense. Programmers hate being nickel-and-dimed by context switches, and they hate being out of flow. This is why seriously good programmers prefer “archaic” tools like the command-line interface, vim and emacs, while considering typical mouse-driven modern IDEs (which are necessary if one is developing in a verbose basketcase of a language like Java, but unnecessary in better languages) to be useless.

The REPL, served at the commandline, allows a person to interact with code as if it were live, and see what the pieces do. At least half of what we must do as programmers is comprehension of assets that other people have created, and the REPL allows us to do this without the painful context switch associated with having to read code cold. It enables “flowful” (that is, engaging) exploration and, later, “lazy reading” of code. (Lazy, in this sense, is a non-pejorative computer science term associated with doing only the work needed to solve a problem.) That’s something REPL-less languages can’t provide, because in them, code is a dead static thing that might be run against some dead static tests, not something a developer can interact with as he works.

Reading code is part of the job description of any programmer, and yet it’s rarely done well because enterprise languages like Java make the process so dismal that most people just give up, falling into abominable development practices. When 90 percent of the code is tedious boilerplate (accidental complexity) that isn’t worth the eye strain, it’s easy to miss crucial details. The accumulation of missed details leads to frank incomprehension quickly, and then development practices akin to “throwing mud at the wall and seeing what sticks” become the norm.

This, I believe, is what holds back most Java developers’ progress. Not only does code in such languages become horrible quickly, but the environment makes it unpleasant to read even “good” (by which, I mean “above average for the language) code. In fact, most IDEs tacitly assume that no one is going to bother to read code after it is first written, and adjust accordingly.

For a contrast, this is something Ocaml got right in a major way. Ocaml is an obscure “niche” language, but it has the highest average quality of programmers that I’ve ever seen (even higher than Haskell and Lisp, although those are close). I don’t believe the reason for this is that only good developers can use Ocaml. Instead, what Ocaml achieves is that it makes it a joy to read average-case code– no small feat. Pattern matching, a core feature of the language, is explicitly designed to make what would otherwise be complex control flows human-readable. Haskell is an excellent language as well, and extremely terse, but in my belief it’s optimized (more than Ocaml) for writers of code (although still far better, from a reader’s perspective, than Java). I would guess that the ML family of languages (which are elegantly simple) are the only languages on earth that go so far to make almost all code readable, even in large systems. Of course, it’s still absolutely possible to write horrible, illegible Ocaml code– the language puts up more of a fight against bad practices than most, but it can be done. The difference, relevant for economic rather than purist discussions, is that average-case Ocaml code is attractive, whereas even very good Java code looks only 20 percent less ugly than typical “bad” Java code.

In a language like Ocaml, there’s so little boilerplate and accidental complexity that one can look at the code and actually see the problem being solved. For larger systems, one can test one’s intuitions at the REPL. No one needs to rifle through 300-page design documents to understand what a well-written Ocaml program does. The consequence of this is that Ocaml has libraries of generally very clean code that people can read as they learn the language. Since it’s not an unpleasant process to read code, they do so, and they grow as programmers at a rate that would be unheard-of in Java or C++: rising from 0.8 to 1.5 in about two to three years is typical. Ocaml isn’t some “hard” language that only 1.5+ programmers have a chance of understanding. It’s a language that turns ordinary programmers into 1.5ers rapidly.

There’s one language that can be cited as a counterexample to the “REPL or Fail” rule, and that’s C. C, invented in the 1970s, doesn’t have a REPL. Why was this acceptable for C? First, the language grew up in a different time, when small programs (that would today be replaced by “scripts” unless performance were an issue) were the norm. Programmers could “grow up” on C in 1985 because the programs they’d be reading were small and had well-defined semantics. Second, to say that “C lacks a REPL” is a bit strict. It doesn’t have a language-native REPL, but the Unix/C environment does have a (rudimentary, but sufficient) REPL: the command-line console. This was the environment in which C programs were run and explored: first you run wc and cat to see what they do, and then you could look at their C code and discover how they do it. C was designed with a “small-program” model of development (because large, megalithic programs were simply untenable in 1975) in mind. If complex behaviors were desired, they could be established by composing independent C programs and having them communicate through pipes and sockets. In this world, one could read “a whole C program” (a small, independent module, usually in one file) in one sitting. One only needed a REPL (command-line console) to understand the bigger environment: Unix.

How’d we end up with these disengaging, REPL-less languages? As I said; speaking superficially and strictly, C has no REPL. This was not a problem for C because large programs were so rarely written in it, and enough small, well-written C programs were distributed in every Linux environment that a programmer could learn the language from those. Where C++ differs is that large, complex, and monolithic programs are written in it, because the language has just enough in the way of high-level support to let people attempt them. The result is that C++ supports beasts of complexity (such as 200-line functions, 1000-line class definitions, and 1-million-line whole programs spanning several directories) that would be unconscionable in C, and yet fails to provide the one tool that might enable a programmer to make sense of such things. Although writing a C++ REPL is possible, it wouldn’t be easy: the language is so deeply imperative and crystalline that overcoming the mismatch between the two models of programming would be a monumental task. Java, as a descendant of C++ in syntax and culture, inherited most of these illnesses from it while becoming the default language for enterprise programming, and was also launched without a REPL. The result is that millions of people are stuck in a REPL-less language and don’t know why, while hacking on monolithic projects of intractable complexity that are doomed to get worse over time.

The REPL isn’t just a tool. It’s an engaging classroom in which one learns how to be a programmer. It’s absolutely necessary for a person assigned a task that involves comprehending a complex piece of code. And unlike the training wheels of an IDE, it doesn’t attempt to hide “difficult” details from the developer; it allows her to explore them to arbitrary depth when she is ready.

For these reasons, the interactive mode can’t be considered a luxury of those who are privileged enough to work in “elite” languages. There’s no reason programming should be that way. If we want to democratize programming (and there’s no reason we can’t have at least ten times as many 1.5+ programmers as are alive now, and 10x is a conservative goal; considering the world population) we need to begin orienting ourselves toward modern languages. And there is one rule that seems more fundamental than any argument about static vs. dynamic typing or imperative vs. functional programming: REPL or fail.

How any software company can cross “The Developer Divide”

Nir Eyal wrote a blog post, The Developer Divide: When Great Companies Can’t Hire, on this conundrum: there are a lot of excellent technology companies that haven’t managed to attract the brightest (and generally pickiest) engineers in sufficient numbers to hire as fast as they grow, a problem that forces a company either to lower its hiring standards or curtail growth, neither of which is desirable.

What makes this problem especially hard is that it’s not about money. In marketing terminology, it’s a problem of “reach”. Many great startups, even if they offered $200,000 per year, would simply be unable to hire 25 elite (top 5%) programmers in a year. Finding one per quarter is pretty good. Compounding this difficulty is the fact that the best software engineers’ job searches are infrequent and short. They usually find jobs through social connections and word-of-mouth before they start officially “looking”. They almost never cold-spam their CVs. Therefore, a company that’s consistently hiring top-5% programmers is probably extending offers to 1 applicant per several hundred CVs that makes its way to the hiring manager.

What is the Developer Divide? It comes down to this: it’s easier to get top developers if your product is something that top developers already use. Google was unequivocally the best search engine even in 2000, and nerds like new search engines, so it established itself as a desirable place to work. Facebook and Foursquare may be targeted toward “non-engineers” (i.e. mass market) but the products are used by enough of the people the firms want to hire as to generate name recognition that accumulates faster than the company needs to grow. That makes it very easy for them to attract so much talent they turn away more 5-percenters than they hire. But for a company whose product isn’t used already, every day, by top programmers, it becomes harder. Much harder. That’s unfortunate, because a lot of great businesses
(most, actually) need to start out doing something lower on the established-product-sexiness scale (enterprise work and products with well-defined markets, rather than speculative “social” projects) before branching out into projects of more general interest. These companies might become sexy in the future, but sometimes the first clientele has to be an unsexy but reliable one, like large corporations or suburban soccer moms.

So, how do these great companies continue to find great talent? I’m going to state one simple (but not easy) thing that any software company can do to increase its attractiveness to top engineers by at least 3 binary orders of magnitude. Here it is: ditch Java/C++ and use a decent programming language.

What’s a decent programming language, for this purpose? It’s one that a 5-percenter would use for a personal side project, or if she were calling the shots. That language might Scala, Python, or Ocaml. It might (for a project such as an operating system) be C. It would amost certainly not, in 2012, be Java or C++.

A clarification must be made, because it’s confusing to people outside of technology: C++ is not a substitute, nor  an improvement on, C. In fact, C is great for programs where low-level concerns are critical in producing a quality system: operating systems, device drivers, hard real-time, and runtime environments for the high-level garbage-collected languages we all love (such as Ocaml and Haskell). Much of the world is built on C. It’s an excellent and immensely successful mid-level language,  as opposed to C++ which is a miserably failed attempt at a C-like high-level language. So do not confuse the two.

If you ask a 5-percenter for his opinions on C and C++, he’ll probably praise the former while trashing the latter. From a non-technical business perspective, this might seem inconsistent, given that C is a proper subset of C++. “Anything you can do in C, you can do in C++.” That makes C++ “strictly better”, right? Well, no. A 5-percenter has the professional maturity to realize that he or she will not be coding in a vacuum, and that reading code is as important an act as writing it, and that therefore adding ill-considered features to a good language doesn’t improve it, but ruins it if people are stupid enough to actually use them.

I’ll stop talking about C++, because it’s not even relevant to most startups. Startups can’t afford it, because they need to accomplish big things with small teams and C++ doesn’t make it possible. C++ is mostly that skulking monster that lives in the bowels of legacy systems at banks, something you expect to fight in the 2300 AD (post-apocalyptic) world of Chrono Trigger.

Java’s problem is somewhat different and new. C++ is a bad language because it was poorly designed, but it was at least designed for good programmers (and it’s disastrous when used by inept ones). Java, in contrast, was explicitly designed to favor the interests of massive (100+) teams of mediocre developers over excellent individual contributors, whom it slows down to a small fraction of their typical speed. It came out of a failed experiment to create a home for low-skill “commodity” developers who haven’t learned a new skill since college and who need to consult the One Smart Person on the team if (God forbid) their RAM-munching IDE breaks. Java has succeeding in making commodity developers marginally effective (as opposed to negatively productive, as they are in more powerful languages) but at the expense of hobbling the best, forcing them to endure ugliness (inherent to a language designed to be used exclusively through IDEs, while 5-percenters overwhelmingly prefer the command-line interface and “classic” editors like vim and emacs) and accidental complexity. Needless to say, 5-percenters despise Java. More correctly, they despise the Java language.  (I emphasize “the language” because the Java Virtual Machine itself is a pretty powerful tool, and because there are superior languages– Scala and Clojure coming to mind– that also run on the JVM.)

If you want to build a large team of 50+ “commodity” developers, content to maintain legacy code or work on mind-numbingly boring stuff, use Java or C++. If you’re a startup, you need to build a small team of excellent developers, so use something else. If you want to hire 5-percenters now but might need to hire Java jockeys later, strongly consider Scala and Clojure, which are highly powerful languages but run in the JVM environment.

Why is this so important? It’s not just about the language. It’s about signaling. As a startup, you have to show that you get it, and that the opportunity you offer isn’t Yet Another Java Job. You can get 5-percenters to use C++ or Java (Google has made this happen, and so have many investment banks, which have huge legacy codebases in C++) but you pretty much have to be a big-name company to pull this off, and you can expect to shell out an obscene of money– a 50-100 percent markup to account for the negatives of maintaining code in a terrible language, and the career stagnation this kind of work invites. If you want a 5-percenter to work 60 hours per week for $100,000 per year, use a great language. If you want him to work 9-to-5, with two-hour lunches and personal errands deducted from the workday, for $250,000 per year, then Java and C++ are options.

More than anything else, 5-percenters want to work with other 5-percenters. This is far more important to them than prestige or money (the reason 5-percenters default to rich, prestigious companies when nothing more interesting crosses their transom is because these companies have other 5-percenters). It is, moreover, even more important than programming language selection itself. Language selection, as I’ve said, an objective mechanism of signaling. This is what creative writing calls the “show, don’t tell” principle (don’t say “Eric was honorable”; have him do something honorable). Every company says (“telling”) it has world-class talent, but language choice is an objective decision that shows that a company is, at the very least, interested in hiring 5-percenters. For that reason, there’s a good chance that it employs some.

To reiterate, because I don’t want a flame war, is it possible to find 5-percenters who will write Java and C++ on a full-time basis? Absolutely, and if you want to compete with Goldman Sachs on salary, you might be able to hire one. Will they take on the risk of working for $5,000 (pre-tax) per month at a risky, seven-person startup in these languages? Not a chance. Five-percenters tolerate C++ jobs at Google because they know that company’s existence doesn’t rely on their individual productivity. In a startup, individual productivity is an existential concern, and the Aspergerian “pathological honesty” that top programmers almost invariably have precludes them from working in low-productivity languages that they believe will retard and destroy their employer.

I’ve said enough about the awfulness of C++ and Java. What are some good languages? I’ll give a list, which is not at all inclusive, of highly-powerful languages that the best developers love. Ocaml, Haskell, Erlang, Scala, Lisp, Clojure. Less strong but still formidable are Python and Ruby. (Many 5-percenters love Python, and almost all will tolerate it, but the median Python programmer is closer to a “10-percenter”.) Why is it this way? First, great developers program and learn technology in their spare time, and a 1-person, 15-hour-per-week project must be written in a real language if it is going to amount to anything. Second, most of these languages are only used by the best employers and only taught by the best universities, so a person deeply familiar with one of them is either (a) coming from elite exposure, or (b) possessive of enough individual curiosity to indicate a high likelihood of skill and success as a computer programmer. People who want to become great programmers quickly discover languages in which it’s possible for them to be 10 times as productive as they would be in Java– and they never look back.

Are all programmers in great languages 5-percenters? The answer is no. Users of languages like Ocaml and Scala tend to fall into two categories: (a) the 5-percenters, and (b) those who are becoming 5-percenters. Not all of them are there yet, but they’re almost all improving at a rapid pace. I’ve worked for almost 4 years in JVM languages (Java, Scala, and Clojure) and what I’ve learned (perhaps astonishingly) is that, per unit time, the Scala and Clojure developers learn about Java faster than those using Java! Because they are in high-productivity languages, they can accomplish more, and because they’re achieving more, they’re learning more along the way.

From a practical standpoint: with so many great languages to choose from, which one of those awesome languages should a person pick? CTOs making this decision have some idea, but this would be an impossible decision for a non-technical CEO to make on direct experience. For a very small team, the answer is easy: whatever the best programmers want to use. I like Scala much better than Python, but if I were in a non-coding role and tasked with hiring a great programmer, and if she preferred Python, I’d have her use that, because the benefit of letting her use the language she thought was best for the job outweighs (for a small team and with no maintenance burden) any benefit conferred by using one powerful language over another. So my answer to the language-selection question to a non-programmer CEO is: ask your best developer.

For my part, I’d recommend Scala. It’s not the best language for all purposes, but it’s a great general-purpose language and (among mainstream languages) may be best choice overall for most purposes. Because it runs in the Java Virtual Machine (JVM) it has full access to all of Java’s assets. (Clojure, an excellent JVM lisp, has the same advantage, but is not as performant and does not have static typing, a feature I find invaluable.)

A person versed in economics might find my argument tenuous. If a set of “elite” languages can be used by programmers and employers as a signal of high competency, what’s to stop the less competent from “faking” this signal once they catch on to the fact that it exists? A few answers come to mind. The first is that programming in a high-power language requires an adjustment that mediocre, 9-to-5, programmers are not likely to want to make. From first principles, functional programming isn’t intellectually harder than object-oriented programming: a 120 IQ is more than enough. It’s actually simpler. (Doing object-oriented programming correctly, and rigorously understanding what one is doing with it, is much harder and much more intellectually complex than succeeding in functional programming; this is a rant for another time, but 99% of people doing “object-oriented-programming” are like crude teenagers with regard to sex– loud about it, but doing it badly.) Nonetheless, the intellectual difficulty of re-learning software on sound principles is not an easy one to make. Like the transitions from memorization to pattern recognition, and then from pattern recognition to rigorous proof in mathematics, these context-switches require a lot of work (months of serious study). Successfully learning Scala or Clojure is a sign of a very strong work ethic. (It goes without saying that your interview process should establish that the candidates actually know these languages, and aren’t playing “buzzword bingo”.)

What about mediocre businesses using language selection as a false signal? That’s even more unlikely. A recruiter (for an elite startup, currently at less than 20 people) I spoke to told me that about 70% of Clojure candidates, 40% of Python candidates, and 5% of Java candidates that he invites to an in-office, full-day interview are good enough to hire. For a startup, interviews are far more expensive (in terms of opportunity cost) than for large companies: it costs about $100 to run a technical phone screen and $1000 to conduct a full-cycle interview, because startups have to involve senior people in their interview process in order to assess quality. Put another way, this means that it costs $1429 to hire a 5-percenter in Clojure and $2500 to hire one in Python– chump change as far as recruiting expenses go. But it costs $20,000 to hire a Java developer, if you insist on the “5-percenter” standard of quality.

There’s one variable I haven’t mentioned, though, and that’s the number of CVs he gets in each language: several hundred times as many Java developers than developers in Clojure or Ocaml. Most companies need (or think they need) warm bodies in large numbers to maintain legacy horrors, not top-talent and the attitude that comes with it. Also, it’s awful that it’s this way, and I think it will change in the future, with the limiting factor being work ethic and engagement rather than innate ability, but the software world is a pyramid, with a few stars at the pinnacle and a large number of incompetents at the base. This holds for programmers and for programmer jobs, of which 90 to 95 are mind-numbingly boring. It’s also seen in the distribution of language preference (and I say “preference” because there are tens of thousands of excellent programmers writing C++ and Java right now, but very few prefer them). The result of this is that the mediocre languages have the most programmers. If a company needs to hire 200 programmers per month, it simply cannot choose Haskell as its main development language; within a couple years, it would have absorbed the entire Haskell community!

Historically, that has been a serious concern for companies when it comes to language selection. Because there are hundreds of times more Java developers than Haskell or Ocaml hackers, there are at least fives of times more half-decent ones. Thus, the worst languages paradoxically have the strongest library support. This is compounded by the fact that powerful tools (such as IDEs, which are a mixed bag of neatness and horror, but sometimes quite useful and outright required when developing in Java) must be written to compensate the shortcomings of hobbled languages. There’s no Lisp IDE because emacs does just fine, but there are a slew of Java IDEs because it’s a revolting experience to write Java without one. From a non-technical CEO’s perspective, this makes Java look better because it has the best supporting tools.

What all this means is that mediocre software shops are not going to switch over to Haskell in order to ape this signaling mechanism. They can’t, because the bottom contingent of their software staff with drop like flies, and because their leadership is unlikely to understand the language selection problem in the first place. There might be some unestablished software companies using elite languages, and of them, I make the same argument that I’d make of “15-percenters” who nonetheless show interest and competence in “elite” languages– they may not be 5-percenters now, but if they keep at it, they will be shortly!

What about the (admitted) shortcomings of elite languages? Except for Scala and Clojure, which have access to the JVM and interoperate cleanly with Java, these languages don’t have the breadth of tooling that C++ and Java do. The answer to that is almost stereotypically “hackerish”: write them! This effort is not wasted, not in the least. Writing high-quality open-source tools to support elite languages is one of the best things a software company can do to establish its reputation. This, again, is an opportunity to show, not tell.

For a small company attempting to define and establish itself, attracting top talent is hard. The best programmers are on the job market so rarely and for such short intervals that attracting them takes concerted effort. Growing companies do not have the name recognition, and cannot afford the immense salaries (over $250,000 for a senior developer) that would attract these “5-percenters” using economic means, so they must win on technical grounds. Language selection is a simple (but not easy, because it’s difficult to adopt new and dramatically more powerful tools) way to do so. The best languages (Ocaml, Scala, Haskell, Python) are “shibboleths” that elite programmers use to identify each other, so adopting one of them is a one-stop choice that instantly establishes “hacker cred” or, to use Nir Eyal’s terminology, bridges “The Developer Divide”.

Object Disoriented Programming

It is my belief that what is now called “object-oriented programming” (OOP) is going to go down in history as one of the worst programming fads of all time, one that has wrecked countless codebases and burned millions of hours of engineering time worldwide. Though a superficially appealing concept– programming in a manner that is comprehensible the “big picture” level– it fails to deliver on this promise, it usually fails to improve engineer productivity, and it often leads to unmaintainable, ugly, and even nonsensical code.

I’m not going to claim that object-oriented programming is never useful. There would be two problems with such a claim. First, OOP means many different things to different people– there’s a parsec of daylight between Smalltalk’s approach to OOP and the abysmal horrors currently seen in Java. Second, there are many niches within programming with which I’m unfamiliar and it would be arrogant to claim that OOP is useless in all of them. Almost certainly, there are problems for which the object-oriented approach is one of the more useful. Instead, I’ll make a weaker claim but with full confidence: as the default means of abstraction, as in C++ and Java, object orientation is a disastrous choice.

What’s wrong with it?

The first problem with object-oriented programming is mutable state. Although I’m a major proponent of functional programming, I don’t intend to imply that mutable state is uniformly bad. On the contrary, it’s often good. There are a not-small number of programming scenarios where mutable state is the best available abstraction. But it needs to be handled with extreme caution, because it makes code far more difficult to reason about than if it is purely functional. A well-designed and practical language generally will allow mutable state, but encourages it to be segregated only into places where it is necessary. A supreme example of this is Haskell, where any function with side effects reflects the fact in its type signature. On the contrary, modern OOP encourages the promiscuous distribution of mutable state, to such a degree that difficult-to-reason-about programs are not the exceptional rarity but the norm. Eventually, the code becomes outright incomprehensible– to paraphrase Boromir, “one does not simply read the source code”– and even good programmers (unknowingly) damage the codebase as they modify it, adding complexity without full comprehension. These programs fall into an understood-by-no-one state of limbo and become nearly impossible to debug or analyze: the execution state of a program might live in thousands of different objects!

Object-oriented programming’s second failing is that it encourages spaghetti code. For an example, let’s say that I’m implementing the card game, Hearts. To represent cards in the deck, I create a Card object, with two attributes: rank and suit, both of some sort of discrete type (integer, enumeration). This is a struct in C or a record in Ocaml or a data object in Java. So far, no foul. I’ve represented a card exactly how it should be represented. Later on, to represent each player’s hand, I have a Hand object that is essentially a wrapper around an array of cards, and a Deck object that contains the cards before they are dealt. Nothing too perverted here.

In Hearts, the person with the 2 of clubs leads first, so I might want to determine in whose hand that card is. Ooh! A “clever” optimization draws near! Obviously it is inefficient to check each Hand for the 2 of clubs. So I add a field, hand, to each Card that is set when the card enters or leaves a player’s Hand. This means that every time a Card moves (from one Hand to another, into or out of a Hand) I have to touch the pointer– I’ve just introduced more room for bugs. This field’s type is a Hand pointer (Hand* in C++, just Hand in Java). Since the Card might not be in a Hand, it can be null sometimes, and one has to check for nullness whenever using this field as well. So far, so bad. Notice the circular relationship I’ve now created between the Card and Hand classes.

It gets worse. Later, I add a picture attribute to the Card class, so that each Card is coupled with the name of an image file representing its on-screen appearance, and ten or twelve various methods for the number of ways I might wish to display a Card. Moreover, it becomes clear that my specification regarding a Card’s location in the game (either in a Hand or not in a Hand) was too weak. If a Card is not in a Hand, it might also be on the table (just played to a trick), in the deck, or out of the round (having been played). So I rename the hand attribute, place, and change its type to Location, from which Hand and Deck and PlaceOnTable all inherit.

This is ugly, and getting incomprehensible quickly. Consider the reaction of someone who has to maintain this code in the future. What the hell is a Location? From its name, it could be (a) a geographical location, (b) a position in a file, (c) the unique ID of a record in a database, (d) an IP address or port number or, what it actually is, (e) the Card’s location in the game. From the maintainer’s point of view, really getting to the bottom of Location requires understanding Hand, Deck, and PlaceOnTable, which may reside in different files, modules, or even directories. It’s just a mess. Worse yet, in such code the “broken window” behavior starts to set in. Now that the code is bad, those who have to modify it are tempted to do so in the easiest (but often kludgey) way. Kludges multiply and, before long, what should have been a two-field immutable record (Card) has 23 attributes and no one remembers what they all do.

To finish this example, let’s assume that the computer player for this Hearts game contains some very complicated AI, and I’m investigating a bug in the decision-making algorithms. To do this, I need to be able to generate game states as I desire as test cases. Constructing a game state requires that I construct Cards. If Card were left as it should be– a two-field record type– this would be a very easy thing to do. Unfortunately, Card now has so many fields, and it’s not clear which can be omitted or given “mock” values, that constructing one intelligently is no longer possible. Will failing to populate the seemingly irrelevant attributes (like picture, which is presumably connected to graphics and not the internal logic of the game) compromise the validity of my test cases? Hell if I know. At this point, reading, modifying, and testing code becomes more about guesswork than anything sound or principled.

Clearly, this is a contrived example, and I can imagine the defenders of object-oriented programming responding with the counterargument, “But I would never write code that way! I’d design the program intelligently in advance.” To that I say: right, for a small project like a Hearts game; wrong, for real-world, complex software developed in the professional world. What I described  is certainly not how a single intelligent programmer would code a card game; it is indicative of how software tends to evolve in the real world, with multiple developers involved. Hearts, of course, is a closed system: a game with well-defined rules that isn’t going to change much in the next 6 months. It’s therefore possible to design a Hearts program intelligently from the start and avoid the object-oriented pitfalls I intentionally fell into in this example. But for most real-world software, requirements change and the code is often touched by a number of people with widely varying levels of competence, some of whom barely know what they’re doing, if at all. The morass I described is what object-oriented code devolves into as the number of lines of code and, more importantly, the number of hands, increases. It’s virtually inevitable.

One note about this is that object-oriented programming tends to be top-down, with types being subtypes of Object. What this means is that data is often vaguely defined, semantically speaking. Did you know that the integer 5 doubles as a DessertToppingFactoryImpl? I sure didn’t. An alternative and usually superior mode of specification is bottom-up, as seen in languages like Ocaml and Haskell. These languages offer simple base types and encourage the user to build more complex types from them. If you’re unsure what a Person is, you can read the code and discover that it has a name field, which is a string, and a birthday field, which is a Date. If you’re unsure what a Date is, you can read the code and discover that it’s a record of three integers, labelled year, month, and day. If you want to get “to the bottom” of a datatype or function when types are built from the bottom-up, you can do so, and it rarely involves pinging across so many (possibly semi-irrelevant) abstractions and files as to shatter one’s “flow”. Circular dependencies are very rare in bottom-up languages. Recursion, in languages like ML, can exist both in datatypes and functions, but it’s hard to cross modules with it or create such obscene indirection as to make comprehension enormously difficult. By contrast, it’s not uncommon to find circular dependencies in object-oriented code. In the atrocious example I gave above, Hand depends on Card depends on Location inherits from Hand.

Why does OOP devolve?

Above, I described the consequences of undisciplined object-oriented programming. In limited doses, object-oriented programming is not so terrible. Neither, for that matter, is the much hated “goto” statement. Both of these are tolerable when used in extremely disciplined ways with reasonable and self-evident intentions. Yet when used by any but the most disciplined programmers, OOP devolves into a mess. This is hilarious in the context of OOP’s original promise to business types in the 1990s– that it wound enable mediocre programmers to be productive. What it actually did is create a coding environment in which mediocre programmers (and rushed or indisposed good ones) are negatively productive. It’s true that terrible code is possible in any language or programming paradigm; what makes object orientation such a terrible default abstraction is that, as with unstructured programming, bad code is an asymptotic inevitability as an object-oriented program grows. In order to discuss why this occurs, it’s necessary to discuss object orientation from a more academic perspective, and pose a question to which thousands of answers have been given.

What’s an object? 

On first approximation, one can think of an object as something that receives messages and performs actions, which usually include returning data to the sender of the message. Unlike a pure function, the response to each message is allowed to vary. In fact, it’s often required to do so. The object often contains state that is (by design) not directly accessible, but only observable by sending messages to the object. In this light, the object can be compared to a remote-procedure call (RPC) server. Its innards are hidden, possibly inaccessible, and this is generally a good thing in the context of, for example, a web service. When I connect to a website, I don’t care in the least about the state of its thread pooling mechanisms. I don’t want to know about that stuff, and I shouldn’t be allowed access to it. Nor do I care what sorting algorithm an email client uses to sort my email, as long as I get the right results. On the other hand, in the context of code for which one is (or, at least, might be in the future) responsible for comprehending the internals, such incomplete comprehension is a very bad thing.

To “What is an object?” the answer I would give is that one should think of it as a miniature RPC server. It’s not actually remote, nor as complex internally as a real RPC or web server, but it can be thought of this way in terms of its (intentional) opacity. This shines light on whether object-oriented programming is “bad”, and the question of when to use objects. Are RPC servers invariably bad? Of course not. On the other hand, would anyone in his right mind code Hearts in such a way that each Card were its own RPC server? No. That would be insane. If people treated object topologies with the same care as network topologies, a lot of horrible things that have been done to code in the name of OOP might never have occurred.

Alan Kay, the inventor of Smalltalk and the original conception of “object-oriented programming”, has argued that the failure of what passes for OOP in modern software is that objects are too small and that there are too many of them. Originally, object-oriented programming was intended to involve large objects that encapsulated state behind interfaces that were easier to understand than the potentially complicated implementations. In that context, OOP as originally defined is quite powerful and good; even non-OOP languages have adopted that virtue (also known as encapsulation) in the form of modules.

Still, the RPC-server metaphor for “What is an Object?” is not quite right, and the philosophical notion of “object” is deeper. An object, in software engineering, should be seen as a thing which the user is allowed to have (and often supposed to have) incomplete knowledge. Incomplete knowledge isn’t always a bad thing at all; often, it’s an outright necessity due to the complexity of the system. For example, SQL is a language in which the user specifies an ad-hoc query to be run against a database with no indication of what algorithm to use; the database system figures that out. For this particular application, incomplete knowledge is beneficial; it would be ridiculous to burden everyone who wants to use a database with the immense complexity of its internals.

Object-orientation is the programming paradigm based on incomplete knowledge. Its purpose is to enable computation with data of which the details are not fully known. In a way, this is concordant with the English use of the word “object” as a synonym for “thing”: it’s an item of which one’s knowledge is incomplete. “What is that thing on the table?” “I don’t know, some weird object.” Object-oriented programming is designed to allow people to work with things they don’t fully understand, and even modify them in spite of incomplete comprehension of it. Sometimes that’s useful or necessary, because complete knowledge of a complex program can be humanly impossible. Unfortunately, over time the over-tolerance of incomplete knowledge leads to an environment where important components can elude the knowledge of each individual responsible for creating them; the knowledge is strewn about many minds haphazardly.

Modularity

Probably the most important predictor of whether a codebase will remain comprehensible as it becomes large is whether it’s modular. Are the components individually comprehensible, or do they form an irreducibly complex tangle of which it is required to understand all of it (which may not even be possible) before one can understand any of it? In the latter case, software quality grinds to a halt, or even backslides, as the size of the codebase increases. In terms of modularity, the object oriented paradigm generally performs poorly, facilitating the haphazard growth of codebases in which answering simple questions like “How do I create and use a Foo object?” can require days-long forensic capers.

The truth about “power”

Often, people describe programming techniques and tools as “powerful”, and that’s taken to be an endorsement. A counterintuitive and dirty secret about software engineering “power” is not always a good thing. For a “hacker”– a person writing “one-off” code that is unlikely to ever be require future reading by anyone, including the author– all powerful abstractions, because they save time, can be considered good. However, in the more general software engineering context, where any code written is likely to require maintenance and future comprehension, power can be bad. For example, macros in languages like C and Lisp are immensely powerful. Yet it’s obnoxiously easy to write incomprehensible code using these features.

Objects are, likewise, immensely powerful (or “heavyweight”) beasts when features like inheritance, dynamic method dispatch, open recursion, et cetera are considered. If nothing else, one notes that objects can do anything that pure functions can do– and more. The notion of “object” is both a very powerful and a very vague abstraction.

“Hackers” like power, in which case a language can be judged based on the power of the abstractions it offers. But real-world software engineers spend an unpleasantly large amount of time reading and maintaining others’ code.  From an engineer’s perspective, a language is good based on what it prevents other programmers from doing to us, those of us who have to maintain their code in the future. In this light, the unrestrained use of Lisp macros and object-oriented programming is bad, bad, bad. From this perspective, a language like Ocaml or Haskell– of middling power but beautifully designed to encourage deployment of the right abstractions– is far better than a more “powerful” one like Ruby.

As an aside, a deep problem in programming language design is that far too many languages are designed with the interests of code writers foremost in mind. And it’s quite enjoyable, from a writer’s perspective, to use esoteric metaprogramming features and complex object patterns. Yet very few languages are designed to provide a beautiful experience for readers of code. In my experience, ML does this best, and Haskell does it well, while most of the mainstream languages fall short of being even satisfactory. In most real-world software environments, reading code is so unpleasant that it hardly gets done at all with any detail. Object-oriented programming, and the haphazard monstrosities its “powerful” abstractions enable, is a major culprit.

Solution?

The truth, I think, about object-oriented programming is that most of its core concepts– inheritance, easy extensibility of data, proliferation of state– should be treated with the same caution and respect that a humble and intelligent programmer gives to mutable state. These abstractions can be powerful and work beautifully in the right circumstances, but they should be used very sparingly and only by people who understand what they are actually doing. In Lisp, it is generally held to be good practice never to write a macro when a function will do. I would argue the same with regard to object-oriented programming: never write an object-oriented program for a problem where a functional or cleanly imperative approach will suffice.  Certainly, to make object orientation the default means of abstraction, as C++ and Java have, is a proven disaster.

Abstractions, and especially powerful ones, aren’t always good. Using the right abstractions is of utmost importance. Abstractions that are too vague, for example, merely clutter code with useless nouns. As the first means of abstraction in high-level languages, higher-order functions suffice most of the time– probably over 95% of the time, in well-factored code. Objects may come into favor for being more general than higher-order functions, but “more general” also means less specific, and for the purpose of code comprehension, this is a hindrance, not a feature. If cleaner and more comprehensible pure functions and algebraic data types can be used in well over 90 percent of the places where objects appear in OO languages, they should be used in lieu of objects, and they should be supported by languages– which C++ and Java don’t do.

In a better world, programmers would be required to learn how to use functions before progressing to objects, and object-oriented features would hold the status of being available but deployed only when needed, or in the rare cases where such features make remarkably better code. To start, this change needs to come about at a language level. Instead of Java or C++ being the first languages to which most programmers are introduced, that status should be shifted to a language like Scheme, Haskell, or my personal favorite for this purpose: Ocaml.