Learning C, reducing fear.

I have a confession to make. At one point in my career, I was a mediocre programmer. I might say that I still am, only in the context of being a harsh grader. I developed a scale for software engineering for which I can only, in intellectual honesty, assign myself 1.8 points out of a possible 3.0. One of the signs of my mediocrity is that I haven’t a clue about many low-level programming details that, thirty years ago, people dealt with on a regular basis. I know what L1 and L2 cache are, but I haven’t built the skill set yet to make use of this knowledge.

I love high-level languages like Scala, Clojure, and Haskell. The abstractions they provide make programming more productive and fun than it is in a language like Java and C++, and the languages have a beauty that I appreciate as a designer and mathematician. Yet, there is still quite a place for C in this world. Last July, I wrote an essay, “Six Languages to Master“, in which I advised young programmers to learn the following languages:

  • Python, because one can get started quickly and Python is a good all-purpose language.
  • C, because there are large sections of computer science that are inaccessible if you don’t understand low-level details like memory management.
  • ML, to learn taste in a simple language often described as a “functional C” that also teaches how to use type systems to make powerful guarantees about programs.
  • Clojure, because learning about language (which is important if one wants to design good interfaces) is best done with a Lisp and because, for better for worse, the Java libraries are a part of our world.
  • Scala, because it’s badass if used by people with a deep understanding of type systems, functional programming, and the few (very rare) occasions where object-oriented programming is appropriate. (It can be, however, horrid if wielded by “Java-in-Scala” programmers.)
  • English (or the natural language of one’s environment) because if you can’t teach other people how to use the assets you create, you’re not doing a very good job.

Of these, C was my weakest at the time. It still is. Now, I’m taking some time to learn it. Why? There are two reasons for this.

  • Transferability. Scala’s great, but I have no idea if it will be around in 10 years. If the Java-in-Scala crowd adopts the language without upgrading its skills and the language becomes associated with Maven, XMHell, IDE culture, and commodity programmers, in the way that Java has, the result will be piles of terrible Scala code that will brand the language as “write-only” and damage its reputation for reasons that are not Scala’s fault. These sociological variables I cannot predict. I do, however, know that C will be in use in 10 years. I don’t mind learning new languages– it’s fun and I can do it quickly– but the upshot of C is that, if I know it, I will be able to make immediate technical contributions in almost any programming environment. I’m already fluent in about ten languages; might as well add C. 
  • Confidence. High-level languages are great, but if you develop the attitude that low-level languages are “unsafe”, ugly, and generally terrifying, then you’re hobbling yourself for no reason. C has its warts, and there are many applications where it’s not appropriate. It requires attention to details (array bounds, memory management) that high-level languages handle automatically. The issue is that, in engineering, anything can break down, and you may be required to solve problems in the depths of detail. Your beautiful Clojure program might have a performance problem in production because of an issue with the JVM. You might need to dig deep and figure it out. That doesn’t mean you shouldn’t use Clojure. However, if you’re scared of C, you can’t study the JVM internals or performance considerations, because a lot of the core concepts (e.g. memory allocation) become a “black box”. Nor will you be able to understand your operating system.

For me, personally, the confidence issue is the important one. In the functional programming community, we often develop an attitude that the imperative way of doing things is ugly, unsafe, wrong, and best left to “experts only” (which is ironic, because most of us are well into the top 5% of programmers, and more equipped to handle complexity than most; it’s this adeptness that makes us aware of our own limitations and prefer functional safeguards when possible). Or, I should not say that this is a prevailing attitude, so much as an artifact of communication. Fifty-year-old, brilliant functional programmers talk about how great it is to be liberated from evils like malloc and free. They’re right, for applications where high-level programming is appropriate. The context being missed is that they have already learned about memory management quite thoroughly, and now it’s an annoyance to them to keep having to do it. That’s why they love languages like Ocaml and Python. It’s not that low-level languages are dirty or unsafe or even “un-fun”, but that high-level languages are just much better suited to certain classes of problems.

Becoming the mentor

I’m going to make an aside that has nothing to do with C. What is the best predictor of whether someone will remain at a company for more than 3 years? Mentorship. Everyone wants “a Mentor” who will take care of his career by providing interesting work, freedom from politics, necessary introductions, and well-designed learning exercises instead of just-get-it-done grunt work. That’s what we see in the movies: the plucky 25-year-old is picked up by the “star” trader, journalist, or executive and, over 97 long minutes, his or her career is made. Often this relationship goes horribly wrong in film, as in Wall Street, wherein the mentor and protege end up in a nasty conflict. I won’t go so far as to call this entirely fictional, but it’s very rare. You can find mentors (plural) who will help you along as much as they can, and should always be looking for people interested in sharing knowledge and help, but you shouldn’t look for “The Mentor”. He doesn’t exist. People want to help those who are already self-mentoring. This is even more true in a world where few people stay at a job for more than 4 years.

I’ll turn 30 this year, and in Silicon Valley that would entitle me to a lawn and the right to tell people to get off of it, but I live in Manhattan so I’ll have to keep using the Internet as my virtual lawn. (Well, people just keep fucking being wrong. There are too many for one man to handle!) One of the most important lessons to learn is the importance of self-mentoring. Once you get out of school where people are paid to teach you stuff, people won’t help people who aren’t helping themselves. To a large degree, this means becoming the “Mentor” figure that one seeks. I think that’s what adulthood is. It’s when you realize that the age in which there were superior people at your beck and call to sort out your messes and tell you what to do is over. Children can be nasty to each other but there are always adults to make things right– to discipline those who break others’ toys, and replace what is broken. The terrifying thing about adulthood is the realization that there are no adults. This is a deep-seated need that the physical world won’t fill. There’s at least 10,000 recorded years of history that shows people gaining immense power by making “adults-over-adults” up, and using the purported existence of such creatures to arrogate political power, because most people are frankly terrified of the fact that, at least in the observable physical world and in this life, there is no such creature.

What could this have to do with C? Well, now I dive back into confessional mode. My longest job tenure (30 months!) was at a startup that seems to have disappeared after I left. I was working in Clojure, doing some beautiful technical work. This was in Clojure’s infancy, but the great thing about Lisps is that it’s easy to adapt the language to your needs. I wrote a multi-threaded debugger using dynamic binding (dangerous in production, but fine for debugging) that involved getting into the guts of Clojure, a test harness, an RPC client-server infrastructure, and a custom NoSQL graph-based database. The startup itself wasn’t well-managed, but the technical work itself was a lot of fun. Still, I remember a lot of conversations to the effect of, “When we get a real <X>”, where X might be “database guy” or “security expert” or “support team”. The attitude I allowed myself to fall into, when we were four people strong, was that a lot of the hard work would have to be done by someone more senior, someone better. We inaccurately believed that the scaling challenges would mandate this, when in fact, we didn’t scale at all because the startup didn’t launch.

Business idiots love real X’s. This is why startups frequently develop the social-climbing mentality (in the name of “scaling”) that makes internal promotion rare. The problem is that this “realness” is total fiction. People don’t graduate from Expert School and become experts. They build a knowledge base over time, often by going far outside of their comfort zones and trying things at which they might fail, and the only things that change are that the challenges get harder, or the failure rate goes down. As with the Mentor that many people wait for in vain, one doesn’t wait to “find a Real X” but becomes one. That’s the difference between a corporate developer and a real hacker. The former plays Minesweeper (or whatever Windows users do these days) and waits for an Expert to come from on high to fix his IDE when it breaks. The latter shows an actual interest in how computers really work, which requires diving into the netherworld of the command line interface.

That’s why I’m learning C. I’d prefer to spend much of my programming existence in high-level languages and not micromanaging details– although, this far, C has proven surprisingly fun– but I realize that these low-level concerns are extremely important and that if I want to understand things truly, I need a basic fluency in them. If you fear details, you don’t understand “the big picture”. The big picture is made up of details, after all. This is a way to keep the senescence of business FUD at bay– to not become That Executive who mandates hideous “best practices” Java, because Python and Scala are “too risky”.

Fear of databases? Of operating systems? Of “weird” languages like C and Assembly? Y’all fears get Zero Fucks from me.

IDE Culture vs. Unix philosophy

Even more of a hot topic than programming languages is the interactive development environment, or IDE. Personally, I’m not a huge fan of IDEs. As tools, standing alone, I have no problem with them. I’m a software libertarian: do whatever you want, as long as you don’t interfere with my work. However, here are some of the negatives that I’ve observed when IDEs become commonplace or required in a development environment:

  • the “four-wheel drive problem”. This refers to the fact that an unskilled off-road driver, with four-wheel drive, will still get stuck. The more dexterous vehicle will simply have him fail in a more inaccessible place. IDEs pay off when you have to maintain an otherwise unmanageable ball of other people’s terrible code. They make unusable code merely miserable. I don’t think there’s any controversy about this. The problem is that, by providing this power, then enable an activity of dubious value: continual development despite abysmal code quality, when improving or killing the bad code should be a code-red priority. IDEs can delay code-quality problems and defer macroscopic business effects, which is good for manageosaurs who like tight deadlines, but only makes the problem worse at the end stage. 
  • IDE-dependence. Coding practices that require developers to depend on a specific environment are unforgivable. This is true whether the environment is emacs, vi, or Eclipse. The problem with IDEs is that they’re more likely to push people toward doing things in a way that makes use of a different environment impossible. One pernicious example of this is in Java culture’s mutilation of the command-line way of doing things with singleton directories called “src” and “com”, but there are many that are deeper than that. Worse yet, IDEs enable the employment of programmers who don’t even know what build systems or even version control are. Those are things “some smart guy” worries about so the commodity programmer can crank out classes at his boss’s request.
  • spaghettification. I am a major supporter of the read-only IDE, preferably served over the web. I think that code navigation is necessary for anyone who needs to read code, whether it’s crappy corporate code or the best-in-class stuff we actually enjoy reading. When you see a name, you should be able to click on it and see where that name is defined. However, I’m pretty sure that, on balance, automated refactorings are a bad thing. Over time, the abstractions which can easily be “injected” into code using an IDE turn it into “everything is everywhere” spaghetti code. Without an IDE, the only way to do such work is to write a script to do it. There are two effects this has on the development process. One is that it takes time to make the change: maybe 30 minutes. That’s fine, because the conversation that should happen before a change that will affect everyone’s work should take longer than that. The second is that only adept programmers (who understand concepts like scripts and the command line) will be able to do it. That’s a good thing.
  • time spent keeping up the environment. Once a company decides on “One Environment” for development, usually an IDE with various in-house customizations, that IDE begins to accumulate plugins of varying quality. That environment usually has to be kept up, and that generates a lot of crappy work that nobody wants to do.

This is just a start on what’s wrong with IDE culture, but the core point is that it creates some bad code. So, I think I should make it clear that I don’t dislike IDEs. They’re tools that are sometimes useful. If you use an IDE but write good code, I have no problem with you. I can’t stand IDE culture, though, because I hate hate hate hate hate hate hate hate the bad code that it generates.

In my experience, software environments that rely heavily on IDEs tend to be those that produce terrible spaghetti code, “everything is everywhere” object-oriented messes, and other monstrosities that simply could not be written by a sole idiot. He had help. Automated refactorings that injected pointless abstractions? Despondency infarction frameworks? Despise patterns? Those are likely culprits.

In other news, I’m taking some time to learn C at a deeper level, because as I get more into machine learning, I’m realizing the importance of being able to reason about performance, which requires a full-stack knowledge of computing. Basic fluency in C, at a minimum, is requisite. I’m working through Zed Shaw’s Learn C the Hard Way, and he’s got some brilliant insights not only about C (on which I can’t evaluate whether his insights are brilliant) but about programming itself. In his preamble chapter, he makes a valid insight in his warning not to use an IDE for the learning process:

An IDE, or “Integrated Development Environment” will turn you stupid. They are the worst tools if you want to be a good programmer because they hide what’s going on from you, and your job is to know what’s going on. They are useful if you’re trying to get something done and the platform is designed around a particular IDE, but for learning to code C (and many other languages) they are pointless. […]
Sure, you can code pretty quickly, but you can only code in that one language on that one platform. This is why companies love selling them to you. They know you’re lazy, and since it only works on their platform they’ve got you locked in because you are lazy. The way you break the cycle is you suck it up and finally learn to code without an IDE. A plain editor, or a programmer’s editor like Vim or Emacs, makes you work with the code. It’s a little harder, but the end result is you can work with any code, on any computer, in any language, and you know what’s going on. (Emphasis mine.)

I disagree with him that IDEs will “turn you stupid”. Reliance on one prevents a programmer from ever turning smart, but I don’t see how such a tool would cause a degradation of a software engineer’s ability. Corporate coding (lots of maintenance work, low productivity, half the day lost to meetings, difficulty getting permission to do anything interesting, bad source code) does erode a person’s skills over time, but that can’t be blamed on the IDE itself. However, I think he makes a strong point. Most of the ardent IDE users are the one-language, one-environment commodity programmers who never improve, because they never learn what’s actually going on. Such people are terrible for software, and they should all either improve, or be fired.

The problem with IDEs is that each corporate development culture customizes the environment, to the point that the cushy, easy coding environment can’t be replicated at home. For someone like me, who doesn’t even like that type of environment, that’s no problem because I don’t need that shit in order to program. But someone steeped in cargo cult programming because he started in the wrong place is going to falsely assume that programming requires an IDE, having seen little else, and such novice programmers generally lack the skills necessary to set one up to look like the familiar corporate environment. Instead, he needs to start where every great programmer must learn some basic skills: at the command-line. Otherwise, you get a “programmer” who can’t program outside of a specific corporate context– in other words, a “5:01 developer” not by choice, but by a false understanding of what programming really is.

The worst thing about these superficially enriched corporate environments is their lack of documentation. With Unix and the command-line tools, there are man pages and how-to guides all over the Internet. This creates a culture of solving one’s own problems. Given enough time, you can answer your own questions. That’s where most of the growth happens: you don’t know how something works, you Google an error message, and you get a result. Most of the information coming back in indecipherable to a novice programmer, but with enough searching, the problem is solved, and a few things are learned, including answers to some questions that the novice didn’t yet have the insight (“unknown unknowns”) yet to ask. That knowledge isn’t built in a day, but it’s deep. That process doesn’t exist in an over-complex corporate environment, where the only way to move forward is to go and bug someone, and the time cost of any real learning process is at a level that most managers would consider unacceptable.

On this, I’ll crib from Zed Shaw yet again, in Chapter 3 of Learn C the Hard Way:

In the Extra Credit section of each exercise I may have you go find information on your own and figure things out. This is an important part of being a self-sufficient programmer. If you constantly run to ask someone a question before trying to figure it out first then you never learn to solve problems independently. This leads to you never building confidence in your skills and always needing someone else around to do your work. The way you break this habit is to force yourself to try to answer your own questions first, and to confirm that your answer is right. You do this by trying to break things, experimenting with your possible answer, and doing your own research. (Emphasis mine.)

What Zed is describing here is the learning process that never occurs in the corporate environment, and the lack of it is one of the main reasons why corporate software engineers never improve. In the corporate world, you never find out why the build system is set up in the way that it is. You just go bug the person responsible for it. “My shit depends on your shit, so fix your shit so I can run my shit and my boss doesn’t give me shit over my shit not working for shit.” Corporate development often has to be this way, because learning a typical company’s incoherent in-house systems doesn’t provide a general education. When you’re studying the guts of Linux, you’re learning how a best-in-class product was built. There’s real learning in mucking about in small details. For a typically mediocre corporate environment that was built by engineers trying to appease their managers, one day at a time, the quality of the pieces is often so shoddy that not much is learned in truly comprehending them. It’s just a waste of time to deeply learn such systems. Instead, it’s best to get in, answer your question, and get out. Bugging someone is the most efficient and best way to solve the problem.

It should be clear that what I’m railing against is the commodity developer phenomenon. I wrote about “Java Shop Politics” last April, which covers a similar topic. I’m proud of that essay, but I was wrong to single out Java as opposed to, e.g. C#, VB, or even C++. Actually, I think any company that calls itself an “<X> Shop” for any language X is missing the point. The real evil isn’t Java the language, as limited as it may be, but Big Software and the culture thereof. The true enemy is the commodity developer culture, empowered by the modern bastardization of “object-oriented programming” that looks nothing like Alan Kay’s original vision.

In well-run software companies, programs are build to solve problems, and once the problem is finished, it’s Done. The program might be adapted in the future, and may require maintenance, but that’s not an assumption. There aren’t discussions about how much “headcount” to dedicate to ongoing maintenance after completion, because that would make no sense. If people need to modify or fix the program, they’ll do it. Programs solve well-defined problems, and then their authors move on to other things– no God Programs that accumulate requirements, but simple programs designed to do one thing and do it well. The programmer-to-program relationship must be one-to-many. Programmers write programs that do well-defined, comprehensible things well. They solve problems. Then they move on. This is a great way to build software “as needed”, and the only problem with this style of development is that the importance of small programs is hard to micromanage, so managerial dinosaurs who want to track efforts and “headcount” don’t like it much, because they can never figure out who to scream at when things don’t go their way. It’s hard to commoditize programmers when their individual contributions can only be tracked by their direct clients, and when people can silently be doing work of high importance (such as making small improvements to the efficiencies of core algorithms that reduce server costs). The alternative is to invert the programmer-to-program relationship: make it many-to-one. Then you have multiple programmers (now a commodity) working on Giant Programs that Do Everything. This is a terrible way to build software, but it’s also the one historically favored by IDE culture, because the sheer work of setting up a corporate development environment is enough that it can’t be done too often, and this leads managers to desire Giant Projects and a uniformity (such as a one-language policy, see again why “<X> Shops” suck) that managers like but that often makes no sense.

The right way of doing things– one programmer works on many small, self-contained programs– is the core of the so-called “Unix philosophy“. Big Programs, by contrast, invariably have undocumented communication protocols and consistency requirements whose violation leads not only to bugs, but to pernicious misunderstandings that muddle the original conceptual integrity of the system, resulting in spaghetti code and “mudballs”. The antidote is for single programs themselves to be small, for large problems to be solved with systems that are given the respect (such as attention to fault tolerance) that, as such, they deserve.

Are there successful exceptions to the Unix philosophy? Yes, there are, but they’re rare. One notable example is the database, because these systems often have very strong requirements (transactions, performance,  concurrency, durability, fault-tolerance) that cannot be as easily solved with small programs and organic growth alone. Some degree of top-down orchestration is required if you’re going to have a viable database, because databases have a lot of requirements that aren’t typical business cruft, but are actually critically important. Postgres, probably the best SQL out there, is not a simple beast. Indeed, databases violate one of the core tenets of the Unix philosophy– store data in plain text– and they do so for good reasons (storage usage). Databases also mandate that people be able to use them without having to keep up with the evolution of such a system’s opaque and highly-optimized internal details, which makes the separation of implementation from interface (something that object-oriented programming got right) a necessary virtue. Database connections, like file handles, should be objects (where “object” means “something that can be used with incomplete knowledge of its internals”.) So databases, in some ways, violate the Unix philosophy, and yet are still used by staunch adherents. (We admit that we’re wrong sometimes.) I will also remark that it has taken decades for some extremely intelligent (and very well-compensated) people to get databases right. Big Projects win when no small project or loose federation thereof will do the job.

My personal belief is that almost every software manager thinks he’s overseeing one of the exceptions: a Big System that (like Postgres) will grow to such importance that people will just swallow the complexity and use the thing, because it’s something that will one day be more important than Postgres or the Linux kernel. In almost all cases, they are wrong. Corporate software is an elephant graveyard of such over-ambitious systems. Exceptions to the Unix philosophy are extremely rare. Your ambitious corporate system is almost certainly not one of them. Furthermore, if most of your developers– or even a solid quarter of them– are commodity developers who can’t code outside of an IDE, you haven’t a chance.

Six languages to master.

Eric Raymond, in “How to Become a Hacker“, recommended five languages: C, Java, Python, Perl, and Lisp. Each he recommended for different reasons: Python and Java as introductory languages, C to understand and hack Unix, Perl because of its use in scripting, and Lisp for, to use his words which are so much better than anything I can come up with, the profound enlightenment experience you will have when you finally get it. That experience will make you a better programmer for the rest of your days, even if you never actually use LISP itself a lot.

It’s 2012. Many languages have come to the fore that didn’t exist when this essay was written. Others, like Perl, have faded somewhat. What is today’s five-language list? I won’t pretend that my answer is necessarily the best; it’s biased based on what I know. That said, I’d think the 5 highest-return languages for people who want to become good engineers are the following, and in this order: Python, C, ML, Clojure, and Scala.

Why these 5? Python I include because it’s easy to learn and, in-the-small, extremely legible. I’d rather not use it for a large system, but people who are just now learning to program are not going to be writing huge systems. They’re not going to be pushing major programs into production. At least, they shouldn’t be. What they should be able to do is scrape a webpage or build a game or investigate an applied math problem and say, “Shit, that’s cool.” Python gets people there quickly. That will motivate them to get deeper into programming. Python is also a language that is not great at many things, but good at one hell of a lot of them. It’s quick to write, legible in the small, and expressive. It allows imperative and functional styles. It has great libraries, and it has strong C bindings, for when performance is needed.

People who are getting started in programming want to do things that are macroscopically interesting from a beginner’s perspective. They don’t just want to learn about algorithms and how compilers work, because none of that’s interesting to them until they learn more of the computer science that motivates the understanding of why these things are important. Compilers aren’t interesting until you’ve written programs in compiled languages. At the start, people want to be able to write games, scrape webpages, and do simple systems tasks that come up. Python is good because it’s relatively easy to do most programming tasks in it.

After Python, C is a good next choice, and not because of its performance. That’s largely irrelevant to whether it will make someone a better programmer (although the confidence, with regard to understanding performance, that can come with knowing C is quite valuable). C is crucial because there’s a lot of computer science that becomes inaccessible if one sticks to higher-level languages (and virtual machines) like Java, C#, and Python. Garbage collection is great, but what is the garbage collector written in? C. As is Unix, notably. For all this, I think C is a better choice than C++ because there’s another important thing about C: C++ is a mess and it’s not clear whether it’s a good language for more than 1% of the purposes to which it’s put, but C, on the other hand, has utterly dominated the mid-level language category. For all its flaws, C is (like SQL for database query languages) a smashing, undisputed success, and for good reasons. The high-level language space is still unsettled, with no clear set of winners, but the mid-level language used to write the runtimes and garbage collectors of those high level languages is usually C, and will be for some time.

Python and C give a person coverage of the mid- and very-high levels of language abstraction. I’m avoiding including low-level (i.e. assembly) languages because I don’t think any of them have the generalist’s interest that would justify top-5 placement. Familiarity with assembly language and how it basically works is a must, but I don’t think mastery of x86 intricacies is necessary for most programmers.

Once a programmer’s fluent in Python and C, we’re talking about someone who can solve most coding problems, but improvement shouldn’t end there. Taste is extremely important, and it’s lack of taste rather than lack of intellectual ability that has created the abundance of terrible code in existence. Languages can’t inherently force people to learn taste, but a good starting point in this direction is ML: SML or OCaml with the “O” mostly not used.

ML has been described as a “functional C” for its elegance. It’s fast, and it’s a simple language, but its strong functional programming support makes it extremely powerful. It also forces people to program from the bottom up. Instead of creating vague “objects” that might be hacked into bloated nonsense over the lifespan of a codebase, they create datatypes (mostly, records and discriminated unions, with parameterized types available for polymorphism) out of simpler ones, and use referentially transparent functions as the basic building blocks of most of their programs. This bottom-up structure forces people to build programs on sound principles (rather than the vague, squishy abstractions of badly-written object-oriented code) but ML’s high-level capability brings people into the awareness that one can write complex software using a bottom-up philosophy. Python and C teach computer science at higher and lower levels, but ML forces a programmer to learn how to write good code.

There’s also something philosophical that Python, C, and Ocaml tend to share that C++ and Java don’t: small-program philosophy, which is generally superior. I’ve written at length about the perils of the alternative. In these languages, it’s much more uncommon to drag in the rats’ nest of dependencies associated with large Java projects. For an added bonus, you never have to look at those fucking ugly singleton directories called “com”. Once a person has used these three languages to a significant extent, one gets a strong sense of how small-program development works and why immodular, large-project orientation is generally a bad thing.

When you write C or Ocaml or Python, you get used to writing whole programs that accomplish something. There’s a problem, you solve it, and you’re done. Now you have a script, or a library, or a long-running executable. You may come back to it to improve it, but in general, you move on to something else, while the solution you’ve created adds to the total stored value of your code repository. That’s what’s great about small-program development: problems are actually solved and work is actually “done” rather than recursively leading to more work without any introspection on whether the features being piled on the work queue make sense. Developers who only experience large-program development– working on huge, immodular Java projects in IDEs a million metaphorical miles from where the code actually runs for real– never get this experience of actually finishing a whole program.

Once a person has grasped ML, we’re talking about a seriously capable programmer, even though ML isn’t a complex language. Learned in the middle of one’s ML career is a point to which I’ll return soon, but for now leave hanging: types are interesting. One of the most important things to learn from ML is how to use the type system to enforce program correctness: it generates a massive suite of implicit unit tests that (a) never have to be written, and (b) don’t contribute to codebase size. (Any decent programmer knows that “lines of code” represent expenditure, not accomplishment.)

The fourth language to learn is Clojure, a Lisp that happens to run on the JVM. The JVM has its warts, but it’s powerful and there are a lot of good reasons to learn that ecosystem, and Clojure’s a great entry point. A lot of exciting work is being done in the JVM ecosystem, and languages like Clojure and Scala keep some excellent programmers interested in it. Clojure is an excellent Lisp, but with its interactive “repl” (read-eval-print-loop) and extremely strong expressivity, it is (ironically)  arguably the best way to learn Java. It has an outstanding community, a strong feature set, and some excellent code in the open-source world.

Lisp is also of strong fundamental importance, because its macro system is unlike anything else in any other language and will fundamentally alter how an engineer thinks about software, and because Lisp encourages people to use a very expressive style. It’s also an extremely productive language: large amounts of functionality can be delivered in a small amount of time. Lisp is a great language for learning the fundamentals of computing, and that’s one reason why Scheme has been traditionally used in education. (However, I’d probably advocate starting with Python because it’s easier to get to “real stuff” quickly in it. Structure and Interpretation of Computer Programs and Scheme should be presented when people know they’re actually interested in computing itself.)

When one’s writing large systems, Lisp isn’t the best choice, because interfaces matter at that point, and there’s a danger that people will play fast-and-loose with interfaces (passing nested maps and lists and expecting the other side to understand the encoding) in a way that can be toxic. Lisp is great if you trust the developers working on the project, but (sadly) I don’t think many companies remain in such a state as they grow to scale.

Also, static typing is a feature, not a drawback. Used correctly, static typing can make code more clear (by specifying interfaces) and more robust, in addition to the faster performance usually available in compiled, statically typed languages. ML and Haskell (which I didn’t list, but it’s a great language in its own right) can teach a person how to use static typing well.

So after Lisp, the 5th language to master is Scala. Why Scala, after learning all those others, and having more than enough tools to program in interesting ways? First of all, it has an incredible amount of depth in its type system, which attempts to unify the philosophies of ML and Java and (in my opinion) does a damn impressive job. The first half of Types and Programming Languages is, roughly speaking, the theoretic substrate for ML. But ML doesn’t have a lot of the finer features. It doesn’t have subtyping, for example. Also, the uniqueness constraint on record and discriminated union labels (necessary for full Hindley-Milner inference, but still painful) can have a negative effect on the way people write code. The second half of TAPL, which vanilla ML doesn’t really support, is realized in Scala. Second, I think Scala is the language that will salvage the 5 percent of object-oriented programming that is actually useful and interesting, while providing such powerful functional features that the remaining 95% can be sloughed away. The salvage project in which a generation of elite programmers selects what works from a variety of programming styles– functional, object-oriented, actor-driven, imperative– and discards what doesn’t work, is going to happen in Scala. So this is a great opportunity to see first-hand what works in language design and what doesn’t.

Scala’s a great language that also requires taste and care, because it’s so powerful. I don’t agree with the detractors who claim it’s at risk of turning into C++, but it definitely provides enough rope for a person to hang himself by the monads.

What’s most impressive about Clojure and Scala is their communities. An enormous amount of innovation, not only in libraries but also in language design, is coming out of these two languages. There is a slight danger of Java-culture creep in them, and Scala best practices (expected by the leading build environments) do, to my chagrin, involve directories called “src” and “main” and even seem to encourage singleton directories called “com”, but I’m willing to call this a superficial loss and, otherwise, the right side seems to be winning. There’s an incredible amount of innovation happening in these two languages that have now absorbed the bulk of the top Java developers.

Now… I mentioned “six languages” in this post’s title but named five. The sixth is one that very few programmers are willing to use in source code: English. (Or, I should say, the scientifically favored natural language of one’s locale.) Specifically, technical English, which requires rigor as well as clarity and taste. Written communication. This is more important than all of the others. By far. For that, I’m not complaining that software engineers are bad at writing. Competence is not a problem. Anyone smart enough to learn C++ or the finer points of Lisp is more than intelligent enough to communicate in a reasonable way. I’m not asking people to write prose that would make Faulkner cry; I’m asking them to explain the technical assets they’ve created at, at the least, the level of depth and rigor expected in a B+ undergraduate paper. The lack of writing in software isn’t an issue of capability, though, but of laziness.

Here’s one you hear sometimes: “The code is self-documenting.” Bullshit. It’s great when code can be self-documenting, making comments unnecessary, but it’s pretty damn rare to be solving a problem so simple that the code responsible for solving it is actually self-explanatory. Most problems are custom problems that require documentation of what is being solved, why, and how. People need to know, when they read code, what they’re looking at; otherwise, they’re going to waste a massive amount of time focusing on details that aren’t relevant. Documentation should not be made a crutch– you should also do the other important things like avoiding long functions and huge classes– but it is essential to write about what you’re doing. People need to stop thinking about software as machinery that “explains itself” and start thinking of it as writing a paper, with instructions for humans about what is happening alongside the code actually doing the work.

One of the biggest errors I encounter with regard to commenting is the tendency to comment minutiae while ignoring the big picture. There might be a 900-line program with a couple comments saying, “I’m doing this it’s O(n) instead of O(n^2)” or “TODO: remove hard-coded filename”, but nothing that actually explains why these 900 lines of code exist. Who does that help? These types of comments are useless to people who don’t understand what’s happening at all, which they generally won’t in the face of inadequate documentation. Code is much easier to read when one knows what one is looking at, and microcomments on tiny details that seemed important when the code was written are not helpful.

Comments are like static typing: under-regarded if not ill-appreciated because so few people use them properly, but very powerful (if used with taste) in making code and systems actually legible and reusable. Most real-world code, unfortunately, isn’t this way. My experience is that about 5 to 10 percent of code in a typical codebase is legible, and quite possibly only 1 percent is enjoyable to read (which good code truly is). The purpose of a comment should not be only to explain minutiae or justify weird-looking code. Comments should also ensure that people always know what they’re actually looking at.

The fallacy that leads to a lack of respect for documentation is that writing code is like building a car or some other well-understood mechanical system. Cars don’t come with a bunch of labels on all the pieces, because cars are fairly similar under the hood and a decent mechanic can figure out what is what. With software, it’s different. Software exists to solve a new problem; if it were solving an old problem, old software could be used. Thus, no two software solutions are going to be the same. In fact, programs tend to be radically different from one another. Software needs to be documented because every software project is inherently different, at least in some respects, from all the others.

There’s another problem, and it’s deep. The 1990s saw an effort, starting with Microsoft’s visual studio, to commoditize programmers. The vision was that, instead of programming being a province of highly-paid, elite specialists with a history of not working well with authority, software could be built by bolting together huge teams of mediocre, “commodity” developers, and directing them using traditional (i.e. pre-Cambrian) management techniques. This has begun to fail, but not before hijacking object-oriented programming, turning Java’s culture poisonous, and creating some of the most horrendous spaghetti code (MudballVisitorFactoryFactory) the world has ever seen. Incidentally, Microsoft is now doing a penance by having its elite research division investigate functional programming in a major way, the results being F# and a much-improved C#. Microsoft, on the whole, may be doomed to mediocrity, but they clearly have a research division that “gets it” in an impressive way. Still, that strikes me as too little, too late. The damage has be done, and the legacy of the commodity-developer apocalypse still sticks around.

The result of the commodity-programmer world is the write-only code culture that is the major flaw of siloized, large-program development. That, I think, is the fundamental problem with Java-the-culture, IDE-reliance, and the general lack of curiosity observed (and encouraged) among the bottom 80 percent of programmers. To improve as programmers, people need to read code and understand it, in order to get a sense of what good and bad code even are, but almost no one actually reads code anymore. IDEs take care of that. I’m not going to bash IDEs too hard, because they’re pretty much essential if you’re going to read a typical Java codebase, but IDE culture is, on the whole, a major fail that makes borderline-employable programmers out of people who never should have gotten in in the first place.

Another problem with IDE culture is that the environment becomes extremely high maintenance, between plugins that often don’t work well together, build system idiosyncracies that accumulate over time, and the various menu-navigation chores necessary to keep the environment sane (as opposed to command-line chores, which are easily automated). Yes, IDEs do the job: bad code becomes navigable, and commodity developers (who are terrified of the command line and would prefer not to know what “build systems” or “version control” even are) can crank out a few thousand lines of code per year. However, the high-maintenance environment requires a lot of setup work, and I think this is culturally poisonous. Why? For a contrast, in the command-line world, you solve your own problems. You figure out how to download software (at the command line using wget, not clicking a button) and install it. Maybe it takes a day to figure out how to set up your environment, but once you’ve suffered through this, you actually know a few things (and you usually learn cool things orthogonal to the problem you were originally trying to solve). When a task gets repetitive, you figure out how to automate it. You write a script. That’s great. People actually learn about the systems they’re using. On the other hand, in IDE-culture, you don’t solve your own problems because you can’t, because there it would take too long. In the big-program world, software too complex for people to solve their own problems is allowed to exist. Instead of figuring it out on your own, you flag someone down who understands the damn thing, or you take a screenshot of the indecipherable error box that popped up and send it to your support team. This is probably economically efficient from a corporate perspective, but it doesn’t help people become better programmers over time.

IDE culture also creates a class of programmers who don’t work with technology outside of the office– the archetypal “5:01 developers”– because they get the idea that writing code requires an IDE (worse yet, an IDE tuned exactly in line with the customs of their work environment). If you’re IDE-dependent, you can’t write code outside of a corporate environment, because when you go home, you don’t have a huge support team to set the damn thing up in a way that you’re used to and fix things when the 22 plugins and dependencies that you’ve installed interact badly.

There are a lot of things wrong with IDE culture, and I’ve only scratched the surface, but the enabling of write-only code creation is a major sticking point. I won’t pretend that bad code began with IDEs because that’s almost certainly not true. I will say that the software industry is in a vicious cycle, which the commodity-developer initiative exacerbated. Because most codebases are terrible, people don’t read them. Because “no one reads code anymore”, the bulk of engineers never get better, and continue to write bad code.

Software has gone through a few phases of what it means for code to actually be “turned in” as acceptable work. Phase 1 is when a company decides that it’s no longer acceptable to horde personal codebases (that might not even be backed up!) and mandates that people check their work into version control. Thankfully, almost all companies have reached that stage of development. Version control is no longer seen as “subversive” by typical corporate upper management. It’s now typical. The second is when a company mandates that code have unit tests before it can be relied upon, and that a coding project isn’t done until it has tests. Companies are reaching this conclusion. The third milestone for code-civilizational development, which very few companies have reached, is that the code isn’t done until you’ve taught users how to use it (and how to interact with it, i.e. instantiate the program and run it or send messages to it, in a read-eval-print-loop appropriate to the language). That teaching can be supplied at a higher level in wikis, codelabs, and courses… but it also needs to be included with the source code. Otherwise, it’s code out-of-context, which becomes illegible after a hundred thousand lines or so. Even if the code is otherwise good, out-of-context code without clear entry-points and big-picture documentation becomes incomprehensible around this point.

What do I not recommend? There’s no language that I’d say is categorically not worth learning, but I do not recommend becoming immersed in Java (except well enough to understand the innards of Clojure and Scala). The language is inexpressive, but the problem isn’t the language, and in fact I’d say that it’s unambiguously a good thing for an engineer to learn how the JVM works. It’s that Java-the-Culture (VisitorSelectionFactories, pointless premature abstraction, singleton directories called “com” that betray dripping contempt for the command line and the Unix philosophy, and build environments so borked that it’s impossible not to rely on an IDE) that is the problem; it’s so toxic that it reduces an engineer’s IQ by 2 points per month.

For each of these five programming languages, I’d say that a year of exposure is ideal and probably, for getting a generalist knowledge, enough– although it takes more than a year to actually master any of these. Use and improvement of written communication, on the other hand, deserves more. That’s a lifelong process, and far too important for a person not to start early on. Learning new programming languages and, through this, new ways of solving problems, is important; but the ability to communicate what problem one has solved is paramount.

Java Shop Politics

Once, I was at a company that was considering (and eventually did so) moving its infrastructure over to Java, and there was a discussion about the danger of “Java Shop Politics”. It would seem strange to any non-programmer that a company’s choice of programming language would alter the political environment– these languages are just tools, right? Well, no. In this case, almost all of us knew exactly what was being talked about. Most software engineers have direct experience with Java Shop Politics, and it has a distinct and unpleasant flavor.

When Unix and C were developed, they were designed by people who had already experienced firsthand the evils of Big Software, or monolithic systems comprised of hundreds of thousands of lines of code without attention paid to modularity, that had often swelled to the point where no one understood the whole system. (In the 1970s, these monoliths were often written in assembly language, in which macroscopically incomprehensible code is not hard to create.) The idea behind the Unix environment, as a reaction to this, was to encourage people to write small programs and build larger systems using simple communication structures like pipes and files. Although C is a strictly compiled language and no Lisp-style REPL existed for it, C programs were intended to be small enough that the Unix operating system was an acceptable REPL. This was not so far from the functional programming vision, and far more practical in its time. The idea behind both is to write small programs (functional “building blocks”) that are easy to reason about, and build more complex systems out of them, while retaining the ability to piecewise debug simple components in event of failure. This style of development works extremely well, because it encourages people to build tools for general use, rather than massive projects that, if they fail, render almost all of the effort put into the project useless. As more code is written, this leads to the growth of generally useful libraries and executable utilities. In the big-program model of development, for a contrast, it leads to increasing complexity within one program, which can easily make whole-program comprehension impossible, and make decay (in the absence of forward refactoring) inevitable, regardless of programming language or engineer talent.

I adhere to this small-program mentality. I’m not saying “don’t be ambitious”. Be ambitious, but build large systems while keeping individual modules small. If a file is too big to read (for full comprehension) in one sitting, break it up. If “a program” is too big to read in a week, then it should be respected (even if it runs as one executable) as a system, and systems are harder to manage than single modules. While it is harder, up front, to develop in a modular style, the quality of the product is substantially higher, and this saves time in the long run.

Many software managers, unfortunately, like Big Projects. They like huge systems with names like cool-sounding names like “Rain Man” and “Dexter” that swell to a hundred thousand lines of code and provide features no one asked for. They like having the name of something Big, something their bosses might have heard of, in their story. Big Projects also provide a lot more in the way of managerial control. A manager can’t control the “chaotic”, often as-needed, growth of small-program development, whereas directing a Big Project is relatively straightforward. Make it Object-Oriented, use this set of analytic tools, complete at least 20 of the 34 feature-request tickets in the queue (it doesn’t matter which 20, just do 20) and have it done by Friday.

Beginning around 1990 was a pernicious attempt by software managers to commoditize programming talent. Enter C++, designed with unrelated intent, but a language that made big-program development seem palatable for C. C’s not high-level enough for most application development in 2012, but C++ is not the solution (except in very specific cases from numeric programming where templates allow very-fast code to be written once for a range of numeric types, with no runtime performance overhead) either. For over 90 percent of the applications that use it, it’s the wrong language. Here’s the metaphor I like to use: assembly coding is infantry: as fine-grained as one could want to be, but slow (to write), lumbering along at 3.1 miles per hour. C is a tank. It’s robust, it’s powerful, and it’s extremely impressive and well-designed, but it moves at ground speeds: about 50 miles per hour. That’s what it’s designed to do. Languages like Python and Ocaml are airplanes– very fast, from a development perspective, but not fine-grained at all. C++ exists because someone had a fever dream in which these two classes of vehicles got mixed up and thought, “I’m going to put wings on a fucking tank”. The drag and awkwardness imposed by the wings made it terrible as a tank, but it doesn’t fly well either. Java was invented after a few horrible tank-plane crashes, the realization being that the things are too powerful and fly too fast. It’s a similarly ridiculous “tank-icopter”. It’s not as fast as the tank-plane, and few people enjoy flying them, but it’s less likely to kill people.

It’s not that C++ or Java, as languages, are evil. They’re not. Languages are just tools. Java was designed to run in embedded systems like automatic coffee pots and cable-TV “set top boxes”, so closures were cut for time in the first release, because these use cases don’t require high-level programming features. I haven’t map-reduced a toaster cluster for years. Contrary to popular history, Java wasn’t designed with an ideological, “enterprise” distrust of the programmer. That trend, in the Java community, came later, when the 1990s attempt to commoditize programming talent (a dismal FailureFactory) co-opted the language.

Java and C++ became languages in which “object-oriented programming” was the norm. The problem is that what is currently called OOP is nothing like Alan Kay’s vision. His inspiration was the cell, which hides (encapsulates) immense mechanical complexity behind a simpler interface of chemical and electrical signals. The idea was that, when one needs complexity, simpler interfaces are invaluable, and that complex systems generally should have comprehensible interfaces. Kay was not saying, “go out there and create giant objects” or “use object-oriented programming everywhere”. He was attempting to provide tools for dealing with complexity when it becomes inevitable. Unfortunately, a generation of software managers took “object-oriented” magic and immodularity as virtues. This is similar to the “waterfall” software methodology, named by a person clearly stating it was the worst idea, and yet taken by suits as having been “recommended by some smart guy” once it was given a name.

What’s wrong with Big Project development? First, it encourages reliance on internal vaporware. Important work can be delayed “until Magneto is done”. When Magneto is done, that will solve all our problems. This works for managers seeking to allocate blame for slow progress– that fucking Magneto team can never get their shit done on time– but it’s a really bad way to structure software. In the small-program model, useful software is being continually released, and even an initiative that fails will provide useful tools. In the big-program arena, the project is either delivered in toto or not at all. What if half the Magneto team quits? What if the project fails for other reasons? Then Magneto est perdu. It’s a huge technical risk.

[ETA: when I wrote this essay, I was unaware that a company named Magneto existed. “Magneto”, in this essay, was a potential name for a large project in a hypothetical software company. There is absolutely no connection between these project names used here and real companies. Consider names like “Cindarella” and “Visigoth” to be gensyms.]

Second, under a big-program regime, people are trackable, because most programmers are “on” a single program. This is also something that managerial dinosaurs love, because it provides implicit time-tracking. Mark is on Cindarella, which has a headcount of 3. Sally is on Visigoth, which has a headcount of 5. Alan is on the project that used to be called 4:20 until Corporate said that name wasn’t okay and that now can’t decide if it wants to be called 4:21 or BikeShed.

What this kills, however, is extra-hierarchical collaboration– the lifeblood of a decent company, despite managerial objections. In a small-program software environment, people frequently help out other teams and contribute to a number of efforts. A person might be involved in more than 20 programs, and take ownership of quite a few, in a year. Those programs end up having general company-wide use, and that creates a lot of cross-hierarchical relationships. MBA dinosaurs, alas, hate cross-hierarchical, unmetered, work. That kind of work is impossible to measure, and a tightly-connected company makes it hard to fire people. On the other hand, if a programmer only works on one Big Project at a time, and has no other interaction with other teams, it’s much easier to “reduce headcount”.

The third problem with big-program methodology is that it’s inefficient. Six months can be spent “on-boarding” an engineer into the complexities of the massive Big Project he’s been assigned to work on. In light of the average job lasting two to three years, that’s just intolerable. The on-boarding problem also restricts internal mobility. If someone is a bad fit for his first Big Project, moving him to another means he could spend up to a year just on-boarding, accomplishing zilch. Transfers become unacceptable, from a business perspective, so people who don’t fit well with their first projects are Just Fucked.

Why is the on-boarding problem so severe? Big Projects, like large Java programs in general, tend to turn into shitty, ad-hoc domain-specific languages (DSLs). Greenspun’s Tenth Rule sets in, because programmers tend to compensate for underpowered tools by adding power, but in hasty ways, to the ones they have. They end up developing a terminology that no one outside of them understands. They become systems where to understand any of it requires understanding all of it. This means that people hired to modify or expand these Big Projects have to spend months understanding the existing system, whose intricacies are only known to a few people in the company, before they can accomplish anything.

The fourth problem with the big-program methodology is the titular Java Shop Politics. In a small-program development environment, engineers write programs. Plural. Several. An engineer can be judged based on whether he or she writes high-quality software that is useful to other people in the organization, and this knowledge (talent discovery) is redundant throughout the organization because the engineer is continually writing good code for a wide array of people. What this means is that technology companies can have the lightweight political environment to which they claim to aspire, in which a person’s clout is a product of (visible) contribution.

On the other hand, in a big-program shop, an engineer only works on one Project, and that project is often a full-time effort for many people. Most people in the company– especially not managers– have no idea whether an individual engineer is contributing appropriately. If John is chugging away at 5 LoC per day on Lorax, is that because he sucks, because the team failed to on-board him, or because Lorax is a badly structured project? In a small-program environment where John could establish himself, such a question could be answered. The good programmers and bad projects (and vice versa) could be objectively discovered. In a big-program world, none of that will ever be known. The person with the most clout on Lorax, usually the technical lead, gets to make that assessment. Naturally, he’s going to choose the theory that benefits him, not John. He’ll never admit that Lorax is badly designed or, worse yet, was a mistake in the first place. So under the bus John goes.

In general, it is this fog of war that creates “office politics”. When no one knows who the good and bad contributors are, there are major incentives toward social manipulation. Eventually, these manipulations take more of peoples’ emotional energy than the actual work, and the quality of the latter declines. This is not limited to technology; it’s the norm in white-collar environments. Small-program development is an antidote. Large-program development accelerates (but does not necessarily cause) the poison.

The solution to this is simple: Don’t become a Java Shop. I’m not saying that Java or the Java Virtual Machine (JVM) is evil. Far from it, I think Clojure and Scala (which run on the JVM) are excellent languages, and the JVM itself is a great piece of software. Writing an occasional (small) Java or C++ program won’t destroy a company, obviously. On the other hand, fully becoming a Java Shop (where the vast majority of development is done on large Java programs, and where success relies on understanding defective “design patterns” and flawed “best practices” instead of being able to code) will. There is no avoidance of this; politically speaking, Java Shops go straight down. The quality of engineering, predictably, follows.

The company I discussed earlier, which was one of this country’s most promising startups before this happened, did become a Java Shop, despite furious protest from existing talent. Within weeks, the politics of the organization became toxic. There was an “old team” that adhered to the Unix philosophy, and a “new team” that was all-Java. (No Scala or Clojure; Scala was flirted with and had serious managerial support at first, but they killed their Scala efforts for being “too close” to the old team.) This old/new cleavage ruptured the company, and led to an epic talent bleed– a small company, it lost several engineers in a month. Java Shop Politics had arrived, and there was no turning back.

Here’s what that looks like. First, it’s not that Java (or C++) code is inherently evil. Any decent software engineer will have a passing competency in half a dozen languages (or more) by my age, and these languages are worth knowing. Some Clojure and Scala programs require classes to be written in Java for performance. That’s fine. Where a company starts to slide is when it becomes clear that “the real code” is to be written in Java (or, as at Google, C++) and when, around the same time, big-program methodologies become the norm. This makes the company “feel” managerially simpler, because headcount and project efforts can be tracked, but it also means that the company has lost touch with the ability to assess or direct individual contribution. Worse yet, because big-program methodologies and immodular projects are usually defective, the usual result is that people with bad tastes thrive. Then the company becomes less like a software enterprise and more like a typical corporation, in which all the important decisions are made by the wrong people and decline becomes inevitable.

How any software company can cross “The Developer Divide”

Nir Eyal wrote a blog post, The Developer Divide: When Great Companies Can’t Hire, on this conundrum: there are a lot of excellent technology companies that haven’t managed to attract the brightest (and generally pickiest) engineers in sufficient numbers to hire as fast as they grow, a problem that forces a company either to lower its hiring standards or curtail growth, neither of which is desirable.

What makes this problem especially hard is that it’s not about money. In marketing terminology, it’s a problem of “reach”. Many great startups, even if they offered $200,000 per year, would simply be unable to hire 25 elite (top 5%) programmers in a year. Finding one per quarter is pretty good. Compounding this difficulty is the fact that the best software engineers’ job searches are infrequent and short. They usually find jobs through social connections and word-of-mouth before they start officially “looking”. They almost never cold-spam their CVs. Therefore, a company that’s consistently hiring top-5% programmers is probably extending offers to 1 applicant per several hundred CVs that makes its way to the hiring manager.

What is the Developer Divide? It comes down to this: it’s easier to get top developers if your product is something that top developers already use. Google was unequivocally the best search engine even in 2000, and nerds like new search engines, so it established itself as a desirable place to work. Facebook and Foursquare may be targeted toward “non-engineers” (i.e. mass market) but the products are used by enough of the people the firms want to hire as to generate name recognition that accumulates faster than the company needs to grow. That makes it very easy for them to attract so much talent they turn away more 5-percenters than they hire. But for a company whose product isn’t used already, every day, by top programmers, it becomes harder. Much harder. That’s unfortunate, because a lot of great businesses
(most, actually) need to start out doing something lower on the established-product-sexiness scale (enterprise work and products with well-defined markets, rather than speculative “social” projects) before branching out into projects of more general interest. These companies might become sexy in the future, but sometimes the first clientele has to be an unsexy but reliable one, like large corporations or suburban soccer moms.

So, how do these great companies continue to find great talent? I’m going to state one simple (but not easy) thing that any software company can do to increase its attractiveness to top engineers by at least 3 binary orders of magnitude. Here it is: ditch Java/C++ and use a decent programming language.

What’s a decent programming language, for this purpose? It’s one that a 5-percenter would use for a personal side project, or if she were calling the shots. That language might Scala, Python, or Ocaml. It might (for a project such as an operating system) be C. It would amost certainly not, in 2012, be Java or C++.

A clarification must be made, because it’s confusing to people outside of technology: C++ is not a substitute, nor  an improvement on, C. In fact, C is great for programs where low-level concerns are critical in producing a quality system: operating systems, device drivers, hard real-time, and runtime environments for the high-level garbage-collected languages we all love (such as Ocaml and Haskell). Much of the world is built on C. It’s an excellent and immensely successful mid-level language,  as opposed to C++ which is a miserably failed attempt at a C-like high-level language. So do not confuse the two.

If you ask a 5-percenter for his opinions on C and C++, he’ll probably praise the former while trashing the latter. From a non-technical business perspective, this might seem inconsistent, given that C is a proper subset of C++. “Anything you can do in C, you can do in C++.” That makes C++ “strictly better”, right? Well, no. A 5-percenter has the professional maturity to realize that he or she will not be coding in a vacuum, and that reading code is as important an act as writing it, and that therefore adding ill-considered features to a good language doesn’t improve it, but ruins it if people are stupid enough to actually use them.

I’ll stop talking about C++, because it’s not even relevant to most startups. Startups can’t afford it, because they need to accomplish big things with small teams and C++ doesn’t make it possible. C++ is mostly that skulking monster that lives in the bowels of legacy systems at banks, something you expect to fight in the 2300 AD (post-apocalyptic) world of Chrono Trigger.

Java’s problem is somewhat different and new. C++ is a bad language because it was poorly designed, but it was at least designed for good programmers (and it’s disastrous when used by inept ones). Java, in contrast, was explicitly designed to favor the interests of massive (100+) teams of mediocre developers over excellent individual contributors, whom it slows down to a small fraction of their typical speed. It came out of a failed experiment to create a home for low-skill “commodity” developers who haven’t learned a new skill since college and who need to consult the One Smart Person on the team if (God forbid) their RAM-munching IDE breaks. Java has succeeding in making commodity developers marginally effective (as opposed to negatively productive, as they are in more powerful languages) but at the expense of hobbling the best, forcing them to endure ugliness (inherent to a language designed to be used exclusively through IDEs, while 5-percenters overwhelmingly prefer the command-line interface and “classic” editors like vim and emacs) and accidental complexity. Needless to say, 5-percenters despise Java. More correctly, they despise the Java language.  (I emphasize “the language” because the Java Virtual Machine itself is a pretty powerful tool, and because there are superior languages– Scala and Clojure coming to mind– that also run on the JVM.)

If you want to build a large team of 50+ “commodity” developers, content to maintain legacy code or work on mind-numbingly boring stuff, use Java or C++. If you’re a startup, you need to build a small team of excellent developers, so use something else. If you want to hire 5-percenters now but might need to hire Java jockeys later, strongly consider Scala and Clojure, which are highly powerful languages but run in the JVM environment.

Why is this so important? It’s not just about the language. It’s about signaling. As a startup, you have to show that you get it, and that the opportunity you offer isn’t Yet Another Java Job. You can get 5-percenters to use C++ or Java (Google has made this happen, and so have many investment banks, which have huge legacy codebases in C++) but you pretty much have to be a big-name company to pull this off, and you can expect to shell out an obscene of money– a 50-100 percent markup to account for the negatives of maintaining code in a terrible language, and the career stagnation this kind of work invites. If you want a 5-percenter to work 60 hours per week for $100,000 per year, use a great language. If you want him to work 9-to-5, with two-hour lunches and personal errands deducted from the workday, for $250,000 per year, then Java and C++ are options.

More than anything else, 5-percenters want to work with other 5-percenters. This is far more important to them than prestige or money (the reason 5-percenters default to rich, prestigious companies when nothing more interesting crosses their transom is because these companies have other 5-percenters). It is, moreover, even more important than programming language selection itself. Language selection, as I’ve said, an objective mechanism of signaling. This is what creative writing calls the “show, don’t tell” principle (don’t say “Eric was honorable”; have him do something honorable). Every company says (“telling”) it has world-class talent, but language choice is an objective decision that shows that a company is, at the very least, interested in hiring 5-percenters. For that reason, there’s a good chance that it employs some.

To reiterate, because I don’t want a flame war, is it possible to find 5-percenters who will write Java and C++ on a full-time basis? Absolutely, and if you want to compete with Goldman Sachs on salary, you might be able to hire one. Will they take on the risk of working for $5,000 (pre-tax) per month at a risky, seven-person startup in these languages? Not a chance. Five-percenters tolerate C++ jobs at Google because they know that company’s existence doesn’t rely on their individual productivity. In a startup, individual productivity is an existential concern, and the Aspergerian “pathological honesty” that top programmers almost invariably have precludes them from working in low-productivity languages that they believe will retard and destroy their employer.

I’ve said enough about the awfulness of C++ and Java. What are some good languages? I’ll give a list, which is not at all inclusive, of highly-powerful languages that the best developers love. Ocaml, Haskell, Erlang, Scala, Lisp, Clojure. Less strong but still formidable are Python and Ruby. (Many 5-percenters love Python, and almost all will tolerate it, but the median Python programmer is closer to a “10-percenter”.) Why is it this way? First, great developers program and learn technology in their spare time, and a 1-person, 15-hour-per-week project must be written in a real language if it is going to amount to anything. Second, most of these languages are only used by the best employers and only taught by the best universities, so a person deeply familiar with one of them is either (a) coming from elite exposure, or (b) possessive of enough individual curiosity to indicate a high likelihood of skill and success as a computer programmer. People who want to become great programmers quickly discover languages in which it’s possible for them to be 10 times as productive as they would be in Java– and they never look back.

Are all programmers in great languages 5-percenters? The answer is no. Users of languages like Ocaml and Scala tend to fall into two categories: (a) the 5-percenters, and (b) those who are becoming 5-percenters. Not all of them are there yet, but they’re almost all improving at a rapid pace. I’ve worked for almost 4 years in JVM languages (Java, Scala, and Clojure) and what I’ve learned (perhaps astonishingly) is that, per unit time, the Scala and Clojure developers learn about Java faster than those using Java! Because they are in high-productivity languages, they can accomplish more, and because they’re achieving more, they’re learning more along the way.

From a practical standpoint: with so many great languages to choose from, which one of those awesome languages should a person pick? CTOs making this decision have some idea, but this would be an impossible decision for a non-technical CEO to make on direct experience. For a very small team, the answer is easy: whatever the best programmers want to use. I like Scala much better than Python, but if I were in a non-coding role and tasked with hiring a great programmer, and if she preferred Python, I’d have her use that, because the benefit of letting her use the language she thought was best for the job outweighs (for a small team and with no maintenance burden) any benefit conferred by using one powerful language over another. So my answer to the language-selection question to a non-programmer CEO is: ask your best developer.

For my part, I’d recommend Scala. It’s not the best language for all purposes, but it’s a great general-purpose language and (among mainstream languages) may be best choice overall for most purposes. Because it runs in the Java Virtual Machine (JVM) it has full access to all of Java’s assets. (Clojure, an excellent JVM lisp, has the same advantage, but is not as performant and does not have static typing, a feature I find invaluable.)

A person versed in economics might find my argument tenuous. If a set of “elite” languages can be used by programmers and employers as a signal of high competency, what’s to stop the less competent from “faking” this signal once they catch on to the fact that it exists? A few answers come to mind. The first is that programming in a high-power language requires an adjustment that mediocre, 9-to-5, programmers are not likely to want to make. From first principles, functional programming isn’t intellectually harder than object-oriented programming: a 120 IQ is more than enough. It’s actually simpler. (Doing object-oriented programming correctly, and rigorously understanding what one is doing with it, is much harder and much more intellectually complex than succeeding in functional programming; this is a rant for another time, but 99% of people doing “object-oriented-programming” are like crude teenagers with regard to sex– loud about it, but doing it badly.) Nonetheless, the intellectual difficulty of re-learning software on sound principles is not an easy one to make. Like the transitions from memorization to pattern recognition, and then from pattern recognition to rigorous proof in mathematics, these context-switches require a lot of work (months of serious study). Successfully learning Scala or Clojure is a sign of a very strong work ethic. (It goes without saying that your interview process should establish that the candidates actually know these languages, and aren’t playing “buzzword bingo”.)

What about mediocre businesses using language selection as a false signal? That’s even more unlikely. A recruiter (for an elite startup, currently at less than 20 people) I spoke to told me that about 70% of Clojure candidates, 40% of Python candidates, and 5% of Java candidates that he invites to an in-office, full-day interview are good enough to hire. For a startup, interviews are far more expensive (in terms of opportunity cost) than for large companies: it costs about $100 to run a technical phone screen and $1000 to conduct a full-cycle interview, because startups have to involve senior people in their interview process in order to assess quality. Put another way, this means that it costs $1429 to hire a 5-percenter in Clojure and $2500 to hire one in Python– chump change as far as recruiting expenses go. But it costs $20,000 to hire a Java developer, if you insist on the “5-percenter” standard of quality.

There’s one variable I haven’t mentioned, though, and that’s the number of CVs he gets in each language: several hundred times as many Java developers than developers in Clojure or Ocaml. Most companies need (or think they need) warm bodies in large numbers to maintain legacy horrors, not top-talent and the attitude that comes with it. Also, it’s awful that it’s this way, and I think it will change in the future, with the limiting factor being work ethic and engagement rather than innate ability, but the software world is a pyramid, with a few stars at the pinnacle and a large number of incompetents at the base. This holds for programmers and for programmer jobs, of which 90 to 95 are mind-numbingly boring. It’s also seen in the distribution of language preference (and I say “preference” because there are tens of thousands of excellent programmers writing C++ and Java right now, but very few prefer them). The result of this is that the mediocre languages have the most programmers. If a company needs to hire 200 programmers per month, it simply cannot choose Haskell as its main development language; within a couple years, it would have absorbed the entire Haskell community!

Historically, that has been a serious concern for companies when it comes to language selection. Because there are hundreds of times more Java developers than Haskell or Ocaml hackers, there are at least fives of times more half-decent ones. Thus, the worst languages paradoxically have the strongest library support. This is compounded by the fact that powerful tools (such as IDEs, which are a mixed bag of neatness and horror, but sometimes quite useful and outright required when developing in Java) must be written to compensate the shortcomings of hobbled languages. There’s no Lisp IDE because emacs does just fine, but there are a slew of Java IDEs because it’s a revolting experience to write Java without one. From a non-technical CEO’s perspective, this makes Java look better because it has the best supporting tools.

What all this means is that mediocre software shops are not going to switch over to Haskell in order to ape this signaling mechanism. They can’t, because the bottom contingent of their software staff with drop like flies, and because their leadership is unlikely to understand the language selection problem in the first place. There might be some unestablished software companies using elite languages, and of them, I make the same argument that I’d make of “15-percenters” who nonetheless show interest and competence in “elite” languages– they may not be 5-percenters now, but if they keep at it, they will be shortly!

What about the (admitted) shortcomings of elite languages? Except for Scala and Clojure, which have access to the JVM and interoperate cleanly with Java, these languages don’t have the breadth of tooling that C++ and Java do. The answer to that is almost stereotypically “hackerish”: write them! This effort is not wasted, not in the least. Writing high-quality open-source tools to support elite languages is one of the best things a software company can do to establish its reputation. This, again, is an opportunity to show, not tell.

For a small company attempting to define and establish itself, attracting top talent is hard. The best programmers are on the job market so rarely and for such short intervals that attracting them takes concerted effort. Growing companies do not have the name recognition, and cannot afford the immense salaries (over $250,000 for a senior developer) that would attract these “5-percenters” using economic means, so they must win on technical grounds. Language selection is a simple (but not easy, because it’s difficult to adopt new and dramatically more powerful tools) way to do so. The best languages (Ocaml, Scala, Haskell, Python) are “shibboleths” that elite programmers use to identify each other, so adopting one of them is a one-stop choice that instantly establishes “hacker cred” or, to use Nir Eyal’s terminology, bridges “The Developer Divide”.

A problem with the term, programming “language”

I’m going to make a radical and perhaps offensive assertion: most of the time, when programmers use the phrase “programming language”, they are using it inappropriately, and creating a dangerously fallacious analogy, one for which few programmers but many decision-makers in business fall: that programming languages can be compared and assessed as one would for natural languages.

The problem is twofold. First, natural languages don’t have the massive variations in capability observed in programming languages. French is not superior to English (or vice versa) in the way that Scala is superior to Java. Second, “switching costs” associated with acquiring a new natural language are high, because it’s hard for an adult human to learn a new natural language well: it takes years of exposure, preferably in the context of contact with native speakers. This is not as true of programming languages; learning programming is hard, but a skilled programmer can become capable in a new language in a few weeks and be proficient after months. Because of this clumsy analogy, decision-makers in the software industry often make a decision that is the obvious right one with regard to natural languages, and almost always the wrong one (because “the standard” is usually an underpowered legacy language like Java or C++) in selecting a programming language: they default to the standard.

My intention is not to discuss specific programming languages per se, but the problem inherent in the phrase “programming language”. It is a correct phrase: a programming language is a form of language, with grammar, syntax and semantics. On the other hand, it admits a certain confusion inherited from something we know about natural languages like English and Spanish: despite their evident differences, human languages are much more alike than they are different. Natural languages have their quirks and are may be more capable in specific ways– character languages are more compact, while letter languages’ smaller alphabets make it easier to convert written words into spoken form– but none is uniformly or astronomically superior to any other.

For example, there’s no natural language where to express an average English sentence (15 words) requires two hundred words, but there are languages (Scala, Python) that are 15 times more concise than Java. There’s also no language that is 100 times easier for the human brain to convert into meaningful instructions than English, but there are programming languages in which programs run 100 times faster than others. Nor is there a natural language where it’s impossible for a syntactically correct sentence (e.g. “the moon ate the moon”) to be nonsensical; this assurance, to a large degree, can be achieved in statically typed languages. While it might be offensive even to consider that natural languages might vary in “quality”, I think it quite reasonable to assert that they don’t. To my knowledge, there aren’t major, order-of-magnitude, variations in capability among natural languages. For this reason, it’s appropriate to communicate (e.g. to conduct business) in the language of which the involved parties will have the best comprehension. The right decision, in selecting a natural language, is to use “the standard”. In New York, that’s usually English. In Moscow, it’s Russian. Every locale has a small set (sometimes only one) of natural languages that are likely to be well-comprehended by most people. For this reason, it would be absurd for a Silicon Valley firm to make the decision to conduct all business in Swahili, even if Swahili were 10 or 50 percent better suited to that company’s needs.

Programming languages are different: there are order of magnitude variations. Rewrite a C program in Java, and the program becomes immune to a wide class of errors (due to Java’s automatic memory management) but one loses the ability to explicitly manage memory, and program performance becomes nondeterministic. Sometimes this is a desirable change; sometimes not. Rewrite the Java program in Python and the source code usually becomes one-tenth of its previous size, making the program easier to maintain and (in the long-run) better, but one forgoes access to the Java libraries (using Python’s, which are strong but less developed) and programs take 10 to 20 times longer to run. Rewrite the Python program in Scala and one regains access to the Java libraries and Java’s speed, but loses access to Python’s libraries and will have trouble transliterating some of Python’s (controversial, but sometimes powerful) most dynamic features.

Order-of-magnitude differences exist among programming languages, with the winners depending largely on one’s definition of “quality”. Python is expressive but the interpreter is slow. Java is fast but verbose and horrendous to read, making source code difficult to maintain and “code rot” inevitable. Scala is fast and concise, with a powerful type system, but few people understand the language or its type system well, so it’s not really possible to hire 50 Scala developers per month. C is verbose (for large-scale, complex software) and without many features, but the best or only choice for a variety of problems, such as writing device drivers, real-time programs, and operator systems. No programming language can be categorized as the uniform “best”; all have their strengths and weaknesses (except for C++, which is a bastardization of C and should never be used except to troll people).

This essay isn’t about what languages are good and which not, but it answers a question: why do most businesses use the wrong programming languages? Why are so many development shops using C++ and Java, when they could accomplish four times as much per developer-month using a more expressive language like Python or Scala? I think a major part of it comes from the fact that we call them programming languages. For an analogous question: Why do most American businesses use English? Not because English is the “best” language (if such a notion could even be defined for a natural language, and I doubt it can) but because it’s “the standard”: it’s what most other Americans use to communicate. With natural languages, there aren’t order-of-magnitude differences in capabilities, and the difficulty of learning a new natural language is high (the American “founding fathers” despised the British so much that they tried to replace English with French or Hebrew as the new country’s “official language”, and failed) so choosing the default natural language is such an obvious choice that it’s often not even a conscious decision.

Java and C++, analogously, are favored in the software industry because they’re “the standard”, like English. To a businessperson unfamiliar with technology, a proposal to implement software in Scala seems like the suggestion that all business should be conducted in Swahili– for a New York firm, ridiculous. This is why many software shops are far more skeptical of “other languages” than they would be if they understood the situation and the potential gains.

How do we dispel these notions? I think one change should be in our terminology. Often, when we debate languages, we’re actually discussing technologies built on languages. For example, sentences like “C is fast” and “Python is slow” are ridiculous as stated. A language (formally speaking) is just a set of strings of symbols, with no intrinsic concept of “speed”. What people mean, in fact, is that “compiled C programs have excellent performance due to advanced compilers and the language’s support for powerful optimizations” and “Python executables run about 10 to 20 times more slowly than analogous programs in C” (but still fast, because all computers are “fast” these days).  Speed isn’t an intrinsic trait of the language, but rather one of the technologies to compile and interpret it.  Likewise, the common justification for using Java is that “It’s has the largest set of active libraries.” Well, not exactly. There are a variety of exciting (and superior) JVM languages– most notably Scala and Clojure– that can use the bulk of these tools equally seamlessly, which means that there’s no excuse (in most cases) for preferring Java.

Programming languages are languages, but the decision to adopt one programming language over another has no similarity to such a decision over natural languages. The relevant comparison, in technology, is one of platforms, programming styles, capabilities, and supporting technologies. Referring to this decision-making process as a “language debate” is, in this context, myopic. Far more is at stake than the choice of “language” alone; the chosen programming language influences the types of programs one can write, and the variation in program quality that is staked on language selection alone is quite high.

Object Disoriented Programming

It is my belief that what is now called “object-oriented programming” (OOP) is going to go down in history as one of the worst programming fads of all time, one that has wrecked countless codebases and burned millions of hours of engineering time worldwide. Though a superficially appealing concept– programming in a manner that is comprehensible the “big picture” level– it fails to deliver on this promise, it usually fails to improve engineer productivity, and it often leads to unmaintainable, ugly, and even nonsensical code.

I’m not going to claim that object-oriented programming is never useful. There would be two problems with such a claim. First, OOP means many different things to different people– there’s a parsec of daylight between Smalltalk’s approach to OOP and the abysmal horrors currently seen in Java. Second, there are many niches within programming with which I’m unfamiliar and it would be arrogant to claim that OOP is useless in all of them. Almost certainly, there are problems for which the object-oriented approach is one of the more useful. Instead, I’ll make a weaker claim but with full confidence: as the default means of abstraction, as in C++ and Java, object orientation is a disastrous choice.

What’s wrong with it?

The first problem with object-oriented programming is mutable state. Although I’m a major proponent of functional programming, I don’t intend to imply that mutable state is uniformly bad. On the contrary, it’s often good. There are a not-small number of programming scenarios where mutable state is the best available abstraction. But it needs to be handled with extreme caution, because it makes code far more difficult to reason about than if it is purely functional. A well-designed and practical language generally will allow mutable state, but encourages it to be segregated only into places where it is necessary. A supreme example of this is Haskell, where any function with side effects reflects the fact in its type signature. On the contrary, modern OOP encourages the promiscuous distribution of mutable state, to such a degree that difficult-to-reason-about programs are not the exceptional rarity but the norm. Eventually, the code becomes outright incomprehensible– to paraphrase Boromir, “one does not simply read the source code”– and even good programmers (unknowingly) damage the codebase as they modify it, adding complexity without full comprehension. These programs fall into an understood-by-no-one state of limbo and become nearly impossible to debug or analyze: the execution state of a program might live in thousands of different objects!

Object-oriented programming’s second failing is that it encourages spaghetti code. For an example, let’s say that I’m implementing the card game, Hearts. To represent cards in the deck, I create a Card object, with two attributes: rank and suit, both of some sort of discrete type (integer, enumeration). This is a struct in C or a record in Ocaml or a data object in Java. So far, no foul. I’ve represented a card exactly how it should be represented. Later on, to represent each player’s hand, I have a Hand object that is essentially a wrapper around an array of cards, and a Deck object that contains the cards before they are dealt. Nothing too perverted here.

In Hearts, the person with the 2 of clubs leads first, so I might want to determine in whose hand that card is. Ooh! A “clever” optimization draws near! Obviously it is inefficient to check each Hand for the 2 of clubs. So I add a field, hand, to each Card that is set when the card enters or leaves a player’s Hand. This means that every time a Card moves (from one Hand to another, into or out of a Hand) I have to touch the pointer– I’ve just introduced more room for bugs. This field’s type is a Hand pointer (Hand* in C++, just Hand in Java). Since the Card might not be in a Hand, it can be null sometimes, and one has to check for nullness whenever using this field as well. So far, so bad. Notice the circular relationship I’ve now created between the Card and Hand classes.

It gets worse. Later, I add a picture attribute to the Card class, so that each Card is coupled with the name of an image file representing its on-screen appearance, and ten or twelve various methods for the number of ways I might wish to display a Card. Moreover, it becomes clear that my specification regarding a Card’s location in the game (either in a Hand or not in a Hand) was too weak. If a Card is not in a Hand, it might also be on the table (just played to a trick), in the deck, or out of the round (having been played). So I rename the hand attribute, place, and change its type to Location, from which Hand and Deck and PlaceOnTable all inherit.

This is ugly, and getting incomprehensible quickly. Consider the reaction of someone who has to maintain this code in the future. What the hell is a Location? From its name, it could be (a) a geographical location, (b) a position in a file, (c) the unique ID of a record in a database, (d) an IP address or port number or, what it actually is, (e) the Card’s location in the game. From the maintainer’s point of view, really getting to the bottom of Location requires understanding Hand, Deck, and PlaceOnTable, which may reside in different files, modules, or even directories. It’s just a mess. Worse yet, in such code the “broken window” behavior starts to set in. Now that the code is bad, those who have to modify it are tempted to do so in the easiest (but often kludgey) way. Kludges multiply and, before long, what should have been a two-field immutable record (Card) has 23 attributes and no one remembers what they all do.

To finish this example, let’s assume that the computer player for this Hearts game contains some very complicated AI, and I’m investigating a bug in the decision-making algorithms. To do this, I need to be able to generate game states as I desire as test cases. Constructing a game state requires that I construct Cards. If Card were left as it should be– a two-field record type– this would be a very easy thing to do. Unfortunately, Card now has so many fields, and it’s not clear which can be omitted or given “mock” values, that constructing one intelligently is no longer possible. Will failing to populate the seemingly irrelevant attributes (like picture, which is presumably connected to graphics and not the internal logic of the game) compromise the validity of my test cases? Hell if I know. At this point, reading, modifying, and testing code becomes more about guesswork than anything sound or principled.

Clearly, this is a contrived example, and I can imagine the defenders of object-oriented programming responding with the counterargument, “But I would never write code that way! I’d design the program intelligently in advance.” To that I say: right, for a small project like a Hearts game; wrong, for real-world, complex software developed in the professional world. What I described  is certainly not how a single intelligent programmer would code a card game; it is indicative of how software tends to evolve in the real world, with multiple developers involved. Hearts, of course, is a closed system: a game with well-defined rules that isn’t going to change much in the next 6 months. It’s therefore possible to design a Hearts program intelligently from the start and avoid the object-oriented pitfalls I intentionally fell into in this example. But for most real-world software, requirements change and the code is often touched by a number of people with widely varying levels of competence, some of whom barely know what they’re doing, if at all. The morass I described is what object-oriented code devolves into as the number of lines of code and, more importantly, the number of hands, increases. It’s virtually inevitable.

One note about this is that object-oriented programming tends to be top-down, with types being subtypes of Object. What this means is that data is often vaguely defined, semantically speaking. Did you know that the integer 5 doubles as a DessertToppingFactoryImpl? I sure didn’t. An alternative and usually superior mode of specification is bottom-up, as seen in languages like Ocaml and Haskell. These languages offer simple base types and encourage the user to build more complex types from them. If you’re unsure what a Person is, you can read the code and discover that it has a name field, which is a string, and a birthday field, which is a Date. If you’re unsure what a Date is, you can read the code and discover that it’s a record of three integers, labelled year, month, and day. If you want to get “to the bottom” of a datatype or function when types are built from the bottom-up, you can do so, and it rarely involves pinging across so many (possibly semi-irrelevant) abstractions and files as to shatter one’s “flow”. Circular dependencies are very rare in bottom-up languages. Recursion, in languages like ML, can exist both in datatypes and functions, but it’s hard to cross modules with it or create such obscene indirection as to make comprehension enormously difficult. By contrast, it’s not uncommon to find circular dependencies in object-oriented code. In the atrocious example I gave above, Hand depends on Card depends on Location inherits from Hand.

Why does OOP devolve?

Above, I described the consequences of undisciplined object-oriented programming. In limited doses, object-oriented programming is not so terrible. Neither, for that matter, is the much hated “goto” statement. Both of these are tolerable when used in extremely disciplined ways with reasonable and self-evident intentions. Yet when used by any but the most disciplined programmers, OOP devolves into a mess. This is hilarious in the context of OOP’s original promise to business types in the 1990s– that it wound enable mediocre programmers to be productive. What it actually did is create a coding environment in which mediocre programmers (and rushed or indisposed good ones) are negatively productive. It’s true that terrible code is possible in any language or programming paradigm; what makes object orientation such a terrible default abstraction is that, as with unstructured programming, bad code is an asymptotic inevitability as an object-oriented program grows. In order to discuss why this occurs, it’s necessary to discuss object orientation from a more academic perspective, and pose a question to which thousands of answers have been given.

What’s an object? 

On first approximation, one can think of an object as something that receives messages and performs actions, which usually include returning data to the sender of the message. Unlike a pure function, the response to each message is allowed to vary. In fact, it’s often required to do so. The object often contains state that is (by design) not directly accessible, but only observable by sending messages to the object. In this light, the object can be compared to a remote-procedure call (RPC) server. Its innards are hidden, possibly inaccessible, and this is generally a good thing in the context of, for example, a web service. When I connect to a website, I don’t care in the least about the state of its thread pooling mechanisms. I don’t want to know about that stuff, and I shouldn’t be allowed access to it. Nor do I care what sorting algorithm an email client uses to sort my email, as long as I get the right results. On the other hand, in the context of code for which one is (or, at least, might be in the future) responsible for comprehending the internals, such incomplete comprehension is a very bad thing.

To “What is an object?” the answer I would give is that one should think of it as a miniature RPC server. It’s not actually remote, nor as complex internally as a real RPC or web server, but it can be thought of this way in terms of its (intentional) opacity. This shines light on whether object-oriented programming is “bad”, and the question of when to use objects. Are RPC servers invariably bad? Of course not. On the other hand, would anyone in his right mind code Hearts in such a way that each Card were its own RPC server? No. That would be insane. If people treated object topologies with the same care as network topologies, a lot of horrible things that have been done to code in the name of OOP might never have occurred.

Alan Kay, the inventor of Smalltalk and the original conception of “object-oriented programming”, has argued that the failure of what passes for OOP in modern software is that objects are too small and that there are too many of them. Originally, object-oriented programming was intended to involve large objects that encapsulated state behind interfaces that were easier to understand than the potentially complicated implementations. In that context, OOP as originally defined is quite powerful and good; even non-OOP languages have adopted that virtue (also known as encapsulation) in the form of modules.

Still, the RPC-server metaphor for “What is an Object?” is not quite right, and the philosophical notion of “object” is deeper. An object, in software engineering, should be seen as a thing which the user is allowed to have (and often supposed to have) incomplete knowledge. Incomplete knowledge isn’t always a bad thing at all; often, it’s an outright necessity due to the complexity of the system. For example, SQL is a language in which the user specifies an ad-hoc query to be run against a database with no indication of what algorithm to use; the database system figures that out. For this particular application, incomplete knowledge is beneficial; it would be ridiculous to burden everyone who wants to use a database with the immense complexity of its internals.

Object-orientation is the programming paradigm based on incomplete knowledge. Its purpose is to enable computation with data of which the details are not fully known. In a way, this is concordant with the English use of the word “object” as a synonym for “thing”: it’s an item of which one’s knowledge is incomplete. “What is that thing on the table?” “I don’t know, some weird object.” Object-oriented programming is designed to allow people to work with things they don’t fully understand, and even modify them in spite of incomplete comprehension of it. Sometimes that’s useful or necessary, because complete knowledge of a complex program can be humanly impossible. Unfortunately, over time the over-tolerance of incomplete knowledge leads to an environment where important components can elude the knowledge of each individual responsible for creating them; the knowledge is strewn about many minds haphazardly.

Modularity

Probably the most important predictor of whether a codebase will remain comprehensible as it becomes large is whether it’s modular. Are the components individually comprehensible, or do they form an irreducibly complex tangle of which it is required to understand all of it (which may not even be possible) before one can understand any of it? In the latter case, software quality grinds to a halt, or even backslides, as the size of the codebase increases. In terms of modularity, the object oriented paradigm generally performs poorly, facilitating the haphazard growth of codebases in which answering simple questions like “How do I create and use a Foo object?” can require days-long forensic capers.

The truth about “power”

Often, people describe programming techniques and tools as “powerful”, and that’s taken to be an endorsement. A counterintuitive and dirty secret about software engineering “power” is not always a good thing. For a “hacker”– a person writing “one-off” code that is unlikely to ever be require future reading by anyone, including the author– all powerful abstractions, because they save time, can be considered good. However, in the more general software engineering context, where any code written is likely to require maintenance and future comprehension, power can be bad. For example, macros in languages like C and Lisp are immensely powerful. Yet it’s obnoxiously easy to write incomprehensible code using these features.

Objects are, likewise, immensely powerful (or “heavyweight”) beasts when features like inheritance, dynamic method dispatch, open recursion, et cetera are considered. If nothing else, one notes that objects can do anything that pure functions can do– and more. The notion of “object” is both a very powerful and a very vague abstraction.

“Hackers” like power, in which case a language can be judged based on the power of the abstractions it offers. But real-world software engineers spend an unpleasantly large amount of time reading and maintaining others’ code.  From an engineer’s perspective, a language is good based on what it prevents other programmers from doing to us, those of us who have to maintain their code in the future. In this light, the unrestrained use of Lisp macros and object-oriented programming is bad, bad, bad. From this perspective, a language like Ocaml or Haskell– of middling power but beautifully designed to encourage deployment of the right abstractions– is far better than a more “powerful” one like Ruby.

As an aside, a deep problem in programming language design is that far too many languages are designed with the interests of code writers foremost in mind. And it’s quite enjoyable, from a writer’s perspective, to use esoteric metaprogramming features and complex object patterns. Yet very few languages are designed to provide a beautiful experience for readers of code. In my experience, ML does this best, and Haskell does it well, while most of the mainstream languages fall short of being even satisfactory. In most real-world software environments, reading code is so unpleasant that it hardly gets done at all with any detail. Object-oriented programming, and the haphazard monstrosities its “powerful” abstractions enable, is a major culprit.

Solution?

The truth, I think, about object-oriented programming is that most of its core concepts– inheritance, easy extensibility of data, proliferation of state– should be treated with the same caution and respect that a humble and intelligent programmer gives to mutable state. These abstractions can be powerful and work beautifully in the right circumstances, but they should be used very sparingly and only by people who understand what they are actually doing. In Lisp, it is generally held to be good practice never to write a macro when a function will do. I would argue the same with regard to object-oriented programming: never write an object-oriented program for a problem where a functional or cleanly imperative approach will suffice.  Certainly, to make object orientation the default means of abstraction, as C++ and Java have, is a proven disaster.

Abstractions, and especially powerful ones, aren’t always good. Using the right abstractions is of utmost importance. Abstractions that are too vague, for example, merely clutter code with useless nouns. As the first means of abstraction in high-level languages, higher-order functions suffice most of the time– probably over 95% of the time, in well-factored code. Objects may come into favor for being more general than higher-order functions, but “more general” also means less specific, and for the purpose of code comprehension, this is a hindrance, not a feature. If cleaner and more comprehensible pure functions and algebraic data types can be used in well over 90 percent of the places where objects appear in OO languages, they should be used in lieu of objects, and they should be supported by languages– which C++ and Java don’t do.

In a better world, programmers would be required to learn how to use functions before progressing to objects, and object-oriented features would hold the status of being available but deployed only when needed, or in the rare cases where such features make remarkably better code. To start, this change needs to come about at a language level. Instead of Java or C++ being the first languages to which most programmers are introduced, that status should be shifted to a language like Scheme, Haskell, or my personal favorite for this purpose: Ocaml.