REPL or Fail

I wrote a post last week on the idealized trajectory of a software engineer in general professional ability, and I find the 3-point scale (with one decimal point) that I developed to be quite useful. It describes the transition from additive to multiplicative contributions to a team, with 1.0 representing the baseline competence (a net adder, rather than a subtracter) of a professional programmer, 1.2 being about average for the industry, 1.5 being seriously good (“senior” in most contexts) and 2.0 representing a consistent multiplier and technical leader. Sadly, one of the more common roadblocks occurs early (around 1.0 to 1.2) and for most developers, and it’s often associated with-established programming languages like Java, C++, and Visual Basic. What’s going on? Why do so many programmers reach a hard ceiling, with some persisting there for decades, while others pass through this barrier with ease? What is it about certain languages or technologies that holds programmers back?

Programming is a two-class society. We have the mere “coders” who use only one language, hate the command line, and don’t program outside of work. They typically work on bland, “enterprise” projects and solve annoyingly detailed, but not difficult, problems. They generally find programming to be a boring task, but “it pays the bills” and the other smart-person route to management (actuarial science) involves hard exams. If they remain “in programming”, they’re lucky to break six figures when they’re 40, and they are likely to face age discrimination and layoffs in the decade after that. On my scale, they plateau around 1.2 because, in enterprise programming, those who reach 1.3-1.4 are usually brought into management. For a contrast, the other class is comprised of elite “hackers” who prefer languages like Python, Scala and Erlang (although some might have to use Java) and who program outside of work, who are hotly desired by huge companies and startups alike, and who continue growing even into old age. These usually reach 1.3 in their first few years as professional programmers, and reliably break 1.5 by mid-career. What separates the two? What is it about a programming language that makes it highly indicative of a software engineer’s future progress (or lack thereof)?

It’s evident that this problem is outside of languages themselves, because a 1.5 programmer can, after some adjustment, program at a comparable level in Java. So it’s actually not the case (aside from an opportunity cost argument) that Java and C++ make people worse programmers. Rather, something happens in other, more modern, languages that makes people improve faster and helps them smash through that 1.2 barrier. So, what is it? I spent much time trying to figure this out, and when I came upon the answer, it was so simple it shocked me: The Mighty REPL.

Modern programming languages supply an interactive mode, also known as a “read-eval-print loop” or REPL, as part of the programming environment. The REPL allows programmers to try out code and get immediate results, and to explore existing software modules interactively by calling their functions and seeing what they do. Although typically associated with interpreted languages (notably, Lisp) REPLs are supplied for compiled languages such as Scala, Ocaml, and Haskell. (They do not provide the speed of compiled code, but code is not run for speed in the interactive mode.) This tight feedback loop facilitates a style of development and code exploration that is far more engaging and effective than the processes of writing and reading code would be without a REPL, and far superior to anything available from an IDE. (A REPL is, in some sense, a simple but highly effective IDE. Properly built, it makes IDEs unnecessary.)

Programmer productivity is binary, in that a programmer is either in a productive, engaged state of “flow” or in a disorganized, unpleasant, and unproductive state out of flow where 10% as much (if that) is accomplished. (A significant amount of work stress, in my estimation, is caused by a self-inflicted sense of pressure to be productive while one is out of flow.) It takes programmers about 15 to 30 minutes to enter flow, but once in it, they are immensely productive and moreover, quite happy. Developers usually write the most code and their best code when in flow, and are best at reading code when in this state as well. There’s a problem: reading other peoples’ code, if the code presents a lot of accidental complexity obscuring the question the reader is trying to answer, often shatters flow (which is why bad code is hated with a passion that only programmers understand).

Programmers understand flow and its importance from experience, and flow becomes more important as one increases one’s programming skills (to the point that 2.0+ programmers, rather than negotiating for higher salaries, tend to negotiate perks oriented toward flow and engagement, such as a quiet working space and an unconditional right to turn down meetings). An experienced programmer can work at a 1.0 level (cranking out code of nominal additive value) without inspiration, but 1.5+ and especially 2.0+ level contributions require creativity and focus. So experienced, elite programmers know that a REPL-less language is generally a dead-end. In a “green field” environment where the programmer controls the entire context in which is work exists, engaged writing of code in languages like Java is still possible– but engaged reading of code is out of the question. The engaging way to read code is to get a big-picture sense of it through interaction, and then to examine the code for implementation details strictly after this has been achieved.

This, in my mind, singularly explains the ceiling that Java developers hit around 1.2. By some non-satisfactory definition akin to Turing-completeness for programming languages, a 1.2 programmer has the knowledge and resources to “solve any programming problem” (ignoring performance and feasibility concerns). The solution may be inelegant, slow, unmaintainable, and even bug-prone, and it might take a long time for the solution to be delivered, but there aren’t programming problems that a 1.2 (or even 1.0) engineer “can’t solve”. On the other hand, far more interesting is how people solve problems, and some of the things that programmers must do if they want to become elite (1.5+) programmers are (a) figure out which problems are worth solving, and (b) learn how to read solutions that other people have created. To become a great programmer, one must be able to read code (in an engaged state of flow). Moreover, one must read good code, a commodity that is depressingly rare in this industry.

Reading code can be an immense joy that produces “Aha!” moments, or it can be hellishly tedious and unfruitful. Sadly, most real-world code (especially in languages like Java and C++) is closer to the latter extreme– it’s probably over 90 percent. Code rots for a variety of reasons. One is the wine/sewage problem (“a teaspoon of sewage in a barrel of wine makes a barrel of sewage, a teaspoon of wine in a barrel of sewage makes a barrel of sewage”): if a system is corrupted and the nastiness isn’t aggressively refactored, the kludges will beget counterkludges in “maintenance” and destroy the whole system. A related issue is the “broken windows” effect: tolerance of ugliness leads to a sense of abandon, and this is more common than most programmers will admit. Modifying code in a reasonable way (i.e. one that doesn’t, despite solving an immediate bug or adding a specific feature, make the general quality of the code worse) requires understanding it, and that usually involves reading it, and most code is so terrible that programmers who have to use, extend or maintain it just give up on comprehension and “hack” it as far as they can. Programming in this style is akin to the” Jenga” game, where players must remove planks from a tower and place them on top of it, making the structure less sturdy and higher as they go (until it collapses, and the player whose turn it is loses).

There’s no silver bullet for code comprehension, but the REPL is the closest thing. The worst code may remain impenetrable, but few real projects begin their existence as incomprehensible legacy nightmares; they usually start out as average code for a project of that size. REPLs make it possible to explore average-case code and comprehend it without dedicating massive amounts of time to the process, and this makes a huge difference. Aging modules can be refactored in the earliest stages of decay, long before they get anywhere near the “legacy horror” state. By enabling interactive peeking and poking of functions, REPLs allow programmers to explore libraries and get a sense of their interfaces. For example, in Ocaml, it’s possible to get the full type signature of any module:

# module L = List;;
module L :
sig
val length : 'a list -> int
val hd : 'a list -> 'a
val tl : 'a list -> 'a list
val nth : 'a list -> int -> 'a
val rev : 'a list -> 'a list
val append : 'a list -> 'a list -> 'a list
val rev_append : 'a list -> 'a list -> 'a list
val concat : 'a list list -> 'a list
val flatten : 'a list list -> 'a list
val iter : ('a -> unit) -> 'a list -> unit
val map : ('a -> 'b) -> 'a list -> 'b list
[...]
end

From these type signatures, it’s relatively easy to get a sense of what these functions do and to test those intuitions:

# List.length [1; 1; 2; 3; 5; 8];;
- : int = 6
# List.map (fun x -> x*x) [1; 1; 2; 3; 5; 8];;
- : int list = [1; 1; 4; 9; 25; 64]

With Ocaml’s powerful REPL, a person can explore code and get a sense of the big picture before starting to read it. That makes a huge difference: reading code is an order of magnitude easier and more engaging when one understands what one is looking at. Moreover, many Lisps such as Clojure and Common Lisp provide documentation functions at the REPL that allow the user to read a function’s documentation without having to leave the command-line. This provides all of the benefits of an IDE, without the flow-breaking drawbacks.

For an aside, there’s something that elite programmers (1.5+) call “keyboard snobbery”: IDEs are scorned, while the key-combos of emacs and vim are venerated, even with anachronistic names like “meta” for the escape key.  The command line interface is highly valued. This doesn’t apply to all computer use (when web surfing, keyboard snobs use the mouse like anyone else) but it does apply to writing and reading code. Why? Because the mouse is physical, continuous and imprecise, while the keyboard is  cerebral, discrete and exact (and therefore a better tool when programming). When we use web pages, we trust the developer to handle the imprecision (in determining whether a button was clicked, and in interpreting the mouse event). This is fine for this purpose, but when we’re writing code, we want exactitude and total control of our interaction with the machine. We want exactly the result we expect at all times. So that’s why we prefer the keyboard when coding, but there’s something else going on as well. Switching from keyboard to mouse doesn’t only involve a move of the hand. It reframes the interaction between the human and the machine, and that’s a context switch. Seemingly benign context switches inflict major drag on programmer productivity. Switching to the mouse because one’s IDE requires it? That’s 3-5 minutes. Pinging about the filesystem because of some stupid requirement that each class live in its own file? That’s about ten minutes. Managerial interruption? An hour, and half a day if the meeting is unexpected and intense. Programmers hate being nickel-and-dimed by context switches, and they hate being out of flow. This is why seriously good programmers prefer “archaic” tools like the command-line interface, vim and emacs, while considering typical mouse-driven modern IDEs (which are necessary if one is developing in a verbose basketcase of a language like Java, but unnecessary in better languages) to be useless.

The REPL, served at the commandline, allows a person to interact with code as if it were live, and see what the pieces do. At least half of what we must do as programmers is comprehension of assets that other people have created, and the REPL allows us to do this without the painful context switch associated with having to read code cold. It enables “flowful” (that is, engaging) exploration and, later, “lazy reading” of code. (Lazy, in this sense, is a non-pejorative computer science term associated with doing only the work needed to solve a problem.) That’s something REPL-less languages can’t provide, because in them, code is a dead static thing that might be run against some dead static tests, not something a developer can interact with as he works.

Reading code is part of the job description of any programmer, and yet it’s rarely done well because enterprise languages like Java make the process so dismal that most people just give up, falling into abominable development practices. When 90 percent of the code is tedious boilerplate (accidental complexity) that isn’t worth the eye strain, it’s easy to miss crucial details. The accumulation of missed details leads to frank incomprehension quickly, and then development practices akin to “throwing mud at the wall and seeing what sticks” become the norm.

This, I believe, is what holds back most Java developers’ progress. Not only does code in such languages become horrible quickly, but the environment makes it unpleasant to read even “good” (by which, I mean “above average for the language) code. In fact, most IDEs tacitly assume that no one is going to bother to read code after it is first written, and adjust accordingly.

For a contrast, this is something Ocaml got right in a major way. Ocaml is an obscure “niche” language, but it has the highest average quality of programmers that I’ve ever seen (even higher than Haskell and Lisp, although those are close). I don’t believe the reason for this is that only good developers can use Ocaml. Instead, what Ocaml achieves is that it makes it a joy to read average-case code– no small feat. Pattern matching, a core feature of the language, is explicitly designed to make what would otherwise be complex control flows human-readable. Haskell is an excellent language as well, and extremely terse, but in my belief it’s optimized (more than Ocaml) for writers of code (although still far better, from a reader’s perspective, than Java). I would guess that the ML family of languages (which are elegantly simple) are the only languages on earth that go so far to make almost all code readable, even in large systems. Of course, it’s still absolutely possible to write horrible, illegible Ocaml code– the language puts up more of a fight against bad practices than most, but it can be done. The difference, relevant for economic rather than purist discussions, is that average-case Ocaml code is attractive, whereas even very good Java code looks only 20 percent less ugly than typical “bad” Java code.

In a language like Ocaml, there’s so little boilerplate and accidental complexity that one can look at the code and actually see the problem being solved. For larger systems, one can test one’s intuitions at the REPL. No one needs to rifle through 300-page design documents to understand what a well-written Ocaml program does. The consequence of this is that Ocaml has libraries of generally very clean code that people can read as they learn the language. Since it’s not an unpleasant process to read code, they do so, and they grow as programmers at a rate that would be unheard-of in Java or C++: rising from 0.8 to 1.5 in about two to three years is typical. Ocaml isn’t some “hard” language that only 1.5+ programmers have a chance of understanding. It’s a language that turns ordinary programmers into 1.5ers rapidly.

There’s one language that can be cited as a counterexample to the “REPL or Fail” rule, and that’s C. C, invented in the 1970s, doesn’t have a REPL. Why was this acceptable for C? First, the language grew up in a different time, when small programs (that would today be replaced by “scripts” unless performance were an issue) were the norm. Programmers could “grow up” on C in 1985 because the programs they’d be reading were small and had well-defined semantics. Second, to say that “C lacks a REPL” is a bit strict. It doesn’t have a language-native REPL, but the Unix/C environment does have a (rudimentary, but sufficient) REPL: the command-line console. This was the environment in which C programs were run and explored: first you run wc and cat to see what they do, and then you could look at their C code and discover how they do it. C was designed with a “small-program” model of development (because large, megalithic programs were simply untenable in 1975) in mind. If complex behaviors were desired, they could be established by composing independent C programs and having them communicate through pipes and sockets. In this world, one could read “a whole C program” (a small, independent module, usually in one file) in one sitting. One only needed a REPL (command-line console) to understand the bigger environment: Unix.

How’d we end up with these disengaging, REPL-less languages? As I said; speaking superficially and strictly, C has no REPL. This was not a problem for C because large programs were so rarely written in it, and enough small, well-written C programs were distributed in every Linux environment that a programmer could learn the language from those. Where C++ differs is that large, complex, and monolithic programs are written in it, because the language has just enough in the way of high-level support to let people attempt them. The result is that C++ supports beasts of complexity (such as 200-line functions, 1000-line class definitions, and 1-million-line whole programs spanning several directories) that would be unconscionable in C, and yet fails to provide the one tool that might enable a programmer to make sense of such things. Although writing a C++ REPL is possible, it wouldn’t be easy: the language is so deeply imperative and crystalline that overcoming the mismatch between the two models of programming would be a monumental task. Java, as a descendant of C++ in syntax and culture, inherited most of these illnesses from it while becoming the default language for enterprise programming, and was also launched without a REPL. The result is that millions of people are stuck in a REPL-less language and don’t know why, while hacking on monolithic projects of intractable complexity that are doomed to get worse over time.

The REPL isn’t just a tool. It’s an engaging classroom in which one learns how to be a programmer. It’s absolutely necessary for a person assigned a task that involves comprehending a complex piece of code. And unlike the training wheels of an IDE, it doesn’t attempt to hide “difficult” details from the developer; it allows her to explore them to arbitrary depth when she is ready.

For these reasons, the interactive mode can’t be considered a luxury of those who are privileged enough to work in “elite” languages. There’s no reason programming should be that way. If we want to democratize programming (and there’s no reason we can’t have at least ten times as many 1.5+ programmers as are alive now, and 10x is a conservative goal; considering the world population) we need to begin orienting ourselves toward modern languages. And there is one rule that seems more fundamental than any argument about static vs. dynamic typing or imperative vs. functional programming: REPL or fail.

Object Disoriented Programming

It is my belief that what is now called “object-oriented programming” (OOP) is going to go down in history as one of the worst programming fads of all time, one that has wrecked countless codebases and burned millions of hours of engineering time worldwide. Though a superficially appealing concept– programming in a manner that is comprehensible the “big picture” level– it fails to deliver on this promise, it usually fails to improve engineer productivity, and it often leads to unmaintainable, ugly, and even nonsensical code.

I’m not going to claim that object-oriented programming is never useful. There would be two problems with such a claim. First, OOP means many different things to different people– there’s a parsec of daylight between Smalltalk’s approach to OOP and the abysmal horrors currently seen in Java. Second, there are many niches within programming with which I’m unfamiliar and it would be arrogant to claim that OOP is useless in all of them. Almost certainly, there are problems for which the object-oriented approach is one of the more useful. Instead, I’ll make a weaker claim but with full confidence: as the default means of abstraction, as in C++ and Java, object orientation is a disastrous choice.

What’s wrong with it?

The first problem with object-oriented programming is mutable state. Although I’m a major proponent of functional programming, I don’t intend to imply that mutable state is uniformly bad. On the contrary, it’s often good. There are a not-small number of programming scenarios where mutable state is the best available abstraction. But it needs to be handled with extreme caution, because it makes code far more difficult to reason about than if it is purely functional. A well-designed and practical language generally will allow mutable state, but encourages it to be segregated only into places where it is necessary. A supreme example of this is Haskell, where any function with side effects reflects the fact in its type signature. On the contrary, modern OOP encourages the promiscuous distribution of mutable state, to such a degree that difficult-to-reason-about programs are not the exceptional rarity but the norm. Eventually, the code becomes outright incomprehensible– to paraphrase Boromir, “one does not simply read the source code”– and even good programmers (unknowingly) damage the codebase as they modify it, adding complexity without full comprehension. These programs fall into an understood-by-no-one state of limbo and become nearly impossible to debug or analyze: the execution state of a program might live in thousands of different objects!

Object-oriented programming’s second failing is that it encourages spaghetti code. For an example, let’s say that I’m implementing the card game, Hearts. To represent cards in the deck, I create a Card object, with two attributes: rank and suit, both of some sort of discrete type (integer, enumeration). This is a struct in C or a record in Ocaml or a data object in Java. So far, no foul. I’ve represented a card exactly how it should be represented. Later on, to represent each player’s hand, I have a Hand object that is essentially a wrapper around an array of cards, and a Deck object that contains the cards before they are dealt. Nothing too perverted here.

In Hearts, the person with the 2 of clubs leads first, so I might want to determine in whose hand that card is. Ooh! A “clever” optimization draws near! Obviously it is inefficient to check each Hand for the 2 of clubs. So I add a field, hand, to each Card that is set when the card enters or leaves a player’s Hand. This means that every time a Card moves (from one Hand to another, into or out of a Hand) I have to touch the pointer– I’ve just introduced more room for bugs. This field’s type is a Hand pointer (Hand* in C++, just Hand in Java). Since the Card might not be in a Hand, it can be null sometimes, and one has to check for nullness whenever using this field as well. So far, so bad. Notice the circular relationship I’ve now created between the Card and Hand classes.

It gets worse. Later, I add a picture attribute to the Card class, so that each Card is coupled with the name of an image file representing its on-screen appearance, and ten or twelve various methods for the number of ways I might wish to display a Card. Moreover, it becomes clear that my specification regarding a Card’s location in the game (either in a Hand or not in a Hand) was too weak. If a Card is not in a Hand, it might also be on the table (just played to a trick), in the deck, or out of the round (having been played). So I rename the hand attribute, place, and change its type to Location, from which Hand and Deck and PlaceOnTable all inherit.

This is ugly, and getting incomprehensible quickly. Consider the reaction of someone who has to maintain this code in the future. What the hell is a Location? From its name, it could be (a) a geographical location, (b) a position in a file, (c) the unique ID of a record in a database, (d) an IP address or port number or, what it actually is, (e) the Card’s location in the game. From the maintainer’s point of view, really getting to the bottom of Location requires understanding Hand, Deck, and PlaceOnTable, which may reside in different files, modules, or even directories. It’s just a mess. Worse yet, in such code the “broken window” behavior starts to set in. Now that the code is bad, those who have to modify it are tempted to do so in the easiest (but often kludgey) way. Kludges multiply and, before long, what should have been a two-field immutable record (Card) has 23 attributes and no one remembers what they all do.

To finish this example, let’s assume that the computer player for this Hearts game contains some very complicated AI, and I’m investigating a bug in the decision-making algorithms. To do this, I need to be able to generate game states as I desire as test cases. Constructing a game state requires that I construct Cards. If Card were left as it should be– a two-field record type– this would be a very easy thing to do. Unfortunately, Card now has so many fields, and it’s not clear which can be omitted or given “mock” values, that constructing one intelligently is no longer possible. Will failing to populate the seemingly irrelevant attributes (like picture, which is presumably connected to graphics and not the internal logic of the game) compromise the validity of my test cases? Hell if I know. At this point, reading, modifying, and testing code becomes more about guesswork than anything sound or principled.

Clearly, this is a contrived example, and I can imagine the defenders of object-oriented programming responding with the counterargument, “But I would never write code that way! I’d design the program intelligently in advance.” To that I say: right, for a small project like a Hearts game; wrong, for real-world, complex software developed in the professional world. What I described  is certainly not how a single intelligent programmer would code a card game; it is indicative of how software tends to evolve in the real world, with multiple developers involved. Hearts, of course, is a closed system: a game with well-defined rules that isn’t going to change much in the next 6 months. It’s therefore possible to design a Hearts program intelligently from the start and avoid the object-oriented pitfalls I intentionally fell into in this example. But for most real-world software, requirements change and the code is often touched by a number of people with widely varying levels of competence, some of whom barely know what they’re doing, if at all. The morass I described is what object-oriented code devolves into as the number of lines of code and, more importantly, the number of hands, increases. It’s virtually inevitable.

One note about this is that object-oriented programming tends to be top-down, with types being subtypes of Object. What this means is that data is often vaguely defined, semantically speaking. Did you know that the integer 5 doubles as a DessertToppingFactoryImpl? I sure didn’t. An alternative and usually superior mode of specification is bottom-up, as seen in languages like Ocaml and Haskell. These languages offer simple base types and encourage the user to build more complex types from them. If you’re unsure what a Person is, you can read the code and discover that it has a name field, which is a string, and a birthday field, which is a Date. If you’re unsure what a Date is, you can read the code and discover that it’s a record of three integers, labelled year, month, and day. If you want to get “to the bottom” of a datatype or function when types are built from the bottom-up, you can do so, and it rarely involves pinging across so many (possibly semi-irrelevant) abstractions and files as to shatter one’s “flow”. Circular dependencies are very rare in bottom-up languages. Recursion, in languages like ML, can exist both in datatypes and functions, but it’s hard to cross modules with it or create such obscene indirection as to make comprehension enormously difficult. By contrast, it’s not uncommon to find circular dependencies in object-oriented code. In the atrocious example I gave above, Hand depends on Card depends on Location inherits from Hand.

Why does OOP devolve?

Above, I described the consequences of undisciplined object-oriented programming. In limited doses, object-oriented programming is not so terrible. Neither, for that matter, is the much hated “goto” statement. Both of these are tolerable when used in extremely disciplined ways with reasonable and self-evident intentions. Yet when used by any but the most disciplined programmers, OOP devolves into a mess. This is hilarious in the context of OOP’s original promise to business types in the 1990s– that it wound enable mediocre programmers to be productive. What it actually did is create a coding environment in which mediocre programmers (and rushed or indisposed good ones) are negatively productive. It’s true that terrible code is possible in any language or programming paradigm; what makes object orientation such a terrible default abstraction is that, as with unstructured programming, bad code is an asymptotic inevitability as an object-oriented program grows. In order to discuss why this occurs, it’s necessary to discuss object orientation from a more academic perspective, and pose a question to which thousands of answers have been given.

What’s an object? 

On first approximation, one can think of an object as something that receives messages and performs actions, which usually include returning data to the sender of the message. Unlike a pure function, the response to each message is allowed to vary. In fact, it’s often required to do so. The object often contains state that is (by design) not directly accessible, but only observable by sending messages to the object. In this light, the object can be compared to a remote-procedure call (RPC) server. Its innards are hidden, possibly inaccessible, and this is generally a good thing in the context of, for example, a web service. When I connect to a website, I don’t care in the least about the state of its thread pooling mechanisms. I don’t want to know about that stuff, and I shouldn’t be allowed access to it. Nor do I care what sorting algorithm an email client uses to sort my email, as long as I get the right results. On the other hand, in the context of code for which one is (or, at least, might be in the future) responsible for comprehending the internals, such incomplete comprehension is a very bad thing.

To “What is an object?” the answer I would give is that one should think of it as a miniature RPC server. It’s not actually remote, nor as complex internally as a real RPC or web server, but it can be thought of this way in terms of its (intentional) opacity. This shines light on whether object-oriented programming is “bad”, and the question of when to use objects. Are RPC servers invariably bad? Of course not. On the other hand, would anyone in his right mind code Hearts in such a way that each Card were its own RPC server? No. That would be insane. If people treated object topologies with the same care as network topologies, a lot of horrible things that have been done to code in the name of OOP might never have occurred.

Alan Kay, the inventor of Smalltalk and the original conception of “object-oriented programming”, has argued that the failure of what passes for OOP in modern software is that objects are too small and that there are too many of them. Originally, object-oriented programming was intended to involve large objects that encapsulated state behind interfaces that were easier to understand than the potentially complicated implementations. In that context, OOP as originally defined is quite powerful and good; even non-OOP languages have adopted that virtue (also known as encapsulation) in the form of modules.

Still, the RPC-server metaphor for “What is an Object?” is not quite right, and the philosophical notion of “object” is deeper. An object, in software engineering, should be seen as a thing which the user is allowed to have (and often supposed to have) incomplete knowledge. Incomplete knowledge isn’t always a bad thing at all; often, it’s an outright necessity due to the complexity of the system. For example, SQL is a language in which the user specifies an ad-hoc query to be run against a database with no indication of what algorithm to use; the database system figures that out. For this particular application, incomplete knowledge is beneficial; it would be ridiculous to burden everyone who wants to use a database with the immense complexity of its internals.

Object-orientation is the programming paradigm based on incomplete knowledge. Its purpose is to enable computation with data of which the details are not fully known. In a way, this is concordant with the English use of the word “object” as a synonym for “thing”: it’s an item of which one’s knowledge is incomplete. “What is that thing on the table?” “I don’t know, some weird object.” Object-oriented programming is designed to allow people to work with things they don’t fully understand, and even modify them in spite of incomplete comprehension of it. Sometimes that’s useful or necessary, because complete knowledge of a complex program can be humanly impossible. Unfortunately, over time the over-tolerance of incomplete knowledge leads to an environment where important components can elude the knowledge of each individual responsible for creating them; the knowledge is strewn about many minds haphazardly.

Modularity

Probably the most important predictor of whether a codebase will remain comprehensible as it becomes large is whether it’s modular. Are the components individually comprehensible, or do they form an irreducibly complex tangle of which it is required to understand all of it (which may not even be possible) before one can understand any of it? In the latter case, software quality grinds to a halt, or even backslides, as the size of the codebase increases. In terms of modularity, the object oriented paradigm generally performs poorly, facilitating the haphazard growth of codebases in which answering simple questions like “How do I create and use a Foo object?” can require days-long forensic capers.

The truth about “power”

Often, people describe programming techniques and tools as “powerful”, and that’s taken to be an endorsement. A counterintuitive and dirty secret about software engineering “power” is not always a good thing. For a “hacker”– a person writing “one-off” code that is unlikely to ever be require future reading by anyone, including the author– all powerful abstractions, because they save time, can be considered good. However, in the more general software engineering context, where any code written is likely to require maintenance and future comprehension, power can be bad. For example, macros in languages like C and Lisp are immensely powerful. Yet it’s obnoxiously easy to write incomprehensible code using these features.

Objects are, likewise, immensely powerful (or “heavyweight”) beasts when features like inheritance, dynamic method dispatch, open recursion, et cetera are considered. If nothing else, one notes that objects can do anything that pure functions can do– and more. The notion of “object” is both a very powerful and a very vague abstraction.

“Hackers” like power, in which case a language can be judged based on the power of the abstractions it offers. But real-world software engineers spend an unpleasantly large amount of time reading and maintaining others’ code.  From an engineer’s perspective, a language is good based on what it prevents other programmers from doing to us, those of us who have to maintain their code in the future. In this light, the unrestrained use of Lisp macros and object-oriented programming is bad, bad, bad. From this perspective, a language like Ocaml or Haskell– of middling power but beautifully designed to encourage deployment of the right abstractions– is far better than a more “powerful” one like Ruby.

As an aside, a deep problem in programming language design is that far too many languages are designed with the interests of code writers foremost in mind. And it’s quite enjoyable, from a writer’s perspective, to use esoteric metaprogramming features and complex object patterns. Yet very few languages are designed to provide a beautiful experience for readers of code. In my experience, ML does this best, and Haskell does it well, while most of the mainstream languages fall short of being even satisfactory. In most real-world software environments, reading code is so unpleasant that it hardly gets done at all with any detail. Object-oriented programming, and the haphazard monstrosities its “powerful” abstractions enable, is a major culprit.

Solution?

The truth, I think, about object-oriented programming is that most of its core concepts– inheritance, easy extensibility of data, proliferation of state– should be treated with the same caution and respect that a humble and intelligent programmer gives to mutable state. These abstractions can be powerful and work beautifully in the right circumstances, but they should be used very sparingly and only by people who understand what they are actually doing. In Lisp, it is generally held to be good practice never to write a macro when a function will do. I would argue the same with regard to object-oriented programming: never write an object-oriented program for a problem where a functional or cleanly imperative approach will suffice.  Certainly, to make object orientation the default means of abstraction, as C++ and Java have, is a proven disaster.

Abstractions, and especially powerful ones, aren’t always good. Using the right abstractions is of utmost importance. Abstractions that are too vague, for example, merely clutter code with useless nouns. As the first means of abstraction in high-level languages, higher-order functions suffice most of the time– probably over 95% of the time, in well-factored code. Objects may come into favor for being more general than higher-order functions, but “more general” also means less specific, and for the purpose of code comprehension, this is a hindrance, not a feature. If cleaner and more comprehensible pure functions and algebraic data types can be used in well over 90 percent of the places where objects appear in OO languages, they should be used in lieu of objects, and they should be supported by languages– which C++ and Java don’t do.

In a better world, programmers would be required to learn how to use functions before progressing to objects, and object-oriented features would hold the status of being available but deployed only when needed, or in the rare cases where such features make remarkably better code. To start, this change needs to come about at a language level. Instead of Java or C++ being the first languages to which most programmers are introduced, that status should be shifted to a language like Scheme, Haskell, or my personal favorite for this purpose: Ocaml.