It is my belief that what is now called “object-oriented programming” (OOP) is going to go down in history as one of the worst programming fads of all time, one that has wrecked countless codebases and burned millions of hours of engineering time worldwide. Though a superficially appealing concept– programming in a manner that is comprehensible the “big picture” level– it fails to deliver on this promise, it usually fails to improve engineer productivity, and it often leads to unmaintainable, ugly, and even nonsensical code.
I’m not going to claim that object-oriented programming is never useful. There would be two problems with such a claim. First, OOP means many different things to different people– there’s a parsec of daylight between Smalltalk’s approach to OOP and the abysmal horrors currently seen in Java. Second, there are many niches within programming with which I’m unfamiliar and it would be arrogant to claim that OOP is useless in all of them. Almost certainly, there are problems for which the object-oriented approach is one of the more useful. Instead, I’ll make a weaker claim but with full confidence: as the default means of abstraction, as in C++ and Java, object orientation is a disastrous choice.
What’s wrong with it?
The first problem with object-oriented programming is mutable state. Although I’m a major proponent of functional programming, I don’t intend to imply that mutable state is uniformly bad. On the contrary, it’s often good. There are a not-small number of programming scenarios where mutable state is the best available abstraction. But it needs to be handled with extreme caution, because it makes code far more difficult to reason about than if it is purely functional. A well-designed and practical language generally will allow mutable state, but encourages it to be segregated only into places where it is necessary. A supreme example of this is Haskell, where any function with side effects reflects the fact in its type signature. On the contrary, modern OOP encourages the promiscuous distribution of mutable state, to such a degree that difficult-to-reason-about programs are not the exceptional rarity but the norm. Eventually, the code becomes outright incomprehensible– to paraphrase Boromir, “one does not simply read the source code”– and even good programmers (unknowingly) damage the codebase as they modify it, adding complexity without full comprehension. These programs fall into an understood-by-no-one state of limbo and become nearly impossible to debug or analyze: the execution state of a program might live in thousands of different objects!
Object-oriented programming’s second failing is that it encourages spaghetti code. For an example, let’s say that I’m implementing the card game, Hearts. To represent cards in the deck, I create a Card object, with two attributes: rank and suit, both of some sort of discrete type (integer, enumeration). This is a struct in C or a record in Ocaml or a data object in Java. So far, no foul. I’ve represented a card exactly how it should be represented. Later on, to represent each player’s hand, I have a Hand object that is essentially a wrapper around an array of cards, and a Deck object that contains the cards before they are dealt. Nothing too perverted here.
In Hearts, the person with the 2 of clubs leads first, so I might want to determine in whose hand that card is. Ooh! A “clever” optimization draws near! Obviously it is inefficient to check each Hand for the 2 of clubs. So I add a field, hand, to each Card that is set when the card enters or leaves a player’s Hand. This means that every time a Card moves (from one Hand to another, into or out of a Hand) I have to touch the pointer– I’ve just introduced more room for bugs. This field’s type is a Hand pointer (Hand* in C++, just Hand in Java). Since the Card might not be in a Hand, it can be null sometimes, and one has to check for nullness whenever using this field as well. So far, so bad. Notice the circular relationship I’ve now created between the Card and Hand classes.
It gets worse. Later, I add a picture attribute to the Card class, so that each Card is coupled with the name of an image file representing its on-screen appearance, and ten or twelve various methods for the number of ways I might wish to display a Card. Moreover, it becomes clear that my specification regarding a Card’s location in the game (either in a Hand or not in a Hand) was too weak. If a Card is not in a Hand, it might also be on the table (just played to a trick), in the deck, or out of the round (having been played). So I rename the hand attribute, place, and change its type to Location, from which Hand and Deck and PlaceOnTable all inherit.
This is ugly, and getting incomprehensible quickly. Consider the reaction of someone who has to maintain this code in the future. What the hell is a Location? From its name, it could be (a) a geographical location, (b) a position in a file, (c) the unique ID of a record in a database, (d) an IP address or port number or, what it actually is, (e) the Card’s location in the game. From the maintainer’s point of view, really getting to the bottom of Location requires understanding Hand, Deck, and PlaceOnTable, which may reside in different files, modules, or even directories. It’s just a mess. Worse yet, in such code the “broken window” behavior starts to set in. Now that the code is bad, those who have to modify it are tempted to do so in the easiest (but often kludgey) way. Kludges multiply and, before long, what should have been a two-field immutable record (Card) has 23 attributes and no one remembers what they all do.
To finish this example, let’s assume that the computer player for this Hearts game contains some very complicated AI, and I’m investigating a bug in the decision-making algorithms. To do this, I need to be able to generate game states as I desire as test cases. Constructing a game state requires that I construct Cards. If Card were left as it should be– a two-field record type– this would be a very easy thing to do. Unfortunately, Card now has so many fields, and it’s not clear which can be omitted or given “mock” values, that constructing one intelligently is no longer possible. Will failing to populate the seemingly irrelevant attributes (like picture, which is presumably connected to graphics and not the internal logic of the game) compromise the validity of my test cases? Hell if I know. At this point, reading, modifying, and testing code becomes more about guesswork than anything sound or principled.
Clearly, this is a contrived example, and I can imagine the defenders of object-oriented programming responding with the counterargument, “But I would never write code that way! I’d design the program intelligently in advance.” To that I say: right, for a small project like a Hearts game; wrong, for real-world, complex software developed in the professional world. What I described is certainly not how a single intelligent programmer would code a card game; it is indicative of how software tends to evolve in the real world, with multiple developers involved. Hearts, of course, is a closed system: a game with well-defined rules that isn’t going to change much in the next 6 months. It’s therefore possible to design a Hearts program intelligently from the start and avoid the object-oriented pitfalls I intentionally fell into in this example. But for most real-world software, requirements change and the code is often touched by a number of people with widely varying levels of competence, some of whom barely know what they’re doing, if at all. The morass I described is what object-oriented code devolves into as the number of lines of code and, more importantly, the number of hands, increases. It’s virtually inevitable.
One note about this is that object-oriented programming tends to be top-down, with types being subtypes of Object. What this means is that data is often vaguely defined, semantically speaking. Did you know that the integer 5 doubles as a DessertToppingFactoryImpl? I sure didn’t. An alternative and usually superior mode of specification is bottom-up, as seen in languages like Ocaml and Haskell. These languages offer simple base types and encourage the user to build more complex types from them. If you’re unsure what a Person is, you can read the code and discover that it has a name field, which is a string, and a birthday field, which is a Date. If you’re unsure what a Date is, you can read the code and discover that it’s a record of three integers, labelled year, month, and day. If you want to get “to the bottom” of a datatype or function when types are built from the bottom-up, you can do so, and it rarely involves pinging across so many (possibly semi-irrelevant) abstractions and files as to shatter one’s “flow”. Circular dependencies are very rare in bottom-up languages. Recursion, in languages like ML, can exist both in datatypes and functions, but it’s hard to cross modules with it or create such obscene indirection as to make comprehension enormously difficult. By contrast, it’s not uncommon to find circular dependencies in object-oriented code. In the atrocious example I gave above, Hand depends on Card depends on Location inherits from Hand.
Why does OOP devolve?
Above, I described the consequences of undisciplined object-oriented programming. In limited doses, object-oriented programming is not so terrible. Neither, for that matter, is the much hated “goto” statement. Both of these are tolerable when used in extremely disciplined ways with reasonable and self-evident intentions. Yet when used by any but the most disciplined programmers, OOP devolves into a mess. This is hilarious in the context of OOP’s original promise to business types in the 1990s– that it wound enable mediocre programmers to be productive. What it actually did is create a coding environment in which mediocre programmers (and rushed or indisposed good ones) are negatively productive. It’s true that terrible code is possible in any language or programming paradigm; what makes object orientation such a terrible default abstraction is that, as with unstructured programming, bad code is an asymptotic inevitability as an object-oriented program grows. In order to discuss why this occurs, it’s necessary to discuss object orientation from a more academic perspective, and pose a question to which thousands of answers have been given.
What’s an object?
On first approximation, one can think of an object as something that receives messages and performs actions, which usually include returning data to the sender of the message. Unlike a pure function, the response to each message is allowed to vary. In fact, it’s often required to do so. The object often contains state that is (by design) not directly accessible, but only observable by sending messages to the object. In this light, the object can be compared to a remote-procedure call (RPC) server. Its innards are hidden, possibly inaccessible, and this is generally a good thing in the context of, for example, a web service. When I connect to a website, I don’t care in the least about the state of its thread pooling mechanisms. I don’t want to know about that stuff, and I shouldn’t be allowed access to it. Nor do I care what sorting algorithm an email client uses to sort my email, as long as I get the right results. On the other hand, in the context of code for which one is (or, at least, might be in the future) responsible for comprehending the internals, such incomplete comprehension is a very bad thing.
To “What is an object?” the answer I would give is that one should think of it as a miniature RPC server. It’s not actually remote, nor as complex internally as a real RPC or web server, but it can be thought of this way in terms of its (intentional) opacity. This shines light on whether object-oriented programming is “bad”, and the question of when to use objects. Are RPC servers invariably bad? Of course not. On the other hand, would anyone in his right mind code Hearts in such a way that each Card were its own RPC server? No. That would be insane. If people treated object topologies with the same care as network topologies, a lot of horrible things that have been done to code in the name of OOP might never have occurred.
Alan Kay, the inventor of Smalltalk and the original conception of “object-oriented programming”, has argued that the failure of what passes for OOP in modern software is that objects are too small and that there are too many of them. Originally, object-oriented programming was intended to involve large objects that encapsulated state behind interfaces that were easier to understand than the potentially complicated implementations. In that context, OOP as originally defined is quite powerful and good; even non-OOP languages have adopted that virtue (also known as encapsulation) in the form of modules.
Still, the RPC-server metaphor for “What is an Object?” is not quite right, and the philosophical notion of “object” is deeper. An object, in software engineering, should be seen as a thing which the user is allowed to have (and often supposed to have) incomplete knowledge. Incomplete knowledge isn’t always a bad thing at all; often, it’s an outright necessity due to the complexity of the system. For example, SQL is a language in which the user specifies an ad-hoc query to be run against a database with no indication of what algorithm to use; the database system figures that out. For this particular application, incomplete knowledge is beneficial; it would be ridiculous to burden everyone who wants to use a database with the immense complexity of its internals.
Object-orientation is the programming paradigm based on incomplete knowledge. Its purpose is to enable computation with data of which the details are not fully known. In a way, this is concordant with the English use of the word “object” as a synonym for “thing”: it’s an item of which one’s knowledge is incomplete. “What is that thing on the table?” “I don’t know, some weird object.” Object-oriented programming is designed to allow people to work with things they don’t fully understand, and even modify them in spite of incomplete comprehension of it. Sometimes that’s useful or necessary, because complete knowledge of a complex program can be humanly impossible. Unfortunately, over time the over-tolerance of incomplete knowledge leads to an environment where important components can elude the knowledge of each individual responsible for creating them; the knowledge is strewn about many minds haphazardly.
Probably the most important predictor of whether a codebase will remain comprehensible as it becomes large is whether it’s modular. Are the components individually comprehensible, or do they form an irreducibly complex tangle of which it is required to understand all of it (which may not even be possible) before one can understand any of it? In the latter case, software quality grinds to a halt, or even backslides, as the size of the codebase increases. In terms of modularity, the object oriented paradigm generally performs poorly, facilitating the haphazard growth of codebases in which answering simple questions like “How do I create and use a Foo object?” can require days-long forensic capers.
The truth about “power”
Often, people describe programming techniques and tools as “powerful”, and that’s taken to be an endorsement. A counterintuitive and dirty secret about software engineering “power” is not always a good thing. For a “hacker”– a person writing “one-off” code that is unlikely to ever be require future reading by anyone, including the author– all powerful abstractions, because they save time, can be considered good. However, in the more general software engineering context, where any code written is likely to require maintenance and future comprehension, power can be bad. For example, macros in languages like C and Lisp are immensely powerful. Yet it’s obnoxiously easy to write incomprehensible code using these features.
Objects are, likewise, immensely powerful (or “heavyweight”) beasts when features like inheritance, dynamic method dispatch, open recursion, et cetera are considered. If nothing else, one notes that objects can do anything that pure functions can do– and more. The notion of “object” is both a very powerful and a very vague abstraction.
“Hackers” like power, in which case a language can be judged based on the power of the abstractions it offers. But real-world software engineers spend an unpleasantly large amount of time reading and maintaining others’ code. From an engineer’s perspective, a language is good based on what it prevents other programmers from doing to us, those of us who have to maintain their code in the future. In this light, the unrestrained use of Lisp macros and object-oriented programming is bad, bad, bad. From this perspective, a language like Ocaml or Haskell– of middling power but beautifully designed to encourage deployment of the right abstractions– is far better than a more “powerful” one like Ruby.
As an aside, a deep problem in programming language design is that far too many languages are designed with the interests of code writers foremost in mind. And it’s quite enjoyable, from a writer’s perspective, to use esoteric metaprogramming features and complex object patterns. Yet very few languages are designed to provide a beautiful experience for readers of code. In my experience, ML does this best, and Haskell does it well, while most of the mainstream languages fall short of being even satisfactory. In most real-world software environments, reading code is so unpleasant that it hardly gets done at all with any detail. Object-oriented programming, and the haphazard monstrosities its “powerful” abstractions enable, is a major culprit.
The truth, I think, about object-oriented programming is that most of its core concepts– inheritance, easy extensibility of data, proliferation of state– should be treated with the same caution and respect that a humble and intelligent programmer gives to mutable state. These abstractions can be powerful and work beautifully in the right circumstances, but they should be used very sparingly and only by people who understand what they are actually doing. In Lisp, it is generally held to be good practice never to write a macro when a function will do. I would argue the same with regard to object-oriented programming: never write an object-oriented program for a problem where a functional or cleanly imperative approach will suffice. Certainly, to make object orientation the default means of abstraction, as C++ and Java have, is a proven disaster.
Abstractions, and especially powerful ones, aren’t always good. Using the right abstractions is of utmost importance. Abstractions that are too vague, for example, merely clutter code with useless nouns. As the first means of abstraction in high-level languages, higher-order functions suffice most of the time– probably over 95% of the time, in well-factored code. Objects may come into favor for being more general than higher-order functions, but “more general” also means less specific, and for the purpose of code comprehension, this is a hindrance, not a feature. If cleaner and more comprehensible pure functions and algebraic data types can be used in well over 90 percent of the places where objects appear in OO languages, they should be used in lieu of objects, and they should be supported by languages– which C++ and Java don’t do.
In a better world, programmers would be required to learn how to use functions before progressing to objects, and object-oriented features would hold the status of being available but deployed only when needed, or in the rare cases where such features make remarkably better code. To start, this change needs to come about at a language level. Instead of Java or C++ being the first languages to which most programmers are introduced, that status should be shifted to a language like Scheme, Haskell, or my personal favorite for this purpose: Ocaml.