What is spaghetti code?

One of the easiest ways for an epithet to lose its value is for it to become over-broad, which causes it to mean little more than “I don’t like this”. Case in point is the term, “spaghetti code”, which people often use interchangeably with “bad code”. The problem is that not all bad code is spaghetti code. Spaghetti code is an especially virulent but specific kind of bad code, and its particular badness is instructive in how we develop software. Why? Because individual people rarely write spaghetti code on their own. Rather, certain styles of development process make it increasingly common as time passes. In order to assess this, it’s important first to address the original context in which “spaghetti code” was defined: the dreaded (and mostly archaic) goto statement.

The goto statement is a simple and powerful control flow mechanism: jump to another point in the code. It’s what a compiled assembly program actually does in order to transfer control, even if the source code is written using more modern structures like loops and functions. Using goto, one can implement whatever control flows one needs. We also generally agree, in 2012, that goto is flat-out inappropriate for source code in most modern programs. Exceptions to this policy exist, but they’re extremely rare. Most modern languages don’t even have it.

Goto statements can make it difficult to reason about code, because if control can bounce about a program, one cannot make guarantees about what state a program is in when it executes a specific piece of code. Goto-based programs can’t easily be broken down into component pieces, because any point in the code can be wormholed to any other. Instead, they devolve into an “everything is everywhere” mess where to understand a piece of the program requires understanding all of it, and the latter becomes flat-out impossible for large programs. Hence the comparison to spaghetti, where following one thread (or noodle) often involves navigating through a large tangle of pasta. You can’t look at a bowl of noodles and see which end connects to which. You’d have to laboriously untangle it.

Spaghetti code is code where “everything is everywhere”, and in which answering simple questions such as (a) where a certain piece of functionality is implemented, (b) determining where an object is instantiated and how to create it, and (c) assessing a critical section for correctness, just to name a few examples of questions one might want to ask about code, require understanding the whole program, because of the relentless pinging about the source code that answer simple questions requires. It’s code that is incomprehensible unless one has the discipline to follow each noodle through from one side to the other. That is spaghetti code.

What makes spaghetti code dangerous is that it, unlike other species of bad code, seems to be a common byproduct of software entropy. If code is properly modular but some modules are of low quality, people will fix the bad components if those are important to them. Bad or failed or buggy or slow implementations can be replaced with correct ones while using the same interface. It’s also, frankly, just much easier to define correctness (which one must do in order to have a firm sense of what “a bug” is) over small, independent functions than over a giant codeball designed to do too much stuff. Spaghetti code is evil because (a) it’s a very common subcase of bad code, (b) it’s almost impossible to fix without causing changes in functionality, which will be treated as breakage if people depend on the old behavior (potentially by abusing “sleep” methods, thus letting a performance improvement cause seemingly unrelated bugs!) and (c) it seems, for reasons I’ll get to later, not to be preventable through typical review processes.

The reason I consider it important to differentiate spaghetti code from the superset, “bad code”, is that I think a lot of what makes “bad code” is subjective. A lot of the conflict and flat-out incivility in software collaboration (or the lack thereof) seems to result from the predominantly male tendency to lash out in the face of unskilled creativity (or a perception of such, and in code this is often an extremely biased perception): to beat the pretender to alpha status so badly that he stops pestering us with his incompetent displays. The problem with this behavior pattern is that, well, it’s not useful and it rarely makes people better at what they’re trying to do. It’s just being a prick. There are also a lot of anal-retentive wankbaskets out there who define good and bad programmers based on cosmetic traits so that their definition of “good code” is “code that looks like I wrote it”. I feel like the spaghetti code problem is better-defined in scope than the larger but more subjective problem of “bad code”. We’ll never agree on tabs-versus-spaces, but we all know that spaghetti code is incomprehensible and useless. Moreover, as spaghetti code is an especially common and damaging case of bad code, assessing causes and preventions for this subtype may be generalizable to other categories of bad code.

People usually use “bad code” to mean “ugly code”, but if it’s possible to determine why a piece of code is bad and ugly, and to figure out a plausible fix, it’s already better than most spaghetti code. Spaghetti code is incomprehensible and often unfixable. If you know why you hate a piece of code, it’s already above spaghetti code in quality, since the latter is just featureless gibberish.

What causes spaghetti code? Goto statements were the leading cause of spaghetti code at one time, but goto has fallen so far out of favor that it’s a non-concern. Now the culprit is something else entirely: the modern bastardization of object-oriented programming. Inheritance is an especially bad culprit, and so is premature abstraction: using a parameterized generic with only one use case in mind, or adding unnecessary parameters. I recognize that this claim– that OOP as practiced is spaghetti code– is not a viewpoint without controversy. Nor was it without controversy, at one time, that goto was considered harmful.

One of the biggest problems in comparative software (that is, the art of comparing approaches, techniques, languages, or platforms) is that most comparisons focus on simple examples. At 20 lines of code, almost nothing shows its evilness, unless it’s contrived to be dastardly. A 20-line program written with goto will usually be quite comprehensible, and might even be easier to reason about than the same program written without goto. At 20 lines, a step-by-step instruction list with some explicit control transfer is a very natural way to envision a program. For a static program (i.e. a platonic form that need never be changed and incurs no maintenance) that can be read in one sitting, that might be a fine way to structure it. At 20,000 lines, the goto-driven program becomes incomprehensible. At 20,000 lines, the goto-driven program has been hacked and expanded and tweaked so many times that the original vision holding the thing together has vanished, and the fact that a program can be in a piece of code “from anywhere” means that to safely modify the code requires confidence quantified by “from everywhere”. Everything is everywhere. Not only does this make the code difficult to comprehend, but it means that every modification to the code is likely to make it worse, due to unforeseeable chained consequences. Over time, the software becomes “biological”, by which I mean that it develops behaviors that no one intended but that other software components may depend on in hidden ways.

Goto failed, as a programming language construct, because of these problems imposed by the unrestricted pinging about a program that it created. Less powerful, but therefore more specifically targeted, structures such as procedures, functions, and well-defined data structures came into favor. For the one case where people needed global control flow transfer (error handling) exceptions were developed. This was a progress from the extreme universality and abstraction of a goto-driven program to the concretion and specificity of pieces (such as procedures) solving specific problems. In unstructured programming, you can write a Big Program that does all kinds of stuff, add features on a whim, and alter the flow of the thing as you wish. It doesn’t have to solve “a problem” (so pedestrian…) but it can be a meta-framework with an embedded interpreter! Structured programming encouraged people to factor their programs into specific pieces that solved single problems, and to make those solutions reusable when possible. It was a precursor of the Unix philosophy (do one thing and do it well) and functional programming (make it easy to define precise, mathematical semantics by eschewing global state).

Another thing I’ll say about goto is that it’s rarely needed as a language-level primitive. One could achieve the same effect using a while-loop, a “program counter” variable defined outside that loop that the loop either increments (step) or resets (goto) and a switch-case statement using it. This could, if one wished, be expanded into a giant program that runs as one such loop, but code like this is never written. What the fact that this is almost never done seems to indicate is that goto is rarely needed. Structured programming thereby points out the insanity of what one is doing when attempting severely non-local control flows.

Still, there was a time when abandoning goto was extremely controversial, and this structured programming idea seemed like faddish nonsense. The objection was: why use functions and procedures when goto is strictly more powerful?

Analogously, why use referentially transparent functions and immutable records when objects are strictly more powerful? An object, after all, can have a method called run or call or apply so it can be a function. It can also have static, constant fields only and be a record. But it can also do a lot more: it can have initializers and finalizers and open recursion and fifty methods if one so chooses. So what’s the fuss about this functional programming nonsense that expects people to build their programs out of things that are much less powerful, like records whose fields never change and whose classes contain no initialization magic?

The answer is that power is not always good. Power, in programming, often advantages the “writer” of code and not the reader, but maintenance (i.e. the need to read code) begins subjectively around 2000 lines or 6 weeks, and objectively once there is more than one developer on a project. On real systems, no one gets to be just a “writer” of code. We’re readers, of our own code and of that written by others. Unreadable code is just not acceptable, and only accepted because there is so much of it and because “best practices” object-oriented programming, as deployed at many software companies, seem to produce it. A more “powerful” abstraction is more general, and therefore less specific, and this means that it’s harder to determine exactly what it’s used for when one has to read the code using it. This is bad enough, but single-writer code usually remains fairly disciplined: the powerful abstraction might have 18 plausible uses, but only one of those is actually used. There’s a singular vision (although usually an undocumented one) that prevents the confusion. The danger sets in when others who are not aware of that vision have to modify the code. Often, their modifications are hacks that implicitly assume one of the other 17 use cases. This, naturally, leads to inconsistencies and those usually result in bugs. Unfortunately, people brought in to fix these bugs have even less clarity about the original vision behind the code, and their modifications are often equally hackish. Spot fixes may occur, but the overall quality of the code declines. This is the spaghettification process. No one ever sits down to write himself a bowl of spaghetti code. It happens through a gradual “stretching” process and there are almost always multiple developers responsible. In software, “slippery slopes” are real and the slippage can occur rapidly.

Object-oriented programming, originally designed to prevent spaghetti code, has become (through a “design pattern” ridden misunderstanding of it) one of the worst sources of it. An “object” can mix code and data freely and conform to any number of interfaces, while a class can be subclassed freely about the program. There’s a lot of power in object-oriented programming, and when used with discipline, it can be very effective. But most programmers don’t handle it well, and it seems to turn to spaghetti over time.

One of the problems with spaghetti code is that it forms incrementally, which makes it hard to catch in code review, because each change that leads to “spaghettification” seems, on balance, to be a net positive. The plus is that a change that a manager or customer “needs yesterday” gets in, and the drawback is what looks like a moderate amount of added complexity. Even in the Dark Ages of goto, no one ever sat down and said, “I’m going to write an incomprehensible program with 40 goto statements flowing into the same point.”  The clutter accumulated gradually, while the program’s ownership transferred from one person to another. The same is true of object-oriented spaghetti. There’s no specific point of transition from an original clean design to incomprehensible spaghetti. It happens over time as people abuse the power of object-oriented programming to push through hacks that would make no sense to them if they understood the program they were modifying and if more specific (again, less powerful) abstractions were used. Of course, this also means that fault for spaghettification is everywhere and nowhere at the same time: any individual developer can make a convincing case that his changes weren’t the ones that caused the source code to go to hell. This is part of why large-program software shops (as opposed to small-program Unix philosophy environments) tend to have such vicious politics: no one knows who’s actually at fault for anything.

Incremental code review is great at catching the obvious bad practices, like mixing tabs and spaces, bad variable naming practices, and lines that are too long. That’s why the more cosmetic aspects of “bad code” are less interesting (using a definition of “interesting” synonymous with “worrisome”) than spaghetti code. We already know how to solve them in incremental code review. We can even configure our continuous-integration servers to reject such code. As for spaghetti code, where there is no clear definition, this is difficult if not impossible to do. Whole-program review is necessary to catch that, but I’ve seen very few companies willing to invest the time and political will necessary to have actionable whole-program reviews. Over the long term (10+ years) I think it’s next to impossible, except among teams writing life- or mission-critical software, to ensure this high level of discipline in perpetuity.

The answer, I think, is that Big Code just doesn’t work. Dynamic typing falls down in large programs, but static typing fails in a different way. The same is true of object-oriented programming, imperative programming, and to a lesser but still noticeable degree (manifest in the increasing number of threaded state parameters) in functional programming. The problem with “goto” wasn’t that goto was inherently evil, so much as that it allowed code to become Big Code very quickly (i.e. the threshold of incomprehensible “bigness” grew smaller). On the other hand, the frigid-earth reality of Big Code is that there’s “no silver bullet”. Large programs just become incomprehensible. Complexity and bigness aren’t “sometimes undesirable”. They’re always dangerous. Steve Yegge got this one right.

This is why I believe the Unix philosophy is inherently right: programs shouldn’t be vague, squishy things that grow in scope over time and are never really finished. A program should do one thing and do it well. If it becomes large and unwieldy, it’s refactored into pieces: libraries and scripts and compiled executables and data. Ambitious software projects shouldn’t be structured as all-or-nothing single programs, because every programming paradigm and toolset breaks down horribly on those. Instead, such projects should be structured as systems and given the respect typically given to such. This means that attention is paid to fault-tolerance, interchangeability of parts, and communication protocols. It requires more discipline than the haphazard sprawl of big-program development, but it’s worth it. In addition to the obvious advantages inherent in cleaner, more usable code, another benefit is that people actually read code, rather than hacking it as-needed and without understanding what they’re doing. This means that they get better as developers over time, and code quality gets better in the long run.

Ironically, object-oriented programming was originally intended to encourage something looking like small-program development. The original vision behind object-oriented programming was not that people should go and write enormous, complex objects, but that they should use object-oriented discipline when complexity is inevitable. An example of success in this arena is in databases. People demand so much of relational databases in terms of transactional integrity, durability, availability, concurrency and performance that complexity is outright necessary. Databases are complex beasts, and I’ll comment that it has taken the computing world literally decades to get them decent, even with enormous financial incentives to do so. But while a database can be (by necessity) complex, the interface to one (SQL) is much simpler. You don’t usually tell a database what search strategy to use; you write a declarative SELECT statement (describing what the user wants, not how to get it) and let the query optimizer take care of it. 

Databases, I’ll note, are somewhat of an exception to my dislike of Big Code. Their complexity is well-understood as necessary, and there are people willing to devote their careers entirely to mastering it. But people should not have to devote their careers to understanding a typical business application. And they won’t. They’ll leave, accelerating the slide into spaghettification as the code changes hands.

Why Big Code? Why does it exist, in spite of its pitfalls? And why do programmers so quickly break out the object-oriented toolset without asking first if the power and complexity are needed? I think there are several reasons. One is laziness: people would rather learn one set of general-purpose abstractions than study the specific ones and when they are appropriate. Why should anyone learn about linked lists and arrays and all those weird tree structures when we already have ArrayList? Why learn how to program using referentially transparent functions when objects can do the trick (and so much more)? Why learn how to use the command line when modern IDEs can protect you from ever seeing the damn thing? Why learn more than one language when Java is already Turing-complete? Big Code comes from a similar attitude: why break a program down into small modules when modern compilers can easily handle hundreds of thousands of lines of code? Computers don’t care if they’re forced to contend with Big Code, so why should we?

However, more to the point of this, I think, is hubris with a smattering of greed. Big Code comes from a belief that a programming project will be so important and successful that people will just swallow the complexity– the idea that one’s own DSL is going to be as monumental as C or SQL. It also comes from a lack of willingness to declare a problem solved and a program finished even when the meaningful work is complete. It also comes from a misconception about what programming is. Rather than existing to solve well-defined problems and then get out of the way, as small-program methodology programs do, they become more than that. Big Code projects often have an overarching and usually impractical “vision” that involves generating software for software’s sake. This becomes a mess, because “vision” in a corporate environment is usually bike-shedding that quickly becomes political. Big Code programs always reflect the political environment that generated them (Conway’s Law) and this means that they invariably look more like collections of parochialisms and inside humor than the more universal languages of mathematics and computer science.

There is another problem in play. Managers love Big Code, because when the programmer-to-program relationship is many-to-one instead of one-to-many, efforts can be tracked and “headcount” can be allocated. Small-program methodology is superior, but it requires trusting the programmers to allocate their time appropriately to more than one problem, and most executive tyrannosaurs aren’t comfortable doing that. Big Code doesn’t actually work, but it gives managers a sense of control over the allocation of technical effort. It also plays into the conflation of bigness and success that managers often make (cf. the interview question for executives, “How many direct reports did you have?”) The long-term spaghettification that results from Big Code is rarely an issue for such managers. They can’t see it happen, and they’re typically promoted away from the project before this becomes an issue.

In sum, spaghetti code is bad code, but not all bad code is spaghetti. Spaghetti is a byproduct of industrial programming that is usually, but not always, an entropic result of too many hands passing over code, and an inevitable outcome of large-program methodologies and the bastardization of “object-oriented programming” that has emerged out of these defective, executive-friendly processes. The antidote to spaghetti is an aggressive and proactive refactoring effort focused on keeping programs small, effective, clean in source code, and most of all, coherent.

About these ads

49 thoughts on “What is spaghetti code?

  1. We Scheme programmers like gotos, but we call them “proper tail calls”, because they pass arguments before doing the goto. We even typically write loops using gotos, though not conditionals.

    The really dangerous part of a conditional goto is not the goto, it’s the “doesn’t goto”.

  2. You’re covering a lot of ground, so let me see if I got everything. There’s three core constraints here:

    a) Merely reading code never conveys the mental model of the original writer. Peter Naur explored this ‘second law’ of programming (http://alistair.cockburn.us/ASD+book+extract%3A+%22Naur,+Ehn,+Musashi%22). Successful programs last, long-lived programs change hands, and their underlying coherence is subject to a lengthy game of telephone that no amount of code-level refactoring will fix. It takes face time to convey the theory of a program, to make explicit the grain of its texture. But face time is all too often too expensive to be viable.[1]

    b) In combating gradual decay, organizations reach for rules. Eschew goto’s, comply with the style guide, all languages not permitted are forbidden, thou shalt this and thou shalt not that. But rules are subject to the same sorts of decay as code. They have a way of gradually growing and multiplying, of focusing the attention on the superficial ‘source code’ of the rule rather than the underlying ‘coherence’ of insight. Soon they fill the horizon; rule makers limit themselves to enforcing compliance, and followers’ jobs are increasingly defined by what they cannot do.

    c) Evaluating code and rules is hard and takes time[2]. They create secondary effects and perverse incentives. There’s usually no way to tell (short of doing it again yourself) if a design is as complex as it needs to be, or over-engineered. Especially if you’re paying your programmers by how impressive their code seems, or how many lines of code they contribute. Or your managers by how much they’re chaperoning/steering their programmers[3]. Often the only way to evaluate something is to use it for a while, keep an open mind and consciously put off a decision until things become clear. You can’t do this to a schedule; it’s hard to incent people to do this at scale, rather than just go through the motions. Perhaps it just isn’t something everyone can put up with[4].

    This all seems really hard. I reread the last paragraph of your post, and suddenly it seems a bland anodyne — it’s well-known that keeping programs clean, coherent and effective requires proactive refactoring. The conclusion also undersells the rest of your post — your description of the structure of incentives that encourage Big Code is far more important to my mind.

    I’m not sure what solution I would propose instead. Institutions are valuable to the extent that they focus on the tangible and deal in certainties. Until we find the right metrics for knowledge work, perhaps we should back off and give way to human nature a bit. Programmers seem to tend to take ownership of programs that fit in their heads. Give up on ‘maintaining coherence’ as a lost cause, and focus instead on comprehensive tests. At least then you have a prayer of periodically refactoring into a new coherence. Tests are the ultimate documentation.

    Are goto’s always bad eventually? Who cares? There’s far bigger problems with gradual change in codebases, let’s think harder about them. My peer reviewers, hold off review until you’ve used what I’ve built and found problems with it. Managers and programmers, wrestle with judgement everyday, the stuff not easily put into rules. Leave the comforting shallows and engage with the abyss.

    [1] This is a problem not just for programming, but writing in general: http://www.ribbonfarm.com/2012/01/11/seeking-density-in-the-gonzo-theater.

    [2] http://www.youtube.com/watch?v=lU5OgrHQd7s

    [3] http://news.ycombinator.com/item?id=4339424

    [4] http://www.newyorker.com/reporting/2012/07/30/120730fa_fact_gladwell

  3. Pingback: What is spaghetti code? « Jelastic — Rock-Solid Java in the Cloud, Java Server Hosting, Java Cloud Computing

  4. Pingback: What is spaghetti code? - Platform as a Service Magazine

  5. While goto certainly has a greater potential for ‘spaghettification’, any branching statement adds to that problem. That includes if, switch, break, continue, and any kind of loop as well. The only reason for goto to be worse than other control statements is that they may jump out of or into a scope. And, consequently, that it is harder to follow the flow of control because goto does not relate to scope.

    You could measure spaghettification like this:
    spaghetti_level = 2^(number of branching statements inside a function).
    That is the (minimum) number of cases you need to verify when refactoring. A 2000 line function with one or two goto’s is not a problem. A 100 line function with 20 if-statements is!

    Of course, the frequency of control statements in code is usualy quite stable for each programmer, so the conclusion is to just write shorter functions as that will automatically reduce the number of branches.

    Disclaimer: I don’t use goto, and discourage it’s use. I merely wanted to point out that other control statements may cause just as much of a problem, when overused.

    • Having read quite a bit of spaghetti code, I would have to respectfully disagree. There are doubtless ways to abuse the other control statements, but the use of goto statements in an unstructured program is qualitatively different and in my opinion the resulting code is much worse to understand, modify, and test. Those other control statements are the alternative to goto and the cure for spaghetti code. Also, if I had the stomach for it, I’ll bet I could write a 2000 line program with 2 gotos that would take you hours to figure out.

  6. One reason for spagetti code is as follows:

    There was a guy who knew nothing about programming until he started to work for us. He came to us with a Maths degree.

    To everything he would reply “That should be pretty easy”, which was an insult to his colleagues, of whom some were working in the field before he started university. But in reality nothing was easy for him as we later discovered.

    He self taught himself C/C++, really poorly. He got hypnotised by Design pattern books. To solve the simplest of problems he used superfluous design patterns. Design patterns have a place, but when used correctly.

    He used his role here as a training platform and peppered all his deliverables with poorly understood (by him) design pattern techniques, with no consideration to his colleagues who would have to maintain his poorly contructed code.

    He’s gone on the work in The City for a well known company that provides financial products to the financial market sector. Goodness help us.

    • Sound pretty bitter patrick_a.
      Many people who come into software do so from other disciplines, particularly maths, science and engineering and even Software Engineering and CS degrees don’t necessarily provide a graduate with all of the skills required for a career in software, as with most careers much of the learning is on the job.
      Sounds like the guy was trying to learn some good software practices with design patterns even if they were badly applied. Guidance from more senior mentors, established practices and procedures, code reviews etc can shape mediocre programmers into good software devs as long as they’re smart and willing to learn. Was anyone mentoring this guy or reviewing his code?
      “He came to us with a Maths degree” – Maths degrees are hard and require an aptitude for structured, logical thinking, good attributes for a programmer I’d say. On the other hand if this guy was some smart ass unwilling to listen or learn you’re probably better of without him.

      Anyway, good article, really got me thinking about the UNIX vs Big Code philosophies on a system level but also on a project level, i.e. within a large code base having discrete blocks of code (packages, classes, namespaces) that can exist on their own as independent artefacts.
      I think that to an extent a good unit testing procedure can help facilitate this. If you think of code in terms of testability at the unit level you’re pretty much forced to break code down into small self contained chunks while minimalising reliance on class level state variables (which in large spaghettified classes can be the new global variables), so that each piece of code is in some way meaningful in it’s own context. Spaghetti code is very hard to write unit tests for.

    • Interesting comment about that Maths guy who used superfluous design patterns. I took a similar attitude to you when doing a project – rather than think about how to make it all object-orientedy. I then interviewed for a gig and I was to explain how I designed that previous project, to which the interviewer said I should have used more object-oriented designs, and that I probably would not be a good fit in his shop.

  7. It’s what a compiled assembly program actually does in order to transfer control,

    Er… I think you’ll find most people/compilers use CALL/RET for function calls. Branches, loops etc. use JMP or more commonly conditional branch/jump instructions.

  8. Even worse was the COBOL Altered GO TO. The actual destination for this branching instruction was set in various places in the program. The effect is a dynamic redirection of code that was almost impossible to debug visually. The ultimate in pasta code…..

  9. ” Spaghetti are good as far as Chef that prepares it ”
    ” Some people do not like Spaghetti because they are lazy ”

    Spaghetti western :
    First unloved, then they were ridiculed and eventually the same people that spit on it, declared it for pure art. Amazing isn’t it.

    Bad computer program code is a code with errors in it, syntax errors or semantics errors !

    There is no such thing as ugly or beautiful computer program code.
    There is functional or not functional computer program code.
    There is readable, and understandable computer program code or not.

    Any computer program code that do the job is good code, “Spaghetti” or not.
    Everything else is a matter of taste, that I do not discuss often.
    “Spaghetti code” is perhaps someone’s style of programming computers.
    If You do not like his style do not hire him.
    25 Years ago when I was learning the BASIC’s of programming, my professor hated ‘ go to ‘ statement or was it ‘ goto ‘. At that time OOP was SF for me.
    Some problems could not be solved without it and why should be.
    Perhaps we should find out who first invented such statement and ask him what is the purpose of it .

    I do not think that ‘ go to ‘ statement itself has anything to do with “Spagetti code”.

    Inadequate knowledge of programming techniques and the way of solving problems throughout the creation of algorithms and implementing the algorithms trough inadequate programming language has everything to do with generating “Spagetti code”.
    If you want the obvious example of “Spagetti code” look at link below where is the computer program code written in BASIC, for solving problem in combinatorics. (computer program was written at 27. june 1987.)

    http://www.dejanristanovic.com/refer/kombin.htm

    Now what is the “Spagetti here”.
    Perhaps You do not like it, can’t understand it or read it because it was not written in english. Shall we call it agly or not functional.
    Try to follow ‘gosub’ statements or it was ‘ go sub’ trough computer program code and you shall find why we call it “Spagetti code”. In this example ‘go to’ statement has nothing to do with it.
    From time when that computer program was written, to this particular moment, I find that computer program code amusing, funny but also functional and educational, hurd to track also. Perhaps I shall write some “Spagetti code” for fun or perhaps future competition that eventually shall be organized by someone who today is ” Too lazy “.

    It seems that Michael O. Church is prolonging the term “Spaghetti code” to something that is for me theoretically impossible.

    Perhaps his explanation of ‘now days’ ” Spagetti code” should be called
    ” Shredded code “. That’s how far I can understand his article.

    I am still learning programming techniques because computer programming is developing as time passes. Some techniques I don’t like so I do not use them. I do not spit on them or curse them, it may come back to my face.

    Article is good since it provokes a reaction.

    All the best,
    Perić N. Željko
    periczeljkosmederevo@yahoo.com

  10. I was reviewing a slew of stored procs just yesterday, looking for ways to speed them up, and was surprised that one contained a couple of go to statements. It made me queasy but it was plain enough what they were doing.

    The simplest definition I know for “spaghetti code” is this: “Code someone else wrote!” Or, an alternative, “Code you wrote yourself but over a year ago!” Just kidding…

    “Macaroni Code” uses meaningless identifiers, such as “A1″ or “pxg”.

    It is interesting that because objects retain state and parameter info and thus have unpredictable side effects they can’t be multi-threaded, and are thus unsuitable for big data situations. To the threading manager they are “spaghetti code”.

  11. Software is fairly unique in that is that the final product is not immutable. A vehicle designer has to ‘get it right’ before the product goes to manufacturing while a software product can continue to evolve. The vehicle designer will learn how to improve the product over time, but those improvements will rarely be retrofitted into an existing product, unlike in software.

    And it’s not that developers are guilty of writing code without a plan either; managers and business leaders see the mutability of code as a reason to shorten schedules. A formal ‘design’ stage is unnecessary as the developer will just ‘make it work’.

    And the problem domain will shift with changing business requirements and the addition of features so that the nice, OOP framework that the professional developer spent his time on will have to hacked to make it work.

    UNIX, with it’s dedicated component model, is not immune to this either.

  12. Judicious use of gotos can be infinitely preferable to multiple interjections of external calls.

    External calls should not be used to create procedural brevity, but to reflect functionality and to reduce duplication.

    If it helps to keep decision blocks short and labels are meaningfully named, gotos can be by far the best way of structuring choices that can more efficiently share the same local declarations and assignments.

    The overriding objective always has to be to keep all procedure blocks as succinct as possible, to reduce duplication to the absolute minimum and to express logical flows as clearly as possible. If goto does that best in a particular situation, then it’s stupid not to use it, just like it’s stupid to use any device if you don’t need to.

    You can create impenetrable spaghetti without using goto at all and it’s a programmer’s central craft for their program flows to be easily intelligible both to their peers and, in six month’s time, to themselves! That’s the acid test, not some anti-goto doctrine.

  13. There are situation when you have to use GOTO to speed up your computer and that is when you are programming on embedded PLC and you need the quickest real time program that you can make.
    I work on automation where a delay of few milliseconds could cause two or more PLC and Robots to loose synchronization, damage the product, if they are mechanically intertwine could damage the machine and worst could injure a technician.
    And when I am dealing with real time on a PLC, I prefer to use Programmable Ladder Logic vs the structured Text because it is easier to troubleshoot and monitor, put most of the variables in Global so you could access it from any location with ease, avoid function with parameters and return statements and use of GOTO, which is in the PLC case is JMP, in order to bypass code which on certain condition need not be executed.

    PLC is a different beast, you can write 100 separate codes and all 100 can run parallel and independent from each other, but that is also the beauty of a real time PLC.

    Because parallel programming uses Global variables and GOTO(jmp) on large scale and it comes with a huge disadvantage, it is a pain to troubleshoot specially when variable changes when it is not suppose to, because any functions that access it and uses GOTO are all suspects.

    The only solution is you have to be good at programming in Real Time.

  14. I like the term spaghetti is Big Code or as I understand what the author is saying, that it is necessary to read the whole program to understand any part. That is my bete noir with many what I consider ill constructed object oriented programs. I am all in favor of object orientation. But when I have to open read the source code of 40 methods of 20 classes of 10 assemblies in 10 different solutions to understand the intent of a 5 line method in the class I am troubleshooting then object orientation has gone too far.

  15. In C, I sometimes use ‘goto’s to transfer control to the end of a function instead of returning immediately in mid-function, where common cleanup or logging can be done. This reduces the possibility of errors during code maintenance.
    In C++, this isn’t necessary, thanks to the use RAII.

    • This one exception was also acceptable to many fine developers in COBOL and probably FORTRAN: To drop to the end of a function before exiting, perhaps due to an error or because no further processing is required. The alternative just creates more convoluted and lengthy IF or ELSE conditions. GOTO xxx-EXIT becomes an obvious and useful convention. I’ve coded both ways to stay consistent with different code sets, but probably prefer to not use the GOTO.

  16. The absolute worst program I ever had to de-bug not onlyt contained GOTO, but also had ALTER GOTO. That meant the original GOTO could be changed anywhere in the program.

  17. This observation has been made by others elsewhere: in the construction trade we recognize the difference between architects, contractors and tradespeople.

    Why do we as software professionals persist in the insanity that “coding” is the only craft necessary to our discipline, and that we can reason about our work without any representation of structure other than the final product?

    To illustrate: If people use patterns inappropriately, it’s because they don’t go through the formal process of mapping the context of application to the description of the pattern. Doh!

  18. I agree that the issue is program size. “Linuxing” of modern programs is difficult because the compilers/Windows generally don’t allow for attaching executable code on demand.

    For example, our primary product does payroll. This involves dozens of different routines that may or not be generally applicable, for example, printing a report for a specific state.

    In earlier times BASIC compilers existed that created a sort of “subordinate” executable called a chain file. Modern-day PowerBasic still retains that feature.

    The chain file was self-contained and callable on demand. It was neither a DLL nor an OOP construct. Because it was self-contained, it never grew as requirements changed – you just rewrote it to the new requirements and let the calling program decide which one to chain to.

    (I know, I know – BASIC is not C so it must be bad.)

    I also completely agree with your comments about OOP. I’d like to share a story that exemplifies just how insane OOP can be.

    When OOP was first being propounded, there was a 3-page article in one of the (now-defunct) programming magazines describing how to create an object to read a text file in great detail and about 1000 lines of code.

    The same process could (and still can) be accomplished in 6 lines of BASIC!

  19. Never underestimate the perverseness of old-time programmers. As recently as 2007 I worked with a senior developer who refused to let me change spaghetti code into a structured program because they wouldn’t be able to read it. The spaghetti code – COBOL with impressively bad GOTOs – was only a few thousand lines and entirely the product of that one programmer. The person had been trained in structured programming but never adopted the practice because they perceived that it took longer. In the same time period I worked for multiple software shops where I couldn’t have held a job with work of that quality.

  20. If you take the fundamental precept that a programmer must write a clear, concise and articulate statement of a problem solution, you start to build a good base. The language doesn’t matter, the constructs don’t matter, as long as it’s clear and correct.

    Spaghetti code is often a result of the programmer not understanding the problem being solved before they start coding.

    Reasonable code can degrade to spaghetti code by inexpert modification.

    If you have to deal with crap code (and we all do in our careers) it is really, really helpful to leave comments about what you have discovered, and what you have done to resolve the defect you’ve been called on to correct, or the added function you’ve included. That’s the sauce that can make spaghetti code sort of palatable for someone else.

    As I’ve grown old in this game, I have the confidence and courage to take my time, because I know from sometimes bitter experience that if I don’t, I’ll end up taking much longer than I should’ve, and the product I produce will not be as good.

    My personal mantra:
    Understand clearly what the task at hand is. Write a good articulate and elegant solution. And don’t be afraid to edit your own, or others, work because no one gets it right first time.

    I’m pretty new to this game. I only started in 1966. But over the last 46 years the debate has continued with each new generation thinking it has discovered something new. Bit like sex really.

  21. I would think that the measure of spaghettification of code is related to the inter-modular dependencies. Given modules A,B,C then a totally pastafied program has A=>B, A=>C, B=>A, B=>C, C=>A, C=>B where => means “depends on” (in that it calls code or uses definitions contained in the translation unit). Note that the dependencies here are two-way; whereas a significantly less Italian codebase might have A=>B, A=>C, B=>C.

    IF this is the case, then it’s nothing to do with “goto” per se. I’m currently mired in a C program where, instead of the OO paradigm where the functions that are logically related are also physically related, they are located either by the type of things that they do or are (eg I’m a “click handler” so I belong in “handlers.c”) or simply by convenience (write the function at the site you need it even though it replicates existing functionality and totally doesn’t belong there). This tends to result in exactly the sort of bi-directional dependencies that mean – in short – that to understand A, you need to understand B and C too. It also means that A is not independently (unit) testable, and cannot be reused without B and C.

    Dependencies. They screw you over.

  22. IMO, the essential reason for spaghetti code, besides the big code syndrome, is that often the original, initial design isn’t really fit for the purpose of the application – in fact, it actually creates very sound and solid conditions for spaghetti code to develop.

    What I mean, in more detail: IME, avoiding spaghetti code is only possible if the architecture/design/metaphor/whatever you like to call it of the application introduces some (very few) essential interfaces which actually model the basic notions from the domain-specific terminology. For example, when you have a state machine, you should have a State and a Transition interface, when you have an ATM app you should have interfaces such as Account, Transaction or Slip. This way, you force programmers to develop to those interfaces, and, if your interfaces specify small enough objects, chances are you won’t get spaghetti code which would be difficult to refactor out of the way. Your very few and basic interfaces become the equivalent of pipes on *n*x, only limited to the comparatively extremely small world of your app.

    By contrast, if you implement a state machine by simply wiring together unrelated classes with no common interface, sharing or not sharing state across objects on a case by case basis, you open the door for two sources of spaghettification: for one, everybody can implement as much or as little of the state machine as he desires inside a single class, and second, instead of reading several implementations of a few interfaces you have to read several unrelated classes. (I intentionally chose a state machine because it’s usually the devil for OO nazis – it doesn’t lend itself to a very clean OO implementation because you usually share state among objects implementing functionality, rather than grouping state and functionality together and hiding details away.)

    Another solid foundation on which spaghetti code develops is a sick data structure. You usually can reasonably maintain an application with badly implemented functionality as long as its data structure is sound, but no matter how sound the code, if the data structure is bad maintenance will be full of ugliness. Whereas functionality may change often, data structures change seldomly. Often enough the database of a LOB app never reaches version 2.0. Provided the initial designer of the data structures did a poor job, code will become riddled with all sorts of navigation through data and conversion from one format to another, long chains of getters and the like, leading not just to spaghettification but also yielding highly brittle code.

    OTOH, big code in itself isn’t bad, even when there’s no high complexity involved – i.e. relying on the compiler’s capability of handling millions of lines of code reliably isn’t bad in itself. What’s bad in big code isn’t its size, it’s its organization. A source tree with sources for thousands of small binaries, intended to interact and be used collaboratively is IMO not better and not worse than the same source tree getting compiled into a single huge binary, as long as the source tree is organized as a large set of small libraries. If you look at a multi-module maven project in Java, you might get what I mean. You may have hundreds of dependencies, but you don’t really care. Having these many dependencies is orthogonal to the quality of _your_ code, and the tool takes care of keeping rigid boundaries between different small sets of source code, allowing each library/package/project to be maintained separately.

  23. Pingback: Functional programs rarely rot « Michael O.Church

  24. Pingback: Virtual machine in C(++) | My programming escapades

  25. Pingback: IDE Culture vs. Unix philosophy « Michael O.Church

  26. Pingback: What is spaghetti code? | ax23w4

  27. Pingback: Enrique's Thoughts | Diving into spaghetti: What does wp_update_attachment_metadata() return?

  28. I used Basic when I first learned programmed. Now, when I was in college the professors did some lessons on Goto’s and how they were bad and whatnot. We were taught to think along the lines of OOP. Anyway, as I read all this, I think we shouldn’t put all the blame on GOTO’s.

    GOTO’s, as I am understanding htem right now, are essentially a branching statement. They’re – in effect – like function calls, except they don’t require a RETURN and thus the similarity seems to end there. They also may not have parameters built-in which adds to the confusion as you’ll have to hunt them down. Overall, it seems to me they’re just a poor means of creating a subroutine functionality. It also seems the generally poor graphical user interfaces during those days also contributed to this, as scrolling to find the subroutine may have been deemed easier than having to open a different file to look at it.

    A programmer can use lots of global variables and/or make confusing parameters and essentially duplicate the problems associated with GOTO/GOSUB. In fact, I think anytime a programmer doesn’t have a firm understanding of what they’re doing they will make bad choices and make unreadable code in much the same way.

    In my view, it’s the organization of code that matters and it must be minimal and concise. Adding unnecessary layers is counterproductive and also adds extra confusion. I think this can happen as software grows and the design loses focus. However, adding too few common interfaces can also confuse things by not keeping things orderly. It’s the same thing that happens if you stuck everything in one file rather than separating it according to its likeness.

    IMHO, a better way to think of “spaghettification” isn’t as “DO NOT USE GOTOS” but rather “MAKE SIMPLE UNDERSTANDABLE CODE.” Brevity is the soul of wit!!!! Cut out the needless processing, combine like terms, reuse common pieces of code, and above all, make it readable!!!!

    I myself have made many mistakes i my projects. I think as projects grow it becomes harder to figure out how to keep them organized and minimal and easy to read, especially because there’re so many nuances in the programming language and/or the compiler. This is also true when you have multiple people coming in and going out on the project. It’s engineering and it’s hard hard hard to do it really well, so just have to work at it.

    • Wow you have really got that wrong, Jon.

      A GOTO is a direct jump to another place in your program, nothing at all like a FUNCTION or GOSUB.

      I’ve been programming for a very long time now and GOTOs were very common when I began. Today they are vilified, in some cases rightly, but still can be quite useful. Most commonly within procedures to trap ERROR conditions but they can also be helpful in densely nested loops.

      The secret to using them is to keep the jumps short.

  29. Oh and one other thing in regards to the last paragraph in my previous comment….

    Another nuance is the IDE you’re using! Believe it or not, it does factor.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s