On programmers, deadlines, and “Agile”

One thing programmers are notorious for is their hatred of deadlines. They don’t like making estimates either, and that makes sense, because so many companies use the demand for an estimate as a “keep ‘em on their toes” managerial microaggression rather than out of any real scheduling need. “It’ll be done when it’s done” is the programmer’s gruff, self-protecting, and honest reply when asked to estimate a project. Whether this is an extreme of integrity (those estimates are bullshit, we all know it) or a lack of professionalism is hotly debated by some. I know what the right answer is, and it’s the first, most of the time.

Contrary to stereotype, good programmers will work to deadlines (by which, I mean, work extremely hard to complete that project by a certain time) under specific circumstances. First, those deadlines need to exist in some external sense, i.e. a rocket launch whose date has been set in advance and can’t be changed. They can’t be arbitrary milestones set for emotional rather than practical reasons. Second, programmers need to be compensated for the pain and risk. Most deadlines, in business, are completely arbitrary and have more to do with power relationships and anxiety than anything meaningful. Will a good programmer accept business “deadline culture”, and the attendant risks, at a typical programmer’s salary? No, not for long. Even with good programmers and competent project management, there’s always a risk of a deadline miss, and the great programmers tend to be, at least, half-decent negotiators (without negotiation skills, you don’t get good projects and don’t improve). My point is only that it can happen for a good programmer to take on personal deadline responsibility (either in financial or career terms) and not find it unreasonable. That usually involves a consulting rate starting around $250 per hour, though.

Worth noting is that there are two types of deadlines in software: there are “this is important” deadlines (henceforth, “Type I”) and “this is not important” deadlines (“Type II”). Paradoxically, deadlines are assessed to the most urgent (mission critical) projects but also to the least important ones (to limit use of resources) while the projects of middling criticality tend to have relatively open timeframes.

A Type I deadline is one with substantial penalties to the client or the business if the deadline’s missed. Lawyers see a lot of this type of deadline; they tend to come from judges. You can’t miss those. They’re rarer in software, especially because software is no longer sold on disks that come in boxes, and because good software engineers prefer to avoid the structural risk inherent in setting hard deadlines, preferring continual releases. But genuine deadlines exist in some cases, such as in devices that will be sent into space. In those scenarios, however, because the deadlines are so critical, you need professional project-management muscle. When you have that kind of a hard deadline, you can’t trust 24-year-old Ivy grads turned “product managers” or “scrotum masters” to pull it out of a hat. It’s also very expensive. You’ll need, at least, the partial time of some highly competent consultants who will charge upwards of $400 per hour. Remember, we’re not talking about “Eric would like to see a demo by the 6th”; we’re talking about “our CEO loses his job or ends up in jail or the company has to delay a satellite launch by a month if this isn’t done”. True urgency. This is something best avoided in software (because even if everything is done right, deadlines are still a point of risk) but sometimes unavoidable and, yes, competent software engineers will work under such high-pressure conditions. They’re typically consultants starting around $4,000 per day, but they exist. So I can’t say something so simple as “good programmers won’t work to deadlines”, even if it applies to 99 percent of commercial software. They absolutely will– if you pay them 5 to 10 times the normal salary, and can guarantee them the resources they need to do the job. That’s another important note: don’t set a deadline unless you’re going to give the people expected to meet it the power, support, and resources to achieve it if at all possible. Deadlines should be extremely rare in software so that when true, hard deadlines exist for external resources, they’re respected.

Most software “deadlines” are Type II. “QZX needs to be done by Friday” doesn’t mean there’s a real, external deadline. It means, usually, that QZX is not important enough to justify more than a week of a programmer’s time. It’s not an actual deadline but a resource limit. That’s different. Some people enjoy the stress of a legitimate deadline, but no one enjoys an artificial deadline, which exists more to reinforce power relationships and squeeze free extra work out of people than to meet any pressing business need. More savvy people use the latter kind as an excuse to slack: QZX clearly isn’t important enough for them to care if it’s done right, because they won’t budget the time, so slacking will probably be tolerated as long as it’s not egregious. If QZX is so low a concern that the programmer’s only allowed to spend a week of his time on it, then why do it at all? Managers of all stripes seem to think that denying resources and time to a project will encourage those tasked with it to “prove themselves” against adversity (“let’s prove those jerks in management wrong… by working weekends, exceeding expectations and making them a bunch of money”) and work hard to overcome the gap in support and resources between what is given and what is needed. That never happens; not with anyone good, at least. (Clueless 22-year-olds will do it; I did, when I was one. The quality of the code is… suboptimal.) The signal sent by a lack of support and time to do QZX right is: QZX really doesn’t matter. Projects that are genuinely worth doing don’t have artificial deadlines thrown on them. They only have deadlines if there are real, external deadlines imposed by the outside world, and those are usually objectively legible. They aren’t deadlines that come from managerial opinion “somewhere” but real-world events. It’s marginal, crappy pet projects that no one has faith in that have to be delivered quickly in order to stay alive. For those, it’s best to not deliver them and save energy for things that matter– as far as one can get away with it. Why work hard on something the business doesn’t really care about? What is proved, in doing so?

Those artificial deadlines may be necessary for the laziest half of the programming workforce. I’m not really sure. Such people, after all, are only well-suited to unimportant projects, and perhaps they need prodding in order to get anything done. (I’d argue that it’s better not to hire them in the first place, but managers’ CVs and high acquisition prices demand headcount, so it looks like we’re stuck with those.) You’d be able to get a good programmer to work to such deadlines around $400 per hour– but because the project’s not important, management will (rightly) never pay that amount for it. But the good salaried programmers (who are bargains, by the way, if properly assigned or, better yet, allowed to choose their projects) are likely to leave. No one wants to sacrifice weekends and nights on something that isn’t important enough for management to budget a legitimate amount of calendar time.

Am I so bold as to suggest that most work people do, even if it seems urgent, isn’t at all important? Yes, I am. I think a big part of it is that headcount is such an important signaling mechanism in the mainstream business world. Managers want more reports because it makes their CVs look better. “I had a team of 3 star engineers and we were critical to the business” is a subjective and unverifiable claim. “I had 35 direct reports” is objective and, therefore, more valued. When people get into upper-middle management, they start using the terms like “my organization” to describe the little empires they’ve built inside their employers. Companies also like to bulk up in size and I think the reason is signaling. No one can really tell, from the outside, whether a company has hired good people. Hiring lots of people is an objective and aggressive bet on the future, however. The end result is that there are a lot of people hired without real work to do, put on the bench and given projects that are mostly evaluative: make-work that isn’t especially useful, but allows management to see how the workers handle artificial pressure. Savvy programmers hate being assigned unimportant/evaluative projects, because they have all the personal downsides that important ones do (potential for embarrassment, loss of job) but no career upside. Succeeding on such a project gets one nothing more than a grade of “pass“. For a contrast, genuinely important projects can carry some of the same downside penalties if they are failed, obviously, but they also come with legitimate upside for the people doing them: promotions, bonuses, and improved prospects in future employment. The difference between an ‘A’ and a ‘B’ performance on an important project (as opposed to evaluative make-work) actually matters. That distinction is really important to the best people, who equate mediocrity with failure and strive for excellence alone.

All that said, while companies generate loads of unimportant work, for sociological reasons it’s usually very difficult for management to figure out which projects are waste. The people who know that have incentives to hide it. But the executives can’t let those unimportant projects take forever. They have to rein them in and impose deadlines with more scrutiny than is given important work. If it an important project over-runs its timeframe by 50 percent, it’s still going to deliver something massively useful, so that’s tolerated. What tends to happen is that the important projects are, at least over time, given the resources (and extensions) they need to succeed. Unimportant projects have artificial deadlines imposed to prevent waste of time. That being the case, why do them at all? Obviously, no organization intentionally sets out to generate unimportant projects.  The problem, I think, is that when management loses faith in a project, resources and time budget are either reduced, or just not expanded even when necessary. That would be fine, if workers had the same mobility and could also vote with their feet. The unimportant work would just dissipate. It’s political forces that hold the loser project together. The people staffed on it can’t move without risking an angry (and, often, prone to retaliation) manager, and the manager of it isn’t likely to shut it down because he wants to keep headcount, even if nothing is getting done. The result is a project that isn’t important enough to confer the status that would allow the people doing it to say, “It’ll be done when it’s done”. The unimportant project is in a perennial race against loss of faith in it from the business, and it truly doesn’t matter to the company as a whole whether it’s delivered or not, but there’s serious personal embarrassment if it isn’t made.

It’s probably obvious that I’m not anti-deadline. The real world has ‘em. They’re best avoided, as points of risk, but they can’t be removed from all kinds of work. As I get older, I’m increasingly anti-“one size fits all”. This, by the way, is why I hate “Agile” so vehemently. It’s all about estimates and deadlines, simply couched in nicer terms (“story points”, “commitments”) and using the psychological manipulation of mandatory consensus. Ultimately, the Scrum process is a well-oiled machine for generating short-term deadlines on atomized microprojects. It also allows management to ask detailed questions about the work, reinforcing the sense of low social status that “conventional wisdom” says will keep the workers on their toes and most willing to work hard, but that actually has the opposite effect: depression and disengagement. (Open-back visibility in one’s office arrangement, which likewise projects low status, is utilized to the same effect and, empirically, it does not work.) It might be great for squeezing some extra productivity out of the bottom-half– how to handle them is just not my expertise– but it demotivates and drains the top half. If you’re on a SCRUM team, you’re probably not doing anything important. Important work is given to trusted individuals, not to SCRUM teams.

Is time estimation on programming projects difficult and error-prone? Yes. Do programming projects have more overtime risk than other endeavors? I don’t have the expertise to answer that, but my guess would be: probably. Of course, no one wants to take on deadline risk personally, which is why savvy programmers (and almost all great programmers are decent negotiators, as discussed) demand compensation and scope assurance. (Seasoned programmers only take personal deadline risk with scope clearly defined and fixed.) However, the major reason for the programmers’ hatred of deadlines and estimates isn’t the complexity and difficulty-of-prediction in this field (although that’s a real issue) but the fact that artificial deadlines are an extremely strong signal of a low-status, unimportant, career-wasting project. Anyone good runs the other way. And that’s why SCRUM shops can’t have nice things.

Software engineer salaries aren’t inflated– at least, not for the 99%

It’s autumn 2013, and there’s a lot of discussion around the current bubble (now obviously one) in the VC-funded technology world and how it will end. Business Insider acknowledges that a bubble exists, but gets some crucial details wrong. Let’s talk about one that most of us actually care about. Business Insider claims: “It’s not just tech asset prices that are high. Salaries are high, too.” Them’s fighting words. Is it true? Well, sort-of. Here’s the evidence, from tech recruiter Matt Allen:

Instead, we’re seeing sign-on bonuses for individuals five-years out of school in the $60,000 range. Candidates queuing-up six, eight or more offers and haggling over a few thousand-dollar differences among the offers. Engineers accepting offers and then fifteen minutes before they’re supposed to start on a Monday, emailing (not calling) to explain they found something better elsewhere.

Ok, let’s dissect this. One: a few people (and it’s not clear that they’re engineers) are getting huge signing bonuses. $60,000 isn’t a number to sneeze at, but it’s not that extreme. Management-level hires typically get signing/relocation bonuses that cover the cost of a cross-country move (easily over $20,000, for people with families) and there’s no reason software engineers shouldn’t get the same. Additionally, signing bonuses usually have clawback provisions if the employee leaves (even involuntarily) in the first year, penalizing the job-hopping for which the worst of our generation is known. Given the tax penalty associated with receiving a bonus and risking having to pay it back, I’m not sure I’d want a $60,000 bonus under typical terms. Two: some candidates are queuing up 6 to 8 job offers. I call bullshit on that one, if only because of the scheduling difficulties in a startup ecosystem where 7-day exploding offers are the norm. I’m sure there are people getting 6-8 offers in the course of an entire job search (I’ve had that) and that people are claiming to have portfolios of excellent offers in negotiation, but the logistics of getting 6 active, credible job offers at one time are unfavorable, to say the least. Three: people are being unprofessional dickbags, pulling out of accepted offers on their start date. I’m sure that that is happening, but how is an occasional episode in which a privileged young hotshot acts like a jackass newsworthy, much less the sign of a bubble? It’s not.

Managers and product executives are making a killing in the present-day startup economy, no doubt, and some of those people might be able to pass as programmers due to some PHP scripts they wrote in their teens, but when one actually studies the contemporary startup economy, there are not a lot of software engineers making over $200,000 per year outside of finance– and those who are tend to be either very good or unusually lucky. For a VC-funded startup to offer $200,000 to an engineer would be incredibly rare, even in the Bay Area, and equity allotments after VCs are involved are notoriously stingy.

Twenty years ago, when startups were underdogs almost by definition, the scene had a “Revenge of the Nerds” feel. A bunch of ragtag computer aficionados, typically from middle-class backgrounds and far away from the East Coast’s financial and corporate elite, were showing up the old guard. New, powerful technologies were being developed, and power shifted (temporarily, perhaps) to those few who understood them at a deep level. There was slight subversion of the 1%; they weren’t destroyed or even harmed, but they were visibly outperformed. For a contrast, the hot properties of the current VC-funded world almost entirely come from the 1%. Behind almost every single one of the VC darlings, there’s a series of strings pulled by powerful people repaying favors to the rich daddies of the founders. There’s no meritocracy in it. It’s not a challenge to the established and rich; it’s a sideshow for the supercapitalists. In a surprising reversal, the old-style corporate world (and the enterprise companies existing and being formed to serve their needs) has a much more middle-class culture, because the current-day rich find it boring.

Software engineer salaries in the VC-funded world are not especially low (nor are they high). They’re usually 75 to 95 percent of what more typical employers are offering. Equity distributions, on the other hand, are extremely lopsided. I worked for a company once where the board refused to allow more than 0.04% to go to an engineer. (Why? Because fuck the people doing the work, that’s why.) There’s something that needs to be discussed here, because it applies to the age-old question of why people who do actual work are modestly compensated, while vacuous celebrity types take the lion’s share. It’s the Teacher-Executive Problem.

The Teacher-Executive Problem

As a society, we need teachers, police officers, park rangers, and other such people who are modestly compensated. We don’t need celebrities, business executives, or professional athletes. I’m not going to argue that the latter are overpaid, insofar as it’s difficult and competitive to get to the top ranks in any field. That would be a subjective argument; all I intend to say is that, objectively, the need for the latter class of labor is smaller. If we didn’t have teachers or police, society would fall apart. If we didn’t have corporate executives, companies would find other ways to survive. So why are the more necessary people paid less? Because being necessary means that more workers will be drawn into the field, and that limits individual compensation. We probably pay more, as a society, for teachers and police than we do for corporate executives (as we should) but the individual slices for the larger, more necessary, job categories are smaller.

We have 3 million teachers in the US, and we need that large number of them, because individual attention per student is important. The functioning of society would be greatly impaired if that number dropped to 2 or 1 million. One might argue that competent teachers are “worth” $200,000 (or much more) per year– and I’d say that the best are worth several times that– but can society afford to pay that much for teaching? Three million $200,000 paychecks is a $600-billion annual liability. Taxes would go up substantially– in a time when the base of political power is (unfortunately) divided between a structurally disadvantaged (read: mostly fucked) emerging-adult cohort and retiring Boomers whose children are out of school– and society would likely determine that $200,000 annual paychecks for teachers “can’t be afforded” (especially given the claim that “they get off work at 3:00″). $200,000 isn’t a large amount of money for a single person, but for people who are actually needed in significant numbers, the multiplier of 3 million makes it seem unacceptable. (I am not arguing that teachers don’t deserve $200,000 salaries; only that it would be politically impossible to get them there.)

For a contrast, the social need for corporate executives (excluding entrepreneurs) is pretty minimal, and society recognizes this in a rational way: there aren’t a large number of slots: title inflation aside, there might be ten thousand truly executive roles in powerful companies. However, when the number of people performing a job is low, gigantic salaries (if those people control the distribution of resources) become socially affordable. Three million somewhat high salaries is a problem, ten thousand enormous ones is not. This is paradoxical because the middle-class conceit is that the way to become wealthy is to make oneself valuable (or, better yet, necessary) to society. What the Teacher-Executive Problem shows us is that there’s more potential for outlier compensation in doing things that aren’t necessary, because asking for more compensation doesn’t carry the implicit multiplier based on the size of the labor base. Society “can’t afford” to pay the 3 million teachers such high salaries, but it can afford the huge salaries of corporate executives, and the $850-million acquisition that enriches the top executives of IUsedThisToilet.com.

Why do so few software engineers get a fair shake in the VC-funded world? They’re on the wrong side of the Teacher-Executive Problem. They’re actually necessary. They’re required in order for technology firms to function.

What about 10X?

The generally accepted consensus (even among software engineers) is that average programmers aren’t very valuable. They write all that buggy, hideous legacy code. There’s little that software engineers and business executives agree upon, but the low status of the average programmer is probably not a point of disagreement. I don’t care to speculate on what the “average” software engineer is like, because while I have seen a ton of incompetents (and a smaller number of good engineers) out there in the world, I don’t have a representative sample. I also think that most of the engineering incompetence comes not from a lack of ability, but from an anti-intellectual culture originating in business culture at large, as well as nonexistent mentoring, so it’s not programmers who are mostly at fault. However, I will agree readily that the bulk of software engineers don’t deserve high ($200,000) salaries. They might have the talent, but few have that level of skill.

However, there is the concept of the “10x” software engineer, one who is 10 times as productive as an average engineer. It reflects a truth of software engineering, which is that excellence and peak productivity are tens to hundreds of times more powerful than the average-case output. (In fact, often that ratio is infinite because there are problems that require top talent to solve it.) Moreover, groups of engineers often scale poorly, so a team of 10 isn’t really (most of the time) 10 times as productive, but maybe 2 or 3 times as strong, as an individual. So it’s not surprising that a great engineer would be 10 times as valuable. The degree to which “10x” is real depends on the type of work, the context, project-person fit, and the competence of the engineer. It’s highly context-dependent, it’s not always the same people, and it’s quite unpredictable. The national average salary for a software engineer is about $90,000. The 10x-ers are not earning 10 times that and, to be honest about it, they probably shouldn’t. You can’t know, when hiring someone, whether the context that supports 10x output for that person is going to exist in the role. The bona fide 10x engineers typically earn 1.5 to 2 times that amount ($135,000 to $180,000) in the U.S. I’m not going to argue that they’re underpaid at this level– although, at least in comparison to MBA graduates earning twice that before age 30, I think they clearly are– but they’re far from overpaid at that level.

Why don’t 10x engineers get paid astronomical sums? For a large part, I think it’s because of the context-dependent nature of “10x”. It doesn’t require only a good engineer, but a good engineer connected with the right kind of work. Companies can’t afford (obviously) to pay $900,000 salaries to senior engineers just on the hunch that those (typically highly specialized) talents will find a use. When engineers do find environments in which they can deliver 10x output, they’re happy– and they’re not liable to clamor for huge raises, or to move quickly and risk starting over in a defective environment. This isn’t especially wrong; engineers would rather have interesting work at “merely” 1.5x salaries than risk happiness and growth for a chance at more. It’s just worth pointing out to establish that, in general, software engineers (and especially the most capable ones) are not overpaid. Moreover, the people commanding $500,000+ salaries in technology are typically not real engineers, but managers who might occasionally “drop down” and hack on one of the sexier projects to keep their skills sharp. Finally, the few (very few) software engineers making that kind of money are generally worth it: we’re not talking about top-1% talent at that level, but top-0.05% (and a level almost never achieved before age 40). There are plenty of people drawing undeserved high salaries in the Valley, but they aren’t the ones writing code.

Why must I point this out?

This (bubble) too, shall pass. The era when a well-connected rich kid can raise a $1-million seed round rather than eating his lumps in an investment bank’s analyst program will end. I don’t think that that’s a controversial assumption. Timing the end of a bubble is nearly impossible, and I don’t think anyone has shown reliability in that particular skill; but predicting that it will die off is trivial– they always do. When this happens, there will be a lot of job losses and belt-tightening. There always is. It’ll get ugly, and that’s fine. Most of these businesses getting funded and acquired don’t deserve to exist, and the economy will inevitably purge them. What I don’t want to see is the bubble made into an argument against the middle-class, hard-working software engineers. When the bubble ends, there will be losses to eat and austerity to go around, and it ought to go right to the people who reaped the benefits while the bubble existed. The end of the bubble should not be used to reduce the compensation of software engineers as a whole, whose pay is currently (I would argue) not quite unfair, but on the low side of the fair interval.

For the 99 percent, there is no software engineer salary bubble.

VC-istan 1: What is VC-istan?

This is the first part of a series that was originally intended as one post, but will take more. For obvious reasons, I’d rather run five 1000-word posts than one 5000-word post. This assumes that I can keep a post short, brevity not being my strong suit. So, let’s just go into it. What the fuck is VC-istan?

I’ve used that term a lot, and many take it to be pejorative, as if there were something negative about “stan”. That’s not correct. I frequently use “stan” to describe one metaphorical space of people, and it can be a group that I’m fond of (e.g. “Nerdistan”) or one I dislike. After all, “stan” only means a place where people live. The name, itself, is utterly neutral.

So what is VC-istan? It’s not just a snarky name for venture capitalists and the toxic technology ecosystem that some of them (some, I reiterate, not all) have created. VC-istan means something, and I’m going to explain just what.

The VC-istan Hypothesis

The VC-istan Hypothesis is that the leading venture capitalists, the “cool kids” in that set who all know each other, have created a postmodern corporate entity– possibly the first of its kind. It functions as one company at the top echelon, but has recognized that corporations themselves are disposable (much moreso than projects or departments within a company, which are more difficult to cut). This insight isn’t new (shell companies are disposable by design) but the VC startup is a fully-fledged, single-product enterprise designed entirely to take on a specific high-risk business gambit (usually in a red ocean where first-mover advantages dominate, making VC “rocket fuel” especially powerful). The disposable company inside VC-istan is a true startup, almost free-standing if you don’t see the strings.

The “marquee” investors function as the executives of this larger, shadowy, not-exactly-a-corporation entity that I’ve described. Middle managers, maligned as inefficient, corrupt, and often stupid, have been replaced with founders and startup executives who inhabit about the same economic and social stratum as their gray-suited forebears, but have sexier job titles: Senior VP of something no one outside California has heard of, as opposed to Associate Director at a household name. (In truth, I think those founders are no more virtuous, as a class, than the maligned middle managers of the old companies.) The tech press are a bizarre HR organization, as are the fully-corporate bean-counters who sign of on acquisitions (the bonus committee). To attract top talent, one has to pay for it; the generous annual compensation for which banking is known has been replaced by less generous (but possibly more variable) lottery-ticket allocation given the name of employee equity.

The “one corporation” dynamic is held together by the collusive– and almost certainly illegal– reputation economy that the venture capitalists have created. As a group, they decide whether they like a company or not. If appropriate laws were enforced, stopping all of that anti-competitive note-sharing and forcing them to think independently, the VC-funded world would behave much more like a fair market. Then, there probably wouldn’t be a palpable entity or “scene” that one could call “VC-istan”. There would be a much more heterogeneous array of new businesses, not all concentrated in one sector or geographical area.

Like the Efficient Market Hypothesis, the VC-istan Hypothesis is neither fully true nor fully false. There is truth in it, but it is not absolute. It seems to be mostly true, now. Its truth might change as conditions do.

Why is VC-istan bad?

I find it important to describe VC-istan and why it emerged because, to be blunt, I want to kill it. In order to defeat it, we must know it. I want to bring truth forward, because its particular truth is ugly and, if proper information percolates, it will render the VC-istan route a second-tier career, starving it of talent unless it changes its ways. Every time a smart person learns what VC-istan truly is, its position becomes weaker, because it becomes that much harder for its established caste to peddle unreasonable dreams and broken promises.

It’s not that I want to end venture capital as a financing process. To the contrary, I think it’s a very good concept, if poorly implemented now. So, what went wrong? The feudalistic reputation economy of VC-istan has a simple (and disgusting) rule: the VCs are your owners. They are not partners; they can destroy your whole career by picking up a phone. If you say anything negative about a venture capitalist, much less sue one in the (admittedly, fairly rare) even that he robs you, you’ll never raise money again. This sort of blacklisting could not exist on a fair market, because a small set of people taking the “protect our own at all moral costs” approach could only exclude their enemies from a small social circle, and not preclude venture investment outright.

It is, of course, this collusion that enables investors to establish themselves as the executive suite, to which even the most talented entrepreneurs and technologists (who are too numerous and unruly to hold any kind of collusive arrangement together) are subordinate. On a fair market, investors, entrepreneurs, and technologists would not separate so cleanly into these distinct strata with the bottom of one still above or near the top of the next. Rather, the talented technologists would outrank the untalented investors. The parties involved would have to find mutual respect for one another as doing fundamentally different, but important, jobs.

Investors have always been the top jocks in VC-istan, but the distribution of power between business founders (i.e. people offering connections to investors) and technologists (i.e. those doing the actual work) seems to have evolved out-of-favor for the technologists. Of course, there are plenty of computer programmers (myself included) who frequently turn away solicitations from “idea guys” who “just need a programmer”, so it might seem that tech is hot right now. Not quite; most of those are cases of “8” technologists turning down “5” business guys. If you compare at-parity, the business side has much higher status. A Business 8 is on the partner track at a top-5 venture firm and can raise a six- or seven-figure seed round on an idea; a technology 8 cannot afford to buy a house in the Bay Area.

This lexicographic rank-ordering replicates the old industrial regime in which being an “8 worker” just didn’t matter, because a 2 manager outranked you. That led to a collapse of motivation at the bottom, caused a bilateral loss of trust, and produced the Theory X culture of a hundred years ago. It wasn’t pretty. I hope that the software industry doesn’t go that way and, to tell the truth, I don’t think that it will; I take it as obvious that technology is much stronger than VC-istan and will resurrect itself no matter what goes down in the Valley.

That said, VC-istan has replicated all of the worst aspects of corporate life, but almost none of its virtues. The old corporate regime was bureaucratic, slow, and prone to corruption and inefficiency, but there were good things about it. Companies invested in their people, and job losses were a rarity, taken as a sign of business failure, rather than an artifact of the mean-spirited “stack ranking” for which companies like Welch’s GE, Enron, and Microsoft have become notorious. The negatives of the old corporate regime were conformism, short-sighted profit-seeking behavior (externalized costs, mostly), subordination, political corruption and inefficiency, and the emergence of an exclusive, sexist and classist “old boy’s club” at the top. That those failures existed (and continue to exist) in many powerful corporations is not controversial.

Most VC-istan companies, as it turns out, have all of these negative traits as well! Startups who no-hire (or fire) over “culture fit” (often, synonymous with “being old, female, or of low socioeconomic origin”) are taking conformity to a far worse extreme than a 1970s-era corporation, which would make much more of an effort to find a place for a “nontraditional” but talented person. The evidence, in truth, is pretty strong that the stodgy bureaucracies of yesteryear were will much fairer (if frustratingly impersonal, since that is usually a prerequisite for fairness) than the new VC-funded “lean” regime.

Sure, there are excellent startups that are free of the above-mentioned ills. They exist, no doubt. However, it’s no easier to find one of those than it was (or is, now) to find a bullshit-free niche of a large company. On the whole, I think having traded in the large companies of the 1960s and ’70s for VC-istan has not been a good deal.

Then, there’s the issue of age discrimination. This will fuck all of us in the ass, and sooner than most of us think, so we need to address it head on. More importantly, ageism is just terrible for technology. Yes, the modeling profession has an expiration date as well; it’s not only programmers who face that. On the other hand, models get to start working in their teens, and there isn’t much of a learning curve, so the loss to the industry is minimal. In technology, on the other hand, it takes years of dedicated full-out work-your-ass-off experience to get any good. It’s an insult to programmers and technology as a whole that a group of talentless, superficial people who have no fucking idea what it takes to be any good at this work, have decided to impose age-grading that begins to close opportunities almost as soon as one has developed a passable level of skill.

In a later essay, I’ll reveal the true and secret cause of VC-istan ageism. It’s not lower wages or some belief that younger people are more creative. It’s far more damning. Here’s a hint. Watch The Office, pay attention to Michael Scott and Ryan Howard.

I don’t know if I will succeed in killing VC-istan. I don’t know what will replace it if I do. I do know that the guys at the top need us more than we need them. I’ll cover that in a future essay, as well.

What is not VC-istan?

Having described what VC-istan is and why I don’t like it, let me give a few comments on what it’s not. First, it’s not all of venture capital. I don’t consider biotechnology or clean energy to be part of VC-istan. The rules in those industries are different, because one actually has to know something about biology, for example, to launch a medical-device startup. The superficiality, ageism, celebrity culture, and lack of respect for hard work that characterize the current bubble crop are not found there; in fact, it’s the opposite, because there are objective goals to be met. VC-istan is focused on light technology, which I use to describe the marketing experiments using technology that, in 2013, seem to outnumber and outcompete (for funding and attention) true technology companies. In light technology, the sole technical challenge is “scaling”, which can be back-filled once flush with cash and able to hire the Valley’s best engineers; but the glory goes to the investors and founders who “had the vision”. So perhaps I might say, “venture-funded light technology” instead of VC-istan; the former sounds less pejorative; but VC-istan has fewer characters and it’s 10:40 at night so, fuck it, brevity wins.

What is it that marries VC-istan to light technology? After all, shouldn’t profit-maximizing venture capitalists go in search of meatier ideas that actually need the investment? Well, many do, and I’m not writing about them. I’m writing about the starfucker types who want to be profiled in business magazines, the ones who want social access more than success on the market, and who thereby sell out not only their own chances at success, but also human decency by creating the collusion and celebrity culture.

Where this ultimately leads is the Parable of the Bikeshed, or what Freud called the narcissism of small differences. People (even those in power) are generally willing to defer to the experts on the big, infrastructural, but usually aesthetically unpleasing (due to their complexity) matters, like how to design a nuclear power plant. “Just bring me the sausage; don’t tell me how it’s made.” On the other hand, on those matters that seem accessible, people form strong opinions. There is little power in the complex and objective, where those who are blatantly wrong will be punished regardless of who they are, and those who have knowledge generally got it through that rank-middling practice of working very hard; but much more power in the simple and subjective, over which simply having made the decision is victory. (The association of arrogant simplisticism with power seems to be present in literalistic religion, too. Notice how many historical deities had strong and explicit opinions about mandatory metaphysical beliefs and about gender roles; but said not a damn thing about how to make antibiotics, which would have actually been useful.) The consumer-oriented light technology that’s in vogue in modern VC-istan is a realm where it’s easy to debate what color to paint the shed, and that’s why it attracts the biggest narcissists.

Venture-funded light technology is, to put it bluntly, a multi-billion-dollar bike shed. In biotech, one can only fund companies whose founders actually understand the science, and pumping money into smiling, stylish idiots is equivalent to incinerating it. For a contrast, the upshot of VC-istan light tech is that any idiot can come up with a plausible scheme, which is much to the benefit of sad-but-established men who “see things in” mediocre but superficially attractive suitors.

One might prefer that this arrogance remain in light technology, which would render it fairly harmless and, in fact, useful. Light tech isn’t a bad thing. Often, it does a much better job at marketing real infrastructural improvements than the inventors ever could. It also can disrupt established and parasitic rent-seekers. I’m not a fan of Uber, a service used mostly because it confers the value of being able to say one uses it (i.e. is able to afford it) but I do welcome anything that threatens to take down the Medallion Mafia. The problem with light technology’s bubbles is that they overreach and, when they get beyond their natural territory, the marketing wizards become devastatingly incompetent. We see a once-great city ruined by absurd rents due to the complete inability of the supposed wunderkinder to solve the city’s biggest problem: housing scarcity. We see that horrific, Aspergerian foray into politics that is Mark Zuckerberg’s FWD.us (which I call “fweed-oos” because I refuse to call it “forward” anything). We see a whole society beginning to hate technology because a few hundred overprivileged celebrity jackasses (most of whom haven’t written code for decades, if ever) are going out and making a bad name for all of us. That’s bad for the future of technology. It’s bad for those of us who are coming up.

So… fuck it, let’s see if we can end this shit on our own terms, or at least take it down a peg or few. My job is to bring truth, and then it’s up to us as a group to decide what to do with it.

Dimensionality– and also, a theory about the rarity of female programmers.

People have invented a number of theories regarding the relative lack of women in the software industries. I don’t intend to opine on each one of those theories. I’m pretty sure that it’s not a lack of ability, because the women who are in software tend to be quite good, and seem to be more capable on average than men in the industry. I’ve met very good male and female programmers, but the bad ones tend to be men. Now, some people attribute the rarity of female programmers to the macho cultures existing in badly managed firms, and I think that that’s part of it, but I also see the negative cultural aspects of this field resulting from the gender imbalance, rather than the reverse. After all, women had to overcome cultures that were just as bigoted and consumptive (if not moreso) as the sort found in the worst software environments, in order to break into medicine and law; but eventually they did so and now are doing quite well. What’s different about software? I offer a partial explanation: social and industrial dimensionality. Perhaps it is a new explanation, and perhaps it’s a spin on an old one, but let’s dive into it.

In machine learning, dimensionality refers to a problem where there are a large number of factors that might be considered, making the problem a lot more complicated and difficult, because it’s usually impossible to tell which dimensions are relevant. Many problems can be framed as attempting to estimate a function (predicted value according to the pattern the machine is trained to learn) over an N-dimensional space, where N might be very large. For example, Bayesian spam filtering implements a K-dimensional regression (specifically, a logistic regression) where K is the number of text strings considered words (usually about 50,000) and the inputs are the counts of each.

A 10- or even 100-dimensional much space is larger than any human can visualize– we struggle with three or four, and more than 5 is impossible– but algebraic methods (e.g. linear regression) still work; at a million dimensions, almost all of our traditional learning methods fail us. Straightforward linear algebra can become computationally intractable, even with the most powerful computers available. Also, we have to be much more careful about what we consider to be true signal (worth including in a model) because the probability of false positive findings becomes very high. For some intuition on this, let’s say that we have a million predictors for a binary (yes/no) response (output) variable and we’re trying to model it as a function of those. What we don’t know is that the data is actually pure noise: there is no pattern, the predictors have no relationship to the output. However, just by chance, a few thousand of those million meaningless predictors will seem very highly correlated to the response. Learning in a space of extremely high dimensionality is an art with a lot of false positives; even the conservative mainstay of least-squares linear regression will massively overfit (that is, mistake random artifacts in the data for true signal) the data, unless there is a very large amount of it.

We see this problem (of overfitting in a space of high dimension) in business and technology. We have an extreme scarcity of data. There are perhaps five or six important, feedback-providing, events per year and there are a much larger number of potentially relevant decisions that lead to them, making it very hard to tell which of those contributed to the result. If 20 decisions were made– choice of programming language, product direction, company location– and the result was business failure, people are quick to conclude that all 20 of those decisions played some role in the failure and were bad, just because it’s impossible to tell, in a data-driven way, which of those decisions were responsible. However, it could be that many of those decisions were good ones. Maybe 19 were good and one was terrible. Or, possibly, all 20 decisions were good ones in isolation but had some interaction that led to catastrophe. It’s impossible to know given the extreme lack of data.

Dimensionality has another statistical effect, which is that it makes each point an outlier, an “edge case”, or “special” in a way. Let’s start with the example of a circular disk and define “the edge” to mean all points that are more than 0.9 radii from the center. Interior points, generally, aren’t special. Most models assume them to be similar to neighbors in expected behavior, and their territory is likely to be well-behaved from a modeling perspective. In two dimensions– the circular disk– 81% of the area is in the interior, and 19% is on the edge. Most of the activity is away from the edge. That changes as the number of dimensions grows. For a three-dimensional sphere, those numbers are 72.9% in the interior, and 27.1% at the edge. However, for a 100-dimensional sphere, well over 99% of the volume is on the edge, and only 0.9^100, or approximately 0.0027%, is in the interior. At 1000 dimensions, for all practical purposes, nearly every point (randomly sampled) will be at the edge. Each data point is unique or “special” insofar as almost no other will be similar to it in each dimension.

Dimensionality comes into play when people are discussing performance at a task, but also social status. What constitutes a good performance versus a bad one? What does a group value, and to what extent? Who is the leader, the second in charge, all the way down to the last? Sometimes, there’s only one dimension of assessment (e.g. a standardized test). Other times, there can be more dimensions than there are people, making it possible for each individual to be superior in at least one. Dimensionality isn’t, I should note, the same thing as subjectivity. For example, figure-skating performance is subjective, but it’s not (in practice) highly dimensional. The judges are largely in agreement regarding what characterizes a good performance, and disagree largely on (subjective) assessments of how each individual matched expectations. But there aren’t (to my knowledge, at least) thousands of credible, independent definitions of what makes a good figure skater. Dimensionality invariably begets subjectivity regarding which dimensions are important, i.e. on the matter of what should be the relative weights for each, but not all subjective matters are highly dimensional, just as perceived color is technically subjective but generally held to have only three dimensions (in one model; red, green, and blue).

Social organizations also can have low or high dimensionality. The lowest-dimensional organization is one with a strict linear ordering over the people. There’s a chief, a second-in-command, a third and fourth, all the way down to the bottom. If you’re 8th out of 18 people, you know that you’re not 7th or 9th. Social status has one legible dimension. Status is determined by one legible fact about a person: the specific assigned rank. Typical hierarchies aren’t that way; while there is a rigid status pyramid, same-rank people are not comparable against one another. In most organizations and groups, leaders are visible, and omegas might be legible because they serve a social purpose, too; but in the middle, it’s almost impossible to tell. Venkatesh Rao goes into a lot of detail on this, but the general rule is that every social group will have a defined alpha and omega, but members 2 to N-1 are incomparable, and the cohesion of the group and the alpha’s positional stability often require this. An independent #2, after all, would represent eventual danger to the chief, which is why proteges are only selected by alphas who plan on graduating to a higher-status group. What is it that keeps social status illegible within the group? Dimensionality. People have been comparing themselves against each other forever; what prevents the emergence of a total linear ordering is the fact that different dimensions of capability or status will produce different rankings, and there’s uncertainty about which matter more.

The group might have one main social status variable, and will usually arrange it so that only one person (or, at scale, a few) have truly elevated status in that dimension, because that’s necessary for stability and morale. Fights over who is #2 vs. #3 are an unwanted distraction. This leaves it to the people within the group to define social status how they will, and the good news for them is that most people can find things or combinations of things at which they’re uniquely good. People find niches. In The Office, people who will never be managers take solace in joining clubs like the Party Planning Committee and the Finer Things Club. People like to become “the <X> guy” (or gal) for some X that makes them important to the group, e.g. “I’m the Friday Cupcake Guy”. It gives each person an additional dimension of social status at which he or she can be an undisputed local chieftain– possibly of territory no one wants, but still the owner of something.

What might this have to do with programming? Well, I’ve often asked (myself, and others) what makes a great programmer, and the conclusion that I’ve come to is that it’s very hard, at an individual level, to tell. Speaking broadly, I can say that Clojure (or Lisp) programmers are better than Java programmers, who are in turn better than VB programmers. I know the patterns and the space, and that’s clearly true (in the aggregate). Better programmers like more challenging, but ultimately more powerful, languages. But there are excellent Java programmers and bad Lisp hackers. Also, if you bring a crack Clojure or Haskell developer into a typical enterprise Java environment where things are done in a wrong but common way, she’ll struggle, just because she’s not familiar with the design patterns. Moreover, a person’s reputation in a new job (and, in the long term, status and trajectory) depends heavily on his performance in the first few months, during which familiarity with the existing technology practices and tools (“tech stack”) have more of an impact than general ability. In the short term, it can be very hard to tell who the good and bad programmers are, because so much is project-specific. People are often judged and slotted into the typical software company’s pseudo-meritocracy before sufficient data can be collected about their actual abilities.

Assessing programmers is, to put it simply, a high-dimensional problem. There are more important and possibly powerful technologies out there (and plenty of duds, as well) than there is time to learn even a small fraction of them, and a lot of the know-how is specific to subdomains of “programming” in which one can have a long, fruitful, and deep career. Machine learning requires a dramatically different skill set from compiler design or web development; a top machine-learning engineer might have no idea, for example, how to build a webpage. People in business are judged shallowly (indeed, 95% of success in “business”, in the U.S., is figuring out how to come out on top of others’ shallow– but usually predictably so– judgements) and programming is rarely an exception, so when a person tries something new, there will be “embarrassing” gaps in his or her knowledge, no matter how capable that person is on his or her own territory. There might be 1000 dimensions that one could use to define what a good vs. bad programmer is, and no one excels at all of them.

Given the extreme dimensionality of assessing programmers, I also contend that self-assessment is very difficult. Good programmers don’t always know that they’re good (it can be frustrating and difficult even for the best) and many bad ones certainly think that they’re good. I don’t think that there are huge differences is self-confidence between men and women, individually speaking. Differences between groups are often smaller than those within groups, and I think that this applies to self-efficacy. However, I think the effect of dimensionality is that it can create a powerful feedback loop out of small personal biases over self-efficacy– and I do believe that men are slightly more inclined to overconfidence while women, in the aggregate, are slightly biased in the other direction. Dimensionality gives leverage to these tendencies. A person slightly inclined to view herself as competent will find reasons to believe she’s superior by selecting her strongest dimensions as important; one inclined the other way will emphasize her shortcomings. Dimensionality admits such a large number of ways to define the rules that a person’s view of him- or herself as a competent programmer is extremely flexible and can be volatile. It’s very easy to convince oneself that one is a master of the craft, or simply no good.

When starting in the software career, women (and men) have to deal with, for just one irritating example, socially undeveloped men who display obnoxious surprise when they’re new to programming trivia. (“You don’t know what xargs is?”) They also had to deal with those assholes when breaking into medicine and law as well, but there was a difference. The outstanding female doctors, for the most part, knew that they were competent and, often, better than the jerks hazing them. That was made obvious by their superior grades in medical school. In software, newcomers deal with environments where the dimensions of assessment often change, can sometimes (e.g. in “design pattern” happy Java shops) even be negatively correlated with actual ability, and are far outside of their control. Dimensionality is benevolent insofar as it gives people multiple avenues toward success and excellence, but can also be used against a person; those in which the person is weak might be selected as the “important” ones.

A piece of ancient wisdom, sometimes attributed to Eleanor Roosevelt although it seems to pre-date her, is: great minds discuss ideas, middling minds discuss events, and small minds discuss people. This is extremely true of programming, and it relates to dimensionality quite strongly.

Weak-minded programmers are actually the most politically dangerous; they don’t understand fuck-all, so they fall back to gossip about who’s “delivering” and who’s not, and hope to use others’ fear of them to extort their way into credibility, then garner a lateral-promote into a soft role before their incompetence is discovered. As expected, the weak-minded ones discuss people.

The great programmers tend to be more interested in the ideas (concurrency, artificial intelligence, algorithms, mathematical reasoning, technology) than in the tools themselves, which is why they’re often well-versed in many languages and technologies. Through experience, they know that it’s impossible to deal with all of computer science while limited to one language or toolset. Anyway, Hadoop isn’t, on its own, that interesting; distributed programming is. It’s what you can do with these tools, and what it feels like to use them on real problems, that’s interesting. Great minds in programming are more attracted to the fundamentals of what they are doing and how they do it than the transient specifics.

However, most minds are middling; and most programmers are the middling kind. Here, discussion of “events” refers to industry trends and the more parochial trivia. Now, great programmers want to narrow in on the core ideas behind the technologies they use and generally aren’t interested in measuring themselves in relation to other people. They just want to understand more and get better. Bad programmers (who usually engineer a transition to an important and highly compensated, but not technically demanding, soft role) play politics in a way that is clearly separate from the competence of the people involved; because they are too limited to grapple with abstract ideas, they focus on people, which often serves them surprisingly well. In other words, the strongest minds believe in the competition of ideas, the eventual consistency thereof, but stay away in general from the messy, ugly process of evaluating people. They shy away from that game, knowing it’s too highly dimensional for anyone to do it adequately. The weak minds, on the other hand, don’t give fuck-all about meritocracy, or might be averse to it, since they aren’t long in merit. They charge in to the people-evaluating parts (“Bob’s commit of April 6, 2012 was in direct violation of the Style Guide”) without heeding the dimensionality, because getting these judgments right just isn’t important to them; participating in that discussion is just a means to power.

Middling programmers, however, understand meritocracy as a concept and are trying to figure out who’s worth listening to and who’s not (the “bozo bit”). They genuinely want the most competent people to rise, but they get hung up on superficial details. Oh, this guy used Java at his last job. He must be a moron. Or: he fucking wrote a git commit line over 65 characters. Has he never worked as a programmer before? They get tricked by the low-signal dimensions and spurious correlations, and conclude people to be completely inexperienced, or even outright morons, when their skill sets and stylistic choices don’t match their particular expectations. These middling minds are the ones who get tripped up by dimensionality. Let’s say, for the sake of argument, that a problem domain has exactly 4 relevant concepts. Then there might be 25 roughly equivalent, but superficially different, technologies or methods or resume buzzwords that have been credibly proposed, as some time, as solutions. Each class of mind ends up in a separate space with differing dimensionalities. Great minds apprehend the 4 core concepts that really matter and focus on the tradeoffs between those. That means there are 4 dimensions. Low minds (the ones that discuss and focus on people) have a natural affinity for political machinations and dominance-submission narratives, which are primordial and very low in dimensionality (probably 1 or 2 dimensions of status and well-liked-ness). The middling minds, however? Remember I said that there are 25 slightly different tools for which familiarity can be used as a credible assessor of competence, and so we end up with a 25-dimensional space! Of course, those middling minds are no more agile in 25 dimensions than anyone else– we just can’t visualize more than two or three at a given time– which is why they tend to narrow in on a few of them, resulting in tool zealotry as they zero in on their local high grounds as representing the important dimensions. (“<X> is the only web framework worth using; you’re a moron if you use anything else.”)

I’ve been abstract and theoretical for the most part, but I think I’m hitting on a real problem. The mediocre programmers– who are capable of banging out code, but not insightful enough to be great at it, and far from deserving any credibility in the evaluation of others– are often the most judgmental. These are the ones who cling to tightly-defined Right Ways of, for example, using version control or handling tabs in source code. One who deviates even slightly from their parochial expectations is instantly judged to be incompetent. Since those expectations are the result of emotionally-laden overfitting (“that project failed because Bob insisted on using underscores instead of camelCase!”) they are stances formed essentially from random processes– often pure noise. But as I said before, it’s easy with a high-dimensional problem to mistake sparse-data artifacts (noise) for signal.

In other words, if you go into programming as a career, you’ll probably encounter at least one person who thinks of you as an idiot (and makes the sentiment clear) for no reason other than the fact that specific dimensions of competence (out of thousands of candidate dimensions) that he’s pinned his identity on happen to be the ones in which you’re weak. It’s shitty, it’s “random”, but in the high-dimensional space of software it’s almost guaranteed to happen at least once– especially when you’re starting out and making a lot of genuine mistakes– for everyone. This isn’t gendered. Men and women both deal with this. Obnoxious people, especially early in one’s career, are just an occupational annoyance in software.

Where I think there is a gendered difference is in the willingness to accept that kind of social disapproval. Whether this is innate or a product of culture, I have no idea, and I don’t care to speculate on that. But the people who will tolerate frank social disapproval (e.g. being looked down upon for doing a garage startup instead of having a paying corporate job) for long periods of time seem to be men. I would argue that most people can’t deal with the false starts, confusing and extreme dimensionality, and improbability of fair evaluation in most companies that characterize the software economy. This is becoming even more true as volatile startup methodologies displace the (deeply flawed, but stable) larger corporate regime, and as tools like programming languages and databases diversify and proliferate– a great thing on the whole, but a contributor to dimensionality. People who can go through that wringer, keep their own sense of self-efficacy strong in spite of all the noise, and do all this over the several years that it actually takes to become a highly competent programmer, are very uncommon. Sociopathy is too strong a word for it (although I would argue that many such people fall under the less pejorative “MacLeod Sociopath” category) but it takes an anti-authoritarian “fuck all y’all” flair to not only keep going, but to gain drive and actually heighten performance, amid negative social signals. Like full-on sociopathy, that zero-fucks-itude exists in both genders, but seems to be more common in men. It’s a major part of why most inventors, but also most criminals, are men.

Women have, in general, equal intellectual talents to ours and I would argue that (on average) they have superior abilities in terms of empathy and judgment of character; but they don’t seem as able to tolerate long-term social disapproval. For the record, I don’t mean to imply that this is a virtue of men. Quixotry, also more often a male trait, is the dangerous and often self-destructive flip side of this. Many things bring social disapproval (e.g. producing methamphetamine) because they deserve condemnation. I’m only making an observation, and I may not be right, but I think that the essentially random social disapproval that programmers endure in their early years (and the fact that it is hard, in a massively high-dimensional space, to gain objective proof of performance that might support a person against that) is a major part of what pushes a large number of talented women to leave, or even to avoid getting involved in the first place. I also think that it is dimensionality, especially of the kind that emerges when middling programmers define assessment and parochial trivia become more important than fundamental understanding, that creates that cacophony of random snark and judgment.

Why Clojure will win

I’m going to make a bold proclamation. I’m not going to claim that Clojure will ever become the most popular language, but it will win in the next 15 years in a major way, because it is already one of the most interesting, and all signs show that it will continue to build momentum,. This is independent of what happens with the Java ecosystem; Clojure will be ready to go off of the JVM if it needs to do so. It is not in a hurry to make that change, but it that becomes necessary it will happen. Why am I so confident about Clojure? Partly, it’s the community. I started using Clojure in late 2008, and the language has improved by leaps and bounds, whereas many languages seem to have gone sideways over the past few years. That, however, isn’t the full story. There are a lot of good languages out there, and most remain in relative oblivion. What makes Clojure different? It’s not just one thing about it, because what I’ve realized about Clojure of late is that it’s not just a language. It’s a vision. Programming should be interactive, as beautiful as possible, modular, and it should generate assets that are easy to use and learn. The Clojure community gets that in a way that many language communities, within the enterprise, don’t.

On a fundamental level, this is different from the enterprise Java vision that has grown up over the past 18 years. The language itself (Java) isn’t so awful– it’s the C analogue of a very powerful garbage-collected virtual machine– but the culture that has grown up around it is conducive to gigantic programs, low interactivity, and manager-friendly conservatism. That doesn’t make it impossible to build good software– far from it– but it creates an environment that’s dissatisfying to the few people capable of writing good software. Clojure has gone the other way. Its build tools and frameworks (although not as fully featured as some in other languages) are simple, idiomatic to the language, and coherent with the vision. While the most “dangerous” feature of Clojure (macros) certainly does allow “cowboy coders” to run wild with ill-conceived DSLs, the truth is that one spends far less time contending with these parochial DSLs in a Clojure program than one does on a typical enterprise Java project; the latter tends to be full of ad-hoc DSLs built to overcome shortfalls in the language. Clojure, on the other hand, has little to be overcome in the language; one can extend it rather than making deeper alterations. Features are added to the language orthogonally, through libraries.

The reason why I say Clojure is a vision more than a language is that it strives to go into messy places that ivory-tower language inventors would avoid, while refusing to compromise on its core vision. It’s visually evident, when looking at a Clojure program, what aspects of the language are idiomatic and which are (usually temporary) tolerated compromises (e.g. the difference between using count or .length on a String). The language strives to remain attractive, but its purpose is to be a fully-fledged commercial language, and it will favor practicality when it’s prudent to do, and rejoin the prevailing aesthetic over time. Clojure is, in particular, designed to go beyond its most typical platform (Java). The reference implementation runs on the (typically server-side) JVM, but ClojureScript runs in the browser, and both dialects get core support. Also in accord with this expanding vision, Pedestal is one of many web frameworks that have been built for the language, and Incanter is bringing it into statistical computing. Then there’s Datomic, which applies Rich Hickey’s refined aesthetic and engineering senses to the database world. In Clojure, there’s a movement to restore a certain aesthetic integrity, modularity, and empowerment of the individual programmer that has, for a couple of dark decades, been lost in the enterprise.

Does Clojure have weaknesses? Of course. In 2008, the warts were evident, because the language was so new (for one thing, JVM interop was very painful). They’re less so now; one has to do a bit of work to find the issues. Yet it has very few fundamental weaknesses. The core language itself is very sound, and it’s only getting better as the vision is refined.

One argument made against it is its lack of a static type system. How damning is this? To be honest, I like static typing a lot, and used to be a die-hard defender of it. My view has softened a bit. My experiences with Scala– well-designed to its purpose, but solving a very difficult (and possibly impossible) problem, which is to unify the Haskell and Java worldviews– have convinced me that compile-time typing, strictly speaking, isn’t needed for most purposes. Interface integrity (type guarantees, contracts) is important, and Haskell and Clojure come loaded with different tool sets to guarantee such things. In the hard-line statically typed world, there’s a type system you get, implicitly, out of the box. It comes with the language, and can be extended in some powerful ways, but it has limits. There are some desired type-system features that require hacking the compiler or making fundamental “wizards only” alterations to the type system that, if done wrongly, can be dangerous. Type systems limit the kinds of programs one can write (and that’s a feature, because reasoning about arbitrary code is literally impossible). They provide automated reasoning about code that is better than the hand-rolled test suites found in 99% of web applications, for sure, but they also (perhaps surprisingly) limit some of the things one can do to reason about code. (If you make a type system too powerful, type-checking becomes undecidable and you’re back into the wild.) Clojure, on the other hand, certainly makes it possible to write sloppy code; however, it also makes it possible to prevent sloppy code in ways that are very difficult in its static counterparts. An enterprise Scala project, for example, is typically going to present the frustrations that come with components that live outside the language. Learning Scala isn’t that bad, but Maven and Spring and Hibernate? Ouch. In Clojure, most of the typically “external” needs are provided with idiomatic libraries that can be reasoned about within the language.

This underscores what I think is the biggest advantage to the dynamically typed world. Static typing is great when one is solving a known problem and requirements don’t change, but when quality and correctness are extremely important. While dynamic typing can theoretically be used to achieve the same level of correctness, I don’t think it’s always economical; it requires more error-handling code and there will performance costs that are only paid once on an executable (in compilation) in the static world. However, I think dynamic typing wins when the requirements are unknown or changing, rapid prototyping is essential, and (most relevantly) integration of new technologies, not always known from the outset, is mandatory. Interactivity and exploration become a core part of the development process at that point, and while every static language worth its salt has an interactive mode (“REPL”) it doesn’t have the same first-class-citizen feel. For example, Clojure’s REPL (which performs compilation of each statement) provides identical performance to what will be achieved in production, making it possible to time functions interactively.

On the static front, there’s nothing that stops the Clojure community and vision from confronting the typical commercial need set that is fulfilled by static typing, and it will. For research value, it will probably be steps behind Haskell on that front forever– Haskell’s a leader in that, and Clojure’s a leader in other things– and such features will always be optional, but I think it will meet most of the needs of typical programming. The shortcomings associated with dynamic languages are, in any case, more often cultural than intrinsic to the language, and Clojure’s culture is of extremely high quality, so I think there were always be enough awareness of those risks, and that the Clojure world will continue to produce technical assets of very high quality.

What makes Clojure– and I’m talking about the vision and community even more than the language– unique, in my mind, is that it takes a hard-line approach to both aesthetics and practicality, perhaps in recognition of the fact that those two need sets were never in opposition at all. The era in which one could “learn programming” in 6 months at age 22 and have lifelong job security are over. People constantly need to learn new things, just to survive, and that’s only feasible when newly-generated technical assets are coherent, attractive, and interactive. That, I think, is the biggest advantage of the Clojure approach to technology growth. Unlike the typical enterprise Java stack, it’s optimized for lifelong learners. That, above all, is why it will win.

The shodan programmer

The belt-color meritocracy

Nothing under the sun is greater than education. By educating one person and sending him into the society of his generation, we make a contribution extending a hundred generations to come.” — Dr. Kano Jigoro, founder of judo.

Colored belts, in martial arts, are a relatively modern tradition, having begun in the late 19th century. It started informally, with the practice in which the teacher (senseiwould wear a black sash in contrast against the white uniform (gi) in order to identify himself. This was later formalized by Dr. Kano with a series of ranks, and by replacing the black sash (in addition to a white belt, holding the gi together) with a black belt. Beginners were assigned descending kyu ranks (traditionally, 6th to 1st) while advanced ranks were dan (from 1st up to 10th). At a dan rank, you earned the right to wear a black belt that would identify you, anywhere in the world, as a qualified teacher of the art. Contrary to popular opinion, a black belt doesn’t mean that you’ve fully mastered the sport. Shodan is taken, roughly, to mean “beginning master”. It means that, after years of work and training, you’ve arrived. There’s still a lot left to learn.

Over time, intermediate belt colors between white and black were introduced. Brown belts began to signify nearness to black-belt level mastery, and green belts signified strong progress. Over time, an upper-division white belt began to be recognized with a yellow belt, while upper-division green belts were recognized with blue or purple. While it’s far from standard, there seems to be a general understanding of belt colors, approximately, as following:

  • White: beginner.
  • Yellow: beginner, upper division.
  • Green: intermediate.
  • Purple: intermediate, upper division.
  • Brown: advanced. Qualified to be senpai, roughly translated as “highly senior student”.
  • Black: expert. Qualified to be sensei, or teacher.

Are these colored belts, and ranks, good for martial arts? There’s a lot of debate about them. Please note that martial arts are truly considered to be arts, in which knowledge and perfection of practice (rather than mere superiority of force) are core values. An 8th dan in judo doesn’t mean you’re the most vicious fighter out there (since you’re usually in your 60s when you get it; you are, while still formidable, probably not winning Olympic competitions) because that’s not the point. These belts qualify you as a teacher, not a fighter only. At that level, knowledge, dedication and service to the community are the guidelines of promotion.

Now, back to our regularly scheduled programming (pun intended)

Would colored belts (perhaps as a pure abstraction) make sense for programming? The idea seems nutty. How could we possibly define a rank system for ourselves as software engineers? I don’t know. I consider myself a 1.8-ish ikkyu (1-kyu; brown belt) at my current level of programmer development. At a typical pace, it takes 4-6 years to go from 1.8 to 2.0 (shodan); I’d like to do it in the next two or three. But we’ll see. Is there a scalable and universally applicable metric for programmer expertise assessment? I don’t know. 

To recap the 0.0-to-3.0 scale that I developed for assessing programmers, let me state the most important points:

  • Level 1 represents additive contributions that produce some front-line business value, while level-2 contributions are multiplicative and infrastructural. Level-3 contributions are global multipliers, or multiplicative over multipliers. Lisps, for example, are languages designed to gift the “mere” programmer with full access to multiplicative power. The Lispier languages are radically powerful, to the point that corporate managers dread them. Level-2 programmers love Lisps and languages like Haskell, however; and level-3 programmers create them.
  • X.0 represents 95% competence (the corporate standard for “manager doesn’t need to worry”) at level X. In other words, a 1.0 programmer will be able to complete 95% of additive tasks laid before him. The going assumption is that reliability is a logistic “S-curve” where a person’s 5% competent on tasks 1.0 levels higher, 50% at 0.5 above, and 95% at-level. So a 1.8 engineer like me is going to be about 85% competent at Level-2 work, meaning that I’d probably do a good job overall but you’d want light supervision (design review, stability analysis) if you were betting a company on my work.
  • 1.0 is the threshold for typical corporate employability, and 2.0 is what we call a “10x programmer”; but the truth is that the actual difference in value creation is highly variable: 20x to 100x on green-field development, 3x to 5x in an accommodating corporate environment such as Google, and almost no gain in a less accommodating one.
  • About 62% of self-described professional software engineers are above 1.0. Only about 1 percent exceed 2.0, which typically requires 10-20 years of high-quality experience. The median is only 1.1, and 1.4 is the 85th percentile.
  • At least in part, the limiting factor that keeps most software engineers mediocre is the extreme rarity of high-quality work experience. Engineers between 1.5 and 1.9 are manager-equivalent in terms of their potential for impact, and 2.0+ are executive-equivalent (they can make or break a company). Unfortunately, our tendency toward multiplicative contribution leads us into direct conflict with “real” managers, who consider multiplicative effects their “turf”.

Programming– like a martial art or the board game Go, both being uncommonly introspective on the measurement of skill ad progress– is a field in which there’s a vast spectrum of skill. 2.0 is a clear candidate for shodan (1st dan). What does shodan mean? It means you’re excellent, and a beginner. You’re a beginner at being excellent. You’re now also, typically, a teacher, but that doesn’t mean you stop learning. In fact, while you can’t formally admit to this too often (lest they get cocky) you often learn as much from your students as they do from you. Multiplicative (level 2) programming contributions are fundamentally about teaching. When you build a Lisp macro or DSL that teaches people how to think properly about (and therefore solve) a problem, you are a teacher. If you don’t see it this way, you just don’t get the point of programming. It’s about instructing computers while teaching humans how the systems work.

In fact, I think there is a rough correlation between the 0.0 to 3.0 programmer competence scale and appropriate dan/kyu ranks, like so:

  • 0.0 to 0.4: 8th kyu. Just getting started. Still needs help over minor compilation errors. Can’t do much without supervision.
  • 0.5 to 0.7: 7th kyu. Understands the fundamental ideas behind programming, but still takes a lot of time to implement them.
  • 0.8 to 0.9: 6th kyu. Reaching “professional-grade” competence but only viable in very junior roles with supervision. Typical for an average CS graduate.
  • 1.0 to 1.1: 5th kyu. Genuine “white belt”. Starting to understand engineering rather than programming alone. Knows about production stability, maintenance, and code quality concerns. Can write 500+ line programs without supervision.
  • 1.2 to 1.3: 4th kyu. Solidly good at additive programming tasks, and can learn whatever is needed to do most jobs, but not yet showing leadership or design sense. Capable but rarely efficient without superior leadership.
  • 1.4 to 1.5: 3rd kyu. Developing a mature understanding of computer science, aesthetics, programming and engineering concerns, and the trade-offs involved in each. May or may not have come into functional programming (whose superiority depends on the domain; it is not, in high-performance domains, yet practical) but has a nuanced opinion on when it is appropriate and when not.
  • 1.6 to 1.7: 2nd kyu. Shows consistent technical leadership. Given light supervision and permission to fail, can make multiplier-level contributions of high quality. An asset to pretty much any engineering organization, except for those that inhibit excellence (e.g. corporate rank cultures that enforce subordinacy and disempower engineers by design).
  • 1.8 to 1.9: 1st kyu. Eminently capable. Spends most of his time on multiplier-type contributions and performs them well. Can be given a role equivalent to VP/Engineering in impact and will do it well.
  • 2.0 to 2.1: 1st dan. She is consistently building high-quality assets and teaching others how to use them. These are transformative software engineers who don’t only make other engineers more productive (simple multiplierism) but actually make them better. Hire one, give her autonomy, and she will “10x” your whole company. Can be given a CTO-equivalent role.
  • 2.2 to 2.3+: Higher dan ranks. Having not attained them, I can’t accurately describe them. I would estimate Rich Hickey as being at least a 2.6 for Clojure, as he built one of the best language communities out there, creating a beautiful language on top of an ugly but important/powerful ecosystem (Java), and for the shockingly high code quality of the product. (If you look into the guts of Clojure, you will forget to hate Java. That’s how good the code is!) However, I’m too far away from these levels (as of now) to have a clear vision of how to define them or what they look like.

Is formal recognition of programmer achievement through formalized ranks and colored belts necessary? Is it a good idea? Should we build up the infrastructure that can genuinely assess whether someone’s a “green belt engineer”, and direct that person toward purple, brown, and black? I used to think that this was a bad idea. Why? Well, to be blunt about it, I fucking hate the shit out of resume culture, and the reason I fucking hate it is that it’s an attempt to collate job titles, prestige of institutions, recommendations from credible people. and dates of employment into a distributed workplace social status that simply has no fucking right to exist. Personally, I don’t lie on my resume. While I have the career of a 26-year-old at almost 30 (thanks to panic disorder, bad startup choices, and a downright evil manager when I was at Google) I feel like I still have more to lose by lying than to gain. So I don’t. But I have no moral qualms about subverting that system and I encourage other people, in dire circumstances, to engage in “creative career repair” without hesitance. Now, job fraud (feigning a competency one does not have) is unacceptable, unethical, and generally considered to be illegal (it is fraud). That’s different, and it’s not what I’m talking about. Social status inflation, such as “playing with dates” to conceal unemployment, or improving a title, or even having a peer pose as manager during a reference check? Fair game, bitches. I basically consider the prestige-title-references-and-dates attempt to create a distributed workplace social status to be morally wrong, extortionate (insofar as it gives the manager to continue to fuck up a subordinate’s life even after they separate) and just plain fucking evil. Subverting it, diluting its credibility, and outright counterfeit in the effort to destroy it; all of these are, for lack of a better word, fucking awesome.

So I am very cynical about anything that might be used to create a distributed social status, because the idea just disgusts me on a visceral level. Ranking programmers (which is inherently subjective, no matter how good we are at the assessment) seems wrong to me. I have a natural aversion to the concept. I also just don’t want to do the work. I’d rather learn to program at a 2.0+ level, and then go off and do it, then spend years trying to figure out how to assess individuals in a scalable and fair way. Yeah, there might be a machine learning problem in there that I could enjoy; but ultimately, the hero who solves that problem is going to be focused mostly on people stuff. Yet, I am starting to think that there is no other alternative than to create an organization-independent ranking system for software engineers. Why? If we don’t rank ourselves in a smart way, then business assholes will step in and rank us anyway, and they’ll do a far shittier job of it. We know this to be true. We can’t deny it. We see it in corporate jobs on a daily basis.

A typical businessman can’t tell the difference between a 2.0 engineer and a 1.2 who’s great at selling his ideas. We tend to be angry at managers over this fact, and over the matter of what is supposed to be a meritocracy (the software industry) being one of the most politicized professional environments on earth; but when we denigrate them for their inability to understand what we do, we’re the ones being assholes. They police and measure us because we can’t police and measure ourselves.

So this may be a problem that we just need to solve. How does one get a black belt in programming? Most professional accreditations are based on churning out commodity professionals. We can’t take that approach, because under the best conditions it takes a decade to become a black belt/2.0+, and some people don’t even have the talent. This is a very hard problem, and I’m going to punt on it for now.

Brawlers and Expert Experts

Let’s peer, for a little while, into why Corporate Programming sucks so much. As far as I’m concerned, there are two categories of degeneracy that merit special attention: Brawlers and Expert Experts.

First I will focus on the Brawlers (also known as “rock stars” or “ninjas”). They write hideous code, and they brag about their long hours and their ability to program fast. There’s no art in what they do. They have only a superficial comprehension of the craft. They can’t be bothered to teach others what they are doing, and don’t have enough insight that would make them passable at it anyway. What they bring is a superhuman dedication to showing off, slogging through painful tasks, and kludging their way to something that works just enough to support a demo. They have no patience for the martial art of programming, and fight using brute strength.

Brawlers tend, in fact, to be a cut above the typical “5:01″ corporate programmers. Combine that with their evident will to be alpha males and you get something that looks like a great programmer to the stereotypical business douche. Brawlers tend to burn themselves out by 30, they’re almost always men, and they share the “deadlines is deadlines” mentality of over-eager 22-year-old investment banking “analysts”. There is no art in what they do, and what they build is brittle, but they can do it fast and they’re impressive to people who don’t understand programming.

Let’s think of corporate competition as a fight that lasts for five seconds, because power destroys a person’s attention span and most executives are like toddlers in that regard. In a three-minute fight, the judoka would defeat the brawler; but, in a 5-second fight, the brawler just looks more impressive. He’s going all out, posturing and spitting and throwing feint punches while the judoka seems passive and conservative with his energy (because he is conserving it, until the brawler makes a mistake, which won’t take long). A good brawler can demolish an untrained fighter in 5 seconds, but the judoka will hold his own for much longer, and the brawler will tire out.

With the beanbag coming in after 5 seconds, no one really lands a blow, as the judoka has avoided getting hit but the brawler hasn’t given enough of an opening for the judoka to execute a throw. Without a conclusive win or loss, victory is assessed by the people in chairs. However, the judges (businessmen, not programmers) don’t have a clue what the fuck they just watched, so they award the match to the brawler who “threw some really good punches” even though he failed to connect and would have been thrown to the ground had the fight lasted 5 seconds more.

Where are Brawlers on the engineer competence scale? It’s hard to say. In terms of exposure and knowledge they can be higher, but they tend to put so much of their energy and time into fights for dominance that the quality of their work is quite low: 1.0 at best. In terms of impressions, though, they seem to be “smart and gets things done” to their superiors. Managers tend to like Brawlers because of their brute-force dedication and unyielding willingness to shift blame, take credit, and kiss ass. Ultimately, the Brawler is the one who no longer wishes to be a programmer and wants to become more like an old-style “do as I say” manager who uses intimidation and extortion to get what he wants.

Brawlers are a real problem in VC-istan. If you don’t have a genuine 1.5+ engineer running your technical organization, they will often end up with all the power. The good news about these bastards (Brawlers) is that they burn themselves out. Unless they can rapidly cross the Effort Thermocline (the point at which jobs become easier and less accountable with increasing rank) by age 30, they lose the ability to put a coherent sentence together, and they just aren’t as good at fighting as they were in their prime.

The second category of toxicity is more long-lived. These are the people called Expert Beginners by Erik Dietrich, but I prefer to call them Expert Experts (“beginner” has too many positive and virtuous connotations, if one either takes a Zen approach, or notes that shodan means “beginner”). No, they’re not actual experts on anything aside from the social role of being an Expert. That’s part of the problem. Mediocrity wants to be something– an expert, a manager, a credible person. Excellence wants to do things– to create, to build, and to improve running operations.

The colored-belt metaphor doesn’t apply well to Brawlers, because even a 1.1 white belt could defeat a Brawler (in terms of doing superior work) were it not for the incompetence of the judges (non-technical businessmen) and the short duration of the fight. That’s more of an issue of attitude than capability, I’ve met some VC-istani Brawlers who would be capable of programming at a 1.4 level if they had the patience and actually cared about the quality of their work. It’s unclear what belt color applies; what is more clear is that they take their belts off because they don’t care.

Expert Experts, however, have a distinct level of competence that they reach, and rarely surpass, and it’s right around the 1.2 level: good enough to retain employment in software, not yet good enough to jeopardize it. They’re career yellow belts at 1.2-1.3. See, the 1.4-1.5 green belts have started exposing themselves to hard-to-master concepts like functional programming, concurrency and parallelism, code maintainability, and machine learning. These are hard; you can be 2.0+ and you’ll still have to do a lot of work to get any good at them. So, the green belts and higher tend to know how little they know. White belts similarly know that they’re beginners, but corporate programming tends to create an environment where yellow belts can perceive themselves to be masters of the craft.

Of course, there’s nothing wrong with being a yellow belt. I was a novice, then a white belt, then yellow and then green, at some point. (I hadn’t invented this metaphor yet, but you know what I mean.) The problem is when people get that yellow belt and assume they’re done. They start calling themselves expert early on and stop learning or questioning themselves; so after a 20-year career, have 18 years of experience in Being Experts! Worse yet, career yellow belts are so resistant to change that they never get new yellow belts, and time is not flattering to bright colors, so their belts tend to get a bit worn and dirty. Soot accumulates and they mistake it (as their non-technical superiors do, too) as a merit badge. “See! It’s dark-gray in spots! This must be what people mean when they talk about black belts!”

There’s a certain environment that fosters Expert Experts. People tend toward polarization of opinion surrounding IDEs but the truth is that they’re just tools. IDEs don’t kill code; people kill code. The evil is Corporate Programming. It’s not Java or .NET, but what I once called “Java Shop Politics“, and if I were to write essay now, I’d call it something else, since the evil is large, monolithic software and not a specific programming language. Effectively, it’s what happens when managers get together and decide that (a) programmers can’t be trusted with multiplicative work, so the goal becomes to build a corporate environment tailored toward mediocre adders (1.0 to 1.3) but with no use for superior skill, and (b) because there’s no use for 1.4+, green-belt and higher, levels of competence, it is useless to train people up to it; in fact, those who show it risk rejection because they are foreign. (Corporate environments don’t intentionally reject 1.4+ engineers, of course, but those tend to be the first targets of Brawlers.) It becomes a world in which software projects are large and staffed by gigantic teams of mediocre developers taking direct orders with low autonomy. It generates sloppy spaghetti code that would be unaffordable in its time cost were it not for the fact that no one is expected, by that point, to get anything done anyway.

Ultimately, someone still has to make architectural decisions, and that’s where the Expert Experts come in. The typical corporate environment is so stifling that 1.4+ engineers leave before they can accumulate the credibility and seniority that would enable them to make decisions. This leaves the Expert Experts to reign over the white-belted novices. “See this yellow belt? This means that I am the architect! I’ve got brown-gray ketchup stains on this thing that are older than you!”

Connecting the Dots

It goes without saying that there are very few shodan-level programmers. I’d be surprised if there are more than 15,000 of them in the United States. Why? What makes advancement to that level so rare? Don’t get me wrong: it takes a lot of talent, but it doesn’t take so much talent as to exclude 99.995% of the population. Partly, it’s the scarcity of high-quality work. In our War on Stupid against the mediocrity of corporate programming, we often find that Stupid has taken a lot of territory. When Stupid wins, multiplicative engineering contributions become impossible, which means that everyone is siloized into get-it-done commodity work explicitly blessed by management, and everything else gets thrown out.

Brawlers, in their own toxic way, rebel against this mediocrity, because they recognize it as a losing arrangement they don’t want; if they continue as average programmers in such an environment, they’ll have mediocre compensation and social status. They want to be alpha males. (They’re almost always men.) Unfortunately, they combat it by taking an approach that involves externalized costs that are catastrophic in the long term. Yes, they work 90 hours per week and generate lots of code, and they quickly convince their bosses that they’re “indispensable”. Superficially, they seem to be outperforming their rivals– even the 1.4+ engineers who are taking their time to do things right.

Unfortunately, Brawlers tend to be the best programmers when it comes to corporate competition, even though their work is shitty. They’re usually promoted away from the externalized costs induced by their own sloppy practices could catch up with them. Over time, they get more and more architectural and multiplier-level responsibilities (at which they fail) and, at some point, they make the leap into real management, about which they complain-brag (“I don’t get to write any code anymore; I’m always in meetings with investors!”) while they secretly prefer it that way. The nice thing, for these sociopaths, about technology’s opacity in quality is that it brings the Effort Thermocline to be quite low in the people-management tier.

Managers in a large company, however, end up dealing with the legacy of the Brawlers and, even though blame has been shifted away from those who deserve it, they get a sense that engineers have “too much freedom”. It’s not sloppy practices that damaged the infrastructure; it’s engineer freedom in the abstract that did it. Alien technologies (often superior to corporate best practices) often get smeared, and so do branch offices. “The Boston office just had to go and fucking use Clojure. Does that even have IDE support?”

This is where Expert Experts come in. Unlike Brawlers, they aren’t inherently contemptible people– most Expert Experts are good people weakened by corporate mediocrity– but they’re expert at being mediocre. They’ve been yellow belts for decades and just know that green-belt levels of achievement aren’t possible. They’re professional naysayers. They’re actually pretty effective at defusing Brawlers, and that’s the scary bit. Their principled mediocrity and obstructionism (“I am the expert here, and I say it can’t be done”) actually serves a purpose!

Both Brawlers and Expert Experts are an attempt at managerial arrogation over a field (computer programming) that is utterly opaque to non-technical managers. Brawlers are the tough-culture variety who attempt to establish themselves as top performers by externalizing costs to the future and “the maintenance team” (which they intend never to be on). Expert Experts are their rank-culture counterparts who dress their mediocrity and lack of curiosity up as principled risk-aversion. So, we now understand how they differ and what their connection is.

Solve It!

I did not intend to do so when I started this essay, in which I only wanted to focus on programming, but I’ve actually come upon (at least) a better name for the solution to the MacLeod Organizational Problem: shodan culture. It involves the best of the guild and self-executive cultures. Soon, I’ll get to exactly what that means, and how it should work.

Blub vs. engineer empowerment

No, I’m not quitting the Gervais / MacLeod Series. Part 23, which will actually be the final one because I want to get back to technology in how I spend spare time, is half-done. However, I am going to take a break in it to write about something else. 

I’ve written about my distaste for language and framework wars, at least when held for their own sake. I’m not fading from my position on that. If you tell go off and tell someone that her favorite language is a U+1F4A9 because it’s (statically|dynamically) typed, then you’re just being a jerk. There are a few terrible languages out there (especially most corporate internal DSLs) but C, Python, Scala, Lisp and Haskell were all designed by very smart people and they all have their places. I’ve seen enough to know that. There isn’t one language to rule them all. Trust me.

Yet, I contend that there is a problem of Blub in our industry. What’s Blub? Well, it’s often used as an epithet for an inferior language, coined in this essay by Paul Graham. As tiring as language wars are, Blubness is real. I contend, however, that it’s not only about the language. There’s much more to Blub.

Let’s start with the original essay and use Graham’s description of Blub:

Programmers get very attached to their favorite languages, and I don’t want to hurt anyone’s feelings, so to explain this point I’m going to use a hypothetical language called Blub. Blub falls right in the middle of the abstractness continuum. It is not the most powerful language, but it is more powerful than Cobol or machine language.

And in fact, our hypothetical Blub programmer wouldn’t use either of them. Of course he wouldn’t program in machine language. That’s what compilers are for. And as for Cobol, he doesn’t know how anyone can get anything done with it. It doesn’t even have x (Blub feature of your choice).

As long as our hypothetical Blub programmer is looking down the power continuum, he knows he’s looking down. Languages less powerful than Blub are obviously less powerful, because they’re missing some feature he’s used to. But when our hypothetical Blub programmer looks in the other direction, up the power continuum, he doesn’t realize he’s looking up. What he sees are merely weird languages. He probably considers them about equivalent in power to Blub, but with all this other hairy stuff thrown in as well. Blub is good enough for him, because he thinks in Blub.

When we switch to the point of view of a programmer using any of the languages higher up the power continuum, however, we find that he in turn looks down upon Blub. How can you get anything done in Blub? It doesn’t even have y.

By induction, the only programmers in a position to see all the differences in power between the various languages are those who understand the most powerful one. (This is probably what Eric Raymond meant about Lisp making you a better programmer.) You can’t trust the opinions of the others, because of the Blub paradox: they’re satisfied with whatever language they happen to use, because it dictates the way they think about programs.

So what is Blub? Well, some might read that description and say that it sounds like Java (has garbage collection, but not lambdas). So is Java Blub? Well, not quite. Sometimes (although rarely) Java is the right language to use. As a general-purpose language, Java is a terrible choice; but for high-performance Android development, Java’s the best. It is not James Gosling’s fault that it became the go-to language for clueless corporate managers and a tool-of-choice for mediocre “commodity developers”. That fact may or may not be related to weaknesses of the language, but it doesn’t make the language itself inferior.

Paul Graham looks at languages from a language-designer’s viewpoint, and also with an emphasis on aesthetics. As an amateur painter whose original passion was art, that shouldn’t surprise us. And in my opinion, Lisp is the closest thing out there to an aesthetically beautiful language. (You get used to the parentheses. Trust me. You start to like them because they are invisible when you don’t want to see them, but highlight structure when you do.) Does this mean that it’s right for everything? Of course not. If nothing else, there are cases when you don’t want to be working in a garbage-collected language, or when performance requirements make C the only game in town. Paul Graham seems to be focused on level of abstraction, and equating the middle territory (Java and C# would take that ground, today) with mediocrity. Is that a fair view?

Well, the low and high ends of the language-power spectrum tend to harbor a lot of great programmers, while the mediocre developers tend to be Java (or C#, or VB) monoglots. Good engineers are not afraid to go close to the metal, or far away from it into design-your-own-language land, if the problem calls for it. They’re comfortable in the whole space, so you’re more likely to find great people at the fringes. Those guys who write low-latency trading algorithms that run on GPUs have no time to hear about “POJOs“, and the gals who blow your mind with elegant Lisp macros have no taste for SingletonVisitorFactories. That said, great programmers will also operate at middling levels of abstraction when that is the right thing to do.

The problem of Blubness isn’t about a single language or level of abstraction. Sometimes, the C++/Java level of abstraction sometimes is the right one to work at. So there certainly are good programmers using those languages. Quite a large number of them, in fact. I worked at Google, so I met plenty of good programming using these generally unloved languages.

IDEs are another hot topic in the 10xers-versus-commodity-engineers flamewar. I have mixed feelings about them. When I see a 22-year-old settling in to his first corporate job and having to use the mouse, that “how the other half programs” instinct flares up and I feel compelled to tell him that, yes, you can still write code using emacs and the command line. My honest appraisal of IDEs? They’re a useful tool, sometimes. With the right configuration, they can be pretty neat. My issue with them is that they tend to be symptomatic. IDEs really shine when you have to read large amounts of other peoples’ poorly-written code. Now, I would rather have an IDE to do than not have one (trust me; I’ve gone both ways on that) but I would really prefer a job that didn’t involve trudging though bad legacy code on a daily basis. When someone tells me that “you have to use an IDE around here” I take it as a bad sign, because it means the code quality is devastatingly bad, and the IDE’s benefit will be to reduce Bad Code’s consumption of my time from 98% to 90%– still unacceptable.

What do IDEs have to do with Blub? Well, IDEs seem to be used often to support Blubby development practices. They make XML and Maven slightly less hideous, and code navigation (a valuable feature, no disagreement) can compensate, for a little while, for bad management practices that result in low code quality. I don’t think that IDEs are inherently bad, but I’ve seen them take the most hold in environments of damaged legacy code and low engineer empowerment.

I’ve thought a lot about language design and languages. I’ve used several. I’ve been in a number of corporate environments. I’ve seen good languages turn bad and bad languages become almost tolerable. I’ve seen the whole spectrum of code quality. I’ve concluded that it’s not generally useful to yell at people about their choices of languages. You won’t change, nor will they, and I’d rather work with good code in less-favored languages than bad code in any language. Let’s focus on what’s really at stake. Blub is not a specific language, but it is a common enemy: engineer disempowerment.

As technologists, we’re inclined toward hyperrationality, so we often ignore people problems and mask them up as technical ones. Instead of admitting that our company hired a bunch of terrible programmers who refuse to improve, we blame Java, as if the language itself (rather than years of terrible management, shitty projects, and nonexistent mentorship) somehow jammed their brains. Well, that doesn’t make sense because not every Java programmer is brain damaged. When something goes to shit in production, people jump to the conclusion that it wouldn’t have happened in a statically-typed language. Sorry, but that’s not true. Things break in horrible ways in all kinds of languages. Or, alternatively, when development is so slow that every top-25% engineer quits, people argue that it wouldn’t have happened in a fast-prototyping, dynamically-typed language. Wrong again. Bad management is the problem, not Scala or Python or even Java.

Even terrible code isn’t deserving of the anger that’s directed at it. Hell, I’ve written terrible code, especially early in my career. Who hasn’t? That anger should be directed against the manager who is making the engineer use shitty code (because the person who wrote it is the manager’s favorite) and not at the code itself. Terrible romance novels are written every day, but they don’t anger me because I never read them. But if I were forced to read Danielle Steele novels for 8 hours per day, I would fucking explode.

Ok, that’s enough negativity for a while…

I had a bit of a crisis recently. I enjoy computer science and I love solving hard problems. I enjoy programming. That said, the software industry has been wearing me down, this past couple of years. The bad code, low autonomy, and lack of respect for what we do is appalling. We have the potential to add millions of dollars per year in economic value, but we tend to get stuck with fourth quadrant work that we lack the power to refuse. I’ve seen enough of startups to know that most of them aren’t any better. The majority of those so-called “tech startups” are marketing experiments that happen to involve technology because, in the 21st century, everything does. I recently got to a point where I was considering leaving software for good. Computer science is fine and I have no problem with coding, but the corporate shit (again, just as bad in many startups) fries the brain and weakens the soul.

For some positivity, I went to the New York Clojure Meetup last night. I’ve been to a lot of technology Meetups, but there was a distinct feel at that one. The energy was more positive than what I’ve seen in many technical gatherings. The crowd was very strong, but that’s true of many technical meetups. Here, there was a flavor of “cleaner burning” in addition to the high intelligence that is always the case at technology meetups. People weren’t touting one corporate technology at the expense of another, and there was real code– good code, in fact– in a couple of the presentations. The quality of discussion was high, in addition to the quality of the people.

I’d had this observation, before, about certain language communities and how the differences of those are much greater than differences in language. People who intend to be lifelong programmers aren’t happy having New Java Despondency Infarction Framework X thrown at them every two years by some process-touting manager. They want more. They want a language that actually improves understanding of deep principles pertaining to how humans solve problems. It’s not that functional programming is inherently and universally superior. Pure functional programming has strong merits, and is often the right approach (and sometimes not) but most of what makes FP great is the community it has generated. It’s a community of engineers who want to be lifelong programmers or scientists, and who are used to firing up a REPL and trying out a new library. It’s a community of people who still use the command line and who still believe that to program is a virtue. The object-oriented world is one in which every programmer wants to be a manager, because object-orientation is how “big picture guys” think.

I’m very impressed with Clojure as a language, and that community has made phenomenally good decisions over the past few years. I started using it in 2008, and the evolution has been very positive. It’s not that I find Clojure (or Lisp) to be inerrant, but the community (and some others, like Haskell’s) stands in stark contrast against the anti-intellectualism of corporate software development. And I admire that immensely. It’s a real sacrifice that we 1.5+ engineers make on an ongoing basis when we demand that we keep learning, do things right, and build on sound principles. It doesn’t come easy. It can demand unusual hours, costs us jobs, and can put us in the ghetto, but there it is.

In the mean time, though, I don’t think it’s useful to mistake language choice as the prevailing or most important issue. If we do that, we’re just as guilty of cargo cultism as the stereotypical Java-happy IT managers. No, the real issue that matters is engineer empowerment, and we need to keep up our culture around that.

Learning C, reducing fear.

I have a confession to make. At one point in my career, I was a mediocre programmer. I might say that I still am, only in the context of being a harsh grader. I developed a scale for software engineering for which I can only, in intellectual honesty, assign myself 1.8 points out of a possible 3.0. One of the signs of my mediocrity is that I haven’t a clue about many low-level programming details that, thirty years ago, people dealt with on a regular basis. I know what L1 and L2 cache are, but I haven’t built the skill set yet to make use of this knowledge.

I love high-level languages like Scala, Clojure, and Haskell. The abstractions they provide make programming more productive and fun than it is in a language like Java and C++, and the languages have a beauty that I appreciate as a designer and mathematician. Yet, there is still quite a place for C in this world. Last July, I wrote an essay, “Six Languages to Master“, in which I advised young programmers to learn the following languages:

  • Python, because one can get started quickly and Python is a good all-purpose language.
  • C, because there are large sections of computer science that are inaccessible if you don’t understand low-level details like memory management.
  • ML, to learn taste in a simple language often described as a “functional C” that also teaches how to use type systems to make powerful guarantees about programs.
  • Clojure, because learning about language (which is important if one wants to design good interfaces) is best done with a Lisp and because, for better for worse, the Java libraries are a part of our world.
  • Scala, because it’s badass if used by people with a deep understanding of type systems, functional programming, and the few (very rare) occasions where object-oriented programming is appropriate. (It can be, however, horrid if wielded by “Java-in-Scala” programmers.)
  • English (or the natural language of one’s environment) because if you can’t teach other people how to use the assets you create, you’re not doing a very good job.

Of these, C was my weakest at the time. It still is. Now, I’m taking some time to learn it. Why? There are two reasons for this.

  • Transferability. Scala’s great, but I have no idea if it will be around in 10 years. If the Java-in-Scala crowd adopts the language without upgrading its skills and the language becomes associated with Maven, XMHell, IDE culture, and commodity programmers, in the way that Java has, the result will be piles of terrible Scala code that will brand the language as “write-only” and damage its reputation for reasons that are not Scala’s fault. These sociological variables I cannot predict. I do, however, know that C will be in use in 10 years. I don’t mind learning new languages– it’s fun and I can do it quickly– but the upshot of C is that, if I know it, I will be able to make immediate technical contributions in almost any programming environment. I’m already fluent in about ten languages; might as well add C. 
  • Confidence. High-level languages are great, but if you develop the attitude that low-level languages are “unsafe”, ugly, and generally terrifying, then you’re hobbling yourself for no reason. C has its warts, and there are many applications where it’s not appropriate. It requires attention to details (array bounds, memory management) that high-level languages handle automatically. The issue is that, in engineering, anything can break down, and you may be required to solve problems in the depths of detail. Your beautiful Clojure program might have a performance problem in production because of an issue with the JVM. You might need to dig deep and figure it out. That doesn’t mean you shouldn’t use Clojure. However, if you’re scared of C, you can’t study the JVM internals or performance considerations, because a lot of the core concepts (e.g. memory allocation) become a “black box”. Nor will you be able to understand your operating system.

For me, personally, the confidence issue is the important one. In the functional programming community, we often develop an attitude that the imperative way of doing things is ugly, unsafe, wrong, and best left to “experts only” (which is ironic, because most of us are well into the top 5% of programmers, and more equipped to handle complexity than most; it’s this adeptness that makes us aware of our own limitations and prefer functional safeguards when possible). Or, I should not say that this is a prevailing attitude, so much as an artifact of communication. Fifty-year-old, brilliant functional programmers talk about how great it is to be liberated from evils like malloc and free. They’re right, for applications where high-level programming is appropriate. The context being missed is that they have already learned about memory management quite thoroughly, and now it’s an annoyance to them to keep having to do it. That’s why they love languages like Ocaml and Python. It’s not that low-level languages are dirty or unsafe or even “un-fun”, but that high-level languages are just much better suited to certain classes of problems.

Becoming the mentor

I’m going to make an aside that has nothing to do with C. What is the best predictor of whether someone will remain at a company for more than 3 years? Mentorship. Everyone wants “a Mentor” who will take care of his career by providing interesting work, freedom from politics, necessary introductions, and well-designed learning exercises instead of just-get-it-done grunt work. That’s what we see in the movies: the plucky 25-year-old is picked up by the “star” trader, journalist, or executive and, over 97 long minutes, his or her career is made. Often this relationship goes horribly wrong in film, as in Wall Street, wherein the mentor and protege end up in a nasty conflict. I won’t go so far as to call this entirely fictional, but it’s very rare. You can find mentors (plural) who will help you along as much as they can, and should always be looking for people interested in sharing knowledge and help, but you shouldn’t look for “The Mentor”. He doesn’t exist. People want to help those who are already self-mentoring. This is even more true in a world where few people stay at a job for more than 4 years.

I’ll turn 30 this year, and in Silicon Valley that would entitle me to a lawn and the right to tell people to get off of it, but I live in Manhattan so I’ll have to keep using the Internet as my virtual lawn. (Well, people just keep fucking being wrong. There are too many for one man to handle!) One of the most important lessons to learn is the importance of self-mentoring. Once you get out of school where people are paid to teach you stuff, people won’t help people who aren’t helping themselves. To a large degree, this means becoming the “Mentor” figure that one seeks. I think that’s what adulthood is. It’s when you realize that the age in which there were superior people at your beck and call to sort out your messes and tell you what to do is over. Children can be nasty to each other but there are always adults to make things right– to discipline those who break others’ toys, and replace what is broken. The terrifying thing about adulthood is the realization that there are no adults. This is a deep-seated need that the physical world won’t fill. There’s at least 10,000 recorded years of history that shows people gaining immense power by making “adults-over-adults” up, and using the purported existence of such creatures to arrogate political power, because most people are frankly terrified of the fact that, at least in the observable physical world and in this life, there is no such creature.

What could this have to do with C? Well, now I dive back into confessional mode. My longest job tenure (30 months!) was at a startup that seems to have disappeared after I left. I was working in Clojure, doing some beautiful technical work. This was in Clojure’s infancy, but the great thing about Lisps is that it’s easy to adapt the language to your needs. I wrote a multi-threaded debugger using dynamic binding (dangerous in production, but fine for debugging) that involved getting into the guts of Clojure, a test harness, an RPC client-server infrastructure, and a custom NoSQL graph-based database. The startup itself wasn’t well-managed, but the technical work itself was a lot of fun. Still, I remember a lot of conversations to the effect of, “When we get a real <X>”, where X might be “database guy” or “security expert” or “support team”. The attitude I allowed myself to fall into, when we were four people strong, was that a lot of the hard work would have to be done by someone more senior, someone better. We inaccurately believed that the scaling challenges would mandate this, when in fact, we didn’t scale at all because the startup didn’t launch.

Business idiots love real X’s. This is why startups frequently develop the social-climbing mentality (in the name of “scaling”) that makes internal promotion rare. The problem is that this “realness” is total fiction. People don’t graduate from Expert School and become experts. They build a knowledge base over time, often by going far outside of their comfort zones and trying things at which they might fail, and the only things that change are that the challenges get harder, or the failure rate goes down. As with the Mentor that many people wait for in vain, one doesn’t wait to “find a Real X” but becomes one. That’s the difference between a corporate developer and a real hacker. The former plays Minesweeper (or whatever Windows users do these days) and waits for an Expert to come from on high to fix his IDE when it breaks. The latter shows an actual interest in how computers really work, which requires diving into the netherworld of the command line interface.

That’s why I’m learning C. I’d prefer to spend much of my programming existence in high-level languages and not micromanaging details– although, this far, C has proven surprisingly fun– but I realize that these low-level concerns are extremely important and that if I want to understand things truly, I need a basic fluency in them. If you fear details, you don’t understand “the big picture”. The big picture is made up of details, after all. This is a way to keep the senescence of business FUD at bay– to not become That Executive who mandates hideous “best practices” Java, because Python and Scala are “too risky”.

Fear of databases? Of operating systems? Of “weird” languages like C and Assembly? Y’all fears get Zero Fucks from me.

IDE Culture vs. Unix philosophy

Even more of a hot topic than programming languages is the interactive development environment, or IDE. Personally, I’m not a huge fan of IDEs. As tools, standing alone, I have no problem with them. I’m a software libertarian: do whatever you want, as long as you don’t interfere with my work. However, here are some of the negatives that I’ve observed when IDEs become commonplace or required in a development environment:

  • the “four-wheel drive problem”. This refers to the fact that an unskilled off-road driver, with four-wheel drive, will still get stuck. The more dexterous vehicle will simply have him fail in a more inaccessible place. IDEs pay off when you have to maintain an otherwise unmanageable ball of other people’s terrible code. They make unusable code merely miserable. I don’t think there’s any controversy about this. The problem is that, by providing this power, then enable an activity of dubious value: continual development despite abysmal code quality, when improving or killing the bad code should be a code-red priority. IDEs can delay code-quality problems and defer macroscopic business effects, which is good for manageosaurs who like tight deadlines, but only makes the problem worse at the end stage. 
  • IDE-dependence. Coding practices that require developers to depend on a specific environment are unforgivable. This is true whether the environment is emacs, vi, or Eclipse. The problem with IDEs is that they’re more likely to push people toward doing things in a way that makes use of a different environment impossible. One pernicious example of this is in Java culture’s mutilation of the command-line way of doing things with singleton directories called “src” and “com”, but there are many that are deeper than that. Worse yet, IDEs enable the employment of programmers who don’t even know what build systems or even version control are. Those are things “some smart guy” worries about so the commodity programmer can crank out classes at his boss’s request.
  • spaghettification. I am a major supporter of the read-only IDE, preferably served over the web. I think that code navigation is necessary for anyone who needs to read code, whether it’s crappy corporate code or the best-in-class stuff we actually enjoy reading. When you see a name, you should be able to click on it and see where that name is defined. However, I’m pretty sure that, on balance, automated refactorings are a bad thing. Over time, the abstractions which can easily be “injected” into code using an IDE turn it into “everything is everywhere” spaghetti code. Without an IDE, the only way to do such work is to write a script to do it. There are two effects this has on the development process. One is that it takes time to make the change: maybe 30 minutes. That’s fine, because the conversation that should happen before a change that will affect everyone’s work should take longer than that. The second is that only adept programmers (who understand concepts like scripts and the command line) will be able to do it. That’s a good thing.
  • time spent keeping up the environment. Once a company decides on “One Environment” for development, usually an IDE with various in-house customizations, that IDE begins to accumulate plugins of varying quality. That environment usually has to be kept up, and that generates a lot of crappy work that nobody wants to do.

This is just a start on what’s wrong with IDE culture, but the core point is that it creates some bad code. So, I think I should make it clear that I don’t dislike IDEs. They’re tools that are sometimes useful. If you use an IDE but write good code, I have no problem with you. I can’t stand IDE culture, though, because I hate hate hate hate hate hate hate hate the bad code that it generates.

In my experience, software environments that rely heavily on IDEs tend to be those that produce terrible spaghetti code, “everything is everywhere” object-oriented messes, and other monstrosities that simply could not be written by a sole idiot. He had help. Automated refactorings that injected pointless abstractions? Despondency infarction frameworks? Despise patterns? Those are likely culprits.

In other news, I’m taking some time to learn C at a deeper level, because as I get more into machine learning, I’m realizing the importance of being able to reason about performance, which requires a full-stack knowledge of computing. Basic fluency in C, at a minimum, is requisite. I’m working through Zed Shaw’s Learn C the Hard Way, and he’s got some brilliant insights not only about C (on which I can’t evaluate whether his insights are brilliant) but about programming itself. In his preamble chapter, he makes a valid insight in his warning not to use an IDE for the learning process:

An IDE, or “Integrated Development Environment” will turn you stupid. They are the worst tools if you want to be a good programmer because they hide what’s going on from you, and your job is to know what’s going on. They are useful if you’re trying to get something done and the platform is designed around a particular IDE, but for learning to code C (and many other languages) they are pointless. [...]
Sure, you can code pretty quickly, but you can only code in that one language on that one platform. This is why companies love selling them to you. They know you’re lazy, and since it only works on their platform they’ve got you locked in because you are lazy. The way you break the cycle is you suck it up and finally learn to code without an IDE. A plain editor, or a programmer’s editor like Vim or Emacs, makes you work with the code. It’s a little harder, but the end result is you can work with any code, on any computer, in any language, and you know what’s going on. (Emphasis mine.)

I disagree with him that IDEs will “turn you stupid”. Reliance on one prevents a programmer from ever turning smart, but I don’t see how such a tool would cause a degradation of a software engineer’s ability. Corporate coding (lots of maintenance work, low productivity, half the day lost to meetings, difficulty getting permission to do anything interesting, bad source code) does erode a person’s skills over time, but that can’t be blamed on the IDE itself. However, I think he makes a strong point. Most of the ardent IDE users are the one-language, one-environment commodity programmers who never improve, because they never learn what’s actually going on. Such people are terrible for software, and they should all either improve, or be fired.

The problem with IDEs is that each corporate development culture customizes the environment, to the point that the cushy, easy coding environment can’t be replicated at home. For someone like me, who doesn’t even like that type of environment, that’s no problem because I don’t need that shit in order to program. But someone steeped in cargo cult programming because he started in the wrong place is going to falsely assume that programming requires an IDE, having seen little else, and such novice programmers generally lack the skills necessary to set one up to look like the familiar corporate environment. Instead, he needs to start where every great programmer must learn some basic skills: at the command-line. Otherwise, you get a “programmer” who can’t program outside of a specific corporate context– in other words, a “5:01 developer” not by choice, but by a false understanding of what programming really is.

The worst thing about these superficially enriched corporate environments is their lack of documentation. With Unix and the command-line tools, there are man pages and how-to guides all over the Internet. This creates a culture of solving one’s own problems. Given enough time, you can answer your own questions. That’s where most of the growth happens: you don’t know how something works, you Google an error message, and you get a result. Most of the information coming back in indecipherable to a novice programmer, but with enough searching, the problem is solved, and a few things are learned, including answers to some questions that the novice didn’t yet have the insight (“unknown unknowns”) yet to ask. That knowledge isn’t built in a day, but it’s deep. That process doesn’t exist in an over-complex corporate environment, where the only way to move forward is to go and bug someone, and the time cost of any real learning process is at a level that most managers would consider unacceptable.

On this, I’ll crib from Zed Shaw yet again, in Chapter 3 of Learn C the Hard Way:

In the Extra Credit section of each exercise I may have you go find information on your own and figure things out. This is an important part of being a self-sufficient programmer. If you constantly run to ask someone a question before trying to figure it out first then you never learn to solve problems independently. This leads to you never building confidence in your skills and always needing someone else around to do your work. The way you break this habit is to force yourself to try to answer your own questions first, and to confirm that your answer is right. You do this by trying to break things, experimenting with your possible answer, and doing your own research. (Emphasis mine.)

What Zed is describing here is the learning process that never occurs in the corporate environment, and the lack of it is one of the main reasons why corporate software engineers never improve. In the corporate world, you never find out why the build system is set up in the way that it is. You just go bug the person responsible for it. “My shit depends on your shit, so fix your shit so I can run my shit and my boss doesn’t give me shit over my shit not working for shit.” Corporate development often has to be this way, because learning a typical company’s incoherent in-house systems doesn’t provide a general education. When you’re studying the guts of Linux, you’re learning how a best-in-class product was built. There’s real learning in mucking about in small details. For a typically mediocre corporate environment that was built by engineers trying to appease their managers, one day at a time, the quality of the pieces is often so shoddy that not much is learned in truly comprehending them. It’s just a waste of time to deeply learn such systems. Instead, it’s best to get in, answer your question, and get out. Bugging someone is the most efficient and best way to solve the problem.

It should be clear that what I’m railing against is the commodity developer phenomenon. I wrote about “Java Shop Politics” last April, which covers a similar topic. I’m proud of that essay, but I was wrong to single out Java as opposed to, e.g. C#, VB, or even C++. Actually, I think any company that calls itself an “<X> Shop” for any language X is missing the point. The real evil isn’t Java the language, as limited as it may be, but Big Software and the culture thereof. The true enemy is the commodity developer culture, empowered by the modern bastardization of “object-oriented programming” that looks nothing like Alan Kay’s original vision.

In well-run software companies, programs are build to solve problems, and once the problem is finished, it’s Done. The program might be adapted in the future, and may require maintenance, but that’s not an assumption. There aren’t discussions about how much “headcount” to dedicate to ongoing maintenance after completion, because that would make no sense. If people need to modify or fix the program, they’ll do it. Programs solve well-defined problems, and then their authors move on to other things– no God Programs that accumulate requirements, but simple programs designed to do one thing and do it well. The programmer-to-program relationship must be one-to-many. Programmers write programs that do well-defined, comprehensible things well. They solve problems. Then they move on. This is a great way to build software “as needed”, and the only problem with this style of development is that the importance of small programs is hard to micromanage, so managerial dinosaurs who want to track efforts and “headcount” don’t like it much, because they can never figure out who to scream at when things don’t go their way. It’s hard to commoditize programmers when their individual contributions can only be tracked by their direct clients, and when people can silently be doing work of high importance (such as making small improvements to the efficiencies of core algorithms that reduce server costs). The alternative is to invert the programmer-to-program relationship: make it many-to-one. Then you have multiple programmers (now a commodity) working on Giant Programs that Do Everything. This is a terrible way to build software, but it’s also the one historically favored by IDE culture, because the sheer work of setting up a corporate development environment is enough that it can’t be done too often, and this leads managers to desire Giant Projects and a uniformity (such as a one-language policy, see again why “<X> Shops” suck) that managers like but that often makes no sense.

The right way of doing things– one programmer works on many small, self-contained programs– is the core of the so-called “Unix philosophy“. Big Programs, by contrast, invariably have undocumented communication protocols and consistency requirements whose violation leads not only to bugs, but to pernicious misunderstandings that muddle the original conceptual integrity of the system, resulting in spaghetti code and “mudballs”. The antidote is for single programs themselves to be small, for large problems to be solved with systems that are given the respect (such as attention to fault tolerance) that, as such, they deserve.

Are there successful exceptions to the Unix philosophy? Yes, there are, but they’re rare. One notable example is the database, because these systems often have very strong requirements (transactions, performance,  concurrency, durability, fault-tolerance) that cannot be as easily solved with small programs and organic growth alone. Some degree of top-down orchestration is required if you’re going to have a viable database, because databases have a lot of requirements that aren’t typical business cruft, but are actually critically important. Postgres, probably the best SQL out there, is not a simple beast. Indeed, databases violate one of the core tenets of the Unix philosophy– store data in plain text– and they do so for good reasons (storage usage). Databases also mandate that people be able to use them without having to keep up with the evolution of such a system’s opaque and highly-optimized internal details, which makes the separation of implementation from interface (something that object-oriented programming got right) a necessary virtue. Database connections, like file handles, should be objects (where “object” means “something that can be used with incomplete knowledge of its internals”.) So databases, in some ways, violate the Unix philosophy, and yet are still used by staunch adherents. (We admit that we’re wrong sometimes.) I will also remark that it has taken decades for some extremely intelligent (and very well-compensated) people to get databases right. Big Projects win when no small project or loose federation thereof will do the job.

My personal belief is that almost every software manager thinks he’s overseeing one of the exceptions: a Big System that (like Postgres) will grow to such importance that people will just swallow the complexity and use the thing, because it’s something that will one day be more important than Postgres or the Linux kernel. In almost all cases, they are wrong. Corporate software is an elephant graveyard of such over-ambitious systems. Exceptions to the Unix philosophy are extremely rare. Your ambitious corporate system is almost certainly not one of them. Furthermore, if most of your developers– or even a solid quarter of them– are commodity developers who can’t code outside of an IDE, you haven’t a chance.

Functional programs rarely rot

The way that most of us sell functional programming is all wrong. The first thing we say about it is that the style lacks mutable state (although most functional languages allow it). So we’re selling it by talking about what it doesn’t have. Unfortunately, it’s not until late in a programmer’s career (and some never get there) that he realizes that less power in a language is often better, because “freedom-from” can be more important than “freedom-to” in systems that will require maintenance. This makes for an ineffective sales strategy, because the people we need to reach are going to see the statelessness of the functional style as a deficit. Besides, we need to be honest here: sometimes (but not often) mutable state is the right tool for the job.

Often, functional programming is sold with side-by-side code snippets, the imperative example being about 30 lines long and relatively inelegant, while the functional example requires only six. The intent is to show that functional programming is superior, but the problem is that people are likely to prefer whatever is most familiar to them. A person who is used to the imperative style is more likely to prefer the imperative code, because they are more used to it. What good is a reduction to six lines if they seem impenetrable? Besides, in the real world, we use mutable state all the time: for performance and because the world actually is stateful. It’s just that we’re mindful enough to manage it well.

So what is it that makes functional programming superior? Or what is it that makes imperative code so terrible? The issue isn’t that state is inherently evil, because it’s not. Small programs are often easier to read in an imperative style than a functional one, just as goto improves the control flow of many small procedures. The problem, rather, is that stateful programs evolve in bad ways when programs get large.

A referentially transparent function is one that returns the same output per input. This, and immutable data, are the atoms of functional programming. Such functions actually have precise (and usually, obviously intended) semantics which means that it’s clear what is and what is not a bug, and unit testing is relatively straightforward. Contrast this with typical object-oriented software where an object’s semantics are the code, and it’s easy to appreciate why the functional approach is better. There’s something else that’s nice about referentially transparent functions, which is that they can’t be changed (in a meaningful way) without altering their interfaces.

Object-oriented software development is contrived to make sense to non-technical businessmen, and the vague squishiness associated with “objectness” appeals to that crowd. Functional programming is how mathematicians would prefer to program, and programming is math. When you actually care about correctness, the mathematician’s insistence on integrity even in the smallest details is preferable.

A functional program computes something, and intermediate computations are returned from one function and passed directly into another as a parameter. Behaviors that aren’t reflected in a returned value might matter for performance but not semantics. An imperative or object-oriented program does things, but the intermediate results are thrown away. What this means is that an unbounded amount of intermediate stuff can be shoved into an imperative program, with no change to its interface. Or, to put it another way, with a referentially transparent function, the interface-level activity is all one needs to know about its behavior. On the other hand, this means that for a referentially transparent function to become complex will require an equally complex interface. Sometimes the simplicity provided by an imperative function’s interface is desirable.

When is imperative programming superior? First, when a piece of code is small and guaranteed to stay small, the additional plumbing involved in deciding what-goes-where that functional programming can make the program harder to write.  A common functional pattern when state is needed is to “thread” it through referentially transparent functions by documenting state effects (that may be executed later) in the interface, which can complicate interfaces. It’s more intuitive to do a series of actions than build up a list of actions (the functional style) that is then executed. What if that list is very large? How will this affect performance? That said, my experience is that the crossover point at which functional programming becomes strictly preferable is low: about 100 lines of code at most, and often below 50. The second case of imperative programming’s superiority is when the imperative style is needed for performance reasons, or when an imperative language with manual memory management (such as C or C++) is strictly required for the problem being solved. Sometimes all of those intermediate results must be thrown away because there isn’t space to hold them. Mutable state is an important tool, but it should almost always be construed as a performance-oriented optimization.

The truth is that good programmers mix the styles quite a bit. We program imperatively when needed, and functionally when possible.

Imperative and object-oriented programming are different styles, and the latter is, in my view, more dangerous. Few programmers write imperative code anymore. C is imperative, Java is full-blown object-oriented, and C++ is between the two depending on who wrote the style guide. The problem with imperative code, in most business environments, is that it’s slow to write and not very dense. Extremely high-quality imperative code can be written, but it takes a very long time to do so. Companies writing mission-critical systems can afford it, but most IT managers prefer the fast-and-loose, sloppy vagueness of modern OOP. Functional and object-oriented programming both improve on the imperative style in terms of the ability to write code fast (thanks to abstraction) but object-oriented code is more manager-friendly, favoring factories over monads

Object-oriented programming’s claiming virtue is the ability to encapsulate complexity behind a simpler interface, usually with configurability of the internal complexity as an advanced feature. Some systems reach a state where that is necessary. One example is the SQL database, where the typical user specifies what data she wants but not how to get it. In fact, although relational databases are often thought of in opposition to “object-oriented” style, this is a prime example of an “object-oriented” win. Alan Kay’s original vision with object-oriented programming was not at all a bad one: when you require complexity, encapsulate it behind a simpler interface so that (a) the product is easier to use, and (b) internals can be improved without disruption at the user level. He was not saying, “go out and write a bunch of highly complex objects.” Enterprise Java is not what he had in mind.

So what about code rot? Why is the functional style more robust against software entropy than object-oriented or imperative code? The answer is inherent in functional programming’s visible (and sometimes irritating) limitation: you can’t add direct state effects, but have to change interfaces. What this means is that adding complexity to a function expands its interface, and it quickly reaches a point where it’s visibly ugly. What happens then? A nice thing about functional programming is that programs are build up using function composition, which means that large functions can easily be broken up into smaller ones. It’s rare, in a language like Ocaml or Clojure when the code is written by a competent practitioner, to see functions longer than 25 lines long. People break functions up (encouraging code reuse) when they get to that point. That’s a really great thing! Complexity sprawl still happens, because that’s how business environments are, but it’s horizontal. As functional programs grow in size, there are simply more functions. This can be ugly (often, in a finished product, half of the functions are unused and can be discarded) but it’s better than the alternative, which is the vertical complexity sprawl that can happen within “God methods”, and in which it’s unclear what is essential and what can be discarded. 

With imperative code, the additional complexity is shoved into a series of steps. For-loops get bigger, and branching gets deeper. At some point, procedures reach several hundred lines in length, often with the components being written by different authors and conceptual integrity being lost. Much worse is object-oriented code, with its non-local inheritance behaviors (note: inheritance is the 21st-century goto) and native confusion of data types and namespaces. Here, the total complexity of an object can reach tens of thousands of lines, at which points spaghettification has occurred.

Functional programming is like the sonnet. There’s nothing inherently superior about iambic pentameter and rhyming requirements, but the form tends to elevate one’s use of language. People find a poetic capability that would be much harder (for a non-poet) to find in free verse. Why? Because constraint– the right kind of constraint– breeds creativity. Functional programming is the same way. No, there’s nothing innately superior about it, but the style forces people to deal with the complexity they generate by forcing it to live at the interface level. You can’t, as easily, throw a dead rat in the code to satisfy some dipshit requirement and then forget about it. You have to think about what you’re doing, because it will effect interfaces and likely be visible to other people. The result of this is that the long-term destruction of code integrity that happens in most business environments is a lot less likely to occur.