Why I wiped my LinkedIn profile

I wiped my LinkedIn profile recently. It now says:

I don’t reveal history without a reason, so my past jobs summary is blank.

I’m a New York-based software engineer who specializes in functional programming, machine learning, and language design.

This might not be the best move for my career. I’m mulling over whether I should delete the profile outright, rather than leaving a short note that appears cagey. I have a valid point– it really isn’t the rest of the world’s business what companies I have worked for– but I’m taking an unusual position that leaves me looking like a “tinfoiler”. For that, I’m honestly not, but I do believe in personal privacy. Privacy’s value is insurance against low-probability, high-impact harms. I don’t consider it likely that I’ll ever damage myself by publicly airing past employment history. It’s actually very unlikely. But why take the chance? I am old enough to know that not all people in the world are good, and this fact requires caution in the sharing of information, no matter how innocuous it might seem.

Consistency risk

My personal belief is that more people will damage their careers through respectable avenues such as LinkedIn than on Facebook, the more classic “digital dirt” culprit. For most jobs, no one is going to care what a now-35 software engineer said when he was 19 about getting drunk. Breaking news: all adults were teenagers, and teenagers are sometimes stupid! On the other hand, people could be burned by inconsistencies between two accounts of their career histories. Let’s say that someone’s CV says “March 2003 – February 2009″ while his LinkedIn profile says “March 2003 – November 2008“. Uh-oh. HR catches this discrepancy, flags it, and brings the candidate in for a follow-on interview, and the candidate discloses that he was on severance (and technically employed, but with no responsibilities) for 3 months. There was no lie. It was a benign difference of accounting. Still, the candidate has now disclosed receipt of a severance payment. There’s a story there. Whoops. In a superficial world, that could mean losing the job offer.

This isn’t a made-up story. The dates were different, but I know someone who ended up having to disclose a termination because of an inconsistency of this kind. (LinkedIn, in the case of which I’m aware, wasn’t the culprit.) So consistency risk is real.

Because the white-collar corporate world has so little in the way of actual ethics, the appearance of being ethical is extremely important. Even minor inconsistencies admit a kind of scrutiny that no one wishes to tolerate. This career oversharing that a lot of young people are participating in is something I find quite dangerous. Not everything that can damage a person’s reputation is a drunk picture. Most threats and mistakes are more subtle than that, and consistency risk is a big deal.

Replicating a broken system

My ideological issue, however, with LinkedIn isn’t the risk that’s involved. I’ll readily concede that those risks are very mild for the vast majority of people. The benefits of using such a service quite possibly outweigh them. The bigger problem I have with it is that it exists to replicate broken ways of doing things.

In 2013, the employment market is extremely inefficient in almost all domains, whether we’re talking about full-time jobs, consulting gigs, or startup funding. It’s a system so broken that no one trusts it, and when people distrust front-door channels or find them clogged and unusable, they retreat to back-door elitism and nepotism. Too much trust is given to word-of-mouth references (that are slow to travel, unreliable, and often an artifact of a legal settlement) and low-quality signals such as educational degrees, prestige of prior employers, and durations of employment. Local influences have a pernicious effect, the result of which is unaffordable real estate in virtually any location where a career can be built. Highly-qualified people struggle to find jobs– especially their first engagements– while companies complain of a dearth of appropriate talent. They’re both right, in a way. This is a matching problem related to the “curse of dimensionality“. We have a broken system that no one seems to know how to fix.

LinkedIn, at least in this incarnation, is an online implementation of the old-style, inefficient way of doing things. If you want an impressive profile, you have to troll for, trade, and if you’ve had a bad separation, use the legal system to demand in a settlement, recommendations and endorsements. You list the companies where you worked, job titles, and dates of employment, even if you honestly fucking hate some of those companies. We’ve used the Internet to give wings to an antiquated set of mechanics for evaluating other people, when we should be trying to do something better.

None of this is intended as a slight against LinkedIn itself. It’s a good product, and I’m sure they’re a great company. I just have an ideological dislike– and I realize that I hold a minority opinion– for the archaic and inefficient way we match people to jobs. It doesn’t even work anymore, seeing as most resumes are read for a few seconds then discarded.

Resumes are broken in an especially irritating way, because they often require people to retain a lasting association with an organization that may have behaved in a tasteless way. I have, most would say, a “good” resume. It’s better than what 98 percent of people my age have: reputable companies, increasing scope of responsibility. Yet, it’s a document through which I associate my name with a variety of organizations. Some of these I like, and some I despise. There is one for which I would prefer for the world never to know that I was associated with it. Of course, if I’m asked, “Tell me about your experience at <X>” in a job interview, for certain execrable values of X, social protocol forbids me from telling the truth.

I’ll play by the rules, when I’m job searching. I’ll send a resume, because it’s part of the process. Currently, however, I’m not searching. This leaves me with little interest in building an online “brand” in a regime vested in the old, archaic protocols. Trolling for endorsements, in my free time, when I’m employed? Are you kidding me?

The legitimacy problem

Why do I so hate these “old, archaic protocols”? It’s not that I have a problem, personally. I have a good resume, strong accomplishments for someone of my age, and I can easily get solid recommendations. I have no need to have a personal gripe here. What bothers me is something else, something philosophical that doesn’t anger a person until she thinks of it in the right way. It’s this: any current matching system between employers and employees has to answer questions regarding legitimacy, and the existing one gets some core bits seriously wrong.

What are the most important features of a person’s resume? For this exercise, let’s assume that we’re talking about a typical white-collar office worker, at least 5 years out of school. Then I would say that “work experience” trumps education, even if that person has a Harvard Ph.D. What constitutes “work experience”? There’s some degree of “buzzword compliance”, but that factor I’m willing to treat as noise. Sometimes, that aspect will go in a candidate’s favor, and sometimes it won’t, but I don’t see it conferring a systemic advantage. I’m also going to say that workplace accomplishments mean very little. Why? Because an unverifiable line on a resume (“built awesome top-secret system you’ve never heard of”) is going to be assumed, by most evaluators, to be inflated and possibly dishonest. So the only bits of a resume that will be taken seriously are the objectively verifiable ones. This leaves:

  • Company prestige. That’s the big one, but it’s also ridiculously meaningless, because prestigious companies hire idiots all the time. 
  • Job titles. This is the trusted metric of professional accomplishment. If you weren’t promoted for it, it didn’t happen.
  • Length of tenure. This one’s nonlinear, because short tenures are embarrassing, but long stints without promotions are equally bad.
  • Gaps in employment. Related to the above, large gaps in job history make a candidate unattractive.
  • Salary history, if a person is stupid enough to reveal it.
  • Recommendations, preferably from management.

There are other things that matter, such as overlap between stated skills and what a particular company needs, but when it comes to “grading” people, look no farther than the above. Those factors determine where a person’s social status starts in the negotiation. Social status isn’t, of course, the only thing that companies care about in hiring… but it’s always advantageous to have it in one’s favor.

What’s disgusting and wrong about this regime is that all of these accolades come from a morally illegitimate source: corporate management. That’s where job titles, for example, come from. They come from a caste of high priests called “managers” who are anointed by a higher caste called “executives” who derive their legitimacy from a pseudo-democracy of shareholders who (while their financial needs and rights deserve respect) honestly haven’t a clue how to run a company. Now, I wouldn’t advise people to let most corporate executives around their kids, because I’ve known enough in my life to know that most of them aren’t good people. So why are we assigning legitimacy to evaluations coming from such an unreliable and often corrupt source? It makes no sense. It’s a slave mentality.

First scratch at a solution

I don’t think resumes scale. They provide low-signal data, and that fails us in a world where there are just so many of the damn things around that a sub-1% acceptance rate is inevitable. I’m not faulting companies for discarding most resumes that they get. What else would they be expected to do? Most resumes come from unqualified candidates who bulk-mail them. Now that it’s free to send a resume anywhere in the world, a lot of people (and recruiters) spam, and that clogs the channels for everyone. The truth, I think, is that we need to do away with resumes– at least of the current form– altogether.

That’s essentially what has happened in New York and Silicon Valley. You don’t look for jobs by sending cold resumes. You can try it, but it’s usually ineffective, even if you’re one of those “rock star” engineers who is always in demand. Instead, you go to meetups and conferences and meet people in-person. That approach works well, and it’s really the only reliable way to get leads. This is less of an option for someone in Anchorage or Tbilisi, however. What we should be trying to do with technology is to build these “post-resume” search avenues on the Internet– not the same old shit that doesn’t work.

So, all of this said, what are resumes good for? I’ve come to the conclusion that there is one very strong purpose for resumes, and one that justifies not discarding the concept altogether. A resume is a list of things one is willing to be asked about in the context of a job interview. If you put Scala on your resume, you’re making it clear that you’re confident enough in your knowledge of that language to take questions about it, and possibly lose a job offer if you actually don’t know anything about it. I think the “Ask me about <X>” feature of resumes is probably the single saving grace of this otherwise uninformative piece of paper.

If I were to make a naive first scratch at solving this problem, here’s how I’d “futurize” the resume. Companies, titles, and dates all become irrelevant. Leave that clutter off. Likewise, I’d ask that companies drop the requirement nonsense where they put 5 years of experience in a 3-year-old technology as a “must have” bullet point. Since requirement sprawl is “free”, it occurs, and few people actually meet any sufficiently long requirement set to the letter, so that seems to select against people who actually read the requirements. Instead, here’s the lightweight solution: allocate 20 points. (The reason for the number 20 is to impose granularity; fractional points are not allowed.) For example, an engineering candidate might put herself forward like so:

  • Machine learning: 6
  • Functional programming: 5
  • Clojure: 3
  • Project management: 3
  • R: 2
  • Python: 1

These points might seem “meaningless”, because there’s no natural unit for them. but they’re not. What they show, clearly, is that a candidate has a clear interest (and is willing to be grilled for knowledge) in machine learning and functional programming, moderate experience in project management and with Clojure, and a little bit of experience in Python and R. There’s a lot of information there, as long as the allocation of points is done in good faith and, if not, that person won’t pass many interviews. Job requirements would be published in the same way: assign importance to the things according to how much they really matter, and keep the total at 20 points.

Since the points have different meanings on each side– for the employee, they represent fractions of experience; for the company, they represent relative importance– it goes without saying that a person who self-assigns 5 points in a technology isn’t ineligible for a job posting that places an importance of 6 for that technology. Rather, it indicates that there’s a rough match in how much weight each party assigns to that competency. This data could be mined to match employees to job listings for initial interviews and, quite likely, this approach (while imperfect) would perform better than the existing resume-driven regime. What used to involve overwhelmed gatekeepers is now a “simple matter” of unsupervised learning.

There is, of course, an obvious problem with this, which is that some people have more industry experience and “deserve” more points. An out-of-college candidate might only deserve 10 points, while a seasoned veteran should get 40 or 50. I’ll admit that I haven’t come up with a good solution for that. It’s a hard problem, because (a) one wants to avoid ageism, while (b) the objective here is sparseness in presentation, and I can’t think of a quick solution that doesn’t clutter the process up with distracting details. What I will concede is that, while some people clearly deserve more points than others do, there’s no fair way to perform that evaluation at an individual level. The job market is a distributed system with numerous adversarial agents, and any attempt to impose a global social status over it will fail, both practically and morally speaking.

Indeed, if there’s something that I find specifically despicable about the current resume-and-referral-driven job search culture, it’s in the attempt to create a global social status when there’s absolutely no good reason for one to exist.

The call for rational economy

If I were to wrap the causes of humanity’s recent political progress (and, quite likely, the scientific and industrial progress that followed) up into two words, I would use rational government. Starting with Machiavelli’s championing of republics (despite his better-known satire, The Prince) in the 16th century, and culminating in the Enlightenment of the 18th, political philosophers began to approach governmental problems with a structural and proto-scientific mindset. The concept that monarchs could rule by “divine right” was discarded, secular governments replaced religious ones, and constitutional government became requisite. It’s easy to take this for granted, but for most of human history, political bodies were ruled by charismatic leaders who would allow no checks against their accumulation of power. “Don’t you trust me, as your rightful king?” (“Well, even if I did, I don’t trust your asshole son who’ll reign after you knock off.”) What makes the 18th-century political philosophers so brilliant is their insight that trusting in people was the wrong way to go, because they tend toward unreliability in the long run as power corrupts and the throne changes hands, and that it was better to build robust structures. Governance, previously existing in the context a paternalistic reign handed down “from above” and usually justified with incredible supernatural claims, was something people could debate and vote on.

Since then, what we’ve had in the West have been mostly libertarian governments, at least compared to most of human history. It hasn’t been monotonic progress, but the ideal of rational government is clearly winning. We don’t burn heretics. In fact, most governments recognize the concept of a “heretic” as meaningless. This isn’t to say we got it right from the start, and it’s still not perfect– “sodomy” laws and the opposition to gay marriage are one example of pre-rational hangover– but the ideal is well-understood and people are working toward it. Libertarian government is, by and large, the accepted norm among educated people.

Rational government likely emerged as Europeans became more mobile. Interactions among people from different countries and with radically different experiences with governments fostered an interest in comparison. What are the English doing right and wrong? How about the French? The Italian states? As Europeans developed a more complete knowledge of their history and the variety of political structures, the existing patterns began to look ridiculous. The view of hereditary divine right evolved from seeing it as a component of a fixed, natural order to considering it a dangerous, reactionary superstition. Not to overstate my country’s importance as an American, but the United States plays a major role in this trend as well. In the late 18th century, a mix of mostly English Europeans attempted to experiment with rational government on the new continent, designing a country that was, from first principles, devoid of hereditary aristocrats or state religion.

What happens when rational, libertarian government (with low corruption) becomes the norm? The good news is that these governments tend to be fair and stable. There isn’t a lot of corruption or rent-seeking by government officials. It exists, but it’s less severe than it would be in a typical theocratic or aristocratic oligarchy. So you get an industrial, capitalistic economy. (For a contrast, the extortions and bribes required in a corrupt oligarchy retard industrial and entrepreneurial progress.) That’s a good thing, but it brings its own sets of problems. One of the major and perennial ones is the ability for businesses to profit, at the expense of the world, if they’re able to externalize costs (e.g. to the environment). There is also the instability, as observed in the 1930s, of hard-line industrial capitalism. Poverty, we learned in the Great Depression, is not some “moral medicine” that makes people better. It’s a cancer that can devour an entire nation. The third problem is that a libertarian government has a hard time curtailing an unchecked corporate elite that emerges in the power vacuum.

Over time, people begin to realize that laissez-faire capitalism is not desirable. This leads to a class of government interventions (social welfare programs, regulation, high taxation rates) typically associated with socialism and, in small doses, there’s no question that they’re an improvement. However, a number of supposedly “socialist” governments have proven themselves to be immensely corrupt, brutal, left-wing authoritarian regimes no better than the right-wing dictatorships of old. I don’t think anyone educated would prefer that extreme (statist, command economies) over the current system. Empirically, they don’t work well (see: North Korea). The question of where on this purported spectrum between statist socialism (left) and laissez-faire capitalism (right) an economy should be remains open.

What is the answer? Well, I think it’s important to look at this with a scientific, data-oriented mindset. I don’t have the kind of data that it would take to find a “closed-form” answer, but let me draw some insights from machine learning and statistics. Modeling approaches tend to be global, local, or some combination of the two. Global methods assert that there is some kind of underlying structure to the problem, and use all the data to build a model. For example, if one were relating latitude to average temperature, a global model would capture the relevant global relationship: polar latitudes tend to be cold, and equatorial latitudes tend to be warm. The vast majority of the earth’s surface would be well-classified by this model. It would, however, misclassify Rome (high latitude, warm) and Mount Kilimanjaro (low latitude, cold). You’d need a richer model (altitude, ocean currents, marine west-coast effects). Linear regression is one of the simplest effective global models, and for a wide variety of problems, it does well. On the other hand, local methods of inference give a high weight to nearby data. The archetypical local method is the “nearest neighbors” approach to inference. This is what real estate appraisers use when they attempt to find a fair value for a house or plot of land: what seems to be the market rate for nearby, comparable property? There’s clearly no simple global model relating positional coordinates to land or location value, so nearby data must be used. The disadvantage that local models have (as opposed to global ones) is the paucity of useful data. For many problems, there just isn’t enough data for local methods to perform well, either because the data is hard to collect or because the space is too convoluted (high dimensionality)– or both. The lesson is that both local and global methods of inference and modeling are valuable, and neither category is uniformly superior. To solve a complex problem, it usually requires both approaches to be used.

So what do these concepts of global versus local inference have to do with economics? Well, archetypical socialism is a global-minded approach. Certain social justice constraints (“no one should have less than <X>”, “total environmental pollution cannot exceed <Y>”) are set with the intention (at least, the stated intention, as many left-wing governments have been execrably corrupt) of keeping society fair, stable, and sane. The problem is that this can lead toward a command economy, and those do a poor job of solving the fundamental wealth-creation problem: building things people want, but that they don’t have the vision yet to know that they want. Command economies can produce commodities “to spec”, but no command economy could have come up with Google. Capitalism, on the other hand, is fiercely local. It has its own intellectually defensible brand of fairness (right-libertarianism) but no interest in enforcing global social-justice constraints. It doesn’t have the tools, and it has no interest in developing them. What it does extremely well is enable the individual to exploit local information (that command-economy bureaucracies would never acquire, and over which they would never agree on an interpretation) for personal benefit. This is, in effect, what markets are: distributed, computational methods for aggregating trillions of bits of local information, aggregating a signal from millions of self-interested actors.

What I intend strongly to convey, of course, is that a modern economy must draw from both columns. Socialist command economies degenerate rapidly, in large part because they must curtail individual freedoms in order to maintain the global structure to which they’re committed. Laissez-faire, on the other hand, diverges. For a while, the use of local information conveys a computational benefit and better economic decisions are made. Unfortunately, this also has a tendency to generate inequality among individuals that, in the long term, has a pernicious effect. Inequality among ideas and companies (aggregations of effort) is a good thing, because it means that bad ideas die and good ones grow in importance. When that’s applied to people, it’s not desirable. A class of economically disenfranchised people emerges, and so does an entrenched, wealthy aristocracy. The modern corporate elite is of the latter category. The incompetence and attitude of entitlement that reside at the top of American corporate world are truly terrifying.

One of the issues with capitalism and socialism both is that they tend to generate defective versions of the other. It seems to be a natural tendency. Supposedly socialist Russia had crime-ridden, violent black markets– the kind associated with illegal psychoactive drugs in North America– over commodities as staid as light bulbs. A command economy will not eradicate the very natural will to trade, and this creates a market. Making that illegal simply denies participation to law-abiding people, making what markets will exist unregulated and inefficient. On the other hand, American capitalism has generated a perverse socialism-for-the-rich. CEOs’ kids don’t “work their way up” in a meritocracy. Their wages aren’t set by a real market, but via favor-trading within a socially closed network of self-dealing corporate officials. Their daddies buy their educational admissions and resumes and, if they’re truly too stupid to make it on their own despite immense assistance, board-position sinecures at large corporations.

Right-libertarians (the “Tea Party”) blame corporatism on the government– it’s all this damn regulation that creates the corporate problem, they say– but that’s not a useful assessment. What actually happens is that the existing elite wants badly to stay elite and will use its immense resources in order to do so. They aren’t ideologically capitalistic. They would be just as comfortable as the ruling party in a left-wing, nominally “socialistic” tyranny as long as they were at the social apex. What they are is self-protecting. If corrupting governmental and educational institutions is an option to them (and it always is, because most modern corruption is in the form of invitations to parties, not actual wads of cash) they will do it.

Corporate America has generated its own royalty. What is different about 2012 from five or ten or fifty years ago is that people are now cognizant of it. The most interesting right-wing movement in the United States is the nascent Tea Party. While I disagree with them vehemently (as a left-libertarian, and also as one who favors science over emotional argument) I will give them credit for this: at their intellectual core (and, yes, there is one) they are aggressively anti-corporate. Post-2008, Americans get that the Corporate System is not a meritocracy, not rational, and not even real capitalism. It’s designed to provide the best of two systems (socialism and capitalism) for a well-connected social and increasingly hereditary elite, regardless of merit, and the worst of both systems for everyone else. For themselves, they create an economic arrangement in which they can derive enormous personal benefit from random variables that exist in the economy, but at the same time build a jealously guard a private social-welfare system that ensures they stay rich, well-positioned, and well-connected even if they fail. For the rest, they provide mostly downside, displacement, and discomfort. A perfect metaphor for this is air travel. Well-connected people get discounted or free air travel, special lounges in the airport, and access to comfortable private aviation. The rest of us get Soviet-style service and capitalistic price volatility: the worst from both systems.

What’s changing is that people all over the world are beginning to see that we don’t have a rational economy. We have a priesthood caste of executives who rule by their own version of “divine right”, claiming that the (invisible, to most people) network of social support that has placed them represents the “wisdom of the market”. We have a world where the transference of money into power is not only politically accepted, but increasingly seen as socially normal. It’s not called “corruption” anymore when journalists and government officials attend depraved parties in Davos, La Jolla or Aspen; it’s “self-interest”.

So what are we going to do? How do we overthrow the tyranny of position, especially in a world where such entrenchment can masquerade as “reputation”? We now have a world in which private social assistance can be presented as a “talent acquisition” (or “acqui-hire”) when our forefathers at least had the insight to call it “welfare for rich people”. These people are very well-connected and extremely adept at corrupting press and educational institutions in order to make their positions seem legitimate. They’ve created their own variety of rule by divine right, with “God’s will” ascertained in accord with how much money a person has (regardless of how he got it). For one concrete example, people are usually evaluated in a professional context according to job titles. Well, what are these but knighthoods and baronies assessed “from above”, and “up” points toward an entrenched, never-elected social elite who are not so much capitalism’s “market winners” as those best positioned to exploit an increasingly industrial economy. Is there really a difference between “Senior VP at BigCo” and “Thane of Cawdor”? I don’t see one. So why is the former resume gold, while the latter is a laughable anachronism?

I’m running out of time, so I’ll stop bashing the corporates and cut to the chase. The 18th-century was when the idea of rational government came to the fore, and it changed everything. People argue that the French Revolution “failed” because it led to Napoleon, but the truth is that Napoleon was quite restrained in comparison to almost all feudal lords, much less absolute monarchs. Progress toward rational government was not monotonic, but once the ideas reached implementation, they couldn’t be rolled back outright. The ideals lived on. They continue, even now, in the darkest and most irrationally governed corners of the earth, such as the Middle East. These concepts of rational government may not be implemented yet, but they are well-known and considered superior among a large number educated people. I believe that the 21st-century is when we’ll start to see real progress toward the rational economy. Why? Because it will be the only thing that can compete in the technological world. Only societies with rational economies and true “meritocracy” will be able to grow their prosperity at a technological (possibly 10+ percent per year) rate.

The Industrial Revolution required rational government, because the theocracies and monarchies of old would never have tolerated the social and economic rise of these upstarts. Change to a technological world will meet similar opposition from our entitled social, nominally “corporate”, elite. I don’t believe in a “Singularity”, but there are phase changes in growth, and the fast-evolving new entrants frequently “win”. Immensely powerful reptiles (dinosaurs) died out, while the small, fast-evolving creatures with mutant sweat glands (mammals) were able to adapt. Tool-using animals were able to control their environment in a way that their predecessors could not, and eventually evolved into the first humans. Awareness of time and future-orientation led to the agrarian revolution, characterized by 0.05 to 1% annual economic growth, and rational government made the industrial (1 to 10% annual growth) world, emerging in the late 18th-century, possible. Now, the world is pregnant with a new possibility: a technological world characterized by rapid economic growth, general prosperity instead of poverty and, if we do it right, an end to this sickening tyranny of geography (physical and social) that has rendered most of the world’s population poor. However, we’ll need a different kind of thought to make this possible. We’ll need a world where the right people– technologically-minded people– are making the decisions, and we need an economy that is not only rational, but protects its own rationality. This requires both the protection against divergence (poverty and self-perpetuating, entitled wealth) provided by socialism and the individual, local liberty of capitalism, but it requires something more: a technologically-minded commitment to solving hard problems using approaches (such as, in software, open allocation) that would previously be considered radical.

The Great Discouragement, and how to escape it.

I’ve recently taken an interest in the concept of the technological “Singularity”, referring to the acceleration of economic growth and social change brought along by escalating technological growth, and the potential for extreme growth (thousands of times faster than what exists now) in the future. People sometimes use “exponential” to refer to fast growth, but the reality is that (a) exponential curves do not always grow fast, and (b) economic growth has actually been faster than exponential to this point.

Life is estimated to estimated to be nearly 4 billion years old, but sexual reproduction and multicellular life are only about a billion years old. In other words, for most of its time in existence, life was relatively primitive, and growth itself was slow. Organisms themselves could reproduce quickly, but they died just as fast, and the overall change was minimal. This was true until the Cambrian Explosion, about 530 million years ago, when it accelerated. Evolution has been speeding up over time. If we represent “growth” in terms such as energy capture, energy efficiency, and neural complexity, we see that biological evolution has a faster-than-exponential “hockey stick” growth pattern. Growth was very slow for a long time, then the rate sped up.

One might model pre-Cambrian life’s growth rate at below 0.0000001% (note: these numbers are all estimates) per year, but by the age of animals it was closer to 0.000001% per year, or a doubling (of neural sophistication) every 70 million years or so, and several times faster than that in the primate era. Late in the age of animals, creatures such as birds and mammals could adapt rapidly, taking appreciably different forms in a mere few hundred thousand years. With the advent of tools and especially language (which had effects on assortative mating, and created culture) the growth rate, now factoring in culture and organization as well as evolutionary changes, skyrocketed to a blazing 0.00001% per year, in the age of hominids. Then came modern humans.

Data on the economic growth of human society paint a similar picture: accelerating exponential growth. Neolithic humans plodded along at about 0.0004% per year (still an order of magnitude faster than evolutionary change) and with the emergence of agriculture around 10000 B.C.E., that rate spend up, again, to 0.006% per year. This fostered the growth of urban, literate civilization (around 3000 B.C.E) and that boosted the growth rate to a whopping 0.1% per year, which was the prevailing economic growth rate for the world up until the Renaissance (1400 C.E.).

This level of growth– a doubling every 700 years– is rapid by the standards of most of the Earth’s history. It’s so obscenely fast that many animal and plant species have, unfortunately, been unable to adapt. They’re gone forever, and there’s a credible risk that we do ourselves in as well (although I find that unlikely). Agricultural humans increased their range by miles per year and increased the earth’s carrying capacity by orders of magnitude. Despite this progress, such a rate would be invisible to the people living in this 4,400-year span. No one had the global picture, and human lives aren’t long enough for anyone to have seen the underlying trend of progress, as opposed to the much more severe, local ups and downs. Tribes wiped each other out. Empires rose and fell. Religions were born, died, and were forgotten. Civilizations that grew too fast faced enemies (such as China, which likely would have undergone the Industrial Revolution in the 13th century had it not been susceptible to Mongol invasions). Finally, economic growth that occurred in this era was often absorbed entirely (and then some) by population growth. A convincing case can be made that the average person’s quality of life changed very little from 10000 B.C.E. to 1800 C.E., when economic growth began (for the first time) to outpace population growth.

In the 15th to 17th centuries, growth accelerated to about 0.3 percent per year: triple the baseline agricultural rate. In the 18th century, with the early stages of the Industrial Revolution, the Age of Reason, and the advent of rational government (as observed in the American experiment and French Revolution) it was 0.8 percent per year. By this point, progress was visible. Whether this advancement is desirable has never been without controversy, but by the 18th century, that it was occurring was without question. At that rate of progress, one would see a doubling of the gross world product in a long human life.

Even Malthus, the archetypical futurist pessimist, observed progress in 1798, but he made the mistake of assuming agrarian productivity to be a linear function of time, while correctly observing population growth to be exponential. In fact, economic growth has always been exponential: it was just a very slow (at that time, about 1% per year) exponential function that looked linear. On the other hand, his insight– that population growth would outpace food production capacity, leading to disaster– would have been correct, had the Industrial Revolution (then in its infancy) not accelerated. (Malthusian catastrophes are very common in history.) The gross world product increased more than six-fold in the 19th century, rising at a rate of 1.8 percent per year. Over the 20th, it continued to accelerate, with economic growth at its highest in the 1960s, at 5.7 percent per year– or a doubling every 150 months. We’re now a society that describes lower-than-average but positive growth as a “recession”.

In that sense, we’re also “in decline”. We’ve stopped growing at anything near our 1960s peak rate. We’re now plodding along at about 4.2 percent per year, if the last three decades are any indication. Most countries in the developed world would be happy to grow at half that rate.

The above numbers, and the rapid increase in the growth rate itself, describe the data behind the concept of “The Singularity”. Exponential growth emerges as a consequence of the differential equation, dy/dx = a * y, whose solution is an exponential function. Logistic growth is derived from the related equation dy/dx = a * y * (1 – y/L), where L is an upper limit or “carrying capacity”. Such limitations always exist, but I think that, with regard to economic growth, that limit is very far away– far enough away that we can ignore it for now. However, what we’ve observed is much faster than exponential growth, since the growth rate itself seems to be accelerating (also at a faster than exponential rate). So what is the correct way to model it?

One class of models for such a phenomenon is derived from the differential equation, dy/dx = a*y^(1+b), where b > 0. The solution to this differential equation (power law) is of the form y = C/(D-t)^(-1/b), the result of which is that as t -> D, growth becomes infinite. Hence, the name “Singularity”. No one actually believes that economic progress will become literally infinite, but that is a point at which it is assumed we will land comfortably in a post-scarcity, indefinite-lifespan existence. These two concepts are intimately connected and I would consider them identical. Time is the only scarce element in the life of a person middle-class or higher, but extremely so as long as our lifespans are so short compared to the complexity of the modern world (a person only gets to have one or two careers). Additionally, if people live “forever” (by which I mean millions of years, if they wish) then there will be an easy response to not being able to afford something: wait until you can. There will still be differences in status among post-scarcity people (some being at the end of a five-year waiting list for lunar tourism, and with the richest paying a premium for the prestige of having human servants) and probably some people will care deeply about them, but on the whole, I think these differences will be trivial and people will (over time) develop an immunity to the emotional problems of extreme abundance.

I should note that there are also dystopian Singularity possibilities, such as in The Matrix, in which machines become sentient and overthrow humans. I find this extremely far-fetched, because most artificial intelligence (to date) is still human intelligence applied to difficult statistical problems. We use machines to do things that we’re bad at, like multiply huge matrices in fractions of a second, and analyze game trees at 40-ply depth. I don’t see machines becoming “like us” because we’ll never have a need for them to be so. We’ll replicate functionality we want in order to solve menial tasks (with an increasingly sophisticated category of tasks being considered “menial”) but we won’t replicate the difficult behaviors and needs of humans. I don’t think we’ll fall into the trap of creating a “strong AI” that overthrows us. Sad to say it, but we’ve been quite skilled, over the millennia, at dehumanizing humans (slavery) in the attempt to make ideal workers. The upshot of this is that we’re unlikely to go to the other extreme and attempt to humanize machines. We’ll make them extremely good at performing our grunt work and leave the “human” stuff to ourselves.

Also, I don’t think a “Singularity” (in the sense of infinite growth) is likely, because I don’t think the model that produces a singularity is correct. I think that economic and technical growth are accelerating, and that we may see a post-scarcity, age-less world as early as 2100. That said, the data show deceleration over the past 50 years (from 5-6 percent to 3-4 percent annual growth) so rather than rocketing toward such a world, we seem to be coasting. I would be willing to call the past 40 years, in the developed world, an era of malaise and cultural decline. It’s the Great Discouragement, culminating a decade (2000s) of severe sociological contraction despite economic growth in the middle years, ending with a nightmare recession. What’s going on?

Roughly speaking, I think we can examine, and classify, historical periods by their growth rate, like so:

  • Evolutionary (below 0.0001% per year): 3.6 billion to 1 million BCE. Modern humans not yet on the scene.
  • Pre-Holocene (0.0001% to 0.01% per year): 1 million to 10,000 BCE.
  • Agrarian (0.01 to 1.0% per year): 10,000 BCE to 1800 CE. Most of written human history occurred during this time. Growth was slower than population increase, hence frequent Malthusian conflict. Most labor was coerced.
  • Industrial (1.0 to 10.0% per year): 1800 CE to Present. Following the advent of rational government, increasing scientific literacy, and the curtailment of religious authority, production processes could be measured and improved at rapid rates. Coercive slavery was replaced by semi-coercive wage labor.
  • Technological (10.0 to 100.0+% per year): Future. This rate of growth hasn’t been observed in the world economy as a whole, ever, but we’re seeing it in technology already (Moore’s Law, cost of genome sequencing, data growth, scientific advances). We’re coming into a time where things that were once the domain of wizardry (read: impossible) such as reading other peoples’ dreams can now be done. In the technological world, labor will be non-coercive, because the labor of highly motivated people is going to be worth 10 to 100 times more than that of poorly motivated people.

Each of these ages has a certain mentality that prospers in it, and that characterizes successful leadership in such a time. In the agrarian era, the world was approximately zero-sum, and the only way for a person to become rich was to enslave others and capture their labor, or kill them and take their resources. In the early industrial era, growth became real, but not fast enough to accommodate peoples’ material ambitions, creating a sense of continuing necessity for hierarchy, intimidation, and injustice in the working world. In a truly technological era (which we have not yet entered) the work will be so meaningful and rewarding (materially and subjectively) that such control structures won’t be necessary.

In essence, these economic eras diverge radically in their attitudes toward work. Agrarian-era leaders, if they wanted to be rich, could only do so by controlling more people. Kings and warlords were assessed on the size of their armies, chattel, and harems. Industrial-era leaders focused on improving mechanical processes and gaining control of capital. They ended slavery in favor of a freer arrangement, and workplace conditions improved somewhat, but were still coarse. Technological-era leadership doesn’t exist yet, in most of the world, but its focus seems to be on the deployment of human creativity to solve novel problems. In the technological world, a motivated and happy worker isn’t 25 or 50 percent more productive than an average one, but 10 times as effective. As one era evolves into the next, the leadership of the old one proves extremely ineffective.

The clergy and kings of antiquity were quite effective rulers in a world where almost no one could afford books, land was the most important form of wealth, and people needed a literate, historically-aware authority to direct them over what to do with it. Those in authority had a deep understanding of the limitations of the world and the slow rate of human progress: much slower than population growth. They knew that life was pretty close to a zero-sum struggle, and much of religion focuses on humanity’s attempts to come to terms with such a nasty reality. These leaders also knew, in a macabre way, how to handle such a world: control reproduction, gain dominion over land through force, use religion to influence the culture and justify land “ownership”, and curtail population growth in small-scale massacres called “wars” instead of suffering famines or revolutions.

People like Johannes Gutenberg, Martin Luther, John Locke, Adam Smith, and Voltaire came late in the agrarian era changed all that. Books became affordable to middle-class Europeans, and the Reformation happened a couple centuries later. This culminated in the philosophical movement known as The Enlightenment, in which Europe and North America disavowed rule based on “divine right” or heredity and began applying principles of science and philosophy to all areas of life. By 1750, there was a world in which the clerics and landlords of the agrarian era were terrible leaders. They didn’t know the first thing about the industrial world that was appearing right in front of them. Over the next couple hundred years, they were either violently overthrown (as in France) or allowed to decline gracefully out of influence (as in England).

The best political, economic, and scientific minds in that time could see a world that grew at industrial rates that were unheard of until that time. The landowning dinosaurs from the agrarian era died out or lost power. This was not always an attractive picture, of course. One of the foremost conflicts between an industrial and an agrarian society was the American Civil War, an extremely traumatic conflict for both sides. Then there were the nightmarish World Wars of the early 20th century, which established that industrial societies can still be immensely barbaric. That said, the mentalities underlying these wars were not novel, and it wasn’t the industrial era that caused them, so much as it was a case of pre-industrial mentalities combining with industrial power, to very dangerous results.

For example, before Nazism inflamed it, racism in Germany was (although hideous) not unusual by European or world standards, then or at any point up to then. In fact, it was a normal attitude in England, the United States, Japan, and probably all of the other nation-states that were forming around that time. Racism, although I would argue it to be objectively immoral in any era, was a natural byproduct of a world whose leaders saw it necessary, for millennia, to justify dispossession, enslavement, and massacre of strangers. What the 1940s taught us, in an extreme way, is that this hangover from pre-industrial humanity, an execrable pocket of non-Reason that had persisted into industrial time, could not be accepted.

The First Enlightenment began when leading philosophers and statesmen realized that industrial rates of growth were possible in a still mostly agrarian world, and they began to work toward the sort of world in which science and reason could reign. Now we have an industrial economy, but our world is still philosophically, culturally and rationally illiterate, even in the leading ranks. Still, we live on the beginning fringe of what might be (although it is too early to tell) a “Second Enlightenment”. We now have an increasing number of technological thinkers in science and academia. We see such thinking on forums like Hacker News, Quora, and some corners of Reddit. It’s “nerd culture”. However, by and large, the world is still run by industrial minds (and the mentality underlying American religious conservatism is distinctly pre-industrial). This is the malaise that top computer programmers face in their day jobs. They have the talent and inclination to work to turn $1.00 into $2.00 on difficult, “sexy” problems (such as machine learning, bioinformatics, and the sociological problems solved by many startups) but they work for companies and managers that have spent decades perfecting the boring, reliable processes that turn $1.00 into $1.04, and I would guess that this is the kind of work with which 90% of our best technical minds are engaged: boring business bullshit instead of the high-potential R&D work that can actually change the world. The corporate world still thinks in industrial (not technological) terms, and it always will. It’s an industrial-era institution, as much as baronies and totalitarian religion are agrarian-era beasts.

Modern “nerd culture” began in the late 1940s when the U.S. government and various corporations began funding basic research and ambitious engineering and scientific projects. This produced immense prosperity, rapid growth, and an era of optimism and peace. It enabled us to land a man on the moon in 1969. (We haven’t been back since 1972.) It built Silicon Valley. It looked like the transition from industrial to technological society (with 10+ percent annual economic growth) was underway. An American in 1969 might have perceived that the Second Enlightenment was underway, with the Civil Rights Act, enormous amounts of government funding for scientific research, and a society whose leaders were, by and large, focused on ending poverty.

Then… something happened. We forgot where we came from. We took the great infrastructure that a previous generation had build for granted, and let it decay. As the memory of the Gilded Age (brought to us by a parasitic elite) and Great Depression faded, elitism became sexy again. Woodstock, Civil Rights, NASA and “the rising tide that lifts all boats” gave way to Studio 54 and the Reagan Era. Basic research was cut for its lack of short-term profit, and because the “take charge” executives (read: demented simians) that raided their companies couldn’t understand what those people did all day. (They talk about math over their two-hour lunches? They can’t be doing anything important! Fire ‘em all!) Academia melted down entirely, with tenure-track jobs becoming very scarce. America lost its collective vision entirely. The 2001 vision of flying cars and robot maids for all was replaced with a shallow and nihilistic individual vision: get as rich as you can, so you have a goddamn lifeboat when this place burns the fuck down.

The United States entered the post-war era as an industrial leader. It rebuilt Europe and Japan after the war, lifted millions out of poverty, made a concerted (if still woefully incomplete) effort to end its own racism, and had enormous technical accomplishments. Yet now it’s in a disgraceful state, with people dying of preventable illnesses because they lack health insurance, and business innovation stagnant except in a few “star cities” with enormous costs of living, where the only thing that can get funded are curious but inconsequential sociological experiments. Funding for basic research has collapsed, and the political environment has veered to the far right wing. Barack Obama– who clearly has a Second Enlightenment era mind, if a conservative one in such a frame– has done an admirable job of fighting this trend (and he’s accomplished far more than his detractors, on the left and right, give him credit for) but one man alone cannot hold back the waterfall. The 2008 recession may have been the nadir of the Great Discouragement, or the trough may still be ahead of us. Right now, it’s too early to tell. We’re clearly not out of the mess, however.

How do we escape the Great Discouragement? To put it simply, we need different leadership. If the titans of our world and our time are people who can do no better than to turn $1.00 into $1.04, then we can’t expect more of them. If we let such people dominate our politics, then we’ll have a mediocre world. This is why we need the Second Enlightenment. The First brought us the idea of rational government: authority coming from laws and structure rather than charismatic personalities, heredity, or religious claims. In the developed world, it worked! We don’t have an oppressive government in the United States. (We may have an inefficient one, and we have some very irrational politicians, but the system is shockingly robust when one considers the kinds of charismatic morons who are voted into power on a fairly regular basis.) To the extent that the U.S. government is failing, it’s because the system has been corrupted by the unchecked corporate power that has stepped into the power vacuum created by a limited, libertarian government. Solving the nation’s economic and sociological problems, and the cultural residue associated with a lack of available, affordable education, will take us a long way toward fixing the political issues we have.

The Second Enlightenment will focus on a rational economy and a fair society. We need to apply scientific thought and philosophy to these domains, just as we did for politics in the 1700s when we got rid of our kings and vicars. I don’t know what the solution will end up looking like. Neither pure socialism nor pure capitalism will do: the “right answer” is very likely to be a hybrid of the two. It is clear to me, to some extent, what conditions this achievement will require. We’ll have to eliminate the effects of inherited wealth, accumulated social connection, and the extreme and bizarre tyranny of geography in determining a person’s economic fortune. We’ll have to dismantle the current corporate elite outright; no question on that one. Industrial corporations will still exist, just as agrarian institutions do, but the obscene power held by these well-connected bureaucrats, whose jobs involve no production, will have to disappear. Just as we ended the concept of a king’s”divine right” to rule, turning such people into mere figureheads, we’ll have to do the same with corporate “executives” and their similarly baseless claims to leadership.

We had the right ideas in the Age of Reason, and the victories from that time benefit us to that day, but we have to keep fighting to keep the lights on. If we begin to work at this, we might see post-scarcity humanity in a few generations. If we don’t, we risk driving headlong into another dark age.

Don’t waste your time in crappy startup jobs.

What I’m about to say is true now, as of July 2012. It wasn’t necessarily true 15 years ago, and it may not be true next year. Right now, for most people, it’s utterly correct– enough that I feel compelled to say it. The current VC-funded startup scene, which I’ve affectionately started calling “VC-istan”, is– not to be soft with it– a total waste of time for most of the people involved.

Startups. For all the glamour and “sexiness” associated with the concept, the truth is that startups are no more and no less than what they sound like: new, growing businesses. There are a variety of good and bad reasons to join or start businesses, but for most of human history, it wasn’t viewed as a “sexy” process. Getting incorporated, setting up a payroll system, and hiring accountants are just not inspiring duties for most people. They’re mundane tasks that people are more than willing to do in pursuit of an important goal, but starting a business has not typically been considered to be  inherently “sexy”. What changed, after about 1996, is that people started seeing ”startups” as an end in themselves. Rather than an awkward growth phase for an emerging, risky business, “startup” became a lifestyle. This was all fine because, for decades, positions at established businesses were systemically overvalued by young talent, and those at growing small companies were undervalued. It made economic sense for ambitious young people to brave the risk of a startup company. Thus, the savviest talent gravitated toward the startups, where they had access to responsibilities and career options that they’d have to wait for years to get in a more traditional setting.

Now, the reverse seems to be true. In 1995, a lot of talented young people went into large corporations because they saw no other option in the private sector– when, in fact, there were credible alternatives, startups being a great option. In 2012, a lot of young talent is going into startups for the same reason: a belief that it’s the only legitimate work opportunity for top talent, and that their careers are likely to stagnate if they work in more established businesses. They’re wrong, I think, and this mistaken belief allows them to be taken advantage of. The typical equity offer for a software engineer is dismally short of what he’s giving up in terms of reduced salary, and the career path offered by startups is not always what it’s made out to be.

For all this, I don’t intend to argue that people shouldn’t join startups. If the offer’s good, and the job looks interesting, it’s worth trying out. I just don’t think that the current, unconditional “startups are awesome!” mentality serves us well. It’s not good for any of us, because there’s no tyrant worse than a peer selling himself short, and right now there are a lot of great people selling themselves very short for a shot at the “startup experience”– whatever that is.

Here are 7 misconceptions about startups that I’d like to dispel.

1. A startup will make you rich. True, for founders, whose equity shares are measured in points. Not true for most employees, who are offered dimes or pennies.

Most equity offerings for engineers are, quite frankly, tiny. A “nickel” (0.05 percent) of an 80-person business is nothing to write home about. It’s not partnership or ownership. Most engineers have the mistaken belief that the initial offering is only a teaser, and that it will be improved once they “prove themselves”, but it’s pretty rare that this actually happens.

Moreover, raises and bonuses are very uncommon in startups. It’s typical for high performers to be making the same salary after 3 years as they earned when they started. (What happens to low performers, and to high performers who fail politically? They get fired, often with no warning or severance.) Substantial equity improvements are even rarer. When things are going well in a startup, the valuation of the equity package is increasing and that is the raise. When things are going badly, that’s the wrong time to be asking for anything.

There are exceptions. One is that, if the company finds itself in very tough straits and can’t afford to pay salaries at all, it will usually grant more equity to employees in order to make up for the direct economic hardship it’s causing them by not being able to pay a salary. This isn’t a good situation, because the equity is usually offered at-valuation (more specifically, at the valuation of the last funding round, when the company was probably in better shape) and typically employees would be better off with the cash. Another is that it’s not atypical for a company to “refresh” or lengthen a vesting period with a proportionate increase. A 0.1% grant, vesting over four years, can be viewed as compensation at 0.025% per year. It’s not atypical for a company to continue that same rate in the years after that. That means that a person spending six years might get up to 0.15%. What is atypical is for an employee brought in with 0.1% to be raised to 1% because of good performance. The only time that happens is when there’s a promotion involved, and internal promotions (more on this, later) are surprisingly rare in startups.

2. The “actual” valuation is several times the official one. This is a common line, repeated both by companies in recruiting and by engineers justifying their decision to work for a startup. (“My total comp. is actually $250,000 because the startup really should be worth $5 billion.) People love to think they’re smarter than markets. Usually, they aren’t. Moreover, the few who are capable of being smarter than markets are not taking (or trying to convince others to take) junior-level positions where the equity allotment is 0.05% of an unproven business. People who’ve legitimately developed that skill (of reliably outguessing markets) deal at a much higher level than that.

So, when someone says, “the actual valuation should be… “, it’s reasonable to conclude with high probability that this person doesn’t know what the fuck he or she is talking about.

In fact, an engineer’s individual valuation should, by rights, be substantially lower than the valuation at which the round of funding is made. When a VC offers $10 million for 20% of a business, the firm is stating that it believes the company (pre-money) is worth $40 million to them. Now, startup equity is always worth strictly more (and by a substantial amount) to a VC than it is worth to an engineer. So the fair economic value (for an engineer) of a 0.1% slice is probably not $40,000. It might be $10-20,000.

There are several reasons for this disparity of value. First, the VC’s stake gives them control. It gives them board seats, influence over senior management, and the opportunity to hand out a few executive positions to their children or to people whom they owe favors. An engineer’s 0.1% slice, vesting over four years, doesn’t give him any control, respect, or prestige. It’s a lottery ticket, not a vote. Second, startup equity is a high-risk asset, and VCs have a different risk profile from average people. An average person would rather have a guarantee of $2 million than a 50% chance of earning $5 million, even though the expected value of the latter offer is higher. VCs, in general, wouldn’t, because they’re diversified enough to take the higher-expectancy, riskier choices. Third, the engineer has no protection against dilution, and will be on the losing side of any preference structure that the investors have set up (and startups rarely volunteer information pertaining to what preferences exist against common stock, which is what the engineers will have). Fourth, venture capitalists who invest in highly successful businesses get prestige and huge returns on investment, whereas mere employees might get a moderate-sized windfall, but little prestige unless they achieved an executive position. Otherwise, they just worked there.

In truth, startup employees should value equity and options at about one-fourth the valuation that VCs will give it. If they’re giving up $25,000 per year in salary, they should only do so in exchange for $100,000 per year (at current valuation) in equity. Out of a $40-million company with a four-year vesting cycle, that means they should ask for 1%.

3. If you join a startup early, you’re a shoe-in for executive positions. Nope.

Points #1-2 aren’t going to surprise many people. Most software engineers know enough math to know that they won’t get filthy rich on their equity grants, but join startups under the belief that coming into the company early will guarantee a VP-level position at the company (at which point compensation will improve) once it’s big. Not so. In fact, one of the best ways not to get a leadership position in a startup is to be there early.

Startups often involve, for engineers, very long hours, rapidly changing requirements, and tight deadlines, which means the quality of the code they write is generally very poor in comparison to what they’d be able to produce in saner conditions. It’s not that they’re bad at their jobs, but that it’s almost impossible to produce quality software under those kinds of deadlines. So code rots quickly in a typical startup environment, especially if requirements and deadlines are being set by a non-technical manager. Three years and 50 employees later, what they’ve built is now a horrific, ad-hoc, legacy system hacked by at least ten people and built under intense deadline pressure, and even the original architects don’t understand it. It may have been a heroic effort to build such a powerful system in so little time, but from an outside perspective, it becomes an embarrassment. It doesn’t make the case for a high-level position.

Those engineers should, by rights, get credit and respect for having built the system in the first place. For all its flaws, if the system works, then the company owes no small part of its success to them. Sadly, though, the “What have you done for me lately?” impulse is strong, and these engineers are typically associated with how their namesake projects end (as deadline-built legacy monstrosities) rather than what it took to produce them.

Moreover, the truth about most VC-funded startups is that they aren’t technically deep, so it seems to most people that it’s marketing rather than technical strength that determines which companies get off the ground and which don’t. The result of this is that the engineer’s job isn’t to build great infrastructure that will last 10 years… because if the company fails on the marketing front, there will be no “in 10 years”. The engineer’s job is to crank out features quickly, and keep the house of cards from falling down long enough to make the next milestone. If this means that he loads up on “technical debt”, that’s what he does.

If the company succeeds, it’s the marketers, executives, and biz-dev people who get most of the glory. The engineers? Well, they did their jobs, but they built that disliked legacy system that “just barely works” and “can’t scale”. Once the company is rich and the social-climbing mentality (of always wanting “better” people) sets in, the programmers will be replaced with more experienced engineers brought in to “scale our infrastructure”. Those new hires will do a better job, not because they’re superior, but because the requirements are better defined and they aren’t working under tight deadline pressure. When they take what the old-timers did and do it properly, with the benefit of learning from history, it looks like they’re simply superior, and managerial blessing shifts to “the new crowd”. The old engineers probably won’t be fired, but they’ll be sidelined, and more and more people will be hired above them.

Furthermore, startups are always short on cash and they rarely have the money to pay for the people they really want, so when they’re negotiating with these people in trying to hire them, they usually offer leadership roles instead. When they go into the scaling phase, they’re typically offering $100,000 to $150,000 per year for an engineer– but trying to hire people who would earn $150,000 to $200,000 at Google or on Wall Street. In order to make their deals palatable, they offer leadership roles, important titles and “freedom from legacy” (which means the political pull to scorched-earth existing infrastructure if they dislike it or it gets in their way) to make up for the difference. If new hires are being offered leadership positions, this leaves few for the old-timers. The end result of this is that the leadership positions that early engineers expect to receive are actually going to be offered away to future hires.

Frankly put, being a J.A.P. (“Just A Programmer”) in a startup is usually a shitty deal. Unless the company makes unusual cultural efforts to respect engineering talent (as Google and Facebook have) it will devolve into the sort of place where people doing hard things (i.e. software engineers) get the blame and the people who are good at marketing themselves advance.

4. In startups, there’s no boss. This one’s patently absurd, but often repeated. Those who champion startups often say that one who goes and “works for a company” ends up slaving away for “a boss” or “working for The Man”, whereas startups are a path to autonomy and financial freedom.

The truth is that almost everyone has a boss, even in startups. CEOs have the board, the VPs and C*Os have the CEO, and the rest have actual, you know, managers. That’s not always a bad thing. A competent manager can do a lot for a person’s career that he wouldn’t realistically be able to do on his own. Still, the idea that joining a startup means not having a boss is just nonsense.

Actually, I think founders often have the worst kind of “boss” in venture capitalists. To explain this, it’s important to note that the U.S. actually has a fairly low “power distance” in professional workplaces– this is not true in all cultures– by which I mean bosses aren’t typically treated as intrinsic social superiors to their direct reports. Yes, they have more power and higher salaries, but they’re also older and typically have been there for longer. A boss who openly treats his reports with contempt, as if he were innately superior, isn’t going to last for very long. Also, difficult bosses can be escaped: take another job. And the most adverse thing they can (legally) do is fire someone, which has the same effect. Beyond that, bosses can’t legally have a long-term negative effect on someone’s career.

With VCs, the power distance is much greater and the sense of social superiority is much stronger. For example, when a company receives funding it is expected to pay both parties’ legal fees. This is only a minor expenditure in most cases, but it exists to send a strong social message: you’re not our kind, dear, and this is what you’ll deal with in order to have the privilege of speaking with us at all. 

This is made worse by the incestuous nature of venture capital, which leads to the worst case of groupthink ever observed in a supposedly progressive, intelligent community. VCs like a startup if other VCs like it. The most well-regarded VCs all know each other, they all talk to each other, and rather than competing for the best deals, they collude. This leaves the venture capitalists holding all the cards. A person who turns down a term sheet with multiple liquidation preferences and participating preferred (disgusting terms that I won’t get into because they border on violence, and I’d prefer this post to be work-safe) is unlikely to get another one.

A manager who presents a prospective employee with a lowball offer and says, “If you don’t take this, I’ll make a phone call and no one in the industry will hire you” is breaking the law. That’s extortion. In venture capital? They don’t have to say this. It’s unspoken that if you turn down a terrible term sheet with a 5x liquidation preference, you’re taking a serious risk that a phone call will be made and that supposedly unrelated interest will dry up as well. That’s why VCs can get away with multiple liquidation preferences and participating preferred.

People who really don’t want to have “a boss” should not be looking into VC-funded startups. There are great, ethical venture capitalists who wouldn’t go within a million miles of the extortive shenanigans I’ve described above. It’s probably true that most are. Even still, the power relationship between a founder and investor is far more lopsided than that between a typical employee and manager. No manager can legally disrupt an employee’s career outside of one firm; but venture capitalists can (and sometimes do) block people from being fundable.

Instead, those who really want not to have a boss should be thinking about smaller “lifestyle” businesses in which they’ll maintain a controlling interest. VC has absolutely no interest in funding these sorts of companies, so this is going to require angel investment or personal savings, but for those who really want that autonomy, I think this is the best way to go.

For all this, what I’ve said here about the relationship between founders and VCs isn’t applicable to typical engineers. An engineer joining a startup of larger than about 20 people will have a manager, in practice if not in reality. That’s not a bad thing. It’s no worse or better than it would be in any other company. It does make the “no boss” vs. “working for The Man” selling point of startups a bit absurd, though.

5. Engineers at startups will be “changing the world”. With some exceptions, startups are generally not vehicles for world-changing visions. Startups need to think about earning revenue within the existing world, not “changing humanity as we know it”.

“The vision thing” is an aspect of the pitch that is used to convince 22-year-old engineers to work for 65 percent of what they’d earn at a more established company, plus some laughable token equity offering. It’s not real.

The problem with changing the world is that the world doesn’t really want to change, and to the extent that it it’s willing to do so, few people who have the resources necessary to push for improvements. What fundamental change does occur is usually gradual– not revolutionary– and requires too much cooperation to be forced through by a single agent.

Scientific research changes the world. Large-scale infrastructure projects change the world. Most businesses, on the other hand, are incremental projects, and there’s nothing wrong with that. Startups are not a good vehicle for “changing the world”. What they are excellent at is finding ways to profit from inexorable, pre-existing trends by doing things that (a) have recently become possible, but that (b) no one had thought of doing (or been able to do) before. By doing so, they often improve the world incrementally: they wouldn’t survive if they didn’t provide value to someone. In other words, most of them are application-level concepts that fill out an existing world-changing trend (like the Internet) but not primary drivers. That’s fine, but people should understand that their chances of individually effecting global change, even at a startup, are very small.

6. If you work at a startup, you can be a founder next time around. What I’ve said so far is that it’s usually a shitty deal to be an employee at a startup: you’re taking high risk and low compensation for a job that (probably) won’t make you rich, lead to an executive position, bring great autonomy, or change the world. So what about being a founder? It’s a much better deal. Founders can get rich, and they will make important connections that will set up their careers. So why aren’t more people becoming founders of VC-funded startups? Well, they can’t. Venture capital acceptance rates are well below 1 percent.

The deferred dream is probably the oldest pitch in the book, so this one deserves address. A common pitch delivered to prospective employees in VC-istan is that “this position will set you up to be a founder (or executive) at your next startup”. Frankly, that’s just not true. The only thing that a job can offer that will set a person up with the access necessary to be a founder in the future is investor contact, and a software engineer who insists on investor contact when joining an already-funded startup is going to be laughed out the door as a “prima donna”.

A non-executive position without investor contact at a startup provides no more of the access that a founder will need than any other office job. People who really want to become startup founders are better off working in finance (with an aim at venture capital) or pursuing MBA programs than taking subordinate positions at startups.

7. You’ll learn more in a startup. This last one can be true; I disagree with the contention that it’s always true. Companies tend to regress to the mean as they get bigger, so the outliers on both sides are startups. And there are things that can be learned in the best small companies when they are small that can’t be learned anywhere else. In other words, there are learning opportunities that are very hard to come by outside of a startup.

What’s wrong here is the idea that startup jobs inherently more educational simply because they exist at startups. There’s genuinely interesting work going on at startups, but there’s also a hell of a lot of grunt work, just like anywhere else. On the whole, I think startups invest less in career development than more established companies. Established companies have had great people leave after 5 years, so they’ve had more than enough time to “get it” on the matter of their best people wanting more challenges. Startups are generally too busy fighting fires, marketing themselves, and expanding to have time to worry about whether their employees are learning.

So… where to go from here?

I am not trying to impart the message that people should not work for startups. Some startups are great companies. Some pay well and offer career advancement opportunities that are unparalleled. Some have really great ideas and, if they can execute, actually will make early employees rich or change the world. People should take jobs at startups, if they’re getting good deals.

Experience has led me to conclude that there isn’t much of a difference in mean quality between large and small companies, but there is a lot more variation in the small ones, for rather obvious reasons. The best and worst companies tend to be startups. The worst ones don’t usually live long enough to become big companies, so there’s a survivorship bias that leads us to think of startups as innately superior. It’s not the case.

As I said, the worst tyrant in a marketplace is a peer selling himself short. Those who take terrible deals aren’t just doing themselves a disservice to themselves, but to all the rest of us as well. The reason young engineers are being offered subordinate J.A.P. jobs with 0.03% equity and poorly-defined career tracks is because there are others who are unwise enough to take them.

In 2012, is there “a bubble” in internet startups? Yes and no. In terms of valuations, I don’t think there’s a bubble. Or, at least, it’s not obvious to me that one exists. I think it’s rare that a person who’s relatively uninformed (such as myself, when it comes to pricing technology companies) can outguess a market, and I see no evidence that the valuations assigned to these companies are unreasonable. Where there is undeniably a bubble is in the extremely high value that young talent is ascribing to subordinate positions at mediocre startups.

So what is a fair deal, and how does a person get one? I’ll give some very basic guidelines.

1. If you’re taking substantial financial risk to work at the company, you’re a Founder. Expect to be treated like one. By “substantial financial risk”, I mean earning less  than (a) the baseline cost-of-living in one’s area or (b) 75% of one’s market compensation.

If you’re taking that kind of risk, you’re an investor and you better be seen as a partner. It means you should demand the autonomy and respect given to a founder. It means not to take the job unless there’s investor contact. It means you have a right to know the entire capitalization structure (an inappropriate question for an employee, but a reasonable one for a founder) and determine if it’s fair, in the context of a four-year vesting period. (If the first technical hire gets 1% for doing all the work and the CEO gets 99% because he has the connections, that’s not fair. If the first technical hire gets 1% while the CEO gets 5% and the other 94% has been set aside for employees and investors, and the CEO has been going without salary for a year already, well, that’s much more fair.) It means you should have the right to represent yourself to the public as a Founder.

2. If you have at least 5 years of programming experience and the company isn’t thoroughly “de-risked”, get a VP-level title. An early technical hire is going to be spending most of his time programming– not managing or sitting in meetings or talking with the press as an “executive” would. Most of us (myself included) would consider that arrangement, of getting to program full-time at high productivity, quite desirable. This might make it seem like “official” job titles (except for CEO) don’t matter and that they aren’t worth negotiating for. Wrong.

Titles don’t mean much when 4 people at the company. Not in the least. So get that VP-level title locked-in now, before it’s valuable and much harder to get. Once there are more than about 25 people, titles start to have real value and for a programmer to ask for a VP title might seem like an unreasonable demand.

People may claim that titles are old-fashioned and useless and elitist, and they often have strong points behind their claims. Still, people in organizations place a high value on institutional consistency (meaning that there’s additional cognitive load for them to contradict the company’s “official” statements, through titles, about the status of its people) and the high status, however superficial and meaningless, conferred by an impressive title can easily become self-perpetuating. As the company becomes larger and more opaque, the benefit conferred by the title increases.

Another benefit of having a VP-level title is the implicit value inherent of being VP of something. It means that one will be interpreted as representing some critical component of the company. It also makes it embarrassing to the top executives and the company if this person isn’t well treated. For an example, let’s take “VP of Culture”. Doesn’t it sound like a total bullshit title? In a small company, it probably is. So meaningless, in fact, that most CEOs would be happy to give it away. “You want to be ‘VP of Culture’, but you’ll be doing the same work for the same salary? By all means.”  Yet what does it mean if a CEO berates the VP of Culture? That culture isn’t very important at this company. What about if the VP of Culture is pressured to resign or fired? From a public view, the company just “lost” its VP of Culture. That’s far more indicative than if a “J.A.P.” engineer leaves.

More relevantly, a VP title puts an implicit limit on the number of people who can be hired above a person, because most companies don’t want the image of having 50 of their 70 people being “VP” or “SVP”. It dilutes the title, and makes the company look bloated (except in finance, where “VP” is understood to represent a middling and usually not executive level.) If you’re J.A.P., the company is free to hire scads of people above you. If you’re a VP, anyone hired above you has to be at least a VP, if not an SVP, and companies tend to be conservative with those titles once they start to actually matter.

The short story to this is that, yes, titles are important and you should get one if the company’s young and not yet de-risked. People will say that titles don’t mean anything, and that “leadership is action, not position”, and there’s some truth in that, but you want the title nonetheless. Get it early when it doesn’t matter, because someday it will. And if you’re a competent mid-career (5+ years) software engineer and the company’s still establishing itself, then having some VP-level title is a perfectly reasonable term to negotiate.

3. Value your equity or options at one-fourth of the at-valuation level.  This has been discussed above. Because this very risky asset is worth much more to diversified, rich investors than it is to an employee, it should be discounted by a factor of 3-4. This means that it’s only worth it to take a job at $25,000 below market in exchange for $100,000 per year in equity or options (at valuation).

Also worth keeping in mind is that raises and bonuses are uncommon in startups, and that working at a startup can have an affect on one’s salary trajectory. Realistically, a person should assess a startup offer in the light of what he expects to earn over the next 3 to 5 years, not what he can command now.

4. If there’s deferred cash involved, get terms nailed down. This one doesn’t apply to most startups, because it’s an uncommon arrangement after a company is funded for it to be paying deferred cash. Usually, established startups pay a mix of salary and equity.

If deferred cash is involved in the package, it’s important to get a precise agreement on when this payment becomes due. Deferred cash is, in truth, zero-interest debt of the company to the employee. Left to its own devices, no rationally acting company would ever repay a zero-interest loan. So this is important to get figured out. What events make deferred cash due? (Startups never have “enough” money, so “when we have enough” is not valid.) What percentage of a VC-funding round is dedicated to pay off this debt? What about customer revenue? It’s important to get a real contract to figure this out; otherwise, the deferred payment is just a promise, and sadly those aren’t always worth much.

The most important matter to address when it comes to deferred cash is termination, because being owed money by a company one has left (or been fired from) is a mess. No one ever expects to be fired, but good people get fired all the time. In fact, there’s more risk of this in a small company, where transfers tend to be impossible on account of the firm’s small size, and where politics and personality cults can be a lot more unruly than they are in established companies.

Moreover, severance payments are extremely uncommon in startups. Startups don’t fear termination lawsuits, because those take years and startups assume they will either be (a) dead, or (b) very rich by the time any such suit would end– and either way, it doesn’t much matter to them. Being fired in established companies usually involves a notice (“improvement plan”) period (in which anyone intelligent will line up another job) or severance, or both, because established companies really don’t want to deal with termination lawsuits. In startups, people who are fired usually get neither notice nor severance.

People tend to think that the risk of startups is limited to the threat of them going out of business, but the truth is that they also tend to fire a lot more people, and often with less justification for doing so. This isn’t always a bad thing (firing too few people can be just as corrosive as firing too many) but it is a risk people need to be aware of.

I wouldn’t suggest asking for a contractual severance arrangement in negotiation with a startup; that request will almost certainly be denied (and might be taken as cause to rescind the offer). However, if there’s deferred cash involved, I would ask for a contractual agreement, if there is deferred cash, that it becomes due immediately on event of involuntary termination. Day-of, full amount, with the last paycheck.

5. Until the company’s well established (e.g. IPO) don’t accept a “cliff” without a deferred-cash arrangement in event of involuntary termination. The “cliff” is a standard arrangement in VC-funded startups whereby no vesting occurs if the employee leaves or is fired in the first year. The problem with the cliff is that it creates a perverse incentive for the company to fire people before they can collect any equity.

Address the cliff as follows. If employee is involuntarily terminated, and the cliff is enforced, whatever equity would have vested is converted (at most recent valuation) to cash and due upon date of termination.

This is a non-conventional term, and many startups will flat-out refuse it. Fine. Don’t work for them. This is important; the last thing you want is for the company to have an incentive to fire you because of a badly-structured compensation package.

6. Keep moving your career forward. Just being “at a startup” is not enough. The most credible appeal of working at a startup is the opportunity to learn a lot, and one can, it’s not a guarantee. Startups tend to be more “self-serve” in terms of career development. People who go out of their way to explore and use new technologies and approaches to problems will learn a lot. People who let themselves get stuck with the bulk of the junior-level grunt work won’t.

I think it’s useful to explicitly negotiate project allocation after the first year– once the “cliff” period is over. Raises being rare at startups, the gap between an employee’s market value and actual compensation is only growing as time goes by. When the request for a raise is denied is a good time to bring up the fact that you really would like to be working on that neat machine learning project or that you’re really interested in trying out a new approach to a problem the company faces.

7. If blocked on the above, then leave. The above are reasonable demands, but they’re going to meet some refusal because there’s no shortage of young talent that is right now willing to take very unreasonable terms for the chance to work “at a startup”. So expect some percentage of these negotiations to end in denial, even to the point of rescinded job offers. For example, some startup CEOs will balk at the idea that a “mere” programmer, even if he’s the first technical hire, wants investor contact. Well, that’s a sign that he sees you as “J.A.P.” Run, don’t walk, away from him.

People tend to find negotiation to be unpleasant or even dishonorable, but everyone in business negotiates. It’s important. Negotiations are indicative, because in business politeness means little, and so only when you are negotiating with someone do you have a firm sense of how he really sees you. The CEO may pay you a million compliments and make a thousand promises about your bright future in the company, but if he’s not willing to negotiate a good deal, then he really doesn’t see you as amounting to much. So leave, instead of spending a year or two in a go-nowhere startup job.

In the light of this post’s alarmingly high word count, I think I’ll call it here. If the number of special cases and exceptions indicates a lack of a clear message, it’s because there are some startup jobs worth taking, and the last thing I want to do is categorically state that they’re all a waste of time. Don’t get me wrong, because I think most of VC-istan (especially in the so-called “social media” space) is a pointless waste of talent and energy, but there are gems out there waiting to be discovered. Probably. And if no one worked at startups, no one could found startups and there’d be no new companies, and that would suck for everyone. I guess the real message is: take good offers and work good jobs (which seems obvious to the point of uselessness) and the difficulty (as observed in the obscene length of this post) is in determining what’s “good”. That is what I, with my experience and observations, have attempted to do.

Talent has no manager

For a store clerk, his “manager” is a boss: someone who can fire the clerk. For a contrast, an actor’s “manager” works for the actor and can be fired by him. In one context, the manager holds power; in the other, he’s a subordinate. This isn’t a misused word, so much as the overloading comes from the shifting application of terms. In both cases the person is accurately described as a “manager”, but the relationships are utterly different– opposite, in fact. Why is this? It requires analysis of what it means to be a manager.

A manager is a person entrusted with decisions related to a resource that its owner cannot as easily handle. A hotel’s manager operates the hotel day-to-day; if the owner is a different person, he passively reaps the benefits. For an entertainment personality’s manager, the asset being managed is the person’s reputation and career. In both cases, if the owner of the asset decides that the manager is doing a poor job, the manager is replaced. Managers work for owners. That much is clear. In the context of a low-level employee like a store-clerk, the clerk doesn’t really own anything of value. His labor is replaceable. Although his supervisor is introduced as “his manager”, the reality is that this person is a manager for the store. This person is the store’s manager, and his boss.

The word “boss” (from the Dutch) replaced “master” in the early stages of the Industrial Revolution, because of the latter’s association with chattel slavery. In accord with the euphemism treadmill, “boss” eventually went out of favor as well, replaced by “supervisor”, which was replaced in turn by “manager”. Having a personal manager sounds a lot better than having a boss, but the latter is a more accurate and better term. It may be a blunt word, but it works well for the purpose.

The job of the boss is to represent the interests of the company, not employee. He or she cannot be expected to serve two masters, just as it would be inappropriate for an attorney to represent both sides of a lawsuit. The result of this is that employees often feel shafted when their “managers” fail to act as their advocates, instead preferring the company’s interests (or, at least, what the manager represents as the corporate interest) over the employee’s own. They shouldn’t feel this way. The boss is doing his job: the one he gets paid for. What’s unfair to the employee is not that his boss prefers the company’s interests over his, but that the employee has no advocate (except himself) with any power. Full responsibility for managing his talent falls on the employee.

Who is talent’s advocate? Generally, there’s none. Talent alone, one might argue, is not very valuable: experience, reputation, and relationships are usually required to unlock it. Because of this disadvantaged initial position, the person with the talent is expected to advocate for himself. Just as it’s dangerous to represent oneself in a court of law, it can be hard to negotiate on one’s own behalf when it comes to career matters. It helps to have an advocate who isn’t risking his personal relationships and reputation in the career process. So a lot of people don’t bother. Most people are underpaid by 10 to 50 percent because they are uncomfortable negotiating better compensation. Their bosses aren’t being evil; these people simply have no advocate and fail to represent themselves. For that, I think compensation is an arena in which employees are actually more fairly treated than in intangibles. Companies can’t legally renege on promised compensation, and basic negotiating skills are often all it takes for to get a fair shake there, but they can (and frequently do) use bait-and-switch tactics to lure the best people with promises of more interesting projects than what those people actually end up working on. This is a common way for companies to mislead employees into working for them, protected by the fact that no one wants a 5-month job on his CV.

In the workplace, talent is of high long-term importance. A company that can’t retain talent will face a creeping devaluation of its prestige, mission, and ultimately, its ability to succeed as a business. For this reason, there are a few progressive managers who advocate on the behalf of talent, at least in the abstract, because they know it to be important to the general interest of the company as much as it is for talented subordinates. This is admirable, but it should be considered an “extracurricular” effort, as it’s one that these managers take on at their own risk. When these efforts fail to show short-term (one quarter) results, the jobs of those who pushed for them end up on the line.

The reality is that this progressive attitude is quite rare. Most managers (who themselves lack advocates, except themselves) are just as worried about keeping their jobs as the people they manage, and aren’t comfortable advocating for interests other than those that they’re required to represent. Companies give lip service to “mentorship” and career development, but often these are just ad copy, not real commitments. What looks like a progressive company is usually an adept marketing department. Moreover, most workplace perks are pure vanity. “Catered lunches” are a nice benefit worth a few thousand dollars per year, largely provided to reduce lunch times and portions (people who eat out are served large portions and become measurably less productive for two hours). That’s not a bad thing, but it’s not given out of altruism. Moreover, perks like an in-office XBox or foosball table are just clumsily-applied band-aids. Real professionals go to work for the work, not the diversions.

As I said, the boss cannot (even if he’d so desire) advocate for subordinate talent because this would cause a conflict of interest between his professional duty to the company’s owners (or their proxies, who are his managers) and this ancillary role. It is also difficult, in an “lean” (euphemism for “we overwork managers”) environment where it’s typical for a manager to have 15 to 20 reports, for the manager to represent the interests of all the people under him. In practice, these “flat” organizations lead to necessary favoritism imposed by the clogged communication channels, while bosses who take “proteges” usually find that their disfavored subordinates decline in productivity and loyalty, which reduces the team’s performance on the whole. The result is that the manager must be disinterested and impersonal with all reports , so career advancement through typical channels is difficult if not impossible. “Extra-hierarchical” work (collaboration with people outside of one’s reporting structure) can be far more effective, because people tend to favor those who help them out but aren’t required to do so, but this effort also makes many managers feel threatened (it seems disloyal, and creates the appearance of someone attempting to engineer a transfer, and managers whose best reports are transferring lose face with their bosses).

If talent has no advocate, does this mean that the interests of talent are ignored? No, but they’re addressed in an often ineffective, far-too-late, way. A talented person’s best move, in 90 percent of organizations, is to find another job in another company. Of course, people are free to do this, and often should, but constant churn is bad for the organization, and leads to a long-term arrangement in which the needs and desires of talent are ignored: if employees are going to leave after 6 months, why invest in them? Alternatively, a talent revolt is often manifest in reduced productivity, which reduces talent’s leverage in negotiation and leads an organization to conclude that talented people are “troublemakers” and that hiring the best people isn’t worth it in the first place.

The position of talent is especially tenuous because it’s a dangerous asset to hold. If every thousand dollars in cash caused increased a person’s risk of mental illness and interpersonal failure by 0.01 percent just by virtue of existing, those who might be billionaires would either give the shit away or burn it. Of course, this isn’t the case. Tangible financial assets– real estate, wealth, ownership in productive enterprises– are largely inert in terms of “mana burn” (the tendency to inflict harm if unused). They are at constant risk of being diminished on the market, and this may be a source of anxiety for some people, but the only thing they can lose is their own value. Talent, on the other hand, becomes extremely detrimental if unused. A millionaire “trust fund kid” working jobs below his means (as an underpaid arts worker in Williamsburg, when his father could easily get him a “boring” but cushy and lucrative position as a junior executive) is not going to be especially unhappy working jobs that are “below him”, especially because the situation can be improved at any time. On the other hand, a person of high talent trapped in a mediocre career will only fall farther. Perversely, although it’s easier to find an advocate or manager for a building or a business, talent needs it more.

The role of “talent advocate”, I believe, is unfulfilled. A boss cannot fulfill this role without entering into a conflict of interest that endangers his career. Companies’ HR departments, I believe, are useless toward this purpose as well. HR has an “eros” (hiring and advancement) and a “thanatos” (firing and risk mitigation) component. The first of these sub-departments works for the company’s management: often they mislead people into joining teams or companies with undeliverable promises of career advancement and work quality, not because they are malicious, but because they do not have the resources (or the duty) to investigate promises made by the managers for whom they work. An in-house recruiter can’t be expected to know that a position being advertised as “a Clojure job” is 90% Java. The second half of HR works for the firm’s attorneys, finance department, and public relations office, and its purpose is (a) to encourage failing employees to leave the company before formal termination, (b) to prevent disgruntled or terminated employees from suing or disparaging the company in the future. As for the advancement of talented people already in the company, managers are trusted (not always wisely) to handle this on their own. This leaves nothing in a company’s HR department that can advocate for talent. It would, arguably, go against their professional duty for them to do so.

Talent needs an advocate independent of any specific company, since its best move is often to leave a disloyal or detrimental company outright. I believe that requirement of independence is quite clear, since companies’ obligations are to shareholders only and managers’ obligations are solely to their companies. (That most middle managers, in practice, place their career interests above both those of their subordinates and of their companies is an issue I will not address for now.) Independent recruiters, one thinks, might fulfill this role. Do they? My experience has been that I do better as my own advocate than when using a recruiter. As recruiters collect a percentage of a first-year salary, they aren’t incented to act in the employee’s or even the employer’s long term interests. They are paid for putting people in roles that last at least 12 months, but not for looking out for the employee’s career interests (which may involve a 10-year career at one company, or it might involve jumping ship almost immediately). Of course, there are good recruiters out there who truly value the long-term interests of the people they place; it’s just that my memory (and, to be fair, I haven’t used one since I was a 23-year-old nobody) is that there are far more ineffective or just plain bad ones, focused on quantity in job placement rather than quality. It’s not surprising for it to be this way, since job quality (holding a person’s level of skill constant) is only loosely correlated to compensation, based on which recruiters are paid. Since it’s companies that write recruiters’ checks, it shouldn’t be surprising where their alliances lie.

Talent may be more valuable than financial resources, but it’s harder to discover and it’s far more illiquid. A company can write a $25,000 check to a recruiter, while a talented person can’t easily pay the recruiter with “$25,000 worth” of talent. Financial assets can be sliced into pieces of any desired size that are useful to anyone, so recruiters can be paid with those. Talent can’t. A recruiter cannot feed his family with 100 hours’ worth of server software. (“Tonight, we’re having fried Scala with NoSQL for dessert.”)

A possible improvement would be for recruiters to be compensated based on the “delta”, or the amount by which they improve their clients’ salaries. This would be like the pay-for-performance model by which hedge fund managers are compensated: a small percentage of assets (usually 2%) and a larger percentage of profits (often 20%). In other words, instead of collecting a flat percentage of the first year salary (15%) the recruiter could be compensated based on the hire’s long-term performance. This might give recruiters a long-term incentive to place people in positions where they are likely to succeed for the long term. Would it encourage recruiters to fill the badly-needed role of talent advocate? I’m not sure. It might just incent recruiters to find high-paying but awful jobs for their clients.

One of the difficulties associated with the talent-advocate role is that it requires the ability to assess talent. Having a talent is generally a necessary, but not sufficient, condition for being able to detect it in others. What this means is that the best talent advocates are going to be people who, themselves, have those skills and abilities. Since currency of technical skills is highly relevant, it’s best that they keep their skills up-to-date as well. Talent advocates, in other words, need to have the talent they intend to represent in order to understand what people with that kind of creativity (a) are and are not capable of, and (b) need from an employer to be motivated and successful. This requirement that the talent advocate be involved in the work for which he advocates makes a full-time recruiting effort unlikely, but without a full-time effort, it’s unlikely that the talent advocate can acquire the connections (to employers) that are necessary to place people in the best positions. In short, this is a very hard role to fill. I can’t see an easy solution.

For the time being, talent must be its own advocate and its own “manager”. This leaves us with what we already know.

Why you can’t hire good Java developers.

Before I begin, the title of my essay deserves explanation. I am not saying, “There are no good Java developers.” That would be inflammatory and false. Nor am I saying it’s impossible to hire one, or even three, strong Java developers for an especially compelling project. What I will say is this: in the long or even medium term, if the house language is Java, it becomes nearly impossible to establish a hiring process that reliably pulls in strong developers with the very-low false-positive rates (ideally, below 5%) that technology companies require.

What I won’t discuss (at least, not at length) are the difficulties in attracting good developers for Java positions, although those are significant, because most skilled software developers have been exposed to a number of programming languages and Java rarely emerges as the favorite. That’s an issue for another post. In spite of that problem, Google and the leading investment banks have the resources necessary to bring top talent to work with uninspiring tools, and one willing to compete with them on compensation will find this difficulty surmountable. Nor will I discuss why top developers find Java uninspiring and tedious; that also deserves its own post (or five). So I’ll assume, for simplicity, that attracting top developers is not a problem for the reader, and focus on the difficulties Java creates in selecting them.

In building a technology team, false positives (in hiring) are considered almost intolerable. If 1 unit represents the contribution of a median engineer, the productivity of the best engineers is 5 to 20 units, and that of the worst can be -10 to -50 (in part, because the frankly incompetent absorb the time and morale of the best developers). In computer programming, making a bad hire (and I mean a merely incompetent one, not even a malicious or unethical person) isn’t a minor mistake as it is in most fields. Rather, a bad hire can derail a project and, for small businesses, sink a company. For this reason, technical interviews at leading companies tend to be very intensive. A typical technology company will use a phone screen as a filter (a surprising percentage of people with impressive CVs can’t think mathematically or solve problems in code, and phone screens shut them out) followed by a code sample, and, after this, an all-day in-office interview involving design questions, assessment of “fit” and personality, and quick problem-solving questions. “White board” coding questions may be used, but those are generally less intensive (due to time constraints) than even the smallest “real world” coding tasks. Those tend to fall closer to the general-intelligence/”on-your-feet” problem-solving questions than to coding challenges.

For this reason, a code sample is essential in a software company’s hiring process. It can come from an open-source effort, a personal “side project”, or even a (contrived) engineering challenge. It will generally be between 100 and 500 lines of code (any more than 500 can’t be read in one sitting by most people).  The code’s greater purpose is irrelevant– but the scope of the sample must be sufficient to determine whether the person writes quality code “in the large” as well as for small projects. Does the person have architectural sense, or use brute-force inelegant solutions that will be impossible for others to maintain? Without the code sample, a non-negligible false-positive rate (somewhere around 5 to 10%, in my experience) is inevitable.

This is where Java fails: the code sample. With 200 lines of Python or Scala code, it’s generally quite easy to tell how skilled a developer is and to get a general sense of his architectural ability, because 200 lines of code in these languages can express substantial functionality. With Java, that’s not the case: a 200-line code sample (barely enough to solve a “toy” problem) provides absolutely no information about whether a job candidate will solve problems in an infrastructurally sound way, or will instead create the next generation’s legacy horrors. The reasons for this are as follows. First, Java is tediously verbose, which means that 200 lines of code in it contain as much information as 20-50 lines of code in a more expressive language. There just isn’t much there there. Second, in Java, bad and good code look pretty much the same: one actually has to read an implementation of “the Visitor pattern” for detail to know if it was used correctly and soundly. Third, Java’s “everything is a class” ideology means that people don’t write programs but classes, and that even mid-sized Java programs are, in fact, domain-specific languages (DSLs)– usually promiscuously strewn about the file system because of Java’s congruence requirements around class and package names. Most Java developers solve larger problems by creating utterly terrible DSLs, but this breakdown behavior simply doesn’t show up on the scale of a typical code sample (at most, 500 lines of code).

The result of all this is that it’s economically infeasible to separate good and bad Java developers based on their code. White-board problems? Code samples? Not enough signal, if the language is Java. CVs? Even less signal there. The result is that any Java shop is going to have to filter on something other than coding ability (usually, the learned skill of passing interviews). In finance, that filter is general intelligence as measured by “brainteaser” interviews. The problem here is that general intelligence, although important, does not guarantee that someone can write decent software. So that approach works for financial employers because they have uses (trading and “quant” research) for high-IQ people who can’t code, but not for typical technology companies that rely on a uniformly high quality in the software they create.

Java’s verbosity makes the most critical aspect of software hiring– reading the candidates’ code not only for correctness (which can be checked automatically) but architectural quality– impossible unless one is willing to dedicate immense and precious resources (the time of the best engineers) to the problem, and to request very large (1000+ lines of code) code samples. So for Java positions, this just isn’t done– it can’t be done. This is to the advantage of incompetent Java developers, who with practice at “white-boarding” can sneak into elite software companies, but to the deep disadvantage of companies that use the language.

Of course, strong Java engineers exist, and it’s possible to hire a few. One might even get lucky and hire seven or eight great Java engineers before bringing on the first dud. Stranger things have happened. But establishing a robust and reliable hiring process requires that candidate code be read for quality before a decision is made. In a verbose language like Java, it’s not economical (few companies can afford to dedicate 25+ percent of engineering time to reading job candidates’ code samples) and therefore, it rarely happens. This makes an uncomfortably high false-positive rate, in the long term, inevitable when hiring for Java positions.

REPL or Fail

I wrote a post last week on the idealized trajectory of a software engineer in general professional ability, and I find the 3-point scale (with one decimal point) that I developed to be quite useful. It describes the transition from additive to multiplicative contributions to a team, with 1.0 representing the baseline competence (a net adder, rather than a subtracter) of a professional programmer, 1.2 being about average for the industry, 1.5 being seriously good (“senior” in most contexts) and 2.0 representing a consistent multiplier and technical leader. Sadly, one of the more common roadblocks occurs early (around 1.0 to 1.2) and for most developers, and it’s often associated with-established programming languages like Java, C++, and Visual Basic. What’s going on? Why do so many programmers reach a hard ceiling, with some persisting there for decades, while others pass through this barrier with ease? What is it about certain languages or technologies that holds programmers back?

Programming is a two-class society. We have the mere “coders” who use only one language, hate the command line, and don’t program outside of work. They typically work on bland, “enterprise” projects and solve annoyingly detailed, but not difficult, problems. They generally find programming to be a boring task, but “it pays the bills” and the other smart-person route to management (actuarial science) involves hard exams. If they remain “in programming”, they’re lucky to break six figures when they’re 40, and they are likely to face age discrimination and layoffs in the decade after that. On my scale, they plateau around 1.2 because, in enterprise programming, those who reach 1.3-1.4 are usually brought into management. For a contrast, the other class is comprised of elite “hackers” who prefer languages like Python, Scala and Erlang (although some might have to use Java) and who program outside of work, who are hotly desired by huge companies and startups alike, and who continue growing even into old age. These usually reach 1.3 in their first few years as professional programmers, and reliably break 1.5 by mid-career. What separates the two? What is it about a programming language that makes it highly indicative of a software engineer’s future progress (or lack thereof)?

It’s evident that this problem is outside of languages themselves, because a 1.5 programmer can, after some adjustment, program at a comparable level in Java. So it’s actually not the case (aside from an opportunity cost argument) that Java and C++ make people worse programmers. Rather, something happens in other, more modern, languages that makes people improve faster and helps them smash through that 1.2 barrier. So, what is it? I spent much time trying to figure this out, and when I came upon the answer, it was so simple it shocked me: The Mighty REPL.

Modern programming languages supply an interactive mode, also known as a “read-eval-print loop” or REPL, as part of the programming environment. The REPL allows programmers to try out code and get immediate results, and to explore existing software modules interactively by calling their functions and seeing what they do. Although typically associated with interpreted languages (notably, Lisp) REPLs are supplied for compiled languages such as Scala, Ocaml, and Haskell. (They do not provide the speed of compiled code, but code is not run for speed in the interactive mode.) This tight feedback loop facilitates a style of development and code exploration that is far more engaging and effective than the processes of writing and reading code would be without a REPL, and far superior to anything available from an IDE. (A REPL is, in some sense, a simple but highly effective IDE. Properly built, it makes IDEs unnecessary.)

Programmer productivity is binary, in that a programmer is either in a productive, engaged state of “flow” or in a disorganized, unpleasant, and unproductive state out of flow where 10% as much (if that) is accomplished. (A significant amount of work stress, in my estimation, is caused by a self-inflicted sense of pressure to be productive while one is out of flow.) It takes programmers about 15 to 30 minutes to enter flow, but once in it, they are immensely productive and moreover, quite happy. Developers usually write the most code and their best code when in flow, and are best at reading code when in this state as well. There’s a problem: reading other peoples’ code, if the code presents a lot of accidental complexity obscuring the question the reader is trying to answer, often shatters flow (which is why bad code is hated with a passion that only programmers understand).

Programmers understand flow and its importance from experience, and flow becomes more important as one increases one’s programming skills (to the point that 2.0+ programmers, rather than negotiating for higher salaries, tend to negotiate perks oriented toward flow and engagement, such as a quiet working space and an unconditional right to turn down meetings). An experienced programmer can work at a 1.0 level (cranking out code of nominal additive value) without inspiration, but 1.5+ and especially 2.0+ level contributions require creativity and focus. So experienced, elite programmers know that a REPL-less language is generally a dead-end. In a “green field” environment where the programmer controls the entire context in which is work exists, engaged writing of code in languages like Java is still possible– but engaged reading of code is out of the question. The engaging way to read code is to get a big-picture sense of it through interaction, and then to examine the code for implementation details strictly after this has been achieved.

This, in my mind, singularly explains the ceiling that Java developers hit around 1.2. By some non-satisfactory definition akin to Turing-completeness for programming languages, a 1.2 programmer has the knowledge and resources to “solve any programming problem” (ignoring performance and feasibility concerns). The solution may be inelegant, slow, unmaintainable, and even bug-prone, and it might take a long time for the solution to be delivered, but there aren’t programming problems that a 1.2 (or even 1.0) engineer “can’t solve”. On the other hand, far more interesting is how people solve problems, and some of the things that programmers must do if they want to become elite (1.5+) programmers are (a) figure out which problems are worth solving, and (b) learn how to read solutions that other people have created. To become a great programmer, one must be able to read code (in an engaged state of flow). Moreover, one must read good code, a commodity that is depressingly rare in this industry.

Reading code can be an immense joy that produces “Aha!” moments, or it can be hellishly tedious and unfruitful. Sadly, most real-world code (especially in languages like Java and C++) is closer to the latter extreme– it’s probably over 90 percent. Code rots for a variety of reasons. One is the wine/sewage problem (“a teaspoon of sewage in a barrel of wine makes a barrel of sewage, a teaspoon of wine in a barrel of sewage makes a barrel of sewage”): if a system is corrupted and the nastiness isn’t aggressively refactored, the kludges will beget counterkludges in “maintenance” and destroy the whole system. A related issue is the “broken windows” effect: tolerance of ugliness leads to a sense of abandon, and this is more common than most programmers will admit. Modifying code in a reasonable way (i.e. one that doesn’t, despite solving an immediate bug or adding a specific feature, make the general quality of the code worse) requires understanding it, and that usually involves reading it, and most code is so terrible that programmers who have to use, extend or maintain it just give up on comprehension and “hack” it as far as they can. Programming in this style is akin to the” Jenga” game, where players must remove planks from a tower and place them on top of it, making the structure less sturdy and higher as they go (until it collapses, and the player whose turn it is loses).

There’s no silver bullet for code comprehension, but the REPL is the closest thing. The worst code may remain impenetrable, but few real projects begin their existence as incomprehensible legacy nightmares; they usually start out as average code for a project of that size. REPLs make it possible to explore average-case code and comprehend it without dedicating massive amounts of time to the process, and this makes a huge difference. Aging modules can be refactored in the earliest stages of decay, long before they get anywhere near the “legacy horror” state. By enabling interactive peeking and poking of functions, REPLs allow programmers to explore libraries and get a sense of their interfaces. For example, in Ocaml, it’s possible to get the full type signature of any module:

# module L = List;;
module L :
sig
val length : 'a list -> int
val hd : 'a list -> 'a
val tl : 'a list -> 'a list
val nth : 'a list -> int -> 'a
val rev : 'a list -> 'a list
val append : 'a list -> 'a list -> 'a list
val rev_append : 'a list -> 'a list -> 'a list
val concat : 'a list list -> 'a list
val flatten : 'a list list -> 'a list
val iter : ('a -> unit) -> 'a list -> unit
val map : ('a -> 'b) -> 'a list -> 'b list
[...]
end

From these type signatures, it’s relatively easy to get a sense of what these functions do and to test those intuitions:

# List.length [1; 1; 2; 3; 5; 8];;
- : int = 6
# List.map (fun x -> x*x) [1; 1; 2; 3; 5; 8];;
- : int list = [1; 1; 4; 9; 25; 64]

With Ocaml’s powerful REPL, a person can explore code and get a sense of the big picture before starting to read it. That makes a huge difference: reading code is an order of magnitude easier and more engaging when one understands what one is looking at. Moreover, many Lisps such as Clojure and Common Lisp provide documentation functions at the REPL that allow the user to read a function’s documentation without having to leave the command-line. This provides all of the benefits of an IDE, without the flow-breaking drawbacks.

For an aside, there’s something that elite programmers (1.5+) call “keyboard snobbery”: IDEs are scorned, while the key-combos of emacs and vim are venerated, even with anachronistic names like “meta” for the escape key.  The command line interface is highly valued. This doesn’t apply to all computer use (when web surfing, keyboard snobs use the mouse like anyone else) but it does apply to writing and reading code. Why? Because the mouse is physical, continuous and imprecise, while the keyboard is  cerebral, discrete and exact (and therefore a better tool when programming). When we use web pages, we trust the developer to handle the imprecision (in determining whether a button was clicked, and in interpreting the mouse event). This is fine for this purpose, but when we’re writing code, we want exactitude and total control of our interaction with the machine. We want exactly the result we expect at all times. So that’s why we prefer the keyboard when coding, but there’s something else going on as well. Switching from keyboard to mouse doesn’t only involve a move of the hand. It reframes the interaction between the human and the machine, and that’s a context switch. Seemingly benign context switches inflict major drag on programmer productivity. Switching to the mouse because one’s IDE requires it? That’s 3-5 minutes. Pinging about the filesystem because of some stupid requirement that each class live in its own file? That’s about ten minutes. Managerial interruption? An hour, and half a day if the meeting is unexpected and intense. Programmers hate being nickel-and-dimed by context switches, and they hate being out of flow. This is why seriously good programmers prefer “archaic” tools like the command-line interface, vim and emacs, while considering typical mouse-driven modern IDEs (which are necessary if one is developing in a verbose basketcase of a language like Java, but unnecessary in better languages) to be useless.

The REPL, served at the commandline, allows a person to interact with code as if it were live, and see what the pieces do. At least half of what we must do as programmers is comprehension of assets that other people have created, and the REPL allows us to do this without the painful context switch associated with having to read code cold. It enables “flowful” (that is, engaging) exploration and, later, “lazy reading” of code. (Lazy, in this sense, is a non-pejorative computer science term associated with doing only the work needed to solve a problem.) That’s something REPL-less languages can’t provide, because in them, code is a dead static thing that might be run against some dead static tests, not something a developer can interact with as he works.

Reading code is part of the job description of any programmer, and yet it’s rarely done well because enterprise languages like Java make the process so dismal that most people just give up, falling into abominable development practices. When 90 percent of the code is tedious boilerplate (accidental complexity) that isn’t worth the eye strain, it’s easy to miss crucial details. The accumulation of missed details leads to frank incomprehension quickly, and then development practices akin to “throwing mud at the wall and seeing what sticks” become the norm.

This, I believe, is what holds back most Java developers’ progress. Not only does code in such languages become horrible quickly, but the environment makes it unpleasant to read even “good” (by which, I mean “above average for the language) code. In fact, most IDEs tacitly assume that no one is going to bother to read code after it is first written, and adjust accordingly.

For a contrast, this is something Ocaml got right in a major way. Ocaml is an obscure “niche” language, but it has the highest average quality of programmers that I’ve ever seen (even higher than Haskell and Lisp, although those are close). I don’t believe the reason for this is that only good developers can use Ocaml. Instead, what Ocaml achieves is that it makes it a joy to read average-case code– no small feat. Pattern matching, a core feature of the language, is explicitly designed to make what would otherwise be complex control flows human-readable. Haskell is an excellent language as well, and extremely terse, but in my belief it’s optimized (more than Ocaml) for writers of code (although still far better, from a reader’s perspective, than Java). I would guess that the ML family of languages (which are elegantly simple) are the only languages on earth that go so far to make almost all code readable, even in large systems. Of course, it’s still absolutely possible to write horrible, illegible Ocaml code– the language puts up more of a fight against bad practices than most, but it can be done. The difference, relevant for economic rather than purist discussions, is that average-case Ocaml code is attractive, whereas even very good Java code looks only 20 percent less ugly than typical “bad” Java code.

In a language like Ocaml, there’s so little boilerplate and accidental complexity that one can look at the code and actually see the problem being solved. For larger systems, one can test one’s intuitions at the REPL. No one needs to rifle through 300-page design documents to understand what a well-written Ocaml program does. The consequence of this is that Ocaml has libraries of generally very clean code that people can read as they learn the language. Since it’s not an unpleasant process to read code, they do so, and they grow as programmers at a rate that would be unheard-of in Java or C++: rising from 0.8 to 1.5 in about two to three years is typical. Ocaml isn’t some “hard” language that only 1.5+ programmers have a chance of understanding. It’s a language that turns ordinary programmers into 1.5ers rapidly.

There’s one language that can be cited as a counterexample to the “REPL or Fail” rule, and that’s C. C, invented in the 1970s, doesn’t have a REPL. Why was this acceptable for C? First, the language grew up in a different time, when small programs (that would today be replaced by “scripts” unless performance were an issue) were the norm. Programmers could “grow up” on C in 1985 because the programs they’d be reading were small and had well-defined semantics. Second, to say that “C lacks a REPL” is a bit strict. It doesn’t have a language-native REPL, but the Unix/C environment does have a (rudimentary, but sufficient) REPL: the command-line console. This was the environment in which C programs were run and explored: first you run wc and cat to see what they do, and then you could look at their C code and discover how they do it. C was designed with a “small-program” model of development (because large, megalithic programs were simply untenable in 1975) in mind. If complex behaviors were desired, they could be established by composing independent C programs and having them communicate through pipes and sockets. In this world, one could read “a whole C program” (a small, independent module, usually in one file) in one sitting. One only needed a REPL (command-line console) to understand the bigger environment: Unix.

How’d we end up with these disengaging, REPL-less languages? As I said; speaking superficially and strictly, C has no REPL. This was not a problem for C because large programs were so rarely written in it, and enough small, well-written C programs were distributed in every Linux environment that a programmer could learn the language from those. Where C++ differs is that large, complex, and monolithic programs are written in it, because the language has just enough in the way of high-level support to let people attempt them. The result is that C++ supports beasts of complexity (such as 200-line functions, 1000-line class definitions, and 1-million-line whole programs spanning several directories) that would be unconscionable in C, and yet fails to provide the one tool that might enable a programmer to make sense of such things. Although writing a C++ REPL is possible, it wouldn’t be easy: the language is so deeply imperative and crystalline that overcoming the mismatch between the two models of programming would be a monumental task. Java, as a descendant of C++ in syntax and culture, inherited most of these illnesses from it while becoming the default language for enterprise programming, and was also launched without a REPL. The result is that millions of people are stuck in a REPL-less language and don’t know why, while hacking on monolithic projects of intractable complexity that are doomed to get worse over time.

The REPL isn’t just a tool. It’s an engaging classroom in which one learns how to be a programmer. It’s absolutely necessary for a person assigned a task that involves comprehending a complex piece of code. And unlike the training wheels of an IDE, it doesn’t attempt to hide “difficult” details from the developer; it allows her to explore them to arbitrary depth when she is ready.

For these reasons, the interactive mode can’t be considered a luxury of those who are privileged enough to work in “elite” languages. There’s no reason programming should be that way. If we want to democratize programming (and there’s no reason we can’t have at least ten times as many 1.5+ programmers as are alive now, and 10x is a conservative goal; considering the world population) we need to begin orienting ourselves toward modern languages. And there is one rule that seems more fundamental than any argument about static vs. dynamic typing or imperative vs. functional programming: REPL or fail.

How any software company can cross “The Developer Divide”

Nir Eyal wrote a blog post, The Developer Divide: When Great Companies Can’t Hire, on this conundrum: there are a lot of excellent technology companies that haven’t managed to attract the brightest (and generally pickiest) engineers in sufficient numbers to hire as fast as they grow, a problem that forces a company either to lower its hiring standards or curtail growth, neither of which is desirable.

What makes this problem especially hard is that it’s not about money. In marketing terminology, it’s a problem of “reach”. Many great startups, even if they offered $200,000 per year, would simply be unable to hire 25 elite (top 5%) programmers in a year. Finding one per quarter is pretty good. Compounding this difficulty is the fact that the best software engineers’ job searches are infrequent and short. They usually find jobs through social connections and word-of-mouth before they start officially “looking”. They almost never cold-spam their CVs. Therefore, a company that’s consistently hiring top-5% programmers is probably extending offers to 1 applicant per several hundred CVs that makes its way to the hiring manager.

What is the Developer Divide? It comes down to this: it’s easier to get top developers if your product is something that top developers already use. Google was unequivocally the best search engine even in 2000, and nerds like new search engines, so it established itself as a desirable place to work. Facebook and Foursquare may be targeted toward “non-engineers” (i.e. mass market) but the products are used by enough of the people the firms want to hire as to generate name recognition that accumulates faster than the company needs to grow. That makes it very easy for them to attract so much talent they turn away more 5-percenters than they hire. But for a company whose product isn’t used already, every day, by top programmers, it becomes harder. Much harder. That’s unfortunate, because a lot of great businesses
(most, actually) need to start out doing something lower on the established-product-sexiness scale (enterprise work and products with well-defined markets, rather than speculative “social” projects) before branching out into projects of more general interest. These companies might become sexy in the future, but sometimes the first clientele has to be an unsexy but reliable one, like large corporations or suburban soccer moms.

So, how do these great companies continue to find great talent? I’m going to state one simple (but not easy) thing that any software company can do to increase its attractiveness to top engineers by at least 3 binary orders of magnitude. Here it is: ditch Java/C++ and use a decent programming language.

What’s a decent programming language, for this purpose? It’s one that a 5-percenter would use for a personal side project, or if she were calling the shots. That language might Scala, Python, or Ocaml. It might (for a project such as an operating system) be C. It would amost certainly not, in 2012, be Java or C++.

A clarification must be made, because it’s confusing to people outside of technology: C++ is not a substitute, nor  an improvement on, C. In fact, C is great for programs where low-level concerns are critical in producing a quality system: operating systems, device drivers, hard real-time, and runtime environments for the high-level garbage-collected languages we all love (such as Ocaml and Haskell). Much of the world is built on C. It’s an excellent and immensely successful mid-level language,  as opposed to C++ which is a miserably failed attempt at a C-like high-level language. So do not confuse the two.

If you ask a 5-percenter for his opinions on C and C++, he’ll probably praise the former while trashing the latter. From a non-technical business perspective, this might seem inconsistent, given that C is a proper subset of C++. “Anything you can do in C, you can do in C++.” That makes C++ “strictly better”, right? Well, no. A 5-percenter has the professional maturity to realize that he or she will not be coding in a vacuum, and that reading code is as important an act as writing it, and that therefore adding ill-considered features to a good language doesn’t improve it, but ruins it if people are stupid enough to actually use them.

I’ll stop talking about C++, because it’s not even relevant to most startups. Startups can’t afford it, because they need to accomplish big things with small teams and C++ doesn’t make it possible. C++ is mostly that skulking monster that lives in the bowels of legacy systems at banks, something you expect to fight in the 2300 AD (post-apocalyptic) world of Chrono Trigger.

Java’s problem is somewhat different and new. C++ is a bad language because it was poorly designed, but it was at least designed for good programmers (and it’s disastrous when used by inept ones). Java, in contrast, was explicitly designed to favor the interests of massive (100+) teams of mediocre developers over excellent individual contributors, whom it slows down to a small fraction of their typical speed. It came out of a failed experiment to create a home for low-skill “commodity” developers who haven’t learned a new skill since college and who need to consult the One Smart Person on the team if (God forbid) their RAM-munching IDE breaks. Java has succeeding in making commodity developers marginally effective (as opposed to negatively productive, as they are in more powerful languages) but at the expense of hobbling the best, forcing them to endure ugliness (inherent to a language designed to be used exclusively through IDEs, while 5-percenters overwhelmingly prefer the command-line interface and “classic” editors like vim and emacs) and accidental complexity. Needless to say, 5-percenters despise Java. More correctly, they despise the Java language.  (I emphasize “the language” because the Java Virtual Machine itself is a pretty powerful tool, and because there are superior languages– Scala and Clojure coming to mind– that also run on the JVM.)

If you want to build a large team of 50+ “commodity” developers, content to maintain legacy code or work on mind-numbingly boring stuff, use Java or C++. If you’re a startup, you need to build a small team of excellent developers, so use something else. If you want to hire 5-percenters now but might need to hire Java jockeys later, strongly consider Scala and Clojure, which are highly powerful languages but run in the JVM environment.

Why is this so important? It’s not just about the language. It’s about signaling. As a startup, you have to show that you get it, and that the opportunity you offer isn’t Yet Another Java Job. You can get 5-percenters to use C++ or Java (Google has made this happen, and so have many investment banks, which have huge legacy codebases in C++) but you pretty much have to be a big-name company to pull this off, and you can expect to shell out an obscene of money– a 50-100 percent markup to account for the negatives of maintaining code in a terrible language, and the career stagnation this kind of work invites. If you want a 5-percenter to work 60 hours per week for $100,000 per year, use a great language. If you want him to work 9-to-5, with two-hour lunches and personal errands deducted from the workday, for $250,000 per year, then Java and C++ are options.

More than anything else, 5-percenters want to work with other 5-percenters. This is far more important to them than prestige or money (the reason 5-percenters default to rich, prestigious companies when nothing more interesting crosses their transom is because these companies have other 5-percenters). It is, moreover, even more important than programming language selection itself. Language selection, as I’ve said, an objective mechanism of signaling. This is what creative writing calls the “show, don’t tell” principle (don’t say “Eric was honorable”; have him do something honorable). Every company says (“telling”) it has world-class talent, but language choice is an objective decision that shows that a company is, at the very least, interested in hiring 5-percenters. For that reason, there’s a good chance that it employs some.

To reiterate, because I don’t want a flame war, is it possible to find 5-percenters who will write Java and C++ on a full-time basis? Absolutely, and if you want to compete with Goldman Sachs on salary, you might be able to hire one. Will they take on the risk of working for $5,000 (pre-tax) per month at a risky, seven-person startup in these languages? Not a chance. Five-percenters tolerate C++ jobs at Google because they know that company’s existence doesn’t rely on their individual productivity. In a startup, individual productivity is an existential concern, and the Aspergerian “pathological honesty” that top programmers almost invariably have precludes them from working in low-productivity languages that they believe will retard and destroy their employer.

I’ve said enough about the awfulness of C++ and Java. What are some good languages? I’ll give a list, which is not at all inclusive, of highly-powerful languages that the best developers love. Ocaml, Haskell, Erlang, Scala, Lisp, Clojure. Less strong but still formidable are Python and Ruby. (Many 5-percenters love Python, and almost all will tolerate it, but the median Python programmer is closer to a “10-percenter”.) Why is it this way? First, great developers program and learn technology in their spare time, and a 1-person, 15-hour-per-week project must be written in a real language if it is going to amount to anything. Second, most of these languages are only used by the best employers and only taught by the best universities, so a person deeply familiar with one of them is either (a) coming from elite exposure, or (b) possessive of enough individual curiosity to indicate a high likelihood of skill and success as a computer programmer. People who want to become great programmers quickly discover languages in which it’s possible for them to be 10 times as productive as they would be in Java– and they never look back.

Are all programmers in great languages 5-percenters? The answer is no. Users of languages like Ocaml and Scala tend to fall into two categories: (a) the 5-percenters, and (b) those who are becoming 5-percenters. Not all of them are there yet, but they’re almost all improving at a rapid pace. I’ve worked for almost 4 years in JVM languages (Java, Scala, and Clojure) and what I’ve learned (perhaps astonishingly) is that, per unit time, the Scala and Clojure developers learn about Java faster than those using Java! Because they are in high-productivity languages, they can accomplish more, and because they’re achieving more, they’re learning more along the way.

From a practical standpoint: with so many great languages to choose from, which one of those awesome languages should a person pick? CTOs making this decision have some idea, but this would be an impossible decision for a non-technical CEO to make on direct experience. For a very small team, the answer is easy: whatever the best programmers want to use. I like Scala much better than Python, but if I were in a non-coding role and tasked with hiring a great programmer, and if she preferred Python, I’d have her use that, because the benefit of letting her use the language she thought was best for the job outweighs (for a small team and with no maintenance burden) any benefit conferred by using one powerful language over another. So my answer to the language-selection question to a non-programmer CEO is: ask your best developer.

For my part, I’d recommend Scala. It’s not the best language for all purposes, but it’s a great general-purpose language and (among mainstream languages) may be best choice overall for most purposes. Because it runs in the Java Virtual Machine (JVM) it has full access to all of Java’s assets. (Clojure, an excellent JVM lisp, has the same advantage, but is not as performant and does not have static typing, a feature I find invaluable.)

A person versed in economics might find my argument tenuous. If a set of “elite” languages can be used by programmers and employers as a signal of high competency, what’s to stop the less competent from “faking” this signal once they catch on to the fact that it exists? A few answers come to mind. The first is that programming in a high-power language requires an adjustment that mediocre, 9-to-5, programmers are not likely to want to make. From first principles, functional programming isn’t intellectually harder than object-oriented programming: a 120 IQ is more than enough. It’s actually simpler. (Doing object-oriented programming correctly, and rigorously understanding what one is doing with it, is much harder and much more intellectually complex than succeeding in functional programming; this is a rant for another time, but 99% of people doing “object-oriented-programming” are like crude teenagers with regard to sex– loud about it, but doing it badly.) Nonetheless, the intellectual difficulty of re-learning software on sound principles is not an easy one to make. Like the transitions from memorization to pattern recognition, and then from pattern recognition to rigorous proof in mathematics, these context-switches require a lot of work (months of serious study). Successfully learning Scala or Clojure is a sign of a very strong work ethic. (It goes without saying that your interview process should establish that the candidates actually know these languages, and aren’t playing “buzzword bingo”.)

What about mediocre businesses using language selection as a false signal? That’s even more unlikely. A recruiter (for an elite startup, currently at less than 20 people) I spoke to told me that about 70% of Clojure candidates, 40% of Python candidates, and 5% of Java candidates that he invites to an in-office, full-day interview are good enough to hire. For a startup, interviews are far more expensive (in terms of opportunity cost) than for large companies: it costs about $100 to run a technical phone screen and $1000 to conduct a full-cycle interview, because startups have to involve senior people in their interview process in order to assess quality. Put another way, this means that it costs $1429 to hire a 5-percenter in Clojure and $2500 to hire one in Python– chump change as far as recruiting expenses go. But it costs $20,000 to hire a Java developer, if you insist on the “5-percenter” standard of quality.

There’s one variable I haven’t mentioned, though, and that’s the number of CVs he gets in each language: several hundred times as many Java developers than developers in Clojure or Ocaml. Most companies need (or think they need) warm bodies in large numbers to maintain legacy horrors, not top-talent and the attitude that comes with it. Also, it’s awful that it’s this way, and I think it will change in the future, with the limiting factor being work ethic and engagement rather than innate ability, but the software world is a pyramid, with a few stars at the pinnacle and a large number of incompetents at the base. This holds for programmers and for programmer jobs, of which 90 to 95 are mind-numbingly boring. It’s also seen in the distribution of language preference (and I say “preference” because there are tens of thousands of excellent programmers writing C++ and Java right now, but very few prefer them). The result of this is that the mediocre languages have the most programmers. If a company needs to hire 200 programmers per month, it simply cannot choose Haskell as its main development language; within a couple years, it would have absorbed the entire Haskell community!

Historically, that has been a serious concern for companies when it comes to language selection. Because there are hundreds of times more Java developers than Haskell or Ocaml hackers, there are at least fives of times more half-decent ones. Thus, the worst languages paradoxically have the strongest library support. This is compounded by the fact that powerful tools (such as IDEs, which are a mixed bag of neatness and horror, but sometimes quite useful and outright required when developing in Java) must be written to compensate the shortcomings of hobbled languages. There’s no Lisp IDE because emacs does just fine, but there are a slew of Java IDEs because it’s a revolting experience to write Java without one. From a non-technical CEO’s perspective, this makes Java look better because it has the best supporting tools.

What all this means is that mediocre software shops are not going to switch over to Haskell in order to ape this signaling mechanism. They can’t, because the bottom contingent of their software staff with drop like flies, and because their leadership is unlikely to understand the language selection problem in the first place. There might be some unestablished software companies using elite languages, and of them, I make the same argument that I’d make of “15-percenters” who nonetheless show interest and competence in “elite” languages– they may not be 5-percenters now, but if they keep at it, they will be shortly!

What about the (admitted) shortcomings of elite languages? Except for Scala and Clojure, which have access to the JVM and interoperate cleanly with Java, these languages don’t have the breadth of tooling that C++ and Java do. The answer to that is almost stereotypically “hackerish”: write them! This effort is not wasted, not in the least. Writing high-quality open-source tools to support elite languages is one of the best things a software company can do to establish its reputation. This, again, is an opportunity to show, not tell.

For a small company attempting to define and establish itself, attracting top talent is hard. The best programmers are on the job market so rarely and for such short intervals that attracting them takes concerted effort. Growing companies do not have the name recognition, and cannot afford the immense salaries (over $250,000 for a senior developer) that would attract these “5-percenters” using economic means, so they must win on technical grounds. Language selection is a simple (but not easy, because it’s difficult to adopt new and dramatically more powerful tools) way to do so. The best languages (Ocaml, Scala, Haskell, Python) are “shibboleths” that elite programmers use to identify each other, so adopting one of them is a one-stop choice that instantly establishes “hacker cred” or, to use Nir Eyal’s terminology, bridges “The Developer Divide”.

A problem with the term, programming “language”

I’m going to make a radical and perhaps offensive assertion: most of the time, when programmers use the phrase “programming language”, they are using it inappropriately, and creating a dangerously fallacious analogy, one for which few programmers but many decision-makers in business fall: that programming languages can be compared and assessed as one would for natural languages.

The problem is twofold. First, natural languages don’t have the massive variations in capability observed in programming languages. French is not superior to English (or vice versa) in the way that Scala is superior to Java. Second, “switching costs” associated with acquiring a new natural language are high, because it’s hard for an adult human to learn a new natural language well: it takes years of exposure, preferably in the context of contact with native speakers. This is not as true of programming languages; learning programming is hard, but a skilled programmer can become capable in a new language in a few weeks and be proficient after months. Because of this clumsy analogy, decision-makers in the software industry often make a decision that is the obvious right one with regard to natural languages, and almost always the wrong one (because “the standard” is usually an underpowered legacy language like Java or C++) in selecting a programming language: they default to the standard.

My intention is not to discuss specific programming languages per se, but the problem inherent in the phrase “programming language”. It is a correct phrase: a programming language is a form of language, with grammar, syntax and semantics. On the other hand, it admits a certain confusion inherited from something we know about natural languages like English and Spanish: despite their evident differences, human languages are much more alike than they are different. Natural languages have their quirks and are may be more capable in specific ways– character languages are more compact, while letter languages’ smaller alphabets make it easier to convert written words into spoken form– but none is uniformly or astronomically superior to any other.

For example, there’s no natural language where to express an average English sentence (15 words) requires two hundred words, but there are languages (Scala, Python) that are 15 times more concise than Java. There’s also no language that is 100 times easier for the human brain to convert into meaningful instructions than English, but there are programming languages in which programs run 100 times faster than others. Nor is there a natural language where it’s impossible for a syntactically correct sentence (e.g. “the moon ate the moon”) to be nonsensical; this assurance, to a large degree, can be achieved in statically typed languages. While it might be offensive even to consider that natural languages might vary in “quality”, I think it quite reasonable to assert that they don’t. To my knowledge, there aren’t major, order-of-magnitude, variations in capability among natural languages. For this reason, it’s appropriate to communicate (e.g. to conduct business) in the language of which the involved parties will have the best comprehension. The right decision, in selecting a natural language, is to use “the standard”. In New York, that’s usually English. In Moscow, it’s Russian. Every locale has a small set (sometimes only one) of natural languages that are likely to be well-comprehended by most people. For this reason, it would be absurd for a Silicon Valley firm to make the decision to conduct all business in Swahili, even if Swahili were 10 or 50 percent better suited to that company’s needs.

Programming languages are different: there are order of magnitude variations. Rewrite a C program in Java, and the program becomes immune to a wide class of errors (due to Java’s automatic memory management) but one loses the ability to explicitly manage memory, and program performance becomes nondeterministic. Sometimes this is a desirable change; sometimes not. Rewrite the Java program in Python and the source code usually becomes one-tenth of its previous size, making the program easier to maintain and (in the long-run) better, but one forgoes access to the Java libraries (using Python’s, which are strong but less developed) and programs take 10 to 20 times longer to run. Rewrite the Python program in Scala and one regains access to the Java libraries and Java’s speed, but loses access to Python’s libraries and will have trouble transliterating some of Python’s (controversial, but sometimes powerful) most dynamic features.

Order-of-magnitude differences exist among programming languages, with the winners depending largely on one’s definition of “quality”. Python is expressive but the interpreter is slow. Java is fast but verbose and horrendous to read, making source code difficult to maintain and “code rot” inevitable. Scala is fast and concise, with a powerful type system, but few people understand the language or its type system well, so it’s not really possible to hire 50 Scala developers per month. C is verbose (for large-scale, complex software) and without many features, but the best or only choice for a variety of problems, such as writing device drivers, real-time programs, and operator systems. No programming language can be categorized as the uniform “best”; all have their strengths and weaknesses (except for C++, which is a bastardization of C and should never be used except to troll people).

This essay isn’t about what languages are good and which not, but it answers a question: why do most businesses use the wrong programming languages? Why are so many development shops using C++ and Java, when they could accomplish four times as much per developer-month using a more expressive language like Python or Scala? I think a major part of it comes from the fact that we call them programming languages. For an analogous question: Why do most American businesses use English? Not because English is the “best” language (if such a notion could even be defined for a natural language, and I doubt it can) but because it’s “the standard”: it’s what most other Americans use to communicate. With natural languages, there aren’t order-of-magnitude differences in capabilities, and the difficulty of learning a new natural language is high (the American “founding fathers” despised the British so much that they tried to replace English with French or Hebrew as the new country’s “official language”, and failed) so choosing the default natural language is such an obvious choice that it’s often not even a conscious decision.

Java and C++, analogously, are favored in the software industry because they’re “the standard”, like English. To a businessperson unfamiliar with technology, a proposal to implement software in Scala seems like the suggestion that all business should be conducted in Swahili– for a New York firm, ridiculous. This is why many software shops are far more skeptical of “other languages” than they would be if they understood the situation and the potential gains.

How do we dispel these notions? I think one change should be in our terminology. Often, when we debate languages, we’re actually discussing technologies built on languages. For example, sentences like “C is fast” and “Python is slow” are ridiculous as stated. A language (formally speaking) is just a set of strings of symbols, with no intrinsic concept of “speed”. What people mean, in fact, is that “compiled C programs have excellent performance due to advanced compilers and the language’s support for powerful optimizations” and “Python executables run about 10 to 20 times more slowly than analogous programs in C” (but still fast, because all computers are “fast” these days).  Speed isn’t an intrinsic trait of the language, but rather one of the technologies to compile and interpret it.  Likewise, the common justification for using Java is that “It’s has the largest set of active libraries.” Well, not exactly. There are a variety of exciting (and superior) JVM languages– most notably Scala and Clojure– that can use the bulk of these tools equally seamlessly, which means that there’s no excuse (in most cases) for preferring Java.

Programming languages are languages, but the decision to adopt one programming language over another has no similarity to such a decision over natural languages. The relevant comparison, in technology, is one of platforms, programming styles, capabilities, and supporting technologies. Referring to this decision-making process as a “language debate” is, in this context, myopic. Far more is at stake than the choice of “language” alone; the chosen programming language influences the types of programs one can write, and the variation in program quality that is staked on language selection alone is quite high.