Big Tech Sees Like a State

Plus! Pirate’s Treasure, Redux; Takedowns; Antitrust and Laggy Beliefs; Money, The High-Order Bit; More...

Welcome back to The Diff. Here are the subscribers-only posts you missed this week:

This is the once-a-week free edition of The Diff, the newsletter  about inflections in finance and technology. The free edition goes out  to 16,308 subscribers, up 136 week-over-week.

In this issue:

Big Tech Sees Like a State

One of the classic slogans of 1960s protests, found on placards,  buttons, and in chants, was: “I am a human being. Do not fold, spindle,  or mutilate me.” The phrase referenced a warning on IBM’s punched cards.  At one level, it was just a slightly geeky catchphrase about new  technology, akin to wearing an “Error 404: Democracy Not Found”  t-shirt. But at another level, it represented a deep of anxiety.  The people who used this slogan weren’t worried about punch card computers as an  abstract force; they were worried about the punch card that had their  name and draft number.

This anxiety is part of a long, long process. In Seeing Like a State,  the anarchist-leaning historian James C. Scott describes this as the  fundamental process of government: governments alter behavior in  order to tax, conscript, and prevent the rebellion of their  citizens/subjects. Scott uses the term “legibility” to understand this. A  fully-legible citizen:

This is not the default state of human beings. It’s a multi-generational process. Seeing Like a State  describes some examples, mostly failed, of governments trying to impose  rules. In some cases, they work, but only at great cost—the USSR was  able to collectivize farms and acquire grain, but millions starved in  the process. In other cases, like the Meiji Restoration and France,  the legibility-inducing process worked relatively well. And it may be  an understated part of the Frontier Thesis in the US, too: it’s much  easier to give people a fixed address and insist that they recognize  property rights if you can give them, as an inducement, some new  property at that address.

Scott assembles a panoply of examples: Russian Bolsheviks, Prussian  foresters, American tomato entrepreneurs, medieval Thai tax collectors,  post-colonial African reformers, modern farmers, ancient farmers,  Brasilia, and Bruges.

Why is imposing legibility hard? The obvious reason is that most  people do not want to pay taxes, serve in the military, or be prevented  from mistreating the capital city’s plenipotentiaries at will. Another  reason, though, is that all this legibility overrides systems that  function perfectly well and rely heavily on local knowledge. Scott uses  the term metis to describe this knowledge borne of practical  experience. Many of his examples he gives involve the very local business of  growing things. A traditional farm might mix a number of different  crops, grown according to ad hoc rules, without much in the way of  non-natural fertilizers or pesticides. The result is a balanced diet, a  hardy, pest-resistant farm—and very little taxable income. When  governments and large companies try to create their own farms, they  usually grow monocultures, plant them in rectangular fields, use  pesticides and heavy equipment, and, in many historical cases, achieve  much lower yields or lower-quality products than the old ways.

Taxes, too, replace webs of local obligations, which are often more  attuned to a community’s needs and limitations. A top-down taxation  system looks more efficient on paper, but compared to an organically  evolved system, it won’t match the precise needs or preferences of the  smaller groups it’s imposed on.

So there’s a strong theoretical argument against imposing legibility.  There are many historical examples. And yet, very few people emigrate  from highly legible countries in the United States and Western Europe in  order to live in more informal ones. Meanwhile, many people around the  world do choose to immigrate to those hyper-legible societies. Clearly—and Scott concedes this—legibility is not all bad.

One reason for this is that highly abstract, ultra-legible systems have more advantages at scale.

Theoretical knowledge faces the constant uphill battle against much more applicable metis. Heliocentrism had to contend with the observed lack of stellar parallax;  the germ theory had to contend with the fact that miasma theory  produced pretty good advice for staying healthy during a plague;  Kahneman and Tversky’s bank teller  example must contend with the fact that in a normal conversation,  deliberately offering a misleading detail and then pointing out that  it’s misleading is fairly rude. Theory can go further than practice,  because it’s built on a firmer foundation. But there’s often a long gap  between when a theory is elegant and when it makes better predictions than a lifetime’s worth of time-tested heuristics in the same domain.[1]

But metis is a hill-climbing algorithm. If it’s based on experience rather than theory, it’s limited by experience. Meanwhile, theory is not  limited by direct experience. By the 1930s, many physicists were quite  convinced that an atomic bomb was possible, though of course none of  them had ever seen one. Because some things can’t be discovered by trial  and error, but can be created by writing down some first principles and  thinking very hard about their implications (followed by lots of trial  and error), the pro-legibility side has an advantage in inventing new  things.

In a society that prizes legibility, more complicated forms of  production can function. Instead of just farms and workshops, you can  have factories. A factory is naturally more legible than a farm, because  the factory takes external inputs (raw materials, machines, workers)  and then produces uniform outputs.[2] A factory isn’t “native” to a  particular physical location, and tolerates a wide range of climates and  soil qualities. So the industrial revolution made society more legible  through the simple expedient of making a larger proportion of it legible  by default. And this legibility was self-reinforcing: if the factory  makes tractors, the farms that buy them have more consistent outputs, so  those farms, too, are more legible. More workers at factories meant  more people who had to be at particular places at particular times,  which meant that even people who didn’t work at factories still needed  watches and clocks. And since factories require steady inputs, the arm  of legibility reached outward: when factories are the main source of  demand for mines, mines need to run at the cadence and variance of a  factory. So do the stores that buy from them; it’s hard to amortize  fixed equipment costs without continuous production, which requires  continuous sales to keep itself going.

The most effective institutions tend to reshape society in their own  image, and the more effective they are, the more profound the reshaping.  Manufacturing and finance had a feedback loop here: complex supply  chains can only function with reliable courts, uniform weights and  measures, and a trusted currency. So industrialization drove all of the  above. Scott talks at length about the politics of measurement: since  feudal dues were set by custom, but defined by vague measurements, tax  increases often took the form of enlarging the bag of grain used to  measure rent denominated in a given number of bags, or arguments over  what constituted filling a bag or basket. These informal systems might  have given primitive political systems some fiscal flexibility, but  they’d make any complex agreements untenable. A variable measure is poor  collateral and makes it hard to hedge an obligation in one place with a  future deliverable somewhere else. The modern system is less flexible,  but the benefit is that it can produce much, much more.

Large-scale manufacturing imposes legibility in another sense. Any  business with economies of scale works better the larger the market it  serves, so a more manufacturing-based economy means less tolerance for  parts of the economy that aren’t somehow plugged into the factory  system.

The industrial revolution is a compelling example of economic growth  imposing legibility, but it’s hardly the only one. Longitude was  discovered thanks to a prize,  which was offered as a way to subsidize trade. Railroads imposed  uniform timetables across different cities—if a train is expected to  arrive at 2 in the afternoon and depart again at 2:05, those times need  to mean the same thing to the conductor, the passengers, and every other train conductor in the same network, as well as anyone expecting to meet a passenger at the station. More recent  global trade made much more of the world legible: the shipping  container, the dollar, and the ubiquity of English as a second language  are all legibility-improving consequences.

But by far the biggest legibility imposers today are big tech companies. This is a recurring theme in The Diff,  and may be the single most common one. Large tech companies create  uniform identifiers for everyone they can: if you’re on Facebook, you  have a unique ID in their system. If you’re not on Facebook, you’re still in the system,  and they are no doubt assiduously trying to come up with reasons for  you to finally join. These companies can do better than trying to teach  everyone the same language; they can translate on the fly. They compile  categorized, tagged, thoroughly-described data on products, people, and  pages, and constantly analyze it.

And, for the average person, this is a material quality of life  improvement. If you meet someone through a work function and don’t catch  their full name, LinkedIn’s advanced search is very likely to narrow  the list down; if you meet them socially, Facebook’s friend search,  weighted by network proximity and other factors, will also help. Google,  of course, surfaces all of the information on the public Internet in a  convenient format, and Twitter gives you a real-time feed of it. Amazon  makes their merchants use consistent descriptors within a given  category, so satisficing on some criterion—cheapest laptop with a  particular graphics card, for example—is straightforward.

These companies generally use legibility the same way governments do:  to collect taxes. Governments try to price-discriminate when they tax  people; charge too much, and you may discourage work or encourage tax  avoidance. Charge too little, and, well, you could have collected more.  Many tech companies spend their efforts getting ever closer to perfect price discrimination.  The mechanics of ad auctions encourage bidders to pay their expected  marginal profit for traffic, and even companies that start out with a  non-ad model, like Amazon, end up using ads to capture the last bit of  extra margin their suppliers were keeping. It’s a testament to big  tech’s state-like capacity that Indonesia basically outsourced sales tax collection to them ($, Nikkei).  Collecting taxes from individuals is a challenge that some states  aren’t up to, but they can adopt the feudal model of granting the  powerful privileges, like the right to do business in a country, in  exchange for feudal dues.

Once you look for legibility, you start to see it everywhere. Every  big tech company wants to control a measurement system, to ensure that  the fundamental unit of some kind of communication is owned by them. And  you see it across companies, too. Long supply chains work poorly when  data is fragmented and hard to join, but they can work wonderfully when  it’s all in the same format, with the same primary keys. Safegraph, for  example, has a guide to data standards, which notes that they’ve made their own,  a unique identifier for real-world locations. A system of long supply  chains based on theoretical constructs about a world more complex than  theory sounds brittle, but it’s more flexible than it looks; P&L  discipline is a good way to keep dreamers grounded.

Getting taxed by big tech is a lot better than getting conscripted by them. (So far, the closest companies come to that is strongly promoting volunteerism).  But it still feels like a worrisome trend. Governments expanded their  state capacity, but they’ve often used it for ominous ends.

Fortunately for anyone who shares Scott’s skepticism of the  legibility project, the end state for tech ends up creating a weird ego  of the metis-driven illegible system we started with. The outer  edges of ad targeting, product recommendations, search results, People  You May Know, and For You Page are driven by machine learning algorithms  that consume unfathomable amounts of data and output a uniquely  well-targeted result. The source code and the data exist, in  human-readable formats, but the actual process can be completely opaque.  There is probably not a single human being at Google who can answer a  question like “Why, when I search for X, is this site #4 while that  site is #5.” The engineer might know what signals Google uses, and  perhaps roughly what their weightings are, but every new signal adds new  complexity, and the sum of a long tail of tiny signals can outweigh the  human-tractable ones.

An ML-driven approach is only possible at large scale, and scale is  only possible through legibility. But it’s the fate of all these  legibility-imposers to move past legibility. They impose order  on the world, and then they automate the order-imposing process, the  order-imposer-refining process, and so on, until the end result is  determined by a metis available to nobody.

This is an echo of how older legible systems worked. There were  rules-based systems, and bureaucracies to implement those rules, but  those bureaucracies worked through unspoken and informal systems. There  are whole books explaining how specific bureaucracies work, and even those are not fully descriptive.

As a general rule in economics (and perhaps in every domain), most of  the interesting results are embodied in the residuals. Equity is the  residual claimant of whatever revenue a company can produce that isn’t  claimed by employees, suppliers, creditors, and the government. Alpha is  also a residual; it’s whatever part of an investor’s performance can’t  be explained by purely statistical factors. Legibility is a continued  effort to refine models so the residual is small and, ideally, normally  distributed. But the process has limits, based on human bandwidth. We’ve  reached a strange point in history where we reinvent at global scale an  approach that worked at sub-Dunbar size. And it works well at its stated goals, even if every year it’s a little less legible.

[1] For example, it’s mathematically true that the aggregate returns  investors get from the stock market must equal the market’s aggregate  return, less fees, so an index fund with lower-than-average fees is  guaranteed to be an above-average investment, at least as long as prices  are being set by non-indexers. This is conventional wisdom today, and  was theoretically true long ago, but it took Vanguard a while to  convince anyone.

[2] Scott has an anecdote in Seeing Like a State about how  even factories require local knowledge. A brand new machine may function  exactly the way its specs indicate, but once it’s been in use for a  while, small imperfections throw it off. A skilled operator can adjust  for these, and for variations in the quality of materials. This kind of  skill, though, gets less useful as the rest of the world gets more  legible: better-quality equipment, cheaper replacement parts, more  diagnostics to spot those imperfections, and tighter quality control on  materials all diminish the value of local knowledge.


Pirate’s Treasure, Redux

In yesterday’s issue, I pointed out that accessing almost a billion  dollars worth of Bitcoin once controlled by the “Dread Pirate Roberts”  may, technically, be the first instance in human history of someone  finding a pirate’s treasure map and discovering actual treasure. Two  updates today:

  1. “Over,” not “Almost,” $1bn, and
  2. We now have the backstory:  the money was stolen from Silk Road by a hacker, whom the government  later identified. They prevailed on this unknown hacker to forfeit the  money.

This appears to be a record for the proceeds of one hack, and one of the most valuable stolen goods of all time (it ranks below the Empire State Building).  It’s also good evidence for what Bitcoin’s short-term bears and  long-term bulls have always argued: any currency with a permanent ledger  is an unwise one to use for criminal activities.


A legal strategy that goes in and out of vogue is to take down  damaging online content by filing dubious copyright claims. The  black-hat version of this is to write a copy of the offending content,  backdate it, and then claim the original version is a copyright  violation. This has worked ($, WSJ), but Google is catching on. The other technique, used in the last few days by online test-taking software company Proctorio and Netflix,  is to find cases where people complaining about a given company cite  material that the company has copyrighted, and then claim it’s an  infringement. The default content moderation approach for most companies  is to assume content doesn’t violate copyright by default, but to also  assume that copyright claims aren’t frivolous by default. In this state,  anyone with a loose interpretation of the law has a first-mover  advantage.

Antitrust and Laggy Beliefs

The US government has sued to block  Visa’s acquisition of spending data aggregator Plaid. The argument is  interesting: it’s not that Plaid competes with, is a supplier to, or is a  customer of Visa, but that Plaid’s product gives them the capability to  launch a Visa competitor in the future.

As a general rule, antitrust actions lag investors in terms of what  they think matters about a given company or industry. Microsoft, IBM,   and AT&T all got in trouble over the part of their business  investors viewed as a melting ice cube. This case shows some progress in  antitrust state capacity, since it actually does match what a smart  Plaid investor would see as Plaid’s long-term upside.

Money, The High-Order Bit

Facebook now lets WhatsApp users make payments in India. This is something they’ve planned for a while. I wrote about it in June:

Communication, identity, and payments are all  fundamentally tied together. Any payment system that doesn’t involve  hard currency has some form of identity verification. Phone and  messaging companies have an incentive to do a moderate amount of  identity verification in the course of their business; someone who signs  up for WhatsApp and spams a thousand people but never gets a reply is  probably a bad actor, whereas someone who signs up and starts a series  of back-and-forth conversations is more likely to be real. The messaging  product is its own proof-of-work. Looking at WhatsApp today, you could  imagine that this was the plan all along: bootstrap a universal identity  system and communications network, and use it to create  smartphone-based payment rails that skip legacy payment systems and  enable more transactions.

Perceptual Arbitrage in Crowdfunding

For any consumer electronics, made-in-Shenzhen is the null hypothesis  unless there’s a very good reason it should be made somewhere else. The  Chinese manufacturing base has scale and flexibility that other  locations can’t match. But it also has a (somewhat dated) reputation  problem. This has changed the economics of crowdfunding,  by encouraging China-based project founders to market their product as  not-necessarily-from-China. This is not simply because of Western  biases, though:

“Even if we’re doing something that’s purely for a  domestic audience, sometimes we’ll have it with Western actors,” he  says, which he credits to Chinese companies often facing prejudice  against their products. They want these Western actors so “they’re  perceived as being more high end, or reliable, or maybe in a higher  price bracket,” he says.

Some economic forces are stronger than individual companies, and end  up reshaping those companies to look like what they replaced. Chinese  companies have a comparative advantage at manufacturing, and US  companies have a comparative advantage at marketing. As a channel like  crowdfunding matures, it increasingly looks like what it replaced.

Covid and the Talent Shift

Jason Lemkin says that for the first time, small companies outside of the Bay Area can tap into the Bay’s talent network.

Just in the past few weeks, 3 top, brand-name CMOs I know  joined $10m+ ARR start-ups HQ’d in Berlin, in Atlanta, and in Colorado.   That never, ever would have happened before Covid. Because most top  talent wants to be close to the CEO.  You would.  It’s just so much  easier to excel that way, and so much riskier to be far from the CEO if  you’re a top executive.

This is an underrated aspect of cities' network effects. Talent  density is self-sustaining, because the companies that hire colocate  with the employees they want to hire, and job negotiating leverage is  partly driven by the number and quality of next-best opportunities. This  trend cuts against the argument that future tech companies will have a  “mullet” approach, with their senior executives located in the Bay Area  and the rest of their employees working somewhere with a low cost of  living. As has been pointed out many times, network effects cut both  ways: every time someone leaves the network, it makes staying behind a worse idea.