LLMs and Communicating in More Dimensions

Plus! Cruise's Asymptotes; Shrinkage; Open Source Economics; Derivatives and the Underlying; Shorting; Diff Jobs

In this issue:

Today's newsletter is brought to you by Tegus, which offers expert interviews on countless companies—now with AI summaries that highlight key questions and answers.

LLMs and Communicating in More Dimensions

Some documents are hard to read solo: an academic paper in a field you don't know especially well, a prospectus for a hard tech company where you're unfamiliar with the underlying technology, or just an email from a busy professional in a different field who knows more acronyms than you and isn't afraid to use them.

There used to be two ways to do this: struggle a lot on your own, or ask someone for help. LLM chats are a new kind of "someone," with infinite knowledge, near-infinite patience, and, yes, a tendency to tell you what they think you want to hear.

One way to think of them is that they're an implementation of the old "enhance" trope, where a blurry image from a security camera can be zoomed in to reveal a license plate number, a name on a business card, a distinguishing facial feature, etc. "Enhance" doesn't work that way—you can't break one gray pixel into four or sixteen black-and-white pixels—but given enough data, you can map the vague images to precise ones. This is more or less a description of facial recognition: take an image of a face, identify the distinguishing metrics that describe it, search a database of existing photos where those metrics have been identified, and reproduce the best match. So it's not enhancing one image, but turning it into a search query for other existing images.[1]

This works in both directions; a newly-standard use case for LLMs is, more or less, "Explain it like I'm not five," i.e. stripping as much extraneous verbiage as possible from a text to get at the fundamental point. Some of this can be done programmatically; there's a ritual on quarterly conference calls where the call starts with a sort of opening benediction, in which the company informs investors that it will contain forward-looking statements that may or may not come to pass, and sometimes enumerates the magical verbs (like "can," "should," "may," "plans to,") that should be understood to immunize the executives from accusations of securities fraud. The information content of this section of the call is zero, and it's easy to strip out of a transcript. But there are other parts of quarterly calls that also carry minimal information, but they're more situational. Summarizing an earnings call, or uploading a bunch of them to a vector database in order to interrogate them, is a way to convert reading and memorization into a search problem.

AI is also improving our resolution of history. This was something The Diff discussed a bit a few weeks ago, but it seems to be accelerating. For example, The Beatles broke up in 1970, but released their latest album last Thursday. We're still early; you can read all of Goethe's works in German or Chinese, but not English. But over time, the idea of an untranslated work will be an anachronism; if it's been digitized in any language, it will be available in every language.[2]

What this ultimately means is that more of your asynchronous information consumption will happen at your preferred resolution. In-person, things will still be tricky; you can't live-compress a five-minute monologue into a fifteen-second summary (or a two hour long lecture into a 5 minute podcast episode) without waiting for it to end.[3] I've personally seen this with The Diff—there are readers who run long pieces through ChatGPT to summarize them (hence our decision to start offering summaries ourselves), and there are other readers who read articles and ask ChatGPT to expand on excessively terse bits.

You can imagine other instances of this. For example, if an essay seems like it's making an interesting point, but the author has wildly different premises than you—they're a Marxist, an Evangelical Christian, whatever—translating the main essay into something disconnected from these points can be a useful exercise. We already do this implicitly when we read anything sufficiently old; the worldviews of people even eighty years ago are sometimes alien. Of course, this is a tricky exercise, especially when the LLM has been optimized to be inoffensive in the context of the modern US (or China ($, WSJ) or Abu Dhabi).

This creates a more profound change, too: it’s the slow death of incomprehensibility. Sometimes, a conversation will involve jargon, or the application of some domain-specific concept in a new area. And sometimes, this completely flummoxes people who are less familiar with the jargon. So understanding it is binary; if someone talking about politics says "this is just like 1994" or someone talking about macroeconomics says "it's 1997 all over again," the listener would get it or they wouldn’t. But LLM chatbots mean that instead of a boolean, the datatype in question is time: jargon speeds up communication for people who share it, but now, for people who don’t, it just means asking ChatGPT what this could possibly mean.

The net result of this is that more obscure interests are accessible to more people. It's safer to pick up an obscure book on an esoteric topic. It's less risky to read an intimidating paper. And it's less of a waste of time to email the paper's author with follow-up questions—for both of you. LLMs expand the surface area of all human knowledge, and, conveniently, represent a map of that same surface. We've barely started exploring.


  1. Facial recognition is a naturally controversial topic, but it's unclear how much of this controversy is from a combination of users not thinking statistically and journalists doing the same thing. Models can be set with different accuracy rates, and if you are a police department in a city of 100,000 people and your model identifies a murder suspect with 99% accuracy, you have simultaneously reduced your potential suspect count by two orders of magnitude and ensured that if you actually arrest the person who matches, your odds of getting the wrong person are 99.9%. In general, the UI for a statistical process that isn't being used by professional statisticians treats these probabilities as binary; Gmail does not have a p(spam) tag next to every email, just an inbox for p(spam) below a certain threshold and a spam folder for p(spam) above it. The way you know they've gotten the interface right for a given accuracy level is that you have false positives and false negatives. Of course, it's better to have a higher accuracy level, but for whatever reason this is harder than it seems like it should be. ↩︎

  2. This will have significant social effects, but with a long lag, because the changes will be downstream from the behavior of people who like to nerd out. And that nerding out takes time. A more prosaic near-term effect will be a smaller valuation gap between companies that don't publish financial statements and investor communications in English and the ones that do. And one can imagine other kinds of AI-based machine translation, like creating truly comparable valuations for companies that use GAAP and IFRS, or looking at how insurers would be valued under different accounting regimes. ↩︎

  3. One of the downstream effects of this may be that heavy AI users will be more rude in person. You can see a bit of this when interacting with other people whose job performance hinges on the transmission of as many timely bytes as possible in as few syllables as necessary. ↩︎

A Word From Our Sponsors

Manual search and data collection means time you’re NOT spending drawing insights, gaining perspective and making bold investment decisions.

Let Tegus do the busy work for you. Powerful AI and machine learning algorithms summarize each expert transcript to put key questions and themes front and center. And you get deep-dives into a single company, industry, or trend with auto-tagging across transcripts.Get better insights — faster — with Tegus’ latest innovations in AI.

Elsewhere

Cruise's Asymptotes

The NYT has a good in-depth look at self-driving car company Cruise's difficulties with both safety and economics. The company was in a rush to get to market, which can be read as a classic move-fast-and-break-things attitude, but applied to human lives. On the other hand, human-driven cars also move fast and break things, with almost 43,000 deaths in 2021. The death number has been declining (with a reversion in the last few years), as population growth is offset by better safety features and behavioral changes like law and stigma against drunk driving. (It's surprising how fast this has changed; in 1983, a New Yorker profile of the writer of "Hints from Heloise" could talk about driving drunk to another bar, and could treat the whole thing as a joke.) But it's still a lot of death, and if the last few years have taught us anything it's that you don't want to bet that humans will always be better than computers at any GPU-bound task.

There are some pushbacks against the article. For example, Swiss Re did a study recently with Waymo (via @kane) where they found that Waymo had roughly one quarter the property damage per million miles driven that human drivers did; the injury number can't be calculated yet because the sample is not yet large enough for one to happen. This is based on a tiny sample size in terms of incidents (three cases of property damage, zero of injury), but it suggests autonomous drivings' general ability to provide better-than-human performance.

Of course, that doesn't necessarily speak to Cruise's ability to do so. Their CEO argues that the article overcounted staff and overcounted cases where humans intervene. The comment also notes that this is another case where queuing theory leads to operating leverage: if the goal is to always have people on staff to handle emergencies, the relevant staffing level is based on potential peak demand. As the number of vehicles increases, peak demand for human intervention as a ratio of fleet size also declines. With a large enough fleet, the need for human staffers slowly approaches the percentage of times that vehicles need human intervention, but the initial ratio is always higher.

Shrinkage

Vox has a good piece on retail shrinkage that manages to interview loss-prevention personnel at stores and professional shoplifters (in both cases, interviewees are reluctant to use their real names). One thing the piece highlights is that loss prevention is always a balance between the cost impact of theft and the cost-and-revenue impact of preventing it: locking up high-value items does make them harder to steal, but also makes them harder to shop for (and isn't free), so some stores accept that it's better to lose $x from theft than to lose $2x from higher costs and lower purchases. (That, of course, ignores an important externality—the norm against shoplifting affects small stores and large ones, but it's the large ones that can calculate the IRR of a security camera or extra guard to the last basis point.)

The piece raises an interesting possibility: higher shrink is downstream from higher wages at the bottom of the wage distribution. Everyone in the piece agrees that having more people walk the floor of the store will at least discourage theft, and sometimes prevent it. If retailers experience faster inflation in labor costs than in inventory costs, shrinkage will increase unless some other force offsets it.

Open Source Economics

In July, Red Hat changed its licensing to make it more difficult for other organizations to create Red Hat-compatible distributions. In response, a consortium of other companies, CIQ, SUSE, and Oracle, have launched their own. A consistent problem in open source economics is the question of how to monetize: keeping a product open means that customers can feel more comfortable adopting it, since they'll be able to retain access even if the original developer stops offering it or makes a change they don't like. But that also makes it tricky to charge for. One option is to make source code free and charge for implementation; another is to have an open version that's a delayed release of the paid version. But running this model is tough when a product is not just a product but also a standard. Is the standard the newest version or the most public one? And if the newest version is a comparatively small departure, what stops someone who benefits from the standard and doesn't care too much about the seller's economics from rolling their own replacement?

Derivatives and the Underlying

If there's an asset that's hard to own directly, but that investors want to bet on, one option is to get exposure synthetically through a derivative that tracks some index that includes, or is made of, the asset. But this has risks. For example: S&P Dow Jones removed Nigeria from an index of frontier markets, setting its value at zero, so derivatives tracking the index, and ETFs that own those derivatives, show a loss ($, FT). The assets in question still exist, but are hard to sell (Russian equities are in a similar limbo). Most of the time, the details of implementing a trade matter for tax or financing purposes, but give investors the same exposure they expected. But not always.

Shorting

South Korea has banned many short sales until the summer of 2024 in response to illegal short-selling among locals. This is actually close to an echo of the Nigeria story above, where there are logistical difficulties to putting on short-sale trades, which institutions can surmount but individual investors can't. When the market dips (earlier last week, the market was down 15% from its August 1st peak), this has two effects: individual investors wish they'd been able to short easily, and they speculate that short sellers pushed stock prices down. That problem, at least, has somewhat abated; Korean stocks rallied, up 5.6% today, led by the sorts of tech companies that often get targeted by short sellers ($, FT).

Diff Jobs

Companies in the Diff network are actively looking for talent. A sampling of current open roles:

If you’re at a company that's looking for talent, we should talk! Diff Jobs works with companies across fintech, hard tech, consumer software, enterprise software, and other areas—any company where finding unusually effective people is a top priority.