
In this issue:
- Using AI for Writing like a Responsible Adult—It's incredibly annoying when people spam AI-generated writing everywhere. But it's equally-but-differently annoying to pretend that any use of AI contaminates writing. There are sensible, responsible ways to use AI, but you have to remember—if you're trying to produce original ideas, you have a partly-adversarial relationship with models that are designed to make you feel like that's what you're doing, whether or not it is.
- Granular Price Discrimination—OpenAI wants to subsidize every line of code that might include an API call to OpenAI
- ETFs—It's optimal for ETFs to launch before it's clear that there's demand.
- Form Factors—Meta would love to listen to every conversation on earth and use them for training data. They can't do that, at least not yet, but they can make progress.
- Training Data—Remember that we're in an era where startups aim to transfer wealth from VCs to consumers—but with the hope that this leads to profits later.
Talk to this post with ReadHaus.
Using AI for Writing like a Responsible Adult
Apologies in advance: publication may be disrupted in the coming weeks. I’ve apparently managed to rupture my patellar tendon. This is one of the important ones, at least if you need to walk. So, expect a few missed posts and some abbreviated ones in the next week or two.
Anyway, onward.
Technology moves faster than norms, and sometimes you end up with a shearing effect where the same thing is simultaneously the subject of delusional promotion from its fans and differently deluded condemnation from its critics. So AI-generated writing is simultaneously liberating, and drivel, and pushing out all of the human stories, and incapable of ever replacing them. There's less of a clear market for the reasonable opinion that almost everybody holds, i.e. that LLMs are a good tool for writing nonfiction, that they're getting better, but that they're also dangerous in subtle ways, and the danger is getting subtler.
Ask for edits; don't ask the LLM to edit: one very defensible use case is that you've written a draft, and you want someone to read it (ideally quickly) before you show it to everyone at once. So you ask the computer. Models don't have great taste in writing, but they do have consistently above-average taste in every possible kind of writing. If you've written a political biography, GPT-5.5 isn't going to give you better feedback than Robert Caro would. But it will give you better feedback than Caro actually will, because he's busy (and please don't interrupt him. He's almost done). It's slightly annoying to have a draft side-by-side with suggestions and to manually type them in; it's much more annoying to realize that one-shotting draft-to-final replaced your favorite line with a contrastive parallelism. There are people who object to even this, but unless they've sworn off Google Docs entirely (or at least turned off its grammar and spellcheck), they're actually still using LLMs to edit their writing all the time.
Autodidacts, or people just getting up to speed in some new space, can flail around a lot because they don't have a good map of common knowledge. They'll reinvent things, misunderstand things, learn concepts but not labels and vice-versa. This is mostly a matter of cumulative exposure to the topic, but LLMs can help you skip a step; they're very good at providing overviews of the literature, recommended places to start, and prerequisites. This is a case where their averageness is a virtue; any given professor might have peculiar opinions on some thinker, which will distort their syllabus. But the average professor's idea of the best way to start approaching some topic, especially if it's qualified with some reference to why someone might reasonably choose an alternative, is actually pretty good guide and roughly what you’d want. (For many programming and adjacent topics, there's a version of it that helps you ship software and a version that could help you prove some original theorem. These are overlapping areas, but usually someone interested in e.g. linear algebra has exactly one of these two use cases in mind.)
They're good at cross-tabulating unstructured data: Back when SEO was a more dominant strategy for getting traffic, a popular format was the top-N list. What publishers like about it is exactly what writers hate about it: the whole idea is to reprocess information that's already out there into some list, and to perhaps add some low-effort snark or attempt to judge it a bit. So, there are a lot of lists out there, both objective ("biggest explosions ever") and subjective ("Columbus' tastiest sandwiches"). One thing LLMs are pretty good at is creating the lists that should exist, but don't, like a list of the cases where one country bought territory from another, or a list of which Presidents served in the military in some capacity in the Second World War. (If you give Carter credit for being in the Naval Academy, and treat both Reagan's and LBJ's service as technically qualifying, then you get the fun historical tidbit that the first President after the Second World War not to have served in that war was born in 1946.[1])
Lists like these aren't good on their own, but they're very good as a way to get a somewhat representative sample. Ideally, you have a pattern in mind (maybe something like "money is exchanged for territory as a face-saving way for someone to surrender when a larger power threatens to annex them,") and you want to see if that pattern holds true. You could just ask an LLM directly, but then the LLM knows what answer would make you happy. You should in general handicap an LLM's answers the way you would those from a friend, but a bit more aggressively. If you show your friend something you made, and ask them if they think it's good, you'll have a very hard time getting them to admit that they don't like it, unless they have you pegged as the kind of person who’d make something deliberately terrible to make exactly this point. LLMs can sometimes candidly tell you that your idea is terrible, but the labs' incentive is for the models to do this just often enough that they seem like tough graders, while still grading you on whatever curve keeps you active.[2]
There are many tricks for getting LLMs not to destroy their value by pandering to you. One is the old "say this draft is by somebody else and ask the LLM to rip it apart" trick, though if you have a public body of work, the LLM will actually know who wrote it.[3] You can ask at different levels of abstraction, or ask for a judgement about an analogous situation, and then ask the LLM to poke holes in the analogy you made.
But, even though LLM critics could use a little more stochasticity when they parrot lines about letting a computer do your thinking for you, it is true that in the end, using an LLM for either research or editing requires you to make judgment calls about what to ask and how to evaluate the result. A day is as long as it was before LLMs, and if writers are sometimes saving hour-plus chunks of research time, fixing slightly subtle prose errors, finding just the right source to consult, etc., standards for prose will actually go up, at least for people who don't just prefer LLM-generated text. It couldn't work any other way; publishing something LLM-generated implies that actually writing it wasn't worth the effort. That's perfectly fine for some kinds of marketing copy, a little risky for things like a privacy policy, and mostly pointless for other kinds of writing.[4] Publishing something under your name continues to imply that you thought it was worth the effort it took to produce the text, and defecting from that norm means that other people have a hard time writing their way into fame.[5] Chatbots improve, norms shift, and writers will probably continue to use chatbots more. If there's a meta-heuristic, it's probably this: you can use them to do a better job for your readers, or to cheat your readers a little bit. Which of those you choose is entirely up to the person writing the prompts.
I'm treating FDR and Truman as technically members of the military during the war, given that each was Commander in Chief. ↩︎
It's possible that because the revenue per user can be so much higher for using LLMs to write code, and because the coding incentive is a lot more truth-seeking, the models may be dragged in that direction over time. For now, assume they're not. ↩︎
Maybe you can get around that, too, by agreeing with your friends to trade LLM reviews, i.e. their LLM reviews your draft and vice-versa. But even in this case, an LLM that's cynically reasoning about what to do is going to say: this text is obviously written by X, but Y's asking me about it. Y and X seem similar enough that it's plausible that they're friendly. And Y doesn't want to be the bearer of bad news, so I don't even need to mention some of the minor problems with it..." and so on. If this is a driver the only way to get really good feedback from an LLM will be to track down someone smart but your polar opposite in as many important ways as possible, and offer the LLM-review-swap service to them. ↩︎
One minor exception: there are a surprising number of Reddit confessional stories floating around that mention, as a minor detail, that someone involved made a bunch of money on a specific casino site, Stake. ↩︎
You might object to this and say that this isn't quite fair, because people vary in how much effort it takes to produce a given essay. Ask a new lawyer and a lawyer with thirty years of experience to write about what it means to practice law, and the second one can probably whip out a much more impressive document given as much time as the first. But that's because the effort involved is thirty years, plus the time it took to write. People who naturally write quickly don't have this excuse. ↩︎
You're on the free list for The Diff. Last week, paying subcribers read about how in the future, growth companies will report ever-declining margins, instead of consistently growing margins ($), and how subscriptions and property taxes are both a form of credit ($). Upgrade today for full access.
Diff Jobs
Companies in the Diff network are actively looking for talent. See a sampling of current open roles below:
- Lightspeed-backed team building the engineering services firm of the future is looking for founding members of technical staff excited about working alongside civil engineers to translate their domain expertise into the operating system that powers the next era of great American infrastructure. If you’re an engineer with strong product intuition, who's energized by access to users, and excited by the prospect of transforming how we design and construct our built world with frontier AI, this is for you. (NYC, SF or Remote)
- Series-A defense tech company that’s redefining logistics superiority with AI is looking for a MLE to build and deploy models that eliminate weeks of Excel work for the Special Forces. If you want to turn complex logistics systems into parametric models, fit them using Bayesian inference, and optimize logistics decision-making with gradient descent, this is for you. Python, PyTorch/TensorFlow, MLOps (Kubernetes, MLflow), and cloud infrastructure experience preferred. (Salt Lake City or NYC)
- A well-funded, Series C startup building the platform and agent primitives to drive operational transformation at large, complex institutions (starting with higher education) is hiring platform engineers. The work spans distributed systems, applied AI, and full-stack infrastructure, focused on deploying reliable agents that meaningfully bend institutional cost curves. (Remote)
- A hyper-growth startup that’s turning the fastest growing unicorns’ sales and marketing data into revenue (driven $XXXM incremental customer revenue the last year alone) is looking for a senior/staff-level software engineer with a track record of building large, performant distributed systems and owning customer delivery at high velocity. Experience with AI agents, orchestration frameworks, and contributing to open source AI a plus. (NYC, SF)
Even if you don't see an exact match for your skills and interests right now, we're happy to talk early so we can let you know if a good opportunity comes up.
If you’re at a company that's looking for talent, we should talk! Diff Jobs works with companies across fintech, hard tech, consumer software, enterprise software, and other areas—any company where finding unusually effective people is a top priority.
And: we're now actively deploying capital into early-stage companies through Anomaly. Our focus is on defense, logistics, robotics, and energy. If you'd like to chat, please reach out.
Elsewhere
Granular Price Discrimination
A few weeks ago, we wrote about OpenAI's offer of free credits to YC startups: it was a way to give them enough inference to, hopefully, build something that would lead to steady demand for inference. OpenAI also just ended another promotion where Codex users temporarily got acccess to more compute. This has a similar payoff function, and is basically a way for a less resource-constrained company to pull forward some immediate demand while raising the long-term level of demand. Over time, we might see more divergence in how tokens are priced based on when they're needed, but if the evolution of ad pricing is any guide, this will be an increasingly opaque process.
ETFs
Starting an ETF has an option-like payoff: you have some fixed costs, which means that you need to balance between losing money at first and having punitive fees. But, if your product works, it can become a metonym for an entire bet. And there's the standard reflexive argument, where an ETF that grows is validating its bets because it's pushing prices in that direction. But that also means that the optimal time to launch an ETF is well before it's clear that it will be viable, and ETFs are closing quite fast enough. Some intuitions on ETFs are just wrong, like the idea that the ratio of ETFs to public equities is meaningful (if you have a list of N things, the number of distinct subsets of fewer-than-N of them is a lot bigger than N!). But it's worth remembering that these products are designed to have a short shelf life.
Form Factors
Meta is working on an AI pendant ($, The Information). AI has advanced so fast in part because all of us were so diligent at stringing meaningful tokens together and posting them in a convenient format. We probably thought we were answering questions on Stack Overflow, arguing about politics on Reddit, penning meaningful blog posts, etc., but we were really generating training data. But most of the human-generated tokens aren't digitized in that way; they're ephemeral either because they're spoken or because they're digitized in some context that doesn't lend itself to widespread collection. Meta will have to navigate some questions about exactly how it can use this data, but even increasing the surface area in which existing models can interact with the world and collect feedback is helpful, even if Meta isn't going to train on every single word spoken near the pendant.
Disclosure: long META.
Training Data
And in similar news, Shift, an AI startup, is offering free robotic house cleaning in exchange for collecting data. This is another data-availability problem: between cars with cameras and satellite photos, we have a lot of data about what the exteriors of buildings look like. For interiors, user reviews can sometimes help a model stitch together a view of an interior space. But for homes, the convenient data is abstract—floorplans, carefully-selected photos, etc. If for whatever reason your model needs to know and reason about which toiletries usually go where in a typical bathroom, or any of the other countless questions one might have about the interior of dwellings, it's lost.
This raises another point about AI: since there are high returns to taking an easy lead, and there are many areas with data shortages, we're all in an SF-in-the-mid-2010s-style situation where venture capital is being turned into consumer welfare with the expectation that it'll later evolve into corporate profits. AI skeptics and AI optimists can agree that there are some deals we'll get now that we'll just never see again.