This might sound like a hot take but it’s not: In 50 years, when historians look back on the crazy 2020s, they might point to advances in artificial intelligence as the most important long-term development of our time. We are building machines that can mimic human language, human creativity, and human thought. What will that mean for the future of work, morality, and economics? Bestselling author Steven Johnson joins the podcast to talk about the most exciting and scary ideas in artificial intelligence and an article he wrote for The New York Times Magazine about the frontier of AI. Part of their conversation is excerpted below.
Derek Thompson: What I wanna do in the next 30 minutes to an hour is offer people a tour of the horizon of AI and ask some big, hard questions about what is exciting about this horizon, what is scary about this horizon, and how we can move the frontier of this technology toward the less dystopian implications of it. And I wanna start off with a very specific example of AI and that is GPT-3. Steven, what is GPT-3 and why are you excited about it?
Steven Johnson: Well, GPT-3 is a kind of a subset of AI. It’s a specific implementation of a category known as large language models and it also belongs to the family of neural nets and the family of deep learning. So those are a bunch of buzzwords right there that will be meaningful.
DT: We’re gonna unpack those buzzwords in just a second.
SJ: Yeah. But it is basically a neural net that is modeled very vaguely on the structure of the human brain, but we should not take that kind of biological analogy too far. That is, it goes through a process that’s called a training process, where it is shown a massive corpus of text, basically a kind of a curated version of the open worldwide web, Wikipedia, a body of digitized books, that are part of the public domain. And it basically ingests all of that information. And this training process is really kind of fascinating. We can get into the details of it, but basically it learns to associate connections between all the words in that body of text. And through that training process, it is able then, when you give it prompts—initially it was in the form of “here’s a sentence” or “here’s a paragraph, continue writing in this mode for another paragraph or another five paragraphs.”
And if you have a big enough corpus of text and a deep enough neural network, it turns out that computers over the last couple of years have gotten quite good at continuing human-authored text. And it was initially kind of a little bit of a parlor trick in that you would write a paragraph and earlier versions of this software would kind of continue on and you would look at it and you’d be like, “Yeah, that sounds vaguely like a human could have written it,” but obviously it was also nonsensical in all these ways. And it wasn’t particularly interesting. For most users, you see this technology and things like autocomplete when you’re using Gmail and you write a sentence and it suggests, you know, a little word at the end, that’s basically built on top.
DT: Oh, right. “It was great to see you last—” And then Gmail suggests in light-gray font, “night.”
SJ: It’s the same idea.
DT: It is using its understanding of millions and millions of emails already sent to predict the next word in the email that you are sending. And just to add a little bit of sort of 101 context to your first answer, neural nets, we’re not gonna get into the full definition here, but basically this is a set of algorithms that mimic a human brain, that learn to identify patterns or solve problems through repeated cycles of trial and error. It’s a domain of AI that is very popular, shows a lot of promise, and is behind the large language models that you just talked about. One of these large language models that’s very exciting is GPT-3. And the reason I think GPT-3 is so interesting is that it’s not just the sort of technology that can add the word “night” when you type in Gmail, “It was great to see you last.” It can go much further than that. It can summarize books, it can summarize papers, it can write entire essays in response to very complicated prompts. Can you give us some examples of some of the implications of GPT-3 that are most thrilling to you?
SJ: Yeah. So let me say one more thing about the structure of it, which I think is kind of fascinating. And I agree we don’t wanna get too far down into the rabbit hole of how it actually works, but on some fundamental level, it is trained on this very elemental act of next-word prediction. And to me this is one of the things that I find kind of mind-blowing about it. I mean, there’s a lot of complexity to what’s going on in the neural net, but fundamentally the training process is, you know, ingest all the history of the web and Wikipedia. And then it’s given endlessly a series of training examples where it’s shown a real-world paragraph that some human has written and then one word is missing. And basically in the initial stages of the training process, the software is instructed like, “Come up with the missing word. Come up with a statistically ranked list of the most likely words in this particular paragraph.”
And in the initial pass, it’ll give you, you know, whatever, 30,000 words that might be that missing word. And it’ll be terrible at it. It’ll be awful. But somewhere at the bottom of the list, like word no. 29,000 will be the right word. And so the training process is saying, “OK, whatever set of neural connections led you to make guess no. 29,000, strengthen all of those connections and weaken all of the other connections in your neural net.” And it just plays that game a trillion times. And eventually it gets incredibly good at predicting the next word—and, in fact, predicting whole sentences or paragraphs. What seems to have happened over the last three or four years—there was an earlier version of GPT-3, which was called GPT-2, that came out a couple years ago. Over this period, the software is now much better, as you say, at constructing larger thoughts and making arguments and summarizing and doing things like that. So to me, it’s just mind-blowing that it really fundamentally comes out of this act of next-word prediction, that that’s the kind of fundamental unit of the whole exercise.
This excerpt was lightly edited for clarity.
Host: Derek Thompson
Guest: Steven Johnson
Producer: Devon Manze