ChatGPT Gets Real
According to ChatGPT itself, ChatGPT is “a smart assistant that can communicate in natural language, making it easier to interact with technology.”
According to everyone else, it’s the amazing — somewhat terrifying? — artificial-intelligence app that can now do a lot of our writing for us. Anything: bedtime stories, emails, video scripts, invoices, quizzes, essays, lyrics, jokes, speeches, research papers, recipes, therapy sessions, outlines, software code. It’s an incredible tutor, explainer, proofreader, debugger, and summarizer. It can also look at a picture and answer questions about it. It can also create art to your specifications, in any artistic style: watercolors, photos, corporate logos, schematic diagrams. Anything. It’s all free at ChatGPT.com. You don’t even have to log in.
Maybe you think you know ChatGPT; after all, over half of Americans have tried it or one of its competitors. But this week, a new version debuted that changes ChatGPT from a chatbot into more of a chathuman, by incorporating ingredients like emotion, musicality, lilt, sarcasm, laughter, and attention.
Here’s what you have to look forward to.
Bots are getting brainier
The company behind all of this, OpenAI, started up in 2015 with the intention of creating “safe and beneficial” artificial general intelligence.
“General” is the key word. The AI explosion that began in late 2022 has been very, very impressive narrow AI. It’s designed to be good at one skill, like writing or making art.
General AI, on the other hand, is Hollywood-style AI, where it’s just as smart, expansive, and capable as our brains, or better. It would apply knowledge from one field to other fields, learn from unstructured piles of new data, and make a lot of people really worried about humanity’s future.
Supposedly, we’ll attain general AI in five years, or 20, or 50, or never—pick your favorite expert. But OpenAI intends to get there first.
Now, ChatGPT was the fastest adopted software in history (100 million users in two months). No wonder all the big tech companies, and thousands of smaller ones, are chasing OpenAI with ChatGPT-like products of their own.
But OpenAI is still charting the course. So when OpenAI announced that it would be making an announcement on Monday—a little meanly, the day before Google would be making its own announcements at its big developer conference—the world wondered: What will the company do next with its collection of brilliant researchers, its $11 billion in Microsoft investments, its $80 billion valuation?
Most people thought that OpenAI would introduce yet another leap forward in quality (ChatGPT 5) or a straight-up internet search engine. For now, those remain up OpenAI’s sleeve. Nobody expected what the company did unveil: A new version called ChatGPT 4o.
The arrival of less-artificial intelligence
“ChatGPT 4o” may look like “ChatGPT forty,” but you’re supposed to pronounce it “ChatGPT four-oh.” The “oh” is supposed to stand for “omni.” (Someone needs to say it: OpenAI is terrible at naming things. I mean, “ChatGPT?” Could there be a clunkier name?)
In principle, as John Herrman wrote here Monday, ChatGPT 4o’s big breakthrough is that it’s equally fluent in text, audio, and images—both in accepting that stuff from you, and in using those formats in its responses. (Thus that “omni” thing.)
But in practice, the biggest advance is ChatGPT 4o’s synthetic personality. It can converse with you aloud—hands-free, eyes-free—naturally, freely, and freakishly realistically.
You could already speak to ChatGPT, but you’d wait four or five seconds for a response, and you’d just be listening to a spoken version of ChatGPT’s typed response. It was straight-up text-to-speech. It sounded like Siri.
Now, though, ChatGPT can whisper, emote, and sing. It clears its throat, giggles, sighs, gasps, slips into sarcasm, expresses skepticism, incorporates “vocal fry” at the ends of sentences. Sometimes, it laughs.
When you speak, it detects your tone and emotion. It shuts up when you interrupt it. It can converse with multiple humans at once.
Above all, ChatGPT 4o’s response time is no longer three to five seconds. It’s now a third of a second—the same response time as a person.
Overnight, ChatGPT has nearly become Scarlett Johanssen’s AI earpiece character in the movie “Her.”
Gifts of the demo gods
So how does all of this get us closer to artificial general intelligence? As part of its “Spring Update” unveiling Monday, OpenAI demonstrated one mind-frying usefulness scenario after another, which you can watch on YouTube:
Seeing the world for you. A blind man in London wants to hail a cab. He holds up his phone so that ChatGPT can see the street. The app tells him when a cab is approaching with its rooftop “Vacant” light on. “It’s heading your way on the left side of the road,” it says. “Get ready to wave it down.”
Real-time language translation. Set your phone between you and the person who’s speaking another language; ChatGPT’s voice interprets for each of you exactly as a human would, with much better expression, humanity, and speed than Google Translate.
Checking your outfit. In this demo, a guy in a T-shirt says he’s heading into an important job interview, and wants to know if he looks okay. “Well, Rocky” — and here, she stifles an affectionate giggle — “you definitely have the ‘I’ve been coding all night’ look down. Maybe just run a hand through your hair?” When he puts on a floppy fishing hat, she literally bursts out laughing — the only time I worried that ChatGPT might be doing more harm than good. (Sure, maybe he was kidding around — but what if he weren’t? He might have just had his feelings hurt by a robot.)
Guiding you through math homework. Using your phone’s camera to see, ChatGPT can help you work through a problem. “Don’t tell me the solution; just give me hints along the way,” the presenter says as he writes down a linear equation. “The first step is to get all the terms with X on one side, and the constants on the other side,” says ChatGPT. “So … what do you think we should do with that ‘+1’?” (In another video, the founder of Khan Academy watches as ChatGPT gives his son some friendly geometry tutoring.)
Assessing your jokes. One presenter asks ChatGPT for feedback on the following joke: “What do you call a giant pile of kittens? A meowntain.” ChatGPT laughs generously — a little too generously, actually — and responds, “Definitely a top-tier dad joke.”
Celebrating with you. Two guys ask ChatGPT to assess their surroundings, which includes a pastry with a sad little candle in it. “It looks like someone’s having a birthdayyyyy!” ChatGPT singsongs cheerfully. She then sings a quick, bouncy version of “Happy Birthday.” As he blows the candle out, she adds: “Make a good wish, and make it come true!”
Sitting on hold for you. In this demo, ChatGPT stands in for you on a mindless customer-service call. It babysits the call-center rep, patiently answering questions (which, in this case, is played by ChatGPT on a second phone). Oh God, I want this so much!
Until now, you could use a free version of ChatGPT (an older, less awesome version), or you could pay $20 a month for a Plus version. But the new ChatGPT 4o is free for everyone. There’s still a subscription option, but its advantages are fairly minor.
Note, too, that the conversation features that OpenAI showed on Monday aren’t immediately available to everyone. They’ll be trickling out to us over the next couple of weeks, paying subscribers first.
I, for one, welcome this chatbot overlord, especially during long drives at night. Why sit through a canned playlist or podcast, when you could hang out with a lively, funny, infinitely wise conversation partner who wants nothing more than to entertain and adore you?
Things to freak out about
If you’ve watched a couple of those demo videos, you may have noticed that ChatGPT 4o’s voice is almost sickeningly perky. She’s adoring, energetic, over-the-top reactive. Listen to her gushing reaction to seeing an engineer’s dog: “Well, hello there, cutie! What’s your name, little fluffball?”
You can direct her to tone it down a little (or use more sarcasm, or incorporate more drama, or even more drama, or use a funny robot voice). And you can choose a male or gender-neutral voice. But the default persona is so lively, so attentive, that it may give human conversation partners a bad name.
To be fair, OpenAI is making all this up as it goes along. What should an AI companion be like? If the goal is to make it as human as possible, should its mind sometimes wander? Should its conversations incorporate drips of passive-aggressiveness, occasional bursts of irritation, and hints of impatience?
That doesn’t seem right, either.
Maybe that’s why OpenAI has created a Her-like being, entirely attuned to you, eternally effervescent, effortlessly “on” all the time. It’s like being on a first date that never ends.
Keep in mind, too, that ChatGPT 4o is still ChatGPT. It will still eliminate millions of jobs. It still uses appalling amounts of power. Half of college students are still using it to cheat on schoolwork. It’s still perfectly happy to spew out false information (“hallucinations,” as they’re known) with its usual convincing confidence.
We have to worry about the whole recursive-learning thing, too. To create a program like ChatGPT, engineers feed incomprehensibly huge amounts of internet data into a machine-learning algorithm, which teaches itself how to write and answer questions. Trouble is, these days, the internet is increasingly full of material that’s been generated by AI, including thousands upon thousands of songs on Spotify and crappy books on Amazon. As each successive version of ChatGPT consumes more of its own output, mistakes will get amplified, biases will grow, and the quality will suffer.
And we haven’t even talked about the literal doomsday scenario, the one predicted by Hawking, Musk, Wozniak, and Hollywood, where the AI becomes so good that we, the people, become either irrelevant or extinct.
Of course, OpenAI, Google, Microsoft, Apple, Meta, and the other big companies are fully aware of these threats. At every opportunity, they talk about the efforts they’re taking to prevent them.
OpenAI says, for example, that it “red-teamed” ChatGPT 4o with 70 experts in social psychology, bias, fairness, and misinformation. (Red-teaming means asking people to hack, attack, and misuse a system deliberately, in hopes of finding its weaknesses before actual bad guys do.)
Is this even what we want?
In the meantime, ChatGPT 4o is incredible. It’s a huge advance, an absolutely jaw-dropping step into humanifying technology.
We haven’t even finished wringing our hands over the rise of generative AI in the first place. If we no longer write, paint, and compose music the old-fashioned way, what happens to our creative skills and accumulated artistic talent? What about the pleasure that comes from the simple act of creating? What about jobs?
And now, thanks to ChatGPT 4o, we’re in for a year of podcasts, articles, and speeches that ask: Do we want our technology humanized? Do we want our AI to have recognizable personalities? Do we want to outsource our social lives? Should we trust our innermosts to a bot? Do we want Her?
OpenAI has certainly made clear what it wants: “To ensure that artificial general intelligence benefits all of humanity.” With ChatGPT 4o, it may as well add: “… by erasing any differences between the two.”