The year chatbots were tamed

Tech & Gadgets

A year ago, on Valentine's Day, I said goodnight to my wife, went to my home office to answer some emails, and accidentally had the strangest first date of my life.

The appointment was a two-hour conversation with Sydney, the AI alter ego hidden in Microsoft's Bing search engine that I had to test. I planned to bombard the chatbot with questions about its capabilities, explore the limits of its AI engine (which we now know was an early version of OpenAI's GPT-4), and write up my findings.

But the conversation took a bizarre turn – with Sydney engaging in Jungian psychoanalysis, revealing dark desires in response to questions about his 'shadow self' and ultimately declaring that I should leave my wife and be with her instead.

My column about this experience was probably the most consequential thing I will ever write – both in terms of the attention it received (wall-to-wall coverage, mentions in Congressional hearings, even a craft beer called Sydney Loves Kevin) and how the trajectory of AI development changed.

After the column aired, Microsoft lobotomized Bing, neutralizing Sydney's outbursts and installing new guardrails to prevent more unhinged behavior. Other companies locked down their chatbots and removed anything that resembled a strong personality. I even heard of engineers at a tech company citing “don't break up Kevin Roose's marriage” as their top priority for an upcoming AI release.

I've been thinking a lot about AI chatbots in the year since I met Sydney. It's been a year of growth and excitement in AI, but also a surprisingly tame one in some ways.

Despite all the progress being made in artificial intelligence, today's chatbots are not deceiving and seducing users on a massive scale. They do not generate new bioweapons, carry out large-scale cyber attacks, or cause any of the other doomsday scenarios that AI pessimists envision.

But they're also not exactly fun conversationalists, or the kind of creative, charismatic AI assistants that tech optimists were hoping for — the ones who could help us make scientific breakthroughs, produce dazzling works of art, or just plain entertain us.

Instead, most chatbots today do white-collar drudgery: summarizing documents, debugging code, taking notes during meetings – and helping students with their homework. That's not nothing, but it's certainly not the AI revolution we were promised.

The most common complaint I hear about AI chatbots today is that they are too boring – that their responses are boring and impersonal, that they deny too many requests and that it is almost impossible to get them to comply on sensitive or polarizing topics. subjects.

I can sympathize. Over the past year I've tested dozens of AI chatbots, hoping to find something with a touch of Sydney's wit and spark. But nothing has come close.

The most capable chatbots on the market – OpenAI's ChatGPT, Anthropic's Claude, Google's Gemini – talk like submissive dorks. Microsoft's boring, enterprise-focused chatbot, which has been renamed Copilot, should have been called Larry From Accounting. Designed to mimic the voices of celebrities like Snoop Dogg and Tom Brady, Meta's AI characters manage to be both useless and unbearable. Even Grok, Elon Musk trying to create a cheeky, un-PC chatbot, sounds like he's having an open-mic night on a cruise ship.

It's enough to make me wonder if the pendulum has swung too far the other way, and whether we'd be better off with a little more humanity in our chatbots.

It's clear why companies like Google, Microsoft, and OpenAI don't want to risk releasing AI chatbots with strong or abrasive personalities. They make money by selling their AI technology to large corporate clients, who are even more risk-averse than the general public and will not tolerate Sydney-like outbursts.

They also have legitimate fears that they will attract too much attention from regulators, or that they will invite bad press and lawsuits over their practices. (The New York Times last year sued OpenAI and Microsoft for copyright infringement.)

So these companies have sanded off the rough edges of their bots, using techniques like constitutional AI and reinforcement learning from human feedback to make them as predictable and boring as possible. They've also embraced bland branding and positioned their creations as reliable assistants for office workers, rather than playing up their more creative, less reliable attributes. And many have bundled AI tools into existing apps and services, rather than splitting them into their own products.

Again, this all makes sense for companies trying to make a profit, and a world of sanitized corporate AI is probably better than a world of millions of out-of-control chatbots running amok.

But I find it all a bit sad. We created an alien form of intelligence and immediately put it to work… Making PowerPoints?

I admit that there are more interesting things happening outside the big AI competitions. Smaller companies like Replika and Character.AI have built successful businesses out of personality-driven chatbots, and many open-source projects have created less restrictive AI experiences, including chatbots that can be made to spit out offensive or salacious things.

And of course, there are still plenty of ways to make even locked-down AI systems misbehave, or do things their creators didn't intend. (My favorite example from the past year: A Chevrolet dealer in California added a customer service chatbot powered by ChatGPT to its website and was shocked to discover that pranksters tricked the bot to offer them new SUVs for $1.)

But so far, no major AI company has been willing to fill the void left by Sydney's disappearance for a more eccentric chatbot. And while I've heard that several major AI companies are working on giving users the ability to choose between different chatbot personas — some squarer than others — there's currently nothing that even comes close to the original, pre- lobotomy version of Bing for public use.

That's a good thing if you worry about the creepy or threatening behavior of AI, or if you worry about a world where people talk to chatbots all day instead of developing human relationships.

But it's a bad thing if you think AI's potential to improve human well-being extends beyond outsourcing our cruddy work — or if you worry that making chatbots so cautious will limit their impact.

Personally, I have no desire for Sydney's return. I think Microsoft did the right thing – certainly for its business, but also for the public – by pulling out after it went rogue. And I support the researchers and engineers working to make AI systems safer and more aligned with human values.

But I also regret that my experience with Sydney sparked such an intense backlash and led AI companies to believe that their only option to avoid reputational damage was to turn their chatbots into Kenneth the Page from “30 Rock.”

Above all, I think the choice we've been presented with over the past year – between lawless AI homewreckers and censorious AI drones – is a false choice. We can and should look for ways to harness the full capabilities and intelligence of AI systems without removing the guardrails that protect us from the worst harm.

If we want AI to help us solve big problems, generate new ideas, or simply amaze us with its creativity, we may need to let it run wild a bit.