Google announced a supercharged update to its Bard chatbot Tuesday: The tech giant will integrate the generative AI into the company’s most popular services, including Gmail, Docs, Drive, Maps, YouTube, and more. Together with a new feature that tells you when Bard provides potentially inaccurate answers, the new version of the AI is neck-and-neck with ChatGPT for the most useful and accessible large language model on the market.
Google is calling the generative features “Bard Extensions,” the same name as the user-selected additions to Chrome. With the AI extensions, you’ll be able to send Bard on a mission that pulls in data from all the disparate parts of your Google account for the very first time. If you’re planning a vacation, for example, you can ask Bard to find the dates a friend sent you on Gmail, look for flights and hotel options on Google Flights, and devise you a daily itinerary of things to do based on information from YouTube. Google promises it won’t use your private data to train its AI, and that these new features are opt-in only.
Perhaps just as significant is a new accuracy tool Google calls “Double Check the Response.” After you ask Bard a question, you can hit the “G” button, and the AI will check to see if answers are backed up by information on the web and highlight information that it may have hallucinated. The feature makes Bard the first major AI tool that fact-checks itself on the fly.
This new, souped-up version of Bard is a tool in its infancy, and it may be buggy and annoying. But it’s a glimmer of the kind of technology we’ve been promised since the early days of science fiction. Today, you have to train yourself to ask questions in the extremely limited terms a computer can understand. It’s nothing like the tools you see on a show like Star Trek, where you can bark “computer” at a machine and give instructions for any task with the same language you’d use to ask a human being. With these updates to Bard, we come one tiny but meaningful step closer to that dream.
Gizmodo sat down for an interview with Jack Krawczyk, Product Lead for Google Bard, to talk about the new features, chatbot problems, and what the near future of AI looks like for you.
(This interview has been edited for clarity and consistency.)
Jack Krawczyk: Two things that we hear pretty consistently about language models in general is that “it sounds really cool, but it doesn’t really useful in my day-to-day life.” And second, you hear that it makes things up a lot, what savvier people call “hallucination.” Starting tomorrow, we have an answer to both of those things.
We’re the first language model that will integrate directly into your personal life. Through the announcement of Bard extensions, you finally have the ability to opt in and allow Bard to retrieve information from your Gmail, or Google Docs, or elsewhere and help you collaborate with it. And with Double Check the Response, we’re the only language model product out there that’s willing to admit when it’s made a mistake.
Thomas Germain: You summed up my reaction to the last year of AI news pretty well. These tools are amazing, but in my experience, fundamentally useless for most people. By roping in all of the other Google apps, it’s starting to feel like less of a party trick and more like a tool that makes my life easier.
JK: At its core, what we believe interacting with language models lets us change the mindset that we have with technology. We’re so used to thinking of technology as a tool that does things for you, like tell me how to get from point A to point B. We’ve found people naturally gravitate towards that. But it’s really inspiring to see it as technology that does things with you, which isn’t intuitive in the beginning.
I’ve seen people use it for things that I would have never expected. We actually had someone snap a photo of their living room, and ask, “how can I move my furniture around to improve feng shui?” It’s the collaborative bit that I’m excited about. We call it “augmented imagination,” because like the ideas and curiosity are in your head. We’re trying to help you at a moment where ideas are really fragile and brittle.
TG: We’ve seen a lot of examples where Bard or some other chatbot spits out something racist, or gives dangerous instructions. It’s been about a year since we all met ChatGPT. Why is this problem so hard to solve?
JK: This is where I think the Double Check feature is really helpful to understand that at a deeper level. So the other day I cooked swordfish, and one of the things that’s challenging about cooking swordfish is that it can make your whole house smell for several days. I asked Bard what to do. One of the suggestions it gave was “wash your pet more frequently.” That’s a surprising solution, but it sort of makes sense. But if I use the Double Check feature, it tells me it got that wrong, and results from the web say washing your pet too frequently can remove the natural oils they need for healthy skin.
We’ve evolved the app, so it goes sentence by sentence and searches on Google to see if it can find things that validate its answers or not. In the pet washing case, it’s a pretty good response, and it’s not like there’s necessarily a right or wrong answer, but it requires nuance and context.
TG: Bard has a little disclaimer that says it might provide inaccurate or offensive information and it doesn’t represent the company’s views. More context is good, but the obvious criticism is, “why is Google releasing a tool that might give offensive or inaccurate answers in the first place?” Isn’t that irresponsible?
JK: What these tools are really useful for is exploring the possibilities. Sometimes when you’re in a collaborative state you make guesses, right? We think that’s the value of technology, and there is no tool for that. We can give people tools for brittle situations. We heard feedback from a person who has autism and they said, “I can tell when someone who writes me an email is angry, but I don’t know if the response that I’m going to give them will make them more angry.”
For that issue, you need to interpret rather than analyze. You have this tool that has potential to solve problems that no other technology can solve today. That’s why we have to strike this balance. We’re six months into Bard. It’s still an experiment, and this problem isn’t solved. But we believe there is so much profound good that we don’t have answers for today in our lives, and that’s why we feel it’s critical to get this into peoples hands and collect feedback.
The question that you’re asking is, “why put out technology that makes mistakes?” Well, it’s collaborative and part of collaboration is making mistakes. You want to be bold here, but you also have to balance it with responsibility.
TG: I imagine the goal is that someday, there won’t be a difference between Bard and Google Search, it will just be Google and you’ll get whatever is most useful at the moment. How far away is that?
JK: Well, an interesting analogy is the tool belt versus the tools. You’ve got a hammer and screwdriver, but then there’s the belt itself. Is that also a tool? That’s probably a semantic debate. But right now, most of our technology works something like, well I go I go to this site to get this job done. I go to that site to get that other job done. We’ve got all individual tools, and I think they will be supercharged by generative AI. You’re still using the different tools, but now they’re working together. That’s kind of how we see having a standalone generative experience, and I think we’re taking the first step towards that today.
TG: This probably isn’t what you’re planning on talking about today. But I want to ask you about sentience. What do you think it is? Is that even an important question for us to be asking people like you right now?
JK: I think the fact that people are asking it means that it’s an important question. Is what we’re building today sentient? Categorically, I would say the answer is no. But there’s a discussion to be had about whether it has the opportunity to be sentient. With sentience, I think in many forms it centers around comparison. I have not seen any signals that suggest that computers can have compassion. And pulling from Buddhist principles here, in order to have compassion, you need to have suffering.
TG: So you haven’t given bard any pain sensors yet?
JK: [Laughing] No.
TG: Can you share anything about Google’s plans to integrate Bard with Android?
JK: For the time being, Bard remains a standalone web app at bard.google.com. And the reason that we’re keeping it there is it’s still an experiment. For an experiment to be useful, you want to minimize the variables that you put into it. At this phase, our first hypothesis is a language model connected with your personal life is going to be extremely helpful. The second hypothesis is a language model that’s willing to admit when it’s made a mistake and how confident it is in its own responses is going to build a deeper truth about the ways people can engage with this idea. Those are the two hypotheses that we’re testing. There are plenty more that we want to test. But for now, we’re trying to minimize the variables.