Large language models like Google’s Bard are trained on all kinds of data, most of which was collected without anyone’s knowledge or consent. Now, website owners have a choice whether to allow their web content to be used by Google as material to feed its Bard AI and any future models it decides to make. In this article, we will discuss how to opt-out your website from Google’s Bard and future AIs.
To opt-out your website from Google’s Bard and future AIs, you need to disallow “User-Agent: Google-Extended” in your site’s robots.txt. This document tells automated web crawlers what content they’re able to access. By disallowing “User-Agent: Google-Extended,” you can protect your web content from being used by Google’s Bard AI and future models.
Google claims to develop its AI in an ethical, inclusive way, but the use case of AI training is meaningfully different than indexing the web. “We’ve also heard from web publishers that they want greater choice and control over how their content is used for emerging generative AI use cases,” the company’s VP of Trust, Danielle Romain, writes in a blog post, as if this came as a surprise.
Google argues that this data collection is necessary to improve Bard’s performance. Bard is used to generate text, translate languages, write different kinds of creative content, and answer questions in an informative way. Google says that Bard is still under development, and that it needs access to a wide variety of data in order to learn and improve.
While Google’s Bard AI has been designed to improve content planning and support SEO initiatives, it has faced criticism for being trained on datasets based on internet content called Infiniset, of which very little is known about where the data came from and how they got it. By opting out your website from Google’s Bard and future AIs, you can protect your web content and have greater control over how it is used.