# Google-Extended Google-Extended is not a web crawler itself, but a special control mechanism that allows website owners to manage whether their content can be used to train Google's generative AI models, such as Gemini and Vertex AI. By using a specific directive in a `robots.txt` file, publishers can opt out of having their content used for AI training purposes without affecting their site's visibility or ranking in Google Search results. It was introduced by Google to give creators more control over their content in the age of AI. Breadcrumb navigation - [Privacy-focused, simple website analytics](https://plainsignal.com/) - [Agents](https://plainsignal.com/agents "Agents, User-Agents, Crawlers, Browsers") - [Google-Extended](https://plainsignal.com/agents/google-extended) ## What is Google-Extended? Google-Extended is a control mechanism, not a separate web crawler, introduced by Google to give website owners control over the use of their content for training Google's AI models. It functions as a special token that can be used in a `robots.txt` file. It works in conjunction with Google's existing crawlers, like Googlebot. While Googlebot still crawls the site for search indexing, the Google-Extended token in `robots.txt` tells Google whether that crawled content can also be used to train and improve AI systems like Google Gemini and the Vertex AI APIs. ## Why is Google-Extended relevant to my site's crawling? Google-Extended does not crawl your site independently. Instead, it is a rule that Google's regular crawlers (like Googlebot) look for when they visit. If you have not explicitly blocked Google-Extended in your `robots.txt` file, Google considers your content eligible for use in training its AI models. This process is part of Google's standard crawling operations. The frequency of these crawls is determined by your site's normal crawl budget, which is influenced by factors like your content update schedule and site authority. ## What is the purpose of Google-Extended? The primary purpose of Google-Extended is to act as a consent and control mechanism, giving publishers a clear choice about whether to contribute their content to the development of Google's AI ecosystem. It specifically governs the use of content for training Google Gemini and Vertex AI generative APIs. Google introduced this in response to publisher concerns about the use of their content for AI training. Importantly, opting out via Google-Extended is designed to have no negative impact on a site's inclusion or ranking in standard Google Search results. ## How do I use Google-Extended to block AI training? To prevent your website's content from being used to train Google's generative AI models, you need to add a specific rule to your `robots.txt` file. This will opt your site out of data collection for this purpose without affecting your Google Search ranking. To block content use for Google's AI models, add the following lines to your `robots.txt` file: ``` User-agent: Google-Extended Disallow: / ``` ## Related agents and operators ## Canonical Human friendly, reader version of this article is available at [Google-Extended](https://plainsignal.com/agents/google-extended) ## Copyright (c) 2025 [PlainSignal](https://plainsignal.com/ "Privacy-focused, simple website analytics")