In early October, Google introduced a significant directive, Google-Extended, enabling webmasters to prevent their content from being utilized for AI training via the robots.txt file.
Concerns with Google-Extended
Despite Google’s claims that Google-Extended restricts content usage to improve specific AI products like “Bard, Vertex AI, and others,” it has failed to encompass SGE (Google Search Generative Experience). Consequently, it does not apply to the “AI snapshots” generated by SGE, as observed by Barry Schwartz from SERoundtable, who identified SGE’s utilization of content from The Rolling Stones website despite the Google-Extended restrictions in robots.txt.
SGE’s Usage Explained
A Google spokesperson clarified that SGE is classified as a search experiment, and therefore, it does not currently adhere to Google-Extended limitations. This means that content from any website can potentially be incorporated into responses generated by SGE’s AI, as long as Googlebot has permission to crawl the site.
Google’s perspective appears to be that webmasters should not be concerned about their site’s content being used for AI snapshots in SGE, as SGE is an integral part of search, notes NIX Solutions. To prevent such usage, the only effective recourse is to entirely block Googlebot.