Quick summary: Starting November 2020, Googlebot will start crawling some sites over HTTP/2. Ever since mainstream browsers started supporting the next major revision of HTTP, HTTP/2 or h2 for short, web professionals asked us whether Googlebot can crawl over the upgraded, more modern version of the protocol. Today we're announcing that starting mid November 2020, Googlebot will support crawling over HTTP/2 for select sites. What is HTTP/2 As we said, it's the next major version of HTTP , the protocol the internet primarily uses for transferring data. HTTP/2 is much more robust, efficient, and faster than its predecessor, due to its architecture and the features it implements for clients (for example, your browser) and servers. If you want to read more about it, we have a long article on the HTTP/2 topic on developers.google.com . Why we're making this change In general, we expect this change to make crawling more efficient in terms of server resource usage. With h2
Last year we released the robots.txt parser and matcher that we use in our production systems to the open source world. Since then, we've seen people build new tools with it, contribute to the open source library (effectively improving our production systems- thanks!), and release new language versions like golang and rust , which make it easier for developers to build new tools. With the intern season ending here at Google, we wanted to highlight two new releases related to robots.txt that were made possible by two interns working on the Search Open Sourcing team, Andreea Dutulescu and Ian Dolzhanskii . Robots.txt Specification Test First, we are releasing a testing framework for robots.txt parser developers, created by Andreea. The project provides a testing tool that can validate whether a robots.txt parser follows the Robots Exclusion Protocol, or to what extent. Currently there is no official and thorough way to assess the correctness of a parser, so Andr