Google Wants To Make Robots.txt An Internet Standard

Google Search has helped both small and large businesses to more easily be found on the internet. A key tool for Google to pull that of is a simple text file on every website known as robot.txt. Robots.txt is over 25 years old, which is almost the same as forever when it comes to the internet. However, Google’s recent announcement about making it standardised might be the first time some people are hearing of it. So, let us explain further.

Robots.txt, officially called Robots Exclusion Protocol (REP), is the system of commands that dictates which part of your website that search engines will find. The concept came to fruition back in 1994, when a webmaster by the name of Martijn Koster felt his website was being overrun with crawler traffic. However, it’s when other webmasters began to join in that robots.txt truly gained recognition. Soon after, internet search engines began to adopt this as a sort of defacto standard.

Robots.txt Today

Nowadays, robots.txt is a fundamental part of search engine optimisation (SEO). Crawlers from search engines like Googlebot and Bingbot, scour the internet for websites. When they find a site, files containing robots.txt tell them if and how pages of that website should be listed. Having the right robots.txt commands is a critical part of a website SEO and ranking.

That said, at present, robots.txt has a ‘sort of’ standard attached to its name, which has spawned into different versions of itself. Over the years, developers have haphazardly added and removed elements from the original version. As a result, webmasters have to get to grips with the multiple standards of robots.txt, which can be a pain. That’s why, Google wants to simplify this process by making only one standard.

As Google owns one of the world’s largest search engines, this move will practically make the robot.txt version become the standard across the internet. Google probably knows this, and as a result listed a set of rules for webmasters and developers to follow, with the launch of the new standard. This meaning that, webmasters who fail to apply the rules will fall out of Google’s good graces and rank further down their search result.

New Rules

These new rules include things like: not accepting typos or variations of commands in the <feild> elements e.g. “useragent” instead of “user-agent”; changes to how Google deals with ‘redirect hops’; a size limit of 500KiB on content; the renaming of ‘records’ to ‘line’ and ‘rules’ and most notably, the inclusion of all URL protocols and not limited to HTTP, just to name a few. In addition, Google says they expect the file format to be plain text encoded in UTF-8. This file will consist of lines separated by CR, CR/LF, or LF. The main elements to be used are <field>:<value><#optional-comment>. Note that, it will ignore whitespaces located either at the start and or the end of the line.

Conclusion

The new robots.txt is still at its draft stage, with Google working with developers to refine the standard. The hope is that these developments will amount to simpler and better ways of working for the modern needs of websites and their owners. The internet has greatly changed since 1994 and therefore it makes sense that robots.txt follows suit, too.

Here at Wiredelta, we understand the importance of robots.txt and other SEO tools for improving your websites’ ranking on search engines. If you need help with your new website and you are not sure where to start, why not reach out to us. In addition, we have plenty of articles about web and digital marketing, so feel free to sign up for our newsletter to keep yourself updated about these topics and other developments in tech.

Success stories

In the past decade we have launched over 100 websites and more than 20 mobile apps, helping each of our client get closer to their digital goals.

Executive Global
Network

Connecting executives around the world in one of the largest professional networks

Philip Morris
International

Working together towards a smoke-free future for the Nordics.

Ønskeskyen
(GoWish)

Denmark’s largest wish cloud is going global with a brand new look and a lot of new features

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Google Wants To Make Robots.txt An Internet Standard

Robots.txt Today

New Rules

Conclusion

Success stories

Executive GlobalNetwork

Philip MorrisInternational

Ønskeskyen(GoWish)

How can we bring you value?

Executive Global
Network

Philip Morris
International

Ønskeskyen
(GoWish)