What is robots.txt?

The robot txt file tells the search engine crawler about which URL pages they can access and which can not access from your site. It is mostly used to prevent overloading our website request. Don’t think that it is a webpage that is used for keeping away web pages out of google. We can use no index directive to protect our web page. It is a set of instruction for robots.

What is robots.txt use for?

It is only a text file with no HTML markup code. With the use of robot.txt we can avoid crawling unimportant page on our site. Unlike another file it is host on the webserver on the site.  Robot txt file can be viewed by typing the complete URL for the home screen and after that adding robot.txt. for example, https://www.minnionstech.com/robots.txt.  It is use to manage crawling traffic. This file is not linked anywhere else on the website.

Robots.txt

Robots.txt manage traffic and hide media file from google

Preventing video, image and audio from displaying in the google search result and to mange the traffic we can use robot.txt. But one thing which we have to keep in mind that it doesn’t prevent other pages or user from linking to the video, images, audio file on the server.

Robots.txt  manage traffic and hide resource file from google

Robot txt can be use to block the resource file like script styles, unnecessary images or style files. Once the resource is block by using robot.txt the crawler can’t able to understand about the page. It will affect the inspection of the page.

If you want to learn more about robots.txt ,you can visit wikipedia.org

robots.txt                 

                         How to create robots.txt file?

A new robots.txt file can make with the help of a plain text editor of choice. In case, if we have already created a robots.txt file then we should have the surety about that we have deleted the text inside the file.

  • Set the user agent
    Start making the file setting the user agent.
  • You can Do this by using an asterisk after writing a user-agent term.
    User-agent: *
  • After that type “disallow”. Don’t type anything else after disallow.
    Disallow:
  • There is nothing left after “disallow”, The web bots will be direct to crawl our whole website. Our robots.txt file will be displaying like this:
    User-agent: *
    Disallow:
  • We can also link the XML sitemap to this. It is completely your preference.
    Sitemap:https://yoursite.com/sitemap.xml

How to test robots.txt file?

  • Firstly, open the tester tool for the website, then normally  move down through the robots.txt code to locate the logic errors and highlighted syntax warning. The amount of logic errors and syntax warning will be show below the editor.
  • Then start by typing the URL of a page on your website in the text box which is present at the bottom of the page.
  • Choose the user-agent which you like to simulate in the dropdown list which is displaying right of the text box.
  • Press on the TEST button to test access.
  • Check to see if the TEST button is properly working and reads ACCEPTED or BLOCKED. Find out if the URL you type is block from web crawlers.
  • Now you can check it by Editing the file on the webpage and retest as per your required. Note the changes made on the page are not save to your website.
  • Just Copy the changes to your robots.txt file on the site. By using this tool, we can’t make change to the real file on the website, it only checks against the copy hosted in the tool.

To  know more about search engine optimization. You can connect us on “MINNIONS TECH”.