robots.txt

On this page

  1. What is a robots.txt?
  2. Where to access the robots.txt function
  3. Commands to use

Easily manage your sites robots.txt file through the settings page of your site.

What is a robots.txt?

Robots.txt is a text file webmasters or SEOs use to instruct web bots on how to crawl pages on their website. You can instruct robots on a page, subdirectory, or site-wide instructions for how search engines should crawl your pages or treat links (such as follow or nofollow). However if a page is included in your sitemap that you wish to exclude, it is recommended to use “nofollow” instead to blog a page. These robot crawl instructions are giving instruction to disallow or allow the behavior of certain (or all) user agents (not only, but often thought of as search engines).

Where to access the robots.txt function

On the top menu when logged in, go to: More > SEO Manager > robots.txt tab. There is only a single field you need to place the new commands or edited commands.

Commands to use

The "/robots.txt" file is a text file, with one or more records. Usually contains a single record looking like this:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~resources/

These 3 lines showing disallow are telling the robots to ignore and not index them.

To exclude all robots from the entire site
User-agent: *
Disallow: /
To allow all robots complete access
User-agent: *
Disallow:

Or just leave your robots.txt empty, however this is not considered best practice.

To exclude all robots from part of the site
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /resources/
To exclude a single robot
User-agent: BadBot
Disallow: /
To allow a single robot
User-agent: Google
Disallow:
User-agent: *
Disallow: /
To exclude all files except one

There is no "Allow" field. The easiest way to deal with this is to put all files to be disallowed into a separate directory, e.g. "misc", and leave the single file outside of that directory.

User-agent: *
Disallow: /~resources/misc/

Alternatively you can explicitly disallow all disallowed pages:

User-agent: *
Disallow: /~resources/index.html
Disallow: /~resources/business.html

Learn more here.

Can't find the answer you are looking for?

Try using searching below: