SEO tip #6: What is the robots.txt file and why is so important for your SEO efforts?

blog cover image
28
6.8K followers
Updated

The search engines can’t – and won’t – help you to expose your content if your site is not 100% accessible and understandable. And when we are talking about accessibility, the very first important factor is always the robots.txt file. So, let’s see …

What is the purpose of the robots.txt file?

When your website is indexed by the search engines, basically, it’s crawled by robot programs called bots, crawlers or spiders (Googlebot, Bingbot, Yahoo Slurp, etc) in order to find and categorize all the content on your site. The bots will automatically index whatever they can find and “read”. If you have any sections or content pieces (for example, expired offers, duplicate content, non-public pages, etc) that you don’t want to get indexed, you’ll have to inform the crawlers about these “banned” areas. In order to do that, you are going to need a so-called robots.txt file.

So, what is the robots.txt file? To put it simply: it’s a simple text document placed in the root of your website that will tell the search engines (crawlers) what they can and what they cannot index while crawling your website. Additionally, if you want to save some bandwidth, you can use the robots.txt file to exclude javascript files, stylesheets or certain images from indexing.

When the spiders visits your site, the very first thing they do is to check out the existence and the content of your robots.txt file. If you have created a robots.txt file with your own rules, the crawlers will listen to your requests and won’t index the pages that you have disallowed. In theory, you could use the robots meta-tag too in order to keep away the spiders from certain files, pages, folders, etc, but not all search engines read meta-tags, so it’s always better to use the robots.txt file.

As I already said, the robots.txt file must be placed in the main root directory of your website. The spiders won’t search your site to find a document with that name. If they can’t find it in the main directory (yourdomain.com/robots.txt) they will simlpy assume that your site doesn’t have a robots.txt file, and as a result they will index everything along their way.

The structure and the syntax of a robots.txt file it’s extremely simple. Basically, it’s a simple list containing pairs of user-agents (crawlers) and disallowed or allowed files or directories. In addition to the “User-agent:”, “Disallow:” and “Allow” directives you can include any comments you want by putting the “#” sign at the beginning of the given line. Technically speaking the user-agent can be any party that requests web pages, including command line utilities, web browsers and of course, search engine spiders. If the “User-agent:” directive is followed by a wildcard operator – “*” – the given rule will apply to all the crawlers.

And of course, once you have created or edited your robots.txt file using any regular text editor, obviously, you’ll have to upload it into the root directory of your website.

Login
Create Your Free Wealthy Affiliate Account Today!
icon
4-Steps to Success Class
icon
One Profit Ready Website
icon
Market Research & Analysis Tools
icon
Millionaire Mentorship
icon
Core “Business Start Up” Training

Recent Comments

32

This is the first time I've heard of the robots.txt file. Is that something that automatically gets created by using WP or do we have to do something. I don't think it's in any of Kyle's training.
Debbie

Hi Debbie,

Yes, usually it's automatically created ... Check out the conversation I had below with Joe (@newmarketpro)

Thanks goodness, that hurt my brain. SEO Tips are great.

I really didn't want to hurt you Mark :)

Thanks for your time!

What hurts is I am going to have that Culture Club song in my head all nite. OUCH! lol. Thanks for sharing all your knowledge, you have helped me immensely.

That's the goal Mark!

And this is only the beginning ... I hope ...

Thanks for the share Zed.

Thanks for reading Michael!

Great ! more good info, ha ha, I will wait for your tutorial and look for these robots on my site next time I log in .
So much info and so little time amen, thx !

Thanks for your time John!

Thank you for this useful info

Thanks for your time Felix!

Excellent Zed, could you give us an example so that we could apply.

Hi Claudio,

I'm working on a topic-centerd tutorial right now ...

Great Zed Always I want to know more about robots.txt.. Many thanks!!

Hi Zsolt
I have never modified the Robot.text file for my site. I assume the file is standard. Is there any circumstance where you need to modify the file. May be you can create a simple tutorial how to correctly modify the file e.g. what to include and exclude etc.
Thanks for sharing this interesting topic.

Joe:)

Thanks for your reply Joe!

Yes, it's a standard file, but many plugins, widgets or user actions could - and usually will - alter the original content. This is why many users are facing crawling errors, indexing issues etc.

Yes, if you have the needed knowledge it's recommended to make certain "improvements" and more importantly, is always a good idea to double-check the content of your robots.txt file every time when you installed a new plugin, etc. in order to remove any automatically inserted commands.

Yep, I'm planning to create a few tutorials and this topic also on the list :)

I am looking forward to the tutorial Zsoft.
For us in WA, changing (modifying) the Robot.text file is only possible through FTP since we are not able to access C-panel. Is there any other way we can do it without accessing the file in FTP?
Thank you in advance.

Joe.

Thank you Smartketeer for providing that I, for one, didn't have
a clue about. - I'm always afraid that I am doing something wrong,
leaving something technical in nature out, or otherwise not being
savvy enough to know for certain that my website is set up
correctly.

Thanks to people like you all I have to do is "follow" you and
learn from you. - God Bless you.

All the best,

John
beginshere

Yes sure ... And in my opinion is the easiest solution.

You can easily edit - and re-submit - your robots.txt file in Google Search Console -> Crawl

Certain SEO plugins (I use Yoast) will also allow you to directly edit the robots.txt file

Thanks. I am still look forward to the tutorial. :) Joe

Thanks for your time and your interest Joe!

Thanks for sharing.

Thanks for reading :)

No problem.

:)

Thank you for this informative post.
I wondered what all that robot speak meant...hehehe now I know!

And now you know :)

Thanks for your time Alenka!

Thanks for the information Zed.

Derek

Thanks for your time Derek!

Hopefully mine has it because it sounds a bit difficult for me to do. Jim

It's really not that difficult Jim ...

And you can check your robots.txt in Google Search Console

See more comments

Login
Create Your Free Wealthy Affiliate Account Today!
icon
4-Steps to Success Class
icon
One Profit Ready Website
icon
Market Research & Analysis Tools
icon
Millionaire Mentorship
icon
Core “Business Start Up” Training