OK. Now, that you already know what the hack is the robots.txt file, let’s see …
The 10 most common robots.txt mistakes that can ruin your SEO efforts
As you have seen, the structure and the syntax of the robots.txt file is ridiculously simple, but you should always keep in mind, that you are using a double edged sword! One single wrongly used wildcard operator will keep away all the search engines from your website, so pay a great attention to every single character included in your robots.txt file, and don’t try to outsmart the search engines with uncommon directives or weird and shady practices learned from various forums, YouTube, etc. Believe me, I have seen many, many webmasters and site owners pulling their hair out after they have tried to enhance their robots.txt with some new awesome directives or smart aleck practices.
In order to keep you away from some potential epic disasters, I have made a small collection with the most common robots.txt mistakes. If you want to make a successful SEO for a WordPress website, you’ll have to play close attention to these issues, otherwise your entire website could simply vanish from the search results.
1. Using the robots.txt file inadequately
Your file MUST be called “robots.txt” all in lower-case, and you must use a new line for every single new instruction!
Also, a robots.txt file placed in a sub-directory will not work at all. Will be completely ignored. If you don’t have access to the root directory, and you want to block some pages in a sub-directory – for example your own individual directory on a membership site -, you’ll have to use other ways, such as robot meta-tags or an “.htaccess” file.
2. Targeting subdomains
Let’s assume that you have a website with multiple subdomains:
www.yourdomain.com
www.home.yourdomain.com
www.blog.yourdomain.com
www.resources.yourdomain.com
If you are going to create one “main” robots.txt file uploaded in the root folder of your website (www.yourdomain.com/robots.txt) in order to block the subdomains, won’t work. A “Disallow: blog/yourdomain.com” directive included in the “main” robots.txt file will have no effect.
In plain English: in order to block some subdomains – and not others! – you are going to need different robots.txt files served from the different subdomains.