How to improve your SEO with the robots.txt file

3. Unintentionally blocking unrelated pages

Check out the following example:

The “$” sign is a so-called end-of-string operator which basically tells to the spider that “this URL ends here”. As a result, the given directive will match “/custom”, but not “/customized-email-templates”.

The worst thing with this type of mistake is that usually goes unnoticed for a very long time. The page with your customized email templates won’t exist anymore …

4. Using incorrect type case in URL paths

The URL paths are case sensitive! “Disallow: /Temp” will not block “/temp” or “/TEMP”. If you have made the big mistake of using similar filenames or a confusing directory structure, you’ll have to block those pages or folders separately by using separate “Disallow:” lines for each.

5. Forgetting the user-agent directive

It may seem ridiculous, but it happens very often. If there is no user-agent directive before the usual “Disallow:” , “Allow:”, etc directives, nothing will actually happen!

6. Forgetting the slash character

Any URL path must start with a slash character! The “Disallow: any-page” directive won’t block anything. The correct syntax is this: “Disallow: /any-page”.

7. Using the robots.txt file to protect sensitive data

This is one of the biggest and most dangerous mistakes, which for obvious reasons will do much more harm than good. The only reliable way to protect your sensitive stuff is to use some sort of password-based security solution! If you have any files or directories that must be kept protected and hidden from the public, do not ever just put them in your robots.txt file with some “Disallow:” directives!

Why? Because you are going to give hostile crawlers a precise road-map to find the folders and files that you don’t want them to find! More than that, your robots.txt is publicly accessible! Anybody can – and will – see the things you’ve said you don’t want indexed, simply by typing yourdomain.com/robots.txt into their browser!

Like This 31

Join the Discussion

Write something…

Recent messages

TommyVTE Premium

great training need to sit down for this to see how and what for my site, thanks

Reply Like

smartketeer Premium

@TommyVTE

Thanks Tommy!

Reply Like

suzzziq Premium

This is totally Greek to me!!! I get the basic premise, but unsure how to implement. I'm flagging it for future reference, in case I ever get brave enough to try this! Thanks so much for the training:)
Blessings:)
Suzi

Reply Like

smartketeer Premium

@suzzziq

Thanks for your time and your feedback Suzi!

Reply Like

FKelso Premium

Gee, where did you learn so much stuff?

Guess I have to go first and see if I have a robots.txt file.

Reply Like

smartketeer Premium

@FKelso

You have Fran ...

The question: what it contains?

Gee ... That's a LOOOOOOOOOONG story :)

Reply Like

FKelso Premium

@smartketeer

You always give me a chuckle. Thanks.

Reply Like

lesabre Premium

Thanks again, got to save this and come back to it. Lot of information that can be very helpful to me. Got to answer all those e-mails first. All the best.

Reply Like

smartketeer Premium

@lesabre

Thanks Michael!

All the best!

Reply Like

dowj01 Premium

Your training certainly helps make a subject which as a newbie seemed beyond me, very clear. Thank you.
Justin

Reply Like

smartketeer Premium

@dowj01

Thanks Justin!

Reply Like

My Favorites

How to improve your SEO with the robots.txt file

Tagged