Jump to content

[Solved] Problem with robots.txt


laurapessoa

Recommended Posts

Hello,

 

I have seen in Firefox that my web site has some SEO problems, the worse is:

 

*Page is not indexable because of robots.txt exclusion (User-agent:*Disallow: /it/)

 

HTML: <meta robots = 'no index'>

HTTP: X-Robots-tab: 'no index'

robots.txt: User-agent: *disallow: /

HTML: <link rel='canonical'>

 

How can I improve that? There is a simply way to do that?

 

 

 

Thank you very much in advance

Link to comment
Share on other sites

Hello,

 

I also have some sort of an issue with robots and indexing. I wanted to remove indexing of sendtoafriend pages so I added this to the robots.txt (Files section):

Disallow: /modules/sendtoafriend2/sendtoafriend-form.php

 

This addition came after I found out that all these pages where indexed although /modules is disallowed...

 

In webmaster tools, testing the access to the file with the integrated google checker results in the page being allowed :(

 

 

Thanks!

Link to comment
Share on other sites

Hi Laura,

 

I have only one language so that is not the issue in my case.

 

My trouble is that Google robots checking tool tells me that a page is indexable and it should't. I've already modified the meta of the page so that it is no longer indexed but I'm afraid I did something wrong and all other disallowed files and directories will eventually get indexed.

 

Thanks for your support though.

Link to comment
Share on other sites

Hi Shhhh, as long as /modules is disallowed in your robots.txt you can request the folder (including send to a friend module) to be removed from the index. In webmaster tools go to crawler access, remove url and enter www.yoursite.com/modules. Google will then remove the directory (and all its subdirectories) within a day or so. See this https://support.google.com/webmasters/bin/answer.py?hl=en&answer=59819

Link to comment
Share on other sites

Hi pel,

 

Thanks for the info, I know I can remove urls from Google index. My issue is that although I have /modules disallowed in the robots.txt file, testing my robots file against this url shows that Google is allowed to crawl it... Really strange...

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...