KenSTJ Posted February 3, 2013 Share Posted February 3, 2013 (edited) Anyone seen this before? This is a new site and a first time sitemap submission to Google Looking at the fact that index is involved - this can't be good for the crawler.... The warnings/examples given by G are: "Value: http://www.mydomain.com/en/index.php?controller=authentication" "Value: http://www.mydomain.com/es/index.php?controller=authentication" "Value: http://www.mydomain.com/fr/index.php?controller=authentication" I noted some other issues in this Forum dealing with language - is this related? Edited February 3, 2013 by KenSTJ (see edit history) Link to comment Share on other sites More sharing options...
guest* Posted February 3, 2013 Share Posted February 3, 2013 Are you using friendly-URL's ? By using friendly URL's this do not happen, but that is another problem with PS 1.5. which is fatal for still running shops. See here: http://forge.prestas.../browse/PNM-971 We haven't released our upgraded shop to PS 1.5. because exact one year ago we had the same problem with sitemap. Google removed all our links from the index. We lost all our good positions... Link to comment Share on other sites More sharing options...
KenSTJ Posted February 3, 2013 Author Share Posted February 3, 2013 as a new shop we went with the latest version 1.5.3.1. Yes, we are using friendly URLs. The index page we had set to "home". However, I just removed it to make the friendly equal "blank". What might I look for in robots to figure out what Google is warning about - do you know? Thanks for the quick response! Link to comment Share on other sites More sharing options...
KenSTJ Posted February 3, 2013 Author Share Posted February 3, 2013 Hello again - what about removing the robots.txt file completely? Link to comment Share on other sites More sharing options...
sscardefield Posted April 12, 2013 Share Posted April 12, 2013 (edited) So what is the answer to this? I am running into the same issue running 1.5.3. I have removed "Disallow: /*controller=authentication" from robots.txt and resubmitted the sitemap, but it still gives the same warnings. Sitemap contains urls which are blocked by robots.txt. https://shop.sunstat...=authentication I have gone through the entire sitemap and compared it to robots.txt and https://shop.sunstatetechnology.com/index.php?controller=authentication is the only URL in the sitemap that matches anything in robots.txt. So why with that entry removed from robots.txt am I still receiving the errors? Edited April 12, 2013 by sscardefield (see edit history) Link to comment Share on other sites More sharing options...
FME_Modules Posted April 12, 2013 Share Posted April 12, 2013 So what is the answer to this? I am running into the same issue running 1.5.3. I have removed "Disallow: /*controller=authentication" from robots.txt and resubmitted the sitemap, but it still gives the same warnings. Sitemap contains urls which are blocked by robots.txt. https://shop.sunstat...=authentication I have gone through the entire sitemap and compared it to robots.txt and https://shop.sunstat...=authentication is the only URL in the sitemap that matches anything in robots.txt. So why with that entry removed from robots.txt am I still receiving the errors? Hello sscardefield I can help you on this, Please share your both robots.txt and sitemap.xml links. plus please mention the urls that for which you receiveing error these errors. Link to comment Share on other sites More sharing options...
sscardefield Posted April 12, 2013 Share Posted April 12, 2013 (edited) Hello alastairbrian, thanks for the response. Here is my robots.txt and sitemap.xml. http://shop.sunstate....com/robots.txt http://shop.sunstate...com/sitemap.xml And here is the errors I'm getting from Google. The robots.txt used to contain "Disallow: /*controller=authentication" but I removed it. Still get the same error. I wouldn't care if those URL's don't get indexed, but this is preventing Google from going forward with the indexing. The indexing just stays as "pending". Edited April 12, 2013 by sscardefield (see edit history) Link to comment Share on other sites More sharing options...
FME_Modules Posted April 12, 2013 Share Posted April 12, 2013 I want you look for the robots.txt entries in webmaster its under health - blocked URLs. There you will see your robot.txt as fetched by Google. See is there any Disallow: /*controller=authentication command ? If yes than wait for 2 to 3 days as Google has not fetched your updated robots.txt. The error will be removed after the Google re fetch your updated robots.txt file. Let me know when you are done with it. However if the problem is persistent I will review it more deeply by spending some more time as apparently I can't see any major culprit in robots.txt Link to comment Share on other sites More sharing options...
skyavis Posted April 20, 2013 Share Posted April 20, 2013 Remove Disallow: /*controller=authentication from your robots.txt. Then resubmit your sitemap to google. You should be find after that. Link to comment Share on other sites More sharing options...
Andrej Stas Posted April 24, 2013 Share Posted April 24, 2013 Hello again - what about removing the robots.txt file completely? This is not the solution for sure - it's important to have robots.txt & sitemap.xml Link to comment Share on other sites More sharing options...
cutecat Posted June 9, 2013 Share Posted June 9, 2013 maybe u can try to install google sitemap module. this resolved my robot errors. but in my google webmaster, I can see I have 1260 indexed but there is a total of 1222 blocked by robots.... is it normal? User-agent: * # Private pages Disallow: /*orderby= Disallow: /*orderway= Disallow: /*tag= Disallow: /*id_currency= Disallow: /*search_query= Disallow: /*back= Disallow: /*utm_source= Disallow: /*utm_medium= Disallow: /*utm_campaign= Disallow: /*n= Disallow: /*controller=addresses Disallow: /*controller=address Disallow: /*controller=authentication Disallow: /*controller=cart Disallow: /*controller=discount Disallow: /*controller=footer Disallow: /*controller=get-file Disallow: /*controller=header Disallow: /*controller=history Disallow: /*controller=identity Disallow: /*controller=images.inc Disallow: /*controller=init Disallow: /*controller=my-account Disallow: /*controller=order Disallow: /*controller=order-opc Disallow: /*controller=order-slip Disallow: /*controller=order-detail Disallow: /*controller=order-follow Disallow: /*controller=order-return Disallow: /*controller=order-confirmation Disallow: /*controller=pagination Disallow: /*controller=password Disallow: /*controller=pdf-invoice Disallow: /*controller=pdf-order-return Disallow: /*controller=pdf-order-slip Disallow: /*controller=product-sort Disallow: /*controller=search Disallow: /*controller=statistics Disallow: /*controller=attachment Disallow: /*controller=guest-tracking # Directories Disallow: /*classes/ Disallow: /*config/ Disallow: /*download/ Disallow: /*mails/ Disallow: /*modules/ Disallow: /*translations/ Disallow: /*tools/ # Files Disallow: /*en/password-recovery Disallow: /*en/address Disallow: /*en/addresses Disallow: /*en/authentication Disallow: /*en/cart Disallow: /*en/discount Disallow: /*en/order-history Disallow: /*en/identity Disallow: /*en/my-account Disallow: /*en/order-follow Disallow: /*en/order-slip Disallow: /*en/order Disallow: /*en/search Disallow: /*en/quick-order Disallow: /*en/guest-tracking # Sitemap Sitemap: http://site.com/sitemap.xml 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now