thezey Posted October 11, 2013 Share Posted October 11, 2013 Hello, My target customers are english speakers for the most part, and I did SEO on english pages of my website only. However, I noticed that Google indexed the other languages that Prestashop in natively providing. As a result, it may lower my Google ranking because pages appear as duplicated contents. I don't want to remove other languages from Prestashop as I don't like to exclude other potential customers who may require a different language setting. I just would like Google to see my english pages. How can I do that? Thanks P.S: It seems Prestashop is creating two sets of pages in English though. One looks like www.example.com/stuff, and the other one is www.example.com/en/stuff. I guess there's one that is not needed there too! Link to comment Share on other sites More sharing options...
vekia Posted October 11, 2013 Share Posted October 11, 2013 have you got any additional language defined in your store? if so, you can just block unwanted language in robots.txt file Link to comment Share on other sites More sharing options...
thezey Posted October 11, 2013 Author Share Posted October 11, 2013 Well, there are a scroll down menu in my website where people can choose from between 5 different languages. How can I configure prestashop so that robots file only index one language? Link to comment Share on other sites More sharing options...
vekia Posted October 11, 2013 Share Posted October 11, 2013 User-agent: *Disallow: /dir1/Disallow: /dir2/Disallow: /dir3/ where dir1 dir2 and dir3 are language codes that you want to disallow for crawling purposes Link to comment Share on other sites More sharing options...
El Patron Posted October 11, 2013 Share Posted October 11, 2013 if you are only 'concerned about google', then log in (or create) to google webmaster tools and ask google to request removal of URL(s) from search results. http://screencast.com/t/vU1KvrATp here is what google says on this matter https://support.google.com/webmasters/answer/59819?hl=en please note: good bots are supposed to follow robots.txt rules, but don't have to. Link to comment Share on other sites More sharing options...
El Patron Posted October 11, 2013 Share Posted October 11, 2013 now, I can ask this question, why are you doing this exactly? Are the translations for the other languages complete? Are you worried about duplicate content? If there is an issue with your sitemap, please post in seo section and post link here so we can review it. Link to comment Share on other sites More sharing options...
El Patron Posted October 11, 2013 Share Posted October 11, 2013 I found something interesting that pertains to no indexing. Header.tpl <meta name="robots" content="{if isset($nobots)}no{/if}index,{if isset($nofollow) && $nofollow}no{/if}follow" /> but I don't see where PrestaShop set's nobots. Vekia? Link to comment Share on other sites More sharing options...
thezey Posted October 12, 2013 Author Share Posted October 12, 2013 User-agent: * Disallow: /dir1/ Disallow: /dir2/ Disallow: /dir3/ where dir1 dir2 and dir3 are language codes that you want to disallow for crawling purposes Vekia, i didn't find the module for that. Could you tell me how you disallow languages? Link to comment Share on other sites More sharing options...
vekia Posted October 12, 2013 Share Posted October 12, 2013 hello there is no module for this, just change contents of robots.txt file located in root dir of your prestashop installation. if you don't have it: open preferences > seo & urls tab and generate it: Link to comment Share on other sites More sharing options...
thezey Posted October 12, 2013 Author Share Posted October 12, 2013 Just one more question: - How do I know which language codes are related to which languages? I don't want to disallow the english part (that may be dir1). Link to comment Share on other sites More sharing options...
vekia Posted October 12, 2013 Share Posted October 12, 2013 you've got languages iso codes defined in localization > languages tab in back office 1 Link to comment Share on other sites More sharing options...
thezey Posted October 12, 2013 Author Share Posted October 12, 2013 (edited) So to sum up, I should write?: Disallow: /fr/Disallow: /ru/ instead of Disallow: /dir2/ Disallow: /dir3/ Plus, under which headline?: #Private pages, #Directories, #Files, #Sitemap Edited October 12, 2013 by thezey (see edit history) Link to comment Share on other sites More sharing options...
El Patron Posted October 12, 2013 Share Posted October 12, 2013 https://support.google.com/webmasters/answer/156449 Link to comment Share on other sites More sharing options...
thezey Posted October 12, 2013 Author Share Posted October 12, 2013 https://support.google.com/webmasters/answer/156449 I guess it's for google only. Not for Bing. My question still stands. Thank you though Link to comment Share on other sites More sharing options...
El Patron Posted October 12, 2013 Share Posted October 12, 2013 I guess it's for google only. Not for Bing. My question still stands. Thank you though it works for them all...and you never mentioned bing.. robots.txt is a standard 'good' search engines use Link to comment Share on other sites More sharing options...
thezey Posted October 12, 2013 Author Share Posted October 12, 2013 In the weblink you shared, Google mentions specific patterns only designed for googlebot. The instance to block all subdirectories is "/private*/". However, in my robots.txt files, it appears directories are disallowed as followed: "/*mails/". That's the reason, I am asking the question in my previous post. Link to comment Share on other sites More sharing options...
thezey Posted October 13, 2013 Author Share Posted October 13, 2013 Vekia, could you let me know please? Link to comment Share on other sites More sharing options...
El Patron Posted October 13, 2013 Share Posted October 13, 2013 in the future when you are asked particular questions, like why are you doing this?, you should answer because it might be in your best interest. adding a disallow to robots.txt is very simple..much like the links explained. here is what you would do, if you already have the User-agent: *, you will not need to add that. User-agent: *Disallow: /folder1/ Link to comment Share on other sites More sharing options...
vekia Posted October 13, 2013 Share Posted October 13, 2013 Vekia, could you let me know please? El Patron explained all the details if you want to block languages Disallow: /ru/ Disallow: /es/ Disallow: /de/ Disallow: /pl/ Disallow: /it/ etc. Link to comment Share on other sites More sharing options...
DArnaez Posted October 16, 2014 Share Posted October 16, 2014 I did it ... I mean I added Disallow: /es/ in robot.txt but still the same problem. Google don't let me index the page "http://www.ohmyicons.com". Any other cause??? Thanks! Link to comment Share on other sites More sharing options...
Recommended Posts