Dan1 Posted July 30, 2013 Share Posted July 30, 2013 (edited) I want to block / redirect certain bots like Baidu. When searching how to do this with Prestashop, I saw some threads with suggestions to do it with the robots.txt file. My understanding is that many of these bots do not respect the robots.txt file, so this advice is useless. I also understand bots can be blocked with the htaccess file. But I read this causes server load. I have a forum where I implemented a modification where bots can be blocked and redirected by user agent. Meaning a unique part of the UA string is added to the block list, and any bot that has this in its UA gets redirected. Is there a PS module with this functionality? Edited July 30, 2013 by Dan1 (see edit history) Link to comment Share on other sites More sharing options...
swsindonesia Posted July 30, 2013 Share Posted July 30, 2013 Hi, If you're avoiding both robots.txt and .htaccess solutions, instead of bloating your PS installation with another module just for this purpose, you can just hack your index.php file by adding: if ($_SERVER['HTTP_USER_AGENT'] == 'baidu...') { header('location:404.html'); die; } that case you're avoiding unnecessary PS load. 1 Link to comment Share on other sites More sharing options...
Dan1 Posted July 30, 2013 Author Share Posted July 30, 2013 Thank you. How would I change this code to redirect to an external site? Why is there ... after baidu? Will this block all instances of baidu, like: baiduspider, botbaidu etc.? How do I change the code you provided to add more bots? Link to comment Share on other sites More sharing options...
c.carlos.s Posted August 13, 2013 Share Posted August 13, 2013 It doesn't work in my PS 1.3.2.3. I modified my index.php with the direction I found in my log: if ($_SERVER['HTTP_USER_AGENT'] == 'http://www.baidu.com/search/spider.html') { header('location:404.html'); die; } Any sintax mistake perhaps? thanks Link to comment Share on other sites More sharing options...
vekia Posted August 13, 2013 Share Posted August 13, 2013 bot's have got different user agents, different refferes urls, different ip number.. this is the main problem. The best way to block unwanted bots is block their IP addresses with $_SERVER['SERVER_ADDR'] variable 1 Link to comment Share on other sites More sharing options...
c.carlos.s Posted August 13, 2013 Share Posted August 13, 2013 (edited) bot's have got different user agents, different refferes urls, different ip number.. this is the main problem. The best way to block unwanted bots is block their IP addresses with $_SERVER['SERVER_ADDR'] variable Thanks for the answer, I'm trying to block several IP addresses, however I don't understand how to do it. Could you tell me the exact syntax for this bots that appears in my logs files, please?: http://www.baidu.com/search/spider.htm http://www.bing.com/bingbot.htm http://ahrefs.com/robot/ All of them are creating phantom carts using differents IP addresses. Thanks! Edited August 13, 2013 by c.carlos.s (see edit history) Link to comment Share on other sites More sharing options...
swsindonesia Posted August 14, 2013 Share Posted August 14, 2013 It's kinda tricky to block these baidu spiders, since they use different names, ips, and user-agent identifiers. You might try greedy matching using regex, for any words containing baidu, 1 Link to comment Share on other sites More sharing options...
c.carlos.s Posted August 14, 2013 Share Posted August 14, 2013 Well, I keep trying modifying my index.php file with different words. I realize I have two different index.php files: First one is in www.myshop/index.php with this code: <?php header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); header("Last-Modified: ".gmdate("D, d M Y H:i:s")." GMT"); header("Cache-Control: no-store, no-cache, must-revalidate"); header("Cache-Control: post-check=0, pre-check=0", false); header("Pragma: no-cache"); header("Location: ../"); exit; ?> Second one is in www.myshop/themes/mytheme/index.php as: <?php include(dirname(__FILE__).'/config/config.inc.php'); if(intval(Configuration::get('PS_REWRITING_SETTINGS')) === 1) $rewrited_url = __PS_BASE_URI__; include(dirname(__FILE__).'/header.php'); $smarty->assign('HOOK_HOME', Module::hookExec('home')); $smarty->display(_PS_THEME_DIR_.'index.tpl'); include(dirname(__FILE__).'/footer.php'); ?> Do you know which one must I modify with: if ($_SERVER['HTTP_USER_AGENT'] == 'different expressions containing BAIDU, AHREFS, BING...') { header('location:404.html'); die; } Thanks a lot! Link to comment Share on other sites More sharing options...
vekia Posted August 14, 2013 Share Posted August 14, 2013 you need to edit the file located in root directory of your store 1 Link to comment Share on other sites More sharing options...
swsindonesia Posted August 14, 2013 Share Posted August 14, 2013 Hi, the only place to do so is through your main index.php file located in www.myshop/index.php, the index.php in the theme folder won't do any good, since the www.shop/index.php is the main bootstrap file for PS. 1 Link to comment Share on other sites More sharing options...
Recommended Posts