Jump to content

How to stop Google from Indexing Some Pages Along With Page Header Elements?


Recommended Posts

Hi All

Not sure how to stop this from happening but here's an example:-

Mysite - Weather Station Photo Frame
Mysite. $. Currency. English (English). contact · sitemap. Welcome, Log in · Cart: productproducts (empty); Your Account ...

It's not happening to all the pages so I'm not sure what's going on.

I have an auto generated Pretashop robots.txt in place and thought the following line would stop this from happening :"Disallow: /header.php"

Anyone else experienced it / fixed it?

Thanks
deepee

Link to comment
Share on other sites

Disallow: /header.php will tell search engine not to crawl that file.
However, that file never gets called on it's own, but rather as a part of the entire page.

I have done a little trick before to hide certain parts from search engines.

You can check the USER_AGENT to detect if it's a search engine or a regular user, set a variable according to that (lets say $_is_search_engine).
Then in your code just add a {if $_is_search_engine}part you wish to hide from search engines{/if}

Link to comment
Share on other sites

Hi tomerg3

Thanks very much for your suggestions.
Apologies for the delay in my reply as this is the first chance I have had to have a look at it.

I'm quite new to this so please excuse me if the following questions seem a bit obvious.
- How do you set up USER_AGENT to check if it is a search engine?
- In which file and whereabouts in the code does the condition get added?

Thanks
deepee

Link to comment
Share on other sites

Add the following lines to init.php anywhere above the smarty assign lines (which are around line #130)

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false || strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false || strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false || strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false || strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false || strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false)
   $isSearchEngine = 1;



This checks the USER_AGENT for populat search engine info (you can add more if you want)
You can check to see if $isSearchEngine == 1 in php files if you need.

Then set a smarty variable to use in .tpl files in the smarty assign blocks

'is_search_engine' => $isSearchEngine,

Link to comment
Share on other sites

×
×
  • Create New...