Guesmi koussay Posted November 1, 2024 Share Posted November 1, 2024 I'm experiencing a sudden spike in the error message "Indexed, though blocked by robots.txt" on my PrestaShop-based e-commerce site. This issue seems to indicate that certain pages are getting indexed by search engines despite being marked as blocked in the robots.txt file. I’m concerned that this could affect the site's visibility and SEO performance, as it may confuse search engines and impact rankings. Upon inspecting the details in Google Search Console, I noticed that most of the affected pages include URL parameters. However, when reviewing the robots.txt file, I confirmed that these specific parameters are indeed disallowed there. I would appreciate any insights on potential causes or steps to resolve this issue. Link to comment Share on other sites More sharing options...
ST-THEMES Posted December 24, 2024 Share Posted December 24, 2024 The "Indexed, though blocked by robots.txt" issue in Google Search Console indicates a mismatch between the behavior of search engines and the directives in your robots.txt file. Search engines may still index pages blocked by robots.txt if they find the URLs through other means, such as backlinks. Here's how you can approach the problem: Potential Causes URL Parameters Indexation Google might be indexing parameterized URLs if they find them through external links or internal navigation. Even though they're blocked in the robots.txt, Google can index the page's URL without crawling it. Misconfigured Robots.txt Rules The robots.txt rules might not be specific or comprehensive enough to handle all variations of parameterized URLs. Canonical Tags Confusion If canonical tags point to blocked pages or there’s inconsistent use of canonicalization, it can mislead search engines. Sitemaps Contain Blocked URLs If your sitemap includes URLs that are disallowed in the robots.txt file, it sends mixed signals to search engines. Backlinks to Blocked Pages External links pointing to parameterized URLs can lead to indexing, even if they are blocked in robots.txt. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now