Jump to content

Controller=404 Automatically generated junk pages


David Li

Recommended Posts

Google crawled a lot of my junk pages, but these pages are generated by themselves one after another. There are currently more than 2,000 such pages. What is even more strange is that this kind of page can also be opened in a browser. What is the problem and how to solve it?

Here're some examples.

https://www.heegermaterials.com/34-evaporation-materials?%25252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525253Fcontroller=404&p=8

https://www.heegermaterials.com/34-evaporation-materials?%2525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525252525253Fcontroller=404&p=5

 

 

QQ图片20210104133805.png

Link to comment
Share on other sites

  • 1 month later...

First issue.

The only thing I can think of: %25 is used in HTTP to cater for the % sign in a URL. Check in your Shop Settings - Traffic & SEO - URL Schemes (last section) if any of your URL Skeletons contain a % sign. Replace it with an other separator. => if it solves the issue file a Change Request for not allowing % in URL skeletons.

Try to remove the cache. Advanced settings - Performance - Empty Cache. If you have access to your server, remove all the directories in <document root>/app/cache (probable only prod, but perhaps also dev or others).

Second issue:

1. Put debugging off. Advanced settings - Performance - Debug Mode - Debug Mode switch.

Redirecting now goes transparent.

The core problem is that in your Catalog - Products - "cdte-target" - SEO tab - Friendly Url, you have capital letters. HTTP does not support capitals. Try not to put capitals in your freindly urls.

Hope this helps,

Rg,

Leo

 

Link to comment
Share on other sites

17 minutes ago, elburgl69 said:

First issue.

The only thing I can think of: %25 is used in HTTP to cater for the % sign in a URL. Check in your Shop Settings - Traffic & SEO - URL Schemes (last section) if any of your URL Skeletons contain a % sign. Replace it with an other separator. => if it solves the issue file a Change Request for not allowing % in URL skeletons.

Try to remove the cache. Advanced settings - Performance - Empty Cache. If you have access to your server, remove all the directories in <document root>/app/cache (probable only prod, but perhaps also dev or others).

Second issue:

1. Put debugging off. Advanced settings - Performance - Debug Mode - Debug Mode switch.

Redirecting now goes transparent.

The core problem is that in your Catalog - Products - "cdte-target" - SEO tab - Friendly Url, you have capital letters. HTTP does not support capitals. Try not to put capitals in your freindly urls.

Hope this helps,

Rg,

Leo

 

Thank you, I will check and try it.

Link to comment
Share on other sites

7 minutes ago, David Li said:

Thank you, I will check and try it.

Hi Leo,

First issue:

There is no "%" in the URL, they are the default setting. I had many other Categories, but only https://www.heegermaterials.com/34-evaporation-materials? had this problem. It is weird. I added the attached setting for your reference. Attached are the urls.

 

Second issue:

Yes, you are correcet. I add the products by csv and urls are capitals. It is not good for HTTP and SEO. But google had crawled all the pages now, I'm not sure what I can do at the moment. Maybe what I can do now is stay the same at the moment.

Thank you very much for your help!

 

image.png

image.png

URLs.csv

Link to comment
Share on other sites

Strange. the pattern I see in the URL is:

%25 => %

%3F => ?

&p=1

The last indicating pagination in your category browsing. What Sitemap.xml module are you using the standard PS one? Have you generated the Sitemap.xml again (best put it under a cron)? What is the URL of your Sitemap.xml.

Are there any hidden characters after 'evaporation-materials' Friendly URL?

 

Second issue best write a little script to batch update all friendly URL (link_rewrite in yourDBPrefix_product_lang) to lowercases. And regerenate your sitemap.xml, your search indexes etc.

Link to comment
Share on other sites

Hi,

I suppose that the %25 can come from a wrongly rendered pagination link which was on the evaporation-materials category page, and the strange thing is that you have the controller=404, which means that that link/page was a 404 page. Did you had any url rewrite modules, or canonical/hrflang modules?

Kind regards, Leo

Link to comment
Share on other sites

36 minutes ago, Prestachamps said:

Hi,

I suppose that the %25 can come from a wrongly rendered pagination link which was on the evaporation-materials category page, and the strange thing is that you have the controller=404, which means that that link/page was a 404 page. Did you had any url rewrite modules, or canonical/hrflang modules?

Kind regards, Leo

Yes, it is strange. It is Fcontroller=404 but the link works. I can open the url and it shows my products instead of a 404 website. I didn't have any url rewrite modules, or canonical/hrflang modules. What I have is an "enquiry" mould, which is build by my friend. Do you need to check the mould? BTW, it is also strange that other categories is good. BTW, these links was not in my sitemap.

Link to comment
Share on other sites

1 hour ago, elburgl69 said:

Strange. the pattern I see in the URL is:

%25 => %

%3F => ?

&p=1

The last indicating pagination in your category browsing. What Sitemap.xml module are you using the standard PS one? Have you generated the Sitemap.xml again (best put it under a cron)? What is the URL of your Sitemap.xml.

Are there any hidden characters after 'evaporation-materials' Friendly URL?

 

Second issue best write a little script to batch update all friendly URL (link_rewrite in yourDBPrefix_product_lang) to lowercases. And regerenate your sitemap.xml, your search indexes etc.

Here's my sitemap for your reference. It seems that these links are not in the sitemap. There any hidden characters after 'evaporation-materials' Friendly URL

https://www.heegermaterials.com/1_en_0_sitemap.xml

There is no hidden characters after 'evaporation-materials' Friendly URL. If I enter my website directly, the link is correcet. It is very strange.

https://www.heegermaterials.com/34-evaporation-materials?p=3

 

Second issue. Do you mean I update all url to lowercases, then 301 all uppercase links to lowercase links?

Link to comment
Share on other sites

23 hours ago, David Li said:

Second issue. Do you mean I update all url to lowercases, then 301 all uppercase links to lowercase links?

In your database (phpmyadmin?) run the following SQL script.

UPDATE `ps_product_lang` 
  SET `link_rewrite` = LOWER(`link_rewrite`)
  WHERE 1=1;

This only works if you have de default _DB_PREFIX of 'ps', else change ps to your prefix.

That converts all your friendly URLS (on products) to lower cases. Regenerate your Search Index, Sitemaps, and other search modules you might have, to reflect those changes. Update if necessary you menu's to use lower cases.

 

Since they are basically the same, Prestatshop will 301 the ones with capititals to the lower case ones.

Edited by elburgl69 (see edit history)
Link to comment
Share on other sites

Can it be that you once generated wrong URLs (a bug or something). Google indexed at that time your page finding all those URL's. Since those URL are basically valid category URL they return a valid page and google keeps them in the database?

Try changing in SEO & URL your category route to:

{id}/{rewrite}

And the layered nav route to:

{id}/{rewrite}{/:selected_filters}

 

You category view is then reached via: <home>/34/evaporation_materials

The old routes are invalid, so Google gets real 404's on those pages.

But beware. It invalidates all your current category links, impacting your google ranking for those pages.

Rg,

Leo

 

Edited by elburgl69 (see edit history)
Link to comment
Share on other sites

On 2/18/2021 at 8:23 PM, elburgl69 said:

Can it be that you once generated wrong URLs (a bug or something). Google indexed at that time your page finding all those URL's. Since those URL are basically valid category URL they return a valid page and google keeps them in the database?

Try changing in SEO & URL your category route to:

{id}/{rewrite}

And the layered nav route to:

{id}/{rewrite}{/:selected_filters}

 

You category view is then reached via: <home>/34/evaporation_materials

The old routes are invalid, so Google gets real 404's on those pages.

But beware. It invalidates all your current category links, impacting your google ranking for those pages.

Rg,

Leo

 

Hi Leo,

 

Thank you very much for your help. I will try it. 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...