Pippo3000 Posted November 3, 2010 Share Posted November 3, 2010 hi, my robots.txt is already pretty long but I realized that many but not all directories are added. It says for example # Directories Disallow: /classes/ Disallow: /config/ but what about folders like /js, /docs or /css. Are this not added by purpose or wouldn't it make sense to add these too to the dissallow list? or e.g. my custom /cms directory with pdf files for terms & conditions (for download).also, read somewhere the tip to disallow specific crawlers completely, e.g. User-agent: EmailCollector Disallow: / User-agent: GagaRobot Disallow: / but where in the robots.txt would I put this? anywhere? I am just wondering if they need to come before the part with User-agent: * or after. means, are the exclusions really excluded if I add them at the top opf my robots.txt and then some lines further down allow all crawlers again? or are they still excluded?thanksphil Link to comment Share on other sites More sharing options...
Bubblemaker Posted November 4, 2010 Share Posted November 4, 2010 Hello Pippo, have a look at this page, it helps a lot... Link to comment Share on other sites More sharing options...
Pippo3000 Posted November 6, 2010 Author Share Posted November 6, 2010 Hello Pippo, have a look at this page, it helps a lot... to some extent, yes. but let me rephrase my question, can I - as a limited saftey feature - disallow all folder and subfolder for crawlers, i.e. also those the PS generator does not include in the default robots.txt? I mean if I generate and submit a siremap to google, is that enough to get my site into google? or where are e.g. the friendly URLs 'stored', i.e. which subfolder do I need to leave open/allow to ensure that google crawls and lists my site?bestphil Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now