I have a hard time keeping my logs tidy. It is basically full of 404 CHttpExceptions traces. A lot of it is requests of files (mostly images) that are not present at my installation, nor do I link to them anywhere.
I thought that changing the .htaccess file would do the trick:
Does the site replace another that was on the same domain?
If the urls don’t look right, perhaps they are remnants from the old site or in the case of images, could be in some old css classes that are now conflicting??
You might see the old urls if for example, someone views a page from Google’s cache.
Ah, yes this is a clean / new installation of an old domain.
I have been running it for a couple of weeks now, and reindexed through webmaster tool kit. Do you know how long these old files will be requested, or if there is anything I can do to force google to flush the old information?
I replaced a site back in August and the old links still appear in Webmaster tools.
They say over time they will disappear but I’m still waiting lol. Good luck with that.
EDIT:
Actually I lied… I last looked at Webmaster about 3 weeks ago, checked again now and the links are gone, wow!
I’m glad about that because the old site must have had some weird loop in the paging that displayed a list of products and there were over 20,000 links that went to the same 100 products.
If there is a common format to most of the urls in the log, you could put a rule in htaccess to 301 redirect to an existing page that would be most suitable as a replacement.
Yes I was aware of that. Thing is, when you have incoming links from other sites, which helps SEO, you don’t want to lose those links. Much better to 301 if that is the case.
Might be a good idea to try and track where those links are coming from.