I was wondering why there are so many dead links in the directory?
A few days ago I downloaded the raw directory data and loaded it into our web safe bot, it searches for... (popups, unsafe script, 3rd party cookies, tracking).
Now I am about 60% done with fetching all the pages in the directory, but I have over 50,000 pages so far, that return 404. I have personally tested half the ones that return 404 and all of them are real 404 returns!
So my question is...
When is the directory cleared of 404(s) and do you even check links before releasing the new data. I ask this because over 80% of the dead links have been in previous raw data releases.
Thanks
j
A few days ago I downloaded the raw directory data and loaded it into our web safe bot, it searches for... (popups, unsafe script, 3rd party cookies, tracking).
Now I am about 60% done with fetching all the pages in the directory, but I have over 50,000 pages so far, that return 404. I have personally tested half the ones that return 404 and all of them are real 404 returns!
So my question is...
When is the directory cleared of 404(s) and do you even check links before releasing the new data. I ask this because over 80% of the dead links have been in previous raw data releases.
Thanks
j