An Outsider's Report of ODP Multiple Listings

We have been researching the ODP Multiple Listing problem. Just browsing through we see tons of spammy sites with hundreds of listings. We personally don't have the time to report them all, so we are letting you know that we are making this resource available to all.

DMOZ Top Listeds Domains

We hope it helps in the fight against abuse. <img src="/images/icons/smile.gif" alt="" />
 

apeuro

Member
Joined
Mar 1, 2002
Messages
1,424
Re: Multiple Listings in sorted order

Thanks for the list - it's fantastic!

I already caught a really slick hotel reservation affiliate outfit. Would it be possible for you to update the list on any regular basis? Secondly would it be possible for you to drop the inclusion threshold to 5 sites, instead of 10. It's really the sites with 5-20 listings where abusive sites are most likely to lurk.
 

windharp

Meta/kMeta
Curlie Meta
Joined
Apr 30, 2002
Messages
9,204
Re: Multiple Listings in sorted order

If you generate the list by perl or PHP we have some editor-owned servers we could have it run on, if you want to have less work with it <img src="/images/icons/smile.gif" alt="" />
 

Re: Multiple Listings in sorted order

Very useful list. Thanks for making it available. <img src="/images/icons/laugh.gif" alt="" />
 

dfy

Member
Joined
Aug 2, 2002
Messages
2,044
Re: Multiple Listings in sorted order

Excellent resource, thanks very much.

You might want to include some method for approving certain sites. Most of the top fifty are hosts or information sites and are obvious candidates for multiple listings (like Geocities, Tripod, AOL, etc.). You also suffer from the same disease that caused SpamCop to block all mail from the UK one day, you only go down to second level domains. This means that at number 17 you have edu.au, which is of course the umbrella domain for all educational establishments in Australia. At number 60 you have sch.uk, which is where all UK school sites live. I notice that co.uk (the umbrella for all UK companies) is missing, so either the scan was broken, or they are already catered for in some way.
 

beebware

Member
Joined
Mar 25, 2002
Messages
1,070
Re: Multiple Listings in sorted order

We do also have a similar system at http://dmoz.org/edoc/addurl.txt which lists all domains with more than 100 links (however, as you may be able to tell from the list, it includes all listings in all parts of the directory - including some 'editor only' areas).
 

apeuro

Member
Joined
Mar 1, 2002
Messages
1,424
Re: Multiple Listings in sorted order

The problem with addurl.txt is that it hasn't been updated in an extremely long time - over a year IIRC.
 

Re: Multiple Listings in sorted order

Yes, We plan on updating the list every time a RDF is available. Also the second level domain problem should get better over time, we will be correcting that and getting better in the future. We just need to add these extensions to our lists as roots. So hopefully you will not see the "edu.au" type problems after a new RDF dump.

Yes, we can make the list go down to 5 listings. Done. <img src="/images/icons/smile.gif" alt="" />
We are happy to help. We just want to see the abuse stop.

&gt;&gt; You might want to include some method for approving
&gt;&gt; certain sites. Most of the top fifty are hosts or information
&gt;&gt; sites and are obvious candidates for multiple listings (like
&gt;&gt; Geocities, Tripod, AOL, etc.).

I don’t believe these sites are pure either. Geocities has 98,373 listings. I believe a script that went through and checked those 98,373 for the “This account has been terminated message” would cut those listings in half. If a meta would use that list, I would be glad to crawl these bulk free hosting places! I think DMOZ could be really clean itself if more attention was given to automating the checking of the bulk top 100 hosts. Check for 404s on those hosts.
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
Messages
19,136
Re: Multiple Listings in sorted order

&gt;If a meta would use that list, I would be glad to crawl these bulk free hosting places! I think DMOZ could be really clean itself if more attention was given to automating the checking of the bulk top 100 hosts. Check for 404s on those hosts.

We check for 404s, but I believe you are right in thinking that a check for known "terminated account" strings would catch a lot more dead "free sites."

If you produced this, I believe I can guarantee that it would be used (and not just by metas, this is prime editall-permissions territory.)

Please try it out and let us know what you find.
 

dfy

Member
Joined
Aug 2, 2002
Messages
2,044
Re: Multiple Listings in sorted order

&gt;&gt; the second level domain problem should get better over time, we will be correcting that and getting better in the future &lt;&lt;

Great stuff, thanks.


&gt;&gt; I believe a script that went through and checked ... for the “This account has been terminated message” would cut those listings in half. &lt;&lt;

That's much better than just 'approving' sites to get them removed from the listings. Thanks once again. I'm looking forward to the RDF problem being fixed so that we can get another run with this information available.
 

sabre23t

Member
Joined
Mar 26, 2002
Messages
252
Re: Multiple Listings in sorted order

Hi nameintel. Great resource. ;-)

When looking at for example ...

http://www.whois.sc/geocities.com
GEOCITIES.COM
Website Title: Yahoo! GeoCities
DMOZ: 98373 listings
Website Status: Active
Web server hosts: 7 other websites hosted
IP Address: 66.218.77.68
Visit Website: www.geocities.com
Name Server:
ICANN Registrar:
&lt;no nslookup info?&gt;

and

http://www.whois.sc/bcsports.net
BCSPORTS.NET
Website Title: BC Gaming
Server Type: Microsoft-IIS/5.0
DMOZ: 2 listings
Website Status: Active
Web server hosts: 2 other websites hosted
IP Address: 196.40.39.253
Visit Website: www.bcsports.net
Name Server: NS1.DIGSOLUTIONS.NET NS2.DIGSOLUTIONS.NET
ICANN Registrar: NETWORK SOLUTIONS, INC.
&lt;and rest of nslookup info&gt;

... can you make that "7" and "2 other websites hosted" clickable to a list of the domains hosted on that same IP?

I think that would make your resource useful even while reviewing a new domain submission not yet listed in DMOZ.
 

giz

Member
Joined
May 26, 2002
Messages
3,112
Re: Multiple Listings in sorted order



Is there any merit in having the list that is currently sorted in numeric order by the number of published listings, to be also made available as a straight alphabetical sort by domain, ignoring (but still quoting) the numbers? I can think of a few uses for that, looking for similar domain names, but where one may be listed 10 times and the other some other number of times which currently makes it harder to find.
 

Re: Multiple Listings in sorted order

sabre23t,
Thanks for pointing that out, we just corrected that bug, multiple domains listed at Verisign on that one. These people that list hosts that are the same as domains caused a bug, but we have fixed this. Whew!

GEOCITIES.COM.JCHOLLOWAY.COM
GEOCITIES.COM

&gt;&gt; ... can you make that "7" and "2 other websites hosted" clickable
&gt;&gt; to a list of the domains hosted on that same IP?

Yes, we can make that clickable. We are still deciding on the best way to do this.

&gt;&gt; straight alphabetical sort by domain

Looks like you have been peeking over our shoulders, something like that is coming very soon. Bot that plain though.
 

Re: Multiple Listings in sorted order

Bumping up. Now that we've had a handful of new database updates, can we have a new list of multiple domains? <img src="/images/icons/smile.gif" alt="" />
 

Re: Multiple Listings in sorted order

The DMOZ database on our site will now be automatically updated within 24 hours of a new RDF. Hope that helps.

BTW, we have also added Yahoo Directory Listings. So it is possible to see how many listings there are in other directories as well now.
 

Re: Multiple Listings in sorted order

Very cool. This should maybe be reposted in a more appropriate forum. Which one would be best?
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
Messages
19,136
Re: Multiple Listings in sorted order

Well, mon, I suppose you're not still wondering whether anyone ever uses it.

Thanks again.
 
This site has been archived and is no longer accepting new content.
Top