How many categories and subcategories are there?

calande

Member
Joined
Mar 9, 2005
Messages
10
Hello,

I had a look at the About page and I was wondering how many categories and subcategories there are on DMOZ...I didn't find the information in sticky topics here either. I am currently running a script that retrieves categories from the RDF file and it currently extracted more than 30,000 categories...It's crazy :eek:
How many are there?
Thanks,
 

aeclark

Curlie Meta
Curlie Meta
Joined
Sep 30, 2004
Messages
68
The precise figure varies, as editors are regularly creating/merging/deleting categories. As an approximation though, the front page (dmoz.org) states at the bottom of the page "4,581,289 sites ... over 590,000 categories"

Regards;

aeclark
 

calande

Member
Joined
Mar 9, 2005
Messages
10
Oh, wow...This is going to take days to complete inserting categories into my database :(
 

hansfn

Curlie Meta
Joined
Aug 4, 2005
Messages
26
Nope, it should take you less than an hour.

Stats from running the insert script included with phpODPWorld on an old machine with 256 MB RAM (!) and an AMD Athlon XP 1.7GHz CPU:

Code:
======================================================================
======================================================================
Inserting RDF into MySQL using phpODPWorld
======================================================================
======================================================================

CONTENT:
# time ./phpodpworld.pl content2db config-mysql-test.pl content.rdf.u8
Info: Loading content records
Info: Record 1000
Info: Record 2000
[...]
Info: Record 4825000
Info: Record 4826000
Info: 4826113 loaded

real    42m7.930s
user    34m26.488s
sys     0m33.846s

======================================================================

STRUCTURE:
# time ./phpodpworld.pl structure2db config-mysql-test.pl structure.rdf.u8
Info: Loading structure records
Info: Record 1000
Info: Record 2000
[...]
Info: Record 724000
Info: 724598 loaded

real    13m25.975s
user    10m7.791s
sys     0m16.187s

======================================================================
======================================================================
Inserting RDF into PostgreSQL using phpODPWorld
======================================================================
======================================================================

CONTENT:
# time ./phpodpworld.pl content2db config-mysql-test.pl content.rdf.u8
Info: Loading content records
Info: Record 1000
Info: Record 2000
[...]
Info: Record 4825000
Info: Record 4826000
Info: 4826113 loaded

real    42m28.023s
user    35m8.967s
sys     0m36.682s

======================================================================

STRUCTURE:
Info: Loading structure records
Info: Record 1000
Info: Record 2000
[...]
Info: Record 724000
Info: 724598 loaded

real    16m58.283s
user    10m15.220s
sys     0m19.324s

PS! I'm planning to make a release of phpODPWorld the coming week which fixes some minor quirks.
 
This site has been archived and is no longer accepting new content.
Top