calande Posted March 26, 2009 Posted March 26, 2009 Hello, I had a look at the About page and I was wondering how many categories and subcategories there are on DMOZ...I didn't find the information in sticky topics here either. I am currently running a script that retrieves categories from the RDF file and it currently extracted more than 30,000 categories...It's crazy How many are there? Thanks,
Meta aeclark Posted March 26, 2009 Meta Posted March 26, 2009 The precise figure varies, as editors are regularly creating/merging/deleting categories. As an approximation though, the front page (dmoz.org) states at the bottom of the page "4,581,289 sites ... over 590,000 categories" Regards; aeclark
calande Posted March 26, 2009 Author Posted March 26, 2009 Oh, wow...This is going to take days to complete inserting categories into my database
Meta hansfn Posted March 29, 2009 Meta Posted March 29, 2009 Nope, it should take you less than an hour. Stats from running the insert script included with phpODPWorld on an old machine with 256 MB RAM (!) and an AMD Athlon XP 1.7GHz CPU: ====================================================================== ====================================================================== Inserting RDF into MySQL using phpODPWorld ====================================================================== ====================================================================== CONTENT: # time ./phpodpworld.pl content2db config-mysql-test.pl content.rdf.u8 Info: Loading content records Info: Record 1000 Info: Record 2000 [...] Info: Record 4825000 Info: Record 4826000 Info: 4826113 loaded real 42m7.930s user 34m26.488s sys 0m33.846s ====================================================================== STRUCTURE: # time ./phpodpworld.pl structure2db config-mysql-test.pl structure.rdf.u8 Info: Loading structure records Info: Record 1000 Info: Record 2000 [...] Info: Record 724000 Info: 724598 loaded real 13m25.975s user 10m7.791s sys 0m16.187s ====================================================================== ====================================================================== Inserting RDF into PostgreSQL using phpODPWorld ====================================================================== ====================================================================== CONTENT: # time ./phpodpworld.pl content2db config-mysql-test.pl content.rdf.u8 Info: Loading content records Info: Record 1000 Info: Record 2000 [...] Info: Record 4825000 Info: Record 4826000 Info: 4826113 loaded real 42m28.023s user 35m8.967s sys 0m36.682s ====================================================================== STRUCTURE: Info: Loading structure records Info: Record 1000 Info: Record 2000 [...] Info: Record 724000 Info: 724598 loaded real 16m58.283s user 10m15.220s sys 0m19.324s PS! I'm planning to make a release of phpODPWorld the coming week which fixes some minor quirks.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now