Jump to content

Recommended Posts

Posted

Hello everybody

 

I am trying to put up a website in which ODP data is to be used. I know that credits are to be given for using the data in commercial purposes.

 

Could anyone please point me to the right direction about what is to be found into the dump file? I know I will have to convert it for filling up a MySQL database, but I need a few guidelines about it.

1. What content is in the file, the entire ODP content?

2. How can I make future updates, do I have to take the entire file each time?

3. How often a new file is available?

4. What type of database can work with it? Is MySQL powerful enough?

 

Thanks in advance for any suggestions!

  • Meta
Posted

There are two major files containing the data, the content.rdf and structure.rdf files, which are VERY large, even when their gzipped. There is a small example of each on the RDF server:

http://rdf.dmoz.org/rdf/content.example.txt

http://rdf.dmoz.org/rdf/structure.example.txt

(Those file smay be a little out of date, some of the data has changed.)

 

The structure file contains all the categories, links, related categories, alternate langauge links, descriptions, etc. The content file contains all the sites. It can be put into a MySQL database, though queries on the content table of my own database often take 7-9 seconds apiece on a moderately fast server.

 

New RDF files are usually available every week, barring technical difficulties. The only way to update the data at this time is to download all the data again. We hope to be able to supply some alternative methods in the future. :monacle:

Posted
Thank you for your reply, it solves a lot of the questions we had!
Posted

Is the ODP file containing the ranking information too?

Will we have a ranking based result or an alphabetical listing?

  • Meta
Posted

We don't do any ranking (that is a Google feature).

You can present the sites in any order you like.

I will not answer PM or emails send to me. If you have anything to ask please use the forum.

  • Meta
Posted

The "ranking" is not a specific one. I don't know if it is random, addition date or something else. We can't provide you with additional information since we don't have any, sorry.

 

If I think about it, according to the process used to generate the search database, it seems likely that it is the same order the entries have in the RDF file. But thats just a guess.

Curlie Meta/kMeta Editor windharp

 

d9aaee9797988d021d7c863cef1d0327.gif

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...