Jump to content

Recommended Posts

Posted

Hi,

 

I would like to use DMOZ data - content and structure files to find out the best possible category for a term.

 

For example: when I enter "bmw" in DMOZ online search, the first result is:

 

# Recreation: Autos: Makes and Models: BMW

, which is most relevant for term "bmw" - as compared to others such as:

 

# Recreation: Motorcycles: Makes and Models: BMW (95)

# World: Deutsch: Freizeit: Auto: Marken: BMW (72)

# Business: Automotive: Motorcycles: Makes and Models: Retailers: BMW (31)

# Home: Consumer Information: Automobiles: Purchasing: By Make: BMW (11)

 

How can I achieve this using the rdf files? Out of all the possible categories, how do I pick the right one automatically?

 

Any help is appreciated. Thanks.

Posted
I guess you would need to design a search algorithm that looks at the contents of each category (category name, path, titles of listed sites, descriptions of listed sites) and gives is a score based on occurrences of the search term(s) entered. Then you just return the category with the highest score. That's pretty much what the search function on http://www.dmoz.org/ does to come up with the category choices.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...