Guest DrBradford Posted March 31, 2003 Posted March 31, 2003 Hello, I am a researcher at Bradford University. Here's my question. I map people's bookmarks to DMOZ to do some comparisons. Currently I map urls to DMOZ catids like 468769. (example here Top/Kids_and_Teens/Pre-School) However, what I would like to do is transform those ids into tree addresses, e.g. 1.4.7. (I made this one up, but it could stand for Top/Kids_and_Teens/Pre-School). I would like to do this to get a better idea of 'parent' / 'children' relationships in classes and peoples' tastes. What would be the easiest way of doing that? Has this been done before? I was thinking of reading out the catname like Top/Kids_and_Teens/Pre-School, but it would be much more tedious working with that. A numerical one like 1.4.7 would be much easier. Let me know what you think, much appreciated! Best, Uwe. (u.aickelin@bradford.ac.uk)
beebware Posted March 31, 2003 Posted March 31, 2003 Ok, here's a few things to think about: - The ODP cannot change the catids in the RDFs. This is because several downstream partners may use the catids for a unique identification field (which is why it is in there). Changing the catid may break downstream users. Ok, the ODP doesn't "care" if a downstream user uses the data or not or in what format (as long as they follow the licence), but since the catid was put in the RDF for that reason it'll be pointless to change it. Co-incidentally, this is why the RDF production stalled last September as the catids were mainly inconsistent. - It'll be a reasonable idea to map, say, Regional=6, Regional/Europe=6.3, Regional/Europe/United_Kingdom=6.3.12, Regional/Europe/United_Kingdom/England=6.3.12.2 , but what happens if you have an @ link such as in the UK dependent areas. Would the "Isle of Man" be Regional/Europe/Isle_of_Man 6.3.26 AND/or Regional/Europe/United_Kingdom/Dependent_Areas/Isle_of_Man 6.3.12.5.12 ? - If you are doing this "privately" (and I think it'll be an innovative use of the data), then you may have to decide how to copy with a "catmv" (category move). What happens when category 'X' from R/E/UK/DA/X is moved to R/E/X - what happens to the relationships and does it get a new 'tree id'? I'm not saying you've got a bad idea (I actually quite like it), but I'm just pointing out some of the reasons why it may not have been done before.
stevesliva Posted July 1, 2003 Posted July 1, 2003 I'd probably try something with an xml parser that allows tree traversal, but then I'd probably run out of memory.
giz Posted July 1, 2003 Posted July 1, 2003 I believe that the actual CatID for any given category might change for any given category from dump to dump. Categories can be @links which are "virtual subcategories" of another branch, and I don't know how you could cater for that. eg. Science/Widgets/Blue/Circular points to a subcategory S/C/W/B/History but that is actually the category Reference/Widgets/History for example. You might be interested in pages linked from http://rodan.ncc.com/rdf/ for other useful snippets.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now