heroine Posted August 12, 2007 Posted August 12, 2007 Hi All, I have successfully parsed the dmoz dump files into mysql. There are 4 tables: structure, content_description, content_links and datatypes. Structure table has fields: catid, name, title..... I set the primary key as CATID and it works fine. When I tried to do the same for the rest of the tables such as datatypes (with fields: catid, type, resource) i always got this error: SQL query: Edit ALTER TABLE `datatypes` ADD PRIMARY KEY ( `catid` ) ; MySQL said: Documentation #1062 - Duplicate entry '1' for key 1 same goes to other tables (content_links with fields: catid,topic,type, resource). Another problem, the 3rd table (content_description) has the following fields: externalpage, title, description,ages, mediadate,priority. the question is which one of these fields should be the PK here? hope to get your feedbacks.... thanks a lot..
Meta windharp Posted August 12, 2007 Meta Posted August 12, 2007 A primary key must be unique, so it can't be a field that is identical in different datasets. None of the fields "catid, type, resource" qualifies. I see three solutions: a) You don't use a primary key. That might have performance issues, but would be the easiest thing to do. b) You implement an additional field containing a (unique) numerical value. This would be the most straightforward thing in my eyes c) You could try to generate a unique new field by joining the three fields you already have. While each of them isn'T unqie, a concatenation of all three should be. At least if there are no inconsestencies in the RDF file. But I would prefer any of the previous over this solution. Curlie Meta/kMeta Editor windharp
heroine Posted August 12, 2007 Author Posted August 12, 2007 Thanks for the reply... I tried the a) solution by removing the primary key from structure table....but nope it doesn't work. The mediator to query from Mysql says no primary key. So no export is possible. I tried also to set separately the primary key to other tables but got the same error about duplicate entry. As for the b) solution, I ain't sure If i understood it all... the additional field containing a unique numerical value should be in all the 4 tables ? and how would it like ? please elaborate more..... thanks once again for the time... cheers...!
sdang Posted January 17, 2008 Posted January 17, 2008 parsing odp rdf Hi All, I have successfully parsed the dmoz dump files into mysql. There are 4 tables: structure, content_description, content_links and datatypes. Structure table has fields: catid, name, title..... I set the primary key as CATID and it works fine. When I tried to do the same for the rest of the tables such as datatypes (with fields: catid, type, resource) i always got this error: SQL query: Edit ALTER TABLE `datatypes` ADD PRIMARY KEY ( `catid` ) ; MySQL said: Documentation #1062 - Duplicate entry '1' for key 1 same goes to other tables (content_links with fields: catid,topic,type, resource). Another problem, the 3rd table (content_description) has the following fields: externalpage, title, description,ages, mediadate,priority. the question is which one of these fields should be the PK here? hope to get your feedbacks.... thanks a lot.. Hi, would you mind sharing with us how you successfully parsed the data? I am looking at multiple scripts in php and getting a lot of problems. Trying to parse the chefmoz rdf. Thanks
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now