Jump to content

Recommended Posts

Posted

Hi All,

 

I have successfully parsed the dmoz dump files into mysql.

There are 4 tables: structure, content_description, content_links and datatypes.

 

Structure table has fields: catid, name, title.....

I set the primary key as CATID and it works fine.

 

When I tried to do the same for the rest of the tables such as datatypes (with fields: catid, type, resource) i always got this error:

 

SQL query: Edit

 

ALTER TABLE `datatypes` ADD PRIMARY KEY ( `catid` ) ;

 

MySQL said: Documentation

#1062 - Duplicate entry '1' for key 1

 

same goes to other tables (content_links with fields: catid,topic,type, resource).

 

 

Another problem, the 3rd table (content_description) has the following fields: externalpage, title, description,ages, mediadate,priority.

 

the question is which one of these fields should be the PK here?

 

hope to get your feedbacks....

thanks a lot..

  • Meta
Posted

A primary key must be unique, so it can't be a field that is identical in different datasets. None of the fields "catid, type, resource" qualifies. I see three solutions:

 

a) You don't use a primary key. That might have performance issues, but would be the easiest thing to do.

b) You implement an additional field containing a (unique) numerical value. This would be the most straightforward thing in my eyes

c) You could try to generate a unique new field by joining the three fields you already have. While each of them isn'T unqie, a concatenation of all three should be. At least if there are no inconsestencies in the RDF file. But I would prefer any of the previous over this solution.

Curlie Meta/kMeta Editor windharp

 

d9aaee9797988d021d7c863cef1d0327.gif

Posted

Thanks for the reply...

 

I tried the a) solution by removing the primary key from structure table....but nope it doesn't work. The mediator to query from Mysql says no primary key. So no export is possible. I tried also to set separately the primary key to other tables but got the same error about duplicate entry.

 

As for the b) solution, I ain't sure If i understood it all... the additional field containing a unique numerical value should be in all the 4 tables ? and how would it like ? please elaborate more.....

 

thanks once again for the time...

cheers...!

  • 5 months later...
Posted

parsing odp rdf

 

Hi All,

 

I have successfully parsed the dmoz dump files into mysql.

There are 4 tables: structure, content_description, content_links and datatypes.

 

Structure table has fields: catid, name, title.....

I set the primary key as CATID and it works fine.

 

When I tried to do the same for the rest of the tables such as datatypes (with fields: catid, type, resource) i always got this error:

 

SQL query: Edit

 

ALTER TABLE `datatypes` ADD PRIMARY KEY ( `catid` ) ;

 

MySQL said: Documentation

#1062 - Duplicate entry '1' for key 1

 

same goes to other tables (content_links with fields: catid,topic,type, resource).

 

 

Another problem, the 3rd table (content_description) has the following fields: externalpage, title, description,ages, mediadate,priority.

 

the question is which one of these fields should be the PK here?

 

hope to get your feedbacks....

thanks a lot..

 

Hi, would you mind sharing with us how you successfully parsed the data? I am looking at multiple scripts in php and getting a lot of problems. Trying to parse the chefmoz rdf. Thanks

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...