Jump to content

Recommended Posts

Posted

This is the news you've all been waiting for - we've finally been able to generate an RDF dump. The data is a week out of date, and does not include any catids, as most of the problems relating to the RDF generation have been to do with duplicate catids. This will be fixed in later dumps.

 

You can download the files from this new URL: http://rdf.dmoz.org/ . Please bear in mind that demand may be high initially, so your downloads may be slow.

Posted

Is there any particular reason why this page isn't updated ?

 

(Says RDFs NOT pushed)

 

Thanks, Gringo.

Guest dargo21
Posted

The RDF wasn't pushed there because there were duplicate catids. At the url totalxsive posted, there are no catids. The data itself is there, without catids.

 

Its good to have something come out. <img src="/images/icons/grin.gif" alt="" />

Guest hughprior
Posted

Thanks for the information. That's very helpful. <img src="/images/icons/grin.gif" alt="" />

 

Recently I checked the page:

http://dmoz.org/rdf/

(from http://dmoz.org/rdf.html)

and I saw that the old dump was still the only one available.

 

Q. Why is the new dump not at this place ( http://dmoz.org/rdf.html )? I should not have to do detective work (looking in this forum) to find it! It can be in the correct place with 10 pages of caveats, but it should be in the correct place!

 

Q. All the RDF dumps I have worked with have lines which cause XML parsing to go AWOL: Why? e.g.

 

216071.

216072. <Topic r:id="Top/Arts/Music/Musicology/Ethnomusicology/Ethnomusicologists">

216073. <d:Title>Ethnomusicologists</d:Title>

216074. <d:Description>Contains personal pages of professional Ethnomusicologists. Most pages include an online form of their research in the field.

216075.

216076. � </d:Description>

 

XML parser error at line 216076 (� ): not well-formed (invalid token)

216077. <lastUpdate>2002-12-09 06:38:22</lastUpdate>

 

Thanks!

  • Meta
Posted
First one should be explained with "and does not include any catids," from above. its not a fully valid RDF dump which can be used by all users without thinking. [Edit] And apart from that maybe staff put them on another server, so editing is not affected by all the people downloading it at the same time...

Curlie Meta/kMeta Editor windharp

 

d9aaee9797988d021d7c863cef1d0327.gif

Guest hughprior
Posted

I don't think that hiding the RDF dump somewhere out of the expected place, in order to help those who are too stupid to use it properly, is a good solution. It may help them, but it also hinders greatly the chance of finding it by those people who are looking from time to time in the official place.

 

I also don't think the RDF dump (in the perfect version or otherwise) is something which anybody can use "without thinking". By the very nature of the beast, it is something which only pretty technical people (who of course do not of course fear dragons in any file) are going to be using.

 

I would have thought a name like .gz.catidsmissing is enough of a warning.

 

If it's worth having at all, let's have it in the official place, with any caveats required stated.

  • Meta
Posted

_Think_, mon!

 

Do you really believe Google to assign a highly paid technician to go inspect the RDF dump manually, mentally parsing it according to his expert knowledge of XML, to make sure it's the right one?

 

Or, just possibly, might they have a script that automatically runs each week, and automatically checks the official place to see if they can find a good one?

 

The people who want to pore over a broken or crippled RDF file by hand can find it in an odd place -- it's hardly been kept secret. The scripts that expect a fully functional one will keep not fining it in the expected place, until there is one to find in the expected place.

 

That's not even elementary, that's downright particular.

  • Meta
Posted

The new place for the RDFs _is_ also on a different server. That would appear to be all part of various changes that are being made, and will continue to be made for 6 months at least, to take some of the load off the main server.

 

I suspect when the RDF is generating correctly we may see RDFs at both places, or one will redirect to the other. IN any event, we hope to see RDF generation return to normal soon. <img src="/images/icons/cool.gif" alt="" />

Posted
If at all possible, please use the new URL. As I understand it, the new URL points to a different server, so not only should it be faster for you but it'll reduce the server load so we'll be able to edit faster.
  • 4 weeks later...
Guest 000000
Posted
That is wicked, glad i checked in here, just for that <img src="/images/icons/cool.gif" alt="" /> nice one guys glad to things getting back to normal <img src="/images/icons/wink.gif" alt="" />

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...