Graph Posted April 19, 2017 Posted April 19, 2017 (edited) As you all know, DMOZ closed last month. And I want a copy of the DMOZ data before it closed. I searched DMOZ.org for the latest dump of the data but all it shows up is the "We're closed" page. Where could I get the latest dump before it closed? I'm creating a successor to DMOZ so I need this data. Also, is the content from this site, dmoztools.net, using the latest data dump or does it contain updates after the latest dump? Also, does dmoztools.net contain anything besides the latest dump? Edited April 19, 2017 by Graph Quote
Graph Posted April 19, 2017 Author Posted April 19, 2017 (edited) I have heard that some people scraped the data before DMOZ shut down: seobook.com/dmoz-shut-down So I'm unsure whether the scraped data is more up-to-date than the RDF data dump. I've also heard that the links submissions on the old DMOZ that were not yet approved will be passed on the new official successor to be approved? Is it true? If it's true then that's data that is not available to me. Edited April 19, 2017 by Graph Quote
Meta pvgool Posted April 19, 2017 Meta Posted April 19, 2017 In this thread you can find a link to a copy of the last rdf > is the content from this site, dmoztools.net, using the latest data dump Yes. But please do not scrape that website. > or does it contain updates after the latest dump? > does dmoztools.net contain anything besides the latest dump? No, it is just a copy based on the last dump. > So I'm unsure whether the scraped data is more up-to-date than the RDF data dump. probably the dump is more up-to-date as it was created just before shutting down > I've also heard that the links submissions on the old DMOZ that were not yet approved > will be passed on the new official successor to be approved? Is it true? It is our (the editor community) intention to start a new directory based on the old DMOZ data, including all suggestions that were not approved yet. So, yes it is true. > If it's true then that's data that is not available to me. Correct, that data is not available to you Quote I will not answer PM or emails send to me. If you have anything to ask please use the forum.
Rz Roth Posted July 27, 2017 Posted July 27, 2017 The link in that thread is to curlz.org/dmoz_rdf/ -- which is no longer a valid domain AND archive.org does not archive the data files SO any other suggestions for data dumps ?? I have been using minimoz to add dmoz links as added information for some of my web sites and missed the closing to grab the last data dump Someone please help -- I will also host those dumps if that is helpful to the project. Quote
stillbuyvhs Posted July 27, 2017 Posted July 27, 2017 https://web.archive.org/web/20170317132728/http://rdf.dmoz.org/rdf/ Really? None of the links there work? They got the text documents, but I can't tell if the RDF files are there with my cell phone. I know I've seen RealPlayer .ra files there, apparently archived from the reference in RealPlayer's .ram files. Quote
Meta informator Posted July 28, 2017 Meta Posted July 28, 2017 The data files are intact at archive.org ( content, structure, categories)... Quote Curlie (Dmoz) Meta editor informator
clement116 Posted July 29, 2017 Posted July 29, 2017 Hello everyone, Both the links in dmoztool and the mirror link in curlz_org does not work for me. Where can I find the lastest structure.rdf.u8.gz? Could anyone help with this? Best wishes Clement Quote
Meta informator Posted July 29, 2017 Meta Posted July 29, 2017 See the link in post # 5, it is working We do not have any connection with curlz.org and how it is working. Dmoztools is only a static copy of the directory listings and not the whole old dmoz site. Quote Curlie (Dmoz) Meta editor informator
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.