Using Directory Data
Also for information on the license and attribution requirements.
262 topics in this forum
-
- 0 replies
- 2.8k views
Hi I found: http://www.katalogn.net-ip.info/ No attribution, no mention about DMOZ. I'm not sure if http://www.site-directory.org/ have proper attribution too... They should, becouse they are offering free script to display ODP data, but...
Last reply by pozmu, -
Why there isn't anything such yet? Or at least not open!
Last reply by bitingduck, -
- 1 reply
- 11.4k views
I have downloaded content.rdf.u8 file and unzipped it to get content.rdf.u8. Now how do i get content.rdf file?? i cant see any option with winzip to extrat it to get .rdf file.. pls help
Last reply by addy, -
Thumbshots
by Guest martin30- 4 replies
- 4.7k views
Does anybody know how to do a script like anaconda or odp++ but that also uses the thumbshots from thumbshots.org? Martin
Last reply by addy, -
- 2 replies
- 1.7k views
Can I have the ODP attribution in all of the category pages, except the main category page (the Top category) like http://www.thumbshots.net/ and lots of other sites? Can I remove the ODP attribution in the main category page only, but all of the other pages will have the ODP attribution.
Last reply by james, -
- 2 replies
- 2.9k views
I download this day the ODP data from DMOZ french site and push all that in a MySQL database : no problem ! I wanted to extract only french data from the ODP but I could not find any french description in my database (except for categories) ! > Does the ODP data integrate all countries datas ? > Do you know if there is specific tools to do this job ? Thanks for help !
Last reply by bobrat, -
A pearl
by Guest alcoholico- 1 reply
- 2.5k views
Justo to show how well managed is DMOZ by its editors, I found this pearl in their own category /Computers/Internet/Searching/Directories/Open_Directory_Project/Use_of_ODP_Data/Upload_Tools/ http://dmoz.org/Computers/Internet/Searching/Directories/Open_Directory_Project/Use_of_ODP_Data/Upload_Tools/ Last site listed there is: Xhoo A collaborative project whose goal is to create Open Source Web Catalog software for various platforms and languages that can selectively import data from the Open Directory Project. Early releases are currently available for Perl, ASP, and PHP. This site in reality has nothing to do with DMOZ data, you can find there only samples of…
Last reply by motsa , -
- 0 replies
- 2.1k views
I have a travel directory based on a free script at www.olga.co.za . It uses mainly data imported from dmoz.org but I changed a bit the subcategories. I'm using the dmoz copyright (as per the policy) at the bottom of my pages but I can't figure out (I'm not a programmer) how to set up the link 'Become an editor' correctly. As per the requirements needs to point to the specific category in Dmoz. On my site shows like http://dmoz.org/cgi-bin/apply.cgi?where=$cat. I need to replace $cat with the proper category. As an example my page for Africa needs to link to "http://dmoz.org/cgi-bin/apply.cgi?where=Regional/Africa/Travel_and_Tourism" , the one for Europe to "http://dmoz.o…
Last reply by dalis, -
-
- Meta
- 2 replies
- 2.6k views
This one : http://www.dominion-web.com/products/dwodp_live/ they pull data directly from DMOZ pages and re-serve them again through your site, I think that's called page scrapping. It is a legal use of DMOZ data and according to DMOZ license ?
Last reply by fischermx, -
-
- 2 replies
- 2.4k views
if you are using dmoz database, is it ok, to use nofollow on dmoz links, and other people's sites?
Last reply by linklink, -
- 2 replies
- 2.8k views
Hi there everyone I have been grappling with importing ODP RDF files into a mysql database and massaging it into an appropriate schema. I have successfully managed to parse the data, dump it into a database and port it over into a suitable schema. I am using a commercial application called Powerseek SQL to manage the directory. However there is a problem which is giving me huge headaches and I cannot find an answer anywhere. Despite the sql code being perfectly valid and the database returning the records when you perform a direct sql query through a client such as phpmyadmin, Powerseek refuses to output some records. I have been in touch with the people who make it an…
Last reply by dataferret, -
ODP compatibility with Windows
by Guest swisstony-
- Meta
- 15 replies
- 6.7k views
I am putting together an ODP "mirror" sites at the moment, using an XP machine to do all of the crunch work and then upload the entire directory to the *nix server. It is English language only. However, I came accross 3 different types of category that are simply incompatible with the Windows XP file system (and I presume other versions of Windows). If there are any techies here who are interested... 1. There are two categories that contain '...' in the title. '...' cannot be used in a Windows folder name; it is simply ignored, thus actually creating a misnamed folder! 2. A bunch of educational categories have such ridiculously long category names that they breac…
Last reply by motsa , -
-
-
- Meta
- 10 replies
- 4.9k views
I would like to download the RDF dump and generate static HTML pages (with customizable headers and footers). I have only found one program called iHierarchy that claims to do this ( http://simiax.com/ihier.html ) however it is $199 and a demo to test is not available. Are there any other applications that will do this? Also, does any know or care to guesstimate the size of the final static HTML output? I have a dedicated server with about 60 GB of free space so hopefully there would be room for the output. I would need the program to maintain the exact ODP directory structure: i.e. http://www.domain.com/ODP/Recreation/Outdoors/
Last reply by Dardo, -
-
What tools, scripts etc are available for cleaning up the RDF representation of the ODP?
Last reply by Dardo, -
-
- Meta
- Editall/Catmv
- 15 replies
- 6.2k views
Although DMOZ has 5 million links (or more?), its searching engine is very weak. Say, search paypal, I even cannot get paypal's official site; But Paypal's official site is in this category: /Business/Financial_Services/Banking_Services/Electronic_Cash/ Search TOEFL (a popular ESL test), No Open Directory Project results found! But there are lots of sites, for instance: /Arts/Education/Language_Arts/English/English_as_a_Second_Language/Examinations/ Why?
Last reply by bobrat, -
-
- 0 replies
- 3k views
Hi, Is there any tool to do this ? I found one to export the data to MySQL but it seemed pretty buggy, it halted at half way with a parsing fatal error and anyway, I would be exporting the data to SQL Server. So, I'm looking for any kind of tool to extrat the data to SQL Server, either commercial or free.
Last reply by fischermx, -
- 3 replies
- 2.6k views
Hi, I want to extract the DMOZ categories and I'm wonder how are the tables needed to put the data. I understand it is hierarchical, however, I've never seen before the multi-parents capability shown by DMOZ. Could someone shed some light on this ?
Last reply by lazydog, -
-
- RZ Admin
- 4 replies
- 3.7k views
Hello; I'm thinking about designing a search engine. I somehow need to get a list of domain names or URLs for the search engine to crawl and add to the index. Does anyone know how I can get a listing of just the raw URLs in the dmoz database? Thanks.
Last reply by shilpesh, -
-
- 11 replies
- 4.4k views
I am starting up a specialized pay per click search engine, and I'm wondering if there is a way to include the ODP results in my search engine. Since this will be a more specialized search, I don't need to download the entire directory, just certain parts of it. I didn't see any information on this. Can anyone help? Thank you. M11
Last reply by bobrat, -
-
- Meta
- 1 reply
- 2.7k views
I'm wondering where I should report errors in the downloadable RDF data. I can see a lot of logical consistence errors wile parsing it. Here is example: <Topic r:id="Top/Arts/Animation/Artists/Directors/Freleng,_Friz"> <catid>78966</catid> <d:Title>Freleng,_Friz</d:Title> <related r:resource="Top/Arts/Animation/Studios/Warner_Bros/Looney_Tunes/Characters/Tweety"/> <related r:resource="Top/Arts/Animation/Studios/Warner_Bros/Looney_Tunes/Characters/Yosemite_Sam"/> <related r:resource="Top/Arts/Animation/Cartoons/Titles/P/Pink_Panther,_The"/> <related r:resource="Top/Arts/Animation/Studios/DePatie-Freleng"…
Last reply by windharp , -
-
I'm little curious. Anyone know how many different sites use dmoz data in their results? Or pull their data to fill in their own directory?
Last reply by Crichey, -
-
- Meta
- Editall
- 9 replies
- 4.1k views
A user of my script (phpODP) has reported getting this message: <center> <table width="75%"><tr><td> <b><font size="+1">Access denied (for you)</font></b> <p> Possible reasons are: <ul> <li><b>You are trying to mirror the page via HTTP. We dislike this!</b> It's a total waste of bandwidth, CPU and time! Please <a href="mailto:dmoz@teamix.net">contact</a> us and use <a href=http://rsync.samba.org/>rsync</a> in the future. <li><b>We blocked you by mistake.</b> Sorry for that. Please get in <a href="mailto:dmoz@teamix.net">contact</a> wit…
Last reply by Callimachus , -
-
-
- Meta
- 4 replies
- 4.1k views
Currently (and for at least a week now), live users of the data, including my site http://www.localpin.com are completely stuffed because DMOZ for one reason or another is not returning pages before a timeout occurs. This is not just my site, but for example, all users of the DWodp live program (http://www.dominion-web.com/products/dwodp_live/) such as http://www.lightmysite.it/eng/search_engine.html which frequently give the error message: This is extremely frustrating, and because Netscape holds a de-facto monopoly on DMOZ level of service, there is nothing, absolutely nothing, that live data users like me can do about it. I think this shows a low regard for t…
Last reply by leer, -
-
- 1 reply
- 2.7k views
Hello All, I haven't seen much in the way of parsers for ms sql. I would like to parse the dump and save it to an ms sql db. Also anything out there for asp.net or vb.net that is related to odp dumps, site crawls, anything of that nature? Any help you could provide would be great. This is for a test project so my budget is very low so anything free would be awesome. Thanks in advance! Joshua joshua@joshuaz.com
Last reply by dizzlewizzle, -
ODP credit
by Guest Awb-
- Meta
- 6 replies
- 3.8k views
We will be using the ODP data and some slight alterations will be made as well as additions to the content. I know that because of this we have to say that we're using the ODP data with our own modifications, as Google does. Besides that we also have to display the following credit, I think http://rdf.dmoz.org/become_an_editor/ If so, do we have to use it exactly as provided or some CSS styles may be applied to it? It interpheres a lot with our current color scheme and it would be great to use the same CSS for it. I am not reffering to another text information or structure, just to its color scheme.
Last reply by windharp , -