Using RDF feed to generate static HTML pages?

jjohnstn

Member
Joined
Aug 6, 2004
Messages
34
I would like to download the RDF dump and generate static HTML pages (with customizable headers and footers). I have only found one program called iHierarchy that claims to do this ( http://simiax.com/ihier.html ) however it is $199 and a demo to test is not available.

Are there any other applications that will do this? Also, does any know or care to guesstimate the size of the final static HTML output? I have a dedicated server with about 60 GB of free space so hopefully there would be room for the output.

I would need the program to maintain the exact ODP directory structure:

i.e. www.domain.com/ODP/Recreation/Outdoors/
 

bobrat

Member
Joined
Apr 15, 2003
Messages
11,061
Based on some work I'm currently doing on a subset - here is a rough idea.

45,000 categories needs about 70MB - I don't know what the actual full counts are - however the front page says 590,000 categories - so do the math. If you are displaying full sites descriptions, then you would need more than this, maybe 25%

That will however depend exactly what you generate for the HTML and what you are displaying
 

jjohnstn

Member
Joined
Aug 6, 2004
Messages
34
bobrat said:
Based on some work I'm currently doing on a subset - here is a rough idea.

45,000 categories needs about 70MB - I don't know what the actual full counts are - however the front page says 590,000 categories - so do the math. If you are displaying full sites descriptions, then you would need more than this, maybe 25%

That will however depend exactly what you generate for the HTML and what you are displaying

So probably around 1 gig or so. Shouldn't be a problem, and thanks for the response.
 

jjohnstn

Member
Joined
Aug 6, 2004
Messages
34
windharp]We collect those tools in [url]http://dmoz.org/Computers/Internet/Searching/Directories/Open_Directory_Project/Use_of_ODP_Data/Upload_Tools/[/url said:
- maybe you can find something there.

Thanks.. I did check there but all I saw (except for iHierarchy mentioned in my original post) was for importing into MySQL or screen scrapers.

Still looking....
 

dizzlewizzle

Member
Joined
Sep 21, 2004
Messages
14
Go here, http://www.cgiexpo.com/Scripts/36_CGI_Scripts_Search_Portals/20/

There is open source scripts here that do the exact same thing. The script you were looking at only parses the data, you dont actually host the data itself.

You dont need to pay $200.00 bucks for a script. Open Source is always the way to go if you can. Besides If Im not mistaken just by looking at his small demo code its based off of some of the open source scripts found at the url above.

Good luck!
 

jjohnstn

Member
Joined
Aug 6, 2004
Messages
34
dizzlewizzle said:
Go here, http://www.cgiexpo.com/Scripts/36_CGI_Scripts_Search_Portals/20/

There is open source scripts here that do the exact same thing. The script you were looking at only parses the data, you dont actually host the data itself.

You dont need to pay $200.00 bucks for a script. Open Source is always the way to go if you can. Besides If Im not mistaken just by looking at his small demo code its based off of some of the open source scripts found at the url above.

Good luck!


Thanks, but the ones I saw there (except iHierarchy) were screen scrapers or did not use ODP data. The ODP keeps going down due to server issues, so I need to have a complete local copy of the ODP... hence the need for a local copy from the RDF dump.
 

dizzlewizzle

Member
Joined
Sep 21, 2004
Messages
14
Hi there again,

If you need to have the actuall content on your server and are having probs figuring it out, let me know and I will help you do it free. I am currently working on the nutch.org code and revamping etc, and have been able to not only parse, but create sql databases from it.

Once the actuall parse, and sql database is formed the rest of the script is very easy, we can use the free version of the script you were looking at and simply add a admin interface, along with sql support so that all search requests come from your server and database-not the odp.

Let me know if you would like my help, I could do it in about a day....Free.

Let me know buddy.
 

jjohnstn

Member
Joined
Aug 6, 2004
Messages
34
DizzleWizzle,

Many thanks for the offer but can it be done without using mysql? I would like to use completely static HTML pages on my server (no db calls) to keep the server load down. I've looked at writing a Perl script to parse out the RDF dump and create the HTML pages but didn't want to "reinvent the wheel".

Again, thank you for your generous offer!
 

bobrat

Member
Joined
Apr 15, 2003
Messages
11,061
I have done this one with completely static pages [as an experiment in progress]

Cool Sites Project

However, there are many issues that you have to consider when doing it this way as opposed to creating dynamically off SQl tables. E.g. When the RDF data updates, unless you keep carefull track, you would have to regenerate all pages every week.
 

Dardo

Member
Joined
Dec 25, 2004
Messages
8
I'm also interested in this topic, why the discussion stopped here? What is the answer?
 
This site has been archived and is no longer accepting new content.
Top