parsing help, please?

cssaddict

Member
Joined
Jul 20, 2004
Messages
6
I'm testing the Extreme dmoz extractor, Located here
http://www.nicecoder.com/dmoz_extractor.php
However it doesn't seem to be working. I'm not sure what I am doing wrong as I am following what it says to do.

If anyone can help me I would greatly appreciate it.

Thanks in advance
 

cssaddict

Member
Joined
Jul 20, 2004
Messages
6
Hi, thank you for your quick replies, Yes, I did email their support, however I have not yet recieved a reply from them.

I followed the directions on the page and tried the category top/shopping/weddings but all I keep getting is top/arts/and the categories that are under arts.

They make it sound so easy. I even re downloaded the dump files in case they were corrupted during download as well as a redownload of the program.

I hope to hear from support today.

firestorm:
I just tried their link and it appears to be working again. I thought maybe I posted it incorrect in here, but I clicked from the above link and it loaded up.

Thanks for the help guys. I appreciate it.
 

giz

Member
Joined
May 26, 2002
Messages
3,112
Be aware that the whole of the ODP data was recently converted to UTF-8. Make sure that your script can deal with that. Early scripts might not be able to.


Also be aware that despite very best efforts, that each data dump in the last two months has between 0 and 10 invalid UTF-8 characters in it (in data totalling more than 2GB per dump that is). Dumps prior to that, whilst the changes were still in progess, had a LOT more errors though.

Converting data that was originally compiled in dozens of different languages and character sets has been a challenge; but we think that it has now been perfected as have the filters to stop invalid characters and byte-sequences finding their way into the data in the future.
 

cssaddict

Member
Joined
Jul 20, 2004
Messages
6
Giz,
thanks for your reply.

How accurate are the dumps from dmoz? Is it better to use a program that gives you partial searches?

The program that I mentioned previously in my post, I never recieved any response from their company in regards to how the program worked. I even signed up on their forum, and no one really gave accurate answers.

Another company that I tried and did purchase their product was this one
http://www.pjltechnology.com/dmoz.htm

I am now working my way through the program and learning how to use it and save the searches.

Thanks again for the replies, they are appreciated! ;)
 

whats_up_skip

Member
Joined
Aug 16, 2004
Messages
46
I have tried both of these.

The Extreme dmoz extractor does not seem to be able to change the category it is working on (as mentioned earlier).

The program http://www.pjltechnology.com/dmoz.htm does appear to be closer to the solution. With the demo version I was able to get it to work sometimes, but others it would stall/crash and just sit there using 98% of the cpu time and nothing was added to the data base.
 
This site has been archived and is no longer accepting new content.
Top