ODP compatibility with Windows

S

swisstony

I am putting together an ODP "mirror" sites at the moment, using an XP machine to do all of the crunch work and then upload the entire directory to the *nix server. It is English language only.

However, I came accross 3 different types of category that are simply incompatible with the Windows XP file system (and I presume other versions of Windows). If there are any techies here who are interested...

1. There are two categories that contain '...' in the title. '...' cannot be used in a Windows folder name; it is simply ignored, thus actually creating a misnamed folder!

2. A bunch of educational categories have such ridiculously long category names that they breach the 255 character limit for a folder tree.

3. The Internet/Content_Filtering category contains a category called 'pro' and a cat called 'con'. Windows does not allow folders called 'con'.

I can come up with work arounds for each of these of course, but if you are looking to set up some helpful guidelines for future reference, then these may be points to keep in mind.
 

Alucard

Member
Joined
Mar 25, 2002
Messages
5,920
Interesting stuff. Just shows what you get if you use an operating system which is so limited ;-)

Seriously, though - I would question the design of relying on the naming of folders to mirror the content of the ODP - every operating system has it's naming conventions and restrictions, and ODP category names don't follow any of them. I think with that design you're always going to run into issues.
 

bobrat

Member
Joined
Apr 15, 2003
Messages
11,061
Although you say, you are limiting to English, ODP in fact contains World non-English categories with sets of characters that would give Windows XP a stomach ache if you tried to name folders the same way. I'm afraid you have a flawed design. Folder names should never reflect data, no matter where the data came from ODP or not.
 
S

swisstony

Well, yes, the non-English categories would be an absolute nightmare to try and deal with, so I haven't bothered for the moment. All the non-English categories are grouped into three main groups, which is very useful:

World
Adult/World
Kids and Teens/International

Unfortunately, in my experience, when building a directory, folder names HAVE to reflect the data. Unless I want the server to do a file lookup and redirect for every request, then I have to actually use the ODP file structure, just as I would think the ODP does.

If you know of another way, do let me know; I would be genuinely interested. I have never seen the server side implementation of the ODP, but then Unix doesn't have the same file restrictions that Windows does, so I doubt that any of these issues arises there.

As I said, there are in fact only three types of category within the English language section that cause any issues at all. Of those, the only annoying one is the use of filenames over 255 characters, which could be easily avoided.

Anyway, the idea wasn't to cause any trouble, merely to state the compatibility issues.

The reason that I am doing the processing on a different box is to reduce the strain on the server. The reason that the box is running XP rather than Linux is that I can't stand Linux - I have tried installing it, but lack of familiarity and time merely made it a painful and very frustrating process. "Better the devil you know" and all that.
 

dfy

Member
Joined
Aug 2, 2002
Messages
2,044
Hmmmm. Would you walk a mile to the bus stop to catch a bus, and then a mile from the bus terminal to your place of work every day, on the grounds that learning to drive a car was too difficult and the bus was 'the devil you knew'? I know I wouldn't.

While learning linux isn't easy, it would certainly be worth all the effort involved, because it would remove the problem you are currently looking at. It would probably remove a whole bunch of other problems too. Try one of the easier linux versions, like SuSe, which can be set-up to look almost exactly like Windows, but without the limitations.


As Swiss Tony says:
"You know, learning linux is like making love to a beautiful woman. First you have to learn which buttons to press, then you have to learn which order to press them in, then finally, all the hard work pays off when you make her purr like a kitten and store your files just where you want them."
 

Alucard

Member
Joined
Mar 25, 2002
Messages
5,920
In my opinion, the only real way to store the data like the ODP is in a database. ODP uses some sort of custom-built flat files, as far as I know, but if they were doing it all over again, I'm sure they would use some sort of relational db. That way you stand the best chance of making it not specific to the quirks of any particular operating system, and it makes it much more scalable - foreign characters, lengths of category names, easier searches, etc.

If you are really stuck on doing it with file folders, then you should be developing on your target operating system, which I think you said was UN*X - then you need to look at the ODP naming conventions for categories and deal with how you map those into directory names which are compantible with the OS.

...and you're not causing trouble - we've said many times, what users do with the ODP data is their business ;)
 
S

swisstony

It was in fact SuSE that I tried for a couple of months on a second box... but I just couldn't get anywhere with it. I not yet convinced that the time I would have to spend learning a new OS is good time management. Yes it will be useful, but in the short term that time is much better spent dealing with the few exceptions.

I have of course used a database for the basis of the directory - that is how the pages are created in the first place. However, it would not be feasible to create all the directory pages on the fly from the database. Every Google crawl would create over a million calls to the database... and it would significantly slow down the page display for users.

I prefer to take the load off the database so that it can be used solely for searching. By having the static HTML pages, it dramatically reduces the load on the server.

If the ODP switched to a completely DB run basis, it would die almost instantly. I imagine that the static files are the only reason it runs at all at the moment.
 

sfromis

Member
Joined
Mar 25, 2002
Messages
202
Well, you do not *HAVE* to make the directory names match the category path - you could work around the Windows limitation by assigning shorter names, and also use them in the links. One possible way of deriving shorter names would be to use the catid instead of the category name.

ODP itself handles World/ language category names by URLencoding them. Of course, this makes the names much longer, which would be probematic to shoehorn into the brief Windows filenames. This can also be avoided by using the catid.
 
S

schoik

Brawrrr, I suggest using " Mandrake Linux 9.1 " if your new to linux, that or Redhat 9.0B... Don't use SuSE unless you have somewhat idea of linux.
 

giz

Member
Joined
May 26, 2002
Messages
3,112
Linux versions, just like buses, none for ages then three almost all at once. Tsk.
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
Messages
19,136
Ya, if you want guaranteed daily updates for your OS, you just gotta go with the Beast from Redmond and its critical security patch program.
 

TheAbsorbant

Member
Joined
Jan 15, 2005
Messages
4
Help!!!

You guys seem to know what you are talking about, I found this thread while googling after a sollution for my problem, which is this:

I have a Creative mp3 player. It utilizes the Creative Media Source Organizer. I tried to transfer the Megadeth album "Killing is My Business..." from the player to my hard drive, not realizing it would use invalid characters when creating the folder name (who would expect such a thing, most win apps substitute the .'s with _'s or alike). Now I'm stuck with a malfunctioning folder named "Killing Is My Business..", which won't let me open it nor delete it!! Does anyone know how to repair/delete that damn thing??!!

I'm on WinXP, if you hadn't already figured out by the desperation...
 

pvgool

kEditall/kCatmv
Curlie Meta
Joined
Oct 8, 2002
Messages
10,093
Sorry but this forum is only for questions related to ODP aka DMOZ
 

TheAbsorbant

Member
Joined
Jan 15, 2005
Messages
4
pvgool said:
Sorry but this forum is only for questions related to ODP aka DMOZ
The helpful and humane thing to do would be to help out anyway, rather than coldly come with a "You're off topic. Go away!" kind of remark.
 

motsa

Curlie Admin
Joined
Sep 18, 2002
Messages
13,294
The helpful and humane thing to do would be to help out anyway, rather than coldly come with a "You're off topic. Go away!" kind of remark.
Except that this forum is not here to be helpful and humane about anything but the Open Directory Project. You need to sign up somewhere else to get help with your question.
 
This site has been archived and is no longer accepting new content.
Top