Odp a library or an index

tomnorian

Member
Joined
Jul 19, 2004
Messages
40
After reading the guidlines for categories mutiple times and spending a good 12 hours purusing threads I"m not sure I undertand this main issue.

Some of this stuff is a bit rusty for me, and I hope I won't be attacked if I get a detail or two wrong.

I seem to rember, in learnign to do research when I was younger, the idea that in libraries they had found that what worked to help people find the information they needed (in an accademic setting too!) was to index each book a number of ways.

A libary might only buy on copy of a book. Which books it bought, or even which books it chose to keep would be based upon its content.

Where a book (or paper, or historical exhibit) was placed probably depended upon what was deemed its primary designation. I suppose in a large university libary system (berkeley) with muttiple colleges with seperate libaries, you might find mutiple copies of the same book classified a bit differently, though my memory is a bit rusty...that might be a libary of congress thing.

Any rate, I do know that there were two or three roledexes with hand typed cards in the days before computers when I was learning this stuff, only began to have the computers by the time I got to college.

I remember them classifing things seperately by subject, author and title.
In addition to those three entries, a book might have mutiple entries in those three main indexes. A journal of Freemont exploration of california, might be listed under georgraphy and history, maybe a third classification for exploration. If the book was co-authored the book would be index under each authors name.

So my question. Back in those days before computers, when such information was much more tedious to produce and to index, they still found a way.

A good part of the real work was the initial review of the book. I would think that, particularly with the issue of subject matter, that the decision to classify the same book a couple different ways both aided research and made the choice easier to make.

Is there a place where I might access the deicision making that went into deciding that the ODP should generally categorize in one category only except in exceptional issues.

Also any thoughts upon this "index" versu "libary" issue. I understand (I thinK) the ODP editorial guidlines from the "libary" issue. I mean, if someone puts a bit of fluff into print and sends if for free to a a libary they aren't going to take up room on their shelves with it. I can see how, from a indexing issue, that even with no physical space beeing taken up the "mental" space of a piece of fluff filling up indexes certainly inhibits research.

However, given the ease of storing information on computers, I should think that there would be a way of allowing a secondary inclusion, and that mutiple subcategories could be accessed by choice of viewer with the knowledge of possible redundancy.

Is the aim not to make a index of useful information (not all information) so that people could find stuff like you did in those card files?
 

motsa

Curlie Admin
Joined
Sep 18, 2002
Messages
13,294
We are not a rolodex or a card catalog or an index in the filing sense of the word, i.e. we are not aiming to cross link every site in every possible place someone could be looking for its content.

We're the library, shelving the books (or sites) where they belong according to our criteria.
 

tomnorian

Member
Joined
Jul 19, 2004
Messages
40
thanks..got it but this libary doesnt have an Index!

motsa said:
We're the library, shelving the books (or sites) where they belong according to our criteria.

This libarry doesn't have an index though.

I suppose the idea is that the search engines are to be the indexes.

Yet they are denied an important tool which it seems to me, you have already done the work for.

Humans, knowledgeble in given fields, have take the time to select what belongs in the libary. That makes sense to me.

Now after having selected what belongs there, they put it on the best shelf they can, but in order to find something a person must either get the dominate place correct, or they must rely upon the impressions of computer robots that stumble through the libary scanning through tables of contents, sometimes being able to find the time to read more text, but having no abilty to understand relative value of pictures and diagrams in the various books.

To gauge which books are more important the computers building the "indexes" look more frequently at books that others are checking out or that appear on best seller lists or they overhear other reccomding each other.

This may give some good idea of "importance" of a site...and probably something the search engines will be quite useful for.

However, the engines are at a severe disadvantage in not having the abilty to comprehend what is distinct and how things are related. They are paricularly at a loss for the unique value of diagrams and pictures which are very often more valuable than text in many fields.

So a libarian, having duitifully selected books (pages...pages vs sites?) of merit leaves the cataloguing of them largely to less comprhending robots, when by checking off a few more boxes after review, they would provide the computers information helping the computer undestand relevance of areas of content.

I guess thats what I find curious. A great deal of work and effort is often already done in a way it should be done at the ODP. But the rules, and information and stuff don't allow the same effort to give a more useful base of knowledge in ways the bots might understand.

No way everything could be mentioned, and thats not the job to be absolute. However, I'd think it would be a legitimate job of the libary keeper to toss a bone to those that would be doing the searching, espeically when the ideas were contemplated, undeerstood and considered by the editors chosing.

I guess I don't really understand the obstacles clearly because obviously some pretty darn smart and committed people came up with the rules!

(and thank you for the work...I'd like to get involved in a subject or two but I'm having a hard time understanding this issue....and without understanding it I might have a hard time feeling I was doing good in applying it?)
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
Messages
19,136
In a sense, we do some of what you say. A widget maker in wisconsin is listed both under Arts/Crafts/Useless/Widgets and under Regional/Wisconsin/Localities. The celebrated essay by Mark Twain about James F. Cooper is listed, IIRC, under both Clemens and Cooper. But there are differences. We don't list every author separately -- there's no reason to have a category for me even though I once had an article published in a magazine. And so any material I write and publish on the web won't be indexed under "Hutcheson, Stephen", although it might conceivably be listed under the relevant topic. The Fremont book you mention might be listed under Exploration and California, although note well: we might get by with one listing in "California Exploration", a category linked from both Exploration and California. (The Library of Congress cataloging system has this same feature.)

But we aren't going to be tied down to the limitations, or constrained to imitate all the idiosyncracies, of any medieval indexing system. The ODP isn't and doesn't want to be and can't be the only way of finding stuff on the internet. There are some things the ODP does better than any other way; and there are things Google will always do better regardless of how hard we try.

But no, the thinking behind that decision isn't documented, nor does it need to be. The decision itself is in the editorial guidelines, and has been there since the beginning. And the rationale is obvious. For a human-edited directory, there are only two practicable options: (1) deeplink never; (2) deeplink only in exceptional cases. Any other approach, and we'd be spending the next 50 years of editing time doing nothing whatsoever but processing Amazon.com deeplinks.
 

tomnorian

Member
Joined
Jul 19, 2004
Messages
40
deeplink versus more info?

I think I've got the deep link issue.
:)
I stil don't quite get the danger you face in categorizing stuff in mutiple categories, or how including a parrallell by author (or by submitter?!) index would cause any great harm or effort.

But I don't understand data storage and stuff like that. Still I'd think that such information would be quite useful for researchers to sort and screen information for pertinance in different ways and I guess I think that the WEB is really begging for more of that sort of stuff.

I know most of that will be the engines work but, gee if you looked at a page and saw it was an amazon page and marked it such then some search engine might

OHHHHH! I'm getting the deep link part better now....! you'd be required to index each amazon page! ahah!

Still I'd think that a catalog by submitters might be less cumbersome...you probably allready have the information. And I still don't get why having a site on yosemite valley geogrpahy might not show up under glacieers and region, california, yosemite...but like you said you do some of that.

It just seems you've got a strong bias against it even if two editors felt the same page (not even talking deep links here) was of interest to differing categories.

I think though that I am understanding better the practical issues you are facing. Not wanting to be caught up in a deluge of sudmissions, which are already difficult to handle, a somewhat arbitrary but more easily *more* evenly enforced rule has been put in place.

Perhaps over time protocol will develope. I'd think though, that you might give your subject editors more lattitude in sorting the information they are expert to apear on mutiple subject lists...however it sounds like systems are still an issue.

Thanks Tom
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
Messages
19,136
As it happens, I spent some time deeplinking the USGS site in various national park categories.

But the protocol is established, and pretty simple: submitter submit site in one most appropriate category (with a couple of exceptions involving Regional and/or World categories); editors encouraged to deeplink exceptional sites.

Editors can also communicate with webmasters of exceptional sites, saying, "please submit xxx as deeplinks."

The ODP is not big on enforcing rules, though. It doesn't even have ten commandments -- maybe 5, IIRC. All the rest is guidelines. It looks editors with good judgment and the ability to develop better judgment; and provide forums for discussing difficult cases.

Talking about ODP principles without some specific case in mind generally isn't very productive. You can read the editors' guidelines, and they may make a little sense. But after you've reviewed a half-dozen sites and done a couple of dozen edits and asked yourself, "was that done right; was it done well?" -- then you can go read the guidelines again and get a lot more out of them.

And after a few hundred edits, you can read the guidelines again, and many things begin to make sense.

After a few thousand edits, you often don't have to go back to look at the guidelines, because you've seen why most of them have to be the way they are.

But that doesn't mean you don't still ask occasionally, "where should this site go?" "is this site worth deep-linking?" "is this site listable at all?" "can't this be described better?" "how should these sites be subcategorized?" The most difficult questions are where several different conflicting principles are almost in balance.

All the hypothetical cases in the world don't help you with the real cases. But the real cases tell you exactly what you need to ask in hypothetical cases. And -- there are a dozen or so questions that may well be pertinent depending on the situation.

"how much content is available for this subject?" (if not enough, we shouldn't even have a category; if lots, then we can be more picky about what we put in it.) "how authoritative is this material?" "how informative?" "how unique?" "how accessible?" "how important?" "how focused?" "how targeted?" "what is its perspective?" "what is its purpose?" "what is the source?" "how much weight should be given to each question?" and so on. An experienced editor will look at each site and judge which are the most critical factors: without a specific site and category and timestamp, you can't even begin to answer most questions -- because the answer can well be different if any of those are changed.
 

tomnorian

Member
Joined
Jul 19, 2004
Messages
40
got you

I understand the complications you raise in generalising.

(but I'm also glad that you put the USGS in mutiple places...what great free (we already paid for it!) resource and one where web searches might casue people on limited budgets to pay for what they could get almost as good for free....it belongs in Maps, in each region, in hiking supplies etc far as I am conerned but would I say the same about anyone *selling* an atlas? I don't want to see every atlas listed in every region anymore than you do.

Ok finally getting it here.

And I guess if you said what I just said there are legal issues involved...so the default must be to allow only one placement except with those which the editors find extremely worth of exception.
 

giz

Member
Joined
May 26, 2002
Messages
3,112
What makes the ODP more accessible than a library shelf or a rolodex index is the fact that the categories are linked - not just the up and down heirarchy of wider or finer granularity of topic, but the sideways linking from deep in one branch to a category at some, or any, level in some other branch.

So, to the surfer looking at one starting category there will be links to more general and more defined topics up and down from there, as well as sideways to related topics in another branch. The latter set of linking gets you to another place that is far away in the category structure, but via a direct one-click expressway with a big signpost as to what the destination will be. That is the power of the ODP structure, and one that is often overlooked. Editors are always looking for better ways to link related topics, as well as to rationalise categories when it is found that there are two or more categories that are attempting to do the same thing.

I think I remember an interesting discussion about whether some rare form of x-ray crystalography (or some such topic) belonged under physics, optics or chemistry because there were sites about the topic under all three branches, with some weak category interlinking, for example. These are the sort of questions that editors are considering all the time, and then implementing after editor consensus.
 

tomnorian

Member
Joined
Jul 19, 2004
Messages
40
yes indexes themselves are crosslinked

Thank you for that response. Giz,

That is something that I can "own" , makes sense and is an effective "why" for me, rather than a "why" based upon resources or "space".

(I could still see editors in categories coming up with rules of must include or must not include keywords that might allow monitored key word data to be usefull in further *User* driven sorting)

Yes, why list only once? If you are in a broader category you sight will appear more frequently and less buried in related searches.

While no system is perfect I did come to understand the logic of it and I do think that the idea of the index itself cross referencing categories adds to the argument for chosing the current standard.
 
This site has been archived and is no longer accepting new content.
Top