Reorganization and why dmoz human editing fails

xixtas01 · Jan 13, 2004

Re: Reorganization and why dmoz human editing fail

Some observations:
* Of course the guidelines are limited by the system. That's true of every system where humans and technology interact.
* Nobody's beating anybody with a hosepipe.
* The directory structure is significantly more technologically advanced than your desk drawer.
* Even relatively modest proposals to restructure the ODP's data have to take into account the challenges involved with updating the 4 million existing listings.
* Suggestions for improvement should always consider improvements from a costs/benefits point of view.

donaldb · Jan 13, 2004

Re: Reorganization and why dmoz human editing fail

What, and we can't question the guidelines?

We can question the guidelines all we want. As a matter of fact we have ongoing discussions about them all the time in the internal forums. We can discuss it here, but unless someone follows through and takes the discussed ideas internal they probably won't have much impact. This is not the place for changing ODP guidelines

chris22rv · Jan 14, 2004

Re: Reorganization and why dmoz human editing fail

Of course the guidelines are limited by the system. That's true of every system where humans and technology interact.

Yes, but, and this is VERY common in organisations, the work method and technology limitations tend to get confused over time, in a chicken and egg sort of way, and it's important to keep the big picture.

You see this a lot in business analysis:-

Q: "What is you need from the system?"
A: "A 20 page list of figures".

Q: "What do you do with the 20 page list of figures?"
A: "Print it out and type it into Excel then make a graph"

Q: "What you do with the graph?"
A: "Check that the figures A, B, C and D haven't dropped more than 10% since last week?"

Q: "So what you NEED is an alert when the figures for A, B, C or D drop?"
A: "Can we do that?"

We can ask for improvements to the system if we want to change the guidelines, we shouldn't consider the guidelines as necessarily leading the technology when partly the reason a guideline exists is a historical shortfall of technology.

Generally speaking, in requesting feature changes of a system, you crystalise precisely what it is you want first, then price the work, before dismissing it as costing too much, or amending your expectations. How can I possibly estimate the cost?

The monkey story is just a parable about how a historical problem can have an unnecessary influence on an organisation's current policy. It wasn't intended to be a perfect analogy of the entire DMOZ organisation, analogies are rarely perfect.

Beating with hosepipes was never mentioned: It's not a suggestion for DMOZ editing policy. Don't try it at home.

lissa · Jan 14, 2004

Re: Reorganization and why dmoz human editing fail

I think what we are skirting around here is that the software, guidelines, community, and actual data are all inextricably tied together at this point. There are some changes that we can't realistically talk about making because they either fundamentally change something critical to the delicate balance we've developed, or just aren't feasible at all.

The editors have quite a pile of feature requests in the internal bugs & features forum. Many of them would significantly enhance how we do our job, but aren't easy (or even possible, in some cases) to implement in the current system. I'm sure the editors would be the first to wish that we could review a site, input it's address (if relevant) for geographical placement, and check off a variety of boxes to say what kind of site it is (informational, commercial, etc.) and what topics it belongs to, and then have the software put the site in the appropriate categories. This, however, is an entirely different software package than what we currently have.

I'm amazed at what our current software does with what are apparently flat files (I'm not technical at all.) It scales with the size of the directory, is completely flexible for rearranging the ontology, and somehow manages to let thousands of editors work simultaneously. Perhaps there is newer technology/software that could be used to create a new ODP and incorporate our long wish list, but even if someone had the time to fully spec out what that software would need to do, it would be years in development. So meanwhile, we continue to use what we have, fairly effectively, IMHO.

xixtas01 · Jan 14, 2004

Re: Reorganization and why dmoz human editing fail

Yes, but, and this is VERY common in organisations, the work method and technology limitations tend to get confused over time, in a chicken and egg sort of way, and it's important to keep the big picture.

I agree with this.

We can ask for improvements to the system if we want to change the guidelines, we shouldn't consider the guidelines as necessarily leading the technology when partly the reason a guideline exists is a historical shortfall of technology.

My point is that the two go hand-in-hand. Yes, clearly some portion of the guidelines are influenced by the limitations of the technology. However, the guidelines represent the egg that the technology was designed to crack. The spirit of the current guidelines should be carefully considered before making changes either to the technology, or ammending the guidelines. The guidelines clearly point the direction for growth because they have been constantly vetted by the editing community for years, and stood the test of time.

Generally speaking, in requesting feature changes of a system, you crystalise precisely what it is you want first, then price the work, before dismissing it as costing too much, or amending your expectations. How can I possibly estimate the cost?

I think a "cost no object" type of conversation early in a technology discussion is a good idea. So long as everyone understands that that is what it is. Such a conversation, if not correctly framed can result in wildly unrealistic expectations. Whenever actual concrete solutions are discussed, ROI should be considered at every step of the process. It's too easy to waste time planning to build a locomotive for a return that only justifies a shopping cart.

P.S. Can someone fix the thread stretch?

tuisp · Jan 14, 2004

Re: Reorganization and why dmoz human editing fail

Can someone fix the thread stretch?

Done

58afw · Jan 14, 2004

Re: Reorganization and why dmoz human editing fail

Would the person/s who vote for or allocate ratings stars to the threads please consider doing so for this thread. Whether one may agree or otherwise with any of the arguments in the posts is not the main reason I make this suggestion, although the topic is apposite IMHO. Its the sheer intellectual vigour of many of the contributions that is so outstanding and warrants lots of stars, again IMHO.

JOE3656 · Jan 19, 2004

Re: Reorganization and why dmoz human editing fail

RE: I am currently tied up for the moment, but I am not shirking the thread. I will post a rather long paper on a set of possible alternatives to the one tree problem, with graphics as well as a list of possible redesigns. These are the types of requirements that are useful.

Magnolia >> The tree structure itself is fundamentally a problem, there's only some point to it if you believe that people use it to drill down. My last company in Holland was in a government database listing companies by type. To find our company you had to first go to one of the 10 top level company types. For us, who were internet database system architects - you had to start by looking under 'Transport'. Because we were under: Transport/Communication/Internet/Developers/ <<

So we really would like a data dimension. Check to see below if I have that right.

Llisa >>
The editors have quite a pile of feature requests in the internal bugs & features forum. Many of them would significantly enhance how we do our job, but aren't easy (or even possible, in some cases) to implement in the current system. I'm sure the editors would be the first to wish that we could review a site, input it's address (if relevant) for geographical placement, and check off a variety of boxes to say what kind of site it is (informational, commercial, etc.) and what topics it belongs to, and then have the software put the site in the appropriate categories. This, however, is an entirely different software package than what we currently have.
<<

I've had to solve that intersection for the EPA for a number of GIS datasets (riverreach, state, county, zip, watershed, polluting facility, discharge points, etc) for "assisted" location of different types of sites.

I'm thinking a part of the solution will use the overall current tree dmoz tree, with categories enriched by "attributes" instead of subcategories. An attribute is a data dimension that is an alternative to a category in that it describes a feature of the site, and is tunable to that category. This is especially useful for regionalization rather having a subtree of region off a category using @ links. Some dimensions are shared (like location, or engine part) as a common resource, which a category editor can apply to the category.

This is a revised structure of dmoz uses the tree-list paradigm for categories of type, but changes the concept of the data dimensions as (optional) attributes of a site listing, which can be set by the category editor. This reduces the reorganization and depth of the tree, allowing an attribute to be set for the site instead of requiring a subtree. The dimension is an optional alternate category (tree) index available to sites in the category, but is not a branch on the main tree.

Most dimensions would be simple grouping for replacing or simplifying @-link categories that are too complex or too deep to maintain easily. (Reduces the depth of the main tree also, a good thing.)

Some dimensions of the site information are so common that a utility could maintain important parts of these common xml ?? undecided ?? files from available geologic and geopolitical data. Possibly a second user utility would assist submitters to resolve location addresses to GIS points into the map for locating a site. (That way if a country divides, the sites can be re-updated from their physical location on earth, and re-categorized by the utility.)
Additionally submitters reduce the effort on editors to update their listed site location, if they move.

For example, if you are at 1600 Pennsylvania Ave, Washington DC you are in Washington, in the District of Columbia, in the USA, 30 miles from Dumfries, Virginia, and served by PEPCO electric.

I think that most category editors would add filters for shaping the dimension to allow submitted sites to enter appropriate location data for that category. (See the dmoz FAQ on regions as to why i recommend this). A store front business may enter a single address (resolved to a lat-long) or a electric company in a different category may be allowed to enter a service area.

A category may have number of data dimensions, for example boating categories may have location, boat type (e.g. sailboat, powerboat, canoe/kayak), language, and manufacturer. A subcat of boating like Associations may have a dimension of special Interest(e.g association types could be safety, manufacturer SIG , other SIGs) that only relate to this category.

- An additional advantage is the address to geo point (lat,long) is that distance and names can be used in end user queries. Many sites are geo sensitive, such as hospitals, stores, or government offices. A dimension may be enriched by a dimension editor who may add categories of facts to the dimension, or by a site editor adding a detail to a dimension for their category.
Finding marinas “Within 50 KM of Chicago”, “In Mersey”, or “On the Potomac River” (from the river reach file) is possible with this approach, because both the site and the location have geo data (lat longs). The nearest repair dock may not be in my state or country.

LLisa: Thanks for the numbers, I'll use them for the proposed RFC (request for comments) for a more powerful dmoz. Also, thanks for the idea of a tunable category 'bot. I'll think more about that.

Alucard · Jan 19, 2004

Re: Reorganization and why dmoz human editing fail

There are some very interesting ideas being talked about here - some of them might make for a better directory, some might not be practical.

What is being called for here is a complete re-write of the ODP application.Based on what I have seen the AOL/Netscape staff have no resources for a redesign of this sort of magnitude, even if it COULD be hashed out.

So the resources would have to come from somewhere else, which comes down to one of the original comments - if anyone believes they can do it better, then please try - "if you build it, they will come", to misquote a movie line (they being the editors).

To say that the ODP will die if it doesn't change significantly is an interesting prediction and one which has been made time and time again before. So far it has yet to come to fruition.

So unless there is someone who will step up to provide the resources to build this thing, this discussion is a purely academic one.

giz · Jan 19, 2004

Re: Reorganization and why dmoz human editing fail

>> this discussion is a purely academic one <<

... but it is important to have it, as it may provide ideas about things that can be done with the ODP structure and software.

JOE3656 · Jan 19, 2004

Re: Reorganization and why dmoz human editing fail

Not sure I agree about resource lack. Why, it could be an open source or PD (see Sourceforge) product, or possibly as a DARPA project, an academic project (or a commercial company could support it). The developer community develops lots of free software for general use. Perl, Linux, STRUTS, JESS, Apache, Tomcat, and even web browsers themselves all started as informal software projects supported by a volunteer community. Netscape does not have to be the development resource to create new approaches. Will I try? Enough to develop a first useful approach (even as Linux is no longer written by Linus Torvalds only).

You may think it's just an academic exercise to be hashing it out, but as important to any efforts is collecting user requirements to determine what can and should be done, otherwise I am "just complaining" or "throwing bricks". Sooner or later all software system have someone like myself trying to modify or mutate it. Perhaps there is an evolutionary force in software, that impells those like myself to challenge and attempt change.

As I read through the editor feedback to the post, some themes are emerging.

TestShootCom · Jan 19, 2004

Well, I can understand your frustrations. There is no automated way of working on this because humans need to edit categories, otherwise porn and mlm sites would spam the directory at will.

I had been trying to get listed here under all the photographer categories for 4 years (not at once, when on failed, I moved to another)

Finally last year I got added because a model of mine was on a reality show, which is not my industry for the site in question.

I was an editor for like three days, before I could settle in I was booted, yet I had not done anything in the 72 hours as an editor, that was back in 1999.

Yes there are thousands of sites that are overlooked or not added. Yes dmoz is overworked and under staffed, yes the editors knew it was a time consuming job when they signed on. Yes I empathize, yes I am pissed that I can't get any of my other sites in other sectors added, even after 4 years. Yet I am patient, I sit, wait, and beg to be an editor, just because I don't like the lag, and want to help.

-Patiently waiting

Tim Hunold
www.TestShoot.com (listed in reality shows)
www.FilmSupplies.com (4 years still no listing)
www.TestShoot.net (4 years still no listing)
www.ItsSoBig.com (4 years still no listing)

Alucard · Jan 20, 2004

Re: Reorganization and why dmoz human editing fail

Just to clear up some possible misunderstandings: I'm not saying this isn't a worthwhile discussion. And I'm not saying that there may well be resources worldwide for this sort of work.

But, based on what I see, there are no resources WITHIN the ODP for this sort of major redesign. Yes, there may be some minor changes which would improve things, but most of what is being discussed here would be a FUNDAMENTAL change to the ODP.

The ODP is not open source, for example. It's the directory that is open, not the code that runs it.

So in order for a project along these lines to take off, it would have to be an effort separate from the ODP which, if successful, may eclipse it in the future, if it becomes sufficiently successful. Alternatively, AOL/Time Warner (who owns the directory) would have to make a major financial committment to a redesign. Based on what I read, I doubt this would happen.

Expectations that the ODP will undergo a major redesign are, in my opinion, unrealistic.

I hope this makes my views on this a little clearer. Now back to your regularly scheduled discussion, already in progress.

lissa · Jan 20, 2004

Re: Reorganization and why dmoz human editing fail

Now, if someone were to work offline developing a prototype of brand new software and demonstrate its effectiveness at handling the data in the RDF (i.e. what the public sees) it might become worthwhile for our tech person to spec out the rest of the dataset requirements and edit-side tools to be encoded into the new software.

If someone is able to develop a whole new software suite, with a different and enhanced method for managing the data, which provides at least the same functionality and look/feel of the existing editor interface, beta testing, and a transition plan, all to be provided to ODP for free, I'm sure we'd be interested.

I would guess that development of something like this would take a year or two (at best!) of concentrated effort on the parts of several people. We'll be looking forward to the results!

sole · Jan 20, 2004

...yes the editors knew it was a time consuming job when they signed on.

Actually, that's not true. I know I didn't sign on for a time consuming job. I just signed up to share some sites that were in my bookmarks. I heard there was this great directory, I went looking for some more sites to add to my collection, and found they didn't have my favorites. So I signed on to share. I didn't do a whole lot of editing at first, and gradually got involved doing more and more.

It's been said before, but I'll say it again. Editors are volunteers.

katapult · Jan 20, 2004

Re: Reorganization and why dmoz human editing fail

If someone is able to develop a whole new software suite, with a different and enhanced method for managing the data, which provides at least the same functionality and look/feel of the existing editor interface, beta testing, and a transition plan, all to be provided to ODP for free, I'm sure we'd be interested.

I have a few ideas to also give the dmoz pages a face-lift, new logo, page layout etc., and I'd be happy to create some designs for free.

I'll publish some examples on my free webspace soon - see what you think.

motsa · Jan 21, 2004

Re: Reorganization and why dmoz human editing fail

Um, the ODP isn't looking to get a facelift, katapult.

katapult · Jan 21, 2004

Re: Reorganization and why dmoz human editing fail

Tell me you're joking... the current dmoz website is about the most out-dated, un-interesting, dull design I've come across. It really does need updating.

yapuka · Jan 21, 2004

Re: Reorganization and why dmoz human editing fail

Hello, and thanks for that.

Actually, we quite like it as it is.

And our objective is to create a dataset of links organized in a tree structure for other to use. We do not really aim at getting 4 milion visitors a day (we could not handle it anyway).

So, what we are looking for (and what we have) is a presentation that suits the needs of editors. And we have does the job quite well.

Of course, it probably could be improved (and it IS being improved all the time), but it really is not a priority.

donaldb · Jan 21, 2004

Re: Reorganization and why dmoz human editing fail

Tell me you're joking... the current dmoz website is about the most out-dated, un-interesting, dull design I've come across. It really does need updating.

Aesthetics are nice, but only where appropriate. We're not trying to be flashy and exciting. We're trying to be user friendly and quick loading. If you could come up with a design that is less graphic intense and would load faster than the current design then we might think about it, but I think that the current design seems to be working well for our target audience

Reorganization and why dmoz human editing fails

xixtas01

Member

donaldb

Member

chris22rv

lissa

Member

xixtas01

Member

tuisp

DMOZ Meta/kMeta

58afw

JOE3656

Alucard

Member

giz

Member

JOE3656

TestShootCom

Alucard

Member

lissa

Member

sole

Member

katapult

motsa

katapult

yapuka

Member

donaldb

Member