Is DMOZ outdated?

thall89553

Member
Joined
Aug 3, 2004
Messages
58
I am a bit challenged by this Internet resource for the following reason. I have followed procedures to a "t" in submitting two web sites for my clients. Both web sites are very sharp, well designed and programmed and would be a valuable resource for anyone looking for their content. I just finished a several month "waiting" period for the sites to be reviewed. When I submitted an inquiry, to the initial submission thread mind you, I was told it had not been reviewed yet. Ok, fair enough I thought to myself, I asked and recieved an answer. I was ok with that, but what really challenged me is that when politely asked "What is the average time from submission to listing", I got this answer:

Moderator:
Well, arguably there are several averages. The median is something like five years, or infinity (depending on whether the rejected submittals are slightly less than half of all submittals, or slightly more than half, and ... I'd bet small sums on the latter, I think. The geometric and arithmetic means are both infinity, and will be so long as some sites are rejected. Oh, you can count up submittals and divide by number of sites reviewed per day, but ... who knows what the distribution would be?


As an organiztion, is this typical of an answer to a "customer"? I would think an answer such as the following might be more appropriate -

"Great question thall89553. Unfortunately there is not any specific "average" figure available. I encourage you to wait some more and repost your inquiry in a few months."

Finally, if it can take 5 years to infinity to submit and get a site listed on DMOZ, what does that say about the integrity of the sites listed? I would think that this "system" would be giving people "outdated" web sites at best.
 

oneeye

Member
Joined
Aug 2, 2002
Messages
3,512
is this typical of an answer to a "customer"?
You aren't a customer, we don't have customers, we are a volunteer driven "academic" project and we invite suggestions from the public on a no obligation basis. We aim to assist surfers not provide a service for webmasters. You're confusing us with a commercial listing service like Yahoo.

Great question thall89553.
It isn't a great question, its been asked and answered a thousand times, at least a dozen in the last few days. And there is an FAQ on listing times too. If I visit a forum for the first time to ask a question I always read any FAQs and general information first, then search the forum to see if my question has been answered, before asking someone to spend time holding my hand because I'm too lazy!

I encourage you to ... repost your inquiry in a few months
We don't want you to repost - once you know the site has been received the site will be listed or not and you can tell which by looking at DMOZ for the listing.

if it can take 5 years to infinity to submit and get a site listed on DMOZ, what does that say about the integrity of the sites listed?
If a site is still going 5 years after being suggested then it is clearly stable and likely to run the course. We have methods of cleaning up dead links, etc. one of which is for the public to report them in a thread on this forum.
 

akknoepfel

Member
Joined
Apr 9, 2005
Messages
4
Balance between expectations and reality.

Adding a useful description of a web-site and categorizing the web-site is a valuable contribution to the DMOZ community. Of course, most people do this to promote a web-site in which they take a certain interest. But people like to know, what they can expect - if expectations will often be will be disappointed, people will retire from contributing to the DMOZ project.

From this point of view, I understand the question for the average time to get a suggested web-site listed and I see a substantial danger for DMOZ in repelling answers "The median is something like five years, or infinity" - there are statistical means to provide reasonable answers. If the statistical data neccesary to answer these questions is not avaibable, perhaps one should think about to provide the means to collect or extract these information - technically, there should be no problem.

The value of DMOZ results not only from the review process of checking web-sites, but also very much from the people suggesting web-sites. Frustating these people has direct impact to the popularity of DMOZ!

With regards,
Andreas Knöpfel
 

pvgool

kEditall/kCatmv
Curlie Meta
Joined
Oct 8, 2002
Messages
10,093
akknoepfel said:
But people like to know, what they can expect - if expectations will often be will be disappointed, people will retire from contributing to the DMOZ project.
The only thing you can expect after suggesting a site to DMOZ is that the site will we reviewed at some point in time. As we (the editors) don't know when this point will be reached we are unable to give you an answer to questions like "when will my site be reviwed" and "what is the average wait time".

akknoepfel said:
The value of DMOZ results not only from the review process of checking web-sites, but also very much from the people suggesting web-sites. Frustating these people has direct impact to the popularity of DMOZ!
Yoa are totaly wrong. The value of DMOZ has totaly no relation with the possibility to suggest sites. Certainly not if you know that a lot (in some parts of DMOZ most) suggested sites will not be listed as they violate our guidelines.
 

oneeye

Member
Joined
Aug 2, 2002
Messages
3,512
there are statistical means to provide reasonable answers.
There aren't! At least not reasonable ones. A site sitting on its own in a heap of one may wait 2 years. A site submitted yesterday to a heap of 10,000 waiting might be listed today. Some listings come from suggestions, many come from link pages on other sites or Google searches. Being in a heap of suggestions is just one of several ways of getting listed and from an editor perspective not the most efficient one of working.

If the statistical data neccesary to answer these questions is not avaibable, perhaps one should think about to provide the means to collect or extract these information
There is no reason to - any statistics you could get out would be worthless to you and of no interest to editors. Editing activity is completely random, undirected, unprioritised, and unpredictable. To an extreme.
 

Sunanda

Member
Joined
Jun 15, 2003
Messages
248
If the statistical data neccesary to answer these questions is not avaibable, perhaps one should think about to provide the means to collect or extract these information - technically, there should be no problem.

It's all available to someone who wants to take the time to analyse the data.

Download a year's worth of the RDF files. Analyse them, and you'll have a long list of when and where new sites were added....You'll have 400,000 sites or so.

Next, do a whois on each of them to get when they were first registered.

The difference between the listed-in-dmoz date and the first-registered date gives you a feel for how long it took eligible sites that are listed to be listed to be listed.

What you don't have is when (or if) that each of those sites were suggested to DMOZ by members of the public. In some ways, that is irrelevant. But to get a statistical sample, run some searches on this site for which of the 400,000 or so have had queries raised.

What it'll all mean is a matter of interpretation. But there are some interesting stats in there from differential RDF analyses, and the data is available to anyone.
 

pvgool

kEditall/kCatmv
Curlie Meta
Joined
Oct 8, 2002
Messages
10,093
We aren't an organization.
As oneeye wrote 'we are a volunteer driven "academic" project'
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
Messages
19,136
>there are statistical means to provide reasonable answers.

Yes, there are. And in fact, approximations to those statistical procedurs were used to provide those accurate answers.

That you think them not reasonable speaks only to your innumeracy, not the invalidity of either the procedures or the results they provide.
 

oneeye

Member
Joined
Aug 2, 2002
Messages
3,512
I am going to become an editor at your organization...
I hope you show a better understanding of the project, its concepts, its aims and principles, and what it isn't than you have done here when you apply. If that is an affirmative we will welcome you with open arms. If not then I can't see it would be a difficult decision for a meta. It may be of interest to know that editors, when referring to themselves collectively, use the term "community" not organization. Good luck to you.

[Added] As regards the statistics, the maths might be right and therefore the answer and the reasoning reasonable, but reasonable in terms of usefulness to man or beast and enabling the supply of information that is meaningfully useful to someone making decision, one of the primary reasons for statistical analysis, I think not personally. Analysing the RDF dump, Google, whois, and forum threads might be a fascinating exercise if you are so inclined, but of what possible use could it ever be to anyone. Which might explain why no-one has ever done it and tried to publish the results with a meaningful conclusion. If someone wants to try then I'm sure all the editors here will be happy to point out the flaws when it says the average time for listing is 4 months, 13 days, 9 hours, and 36 minutes. Which it might well be for all I know, it is probably not far off (gut feel, not based on anything else). Spread over the 1 million sites sitting in unreviewed heaps with an unpredictable variable called editors to foul up any possible conclusion, what does that mean for mysite.com. Nothing I'm afraid. Its projected time for listing remains 1 minute to infinity and the only one who really knows is the editor who opens the submission, opens the site, and decides to make a decision on that site at that moment. That split second is the only time editors can predict the time of a review.

It is interesting that at some point the originator of this thread has proved the point. We are busy saying anytime from now to infinity, whilst at the exact same time another editor has found the site waiting and listed it not once but twice. There are many things I as an editor may have thought in terms of when it might be listed, whilst posting was not in the top 100.
 

hutcheson

Curlie Meta
Joined
Mar 23, 2002
Messages
19,136
>if it can take 5 years to infinity to submit and get a site listed on DMOZ, what does that say about the integrity of the sites listed?

First of all, submittal date means nothing whatsoever. If any date at all were relevant, it would be the date the site was published. (Of course, we really don't know either date, so it's a good thing they simply don't matter.)

As to the ethics of the question: if it takes "infinity" to get a site listed, what does that mean for the ODP's integrity? EVERYTHING! It is our whole purpose in life to list sites that contribute to the sum of human knowledge. Most sites don't do that. It is our duty and our delight to ensure that such sites take forever to "get listed." If you're thinking of the ODP as a place where you can "get listed", you've already stepped into the twilight zone. We don't offer a "get listing" service, and the submittal policy specifically says so up front. "NO SITE IS GUARANTEED A LISTING."

Third, as to the utter mathematical naivity of the question, once again you're assuming that "I can expect my site to get listed in the average time." And that would be true for certain kinds of random distributions AND for sufficiently imprecise measures of the "average." But my experience based on a large number of sites and some actual knowledge of statistics is that such a distribution simply doesn't exist. About a tenth of submittals are reviewed within a matter of days. But if a site isn't listed within, say, two or three months, then it is more likely than not to wait for years. In other words, by the time a site has waited half the "average" time, it is already probably that it is effectively NOT in the norm of sites over which the average time was calculated, and the "average" doesn't apply to it. Thus, ANY expectation based on ANY reported average will result in disappointed expectations and frustration.

Hence my original answer was based firmly on valid mathematical analysis and ethical considerations -- and I believe it's the only answer that can pass both of those tests.

But as to the question at hand, before the ODP appeared, it was a truism that "directories favor large, stable sites." The ODP has been successful at finding and listing many smaller sites, but the bit about "stable" remains as true as ever.

Sites with "integrity", in the only sense that is meaningful to us, are sites that have shown a consistent determination to provide valuable information whether or not they can attract an ODP listing. Sites of businesses that can operate and provide their own promotion -- and succeed. Sites that last.

The ODP does a better job than any other large directory of weeding out sites that haven't lasted, so we're confident that, six months from now when Google (or anyone else) picks up the current RDF, they will have something worth posting.

That's integrity.

Counting the time from the date a spammy submittal is flung at us till the day they are flushed by us -- is not just an exercise in irrelevancy, it is far far worse.

Because it's a well known management truism that application of an irrelevant metric of productivity to a group of workers will result in the workers reducing real productivity to increase the value of the metric. The ODP doesn't play those counterproductive games. We don't TRACK even those numbers that (unlike time-from-submittal-to-review) are at least vaguely related to some actual purpose of the ODP. Even "number of sites listed" is an irrelevant metric: the community doesn't know the exact number (although there's an arithmetic process called "counting" to compute it); and the community will qualmlessly take steps to reduce it. We focusing on reality, not contrafactual models thereof.
 

LizardGroupie

Member
Joined
Mar 1, 2005
Messages
16
thall89553 said:
As an organiztion, is this typical of an answer to a "customer"?

I'm glad you were smart enough to put customer in ironic quotes. But that doesn't change the fact that this is trolling. I really must question the integrity of anyone who would make such a post. It stands in stark contrast to the top integrity of the Open Directory Project.
 

foj1971

Member
Joined
Apr 27, 2005
Messages
16
Its a joke

If i wasn't keen to get my own site listed in the directory i'd point out that the only people who would volunteer to become an editor would be someone who has an interest in getting his own site in a category and keeping his competitors out. Or a jumped up little hitler who wants that little bit power, you know - the type of person who would reply to a genuine query with sort sort of smart alec reply like "The median is something like five years, or infinity..."

any one brave enough to agree? thought not...
 

pvgool

kEditall/kCatmv
Curlie Meta
Joined
Oct 8, 2002
Messages
10,093
foj1971 said:
i'd point out that the only people who would volunteer to become an editor would be someone who has an interest in getting his own site in a category
I guess for some editors this was one of the reasons. But luckely not all people are so selfish.

foj1971 said:
and keeping his competitors out. Or a jumped up little hitler who wants that little bit power
We have had people like this as editor. But they never remain for long. Internal and external control take care of them very quickly.
 

jjwill

Member
Joined
Aug 11, 2004
Messages
422
With all that said, you know the status of the submitted sites, and thats all you need to know. There is nothing else you can do. The best you can do is tell your clients that the sites were submitted and recieved and leave it at that. Nobody can tell you when they will be reviewed, ever. So at this point, forgetting about it is probably the best route. :)

Of course you are welcome to check back later (6 months after your last status update) if you like but that really doesnt speed up the process any.
 
This site has been archived and is no longer accepting new content.
Top