jeanmanco Posted May 29, 2006 Posted May 29, 2006 The idea that Google gives special love to sites listed in the Open Directory is an urban myth. But it is so prevalent that it seems worth starting a thread specifically to tackle this. I am therefore copying here a reply I made on another thread. Search engines do tend to use the Open Directory to seed their indexes, but that doesn't mean that they rank ODP-listed sites any higher than those not listed. In fact I don't know of any proof that any SE does. What does seem to be true is that all the major engines use backlinks in some way in their algorithms. So if a site has no links to it and then is listed in the ODP, that is likely to have a noticeable effect on its rankings in the SEs. But you would see the same effect from a listing in Yahoo! and even more of an effect from a backlink from the front page of BBC News. Google's PageRank patent does not mention the ODP at all. What it does describe is the totally automated method of calculating a value for a page based on the number of incoming links and their value (where value = PR). Google representatives have repeated said that no site can have a PR set manually. They have also said in so many words that a link from the ODP is worth exactly the same as any other link. Google has always been upfront about its PageRank mechanism. It can't calculate PR without any inbound links. So it recommends that sites gain inbound links. Obviously it cannot list all the sites on the web that a site might get an inbound link from. So Google makes a couple of suggestions that will be possibilities for sites of a great range of types: Yahoo! and the ODP. They are both general directories - the largest on the Web. Yahoo! Search makes a similar recommendation: "Getting your site listed in major directory services such as the Yahoo! Directory and DMOZ is an excellent way to be sure that there are links pointing our crawler to your site." Obviously Yahoo! would be the best choice for commercial sites willing to pay for rapid listing. But there are other directories too which provide pay-for-placement services. No site is dependent upon a listing in the ODP to appear in the indexes of any of the major search engines. And it is those major search engines that really matter. They are the ones being used by the great majority of the world's population. Small engines in the experimental phase may draw exclusively on the Open Directory. But once they get large enough to be widely used, they will have moved beyond simply indexing the sites listed in the ODP.
Expertu Posted May 29, 2006 Posted May 29, 2006 The idea that Google gives special love to sites listed in the Open Directory is an urban myth. I tend to disagree. It gives some greater importance, than any other website. Altough greater, it will not improve any rankings, as some might dream. It's just a bonus to the trust rating that a website has. But it is so prevalent that it seems worth starting a thread specifically to tackle this. It's good that you use transparency and want to tackle things right on. Search engines do tend to use the Open Directory to seed their indexes, but that doesn't mean that they rank ODP-listed sites any higher than those not listed. In fact I don't know of any proof that any SE does. As i said above, it will only improve your overall quality rating. That doesn't mean that it will boost your rankings sky high. It's just a small percentage added to your websites quality as far as inbound linkage goes. I do want to be in a quality directory such as DMOZ, than in a spammy website, with the same characteristics. What does seem to be true is that all the major engines use backlinks in some way in their algorithms. So if a site has no links to it and then is listed in the ODP, that is likely to have a noticeable effect on its rankings in the SEs. But you would see the same effect from a listing in Yahoo! and even more of an effect from a backlink from the front page of BBC News. You actually compare a link from the frontpage of BBC news with a listing from DMOZ ? Ok .. Compare a link from the frontpage of DMOZ with a link from the frontpage of BBC news. I would take DMOZ. Google's PageRank patent does not mention the ODP at all. Off course it doesn't. That's why it uses it's content right ? (descriptions for websites in it's search results). What's the step to considering DMOZ a "better" link ? close ? They have also said in so many words that a link from the ODP is worth exactly the same as any other link. They also say a lot of other things, but act completly around them. Google has always been upfront about its PageRank mechanism. You are joking right ? Upfront about it's Pagerank mechanism ? There are 40000 forums out there, searching bits on info every day. Nothing is made public except the number that results. Yahoo! Search makes a similar recommendation: "Getting your site listed in major directory services such as the Yahoo! Directory and DMOZ is an excellent way to be sure that there are links pointing our crawler to your site." Actually they state it like this : Getting your site listed in the Yahoo! Directory can increase the likelihood that your site will eventually show in Web Search results. However, a listing in the Directory does not guarantee that it will be listed in Yahoo! Search results, or that it will be listed in any particular manner or order in Search Results. No site is dependent upon a listing in the ODP to appear in the indexes of any of the major search engines. It couldn't be more true than you wrote. Small engines in the experimental phase may draw exclusively on the Open Directory. Google and MSN use the description of websites present in DMOZ. Google mirrors the DMOZ directory. Yahoo used Google some time ago, and it used DMOZ accordingly. Conclusion : Do not consider what i said above, to be an argument, but only a constructive discussion. Hi. I am Cristian, and i want to be part of this community, because i want to help DMOZ, and i want to help the people to dedicate their time to maintain it.
jeanmanco Posted May 29, 2006 Author Posted May 29, 2006 Hi Christian. Thanks for your comments. 1. Webmasters don't believe what Google tells them. What can I say? It is evident that Google has learned caution. Brin and Page published their paper on PageRank and boasted that the method was spam-resistant. (That's what I mean about them being upfront. How much more upfront could they get? The whole methodology was laid bare.) Naturally webmasters and SEOs immediately saw how they could boost their sites in Google. Brin and Page had handed them the key to high rankings. They proceeded to create link farms, FFAs and reciprocal linking, to buy links, spam guest books and forums and generally subvert the system Brin and Page were so proud of. So now Google engineers are more cagey. No - they are not going to give you the recipe for their 'secret sauce'. They are not going to tell you every one of the 100 other elements in their algorithm. But that does not mean that they are lying. I do not believe that Matt Cutts is a liar. And what he and other Google representatives have said about PageRank makes perfect sense to me. That calculation is automated. The whole idea of search engines was to automate search. Given the billions of pages on the Web, how could Google individually allocate PageRank by hand?
charlesleo Posted May 29, 2006 Posted May 29, 2006 Altough greater, it will not improve any rankings, as some might dream. It's just a bonus to the trust rating that a website has. DMOZ/ODP listing by itself does not improve rankings, however Google does place greater importance for a website listed within DMOZ. That's why Google has a directory built upon the ODP and recommends you submit to ODP. A listing in DMOZ means that over 300 search engines will index your website, thereby leading to more backlinks and thereby boosting a ranking rather significantly. So therefore, anyone thinking that a listing in DMOZ doesn't affect Google rankings in such a big way has their information all wrong. ODP doesn't promote a website rankings by itself intentionally - but many search engines do by the way they are written and how they reference the Internet in general.
Expertu Posted May 29, 2006 Posted May 29, 2006 1. Webmasters don't believe what Google tells them. This is not a general issue. Altough we are webmasters, we are still web users, but more like savvy ones. But that does not mean that they are lying. I didn't said they were lying. I said they are withholding information, or they are missleading. It's a difference But after all, if all that info were to be public, i think 5.000.000 people would be out of jobs, and the web would have 500.000.000 less pages.
jeanmanco Posted May 29, 2006 Author Posted May 29, 2006 2. TrustRank It seems to be a common assumption among SEOs that the Open Directory is one of the 'seed sites' involved in calculating TrustRank. But there is no evidence for it. In fact the evidence is against. In the Stanford White Paper on Trust Rank the authors selected seeds via inverse PageRank (i.e. sites with many outbound links). To get rid of the spam, they removed from their initial set of 25,000 sites all those not listed in any of the major directories, cutting the pool down to 7,900. Then they manually evaluated the top 1,250 sites and selected 178 to be used as good seeds. They selected only sites with a clearly identifiable authority (such as a government or educational institution or company) that controlled the contents of the site. So all of their 178 seed sites are listed in either Yahoo! or the ODP. But that doesn't mean that every one of the millions of listings in the ODP is a trusted site. The ODP itself was not among the seed set, since it is not controlled by such an authority.
jeanmanco Posted May 29, 2006 Author Posted May 29, 2006 Just to keep all this stuff together, I'm repeating material from another post from the other thread. In Google's webmaster guidelines, they say Submit your site to relevant directories such as the Open Directory Project and Yahoo!, as well as to other industry-specific expert sites. Google does place greater importance for a website listed within DMOZ. That's why Google has a directory built upon the ODP and recommends you submit to ODP. The two facts: 1) Google's Directory = ODP 2) Google advises getting links and suggests the ODP as one possible source are the foundation upon which a huge edifice of speculation has been built. It is true enough that one link in the ODP eventually generates a lot more from ODP clones. However the effect of this may not be as great as you might imagine. Many of these clones have little or no PageRank. In fact Google seems to be declaring war on a lot of them and banning them altogether. Presumably this is part of the relentless drive against spam, in particular duplicate content.
jeanmanco Posted May 29, 2006 Author Posted May 29, 2006 3. Use of ODP titles and descriptions in search engine results. There seems to be a common misconception that this affects ranking. It doesn't. It is a purely presentational issue. For more see my article: http://jean.manco.googlepages.com/googleanddmoz
Expertu Posted May 29, 2006 Posted May 29, 2006 So all of their 178 seed sites are listed in either Yahoo! or the ODP. My point too . It is true enough that one link in the ODP eventually generates a lot more from ODP clones. I would not want links from those 3rd party websites. Really. I would preffer if DMOZ wouldn't offer the RDF feed at all. Many of these clones have little or no PageRank. Not only the the Pagerank you will get is close to NONE (deep DMOZ internal categories have 3-4-5 PR. The clones will MAYBE get 0-5 on the index, which is 0 for the websites in it), but there is also the duplicate content penalty. There seems to be a common misconception that this affects ranking. It doesn't. It is a purely presentational issue. I totally agree. But when Microsoft (which recently offered a opt-out method) and Google use those descriptions, as a SPAM-free description of the websites in their index (and they did/do this exactly to remove the spam in their index, with websites that overzelously use the meta description and title), one must wonder : they trust DMOZ.
jeanmanco Posted May 30, 2006 Author Posted May 30, 2006 I don't deny that the Open Directory is useful to the search engines. I'm happy that it is. My concern is that webmasters often expect far too much from a listing in the Open Directory. There has been so much exaggeration and speculation about its effects on search engines. What it all boils down to is that a listing is a link. And a link should help a site a little. The effect will be most noticeable for a site that has no other links. A listing then puts it on the map. But so would a listing in Yahoo! Or a link from any well-known and heavily-crawled site. There is no magic about the ODP.
Expertu Posted May 30, 2006 Posted May 30, 2006 My concern is that webmasters often expect far too much from a listing in the Open Directory. Real webmasters and marketers know better. It's indeed a quality resource, but some over-exagerate. What it all boils down to is that a listing is a link. A quality link (links?). Let's not forget that some websites have thousands of links in the ODP. But so would a listing in Yahoo! Not quite. Yahoo is a commercial directory. They have interests. And did you ever look at the outbound links (how they are constructed) in the Yahoo Directory ?
jeanmanco Posted May 30, 2006 Author Posted May 30, 2006 I can't say that I have looked at the Yahoo Directory recently. But I can tell you that Google shows backlinks to my site from the Yahoo! Directory.
jeanmanco Posted May 30, 2006 Author Posted May 30, 2006 Real webmasters and marketers know better. I agree that SEOs who really know their stuff have been giving sensible advice for some time. But still the most alarming misconceptions circulate.
Expertu Posted May 30, 2006 Posted May 30, 2006 But I can tell you that Google shows backlinks to my site from the Yahoo! Directory. Sure. It shows. I agree that SEOs who really know their stuff have been giving sensible advice for some time. But still the most alarming misconceptions circulate. That's why me, Compostannie, and so many others are writing in forums all day. To educate the rest.
charlesleo Posted May 30, 2006 Posted May 30, 2006 In fact the evidence is against. In the Stanford White Paper on Trust Rank the authors selected seeds via inverse PageRank I think you're refering to sites that had too many outbound links - ones that would be considered 'link farming.' Google does not consider the ODP a 'link farm', otherwise they wouldn't recommend submitting a site to the ODP. I disagree that you feel an ODP listing should be downplayed. For a variety of reasons, we can all agree it effects ranking. You believe it's not too detrimental and can be circumvented (again, this is speculative), and I believe based upon my searches, website development, and research into the topic that it is (again, my opinion is just as speculative as yours.) Without Google officially divulging what criteria they build their results from, we may never know. Another thing is for certain - Google does place more importance in certain websites - they do rank certain sites according to the quality and importance of a website as well it's content. ODP happens to be one of them which for all intents and purposes, could easily be considered a link farm. This leads me to believe special consideration was given to ODP otherwise it would have been automatically filtered out by their algorithms.
Meta hutcheson Posted May 30, 2006 Meta Posted May 30, 2006 >Another thing is for certain - Google does place more importance in certain websites - they do rank certain sites according to the quality and importance of a website as well it's content. I don't believe this. (1) Once the algorithm was "seeded", manual intervention is limited to reversing automatic bans. Or at least, so I understand Google to say. Except for bans, the concept of "site" doesn't really seem to apply except in the results presentation phase (where multiple hits within the same domain are hidden by default). (2) "Quality" is a word that shouldn't be used at all in this context. It has no "meaning" in any normal sense. And certainly in no useful sense at all is it automatically calculable. >ODP happens to be one of them... No, the set is empty. There aren't any of them. >... which for all intents and purposes, could easily be considered a link farm. Only by an idiot. Or a really lame algorithm. Granted, the affiliate spam business seems to attract more than its share of idiots. And MSN, of course, is more than willing to serve up lame search algorithms all day. >This leads me to believe special consideration was given to ODP otherwise it would have been automatically filtered out by their algorithms. The "Hilltop" algorithm is all about algorithmically distinguishing between "link farms" and "directory resources". It's patented (that is, publicly disclosed), and conceptually it's not all that complex. But even the original PageRank algorithm weeded out link farms unsupported by mutual interlinks with other link farms. (Later improvements, having (I suspect) to do with almost-partitionable matrix computations, caught many mutually-supporting link farms. (3) "Importance" is only another way of saying "lots of inbound links from pages that also have lots of inbound links..." The all-important element of human judgment comes from the humans that added those links. That is all Google is taking into account in page rank. It is all linear algebra, and conceptually fairly simple. What really makes the difference in what Google has done is that they are doing computations efficiently on extremely large, extremely sparse matrices. And without a feel for the normal things that can be done with matrices, sure, Google will look like hand-tweaking. But they say it's not -- and the fact is, they're hiring mathematicians not website reviewers, suggests they're telling the truth.
Meta hutcheson Posted May 30, 2006 Meta Posted May 30, 2006 I should add, that PageRank isn't about sites at all. It's about, DOH! pages. Some pages on a site may rank well, others may trigger the "bad neighborhood" death (which is one of Google's ways of spotting link farms.) I think at least one ODP category is (for good reasons) considered part of a bad neighborhood -- another reason not to believe that Google hand-selects "authority" sites.
charlesleo Posted May 30, 2006 Posted May 30, 2006 It's all speculation on all of our ends with best guesses at most. I do believe you when you say Google's methods are mostly calculated by algebra - keywords, keyword density, website age, proper markup, naming conventions, inbound and outbound links. I do believe they manually let some very popular sites through that would other-wise be red-flagged, just as they red flag some sites for inappropriate content. When I say 'quality', I am refering to site content which is related to the topic at hand and how this interconnects to and from other related websites. 'Quality' in this sense is also determined by how many related sites link back to a particular website. Being listed here does make a difference - some of you will argue small and others will argue large. I'm not saying it's any responsibility of the editors here by any stretch of the imagination. The only true test that would give any of us answers is to set up a control group - a couple of 'unknown' sites (non search engine submitted sites and assuming they are not listed through backlinks or by other websites) that are about to be listed in the ODP and monitor how they fare ranking-wise within a couple month's time according to common keywords pertaining to those sites.
Meta hutcheson Posted May 30, 2006 Meta Posted May 30, 2006 Here's what I'd put down as the one thing that is certain. The search engines keep changing their algorithms. The way they use ODP data, their choice of which ODP data they use, is different this year than it was last year, which was different than the year before that. At Google alone, in the visible part alone, the "link to ODP category" disappeared, and snippets appeared. The old Hotbot and AOL search engines used ODP data in yet other ways. Next year it'll be something different, perhaps some other search engine. And because of this, it is the absolute responsibility of the ODP editors NOT EVER to take current search engine use into consideration. ODP editors would do everyone a disservice, and would betray the trust given them, if they started worrying about what, why, and how the search engines use the ODP data today. We're in this for the long run: what directories do best is large stable sites with continuing value. If we and the search engines both do what we do well, they search engines will find uses for our work that we never would have imagined. And that would be the best possible result.
jeanmanco Posted May 30, 2006 Author Posted May 30, 2006 I think you're refering to sites that had too many outbound links - ones that would be considered 'link farming.' Google does not consider the ODP a 'link farm', otherwise they wouldn't recommend submitting a site to the ODP. Let's go over this again. The idea of TrustRank is to start with a group of sites which can be trusted. This is the seed set. A link from a trusted site confers 'trust'. The number of link from 'trust seeds' = TrustRank. In the Stanford White Paper on Trust Rank the authors selected seeds via inverse PageRank (i.e. sites with many outbound links). The authors wanted their seed set of sites to have lots of outbound links. Otherwise TrustRank would be very limited indeed! So they used inverse PageRank to find an initial set of 25,000 sites. That set would certainly include Yahoo! Directory and the ODP, which have millions of outbound links. Unfortunately it also included a huge number of ODP clones, which raised the spam factor in the set. To get rid of the spam, they removed from their initial set of 25,000 sites all those not listed in any of the major directories, cutting the pool down to 7,900. So they easily weeded out a lot of the spammier ODP clones, other spam directories and link farms. At this stage the Open Directory itself and Yahoo! Directory were presumably still in the set. But what comes next? Then they manually evaluated the top 1,250 sites and selected 178 to be used as good seeds. They selected only sites with a clearly identifiable authority (such as a government or educational institution or company) that controlled the contents of the site. So that excludes the ODP, since it is not controlled by such an authority. Of course we have Guidelines. But that is not the same thing. Naturally Google will not reveal the list of 178 seed sites. In fact they won't even say whether they are using TrustRank. So that has left the way open for speculation. But no convincing evidence has been put forward that the ODP is one of a set of "trusted seed sites" used by Google in its algorithm.
jamesensor Posted May 30, 2006 Posted May 30, 2006 I would be interested to know how you became so prominent in aworld-institution from an UK base. A great achievement. I have been trying to register as a potential editor but your return email is full for 3 hours and aol is just sending rejects.
jeanmanco Posted May 30, 2006 Author Posted May 30, 2006 James - The ODP servers are in California, but the Open Directory has editors from all over the world and lists sites in many languages. If you are asking how I personally became prominent, I wouldn't call myself that. However we do have a Briton or two in the upper echelons of the Open Directory.
Meta nea Posted May 30, 2006 Meta Posted May 30, 2006 I have been trying to register as a potential editor but your return email is full for 3 hours and aol is just sending rejects. James, do you mean that you can't send in the editor application form? There have been some technical problems recently with the applications -- please see the thread at the top of this forum titled "Recurring problems with applications". Curlie Meta and kMeta editor nea
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now