Guest Posted June 13, 2002 Posted June 13, 2002 Just curious if anyone knows whether or not editor bookmarks are included in 'dumps' - i.e. will google and other engines consider them when assigning page rank? I understand this question would probably be better answered by the individual engines, but I’m looking for general acknowledgement one way or the other.
Meta yklaw Posted June 13, 2002 Meta Posted June 13, 2002 It is included in the RDF, but different directories consider them differently. I don't think Google considers it, but some may. However, I'm not sure on the latter point.
Meta hutcheson Posted June 13, 2002 Meta Posted June 13, 2002 Not included in dumps. There were several reasons for this: honest editors were creating lists of spam sites for their own reference and didn't want that to be published; dishonest editors were creating lists of spam sites for their own remuneration and DID want them to be published.) In any case, the ODP dump so loved by Googlebot (together with the Yahoo equivalent, forming the core of the seed of its web traversal) does not include Bookmarks. Are publicly visible at dmoz.org (so they might have pagerank; and you could pass pagerank to them by linking to them--but you could do the same to any publicly visible page anywhere) but are NOT part of the Open Directory itself. dmoz.org itself has, so far as I know, no particular privileged position among search engine spiders -- they do visit its pages -- including submitter guidelines, editor profiles and bookmarks. I could, for instance, pay Inktomi to visit my bookmarks daily; and Googlebot already spiders them in one way or another, including "mirrors" at some "ODP licensee" sites that merely use hidden redirection to fetch pages from dmoz.org. You can see how this works by looking at the bookmarks page for some editor, and searching google.com for some characteristic text from it.
Meta hutcheson Posted June 13, 2002 Meta Posted June 13, 2002 (after reading yklaw's post: That used to be correct, but I think I'll stand by my story for CURRENT RDF dumps. If I were you, I'd not bet large sums of money on it, though, before getting a third opinion.
Guest Posted June 13, 2002 Posted June 13, 2002 according to the dmoz robots.txt file ... the Bookmarks/ directories does allow crawling ... Disallow: /cgi-bin/ Disallow: /editors/ One of my colleagues just discovered this after seeing numerous instances of suspicious linking within editor bookmarks. So assuming googlebot does spider those links they will most likely receive a page rank boost. hmmmmm /images/icons/smile.gif
apeuro Posted June 13, 2002 Posted June 13, 2002 Editors are allowed to do whatever they want in their Bookmarks - which are considered a sandbox for editors to play in. This includes all sorts of stuff that would never be allowed in the regular directory.
Guest Posted June 14, 2002 Posted June 14, 2002 apeuro wrote: "Editors are allowed to do whatever they want in their Bookmarks ... This includes all sorts of stuff that would never be allowed in the regular directory. " Thanks for confirming the obvious. /images/icons/wink.gif
Guest Posted June 14, 2002 Posted June 14, 2002 To clarify: Bookmarks are NOT included in the RDF dump Bookmarks are NOT blocked from spiders. You can therefore find Bookmarks cats and/or sites listed in Bookmarks in the SERPs of SES whose bots crawl these pages.
Guest Posted June 14, 2002 Posted June 14, 2002 Yes - thats what I was implying /images/icons/wink.gif I have found bookmarked sites in SERPs ...
Guest Posted June 14, 2002 Posted June 14, 2002 Uhm... so did I answer your question (or confirmed your assumptions)? If not, I have to admit that I didn't understand the question (or original assumptions) /images/icons/smile.gif
Guest Posted June 14, 2002 Posted June 14, 2002 Well I will admit it was a loaded question – I knew prior to asking that the Bookmarks could be crawled … so I guess you could say that you confirmed my assumptions to a point. As for those assumptions – I am not allowed to post those in this forum as per my member agreement, but I’m sure you can figure those out – particularly if you read the post made by apeuro …
Guest Posted June 14, 2002 Posted June 14, 2002 I don't think bookmarks are included in the dumps -- which are big enough as it is. As for pagerank, why not complain to Google about their algorithms? We don't have control over what is done with any ODP data, you know.
sabre23t Posted September 14, 2002 Posted September 14, 2002 Mmmm ... with reference to kctipton's announcement on 9 Aug 2002 of bookmarks not being crawled, looks like http://dmoz.org/robots.txt is now back to ... <pre><font class="small">code:</font><hr># Please do not crawl us faster than 1 hit/second # User-agent: * Disallow: /cgi-bin/ Disallow: /editors/</pre><hr> ... as pointed out in this thread on another forum posted 6 Sep 2002. I did remember seeing it disalllowing crawling of /Bookmarks/ for a while.
Meta theseeker Posted September 14, 2002 Meta Posted September 14, 2002 Search for dmoz.org/Bookmarks in Google. You won't find any. Apparently, they have never been spidered by Google, and therefore cannot effect page rank. Bookmarks on sites that use a program to scrape the data from dmoz are spidered, but most of those have a low page rank anyway. For the most part, I see this as a non-issue.
sabre23t Posted September 15, 2002 Posted September 15, 2002 Hi theSeeker. Google Toolbar on my IE5.5 is showing ... [*] http://dmoz.org/Bookmarks/S/sabre23t/ - PageRank 6 [*] http://www.resource-zone.com/showprofile.php?Cat=&User=sabre23t - PageRank 7 [*] http://www.webmasterworld.com/viewprofile.cgi?action=view&member=sabre23t - PageRank Nil [*] http://www.searchengineforums.com/bin/ubbmisc.cgi?action=getbio&UserName=sabre23t - PageRank Nil [/list:u] Also ... [*] http://dmoz.org/ - PageRank 10 [*] http://dmoz.org/Bookmarks/ - PageRank 8 [*] http://dmoz.org/Bookmarks/S/ - PageRank 7 [*] http://www.resource-zone.com/ - PageRank 8 [*] http://www.resource-zone.com/ubbthreads.php?Cat= - PageRank 7 [*] http://www.webmasterworld.com/ - PageRank 7 [*] http://www.webmasterworld.com/home.htm - PageRank 7 [*] http://www.searchengineforums.com/ - PageRank 7 [/list:u] ... for what it's worth. <img src="/images/icons/wink.gif" alt="" />
Meta theseeker Posted September 15, 2002 Meta Posted September 15, 2002 Google Toolbar on my IE5.5 is showing ... http://dmoz.org/Bookmarks/S/sabre23t/ - PageRank 6 Yes, and that's confusing everyone. Having a link on a page with a PageRank of 6 would be a good thing, hence the possibility of abusing bookmarks for the page rank. However, since Google never spiders the Bookmark page itself (type "dmoz.org/Bookmarks/S/sabre23t" into Google it won't find it), this means Google does not know that a site listed in Bookmarks is linked from it. Therefore, there is no real advantage to having your site listed in your bookmarks, as far as page rank is concerned. As matter of fact, I believe there might be a disadvantage to it. But I'm not an expert on the page rank subject. There are far more knowledgeable people in the world in that sense. I believe, though, that having a link on a bunch of pages with very low page rank can lower your page rank. A site listed in bookmarks is likely to appear on all the sites that skim dmoz using a program, and these sites are likely to have low page ranks.
dstanovic Posted September 15, 2002 Posted September 15, 2002 sabre23t, << http://dmoz.org/ - PageRank 10 http://dmoz.org/Bookmarks/ - PageRank 8 http://dmoz.org/Bookmarks/S/ - PageRank 7>> This is quite normal. Since you have the Google Toolbar click on http://dmoz.org/Bookmarks/ or http://dmoz.org/Bookmarks/S/. Now click on Page Info from the Google Toolbar and select "Cached Snapshot of the Page". You will see that Google does "Not" cache these areas of dmoz. If these pages were cached they would probably pick up an extra point of PR value on the toolbar. What you are seeing for PR is an "estimated" value Google assigns based upon the PR value assigned to the root domain. If you notice many times the lower you go or deeper from the root domain the PR starts to drop by one. Hence, http://dmoz.org/Bookmarks/S/ = PageRank 7 and one level below http://dmoz.org/Bookmarks/S/sabre23t/ = PageRank 6 . I hope this explains things <img src="/images/icons/smile.gif" alt="" /> Dave
Meta hutcheson Posted September 15, 2002 Meta Posted September 15, 2002 >I believe, though, that having a link on a bunch of pages with very low page rank can lower your page rank. Based on the published algorithms, I believe that this cannot, mathematically speaking, be true. The usual caveat applies: numerical approximations can very occasionally do wierd things with what seem like stable computations. But even so, there are lots worse things to worry about, IMO, than having a link from a low-ranked page.
old_crone Posted September 15, 2002 Posted September 15, 2002 This link will help to explain googles PR methods, if you all are really interested. I think you will find it more complex than it appears and what you see in the google bar is not necessarily the same PR number as google actually uses. Often internal page ranking is only an estimate based on the main page ranking, which may or may not be accurate. google page rank
Guest Posted September 16, 2002 Posted September 16, 2002 As I had pointed out back in June after discovering that several editors were abusing this privilege - bookmarks can be crawled if an editor wishes to point googlebot in their direction! Its nice to see that kctipton had taken the time to inform us all that the robots txt was changed - now that it seems bookmarks are back off (I never confirmed that they were on) the robots text disallow I imagine it's a matter of timing! You know put it up to please the crowd take it back down when its time to crawl! I reckon I could pass along some very disturbing bookmarks that I have discovered, but now is not the time or place to do so. But as a hint to the curious - adult links shouldn't be in an editors bookmarks who are editing children's categories! Like I had stated before this is just another way for the editors to use the directory to their own advantage. They will do everything to conceal the truth, hide the evidence, and try and tell you otherwise, but if it looks like a duck, walks like a duck and quacks like a duck ... well you know!
Guest Posted September 16, 2002 Posted September 16, 2002 Only staff has control of what is in the robots.txt file, I hope you realize.
Guest wkallander Posted September 26, 2002 Posted September 26, 2002 PageRank is accumulative - not an average - so there is definately no disadvantage to being linked to from pages with low PR. In fact, quite the opposite. HTH, --Will
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now