• The Web is a Trillion Pages Long: Google

    The Web is a Trillion Pages Long: Google

    Techtree News Staff, Jul 27, 2008 2001 hrs IST

    The interweb comprises of more than a trillion pages, as per Google spiders; with several billion pages added each day!

    print mail share

The web is a trillion pages to Google, and growing at a rate of several billion pages per day, the company said in a blog post. Literally though, the interweb consists of more than the trillion pages that Google indexes. Google claims not to index every one of those trillion pages; not all of them, "We don't index every one of those trillion pages -- many of them are similar to each other, or represent auto-generated content..." Most of the pages consist of duplicate URLs -- with multiple pages containing the same content. The first Google index in 1998 had 26 million pages, and by 2000 the Google index reached the one billion mark. The blog further charts the nature of this task and the evolution of Google's own methods: "Back then, we did everything in batches: one workstation could compute the PageRank graph on 26 million pages in a couple of hours, and that set of pages would be used as Google's index for a fixed period of time. Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day." The blog post led to Michael Arrington of TechCrunch to hint at something interesting come next week. Quoting that Google is proud to have the most 'comprehensive index of any search engine', Michael adds that "That may be true today, but it probably won t be true next week". A hint to a potential challenger to the search engine crown, if there ever was one.

Follow Us

Discuss this article
( All fields are mandatory )
Comment here
Name
City
E-mail
Word Verification
Type the characters you see in the picture below.
Characters are not case sensetive.



Discussion Board
Regular Joe, Jo
,chennai, on Jul 27, 2008 10:07 PM
we are just a few steps away from a downfall trillion miles away. how many here online reads news, than reading some dumb email forward and re-re-re-re-re-re(a trillion re) fwd email appealing us to re-fwd else god will punish me? this news is trash as people are busy in religious institutions getting brain washed with beauty bath soap bar. delete this post, mail and blog or hide it in a non-degradable bio-container moisture proof can. else the idiots will again start to laugh on us hysterically and they will empower us with their powerful idiotism. now the time has come for technology to RIP. gujrat bomb blast dead victims may rest in peace for eternity.
Justin
,Australia, on Jul 28, 2008 05:54 AM
huh!
Confused
,BY his Post, on Aug 01, 2008 10:07 PM
Err... what r u trying to say?
LWM
,South Saint Paul, on Jul 27, 2008 09:11 PM
Humm. All I can say is who cares. There are always going to be new and exciting things on the internet. I enjoy being on the internet.
gsdf
,asfd, on Jul 30, 2008 03:54 PM
can some one tell me how to post a comment . thanks :)
hmmm
,New York, on Jul 28, 2008 05:27 AM
The "interweb"? What kind of idiot wrote this article?
Ed Parker
,Spokane, on Jul 28, 2008 12:04 AM
There must be billions of pages which Google does not have access to because of passwords and pay-for-view entry. Does Google even attempt to GUESS how many pages there are to which it does not have access?
Blah
,s, on Jul 28, 2008 04:58 AM
@ed, nice one. how ling did the research take for this comment?
dave
,st. paul mn, on Jul 27, 2008 09:23 PM
because this link was through Google, we don't see that "potential challengers" name?????
Anonymous
,South Saint Paul, on Jul 27, 2008 09:13 PM
Viva Internet. and high speed porn

Copyright(C) 2010 UNML. All rights reserved.