Get Visual Studio 2010 Beta 2

The Web is a Trillion Pages Long: Google

The Web is a Trillion Pages Long: Google

Techtree News Staff, Jul 27, 2008 2001 hrs IST

The interweb comprises of more than a trillion pages, as per Google spiders; with several billion pages added each day!

The web is a trillion pages to Google, and growing at a rate of several billion pages per day, the company said in a blog post. Literally though, the interweb consists of more than the trillion pages that Google indexes. Google claims not to index every one of those trillion pages; not all of them, "We don't index every one of those trillion pages -- many of them are similar to each other, or represent auto-generated content..." Most of the pages consist of duplicate URLs -- with multiple pages containing the same content.

The first Google index in 1998 had 26 million pages, and by 2000 the Google index reached the one billion mark. The blog further charts the nature of this task and the evolution of Google's own methods: "Back then, we did everything in batches: one workstation could compute the PageRank graph on 26 million pages in a couple of hours, and that set of pages would be used as Google's index for a fixed period of time. Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day."

The blog post led to Michael Arrington of TechCrunch to hint at something interesting come next week. Quoting that Google is proud to have the most 'comprehensive index of any search engine', Michael adds that "That may be true today, but it probably won t be true next week". A hint to a potential challenger to the search engine crown, if there ever was one.

(All fields are mandatory.)

Text Limit = 255 Characters

Type the characters you see in the picture below.

#

Characters are not case sensitive.



USER COMMENTS

we are just a few steps away from a downfall trillion miles away. how many here online reads news, than reading some dumb email forward and re-re-re-re-re-re(a trillion re) fwd email appealing us to re-fwd else god will punish me? this news is trash as people are busy in religious institutions getting brain washed with beauty bath soap bar. delete this post, mail and blog or hide it in a non-degradable bio-container moisture proof can. else the idiots will again start to laugh on us hysterically and they will empower us with their powerful idiotism. now the time has come for technology to RIP. gujrat bomb blast dead victims may rest in peace for eternity.

by Regular Joe, Jo, chennai, on Jul 27, 2008 10:07 PM, Report abuse   Reply

huh!

by Justin, Australia, on Jul 28, 2008 05:54 AM, Report abuse

Err... what r u trying to say?

by Confused, BY his Post, on Aug 01, 2008 10:07 PM, Report abuse

Humm. All I can say is who cares. There are always going to be new and exciting things on the internet. I enjoy being on the internet.

by LWM, South Saint Paul, on Jul 27, 2008 09:11 PM, Report abuse   Reply

can some one tell me how to post a comment . thanks :)

by gsdf, asfd, on Jul 30, 2008 03:54 PM, Report abuse

The "interweb"? What kind of idiot wrote this article?

by hmmm, New York, on Jul 28, 2008 05:27 AM, Report abuse   Reply

There must be billions of pages which Google does not have access to because of passwords and pay-for-view entry. Does Google even attempt to GUESS how many pages there are to which it does not have access?

by Ed Parker, Spokane, on Jul 28, 2008 12:04 AM, Report abuse   Reply

@ed, nice one. how ling did the research take for this comment?

by Blah, s, on Jul 28, 2008 04:58 AM, Report abuse

because this link was through Google, we don't see that "potential challengers" name?????

by dave, st. paul mn, on Jul 27, 2008 09:23 PM, Report abuse   Reply

Viva Internet. and high speed porn

by Anonymous, South Saint Paul, on Jul 27, 2008 09:13 PM, Report abuse   Reply

HOT STUFF