How Much Information?
Fascinating study (from 2003) on the amount of information stored and flowing through different media (Broadcasting, Telephony, Internet). Must-read for anyone serious about informatics. We're talking exabytes (one million terabytes) and zettabytes (1000 times more).
According to this study, 5 exabytes were stored digitally in 2002. 18 exabytes flowed through electronic channels. Information would be doubling every 2-3 years.
However, another study reports a more explosive growth based on recent numbers: 280 exabytes stored in 2007, with an accelerating yearly factor! That's called the "Exaflood". It is worth mentioning that this second study is provided by a data storage company.
Moreover, a study on the infrastructure needed to cope with the Internet storage and bandwidth demands uses the above results to predict an Internet singularity in the near future due to lack of adequate growth - Moore's law being dampened by other factors.
The Wikipedia entry on Exabyte links to some further articles.
Indexing the information
Assuming this information is all publicly available, indexing it is necessary in order to search through it.In 2008, Google reportedly processes 20 petabytes a day (0.020 exabytes/day) and has an index size of 20 billion documents.
Now how long would it take using the current Google infrastructure to index the whole digital universe, estimated at 280 exabytes? 280 / 0.020 = 14,000 days ~= 38 years! Even Google is not ready yet...