Saturday, June 24, 2006

Googleplexus

Next week I'm headed down to the valley on another Robot Genius business trip. Among the meetings with engineers, bankers and VCs is a visit to the Googleplex in Mountain View. I'll let you know how the wasabi seared ahi is, and whether I managed to get a free massage ;-)

Seriously though, I'm one of Google's biggest fans, and fascinated by the technical aspects of what they're doing. This extends well beyond search, to things like AI and natural language (they are in a good position to use statistical methods with the large corpus of text they have on hand), massive scaling of computing systems (they have over 500k servers deployed) and even how they hire and manage highly talented technologists. I would consider taking my next sabbatical there! (OK, also in the running are the Santa Fe Institute and possibly KITP Beijing.)

Here's an interesting example of the kind of problem Google has to deal with in its core search business. A Russian hacker managed to create billions of fake Web pages using scraped content. He registered numerous domains and hosted the pages on servers in Russia. Through clever linking he managed to get Google to index his fake pages and made of order $100k in AdSense revenue before being discovered. This type of search engine manipulation is an ongoing problem, which, along with click fraud, threaten Google's core business.

Below is a hint about something we're working on in the Robot Genius labs. Why shouldn't search engines characterize the executables found on Web sites, not just the text or images?

We're also crawling the web in order to find distributors of spyware, and can automatically identify packages installed by bad guys as soon as they are created on the system.

No comments:

Blog Archive

Labels