SiteProNews have now published my 2 part article based on the Webstock 2008 presentation by Google’s Senior Research Scientist, Dr. Craig Nevill-Manning.
In his presentation, Craig, who is New Zealand born and bred, explained how Google uses science to develop more precise search techniques. I found his talk absolutely riveting and typed frantically during the whole thing in my hurry to blog it.
Here are a couple of classic excerpts:
Google used to do a terrible job of defining terms. Craig noticed people were searching for “definition of…”, or “what is a….” etc so he wanted the search engine to provide better results for these searches. He found lots of web pages that contained glossaries and definitions, so he hacked up a Perl script to get the glossary formats.
The first recall results were only 50 percent accurate. He wanted to improve this rate, so he did some experiments with the data. But he could never reach an accuracy level he was happy with. It was later he realized that most of the questions people actually needed answers to could be answered with his crappy little Perl script. He concluded that 100 percent accuracy is not important, that scale is much more important.
Craig says that once a week, a person at each data center has a list of all the failed hard disks and walks around the datacenter with a pile of hard drives, replacing them one at a time. Velcro is Google’s secret weapon! All Google’s hard disks are velcroed in. This allows super quick service and replacement time. So curiously, there is no downside to hardware failures at Google, because they are expected and managed via scale.
Fascinating stuff!
I loved both articles. What drew me in was the ressemblance of the photo to Matt Cutts. I wonder if all people look like Matt Cutts at the google plex. Maybe the book “A Wrinkle in Time” was based off the google plex. I wonder what the underlying end will be? So maybe I am getting to deep for me own good.
Great articles and I wish I had seen the presentation. I wonder if the presentation is on the net somewhere?
@ Chris - Glad you enjoyed them! I don’t think the presentation is on the net anywhere as it was just Craig showing some slides as prompts and speaking off the cuff. I was typing as fast as he was talking so they contain the presentation pretty much word for word.