Analyzing Search Logs for Norvig.com

Now that I work for a search engine company, I though I'd look at the logs for norvig.com (97-99, 00-01) to see what searches are coming my way, and from whom.

Feeling Lucky?

I'm particularly proud that, as of July 2, 2001, norvig.com (or the site for my textbook that I maintain at Berkeley) provided answers in the top 3 for the following Google queries:

#1 teach yourself programming             #2 Silk
#1 lisp python                            #3 AI
#1 powerpoint presentation                #3 design patterns
#1 dynamic design patterns                #3 Java Lisp
#1 artificial intelligence programming    #3 best way to learn programming
#1 AI programming                         #3 learn programming
#1 AI textbook
#1 AI book
#1 AI on the web
#1 Java questions

Query Terms and Phrases

The web log analsysis tool Analog provides me with a nice report of search words for my site, including the following picture:

I wrote some scripts of my own to see which individual words and whole phrases were searched for most often:

Query TermsQuery PhrasesQuery Phrases (continued)
 22024: powerpoint
 21000: java
 14903: gettysburg
 10590: programming
  8469: presentation
  7343: address
  5697: design
  5602: lisp
  5055: in
  4891: dynamic
  4739: patterns
  3643: of
  3454: artificial
  3395: intelligence
  3177: norvig
  3115: presentations
  2777: teach
  2661: and
  2445: yourself
  2380: to
  2207: battle
  2169: peter
  2126: python
  2062: the
  1940: pattern
  1907: questions
  1730: c
  1722: silk
  1720: scheme
  1580: code
 7510: powerpoint
 3481: powerpoint presentation
 3229: gettysburg address
 3179: gettysburg
 1887: dynamic programming
 1829: powerpoint presentations
 1717: design patterns
 1220: presentation
  811: battle of gettysburg
  761: peter norvig
  556: gettysburg address powerpoint
  544: critical success factors
  499: java questions
  412: gettysburg powerpoint
  405: the gettysburg address
  320: artificial intelligence programming
  303: teach yourself
  299: learn programming
  280: powerpoint gettysburg address
  280: lisp
  273: cache
  253: norvig
  206: the battle of gettysburg
  195: artificial intelligence
  179: ai programming
  172: silk
  166: gettysburg battlefield
  162: presentations
  141: presentation powerpoint
  135: software license agreement
  134: civil war battle fields
 117: powerpoint gettysburg
 109: decision theory
 108: java
 103: paradigms
  97: java sizeof
  97: how to learn programming
  95: java lisp
  89: python lisp
  88: gettysburg battle field
  84: python
  81: free powerpoint presentations
  79: adaptive software
  76: silk scheme
  76: iaq
  75: powerpointpresentation
  75: peter
  75: gettysburg battle
  74: java substring
  72: gettysburg address power point
  70: gettysburg cemetery
  67: teach yourself java
  67: agent software
  65: paip
  63: eliza source code
  61: programming artificial intelligence
  60: teach yourself programming
  58: teach yourself c in 21 days
  54: teach yourself c
  52: index of photos
  51: lisp python
  50: gettysburg address and powerpoint
 

Search Engines

I also looked at where the queries were coming from:

Referers
 40540: google.com
 27832: yahoo.com
  4195: altavista.com
  3052: msn.com
  1608: lycos.com
  1499: edwardtufte.com
  1021: goto.com
  1018: excite.com
  1003: alltheweb.com
   807: google.de
   619: netscape.com
   598: fortune.com
   580: plastic.com
   508: dogpile.com
   420: google.fr
   415: ask.com
   415: about.com
   351: aol.com
   350: javasourcecode.com
   346: mamma.com
   340: google.co.uk
   281: metacrawler.com
   278: askjeeves.com
   207: overture.com
   202: search.com
   173: northernlight.com
   144: go.com
   142: pemberley.com
   132: directhit.com
   128: ibm.com
   107: iwon.com
    87: mkp.com
    80: militaryhistoryonline.com
    80: google.ch
    77: msn.co.uk
    70: sun.com
    69: amazon.com
    68: gamers.com
    59: rediff.com
    55: lycos.de
    52: well.com
    51: google.it
    50: looksmart.com


Peter Norvig - 2 July, 2001