18/20

What have I missed?

  • Tons!

  • Stemming -- reducing "horses" and "horse" to common form.

  • Normalizing -- everything in lower case, or, strip macrons out for Maaori words.

  • Tokenizing -- removing the junk (HTML tags, etc.)

  • Weighting -- some fields (from lines) more important than others.

  • Handling special fields -- "date:today", for example.

  • Boolean queries.