18/20 |
Michael Robinson
|
Tons!
Stemming -- reducing "horses" and "horse" to common form.
Normalizing -- everything in lower case, or, strip macrons out for Maaori words.
Tokenizing -- removing the junk (HTML tags, etc.)
Weighting -- some fields (from lines) more important than others.
Handling special fields -- "date:today", for example.
Boolean queries.