On Thu, 2007-11-08 at 08:35 +0000, Pedro Melo wrote:
If you are using MySQL, checkout Sphinx also.
Can definitely recommend Sphinx for performance on large volumes of
data. We were having searches take 10 secs typically, sometimes much
longer, using MySQL fulltext indices. With Sphinx that went sub-second,
and ranking was better than MySQL. I wrote the Sphinx::Search perl
interface; thought I might write a Catalyst model for it one day, just
haven't had the driving need.
MySQL fulltext can be dangerous on certain types of data because it
automatically dismisses frequently occurring terms - so e.g. if you
search on a word that appears in more than half of your records, you can
get zero results!
Have seen Xapian in action, producing slow results of poor relevance in
these particular cases. That's not to say it needs to be like that -
Xapian is a highly configurable and fairly complicated beast, so chances
are it wasn't running optimally. In general, whatever product you use,
getting search relevance right for your specific data set can be a