Hi, I'm building a system where I want to show only results indexed in the
past few days. Furthermore, I don't want to maintain a giant index with
millions of documents if I only want to return results from a couple of days
(thousands of documents).
My system heavily relies that the occurrences of terms in documents stored
in the index have a realistic distribution (consequently: realistic IDF).
That said, I would like to use a small index to return results, but I want
to compute documents score using a IDF from a much greater Index (or even an
external source).
The Similarity API doesn't seem to allow me to do this. The *idf* method
does not receive as parameter the term being used.
Another possibility is to use TrieRangeQuery to make sure the documents
shown are within the last couple of days. Again, I rather not mantain a
large index. Also this kind of query is not cheap.
Am I missing something?
Thanks
Felipe Hummel