I'd like to do a very simple change to the idf computation, but I can't seem
to wrap my head around it.
There are very useful hints in the javadocs for "Changing Similarity" for
new tf() and lengthNorm() behavior, but it was a little bit blurrier for
I'd like to use something beyond the global numDocs.
I'd like to have a modified idf() that gives me the inverse frequency in a
*subset* of the index (e.g. for a specific type of document). I have the
type stored in a field, and I'd need to count how many documents contain
that type for a given term. Since IDF takes the numDocs as a parameter, I
could just change the class that calls idf() and pass the number I need? Who
class calls idf()? TermQuery? So should I make the changes there? Or in
Anybody has some light to shed on this issue?
Thanks in advance,