FAQ
Hi all,

I am evaluating Lucene 1.9 for a search application. I am using
MultiFieldQueryParser for searching across fields and everything works fine.
However, we have a new requirement where certain fields need to be boosted
while searching. To complicate matters, users can specify fields while
searching (e.g. "title:hybrid content:vehicle").

I came across this enhancement request (
http://issues.apache.org/bugzilla/show_bug.cgi?id=32115) that appears to
address the boosting issue (Please see attached sample code to illustrate
the problem. NewMultiFieldQueryParser is the enhanced version of
MultiFieldQueryParser as per the enhancement request. I used
lucene-core-1.9.1.jar to test the code). For all simple searches (i.e. when
I don't mention a field while searching), it appears to work - explain
output follows:

Hits for "hybrid vehicle" were found in quotes by:
1. Big tax savings on hybrid vehicles
0.36948532 = sum of:
0.35416904 = sum of:
0.33885276 = weight(title:hybrid^8.0 in 0), product of:
0.5510778 = queryWeight(title:hybrid^8.0), product of:
8.0 = boost
1.4054651 = idf(docFreq=1)
0.049012046 = queryNorm
0.614891 = fieldWeight(title:hybrid in 0), product of:
1.0 = tf(termFreq(title:hybrid)=1)
1.4054651 = idf(docFreq=1)
0.4375 = fieldNorm(field=title, doc=0)
0.015316265 = weight(content:hybrid^2.0 in 0), product of:
0.09802409 = queryWeight(content:hybrid^2.0), product of:
2.0 = boost
1.0 = idf(docFreq=2)
0.049012046 = queryNorm
0.15625 = fieldWeight(content:hybrid in 0), product of:
1.0 = tf(termFreq(content:hybrid)=1)
1.0 = idf(docFreq=2)
0.15625 = fieldNorm(field=content, doc=0)
0.015316265 = weight(content:vehicle^2.0 in 0), product of:
0.09802409 = queryWeight(content:vehicle^2.0), product of:
2.0 = boost
1.0 = idf(docFreq=2)
0.049012046 = queryNorm
0.15625 = fieldWeight(content:vehicle in 0), product of:
1.0 = tf(termFreq(content:vehicle)=1)
1.0 = idf(docFreq=2)
0.15625 = fieldNorm(field=content, doc=0)

2. Honda Civic
0.006126506 = product of:
0.012253012 = weight(content:vehicle^2.0 in 1), product of:
0.09802409 = queryWeight(content:vehicle^2.0), product of:
2.0 = boost
1.0 = idf(docFreq=2)
0.049012046 = queryNorm
0.125 = fieldWeight(content:vehicle in 1), product of:
1.0 = tf(termFreq(content:vehicle)=1)
1.0 = idf(docFreq=2)
0.125 = fieldNorm(field=content, doc=1)
0.5 = coord(1/2)

3. Fuel blends: ethanol
0.017124103 = product of:
0.034248207 = weight(content:hybrid^2.0 in 2), product of:
0.09802409 = queryWeight(content:hybrid^2.0), product of:
2.0 = boost
1.0 = idf(docFreq=2)
0.049012046 = queryNorm
0.34938562 = fieldWeight(content:hybrid in 2), product of:
2.236068 = tf(termFreq(content:hybrid)=5)
1.0 = idf(docFreq=2)
0.15625 = fieldNorm(field=content, doc=2)
0.5 = coord(1/2)



However, when the search term includes a field (e.g. "title:hybrid
content:vehicle"), boosting for the terms does not seem to work:

Hits for "title:hybrid content:vehicle" were found in quotes by:
1. Big tax savings on hybrid vehicles
0.59159887 = sum of:
0.5010147 = weight(title:hybrid in 0), product of:
0.81480247 = queryWeight(title:hybrid), product of:
1.4054651 = idf(docFreq=1)
0.5797387 = queryNorm
0.614891 = fieldWeight(title:hybrid in 0), product of:
1.0 = tf(termFreq(title:hybrid)=1)
1.4054651 = idf(docFreq=1)
0.4375 = fieldNorm(field=title, doc=0)
0.09058417 = weight(content:vehicle in 0), product of:
0.5797387 = queryWeight(content:vehicle), product of:
1.0 = idf(docFreq=2)
0.5797387 = queryNorm
0.15625 = fieldWeight(content:vehicle in 0), product of:
1.0 = tf(termFreq(content:vehicle)=1)
1.0 = idf(docFreq=2)
0.15625 = fieldNorm(field=content, doc=0)

2. Fuel blends: ethanol
0.036233667 = product of:
0.072467335 = weight(content:vehicle in 1), product of:
0.5797387 = queryWeight(content:vehicle), product of:
1.0 = idf(docFreq=2)
0.5797387 = queryNorm
0.125 = fieldWeight(content:vehicle in 1), product of:
1.0 = tf(termFreq(content:vehicle)=1)
1.0 = idf(docFreq=2)
0.125 = fieldNorm(field=content, doc=1)
0.5 = coord(1/2)


Can you please suggest how to tackle the issue?

Thanks

Suman

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMay 17, '06 at 6:08p
activeMay 17, '06 at 6:08p
posts1
users1
websitelucene.apache.org

1 user in discussion

Suman Ghosh: 1 post

People

Translate

site design / logo © 2023 Grokbase