FAQ
This question has been asked numerous times here. But the answer has never
been satisfactory. Can someone answer it full and final, please ?

If you get back a document as a search hit. How do you find out which field
in it matched ? Just the position of the field is sufficient !

-thanks

--
View this message in context: http://www.nabble.com/Finding-out-which-field-caused-the-search-hit-tf4412955.html#a12588494
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Mark harwood at Sep 10, 2007 at 9:15 am
    the answer has never been satisfactory
    Is this the original question?
    http://www.nabble.com/Which-field-matched---tf4141549.html#a11780757


    What actually formed the basis of a document match is hidden in a tree of
    heterogeneous Query objects and to be efficient their match output is
    limited to document ids and scores- not some detailed analysis of
    which sections of a document matched. It is therefore hard/impossible
    to have any highlighting solution which provides answers for all query
    types and the existing Highlighter relies on a rough heuristic where QueryTermExtractor output is used to find the list of query terms used and field TokenStreams are analyzed for content.

    You mention Highlighter performance would be bad for wildcard queries. Have you tried it? If it does turn out to be bad (many wildcard variants produced) might I suggest the following:

    1) Dissect the unrewritten Query and find all WildcardQuery objects
    2) Create a custom analyzer that re-implements the wildcard logic and produces a highlighter-friendly token stream i,e
    Given a query of
    Fred W*
    and data of
    Fred West was arrested
    the analyzer would produce:
    Fred [W*|West] was arrested
    ..where the tokens "W*" and "West" appear at the same position
    3) Add a special wildcard term (W*) to the list of Query terms given to the Highlighter. This would then match with the W* injected into the content in step 2)

    This would avoid the overhead of picking through all the wildcard variants produced by the wildcardQuery but at the cost of extra coding on your part and the runtime cost of re-executing wildcard logic on all terms in the selected documents' TokenStreams. The difference in runtime cost may prove minimal.

    Cheers
    Mark






    ___________________________________________________________
    Yahoo! Answers - Got a question? Someone out there knows the answer. Try it
    now.
    http://uk.answers.yahoo.com/

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Erik Hatcher at Sep 10, 2007 at 11:58 am

    On Sep 10, 2007, at 4:12 AM, makkhar wrote:
    This question has been asked numerous times here. But the answer
    has never
    been satisfactory. Can someone answer it full and final, please ?

    If you get back a document as a search hit. How do you find out
    which field
    in it matched ? Just the position of the field is sufficient !
    Short answer is you can't. Longer answer, of course Lucene matches
    on a per-field basis, so it is determining those sorts of things
    during the search operation, but it does not keep that information
    handy along with the Hit.

    Erik


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedSep 10, '07 at 8:13a
activeSep 10, '07 at 11:58a
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase