FAQ
Hi,

I would like to write a query composed of a BooleanQuery (several
clauses) and a SpanQuery (SpanNearQuery), where both are mandatory.
Sounds simple but I have to work on spans returned by this query.

I know that I could use a Filter, but my goal is to get the spans from
the « combined » query : BooleanQuery + SpanQuery. Even if I filter my
BooleanQuery with the SpanQuery, spans returned by the
SpanQuery.getSpans(reader) are not « filtered » by the BooleanQuery.
Since executing the query is not needed to get spans from a SpanQuery I
understand this behaviour.

My current implementation first runs the BooleanQuery filtered by the
SpanQuery. I then get the spans from the SpanQuery and remove from them
all docs that are not in the score docs returned by the filtered
BooleanQuery. Is there a more efficient, simple or clever way to reach
the same goal?

Thank you very much in advance for your advices.

Best regards,

mercibe

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Grant Ingersoll at Mar 23, 2010 at 1:54 pm

    On Mar 23, 2010, at 12:58 AM, Benoit Mercier wrote:

    Hi,

    I would like to write a query composed of a BooleanQuery (several clauses) and a SpanQuery (SpanNearQuery), where both are mandatory. Sounds simple but I have to work on spans returned by this query.

    I know that I could use a Filter, but my goal is to get the spans from the « combined » query : BooleanQuery + SpanQuery. Even if I filter my BooleanQuery with the SpanQuery, spans returned by the SpanQuery.getSpans(reader) are not « filtered » by the BooleanQuery. Since executing the query is not needed to get spans from a SpanQuery I understand this behaviour.

    My current implementation first runs the BooleanQuery filtered by the SpanQuery. I then get the spans from the SpanQuery and remove from them all docs that are not in the score docs returned by the filtered BooleanQuery. Is there a more efficient, simple or clever way to reach the same goal?

    Thank you very much in advance for your advices.

    If you are 3.x:

    I think maybe you could reverse this around. Get a filter from your BooleanQuery and get the DocIdSet and then advance through the Spans and the DocIdSetIterator, as they will both be forward facing. For each span, check to see whether that doc is in the filter or not.

    In 2.x, I think on the filter you can get the BitSet and then just directly look up to see if the current span is in the bit set.

    In either case, I don't think this will be that big of a performance hit as it is all a forward facing iteration.

    -Grant
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Benoit Mercier at Mar 23, 2010 at 3:35 pm
    Thank you Grant. I will try your suggested approach. It confirms to me
    that I wasn't lost too much;-)
    mercibe

    Grant Ingersoll a écrit :
    On Mar 23, 2010, at 12:58 AM, Benoit Mercier wrote:

    Hi,

    I would like to write a query composed of a BooleanQuery (several clauses) and a SpanQuery (SpanNearQuery), where both are mandatory. Sounds simple but I have to work on spans returned by this query.

    I know that I could use a Filter, but my goal is to get the spans from the « combined » query : BooleanQuery + SpanQuery. Even if I filter my BooleanQuery with the SpanQuery, spans returned by the SpanQuery.getSpans(reader) are not « filtered » by the BooleanQuery. Since executing the query is not needed to get spans from a SpanQuery I understand this behaviour.

    My current implementation first runs the BooleanQuery filtered by the SpanQuery. I then get the spans from the SpanQuery and remove from them all docs that are not in the score docs returned by the filtered BooleanQuery. Is there a more efficient, simple or clever way to reach the same goal?

    Thank you very much in advance for your advices.

    If you are 3.x:

    I think maybe you could reverse this around. Get a filter from your BooleanQuery and get the DocIdSet and then advance through the Spans and the DocIdSetIterator, as they will both be forward facing. For each span, check to see whether that doc is in the filter or not.

    In 2.x, I think on the filter you can get the BitSet and then just directly look up to see if the current span is in the bit set.

    In either case, I don't think this will be that big of a performance hit as it is all a forward facing iteration.

    -Grant
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 23, '10 at 4:58a
activeMar 23, '10 at 3:35p
posts3
users2
websitelucene.apache.org

2 users in discussion

Benoit Mercier: 2 posts Grant Ingersoll: 1 post

People

Translate

site design / logo © 2022 Grokbase