FAQ
Hi,

I've been going nuts trying to use LuceneParser parse query
strings using the default operator AND correctly:

String queryString = getQueryString();
QueryParser parser = new QueryParser("text", new StandardAnalyzer());
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
try {
Query q = parser.parse(queryString);
LOG.info("q: " + q.toString());
/* [...] */

Here's two example queries and the results I get with and
without the `setDefaultOperator()' statetment:

Query: hose AND cat:Wohnen cat:Mode OR color:blau

- Default-Op OR: (+text:hose +cat:Wohnen) cat:Mode color:blau
- Default-Op AND: +(+text:hose +cat:Wohnen) cat:Mode color:blau

Query: hose AND ( cat:Wohnen cat:Mode ) OR color:blau

- Default-Op OR: (+text:hose +(cat:Wohnen cat:Mode)) color:blau
- Default-Op AND: (+text:hose +(+cat:Wohnen +cat:Mode)) color:blau

It seems like theparser handles the default case well, but what
I get with the default operator set to AND is completely
incorrect. I've seen this behaviour with both version 2.1.0 and
2.2.0.

Any hints?

Cheers,

Martin

--
----------- / http://herbert.the-little-red-haired-girl.org / -------------
=+=
I got it good, I got it bad. I got the sweetest sadness I ever had.
--- the The

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Daniel Naber at Oct 9, 2007 at 7:59 pm

    On Tuesday 09 October 2007 09:55, Martin Dietze wrote:

    I've been going nuts trying to use LuceneParser parse query
    strings using the default operator AND correctly:
    The operator precedence is known to be buggy. You need to use parenthesis,
    e.g. (aa AND bb) OR (cc AND dd)

    regards
    Daniel

    --
    http://www.danielnaber.de

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Martin Dietze at Oct 10, 2007 at 12:21 pm

    On Tue, October 09, 2007, Daniel Naber wrote:

    The operator precedence is known to be buggy. You need to use parenthesis,
    e.g. (aa AND bb) OR (cc AND dd)
    This would be fine with me but unfortunately not for my users.
    More precisely, I need to analyze a query string from one search
    engine, filter out a black list of facette queries and pass the
    result on to a second search engine. This means that I have no
    control over the way people enter their queries.

    Is there any known query parser which handles this correctly?

    Also, how does solr do this? It uses a parser derived from the
    Lucene QueryParser, and I found it produces the same output,
    however the search queries are still handled correctly, i.e. the
    results I get indicate that deep down inside it seems to get it
    right in the end.

    Cheers,

    Martin

    --
    ----------- / http://herbert.the-little-red-haired-girl.org / -------------
    =+=
    My name is spelled Luxury Yacht but it's pronounced Throatwabbler Mangrove.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Mark Miller at Oct 10, 2007 at 12:04 pm
    There is a lot on this topic if you search the archives.

    Things to check out:

    Precedence QueryParser (I think its in Lucene contrib packages - I don't
    believe its perfect but I have not tried it)

    Qsol: myhardshadow.com/qsol (A query parser I wrote that has fully
    customizable precedence support - don't be fooled by the stale
    website...I am actually working on version 2 as i have time)

    - Mark

    Martin Dietze wrote:
    Hi,

    I've been going nuts trying to use LuceneParser parse query
    strings using the default operator AND correctly:

    String queryString = getQueryString();
    QueryParser parser = new QueryParser("text", new StandardAnalyzer());
    parser.setDefaultOperator(QueryParser.AND_OPERATOR);
    try {
    Query q = parser.parse(queryString);
    LOG.info("q: " + q.toString());
    /* [...] */

    Here's two example queries and the results I get with and
    without the `setDefaultOperator()' statetment:

    Query: hose AND cat:Wohnen cat:Mode OR color:blau

    - Default-Op OR: (+text:hose +cat:Wohnen) cat:Mode color:blau
    - Default-Op AND: +(+text:hose +cat:Wohnen) cat:Mode color:blau

    Query: hose AND ( cat:Wohnen cat:Mode ) OR color:blau

    - Default-Op OR: (+text:hose +(cat:Wohnen cat:Mode)) color:blau
    - Default-Op AND: (+text:hose +(+cat:Wohnen +cat:Mode)) color:blau

    It seems like theparser handles the default case well, but what
    I get with the default operator set to AND is completely
    incorrect. I've seen this behaviour with both version 2.1.0 and
    2.2.0.

    Any hints?

    Cheers,

    Martin
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Martin Dietze at Oct 10, 2007 at 12:27 pm
    Mark,

    this reply was just in time :)
    On Wed, October 10, 2007, Mark Miller wrote:

    Precedence QueryParser (I think its in Lucene contrib packages - I don't
    believe its perfect but I have not tried it)
    I checked that one out, and while it improves things with
    default settings I found it to exhibit the same incorrect
    behaviour with default operator AND.
    Qsol: myhardshadow.com/qsol (A query parser I wrote that has fully
    customizable precedence support - don't be fooled by the stale website...I
    am actually working on version 2 as i have time)
    That sounds promising, I will check this out right now!

    Thannk you!

    Martin

    --
    ----------- / http://herbert.the-little-red-haired-girl.org / -------------
    =+=
    Die Freiheit ist uns ein schoenes Weib.
    Sie hat einen Ober- und Unterleib.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Martin Dietze at Oct 10, 2007 at 2:45 pm
    Mark,
    On Wed, October 10, 2007, Martin Dietze wrote:

    Qsol: myhardshadow.com/qsol (A query parser I wrote that has fully
    customizable precedence support - don't be fooled by the stale website...I
    am actually working on version 2 as i have time)
    That sounds promising, I will check this out right now!
    as far as I can judge this from what I've tested now it seem
    like qsol does handle operator precedence correctly for my
    test cases. However - excuse a possibly dumb question - how
    do I get out my query in a form accepted by solr?

    Cheers,

    Martin

    --
    ----------- / http://herbert.the-little-red-haired-girl.org / -------------
    =+=
    I now declare this bizarre open!

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Mark Miller at Oct 10, 2007 at 5:43 pm
    I have only taken passing glances at Solr, so I am afraid I cannot be of
    much help. Certainly one of the Solr guys will be able to be of
    assistance though.

    Since Qsol generates Query objects, you just need to find out how to
    bypass sending solr a query String and instead give it a Query object. I
    assume this must be possible.

    Back in the day you might have been able to call Query.toString() as the
    Query contract says that toString() should output valid QueryParser
    syntax. This does not work for many queries though (most notably Span
    Queries -- QueryParser knows nothing about Span queries).

    - Mark

    Martin Dietze wrote:
    Mark,

    On Wed, October 10, 2007, Martin Dietze wrote:

    Qsol: myhardshadow.com/qsol (A query parser I wrote that has fully
    customizable precedence support - don't be fooled by the stale website...I
    am actually working on version 2 as i have time)
    That sounds promising, I will check this out right now!
    as far as I can judge this from what I've tested now it seem
    like qsol does handle operator precedence correctly for my
    test cases. However - excuse a possibly dumb question - how
    do I get out my query in a form accepted by solr?

    Cheers,

    Martin
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chris Hostetter at Oct 10, 2007 at 11:58 pm
    : I have only taken passing glances at Solr, so I am afraid I cannot be of much
    : help. Certainly one of the Solr guys will be able to be of assistance though.

    the StandardRequestHandler in solr will accept anythign the lucene
    QueryParser will accept ... sublcassing StandardRequestHandler to use the
    Qsol parser instead would be fairly easy (there are some open feature
    requests in Jira that will make it trivial, but they're still in flux)

    : Since Qsol generates Query objects, you just need to find out how to bypass
    : sending solr a query String and instead give it a Query object. I assume this
    : must be possible.

    Eh ... not really. it would be easier to just load the Qsol parser in
    solr ... or toString() the query...

    : Back in the day you might have been able to call Query.toString() as the Query
    : contract says that toString() should output valid QueryParser syntax. This

    Back in 1.4.3 it said "The representation used is one that is readable by
    QueryParser" but that wasn't really a "contract" as much as it was a
    statement about how the "core" queries behaved (hence the wording was
    changed) ... a contract would imply that *anyone* subclassing Query must
    obey the contract, and that would be an impossible contract for anyone but
    lucene commiters to satisfy.


    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Mark Miller at Oct 11, 2007 at 2:32 am
    As usual, thank you to the gruff but brilliant Mr Hostetter.

    - Mark

    Chris Hostetter wrote:
    : I have only taken passing glances at Solr, so I am afraid I cannot be of much
    : help. Certainly one of the Solr guys will be able to be of assistance though.

    the StandardRequestHandler in solr will accept anythign the lucene
    QueryParser will accept ... sublcassing StandardRequestHandler to use the
    Qsol parser instead would be fairly easy (there are some open feature
    requests in Jira that will make it trivial, but they're still in flux)

    : Since Qsol generates Query objects, you just need to find out how to bypass
    : sending solr a query String and instead give it a Query object. I assume this
    : must be possible.

    Eh ... not really. it would be easier to just load the Qsol parser in
    solr ... or toString() the query...

    : Back in the day you might have been able to call Query.toString() as the Query
    : contract says that toString() should output valid QueryParser syntax. This

    Back in 1.4.3 it said "The representation used is one that is readable by
    QueryParser" but that wasn't really a "contract" as much as it was a
    statement about how the "core" queries behaved (hence the wording was
    changed) ... a contract would imply that *anyone* subclassing Query must
    obey the contract, and that would be an impossible contract for anyone but
    lucene commiters to satisfy.


    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chris Hostetter at Oct 11, 2007 at 2:41 am
    : As usual, thank you to the gruff but brilliant Mr Hostetter.

    Doh! ... sorry if i've been gruffer then usual ... i've been rotating my
    sleep schedule so my days start an hour earlier each day for the last 6
    days. it's been throwing my psyche for a loop.


    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Martin Dietze at Oct 11, 2007 at 8:10 am

    On Wed, October 10, 2007, Chris Hostetter wrote:

    Eh ... not really. it would be easier to just load the Qsol parser in
    solr ... or toString() the query...
    This would be nice, but unfortunately I do not have direct access
    to the solr server in my application. I need to parse queries,
    filter out blacklisted facettes and then parse them on to solr
    using solrj.

    Maybe I am missing out on something obvious, and there's an
    entirely simple way to accomplish this?

    Cheers,

    Martin

    --
    ----------- / http://herbert.the-little-red-haired-girl.org / -------------
    =+=
    Yoda of Borg I am. Assimilated you will be.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chris Hostetter at Oct 12, 2007 at 12:34 am
    : This would be nice, but unfortunately I do not have direct access
    : to the solr server in my application. I need to parse queries,
    : filter out blacklisted facettes and then parse them on to solr
    : using solrj.

    that depends ... what do you mean by a blacklisted facet?

    facet counts are controlled by seperate query params then the query string
    ... are you talking about preventing people from including field
    specific queries in their query string? i'm guessing that you mean
    something like this is okay...

    solr title:bobby body:boy

    ...but this isn't...

    solr title:bobby body:boy secret_field:xyzyq

    ...is that the idea?

    the easiest approach is to do your own simple pass over the query string,
    and escape any metacharacters in clauses you don't like ... they'll be
    treated as "terms" and either be ignored (if they are optional) or cause
    the query to not match anything (if they are required)...

    solr title:bobby body:boy secret_field\:xyzyq







    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Martin Dietze at Oct 12, 2007 at 8:15 am
    Chris,
    On Thu, October 11, 2007, Chris Hostetter wrote:

    ... are you talking about preventing people from including field
    specific queries in their query string? i'm guessing that you mean
    something like this is okay...

    solr title:bobby body:boy

    ...but this isn't...

    solr title:bobby body:boy secret_field:xyzyq

    ...is that the idea?
    yes that's just about it. We have two search engines for
    different purposes. The first one indexes more fields than the
    second and we want to prevent "good" search queries from failing
    on the second. Supporting all theses fields on the second SE is
    not a good idea since indexing all this additonal data would
    have an impact on performance and index size.
    the easiest approach is to do your own simple pass over the query string,
    and escape any metacharacters in clauses you don't like ... they'll be
    treated as "terms" and either be ignored (if they are optional) or cause
    the query to not match anything (if they are required)...
    This is a very interesting idea. Yet I wonder how to deal with
    such terms if they are part of an AND query (actually AND is our
    default operator, so that a query "body:boy secret_field\:xyzyq"
    would always fail. It seems obvious that in any case you end up
    parsing the query in some way...

    Cheers,

    Martin

    --
    ----------- / http://herbert.the-little-red-haired-girl.org / -------------
    =+=
    My family says I'm a psychopath, but the voices in my head disagree

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Martin Dietze at Oct 11, 2007 at 8:10 am

    On Wed, October 10, 2007, Mark Miller wrote:

    Back in the day you might have been able to call Query.toString() as the
    Query contract says that toString() should output valid QueryParser syntax.
    This does not work for many queries though (most notably Span Queries --
    QueryParser knows nothing about Span queries).
    I see, so my old code which was based on QueryParser was not
    completely flawed :) Are there any other queries besides span
    queries which can occur with qsol and do not produce valid
    QueryParser syntax?

    Also I wonder why a facette query, like `foo:bar' results in a
    SpanQuery `+spanNear([foo, bar], 0, true)' (I may not understand
    the concept here).

    Cheers,

    Martin

    --
    ----------- / http://herbert.the-little-red-haired-girl.org / -------------
    =+=
    Who the fsck is "General Failure", and why is he reading my disk?

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Mark Miller at Oct 11, 2007 at 3:33 pm

    Martin Dietze wrote:
    On Wed, October 10, 2007, Mark Miller wrote:

    Back in the day you might have been able to call Query.toString() as the
    Query contract says that toString() should output valid QueryParser syntax.
    This does not work for many queries though (most notably Span Queries --
    QueryParser knows nothing about Span queries).
    I see, so my old code which was based on QueryParser was not
    completely flawed :) Are there any other queries besides span
    queries which can occur with qsol and do not produce valid
    QueryParser syntax?
    I'm not sure, I'd have to look into it.
    Also I wonder why a facette query, like `foo:bar' results in a
    SpanQuery `+spanNear([foo, bar], 0, true)' (I may not understand
    the concept here).
    Qsol has a different field search syntax: foo(bar).

    If you give something like foo:bar or foo-bar, the results will depend
    on your analyzer. If using the standard analyzer, the ':' or '-' is
    thrown out and two tokens are generated: foo and bar. Like the standard
    Lucene QueryParser, if more than one token is generated from a single
    'queryparser token', they are looked for next to each other. The
    difference is that the standard Lucene QueryParser uses PhraseQuery's
    for this. Qsol uses SpanQuery's instead so that results are consistent
    if the clause needs to be in a SpanQuery rather than a BooleanQuery
    (PhraseQuery's cannot be nested in SpanQuery's). This is required
    because Qsol allows the mixing of Span/Non-Span queries.

    If you want to get around this, I may be able to help.

    - Mark

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedOct 9, '07 at 7:56a
activeOct 12, '07 at 8:15a
posts15
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase