FAQ
[ https://issues.apache.org/jira/browse/LUCENE-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549042 ]

Christoph Bammann commented on LUCENE-167:
------------------------------------------

Hi, I would like to know if this new Parser is integrated or even
the standard QueryParser in current releases?
[PATCH] QueryParser not handling queries containing AND and OR
--------------------------------------------------------------

Key: LUCENE-167
URL: https://issues.apache.org/jira/browse/LUCENE-167
Project: Lucene - Java
Issue Type: Bug
Components: QueryParser
Affects Versions: unspecified
Environment: Operating System: Linux
Platform: PC
Reporter: Morus Walter
Assignee: Erik Hatcher
Attachments: LuceneTest.java, QueryParser.jj.patch, QueryParser.patch


The QueryParser does not seem to handle boolean queries containing AND and OR
operators correctly:
e.g.
a AND b OR c AND d gets parsed as +a +b +c +d.
The attached patch fixes this by changing the vector of boolean clauses into a
vector of vectors of boolean clauses in the addClause method of the query
parser. A new sub-vector is created whenever an explicit OR operator is used.
Queries using explicit AND/OR are grouped by precedence of AND over OR. That is
a OR b AND c gets a OR (b AND c).
Queries using implicit AND/OR (depending on the default operator) are handled as
before (so one can still use a +b -c to create one boolean query, where b is
required, c forbidden and a optional).
It's less clear how a query using both explizit AND/OR and implicit operators
should be handled.
Since the patch groups on explicit OR operators a query
a OR b c is read as a (b c)
whereas
a AND b c as +a +b c
(given that default operator or is used).
There's one issue left:
The old query parser reads a query
`a OR NOT b' as `a -b' which is the same as `a AND NOT b'.
The modified query parser reads this as `a (-b)'.
While this looks better (at least to me), it does not produce the result of a OR
NOT b. Instead the (-b) part seems to be silently dropped.
While I understand that this query is illegal (just searching for one negative
term) I don't think that silently dropping this part is an appropriate way to
deal with that. But I don't think that's a query parser issue.
The only question is, if the query parser should take care of that.
I attached the patch (made against 1.3rc3 but working for 1.3final as well) and
a test program.
The test program parses a number of queries with default-or and default-and
operator and reparses the result of the toString method of the created query.
It outputs the initial query, the parsed query with default or, the reparesed
query, the parsed query with the default and it's reparsed query.
If called with a -q option, it also run's the queries against an index
consisting of all documentes containing one or none a b c or d.
Using an unpatched and a patched version of lucene in the classpath one can look
at the effect of the patch in detail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

  • Erik Hatcher (JIRA) at Dec 6, 2007 at 2:21 pm
    [ https://issues.apache.org/jira/browse/LUCENE-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549046 ]

    Erik Hatcher commented on LUCENE-167:
    -------------------------------------

    the PrecedenceQueryParser is in the contrib/miscellaneous codebase (in Lucene's repo) and in released "miscellaneous" JAR. But it has some issues that are documented in the test case, so it is definitely not ready for prime time.
    [PATCH] QueryParser not handling queries containing AND and OR
    --------------------------------------------------------------

    Key: LUCENE-167
    URL: https://issues.apache.org/jira/browse/LUCENE-167
    Project: Lucene - Java
    Issue Type: Bug
    Components: QueryParser
    Affects Versions: unspecified
    Environment: Operating System: Linux
    Platform: PC
    Reporter: Morus Walter
    Assignee: Erik Hatcher
    Attachments: LuceneTest.java, QueryParser.jj.patch, QueryParser.patch


    The QueryParser does not seem to handle boolean queries containing AND and OR
    operators correctly:
    e.g.
    a AND b OR c AND d gets parsed as +a +b +c +d.
    The attached patch fixes this by changing the vector of boolean clauses into a
    vector of vectors of boolean clauses in the addClause method of the query
    parser. A new sub-vector is created whenever an explicit OR operator is used.
    Queries using explicit AND/OR are grouped by precedence of AND over OR. That is
    a OR b AND c gets a OR (b AND c).
    Queries using implicit AND/OR (depending on the default operator) are handled as
    before (so one can still use a +b -c to create one boolean query, where b is
    required, c forbidden and a optional).
    It's less clear how a query using both explizit AND/OR and implicit operators
    should be handled.
    Since the patch groups on explicit OR operators a query
    a OR b c is read as a (b c)
    whereas
    a AND b c as +a +b c
    (given that default operator or is used).
    There's one issue left:
    The old query parser reads a query
    `a OR NOT b' as `a -b' which is the same as `a AND NOT b'.
    The modified query parser reads this as `a (-b)'.
    While this looks better (at least to me), it does not produce the result of a OR
    NOT b. Instead the (-b) part seems to be silently dropped.
    While I understand that this query is illegal (just searching for one negative
    term) I don't think that silently dropping this part is an appropriate way to
    deal with that. But I don't think that's a query parser issue.
    The only question is, if the query parser should take care of that.
    I attached the patch (made against 1.3rc3 but working for 1.3final as well) and
    a test program.
    The test program parses a number of queries with default-or and default-and
    operator and reparses the result of the toString method of the created query.
    It outputs the initial query, the parsed query with default or, the reparesed
    query, the parsed query with the default and it's reparsed query.
    If called with a -q option, it also run's the queries against an index
    consisting of all documentes containing one or none a b c or d.
    Using an unpatched and a patched version of lucene in the classpath one can look
    at the effect of the patch in detail.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedDec 6, '07 at 1:58p
activeDec 6, '07 at 2:21p
posts2
users1
websitelucene.apache.org

1 user in discussion

Erik Hatcher (JIRA): 2 posts

People

Translate

site design / logo © 2021 Grokbase