One other note -- I remember now, when I was developing this code, I
first looked at Nutch's query processing code, but I think Nutch
assumed the queries were always flat to begin with. So I took Nutch's
"query filter" approach, but made it work with nested queries that
aren't always flat.

On 12/18/05, Chris Lamprecht wrote:
Hi Erik,

I ran into the same thing in my work so I created a query utility
class and an interface called QueryFilter (I know this is a bad name)
that is really more of a visitor pattern callback thing. Most of my
methods are convenience methods built on top of this interface and a
few classes that implement it. The main method in the utility class
takes a lucene Query and a QueryFilter, and returns a (possibly new)
Query object:

public static Query filter(Query query, QueryFilter filter) {
/* typical recursive code to traverse the query if it's
* a BooleanQuery, or handle the terminal query
* base case, issuing callbacks to the QueryFilter
* interface.

Some of the utility methods I've build using this interface and code are:

/** Converts a query into a "sloppy phrase" query for a given field
(optional) and slop */
public static Query getSloppyPhraseQuery(Query query, String field, int slop)

* Returns a Collection<String> of non-prohibited
* terms for a given query and optional field.
public static List<String> getTerms(Query q, String field)

/** flattens a query, allowing depth up to d; use d=0 for completely flat. */
public static Query flatten(Query q, int d)

* Returns an infix String from a {@link Query},
* optionally including the field in the query.
public static String getInfixString(Query q, String field, boolean includeField)

If others find this useful, I'd like to submit it so the lucene
developer community can help clean up the API for general consumption.
I didn't bother changing any existing Lucene code such as TermQuery;
I haven't noticed any performance problems; I'm pretty sure the whole
query parsing/rewriting phase takes less than 1ms (it registers 0ms..
I guess I could use Java 1.5's nanotimer to get more detailed numbers

On 12/16/05, Erik Hatcher wrote:
In my latest project, I've written code to walk a general Query and
do three different manipulations of it: 1) Convert it to a SpanQuery
2) Change a specified field to another field for each nested Query 3)
Rotate (Span)RegexQuery terms.

I have a lot of duplication of this recursive Query processing. I'd
like to create a general way to do this sort of thing and am
wondering if others have created similar routines and have ideas
that might be useful in making a general facility.

I'm also curious if there are changes to Query that could be made to
facilitate this sort of thing, such as setters to allow terms to be
changed without having to construct an entirely new TermQuery for
example. I realize that Query's are considered basically immutable,
and maybe this is an important thing to maintain, or maybe this
convention is worth abolishing?



To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 5 | next ›
Discussion Overview
groupjava-dev @
postedDec 16, '05 at 10:43a
activeDec 19, '05 at 9:12p



site design / logo © 2021 Grokbase