Probably the easiest way to do this would be to index all
the terms in the same field with a large increment gap between.
See Analyzer.getPositionIncrementGap (you'll have to create
your own analyzer here, probably just subclassing one of the
Once things are indexed that way, then you can do, say, SpanQueries
or even proximity queries (i.e. "yellow sell"~5).
This sounds a bit like gibberish, but bear with me. Let's say you have
overridden an analyzer and return an increment gap of 100. Now say you
index as follows (pseudo code).
Document doc = new Document()
doc.add(new Field("field", "house", ...))
doc.add(new Field("field", "yellow ball", ...))
doc.add(new Field("field'', "yellow sell", ...))
doc.add(new Field("field", "ball star", ...))
doc.add(new Field("field", "home xyz", ...))
Now, here are (roughly), your term positions
house - 1
yellow - 102
ball - 103
yellow - 204
sell - 205
ball - 306
star - 307
home - 408
xyz - 409
The bump comes because each time you call doc.add, if it's already been
called before on that document, a call is made to getPositionIncrementGap
and the return value is added to the offset of the first token.
Now if you choose a large enough increment gap and make your proximity
require that all the terms are within *less* than that gap, you should be
P.S. Both messages came through, so I have no idea why you got your message,
you might check your local server.
On Sat, Jan 17, 2009 at 2:35 PM, Haroldo Nascimento wrote:
I have a problem to do searches in fields tokenized.
Initially I had associated with an advertisement 10 terms and for each term
corresponded to one field in my index and the query had operations OR for
the 10 fields.
Now, the advertisements have more than 2,000 terms and the current
solution (to create 2,000 fields) not works.
I think in create only field, that contens all terms tokenized with ";"
for example. How I can do search in a field that contains tokenized fields
or exists another solution for this problem?
advertise_id = "00001"
1- "home work"
3- "yellow green ball sell"
4- "star sports"
5- "tennis ball new"
My unique field contains: "home work; house; yellow green ball sell; star
sports; tennis ball new; ... ; xyz;"
If my query is:
query= "house" -> result = 1
query= "yellow ball" -> result = 1
query= "yellow sell" -> result = 1
query= "ball star" -> result = 0 (no has result)
query= "home xyz" -> result = 0 (no has result)
Mais do que emails! Confira tudo o que Windows Live™ pode oferecer.http://www.microsoft.com/windows/windowslive/