FAQ
For your field configuration, the TokenStream you get with getAnyTokenStream is built from TermVectors.

What tokenizer do you use for populating your field ? Have you check with luke that your term vectors are Ok ?

And what version of lucene ? A change was made on this code recently, for another issue (apparently unrelated, but who knows ?) See https://issues.apache.org/jira/browse/LUCENE-2874

Pierre


De : Cescy
Envoyé : vendredi 18 mars 2011 07:32
À : java-user; Pierre GOSSE
Objet : Re:RE: About highlighter


Yes, I only search the "contents" field. And I can print the whole contents by doc.get("contents") if there are any keywords in it. And if the number of words is too large, it is cannot highlight the keywords at end part of the contents, as if highlight have a word limitation.

document.add( new Field( "contens", value, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS ) );

Thx
Gong

------------------ Original ------------------
From: "Pierre GOSSE"<pierre.gosse@arisem.com>;
Date: Thu, Mar 17, 2011 04:25 PM
To: "java-user@lucene.apache.org"<java-user@lucene.apache.org>;
Subject: RE: About highlighter

500 is the max size of text fragments to be returned by highlight. It shouldn't be the problem here, as far as I understand highlight.

Gong li, how is defined the field "contents" ? Is it the only field on which the search is made ?

Pierre

-----Message d'origine-----
De : Ian Lea
Envoyé : mercredi 16 mars 2011 22:29
�� : java-user@lucene.apache.org
Objet : Re: About highlighter

I know nothing about highlighting but that 500 looks like a good place
to start investigating.


--
Ian.

On Tue, Mar 15, 2011 at 8:47 PM, Cescy wrote:
Hi,


My highlight code is shown as following:


QueryScorer scorer = new QueryScorer(query);
Highlighter highlighter = new Highlighter(simpleHTMLFormatter, scorer);
highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer, 500));
String contents = doc.get("contents");
TokenStream tokenStream = TokenSources.getAnyTokenStream(searcher.getIndexReader(), topDocs.scoreDocs[i].doc, "contents", doc, analyzer);
String[] snippet = highlighter.getBestFragments(tokenStream, contents, 10);



snippet is the result contexts and then I will print out them on the screen.
But If I may search for a keyword at the last few paragraph and the essay is too long (1000-2000 words), it will return "document found" and snippet..length=0 (i.e. document is found but context is NOT found). Why???


How could I fix the problem?
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 18, '11 at 8:28a
activeMar 18, '11 at 8:28a
posts1
users1
websitelucene.apache.org

1 user in discussion

Pierre GOSSE: 1 post

People

Translate

site design / logo © 2022 Grokbase