FAQ
Hi,


I am developing an advanced pdf search engine in java by using pdfbox and lucene. And I must display the context of each keyword in the user interface, but i cannot find a method to do so. Most of the methods provided are used to deal with documents with whole content in the specified field, and i just need the context of each keyword (i.e. some specified part of the contents in the specified field).


Are there any ways to do so???


Thx.


Cescy

Search Discussions

  • Felipe Lobo at Feb 3, 2011 at 5:55 pm
    If i understand you question right, you want do generate the snippet for the
    result documents.
    You can do something like the code below:

    QueryScorer scorer = new QueryScorer(query);
    Highlighter highlighter = new Highlighter(scorer);
    highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));
    String text = document.getField(fieldName).stringValue();
    TokenStream tokenStream = analyzer.tokenStream(fieldName, new
    StringReader(text));
    String snippet = highlighter.getBestFragments(tokenStream, text,
    NUM_FRAGMENTS, TOKEN_DELIMITER);


    2011/2/3 Cescy <290131755@qq.com>
    Hi,


    I am developing an advanced pdf search engine in java by using pdfbox and
    lucene. And I must display the context of each keyword in the user
    interface, but i cannot find a method to do so. Most of the methods provided
    are used to deal with documents with whole content in the specified field,
    and i just need the context of each keyword (i.e. some specified part of the
    contents in the specified field).


    Are there any ways to do so???


    Thx.


    Cescy



    --
    Felipe Lobo
    www.jusbrasil.com.br

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedFeb 3, '11 at 5:42p
activeFeb 3, '11 at 5:55p
posts2
users2
websitelucene.apache.org

2 users in discussion

Felipe Lobo: 1 post Cescy: 1 post

People

Translate

site design / logo © 2022 Grokbase