FAQ
Hi,

I am developing a PDF search engine, locally. I have used API: pdfbox and
lucene.

I must show the user the PDF page containing the keywords(if highlight, it's
great) and sort by relevance(default in lucene). HOW???

Maybe, if there are some pictures in the PDF page, how could it display to
the user after index and search the extracted text???

Thanks

Search Discussions

  • Simon Willnauer at Feb 19, 2011 at 11:17 pm
    hi Gong Li,

    your question is out of scope of this mailing list.

    thanks,

    simon
    On Fri, Feb 18, 2011 at 7:29 PM, Gong Li wrote:
    Hi,

    I am developing a PDF search engine, locally. I have used API: pdfbox and
    lucene.

    I must show the user the PDF page containing the keywords(if highlight, it's
    great) and sort by relevance(default in lucene). HOW???

    Maybe, if there are some pictures in the PDF page, how could it display to
    the user after index and search the extracted text???

    Thanks
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Alexander Aristov at Feb 20, 2011 at 6:36 am
    your search engine would extract text content from a PDF file and all
    markup, pictures etc would be lost. and so when you search you would get
    only text, highlighted or not.


    Best Regards
    Alexander Aristov

    On 18 February 2011 21:29, Gong Li wrote:

    Hi,

    I am developing a PDF search engine, locally. I have used API: pdfbox and
    lucene.

    I must show the user the PDF page containing the keywords(if highlight,
    it's
    great) and sort by relevance(default in lucene). HOW???

    Maybe, if there are some pictures in the PDF page, how could it display to
    the user after index and search the extracted text???

    Thanks

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedFeb 18, '11 at 6:30p
activeFeb 20, '11 at 6:36a
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase