FAQ
Don,

I think you misunderstood Otis. You will need to use some sort of parser
(i.e. pdfbox, xpdf) to get the title text from the PDF (I assume you are
indexing the documents, so you have this already). Then you create a
"title" field in your index and store the text of the title in there (so
that you can return it with the results). That is independent of what you
do with the rest of the PDF document.

-Mike
At 11:17 AM 6/28/2004, Don Vaillancourt wrote:
It can't be that simple because the field will contain the whole PDF and
not just the title. And for PDFs that are 3 or 4 megs, it is really not
reasonable to store the who PDF in the collection just to get the title.
________________________________________________________________________
Save and share anything you find online - Furl @ http://www.furl.net


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 28, '04 at 3:22p
activeJun 28, '04 at 3:22p
posts1
users1
websitelucene.apache.org

1 user in discussion

Michael Giles: 1 post

People

Translate

site design / logo © 2022 Grokbase