1. Convert PDF to file with e.g xpdf
2. Insert parsed text to a table of your choice.
3. Make vectors from the text.
Actually, if you're not going to use the headline()
function, you cna
just store it directly in a vector, cutting down on the size
requirements.
What size requirements ?
If you store both text and tsvector, that's going to use up a lot more
space than if you just store the tsvector. With a proper lexer and such,
it will be *more* than twice as large, given that the tsvector will be
smaller than the text.

//Magnus

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 7 of 7 | next ›
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedDec 11, '06 at 11:11a
activeDec 12, '06 at 7:51a
posts7
users4
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase