|| at Nov 17, 2008 at 3:22 pm
It is not difficult to write a Pig Load/Store function pair that
builds/reads a Lucene index -- we have written such code at Yahoo.
Unfortunately that code is not open-source, otherwise I'd be happy to share.
In terms of passing arguments, Pig Load/Store functions accept one or more
string arguments. You can pass arbitrary Java objects by serializing the
object into a string, passing the encoded string as the argument via Pig,
and having the load/store function deserialize it.
On 11/15/08 2:50 PM, "Ian Holsman" wrote:
I submitted a patch (https://issues.apache.org/jira/browse/PIG-533) to
demonstrate how you would load data into a database.
This could easily be changed to push it into SOLR. (Retrieving it from
Solr or a DB would be more of a challenge I think).
The only painful thing is that you would need to pass the
fieldname/column# mapping into the function, as I don't see any other
way to get this information through.
David Linsin wrote:
is there any Pig Lucene integration, in terms of loading a Index and
accessing the Documents contained?
with kind regards,
- - - - - - - - - - - - - - - - - - - - - - - -
Christopher Olston, Ph.D.
Sr. Research Scientist