thank you for your reply,
so can I do the same with java scripts,
and to be more clear, I have a folder with multiple xml files thatI want to
read and parse in order to extract some attributes (att1,att2) values ....

< elem att1=452 att2=7587>elem1</elem>

On Wed, Sep 14, 2011 at 4:53 PM, wrote:

I do this:
define analyze_unif `analyze_unif_recs.py`
input (stdin)
output (stdout USING PigStreaming(','))
ship ('$scriptDir/analyze_unif_recs.py');

UnifLines = load '$unif_xml'
using org.apache.pig.piggybank.storage.XMLLoader('REC')
as (doc:chararray);
UnifXmlByDocId = stream UnifLines through analyze_unif
as (docid : int,
xml_comp: chararray

where analyze_unif_recs.py is a python script I wrote that does the xml
parsing, and org.apache.pig.piggybank.storage.XMLLoader('REC') finds the
<REC> elements in the xml input, that are passed to my script.

William F Dowling
Sr Technical Specialist, Software Engineering
Thomson Reuters
0 +1 215 823 3853

-----Original Message-----
From: Baraa Mohamad
Sent: Wednesday, September 14, 2011 10:41 AM
To: user@pig.apache.org
Subject: reading xml file within a UDF

I have a question please

How I can read a file in a UDF in pig

ex: A = load 'xmlFiles' using myXMLParser ( xmlfile)

can I do something like that, so that I can parse the xml file using some
java library

thanks for your help



Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 3 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedSep 14, '11 at 2:41p
activeSep 14, '11 at 3:26p

2 users in discussion

Baraa Mohamad: 2 posts William Dowling: 1 post



site design / logo © 2021 Grokbase