FAQ
Hello,

while flipping through the cloud9 collections, I came across an XML
InputFormat class:

http://www.umiacs.umd.edu/~jimmylin/cloud9/docs/api/edu/umd/cloud9/collection/XMLInputFormat.html

I haven't used it myself, but It might be worth a try.


Joerg

On 30.07.2009, at 14:16, Hyunsik Choi wrote:

Hi,

Actually, I don't know there exists any well-made XML InputFormat or
Record reader.
To the best of my knowledge, StreamXmlRecordReader (
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/streaming/StreamXmlRecordReader.html
) of Hadoop streaming is only solution.

Good luck!

--
Hyunsik Choi
Database & Information Systems Group, Korea University
http://diveintodata.org



On Thu, Jul 30, 2009 at 5:30 PM, Wasim Bariwrote:


Hi All,

I am looking to store some real big xml files in HDFS and
then process them using MapReduce.



Do we have some utility which uploads the xml files to hdfs making
sure split up of file in block doen't brake an elemet ( mean half
element on one block and half on someother ) ?



Any suggestions to work thos out will be appreciated greatly.



Thanks



Bari
--

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 4 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 30, '09 at 8:31a
activeAug 11, '09 at 11:02p
posts4
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase