Actually, I don't know there exists any well-made XML InputFormat or
To the best of my knowledge, StreamXmlRecordReader (
) of Hadoop streaming is only solution.
Database & Information Systems Group, Korea University
On Thu, Jul 30, 2009 at 5:30 PM, Wasim Bariwrote:
I am looking to store some real big xml files in HDFS and then process them using MapReduce.
Do we have some utility which uploads the xml files to hdfs making sure split up of file in block doen't brake an elemet ( mean half element on one block and half on someother ) ?
Any suggestions to work thos out will be appreciated greatly.