I have some files with mixed characters from all over the world. utf-8,
latin1, latin9, and like 10 others. These are international files of raw IM
logs. Is there a way to load these files as is into Hadoop? Its smart
enough to interpret the file as is correct? My file sizes are petabytes and
I want to write some Hive queries to find patterns. Please bare with me as
I am a newbie.

I know I can set the character level at the server level, but I want to make
sure there is no other setting that I am missing. For example in mysql, I
can set the language at the DB Level.....

Thanks so much!

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 2 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedSep 25, '09 at 9:58p
activeSep 27, '09 at 12:25a

2 users in discussion

Zheng Shao: 1 post Tom kersnick: 1 post



site design / logo © 2022 Grokbase