Grokbase Groups Hive user April 2011

We have a situation where the data coming from source systems to hive may
contain the common characters and delimiters such as |, tabs, new line
characters etc.

We may have to use multi character delimiters such as "|#" for columns and "||#"
for rows.

How can we achieve this? In this case our single rows may look like below (|#is
column delimiter and ||#is row delimiter

row 1 col1 |# row 1 col2 |# row 1 col 3 has
new line characters |# and this is
the last column of row 1 ||# row 2 col1 |# row 2 col2 |# row 2 col 3 has
one tab and one new line character |# and this is
the last column of row 2 ||#

Would custom SerDe help us handle this situation?

Thanks and Regards,

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 11 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedApr 27, '11 at 6:06a
activeMay 7, '11 at 11:51p



site design / logo © 2021 Grokbase