FAQ
I am new to Hadoop/PIG. I have two data sets in my HDFS. One set is the
Persons and the second set is the Addresses (CSV files). Both data sets
have the unique id called personid. I want to be able to load both sets in
my Apache PIG script and produce JSON with higher-level Peson object and
inner level address objects. I couldn't find an easy way to do this. I
appreciate any help. Thank you.

Ex JSON:

{
"person": {
"personid": "1",
"address": [
{ "city": "A" },
{ "city": "B" }
]
}
}

Search Discussions

  • Harsha at Feb 12, 2013 at 7:15 pm
    Hi Satish,
    from what I understand you are trying to convert your csv files into json objects. Can you try joining your two data sets based on personid , more on join here http://pig.apache.org/docs/r0.10.0/basic.html#JOIN. Once you have the data in one relation pass that to a UDF which can construct a json object (http://pig.apache.org/docs/r0.10.0/udf.html#udf-java).

    --
    Harsha

    On Tuesday, February 12, 2013 at 10:30 AM, Satish Kolli wrote:

    I am new to Hadoop/PIG. I have two data sets in my HDFS. One set is the
    Persons and the second set is the Addresses (CSV files). Both data sets
    have the unique id called personid. I want to be able to load both sets in
    my Apache PIG script and produce JSON with higher-level Peson object and
    inner level address objects. I couldn't find an easy way to do this. I
    appreciate any help. Thank you.

    Ex JSON:

    {
    "person": {
    "personid": "1",
    "address": [
    { "city": "A" },
    { "city": "B" }
    ]
    }
    }

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedFeb 12, '13 at 6:30p
activeFeb 12, '13 at 7:15p
posts2
users2
websitepig.apache.org

2 users in discussion

Harsha: 1 post Satish Kolli: 1 post

People

Translate

site design / logo © 2021 Grokbase