Grokbase Groups Pig user August 2011
FAQ
Hi All,

I am trying to perform a join of some hbase tables in pig and I am using
HBaseStorage to load the data from hbase in pig .

I was able to load my data using HBaseStorage but I have one problem. My
Hbase tables are large and contain historic data. Hence I want to load the
data in hbase for the last one hour or day. Is there a way I can do this? I
tried to read about HBaseStorage but couldnt find a way I can achieve this

Kindly suggest if this can be done.

Thanks
Gayatri

Search Discussions

  • Bill Graham at Aug 15, 2011 at 6:10 pm
    If your rowKeys are time-based you can filter on them in the constructor
    with the -lt and -gt params. If instead you want to filter by cell
    timestamp, PIG-2114 us currently underway to support that, but it's not
    there yet.

    On Mon, Aug 15, 2011 at 10:57 AM, Gayatri Rao wrote:

    Hi All,

    I am trying to perform a join of some hbase tables in pig and I am using
    HBaseStorage to load the data from hbase in pig .

    I was able to load my data using HBaseStorage but I have one problem. My
    Hbase tables are large and contain historic data. Hence I want to load the
    data in hbase for the last one hour or day. Is there a way I can do this? I
    tried to read about HBaseStorage but couldnt find a way I can achieve this

    Kindly suggest if this can be done.

    Thanks
    Gayatri
  • Gayatri Rao at Aug 15, 2011 at 6:27 pm
    Thanks Bill.

    My rowkeys currently are ids(alpha numeric)
    By rowkeys being time based did you mean appending the time stamp to the
    keys?

    Thanks for pointing out the jira issue I will check that also.

    -Gayatri
    On Mon, Aug 15, 2011 at 11:39 PM, Bill Graham wrote:

    If your rowKeys are time-based you can filter on them in the constructor
    with the -lt and -gt params. If instead you want to filter by cell
    timestamp, PIG-2114 us currently underway to support that, but it's not
    there yet.

    On Mon, Aug 15, 2011 at 10:57 AM, Gayatri Rao wrote:

    Hi All,

    I am trying to perform a join of some hbase tables in pig and I am using
    HBaseStorage to load the data from hbase in pig .

    I was able to load my data using HBaseStorage but I have one problem. My
    Hbase tables are large and contain historic data. Hence I want to load the
    data in hbase for the last one hour or day. Is there a way I can do this? I
    tried to read about HBaseStorage but couldnt find a way I can achieve this
    Kindly suggest if this can be done.

    Thanks
    Gayatri
  • Dmitriy Ryaboy at Aug 15, 2011 at 6:38 pm
    That would have to be *pre* pending, which causes problems (hotspots) on load.

    It might be better to use timestamps (support work is underway) or to
    design your schema such that you have separate columns for separate
    epochs, and scan ranges of columns.

    D
    On Mon, Aug 15, 2011 at 11:26 AM, Gayatri Rao wrote:
    Thanks Bill.

    My rowkeys currently are ids(alpha numeric)
    By rowkeys being time based did you mean appending the time stamp to the
    keys?

    Thanks for pointing out the jira issue I will check that also.

    -Gayatri
    On Mon, Aug 15, 2011 at 11:39 PM, Bill Graham wrote:

    If your rowKeys are time-based you can filter on them in the constructor
    with the -lt and -gt params. If instead you want to filter by cell
    timestamp, PIG-2114 us currently underway to support that, but it's not
    there yet.

    On Mon, Aug 15, 2011 at 10:57 AM, Gayatri Rao wrote:

    Hi All,

    I am trying to perform a join of some hbase tables in pig and I am using
    HBaseStorage to load the data from hbase in pig .

    I was able to load my data using HBaseStorage but I have one problem.  My
    Hbase tables are large and contain historic data. Hence I want to load the
    data in hbase for the last one hour or day. Is there a way I can do this? I
    tried to read about HBaseStorage but couldnt find a way I can achieve this
    Kindly suggest if this can be done.

    Thanks
    Gayatri

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedAug 15, '11 at 5:58p
activeAug 15, '11 at 6:38p
posts4
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase