FAQ
Since Hbase is tailored to handle one table very well, we are thinking to put multiple tables into one big table but on different column family sets. Our use case is full table scan against single column value filters. As records from different "logical tables" are at different column families, could we speed up the scan performance by simply checking the column family referenced by these single column value filters first before really going through all the underlying K-V pairs? It would be great if the Hbase code is already coded that way.


$0.02,
Thomas

Search Discussions

  • Todd Lipcon at Feb 15, 2012 at 10:02 pm
    Hi Thomas,

    The issue with combining multiple tables into different CFs of one
    table is that the tables will get tied together for flush/compact
    operations. If the workload between them differs significantly you
    might introduce bad inefficiency for one or the other. See HBASE-3149.

    -Todd
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:

    Since Hbase is tailored to handle one table very well, we are thinking to put multiple tables into one big table but on different column family sets. Our use case is full table scan against single column value filters. As records from different "logical tables" are at different column families, could we speed up the scan performance by simply checking the column family referenced by these single column value filters first before really going through all the underlying K-V pairs? It would be great if the Hbase code is already coded that way.


    $0.02,
    Thomas


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Stack at Feb 15, 2012 at 10:07 pm

    On Wed, Feb 15, 2012 at 2:02 PM, Todd Lipcon wrote:
    Hi Thomas,

    The issue with combining multiple tables into different CFs of one
    table is that the tables will get tied together for flush/compact
    operations. If the workload between them differs significantly you
    might introduce bad inefficiency for one or the other. See HBASE-3149.
    Are the two column families bulk loaded at the same time Thomas?

    Updates come in as trickles over the API but main loading is via bulk
    load (across the multiple column families?)?

    St.Ack
  • Pan, Thomas at Feb 17, 2012 at 9:27 pm
    Currently, bulk load is for bootstrapping the table(s) while random write
    is the way to go, which we could assume that the operations are evenly
    distributed across the time for all the column families. -Thomas
    On 2/15/12 2:07 PM, "Stack" wrote:
    On Wed, Feb 15, 2012 at 2:02 PM, Todd Lipcon wrote:
    Hi Thomas,

    The issue with combining multiple tables into different CFs of one
    table is that the tables will get tied together for flush/compact
    operations. If the workload between them differs significantly you
    might introduce bad inefficiency for one or the other. See HBASE-3149.
    Are the two column families bulk loaded at the same time Thomas?

    Updates come in as trickles over the API but main loading is via bulk
    load (across the multiple column families?)?

    St.Ack
  • Pan, Thomas at Feb 17, 2012 at 6:49 pm
    In our case, we have similar updating patterns for completed and live
    items. $0.02, -Thomas
    On 2/15/12 2:02 PM, "Todd Lipcon" wrote:

    Hi Thomas,

    The issue with combining multiple tables into different CFs of one
    table is that the tables will get tied together for flush/compact
    operations. If the workload between them differs significantly you
    might introduce bad inefficiency for one or the other. See HBASE-3149.

    -Todd
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:

    Since Hbase is tailored to handle one table very well, we are thinking
    to put multiple tables into one big table but on different column family
    sets. Our use case is full table scan against single column value
    filters. As records from different "logical tables" are at different
    column families, could we speed up the scan performance by simply
    checking the column family referenced by these single column value
    filters first before really going through all the underlying K-V pairs?
    It would be great if the Hbase code is already coded that way.


    $0.02,
    Thomas


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Vladimir Rodionov at Feb 15, 2012 at 10:27 pm
    I think having unique row-prefix for every table is a standard way of storing multiple virtual tables inside one BigTable's table
    You get data locality per every virtual table and in this case you can easily specify start and stop rows for a Scan.

    Assigning separate CF to a virtual table is a bad idea because you will get data from different virtual tables mixed as since
    CF comes after row-key in default BigTable (HBase) comparison routine.

    Best regards,
    Vladimir Rodionov
    Principal Platform Engineer
    Carrier IQ, www.carrieriq.com
    e-mail: vrodionov@carrieriq.com

    ________________________________________
    From: Pan, Thomas [thpan@ebay.com]
    Sent: Wednesday, February 15, 2012 1:57 PM
    To: dev@hbase.apache.org
    Subject: Scan performance on a big table as combination of multiple logic tables

    Since Hbase is tailored to handle one table very well, we are thinking to put multiple tables into one big table but on different column family sets. Our use case is full table scan against single column value filters. As records from different "logical tables" are at different column families, could we speed up the scan performance by simply checking the column family referenced by these single column value filters first before really going through all the underlying K-V pairs? It would be great if the Hbase code is already coded that way.


    $0.02,
    Thomas


    Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.
  • Jacques at Feb 15, 2012 at 11:45 pm
    Out of curiosity, what do you perceive as the benefit to having only one
    table? Are there reasons that you think one table would perform better
    than a few?

    If you're splitting data within a table because you'd otherwise have
    millions of tables, I understand that and would concur with Vladimir's
    approach below. However, if you're really looking at 10 tables versus one
    table, it seems like HBase is built exactly to make that work well (rather
    than having to make all sorts of application level code to do what HBase
    already does).

    thanks,
    Jacques
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:


    Since Hbase is tailored to handle one table very well, we are thinking to
    put multiple tables into one big table but on different column family sets.
    Our use case is full table scan against single column value filters. As
    records from different "logical tables" are at different column families,
    could we speed up the scan performance by simply checking the column family
    referenced by these single column value filters first before really going
    through all the underlying K-V pairs? It would be great if the Hbase code
    is already coded that way.


    $0.02,
    Thomas
  • Vladimir Rodionov at Feb 16, 2012 at 12:15 am
    10 tables are fine. 1000 are not, especially when one does table pre-splitting to increase write perf.

    Too many regions kill HBase.

    Best regards,
    Vladimir Rodionov
    Principal Platform Engineer
    Carrier IQ, www.carrieriq.com
    e-mail: vrodionov@carrieriq.com

    ________________________________________
    From: Jacques [whshub@gmail.com]
    Sent: Wednesday, February 15, 2012 3:45 PM
    To: dev@hbase.apache.org
    Subject: Re: Scan performance on a big table as combination of multiple logic tables

    Out of curiosity, what do you perceive as the benefit to having only one
    table? Are there reasons that you think one table would perform better
    than a few?

    If you're splitting data within a table because you'd otherwise have
    millions of tables, I understand that and would concur with Vladimir's
    approach below. However, if you're really looking at 10 tables versus one
    table, it seems like HBase is built exactly to make that work well (rather
    than having to make all sorts of application level code to do what HBase
    already does).

    thanks,
    Jacques
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:


    Since Hbase is tailored to handle one table very well, we are thinking to
    put multiple tables into one big table but on different column family sets.
    Our use case is full table scan against single column value filters. As
    records from different "logical tables" are at different column families,
    could we speed up the scan performance by simply checking the column family
    referenced by these single column value filters first before really going
    through all the underlying K-V pairs? It would be great if the Hbase code
    is already coded that way.


    $0.02,
    Thomas
    Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.
  • Andrew Purtell at Feb 16, 2012 at 1:44 am
    Too many regions kill HBase.
    How many regions do you carry per RS? What was the effective limit you encountered? Curious.

    The available public information is getting old now but BigTable deployments at Google limited the number of tablets per tablet server to ~100. This was for a number of reasons related to their specific hardware configuration, no doubt, considerations such as having enough RAM to keep in memory tables in memory, and the fact they had something like 160 or 320 GB of local storage only, and so on; but also presumably to limit the scope of failure of a given server, and to keep overheads down.

    I advise our ops people to set notifications for when the number of regions per HBase RegionServer gets above 500. The more regions per server, the more must be relocated per server failure, the longer some regions will be in transition. When we get close to the limit, it's time to add another RegionServer. (Even if HBase could handle 10,000 regions per RegionServer that wouldn't be a good idea without a distributed master of some kind.) If you are scaling out for this reason already, then the region carrying capacity of the cluster is also scaling. We have many thousands of regions and region housekeeping overhead is not an issue, although we are certainly not the largest deployment. Currently the META region isn't split, I think that might impose an effective upper bound at some point, but that can be fixed. There's no architectural limit that I am aware of.

    Best regards,

    - Andy

    Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


    ----- Original Message -----
    From: Vladimir Rodionov <vrodionov@carrieriq.com>
    To: "dev@hbase.apache.org" <dev@hbase.apache.org>
    Cc:
    Sent: Wednesday, February 15, 2012 4:11 PM
    Subject: RE: Scan performance on a big table as combination of multiple logic tables

    10 tables are fine. 1000 are not, especially when one does table pre-splitting
    to increase write perf.

    Too many regions kill HBase.

    Best regards,
    Vladimir Rodionov
    Principal Platform Engineer
    Carrier IQ, www.carrieriq.com
    e-mail: vrodionov@carrieriq.com

    ________________________________________
    From: Jacques [whshub@gmail.com]
    Sent: Wednesday, February 15, 2012 3:45 PM
    To: dev@hbase.apache.org
    Subject: Re: Scan performance on a big table as combination of multiple logic
    tables

    Out of curiosity,  what do you perceive as the benefit to having only one
    table?  Are there reasons that you think one table would perform better
    than a few?

    If you're splitting data within a table because you'd otherwise have
    millions of tables, I understand that and would concur with Vladimir's
    approach below.  However, if you're really looking at 10 tables versus one
    table, it seems like HBase is built exactly to make that work well (rather
    than having to make all sorts of application level code to do what HBase
    already does).

    thanks,
    Jacques
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:


    Since Hbase is tailored to handle one table very well, we are thinking to
    put multiple tables into one big table but on different column family sets.
    Our use case is full table scan against single column value filters. As
    records from different "logical tables" are at different column families,
    could we speed up the scan performance by simply checking the column family
    referenced by these single column value filters first before really going
    through all the underlying K-V pairs? It would be great if the Hbase code
    is already coded that way.


    $0.02,
    Thomas
    Confidentiality Notice:  The information contained in this message, including
    any attachments hereto, may be confidential and is intended to be read only by
    the individual or entity to whom this message is addressed. If the reader of
    this message is not the intended recipient or an agent or designee of the
    intended recipient, please note that any review, use, disclosure or distribution
    of this message or its attachments, in any form, is strictly prohibited.  If you
    have received this message in error, please immediately notify the sender and/or
    Notifications@carrieriq.com and delete or destroy any copy of this message and
    its attachments.
  • Pan, Thomas at Feb 17, 2012 at 6:56 pm
    Vladimire and Jacques, Thanks for the information! Unless Hbase well
    handles multiple big sized tables (relatively high region count) in one
    cluster, it seems to me that one big table is the way to go. Otherwise,
    runtime tuning seems to add quite amount of operational cost. That leads
    to another question. Do we see big region size as an issue? If so, what's
    the pivot point as region size grows further, the scan performance starts
    to degrade exponentially?
    On 2/15/12 4:11 PM, "Vladimir Rodionov" wrote:

    10 tables are fine. 1000 are not, especially when one does table
    pre-splitting to increase write perf.

    Too many regions kill HBase.

    Best regards,
    Vladimir Rodionov
    Principal Platform Engineer
    Carrier IQ, www.carrieriq.com
    e-mail: vrodionov@carrieriq.com

    ________________________________________
    From: Jacques [whshub@gmail.com]
    Sent: Wednesday, February 15, 2012 3:45 PM
    To: dev@hbase.apache.org
    Subject: Re: Scan performance on a big table as combination of multiple
    logic tables

    Out of curiosity, what do you perceive as the benefit to having only one
    table? Are there reasons that you think one table would perform better
    than a few?

    If you're splitting data within a table because you'd otherwise have
    millions of tables, I understand that and would concur with Vladimir's
    approach below. However, if you're really looking at 10 tables versus one
    table, it seems like HBase is built exactly to make that work well (rather
    than having to make all sorts of application level code to do what HBase
    already does).

    thanks,
    Jacques
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:


    Since Hbase is tailored to handle one table very well, we are thinking
    to
    put multiple tables into one big table but on different column family
    sets.
    Our use case is full table scan against single column value filters. As
    records from different "logical tables" are at different column
    families,
    could we speed up the scan performance by simply checking the column
    family
    referenced by these single column value filters first before really
    going
    through all the underlying K-V pairs? It would be great if the Hbase
    code
    is already coded that way.


    $0.02,
    Thomas
    Confidentiality Notice: The information contained in this message,
    including any attachments hereto, may be confidential and is intended to
    be read only by the individual or entity to whom this message is
    addressed. If the reader of this message is not the intended recipient or
    an agent or designee of the intended recipient, please note that any
    review, use, disclosure or distribution of this message or its
    attachments, in any form, is strictly prohibited. If you have received
    this message in error, please immediately notify the sender and/or
    Notifications@carrieriq.com and delete or destroy any copy of this
    message and its attachments.
  • Jacques at Feb 17, 2012 at 10:47 pm
    You should be fine having multiple tables with high region counts. I would
    avoid making thousands of tables. However, if you have three separate
    business needs, make three different tables.

    You seem to be starting with a perspective that there would be some kind of
    issues with multiple tables. Why do you think this exists? You said
    "Otherwise, runtime tuning seems to add quite amount of operational cost."
    I'm not sure what you are thinking here and where your thoughts are coming
    from. Additionally, if you have separate tables, then you can modify them
    differently (e.g. setting them to different region sizes if it makes
    sense-- for example, some of our tables have smaller region sizes so we'll
    have more maps rather than fewer when we run map reduce jobs).

    Regarding region size: the HTable v1 format in 0.90 and below suffered from
    taking a long time to transition as individual regions got too big. With
    0.92 and HTablev2 that isn't as much of a problem as I understand it. If I
    recall correctly, there are numerous organizations using 10gb regions with
    sucess-- (among others, I believe this what Yahoo reported they were using
    for their web crawl tables on their thousand node cluster). While I
    haven't run any stats, I believe that there is negligible scan performance
    impact as region size grows. There is definitely no exponential negative
    performance impact.


    On Fri, Feb 17, 2012 at 10:55 AM, Pan, Thomas wrote:


    Vladimire and Jacques, Thanks for the information! Unless Hbase well
    handles multiple big sized tables (relatively high region count) in one
    cluster, it seems to me that one big table is the way to go. Otherwise,
    runtime tuning seems to add quite amount of operational cost. That leads
    to another question. Do we see big region size as an issue? If so, what's
    the pivot point as region size grows further, the scan performance starts
    to degrade exponentially?
    On 2/15/12 4:11 PM, "Vladimir Rodionov" wrote:

    10 tables are fine. 1000 are not, especially when one does table
    pre-splitting to increase write perf.

    Too many regions kill HBase.

    Best regards,
    Vladimir Rodionov
    Principal Platform Engineer
    Carrier IQ, www.carrieriq.com
    e-mail: vrodionov@carrieriq.com

    ________________________________________
    From: Jacques [whshub@gmail.com]
    Sent: Wednesday, February 15, 2012 3:45 PM
    To: dev@hbase.apache.org
    Subject: Re: Scan performance on a big table as combination of multiple
    logic tables

    Out of curiosity, what do you perceive as the benefit to having only one
    table? Are there reasons that you think one table would perform better
    than a few?

    If you're splitting data within a table because you'd otherwise have
    millions of tables, I understand that and would concur with Vladimir's
    approach below. However, if you're really looking at 10 tables versus one
    table, it seems like HBase is built exactly to make that work well (rather
    than having to make all sorts of application level code to do what HBase
    already does).

    thanks,
    Jacques
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:


    Since Hbase is tailored to handle one table very well, we are thinking
    to
    put multiple tables into one big table but on different column family
    sets.
    Our use case is full table scan against single column value filters. As
    records from different "logical tables" are at different column
    families,
    could we speed up the scan performance by simply checking the column
    family
    referenced by these single column value filters first before really
    going
    through all the underlying K-V pairs? It would be great if the Hbase
    code
    is already coded that way.


    $0.02,
    Thomas
    Confidentiality Notice: The information contained in this message,
    including any attachments hereto, may be confidential and is intended to
    be read only by the individual or entity to whom this message is
    addressed. If the reader of this message is not the intended recipient or
    an agent or designee of the intended recipient, please note that any
    review, use, disclosure or distribution of this message or its
    attachments, in any form, is strictly prohibited. If you have received
    this message in error, please immediately notify the sender and/or
    Notifications@carrieriq.com and delete or destroy any copy of this
    message and its attachments.
  • Pan, Thomas at Feb 18, 2012 at 7:26 am
    Jacques, thanks for the details on region size. We've observed that
    regions per region server could skew big time at the table level. We do
    have tool to balance regions. Still, it is sort of annoying to maintain
    the balance. $0.02, -Thomas
    On 2/17/12 2:46 PM, "Jacques" wrote:

    You should be fine having multiple tables with high region counts. I
    would
    avoid making thousands of tables. However, if you have three separate
    business needs, make three different tables.

    You seem to be starting with a perspective that there would be some kind
    of
    issues with multiple tables. Why do you think this exists? You said
    "Otherwise, runtime tuning seems to add quite amount of operational cost."
    I'm not sure what you are thinking here and where your thoughts are coming
    from. Additionally, if you have separate tables, then you can modify them
    differently (e.g. setting them to different region sizes if it makes
    sense-- for example, some of our tables have smaller region sizes so we'll
    have more maps rather than fewer when we run map reduce jobs).

    Regarding region size: the HTable v1 format in 0.90 and below suffered
    from
    taking a long time to transition as individual regions got too big. With
    0.92 and HTablev2 that isn't as much of a problem as I understand it. If
    I
    recall correctly, there are numerous organizations using 10gb regions with
    sucess-- (among others, I believe this what Yahoo reported they were using
    for their web crawl tables on their thousand node cluster). While I
    haven't run any stats, I believe that there is negligible scan performance
    impact as region size grows. There is definitely no exponential negative
    performance impact.


    On Fri, Feb 17, 2012 at 10:55 AM, Pan, Thomas wrote:


    Vladimire and Jacques, Thanks for the information! Unless Hbase well
    handles multiple big sized tables (relatively high region count) in one
    cluster, it seems to me that one big table is the way to go. Otherwise,
    runtime tuning seems to add quite amount of operational cost. That leads
    to another question. Do we see big region size as an issue? If so,
    what's
    the pivot point as region size grows further, the scan performance
    starts
    to degrade exponentially?
    On 2/15/12 4:11 PM, "Vladimir Rodionov" wrote:

    10 tables are fine. 1000 are not, especially when one does table
    pre-splitting to increase write perf.

    Too many regions kill HBase.

    Best regards,
    Vladimir Rodionov
    Principal Platform Engineer
    Carrier IQ, www.carrieriq.com
    e-mail: vrodionov@carrieriq.com

    ________________________________________
    From: Jacques [whshub@gmail.com]
    Sent: Wednesday, February 15, 2012 3:45 PM
    To: dev@hbase.apache.org
    Subject: Re: Scan performance on a big table as combination of multiple
    logic tables

    Out of curiosity, what do you perceive as the benefit to having only one
    table? Are there reasons that you think one table would perform better
    than a few?

    If you're splitting data within a table because you'd otherwise have
    millions of tables, I understand that and would concur with Vladimir's
    approach below. However, if you're really looking at 10 tables versus one
    table, it seems like HBase is built exactly to make that work well (rather
    than having to make all sorts of application level code to do what HBase
    already does).

    thanks,
    Jacques
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:


    Since Hbase is tailored to handle one table very well, we are
    thinking
    to
    put multiple tables into one big table but on different column family
    sets.
    Our use case is full table scan against single column value filters.
    As
    records from different "logical tables" are at different column
    families,
    could we speed up the scan performance by simply checking the column
    family
    referenced by these single column value filters first before really
    going
    through all the underlying K-V pairs? It would be great if the Hbase
    code
    is already coded that way.


    $0.02,
    Thomas
    Confidentiality Notice: The information contained in this message,
    including any attachments hereto, may be confidential and is intended to
    be read only by the individual or entity to whom this message is
    addressed. If the reader of this message is not the intended recipient or
    an agent or designee of the intended recipient, please note that any
    review, use, disclosure or distribution of this message or its
    attachments, in any form, is strictly prohibited. If you have received
    this message in error, please immediately notify the sender and/or
    Notifications@carrieriq.com and delete or destroy any copy of this
    message and its attachments.
  • M. C. Srivas at Feb 19, 2012 at 4:38 pm
    What is the impact when a compaction happens on a large 20G region? Given
    that the FS will do writes at 30 MB/s (over a single 1 GigE link), it will
    take about 1500 seconds to read/write the region. Is the region out of
    service for 25 mins (= 1500 seconds)?

    On Fri, Feb 17, 2012 at 11:25 PM, Pan, Thomas wrote:


    Jacques, thanks for the details on region size. We've observed that
    regions per region server could skew big time at the table level. We do
    have tool to balance regions. Still, it is sort of annoying to maintain
    the balance. $0.02, -Thomas
    On 2/17/12 2:46 PM, "Jacques" wrote:

    You should be fine having multiple tables with high region counts. I
    would
    avoid making thousands of tables. However, if you have three separate
    business needs, make three different tables.

    You seem to be starting with a perspective that there would be some kind
    of
    issues with multiple tables. Why do you think this exists? You said
    "Otherwise, runtime tuning seems to add quite amount of operational cost."
    I'm not sure what you are thinking here and where your thoughts are coming
    from. Additionally, if you have separate tables, then you can modify them
    differently (e.g. setting them to different region sizes if it makes
    sense-- for example, some of our tables have smaller region sizes so we'll
    have more maps rather than fewer when we run map reduce jobs).

    Regarding region size: the HTable v1 format in 0.90 and below suffered
    from
    taking a long time to transition as individual regions got too big. With
    0.92 and HTablev2 that isn't as much of a problem as I understand it. If
    I
    recall correctly, there are numerous organizations using 10gb regions with
    sucess-- (among others, I believe this what Yahoo reported they were using
    for their web crawl tables on their thousand node cluster). While I
    haven't run any stats, I believe that there is negligible scan performance
    impact as region size grows. There is definitely no exponential negative
    performance impact.


    On Fri, Feb 17, 2012 at 10:55 AM, Pan, Thomas wrote:


    Vladimire and Jacques, Thanks for the information! Unless Hbase well
    handles multiple big sized tables (relatively high region count) in one
    cluster, it seems to me that one big table is the way to go. Otherwise,
    runtime tuning seems to add quite amount of operational cost. That leads
    to another question. Do we see big region size as an issue? If so,
    what's
    the pivot point as region size grows further, the scan performance
    starts
    to degrade exponentially?
    On 2/15/12 4:11 PM, "Vladimir Rodionov" wrote:

    10 tables are fine. 1000 are not, especially when one does table
    pre-splitting to increase write perf.

    Too many regions kill HBase.

    Best regards,
    Vladimir Rodionov
    Principal Platform Engineer
    Carrier IQ, www.carrieriq.com
    e-mail: vrodionov@carrieriq.com

    ________________________________________
    From: Jacques [whshub@gmail.com]
    Sent: Wednesday, February 15, 2012 3:45 PM
    To: dev@hbase.apache.org
    Subject: Re: Scan performance on a big table as combination of multiple
    logic tables

    Out of curiosity, what do you perceive as the benefit to having only one
    table? Are there reasons that you think one table would perform better
    than a few?

    If you're splitting data within a table because you'd otherwise have
    millions of tables, I understand that and would concur with Vladimir's
    approach below. However, if you're really looking at 10 tables versus one
    table, it seems like HBase is built exactly to make that work well (rather
    than having to make all sorts of application level code to do what HBase
    already does).

    thanks,
    Jacques
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:


    Since Hbase is tailored to handle one table very well, we are
    thinking
    to
    put multiple tables into one big table but on different column family
    sets.
    Our use case is full table scan against single column value filters.
    As
    records from different "logical tables" are at different column
    families,
    could we speed up the scan performance by simply checking the column
    family
    referenced by these single column value filters first before really
    going
    through all the underlying K-V pairs? It would be great if the Hbase
    code
    is already coded that way.


    $0.02,
    Thomas
    Confidentiality Notice: The information contained in this message,
    including any attachments hereto, may be confidential and is intended to
    be read only by the individual or entity to whom this message is
    addressed. If the reader of this message is not the intended recipient or
    an agent or designee of the intended recipient, please note that any
    review, use, disclosure or distribution of this message or its
    attachments, in any form, is strictly prohibited. If you have received
    this message in error, please immediately notify the sender and/or
    Notifications@carrieriq.com and delete or destroy any copy of this
    message and its attachments.
  • Mikael Sitruk at Feb 19, 2012 at 9:46 pm
    During compaction the region is not out of service.
    According to documentation the max region size for V2 format is 20G
    And now the question: Assuming that 20G is the limit and the number of
    regions in a single RS should stay low < 500 it means that there is no mean
    having RS with more than 10TB of storage to use by HBase (otherwise
    locality will not be achieve for some servers, i also assume that
    compression is used and therefore it compensate the need for additional
    space for replication)?
    If the max number of region per RS is smaller then the storage size is even
    smaller. Is it correct?

    Mikael.S
    On Sun, Feb 19, 2012 at 6:38 PM, M. C. Srivas wrote:

    What is the impact when a compaction happens on a large 20G region? Given
    that the FS will do writes at 30 MB/s (over a single 1 GigE link), it will
    take about 1500 seconds to read/write the region. Is the region out of
    service for 25 mins (= 1500 seconds)?

    On Fri, Feb 17, 2012 at 11:25 PM, Pan, Thomas wrote:


    Jacques, thanks for the details on region size. We've observed that
    regions per region server could skew big time at the table level. We do
    have tool to balance regions. Still, it is sort of annoying to maintain
    the balance. $0.02, -Thomas
    On 2/17/12 2:46 PM, "Jacques" wrote:

    You should be fine having multiple tables with high region counts. I
    would
    avoid making thousands of tables. However, if you have three separate
    business needs, make three different tables.

    You seem to be starting with a perspective that there would be some kind
    of
    issues with multiple tables. Why do you think this exists? You said
    "Otherwise, runtime tuning seems to add quite amount of operational
    cost."
    I'm not sure what you are thinking here and where your thoughts are
    coming
    from. Additionally, if you have separate tables, then you can modify
    them
    differently (e.g. setting them to different region sizes if it makes
    sense-- for example, some of our tables have smaller region sizes so
    we'll
    have more maps rather than fewer when we run map reduce jobs).

    Regarding region size: the HTable v1 format in 0.90 and below suffered
    from
    taking a long time to transition as individual regions got too big.
    With
    0.92 and HTablev2 that isn't as much of a problem as I understand it.
    If
    I
    recall correctly, there are numerous organizations using 10gb regions
    with
    sucess-- (among others, I believe this what Yahoo reported they were
    using
    for their web crawl tables on their thousand node cluster). While I
    haven't run any stats, I believe that there is negligible scan
    performance
    impact as region size grows. There is definitely no exponential
    negative
    performance impact.


    On Fri, Feb 17, 2012 at 10:55 AM, Pan, Thomas wrote:


    Vladimire and Jacques, Thanks for the information! Unless Hbase well
    handles multiple big sized tables (relatively high region count) in
    one
    cluster, it seems to me that one big table is the way to go.
    Otherwise,
    runtime tuning seems to add quite amount of operational cost. That
    leads
    to another question. Do we see big region size as an issue? If so,
    what's
    the pivot point as region size grows further, the scan performance
    starts
    to degrade exponentially?

    On 2/15/12 4:11 PM, "Vladimir Rodionov" <vrodionov@carrieriq.com>
    wrote:
    10 tables are fine. 1000 are not, especially when one does table
    pre-splitting to increase write perf.

    Too many regions kill HBase.

    Best regards,
    Vladimir Rodionov
    Principal Platform Engineer
    Carrier IQ, www.carrieriq.com
    e-mail: vrodionov@carrieriq.com

    ________________________________________
    From: Jacques [whshub@gmail.com]
    Sent: Wednesday, February 15, 2012 3:45 PM
    To: dev@hbase.apache.org
    Subject: Re: Scan performance on a big table as combination of
    multiple
    logic tables

    Out of curiosity, what do you perceive as the benefit to having only one
    table? Are there reasons that you think one table would perform
    better
    than a few?

    If you're splitting data within a table because you'd otherwise have
    millions of tables, I understand that and would concur with
    Vladimir's
    approach below. However, if you're really looking at 10 tables
    versus
    one
    table, it seems like HBase is built exactly to make that work well (rather
    than having to make all sorts of application level code to do what HBase
    already does).

    thanks,
    Jacques
    On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas wrote:


    Since Hbase is tailored to handle one table very well, we are
    thinking
    to
    put multiple tables into one big table but on different column
    family
    sets.
    Our use case is full table scan against single column value
    filters.
    As
    records from different "logical tables" are at different column
    families,
    could we speed up the scan performance by simply checking the
    column
    family
    referenced by these single column value filters first before really
    going
    through all the underlying K-V pairs? It would be great if the
    Hbase
    code
    is already coded that way.


    $0.02,
    Thomas
    Confidentiality Notice: The information contained in this message,
    including any attachments hereto, may be confidential and is intended to
    be read only by the individual or entity to whom this message is
    addressed. If the reader of this message is not the intended
    recipient
    or
    an agent or designee of the intended recipient, please note that any
    review, use, disclosure or distribution of this message or its
    attachments, in any form, is strictly prohibited. If you have
    received
    this message in error, please immediately notify the sender and/or
    Notifications@carrieriq.com and delete or destroy any copy of this
    message and its attachments.


    --
    Mikael.S
  • Jean-Daniel Cryans at Feb 21, 2012 at 8:09 pm

    On Sun, Feb 19, 2012 at 1:45 PM, Mikael Sitruk wrote:
    During compaction the region is not out of service.
    According to documentation the max region size for V2 format is 20G
    And now the question: Assuming that 20G is the limit and the number of
    regions in a single RS should stay low < 500 it means that there is no mean
    having RS with more than 10TB of storage to use by HBase (otherwise
    locality will not be achieve for some servers, i also assume that
    compression is used and therefore it compensate the need for additional
    space for replication)?
    If the max number of region per RS is smaller then the storage size is even
    smaller. Is it correct?
    In the documentation 20GB is given as an example of a larger size that
    can be supported, but nothing blocks you from going way higher than
    that. I've done some import tests and had 100GB regions. It just takes
    a while to compact the bigger files.

    Also you can go over 500 regions, in fact one of our clusters has
    14,398 regions right now. It's just a pain to reassign everything when
    HBase boots but this is an offline cluster.

    J-D
  • Mikael Sitruk at Feb 21, 2012 at 9:18 pm
    This is interesting J.D. so, is there a limitation on the region size or
    not? Can it be really any number? If so beside the collection time is there
    any impact (perhaps the documentation should be updated too)?
    Regarding the number of regions you have (14,398) is it for a single RS?
    What is your number of RS?

    Mikael.S
    On Feb 21, 2012 10:09 PM, "Jean-Daniel Cryans" wrote:
    On Sun, Feb 19, 2012 at 1:45 PM, Mikael Sitruk wrote:
    During compaction the region is not out of service.
    According to documentation the max region size for V2 format is 20G
    And now the question: Assuming that 20G is the limit and the number of
    regions in a single RS should stay low < 500 it means that there is no mean
    having RS with more than 10TB of storage to use by HBase (otherwise
    locality will not be achieve for some servers, i also assume that
    compression is used and therefore it compensate the need for additional
    space for replication)?
    If the max number of region per RS is smaller then the storage size is even
    smaller. Is it correct?
    In the documentation 20GB is given as an example of a larger size that
    can be supported, but nothing blocks you from going way higher than
    that. I've done some import tests and had 100GB regions. It just takes
    a while to compact the bigger files.

    Also you can go over 500 regions, in fact one of our clusters has
    14,398 regions right now. It's just a pain to reassign everything when
    HBase boots but this is an offline cluster.

    J-D
  • Jean-Daniel Cryans at Feb 21, 2012 at 9:40 pm

    On Tue, Feb 21, 2012 at 1:17 PM, Mikael Sitruk wrote:
    This is interesting J.D. so, is there a limitation on the region size or
    not?
    Your imagination? Like I said nothing blocks you in the code.
    Can it be really any number?
    That's what it implies.
    If so beside the collection time is there
    any impact (perhaps the documentation should be updated too)?
    Collection time? You mean GC? Sorry I don't get what you mean.
    Regarding the number of regions you have (14,398) is it for a single RS?
    What is your number of RS?
    Currently 91 in that cluster. It varies :)

    We have >200 tables coming all in different sizes.

    J-D
    Mikael.S
    On Feb 21, 2012 10:09 PM, "Jean-Daniel Cryans" wrote:

    On Sun, Feb 19, 2012 at 1:45 PM, Mikael Sitruk <mikael.sitruk@gmail.com>
    wrote:
    During compaction the region is not out of service.
    According to documentation the max region size for V2 format is 20G
    And now the question: Assuming that 20G is the limit and the number of
    regions in a single RS should stay low < 500 it means that there is no mean
    having RS with more than 10TB of storage to use by HBase (otherwise
    locality will not be achieve for some servers, i also assume that
    compression is used and therefore it compensate the need for additional
    space for replication)?
    If the max number of region per RS is smaller then the storage size is even
    smaller. Is it correct?
    In the documentation 20GB is given as an example of a larger size that
    can be supported, but nothing blocks you from going way higher than
    that. I've done some import tests and had 100GB regions. It just takes
    a while to compact the bigger files.

    Also you can go over 500 regions, in fact one of our clusters has
    14,398 regions right now. It's just a pain to reassign everything when
    HBase boots but this is an offline cluster.

    J-D
  • Mikael Sitruk at Feb 21, 2012 at 9:58 pm
    See inline
    On Feb 21, 2012 11:40 PM, "Jean-Daniel Cryans" wrote:
    On Tue, Feb 21, 2012 at 1:17 PM, Mikael Sitruk wrote:
    This is interesting J.D. so, is there a limitation on the region size or
    not?
    Your imagination? Like I said nothing blocks you in the code.
    Can it be really any number?
    That's what it implies.
    If so beside the collection time is there
    any impact (perhaps the documentation should be updated too)?
    Collection time? You mean GC? Sorry I don't get what you mean.
    *Sorry, typo mistake (from mobile) I meant compaction not collection
    Regarding the number of regions you have (14,398) is it for a single RS?
    What is your number of RS?
    Currently 91 in that cluster. It varies :)

    We have >200 tables coming all in different sizes.
    *Not clear, 91 rs, and 14398 regions in total? Or per RS?
    Mikael.S
    J-D
    Mikael.S
    On Feb 21, 2012 10:09 PM, "Jean-Daniel Cryans" wrote:

    On Sun, Feb 19, 2012 at 1:45 PM, Mikael Sitruk <mikael.sitruk@gmail.com
    wrote:
    During compaction the region is not out of service.
    According to documentation the max region size for V2 format is 20G
    And now the question: Assuming that 20G is the limit and the number
    of
    regions in a single RS should stay low < 500 it means that there is
    no
    mean
    having RS with more than 10TB of storage to use by HBase (otherwise
    locality will not be achieve for some servers, i also assume that
    compression is used and therefore it compensate the need for
    additional
    space for replication)?
    If the max number of region per RS is smaller then the storage size
    is
    even
    smaller. Is it correct?
    In the documentation 20GB is given as an example of a larger size that
    can be supported, but nothing blocks you from going way higher than
    that. I've done some import tests and had 100GB regions. It just takes
    a while to compact the bigger files.

    Also you can go over 500 regions, in fact one of our clusters has
    14,398 regions right now. It's just a pain to reassign everything when
    HBase boots but this is an offline cluster.

    J-D
  • Jean-Daniel Cryans at Feb 21, 2012 at 10:14 pm

    On Tue, Feb 21, 2012 at 1:57 PM, Mikael Sitruk wrote:
    If so beside the collection time is there
    any impact (perhaps the documentation should be updated too)?
    Collection time? You mean GC? Sorry I don't get what you mean.
    *Sorry, typo mistake (from mobile) I meant compaction not collection
    Ah! Well there's a ton of impacts starting from having less regions :)
    But definitely compactions will take a lot longer the bigger the
    regions are since more and more is done in a single process. The
    documentation could definitely have more info on that.
    Regarding the number of regions you have (14,398) is it for a single RS?
    What is your number of RS?
    Currently 91 in that cluster. It varies :)

    We have >200 tables coming all in different sizes.
    *Not clear, 91 rs, and 14398 regions in total? Or per RS?
    Oh sorry, total. 14k on a single RS is impossible/suicide if you have
    any data in there because it would OOME trying to load the indexes
    (better in 0.92 tho).

    J-D
  • Mikael Sitruk at Feb 21, 2012 at 10:31 pm
    Ok, so this is approx 150 regions per RS
    What are the maths between the memory (index size) and number of regions?
    (Btw at the beginning when I mentionned 500 regions it was per RS.)
    I'm trying to figure out what should be my cluster configuration, regarding
    region, region size, memory size, and number of RS for the volume and
    workload I'm using
    On Feb 22, 2012 12:14 AM, "Jean-Daniel Cryans" wrote:
    On Tue, Feb 21, 2012 at 1:57 PM, Mikael Sitruk wrote:
    If so beside the collection time is there
    any impact (perhaps the documentation should be updated too)?
    Collection time? You mean GC? Sorry I don't get what you mean.
    *Sorry, typo mistake (from mobile) I meant compaction not collection
    Ah! Well there's a ton of impacts starting from having less regions :)
    But definitely compactions will take a lot longer the bigger the
    regions are since more and more is done in a single process. The
    documentation could definitely have more info on that.
    Regarding the number of regions you have (14,398) is it for a single
    RS?
    What is your number of RS?
    Currently 91 in that cluster. It varies :)

    We have >200 tables coming all in different sizes.
    *Not clear, 91 rs, and 14398 regions in total? Or per RS?
    Oh sorry, total. 14k on a single RS is impossible/suicide if you have
    any data in there because it would OOME trying to load the indexes
    (better in 0.92 tho).

    J-D
  • Jean-Daniel Cryans at Feb 21, 2012 at 11:32 pm
    This describes how they are written, with your knowledge of your data
    size and key average size you can do the math:

    http://hbase.apache.org/book.html#d0e9542

    J-D
    On Tue, Feb 21, 2012 at 2:30 PM, Mikael Sitruk wrote:
    Ok, so this is approx 150 regions per RS
    What are the maths between the memory (index size) and number of regions?
    (Btw at the beginning when I mentionned 500 regions it was per RS.)
    I'm trying to figure out what should be my cluster configuration, regarding
    region, region size, memory size, and number of RS for the volume and
    workload I'm using
    On Feb 22, 2012 12:14 AM, "Jean-Daniel Cryans" wrote:

    On Tue, Feb 21, 2012 at 1:57 PM, Mikael Sitruk <mikael.sitruk@gmail.com>
    wrote:
    If so beside the collection time is there
    any impact (perhaps the documentation should be updated too)?
    Collection time? You mean GC? Sorry I don't get what you mean.
    *Sorry, typo mistake (from mobile) I meant compaction not collection
    Ah! Well there's a ton of impacts starting from having less regions :)
    But definitely compactions will take a lot longer the bigger the
    regions are since more and more is done in a single process. The
    documentation could definitely have more info on that.
    Regarding the number of regions you have (14,398) is it for a single
    RS?
    What is your number of RS?
    Currently 91 in that cluster. It varies :)

    We have >200 tables coming all in different sizes.
    *Not clear, 91 rs, and 14398 regions in total? Or per RS?
    Oh sorry, total. 14k on a single RS is impossible/suicide if you have
    any data in there because it would OOME trying to load the indexes
    (better in 0.92 tho).

    J-D
  • Michael Stack at Feb 22, 2012 at 1:34 am

    On Tue, Feb 21, 2012 at 1:17 PM, Mikael Sitruk wrote:
    This is interesting J.D. so, is there a limitation on the region size or
    not? Can it be really any number? If so beside the collection time is there
    any impact (perhaps the documentation should be updated too)?
    Yes. It should not be read as a hard limit. If that is what it says,
    we need a patch for the doc.

    St.Ack
  • M. C. Srivas at Feb 22, 2012 at 1:45 am

    On Tue, Feb 21, 2012 at 12:08 PM, Jean-Daniel Cryans wrote:
    On Sun, Feb 19, 2012 at 1:45 PM, Mikael Sitruk wrote:
    During compaction the region is not out of service.
    According to documentation the max region size for V2 format is 20G
    And now the question: Assuming that 20G is the limit and the number of
    regions in a single RS should stay low < 500 it means that there is no mean
    having RS with more than 10TB of storage to use by HBase (otherwise
    locality will not be achieve for some servers, i also assume that
    compression is used and therefore it compensate the need for additional
    space for replication)?
    If the max number of region per RS is smaller then the storage size is even
    smaller. Is it correct?
    In the documentation 20GB is given as an example of a larger size that
    can be supported, but nothing blocks you from going way higher than
    that. I've done some import tests and had 100GB regions. It just takes
    a while to compact the bigger files.
    With no impact on Java GC going nuts? FB reported (a few months ago) it
    was bad to run a region-server
    with -Xmx larger than 15G or 16G. Unless its no longer true, wouldn't that
    be limiting factor for how
    large one should make regions?




    Also you can go over 500 regions, in fact one of our clusters has
    14,398 regions right now. It's just a pain to reassign everything when
    HBase boots but this is an offline cluster.

    J-D
  • Jean-Daniel Cryans at Feb 22, 2012 at 1:57 am

    In the documentation 20GB is given as an example of a larger size that
    can be supported, but nothing blocks you from going way higher than
    that. I've done some import tests and had 100GB regions. It just takes
    a while to compact the bigger files.
    With no impact on Java GC going nuts?  FB reported (a few months ago) it
    was bad to run a region-server
    with -Xmx larger than 15G or 16G. Unless its no longer true, wouldn't that
    be limiting factor for how
    large one should make regions?
    You'll have to explain how having "big regions" means you GC at lot, I
    don't see the relation.

    J-D
  • Michael Stack at Feb 22, 2012 at 2:16 am

    On Tue, Feb 21, 2012 at 5:44 PM, M. C. Srivas wrote:
    With no impact on Java GC going nuts?  FB reported (a few months ago) it
    was bad to run a region-server
    with -Xmx larger than 15G or 16G. Unless its no longer true, wouldn't that
    be limiting factor for how
    large one should make regions?
    We don't bring the total region into memory Srivas (Is that what you
    are thinking?).

    The FB recommendation of > 15G heaps was probably the old adage around
    big heaps taking a long time to sweep when GCing?

    Good on you,
    St.Ack
  • M. C. Srivas at Feb 22, 2012 at 5:29 am

    On Tue, Feb 21, 2012 at 6:16 PM, Stack wrote:
    On Tue, Feb 21, 2012 at 5:44 PM, M. C. Srivas wrote:
    With no impact on Java GC going nuts? FB reported (a few months ago) it
    was bad to run a region-server
    with -Xmx larger than 15G or 16G. Unless its no longer true, wouldn't that
    be limiting factor for how
    large one should make regions?
    We don't bring the total region into memory Srivas (Is that what you
    are thinking?).
    Yes, that was my thinking --- to do a major compaction the region-server
    would have to load all the flushed files for that region, merge them, and
    then write out the new region. If the region-file was 20g in size, the
    region-server would require well over 20g of heap space to do this work. Am
    I completely off?


    The FB recommendation of > 15G heaps was probably the old adage around
    big heaps taking a long time to sweep when GCing?

    Good on you,
    St.Ack
  • Michael Stack at Feb 22, 2012 at 5:59 am

    On Tue, Feb 21, 2012 at 9:29 PM, M. C. Srivas wrote:
    Yes,  that was my thinking ---  to do a major compaction  the region-server
    would have to load all the flushed files for that region, merge them, and
    then write out the new region. If the region-file was 20g in size, the
    region-server would require well over 20g of heap space to do this work. Am
    I completely off?
    You are a little off. We open all hfiles and then stream through each
    of them doing a merge sort streaming the outputting to the new
    compacted file.

    Here is where we open a scanner on all the files to compact and then
    as we inch through, we figure what to write to the output:
    http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/Store.html#1393

    (Its a bit hard to follow whats going on -- file selection is done
    already higher up in call chain).

    St.Ack
  • M. C. Srivas at Feb 24, 2012 at 6:35 am

    On Tue, Feb 21, 2012 at 9:58 PM, Stack wrote:
    On Tue, Feb 21, 2012 at 9:29 PM, M. C. Srivas wrote:
    Yes, that was my thinking --- to do a major compaction the
    region-server
    would have to load all the flushed files for that region, merge them, and
    then write out the new region. If the region-file was 20g in size, the
    region-server would require well over 20g of heap space to do this work. Am
    I completely off?
    You are a little off. We open all hfiles and then stream through each
    of them doing a merge sort streaming the outputting to the new
    compacted file.
    Doh! Seems obvious once you mention it. Sorry about that.


    Here is where we open a scanner on all the files to compact and then
    as we inch through, we figure what to write to the output:

    http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/Store.html#1393

    (Its a bit hard to follow whats going on -- file selection is done
    already higher up in call chain).

    St.Ack
  • Jean-Daniel Cryans at Feb 21, 2012 at 8:05 pm

    On Sun, Feb 19, 2012 at 8:38 AM, M. C. Srivas wrote:
    What is the impact when a compaction happens on a large 20G region?   Given
    that the FS will do writes at 30 MB/s (over a single 1 GigE link), it will
    take about 1500 seconds to read/write the region. Is the region out of
    service for 25 mins (= 1500 seconds)?
    It would be awful if it did :)

    And fortunately it does not.

    J-D
  • Pan, Thomas at Feb 24, 2012 at 6:45 pm
    Just a quick heads-up. Ted pointed me to this jira:
    https://issues.apache.org/jira/browse/HBASE-5416
    Max (the author) has confirmed that the patch provides what I want. :-)
    On 2/15/12 1:57 PM, "Pan, Thomas" wrote:


    Since Hbase is tailored to handle one table very well, we are thinking to
    put multiple tables into one big table but on different column family
    sets. Our use case is full table scan against single column value
    filters. As records from different "logical tables" are at different
    column families, could we speed up the scan performance by simply
    checking the column family referenced by these single column value
    filters first before really going through all the underlying K-V pairs?
    It would be great if the Hbase code is already coded that way.


    $0.02,
    Thomas
  • Michael Stack at Feb 24, 2012 at 6:55 pm

    On Fri, Feb 24, 2012 at 10:44 AM, Pan, Thomas wrote:
    Just a quick heads-up. Ted pointed me to this jira:
    https://issues.apache.org/jira/browse/HBASE-5416
    Max (the author) has confirmed that the patch provides what I want. :-)
    What do you think about what Mikhael says on the end? Have you tried
    doing two scans; one for the work to do and then another to do the
    work?

    St.Ack
  • Pan, Thomas at Feb 25, 2012 at 12:20 am
    He has a good point on unit test coverage. Atomicity is not a concern for
    the use case mentioned in this email thread. :-)
    The two-scan approach doesn't seem to help as the second scan still goes
    through all the rows if my understanding is correct.


    -Thomas
    On 2/24/12 10:54 AM, "Stack" wrote:
    On Fri, Feb 24, 2012 at 10:44 AM, Pan, Thomas wrote:

    Just a quick heads-up. Ted pointed me to this jira:
    https://issues.apache.org/jira/browse/HBASE-5416
    Max (the author) has confirmed that the patch provides what I want. :-)
    What do you think about what Mikhael says on the end? Have you tried
    doing two scans; one for the work to do and then another to do the
    work?

    St.Ack

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedFeb 15, '12 at 9:57p
activeFeb 25, '12 at 12:20a
posts32
users9
websitehbase.apache.org

People

Translate

site design / logo © 2022 Grokbase