Grokbase Groups HBase user June 2016
FAQ
Hello,

We are running 1.2.0-cdh5.7.0 on our server side, and 1.0.0-cdh5.4.5 on the
client side. We're in the process of upgrading the client, but aren't there
yet. I'm trying to figure out the relationship of Result.isPartial and the
user, when setMaxResultSize is used.

I've done a little reading of the code, and it looks like isPartial is
mostly used by the internals of ClientScanner. From what I can tell the
user should never get a Result where isPartial == true, because the
ClientScanner will do multiple requests internally to flesh out incomplete
rows.

However, the code is a bit complex so I'd like to verify. Is this correct
for either version of HBase above? Is it safe to use setMaxResultSize
without any more work, or should we be handling the potential isPartial()
Result ourselves in every scan request we make?

I wonder if this should be added to the docs, either way (didn't see it),
or remove isPartial from the public API in future versions?

Thanks!

Search Discussions

  • Enis Söztutar at Jun 18, 2016 at 2:13 am
    You should probably read
    https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1 first.

    In HBase-1.1 and later code bases, you can call Scan.allowPartialResults()
    to instruct the ClientScanner to give you partial results. In this case,
    you can use Result.isPartial() to stitch together multiple Result objects
    into a single row. Unless you explicitly request it, Results returned will
    never be partial results. Why would you want to call
    Scan.allowPartialResults() in the first place? It is because of client-side
    memory allocation. If you have a row with millions of columns and GBs of
    data let's say, you cannot afford to have the ClientScanner to auto-stitch
    all the column values for you and give a single Result object, because it
    will cause OOM.

    Hope this helps.
    Enis
    On Fri, Jun 17, 2016 at 4:15 PM, Bryan Beaudreault wrote:

    Hello,

    We are running 1.2.0-cdh5.7.0 on our server side, and 1.0.0-cdh5.4.5 on the
    client side. We're in the process of upgrading the client, but aren't there
    yet. I'm trying to figure out the relationship of Result.isPartial and the
    user, when setMaxResultSize is used.

    I've done a little reading of the code, and it looks like isPartial is
    mostly used by the internals of ClientScanner. From what I can tell the
    user should never get a Result where isPartial == true, because the
    ClientScanner will do multiple requests internally to flesh out incomplete
    rows.

    However, the code is a bit complex so I'd like to verify. Is this correct
    for either version of HBase above? Is it safe to use setMaxResultSize
    without any more work, or should we be handling the potential isPartial()
    Result ourselves in every scan request we make?

    I wonder if this should be added to the docs, either way (didn't see it),
    or remove isPartial from the public API in future versions?

    Thanks!
  • Bryan Beaudreault at Jun 18, 2016 at 3:00 pm
    Thanks Enis, I had forgotten about that post! All of this makes sense now
    On Fri, Jun 17, 2016 at 10:12 PM Enis Söztutar wrote:

    You should probably read
    https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1 first.

    In HBase-1.1 and later code bases, you can call Scan.allowPartialResults()
    to instruct the ClientScanner to give you partial results. In this case,
    you can use Result.isPartial() to stitch together multiple Result objects
    into a single row. Unless you explicitly request it, Results returned will
    never be partial results. Why would you want to call
    Scan.allowPartialResults() in the first place? It is because of client-side
    memory allocation. If you have a row with millions of columns and GBs of
    data let's say, you cannot afford to have the ClientScanner to auto-stitch
    all the column values for you and give a single Result object, because it
    will cause OOM.

    Hope this helps.
    Enis

    On Fri, Jun 17, 2016 at 4:15 PM, Bryan Beaudreault <
    bbeaudreault@hubspot.com
    wrote:
    Hello,

    We are running 1.2.0-cdh5.7.0 on our server side, and 1.0.0-cdh5.4.5 on the
    client side. We're in the process of upgrading the client, but aren't there
    yet. I'm trying to figure out the relationship of Result.isPartial and the
    user, when setMaxResultSize is used.

    I've done a little reading of the code, and it looks like isPartial is
    mostly used by the internals of ClientScanner. From what I can tell the
    user should never get a Result where isPartial == true, because the
    ClientScanner will do multiple requests internally to flesh out
    incomplete
    rows.

    However, the code is a bit complex so I'd like to verify. Is this correct
    for either version of HBase above? Is it safe to use setMaxResultSize
    without any more work, or should we be handling the potential isPartial()
    Result ourselves in every scan request we make?

    I wonder if this should be added to the docs, either way (didn't see it),
    or remove isPartial from the public API in future versions?

    Thanks!

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshbase, hadoop
postedJun 17, '16 at 11:15p
activeJun 18, '16 at 3:00p
posts3
users2
websitehbase.apache.org

People

Translate

site design / logo © 2018 Grokbase