FAQ
(sending to user@ and bbcing dev@ since this is a user question)

That type of problem can be "fun" to debug, did you try with the shell
to query the data? Do you get a different result?

BTW, any TTL set on that table?

J-D
On Mon, Aug 29, 2011 at 5:09 PM, Neerja Bhatnagar wrote:
Hi,

I am sorry if this question has been resolved before. Thank you for your
help.

I am seeing really strange behavior with HBase Scan.

I insert 1 row into a table named test, 1 col family named testColFam, and 3
columns : foo (with value foo), bar (with value bar), and id (a unique id).

I wait 5 minutes, and run the following code to retrieve the row ---

HTablePool htablePool = new HTablePool(config, maxsize);

HTable table = (HTable) htablePool.getTable("test"); // test is the
tablename

Scan scan = new Scan();
scan.addFamily(Bytes.toBytes("testColFam"));
scan.setStartRow(Bytes.toBytes("")); // scan from the first row
scan.setBatch(batchSize);

ResultScanner resScanner = table.getScanner(scan);
Iterator<Result> resultIterator = resultScanner.iterator();

Result result = resultIterator.next();

result.getMap();

the result.getMap() behaves differently based on time-elapsed. If I run this
code as soon as I have inserted the data, the 3 columns in the 1 row are
returned as expected.

But after some time elapses, scan returns fewer columns per row each time.

Can anyone please help me with this? Please let me know if you need more
information.

Do I need to set the timerange or something to make sure that all columns
are returned?

Cheers, Neerja

Search Discussions

  • Jean-Daniel Cryans at Aug 30, 2011 at 2:07 am
    (Sending to user@ again and bccing dev@ for the last time, please take
    notice and reply to user@)

    Ok so it should be something about the code... what is batchSize set
    to? I don't see it in that code snippet.

    getMap gives a map of all the families with all the data, whereas
    getFamily gives a map of all the qualifiers and their values for one
    family. Both APIs are good, just solving a different problem.

    J-D
    On Mon, Aug 29, 2011 at 6:00 PM, Neerja Bhatnagar wrote:
    Hi J-D,

    Thanks! I do scan 'tablename' on the shell, and I can see all 3 columns in
    the 1 column family for a row. I haven't set any TTL on the table or result
    scanner.
    Any other suggestions would be very welcome. I was getting the same response
    with result.getFamilyMap() and I moved to result.getMap() thinking I was
    using the wrong api.

    Cheers, Neerja
    On Mon, Aug 29, 2011 at 5:23 PM, Jean-Daniel Cryans wrote:

    (sending to user@ and bbcing dev@ since this is a user question)

    That type of problem can be "fun" to debug, did you try with the shell
    to query the data? Do you get a different result?

    BTW, any TTL set on that table?

    J-D
  • Jean-Daniel Cryans at Aug 30, 2011 at 5:16 pm
    If you want to limit the number of rows you can instead set the
    caching to exactly what you need, or set a stop row.

    J-D
    On Mon, Aug 29, 2011 at 11:38 PM, Neerja Bhatnagar wrote:
    Hi J-D,

    Thank you very much! Hopefully, this iteration clears it up for me.
    The batchSize is set to 1. I tried the same code with batchSize set to
    nothing or the same number as the number of columns in my column family.
    When the batchsize is not set, or is set to the number of columns in the
    column family I am retrieving from getMap or getFamilyMap, then the entire
    result (as expected) is returned.

    Is batchsize setting the number of columns to return, rather than number of
    rows?
    I am sorry,  to me it is not clear if the API is for setBatch in Scan is
    row-oriented or column-oriented.

    Perhaps, I should use the PageFilter to limit the number of rows retrieved
    from HBase?
    setBatch

    public void *setBatch*(int batch)

    Set the maximum number of values to return for each call to next()

    *Parameters:*batch - the maximum number of valuesYour help is much
    appreciated. Cheers, Neerja
    On Mon, Aug 29, 2011 at 7:07 PM, Jean-Daniel Cryans wrote:

    (Sending to user@ again and bccing dev@ for the last time, please take
    notice and reply to user@)

    Ok so it should be something about the code... what is batchSize set
    to? I don't see it in that code snippet.

    getMap gives a map of all the families with all the data, whereas
    getFamily gives a map of all the qualifiers and their values for one
    family. Both APIs are good, just solving a different problem.

    J-D

    On Mon, Aug 29, 2011 at 6:00 PM, Neerja Bhatnagar <neerjapub@gmail.com>
    wrote:
    Hi J-D,

    Thanks! I do scan 'tablename' on the shell, and I can see all 3 columns in
    the 1 column family for a row. I haven't set any TTL on the table or result
    scanner.
    Any other suggestions would be very welcome. I was getting the same response
    with result.getFamilyMap() and I moved to result.getMap() thinking I was
    using the wrong api.

    Cheers, Neerja

    On Mon, Aug 29, 2011 at 5:23 PM, Jean-Daniel Cryans <jdcryans@apache.org
    wrote:
    (sending to user@ and bbcing dev@ since this is a user question)

    That type of problem can be "fun" to debug, did you try with the shell
    to query the data? Do you get a different result?

    BTW, any TTL set on that table?

    J-D

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshbase, hadoop
postedAug 30, '11 at 12:23a
activeAug 30, '11 at 5:16p
posts3
users1
websitehbase.apache.org

1 user in discussion

Jean-Daniel Cryans: 3 posts

People

Translate

site design / logo © 2022 Grokbase