Grokbase Groups HBase user May 2011
FAQ
Hi all,

I'm doing some work to read records directly from the HFiles of a damaged table. When I scan through the records in the HFile using org.apache.hadoop.hbase.io.hfile.HFileScanner, will I get only the latest version of the record as with a default HBase Scan? Or do I need to do some work to pull out the latest version from several?

Thanks,
Sandy

Search Discussions

  • Stack at May 31, 2011 at 8:10 pm

    On Tue, May 31, 2011 at 11:05 AM, Sandy Pratt wrote:
    Hi all,

    I'm doing some work to read records directly from the HFiles of a damaged table.  When I scan through the records in the HFile using org.apache.hadoop.hbase.io.hfile.HFileScanner, will I get only the latest version of the record as with a default HBase Scan?  Or do I need to do some work to pull out the latest version from several?
    It looks like it just returns all entries in the hfile. See tests --
    e.g. TestHFile -- for how to make make an HFile Reader instance and
    pull the values. The tail of HFile has some examples too?

    Tell us about the 'damaged table'.

    St.Ack.
  • Sandy Pratt at May 31, 2011 at 9:28 pm
    Thanks for the pointers.

    The damage manifested as scanners skipping over a range in our time series data. We knew from other systems that there should be some records in that region that weren't returned. When we looked closely we saw an extremely improbable jump in rowkeys that should by evenly distributed UUIDs beneath an hourly prefix. We checked the region listing and start/end keys in the regionserver UI, and found a region listed that wasn't being served. We traced it back to a couple of possible locations under /hbase, and got some odd results when we tried to point the HFile main method at those files.

    Here's the region we found missing along with the next one and the previous one:

    Previous:
    ets.derived.events.pb,2010-09-28-02:dcba1a8d00d945e6a90442c9561e8ac4,1285667269423 ets-lax-prod-hadoop-10.corp.adobe.com:60030 2010-09-28-02:dcba1a8d00d945e6a90442c9561e8ac4 2010-09-28-05:5457075d4f9345908bdfd89b5b641d3d

    Affected region:
    ets.derived.events.pb,2010-09-28-05:5457075d4f9345908bdfd89b5b641d3d,1285684268773 ets-lax-prod-hadoop-04.corp.adobe.com:60030
    2010-09-28-05:5457075d4f9345908bdfd89b5b641d3d 2010-09-28-11:29664000a226486e9ecb7547a738d101

    Next:
    ets.derived.events.pb,2010-09-28-11:29664000a226486e9ecb7547a738d101,1285687842817 ets-lax-prod-hadoop-07.corp.adobe.com:60030
    2010-09-28-11:29664000a226486e9ecb7547a738d101 2010-09-28-12:f8fa9dc21bfe4091a4864d0adc655b4d


    The affected region on RS UI:
    ets.derived.events.pb,2010-09-28-05:5457075d4f9345908bdfd89b5b641d3d,1285684268773.1836172434
    2010-09-28-05:5457075d4f9345908bdfd89b5b641d3d 2010-09-28-11:29664000a226486e9ecb7547a738d101 stores=1, storefiles=1, storefileSizeMB=45, memstoreSizeMB=0, storefileIndexSizeMB=0


    Directory for region on hdfs (guessing based on suffix from RS UI):
    /hbase/ets.derived.events.pb/1836172434


    Here's what happened when we ran HFile main method on those files:

    Checked with HFile:

    [hadoop@ets-lax-prod-hadoop-01 ~]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -r 'ets.derived.events.pb,2010-09-28-05:5457075d4f9345908bdfd89b5b641d3d,1285684268773.1836172434' -v
    cat: /opt/hadoop/hbase/target/cached_classpath.txt: No such file or directory
    region dir -> hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.derived.events.pb/107531684
    Number of region files found -> 0

    Note that it found a different directory on hdfs than I would have thought. Look at that file with HFile and it doesn't like it:

    [hadoop@ets-lax-prod-hadoop-01 ~]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -f hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.derived.events.pb/107531684 -v -k
    cat: /opt/hadoop/hbase/target/cached_classpath.txt: No such file or directory
    Scanning -> hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.derived.events.pb/107531684
    ERROR, file doesnt exist: hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.derived.events.pb/107531684

    Put in the file I thought it was, and although it's there on HDFS, HFile can't find it:
    [hadoop@ets-lax-prod-hadoop-01 ~]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -f hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.derived.events.pb/1836172434 -v -k
    cat: /opt/hadoop/hbase/target/cached_classpath.txt: No such file or directory
    Scanning -> hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.derived.events.pb/1836172434
    java.io.FileNotFoundException: File does not exist: /hbase/ets.derived.events.pb/1836172434
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1586)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:428)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:185)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:431)
    at org.apache.hadoop.hbase.io.hfile.HFile$Reader.[hadoop@ets-lax-prod-hadoop-01 ~]$ hadoop dfs -ls /hbase/ets.derived.events.pb/1836172434
    Found 2 items
    -rw-r--r-- 3 hadoop hadoop 862 2010-09-28 07:31 /hbase/ets.derived.events.pb/1836172434/.regioninfo
    drwxr-xr-x - hadoop hadoop 0 2011-05-06 16:36 /hbase/ets.derived.events.pb/1836172434/f1


    Ran an hbase hbck which came back clean. Stopped HBase and restarted to find that hbck gave errors (not sure why it was ok before and not after - maybe a split happened in the interim or something - but we are running durable now so hopefully a change to META would not get lost). After that I made a backup and tried add_table.rb, which seems to make the problem worse. We eventually concluded that we must have lost a write to META last year when we were running Hadoop 0.20.1 and HBase 0.20.3 without durability (currently running CDH3b3). This is supported by the fact that other environments running the same code are OK and hadoop fsck / is also healthy.

    My solution is to create a broadly similar table and read the HFiles from the old one directly into it. So this would be an MR with an HFileInputFormat I wrote using the HFile API, and a TableOutputFormat into the new table (didn't want to put writing directly to HFiles on my plate at this time). Once that's done and verified, I'll drop the older table and move on.

    Because of the version of HBase we're running, we don' t have hbck -fix available, and I assume it's been months since the damage happened which might mean we have some regions overlapping. It might be hard to manually stitch them back together, so this holistic approach seemed like the best bet.

    One thing I can put as a win in HBase's column is that the damaged table still functions fine in the parts that don't have holes, which is the majority of the table. So we can keep running for the majority of our dataset (and work) and take the time to fix the damage carefully.

    Sandy
    -----Original Message-----
    From: saint.ack@gmail.com On Behalf Of Stack
    Sent: Tuesday, May 31, 2011 13:10
    To: user@hbase.apache.org
    Subject: Re: HFile.Reader scans return latest version?
    On Tue, May 31, 2011 at 11:05 AM, Sandy Pratt wrote:
    Hi all,

    I'm doing some work to read records directly from the HFiles of a damaged
    table.  When I scan through the records in the HFile using
    org.apache.hadoop.hbase.io.hfile.HFileScanner, will I get only the latest
    version of the record as with a default HBase Scan?  Or do I need to do some
    work to pull out the latest version from several?
    It looks like it just returns all entries in the hfile. See tests -- e.g. TestHFile --
    for how to make make an HFile Reader instance and pull the values. The tail
    of HFile has some examples too?

    Tell us about the 'damaged table'.

    St.Ack.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshbase, hadoop
postedMay 31, '11 at 6:05p
activeMay 31, '11 at 9:28p
posts3
users2
websitehbase.apache.org

2 users in discussion

Sandy Pratt: 2 posts Stack: 1 post

People

Translate

site design / logo © 2022 Grokbase