FAQ
Has anyone else run into issues using output compression (in our case lzo) on TestDFSIO and it failing to be able to read the metrics file? I just assumed that it would use the correct decompression codec after it finishes but it always returns with a 'File not found' exception. Is there a simple way around this without spending the time to recompile a cluster/codec specific version?

Matt
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.

Search Discussions

  • Ken Krugler at Sep 2, 2011 at 12:27 am
    Hi Matt,
    On Jun 20, 2011, at 1:46pm, GOEKE, MATTHEW (AG/1000) wrote:

    Has anyone else run into issues using output compression (in our case lzo) on TestDFSIO and it failing to be able to read the metrics file? I just assumed that it would use the correct decompression codec after it finishes but it always returns with a 'File not found' exception.
    Yes, I've run into the same issue on 0.20.2 and CHD3u0

    I don't see any Jira issue that covers this problem, so unless I hear otherwise I'll file one.

    The problem is that the post-job code doesn't handle getting the <path>.deflate or <path>.lzo (for you) file from HDFS, and then decompressing it.
    Is there a simple way around this without spending the time to recompile a cluster/codec specific version?

    You can use "hadoop fs -text <path reported in exception>.lzo"

    This will dump out the file, which looks like:

    f:rate 171455.11
    f:sqrate 2981174.8
    l:size 10485760000
    l:tasks 10
    l:time 590537

    If you take f:rate/1000/l:tasks, that should give you the average MB/sec.

    E.g. for the example above, that would be 171455/1000/10 = 17MB/sec.

    -- Ken

    --------------------------
    Ken Krugler
    +1 530-210-6378
    http://bixolabs.com
    custom big data solutions & training
    Hadoop, Cascading, Mahout & Solr

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 20, '11 at 8:47p
activeSep 2, '11 at 12:27a
posts2
users2
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase