Hi,

1 - I'm trying to read parts of a compressed file to generate message
digests, but I can't fetch the right parts. I searched for an example
that read compressed files, but I can't find one.
As I've 3 partition in my example, below are the indexes of the file:
raw bytes: 54632 / offset: 0 / partLength: 20307
raw bytes: 53771 / offset: 20307 / partLength: 19882
raw bytes: 53568 / offset: 40189 / partLength: 19814

Here's my code:

[code]
readCompressedFile(InputStream input) {
decompressor.reset();
CompressionInputStream input2 = codec.createInputStream(input,
decompressor);

IndexRecord index = spillRec.getIndex(part);

long size = index.rawLength;
//long size2 = index.partLength;
long offset = index.startOffset;
hash[part] = hashGen.generateHash(input2, (int) offset, (int) size);
}



public String generateHash(CompressionInputStream input, int offset,
int mapOutputLength) {
MessageDigest md = null;
StringBuffer buf = new StringBuffer();

try {
md = MessageDigest.getInstance("SHA-1");
int totalBytes= 0;

int size = mapOutputLength < (60 * 1024) ? mapOutputLength : (60*1024);

byte[] buffer = new byte[size];

int n = input.read(buffer, 0, size);

if(n > 0)
md.update(buffer);

while (n > 0) {
totalBytes += n;

mapOutputLength -= n;

// the case that the bytes read is small the the default size.
// We don't want that the message digest contains trash.
size = mapOutputLength < (60 * 1024) ? mapOutputLength : (60*1024);

if(size == 0)
break;

buffer = new byte[size];
n = input.read(buffer, 0, size);

if(n > 0) {
md.update(buffer);
}
}
System.out.println("END: " + totalBytes + " - ");

// DO THE HASH

} catch (NoSuchAlgorithmException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

return HASH;
}

[/code]

I can't get the right portions of the compressed file, and I don't
know why. What am I doing wrong?

2 - When I'm reading a compressed file with the CompressionInputStream class,
CompressionInputStream input2 = codec.createInputStream(input, decompressor);

means that, when I call the method "read", I'm reading uncompressed data?




Thanks,


--
Pedro

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedFeb 16, '11 at 9:37p
activeFeb 16, '11 at 9:37p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Pedro Costa: 1 post

People

Translate

site design / logo © 2022 Grokbase