FAQ
Hi -

I’m trying to use the Apache Tar package (1.8.2) for a Java program that tars large files in Hadoop. I am currently failing on a file that’s 17 GB long. Note that this code works without any problem for smaller files. I’m tarring smaller HDFS files all day long without any problem. It fails only when I have to tar that 17 GB file. I have a hard time making sense of the error message, after looking at source code for 3 days now... The exact file size at the time of the error is: 17456999265 bytes. The exception I’m seeing is:

12/19/11 5:54 PM [BDM.main] EXCEPTION request to write '65535' bytes exceeds size in header of '277130081' bytes
12/19/11 5:54 PM [BDM.main] EXCEPTION org.apache.tools.tar.TarOutputStream.write(TarOutputStream.java:238)
12/19/11 5:54 PM [BDM.main] EXCEPTION com.yahoo.ads.ngdstone.tpbdm.HDFSTar.archive(HDFSTar.java:149)

My code is:

TarEntry entry = new TarEntry(p.getName());
Path absolutePath = p.isAbsolute() ? p : new Path(baseDir, p); // HDFS Path
FileStatus fileStatus = fs.getFileStatus(absolutePath); // HDFS fileStatus
entry.setNames(fileStatus.getOwner(), fileStatus.getGroup());
entry.setUserName(user);
entry.setGroupName(group);
entry.setName(name);
entry.setSize(fileStatus.getLen());
entry.setMode(Integer.parseInt("0100" + permissions, 8));
out.putNextEntry(entry); // out = TarOutputStream

if (fileStatus.getLen() > 0) {

InputStream in = fs.open(absolutePath); // large file in HDFS

try {

++nEntries;

int bytesRead = in.read(buf);

while (bytesRead >= 0) {
out.write(buf, 0, bytesRead);
bytesRead = in.read(buf);
}

} finally {
in.close();
}
}

out.closeEntry();

Any idea? Am I missing anything in the way I’m setting up the TarOutputStream or TarEntry? Or does tar have implicit limits that are never going to work for multi-gigabytes size files?

Thanks!

Frank

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 4 | next ›
Discussion Overview
groupuser @
categoriesant
postedDec 20, '11 at 5:31a
activeDec 22, '11 at 4:44p
posts4
users3
websiteant.apache.org

People

Translate

site design / logo © 2022 Grokbase