can't package zip file with hadoop streaming -file argument

Key: HADOOP-3811
URL: https://issues.apache.org/jira/browse/HADOOP-3811
Project: Hadoop Core
Issue Type: Bug
Components: contrib/streaming
Affects Versions: 0.17.0
Reporter: Karl Anderson

I'm unable to ship a file with a .zip suffix to the mapper using the -file argument for hadoop streaming. I am able to ship it if I change the suffix to .zipp. Is this a bug, or perhaps has something to do with the jar file format which is used to send files to the instance?

For example, with this hadoop invocation, and local files "/tmp/boto.zip" and "/tmp/boto.zipp" which are copies of each other:

$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.17.0-streaming.jar -mapper $KCLUSTER_SRC/testmapper.py -reducer $KCLUSTER_SRC/testreducer.py -input input/foo -output output -file /tmp/foo.txt -file /tmp/boto.zip -file /tmp/boto.zipp

I see this line in the invocation standard output:

packageJobJar: [/tmp/foo.txt, /tmp/boto.zip, /tmp/boto.zipp, /tmp/hadoop-karl/hadoop-unjar6899/] [] /tmp/streamjob6900.jar tmpDir=null

But in the current directory of the mapper process, "boto.zip" does not exist, while "boto.zipp" does.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
postedJul 22, '08 at 10:17p
activeJul 22, '08 at 10:17p

1 user in discussion

Karl Anderson (JIRA): 1 post



site design / logo © 2022 Grokbase