Thanks for the info
I have not yet verified with the hadoop list but it looks like the CDH3b4 0.20.2 hadoop-core.jar is incompatible or different from the hadoop-core.jar that the pig build script pulls in via ivy. I was able to solve my problem by building pig without hadoop (ant jar-withouthadoop) then manually including the 'correct' hadoop-core.jar in the class path. This is a bug but I don’t know enough about the community to say who's; perhaps Cloudera's?
I would like to point out one bug I found in the Pig build.xml.
The main jar target (buildJar) has the following dependencies:
<zipfileset src="${ivy.lib.dir}/hadoop-core-${hadoop-core.version}.jar" />
<zipfileset src="${lib.dir}/${automaton.jarfile}" />
<zipfileset src="${ivy.lib.dir}/junit-${junit.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jsch-${jsch.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jline-${jline.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jackson-mapper-asl-${jackson.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jackson-core-asl-${jackson.version}.jar" />
<zipfileset src="${ivy.lib.dir}/joda-time-${joda-time.version}.jar" />
<zipfileset src="${ivy.lib.dir}/${guava.jar}" />
<zipgroupfileset dir="${ivy.lib.dir}" includes="commons*.jar"/>
<zipgroupfileset dir="${ivy.lib.dir}" includes="log4j*.jar"/>
<zipgroupfileset dir="${ivy.lib.dir}" includes="jsp-api*.jar"/>
Yet in the 0.8.0 tag, the non-hadoop target (jar-withouthadoop) has:
<zipfileset src="${ivy.lib.dir}/junit-${junit.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jsch-${jsch.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jline-${jline.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jackson-mapper-asl-${jackson.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jackson-core-asl-${jackson.version}.jar" />
<zipfileset src="${ivy.lib.dir}/joda-time-${joda-time.version}.jar" />
<zipfileset src="${lib.dir}/${automaton.jarfile}" />
Should it not be the same with the exception of the first line? Among other things, withouthadoop jar is missing the logging dependencies.
Dan
-----Original Message-----
From: joshdevins@gmail.com On Behalf Of Josh Devins
Sent: March-22-11 4:59
To: user@pig.apache.org
Subject: Re: Incorrect header or version mismatch
Hey Dan
This usually means that you have mismatched Hadoop jar versions somewhere. I
encountered a similar problem with Oozie trying to talk to HDFS. Maybe try
posting to the Hadoop user list as well. In general, you should just need
the same hadoop-core.jar as on your cluster when you run Pig. From Pig all
you should need is pig.jar (and piggybank, etc.) and the pre-compiled jar
should suffice.
Cheers,
Josh
On 21 March 2011 22:56, Dan Hendry wrote:First off, I am fairly new to both pig and Hadoop. I am having some
problems
connecting pig to a local hadoop cluster. I am getting the following error
in the hadoop namenode logs whenever I try and start up pig:
2011-03-21 17:48:17,299 WARN org.apache.hadoop.ipc.Server: Incorrect header
or version mismatch from 127.0.0.1:60928 got version 3 expected version 4
I am using the cloudera deb repository (CDH3b4) installed according to
https://docs.cloudera.com/display/DOC/CDH3+Installation+Guide. The hadoop
version is 20.2 and running in pseudo distributed mode. I am using pig
0.8.0, both the provided tarball and a clone of the 0.8.0 tag compiled
locally. Any help would be appreciated. I am getting the following error in
the pig logs:
Error before Pig is launched
----------------------------
ERROR 2999: Unexpected internal error. Failed to create DataStorage
java.lang.RuntimeException: Failed to create DataStorage
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.jav
a:75)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.j
ava:58)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecuti
onEngine.java:213)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecuti
onEngine.java:133)
at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
at org.apache.pig.PigServer.<init>(PigServer.java:225)
at org.apache.pig.PigServer.<init>(PigServer.java:214)
at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55)
at org.apache.pig.Main.run(Main.java:462)
at org.apache.pig.Main.main(Main.java:107)
Caused by: java.io.IOException: Call to localhost/127.0.0.1:8020 failed on
local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
at org.apache.hadoop.ipc.Client.call(Client.java:743)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy0.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSyste
m.java:82)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.jav
a:72)
... 9 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
============================================================================
====
No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.894 / Virus Database: 271.1.1/3522 - Release Date: 03/22/11 03:34:00