Grokbase Groups Pig user March 2011
FAQ
Hi,
We recently updated our hadoop from CDH2 to CDH3b4, and had problems
using some old python udfs. Runing in local mode still works, but in
hadoop mode, it gives errors like "could not instantiate
'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
Anyone see similar error with python udf on this hadoop distribution?
We are using pig 0.8.0. Thanks!

Regards
Shawn

Search Discussions

  • Aniket Mokashi at Mar 31, 2011 at 10:38 pm
    I think this might be because when you start in hadoop mode, your
    classpath configuration does not have jython.jar. Can you put that
    explicitly in classpath and check it out?

    Thanks,
    Aniket
    On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:
    Hi,
    We recently updated our hadoop from CDH2 to CDH3b4, and had problems
    using some old python udfs. Runing in local mode still works, but in hadoop
    mode, it gives errors like "could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
    Anyone see similar error with python udf on this hadoop distribution?
    We are using pig 0.8.0. Thanks!


    Regards
    Shawn

  • Xiaomeng Wan at Apr 1, 2011 at 4:07 pm
    Hi Aniket,

    We put both jython.jar and myudf.py in classpath and also register
    jython.jar in our pig script. It worked well before the upgrading,
    only failed after.

    Regards,
    Shawn
    On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi wrote:
    I think this might be because when you start in hadoop mode, your
    classpath configuration does not have jython.jar. Can you put that
    explicitly in classpath and check it out?

    Thanks,
    Aniket
    On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:
    Hi,
    We recently updated our hadoop from CDH2 to CDH3b4, and had problems
    using some old python udfs. Runing in local mode still works, but in hadoop
    mode, it gives errors like "could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
    Anyone see similar error with python udf on this hadoop distribution?
    We are using pig 0.8.0. Thanks!


    Regards
    Shawn

  • Aniket Mokashi at Apr 1, 2011 at 8:24 pm
    Hi Shawn,

    Every time we throw an Exception with 'could not instantiate ..' error
    message, we also pass down the real exception instance, this might be able
    to point to the reason why we fail in this scenario.
    Can you provide details of your exception message from the log?

    The way this works is, when you register the myudf.py script we register
    all the function names inside script to pig and when we use these
    functions, we parse and construct them with JythonFunction constructor.

    Thanks,
    Aniket
    On Fri, April 1, 2011 12:06 pm, Xiaomeng Wan wrote:
    Hi Aniket,


    We put both jython.jar and myudf.py in classpath and also register
    jython.jar in our pig script. It worked well before the upgrading, only
    failed after.

    Regards,
    Shawn

    On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi wrote:

    I think this might be because when you start in hadoop mode, your
    classpath configuration does not have jython.jar. Can you put that
    explicitly in classpath and check it out?

    Thanks,
    Aniket

    On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:

    Hi,
    We recently updated our hadoop from CDH2 to CDH3b4, and had problems
    using some old python udfs. Runing in local mode still works, but in
    hadoop mode, it gives errors like "could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
    Anyone see similar error with python udf on this hadoop distribution?
    We are using pig 0.8.0. Thanks!



    Regards
    Shawn


  • Xiaomeng Wan at Apr 1, 2011 at 8:43 pm
    Hi Aniket,

    Here is the stacktrace of the exception.

    java.io.IOException: Deserialization error: could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments
    '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
    at org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:55)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.setup(PigMapBase.java:151)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:251)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
    at org.apache.hadoop.mapred.Child.main(Child.java:245)
    Caused by: java.lang.RuntimeException: could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments
    '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
    at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:502)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.instantiateFunc(POUserFunc.java:109)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.readObject(POUserFunc.java:451)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593)
    at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.HashMap.readObject(HashMap.java:1030)
    at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593)
    at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593)
    at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.HashMap.readObject(HashMap.java:1030)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:53)
    ... 9 more
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:470)
    ... 87 more
    Caused by: java.lang.IllegalStateException: Could not initialize:
    /home/shawn/TESS/code/mypyudfs.py
    at org.apache.pig.scripting.jython.JythonFunction.<init>(JythonFunction.java:86)
    ... 92 more
    2011-04-01 14:31:40,977 INFO org.apache.hadoop.mapred.Task: Runnning
    cleanup for the task

    Thanks!

    Regards,
    Shawn
    On Fri, Apr 1, 2011 at 2:24 PM, Aniket Mokashi wrote:
    Hi Shawn,

    Every time we throw an Exception with 'could not instantiate ..' error
    message, we also pass down the real exception instance, this might be able
    to point to the reason why we fail in this scenario.
    Can you provide details of your exception message from the log?

    The way this works is, when you register the myudf.py script we register
    all the function names inside script to pig and when we use these
    functions, we parse and construct them with JythonFunction constructor.

    Thanks,
    Aniket
    On Fri, April 1, 2011 12:06 pm, Xiaomeng Wan wrote:
    Hi Aniket,


    We put both jython.jar and myudf.py in classpath and also register
    jython.jar in our pig script. It worked well before the upgrading, only
    failed after.

    Regards,
    Shawn


    On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi <amokashi@andrew.cmu.edu>
    wrote:
    I think this might be because when you start in hadoop mode, your
    classpath configuration does not have jython.jar. Can you put that
    explicitly in classpath and check it out?

    Thanks,
    Aniket

    On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:

    Hi,
    We recently updated our hadoop from CDH2 to CDH3b4, and had problems
    using some old python udfs. Runing in local mode still works, but in
    hadoop mode, it gives errors like "could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
    Anyone see similar error with python udf on this hadoop distribution?
    We are using pig 0.8.0. Thanks!



    Regards
    Shawn


  • Aniket Mokashi at Apr 1, 2011 at 11:11 pm
    Hi Shawn,

    I think this is more of CDH packaging problem than Pig problem. I suspect
    this is related to Java versions of jython and other components.

    You may look into
    https://docs.cloudera.com/download/attachments/8784980/CDH3b3_Installation_Guide.pdf?version=1&modificationDate=1300229469101
    for more details.

    Thanks,
    Aniket
    On Fri, April 1, 2011 4:42 pm, Xiaomeng Wan wrote:
    Hi Aniket,


    Here is the stacktrace of the exception.


    java.io.IOException: Deserialization error: could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments
    '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
    at
    org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.ja
    va:55)
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.s
    etup(PigMapBase.java:151)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at
    org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) at
    org.apache.hadoop.mapred.Child$4.run(Child.java:251)
    at java.security.AccessController.doPrivileged(Native Method) at
    javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
    .java:1115)
    at org.apache.hadoop.mapred.Child.main(Child.java:245) Caused by:
    java.lang.RuntimeException: could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments
    '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
    at
    org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:50
    2)
    at
    org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
    rators.POUserFunc.instantiateFunc(POUserFunc.java:109)
    at
    org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
    rators.POUserFunc.readObject(POUserFunc.java:451)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
    :39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593) at
    sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.HashMap.readObject(HashMap.java:1030) at
    sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593) at
    sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593) at
    sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.HashMap.readObject(HashMap.java:1030) at
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
    :39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at
    org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.ja
    va:53)
    ... 9 more
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAc
    cessorImpl.java:39)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConst
    ructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
    org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:47
    0)
    ... 87 more
    Caused by: java.lang.IllegalStateException: Could not initialize:
    /home/shawn/TESS/code/mypyudfs.py
    at
    org.apache.pig.scripting.jython.JythonFunction.<init>(JythonFunction.java
    :86)
    ... 92 more
    2011-04-01 14:31:40,977 INFO org.apache.hadoop.mapred.Task: Runnning
    cleanup for the task

    Thanks!


    Regards,
    Shawn

    On Fri, Apr 1, 2011 at 2:24 PM, Aniket Mokashi wrote:

    Hi Shawn,


    Every time we throw an Exception with 'could not instantiate ..' error
    message, we also pass down the real exception instance, this might be
    able to point to the reason why we fail in this scenario. Can you provide
    details of your exception message from the log?

    The way this works is, when you register the myudf.py script we
    register all the function names inside script to pig and when we use
    these functions, we parse and construct them with JythonFunction
    constructor.

    Thanks,
    Aniket

    On Fri, April 1, 2011 12:06 pm, Xiaomeng Wan wrote:

    Hi Aniket,



    We put both jython.jar and myudf.py in classpath and also register
    jython.jar in our pig script. It worked well before the upgrading,
    only failed after.

    Regards,
    Shawn



    On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi
    <amokashi@andrew.cmu.edu>
    wrote:

    I think this might be because when you start in hadoop mode, your
    classpath configuration does not have jython.jar. Can you put that
    explicitly in classpath and check it out?

    Thanks,
    Aniket



    On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:

    Hi,
    We recently updated our hadoop from CDH2 to CDH3b4, and had
    problems using some old python udfs. Runing in local mode still
    works, but in hadoop mode, it gives errors like "could not
    instantiate 'org.apache.pig.scripting.jython.JythonFunction' with
    arguments...". Anyone see similar error with python udf on this
    hadoop distribution? We are using pig 0.8.0. Thanks!




    Regards
    Shawn



  • Xiaomeng Wan at Apr 4, 2011 at 3:37 pm
    Thanks! Aniket
    I will look into it.

    Regards,
    Shawn
    On Fri, Apr 1, 2011 at 5:10 PM, Aniket Mokashi wrote:
    Hi Shawn,

    I think this is more of CDH packaging problem than Pig problem. I suspect
    this is related to Java versions of jython and other components.

    You may look into
    https://docs.cloudera.com/download/attachments/8784980/CDH3b3_Installation_Guide.pdf?version=1&modificationDate=1300229469101
    for more details.

    Thanks,
    Aniket
    On Fri, April 1, 2011 4:42 pm, Xiaomeng Wan wrote:
    Hi Aniket,


    Here is the stacktrace of the exception.


    java.io.IOException: Deserialization error: could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments
    '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
    at
    org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.ja
    va:55)
    at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.s
    etup(PigMapBase.java:151)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at
    org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) at
    org.apache.hadoop.mapred.Child$4.run(Child.java:251)
    at java.security.AccessController.doPrivileged(Native Method) at
    javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
    .java:1115)
    at org.apache.hadoop.mapred.Child.main(Child.java:245) Caused by:
    java.lang.RuntimeException: could not instantiate
    'org.apache.pig.scripting.jython.JythonFunction' with arguments
    '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
    at
    org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:5
    2)
    at
    org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
    rators.POUserFunc.instantiateFunc(POUserFunc.java:109)
    at
    org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
    rators.POUserFunc.readObject(POUserFunc.java:451)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
    :39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593) at
    sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.HashMap.readObject(HashMap.java:1030) at
    sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593) at
    sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593) at
    sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.HashMap.readObject(HashMap.java:1030) at
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
    :39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
    mpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597) at
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at
    org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.ja
    va:53)
    ... 9 more
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAc
    cessorImpl.java:39)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConst
    ructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
    org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:47
    0)
    ... 87 more
    Caused by: java.lang.IllegalStateException: Could not initialize:
    /home/shawn/TESS/code/mypyudfs.py
    at
    org.apache.pig.scripting.jython.JythonFunction.<init>(JythonFunction.java
    :86)
    ... 92 more
    2011-04-01 14:31:40,977 INFO org.apache.hadoop.mapred.Task: Runnning
    cleanup for the task

    Thanks!


    Regards,
    Shawn


    On Fri, Apr 1, 2011 at 2:24 PM, Aniket Mokashi <amokashi@andrew.cmu.edu>
    wrote:
    Hi Shawn,


    Every time we throw an Exception with 'could not instantiate ..' error
    message, we also pass down the real exception instance, this might be
    able to point to the reason why we fail in this scenario. Can you provide
    details of your exception message from the log?

    The way this works is, when you register the myudf.py script we
    register all the function names inside script to pig and when we use
    these functions, we parse and construct them with JythonFunction
    constructor.

    Thanks,
    Aniket

    On Fri, April 1, 2011 12:06 pm, Xiaomeng Wan wrote:

    Hi Aniket,



    We put both jython.jar and myudf.py in classpath and also register
    jython.jar in our pig script. It worked well before the upgrading,
    only failed after.

    Regards,
    Shawn



    On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi
    <amokashi@andrew.cmu.edu>
    wrote:

    I think this might be because when you start in hadoop mode, your
    classpath configuration does not have jython.jar. Can you put that
    explicitly in classpath and check it out?

    Thanks,
    Aniket



    On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:

    Hi,
    We recently updated our hadoop from CDH2 to CDH3b4, and had
    problems using some old python udfs. Runing in local mode still
    works, but in hadoop mode, it gives errors like "could not
    instantiate 'org.apache.pig.scripting.jython.JythonFunction' with
    arguments...". Anyone see similar error with python udf on this
    hadoop distribution? We are using pig 0.8.0. Thanks!




    Regards
    Shawn



Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedMar 31, '11 at 10:07p
activeApr 4, '11 at 3:37p
posts7
users2
websitepig.apache.org

2 users in discussion

Xiaomeng Wan: 4 posts Aniket Mokashi: 3 posts

People

Translate

site design / logo © 2021 Grokbase