FAQ
[ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated HADOOP-1622:
----------------------------------

Fix Version/s: 0.17.0

marking this for 0.17 release.
Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
--------------------------------------------------------------------------------------------

Key: HADOOP-1622
URL: https://issues.apache.org/jira/browse/HADOOP-1622
Project: Hadoop Core
Issue Type: Improvement
Components: mapred
Reporter: Runping Qi
Assignee: Dennis Kubes
Fix For: 0.17.0

Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


More likely than not, a user's job may depend on multiple jars.
Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
(like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
of job submission. Someting like:
bin/hadoop .... --depending_jars j1.jar:j2.jar
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Mahadev konar (JIRA) at Mar 20, 2008 at 11:33 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Attachment: HADOOP-1622_1.patch

    attaching a patch for this feature. It does not have unit tests included. I am still writing unit tests and will upload a patch by the end of the day.

    this patch enhances the hadoop command line for job submission:

    so you can say:

    - bin/hadoop jar -files <commaseperated files> -libjars <comma seperated libs> -archives <comma seperated archives>

    - these options are all optional and the command line is backwards compatible

    - the patch uses cli for command line parsing

    - it uses DistributedCache for copying files locally to the tasks

    - it supports uri's in the command line arguments

    - if the files are already uploaded do the hdfs used by jobtracker then it does not recopy the files -- there is a tiny catch here ... since the uri's are matched as string for the remote file system and the one jt uses, it might be possible that the files are copied even though its the same dfs (ex: hdfs://hostname1:port != hdfs://hostname1.fullyqualifiedname:port)

    - the command line files, archives, libajrs are stored temporarurly in the hdfs job directory from where they are copied locally.

    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 22, 2008 at 2:25 am
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Attachment: HADOOP-1622_2.patch

    attaching a patch with the unit test. passes the tests on my machine.
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 22, 2008 at 2:25 am
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Patch Available (was: Reopened)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 24, 2008 at 4:29 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Open (was: Patch Available)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 24, 2008 at 9:33 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Patch Available (was: Open)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 24, 2008 at 9:33 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Attachment: HADOOP-1622_3.patch

    fixed findbugs warnings.
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 25, 2008 at 12:03 am
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Patch Available (was: Open)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 25, 2008 at 12:03 am
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Open (was: Patch Available)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 25, 2008 at 12:03 am
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Attachment: HADOOP-1622_4.patch

    got rid of the findbugs warning.
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Robert Chansler (JIRA) at Mar 25, 2008 at 3:06 am
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Robert Chansler updated HADOOP-1622:
    ------------------------------------

    Fix Version/s: (was: 0.17.0)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Sameer Paranjpye (JIRA) at Mar 25, 2008 at 7:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Sameer Paranjpye updated HADOOP-1622:
    -------------------------------------

    Fix Version/s: 0.17.0
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 25, 2008 at 4:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Open (was: Patch Available)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, HADOOP-1622_5.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 25, 2008 at 4:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Attachment: HADOOP-1622_5.patch

    this is the patch implementing devaraj's coment about host resolution. I will add another jira for this feature to be used by pipes.
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, HADOOP-1622_5.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 25, 2008 at 4:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Patch Available (was: Open)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, HADOOP-1622_5.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 26, 2008 at 5:55 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Attachment: HADOOP-1622_6.patch

    looks like the previous patch got stale with some commits yesterday. attaching a new patch.
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, HADOOP-1622_5.patch, HADOOP-1622_6.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 26, 2008 at 5:57 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Open (was: Patch Available)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, HADOOP-1622_5.patch, HADOOP-1622_6.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Mahadev konar (JIRA) at Mar 26, 2008 at 5:57 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mahadev konar updated HADOOP-1622:
    ----------------------------------

    Status: Patch Available (was: Open)
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, HADOOP-1622_5.patch, HADOOP-1622_6.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Mar 26, 2008 at 9:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    dhruba borthakur updated HADOOP-1622:
    -------------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    I just committed this. Thanks Mahadev!
    Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
    --------------------------------------------------------------------------------------------

    Key: HADOOP-1622
    URL: https://issues.apache.org/jira/browse/HADOOP-1622
    Project: Hadoop Core
    Issue Type: Improvement
    Components: mapred
    Reporter: Runping Qi
    Assignee: Mahadev konar
    Fix For: 0.17.0

    Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, HADOOP-1622_1.patch, HADOOP-1622_2.patch, HADOOP-1622_3.patch, HADOOP-1622_4.patch, HADOOP-1622_5.patch, HADOOP-1622_6.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch


    More likely than not, a user's job may depend on multiple jars.
    Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that.
    A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
    This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function
    (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
    It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time
    of job submission. Someting like:
    bin/hadoop .... --depending_jars j1.jar:j2.jar
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMar 13, '08 at 6:24p
activeMar 26, '08 at 9:11p
posts19
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

dhruba borthakur (JIRA): 19 posts

People

Translate

site design / logo © 2022 Grokbase