FAQ
If #link is missing from uri format of -cacheArchive then streaming does not throw error.
-----------------------------------------------------------------------------------------

Key: HADOOP-2879
URL: https://issues.apache.org/jira/browse/HADOOP-2879
Project: Hadoop Core
Issue Type: Bug
Components: contrib/streaming
Reporter: Karam Singh
Priority: Minor


Ran hadoop streaming command as -:
bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input in -output out -mapper "xargs cat" -reducer "bin/cat" -cahceArchive hdfs://h:p/pathofJarFile
Streaming submits job to jobtracker and map fails.
For similar with -cacheFile -:
bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input in -output out -mapper "xargs cat" -reducer "bin/cat" -cahceFile hdfs://h:p/pathofFile
followinng error is repoerted back -:
[
You need to specify the uris as hdfs://host:port/#linkname,Please specify a different link name for all of your caching URIs
]


Streaming should check about present #link after uri of cacheArchive and should throw proper error .


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Karam Singh (JIRA) at Feb 22, 2008 at 2:48 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571427#action_12571427 ]

    Karam Singh commented on HADOOP-2879:
    -------------------------------------

    Looking at code -:
    StreamJob.java (line 845)-:
    [
    boolean b = DistributedCache.checkURIs(fileURIs, archiveURIs);
    if (!b)
    fail(LINK_URI);
    }

    ]

    It is observed the StreamJob.java is calling checkURIs of Distributed.
    Looking at ChekURIs code from org.apahe.hadoop.DistributedCache.java (line 716 onwrds) -:
    [
    if (uriFiles != null){
    for (int i = 0; i < uriFiles.length; i++){
    String frag1 = uriFiles[i].getFragment();
    if (frag1 == null)
    return false;
    for (int j=i+1; j < uriFiles.length; j++){
    String frag2 = uriFiles[j].getFragment();
    if (frag2 == null)
    return false;
    if (frag1.equalsIgnoreCase(frag2))
    return false;
    }
    if (uriArchives != null){
    for (int j = 0; j < uriArchives.length; j++){
    String frag2 = uriArchives[j].getFragment();
    if (frag2 == null){
    return false;
    }
    if (frag1.equalsIgnoreCase(frag2))
    return false;
    for (int k=j+1; k < uriArchives.length; k++){
    String frag3 = uriArchives[k].getFragment();
    if (frag3 == null)
    return false;
    if (frag2.equalsIgnoreCase(frag3))
    return false;
    }
    }
    }
    }
    }
    return true;
    ]

    It seems that if uriFiles is null it does no checks for uriArchives. So if -cacheFile option is not present then it will validate cacheArchive uris

    If #link is missing from uri format of -cacheArchive then streaming does not throw error.
    -----------------------------------------------------------------------------------------

    Key: HADOOP-2879
    URL: https://issues.apache.org/jira/browse/HADOOP-2879
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/streaming
    Reporter: Karam Singh
    Priority: Minor

    Ran hadoop streaming command as -:
    bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input in -output out -mapper "xargs cat" -reducer "bin/cat" -cahceArchive hdfs://h:p/pathofJarFile
    Streaming submits job to jobtracker and map fails.
    For similar with -cacheFile -:
    bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input in -output out -mapper "xargs cat" -reducer "bin/cat" -cahceFile hdfs://h:p/pathofFile
    followinng error is repoerted back -:
    [
    You need to specify the uris as hdfs://host:port/#linkname,Please specify a different link name for all of your caching URIs
    ]
    Streaming should check about present #link after uri of cacheArchive and should throw proper error .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 22, '08 at 2:44p
activeFeb 22, '08 at 2:48p
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Karam Singh (JIRA): 2 posts

People

Translate

site design / logo © 2022 Grokbase