i m trying to access file that I sent as -files option in my hadoop jar
command.
in my outputformat,
I am doing something like:
Path[] cacheFiles = DistributedCache.getLocalCacheFiles(conf);
String file1="";
String file2="";
Path pt=null;
for (Path p : cacheFiles) {
if (p != null) {
if (p.getName().endsWith(".ryp")) {
file1 = p.getName();
} else if (p.getName().endsWith(".cpt")) {
file2 = p.getName();
pt=p;
}
}
}
// then read the file, which gives file does not exist exception:
Path pat = new Path(file2);
BufferedReader reader = null;
try {
FileSystem fs = FileSystem.get(conf);
reader=new BufferedReader(
new InputStreamReader(fs.open(pat)));
String line = null;
while ((line = reader.readLine()) != null) {
System.out.println("Now parsing the line: " + line);
}
} catch (Exception e) {
System.out.println("exception" + e.getMessage());
}
On Fri, Jul 29, 2011 at 10:50 AM, Alejandro Abdelnur wrote:Where are you getting the error, in the client submitting the job or in the
MR tasks?
Are you trying to access a file or trying to set a JAR in the
DistributedCache?
How/when are you adding the file/JAR to the DC?
How are you retrieving the file/JAR from your outputformat code?
Thxs.
Alejandro
On Fri, Jul 29, 2011 at 10:43 AM, Mapred Learn wrote:I am trying to create a custom text outputformat where I want to access a
distirbuted cache file.
On Fri, Jul 29, 2011 at 10:42 AM, Harsh J wrote:Mapred,
By outputformat, do you mean the frontend, submit-time run of
OutputFormat? Then no, it cannot access the distributed cache cause
its not really setup at that point, and the front end doesn't need the
distributed cache really when it can access those files directly.
Could you describe slightly deeper on what you're attempting to do?
On Fri, Jul 29, 2011 at 10:57 PM, Mapred Learn <
[email protected]>
wrote:
Hi,
I am trying to access distributed cache in my custom output format but it
does not work and file open in custom output format fails with file does not
exist even though it physically does.
Looks like distributed cache only works for Mappers and Reducers ?
Is there a way I can read Distributed Cache in my custom output format ?
Thanks,
-JJ
--
Harsh J