[
https://issues.apache.org/jira/browse/HADOOP-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-3041:
--------------------------------------------
Attachment: patch-3041.txt
After the discussions, attaching a patch that does the following:
1. Deprecates JobConf.setOutputPath and JobConf.getOutputPath
JobConf.getOutputPath() still returns the same value that it used to return.
2. Deprecates OutputFormatBase. Adds FileOutputFormat. Existing output formats extending OutputFormatBase, are now extending FileOutputFormat.
3. Adds the following APIs in FileOutputFormat :
public static void setOutputPath(JobConf conf, Path outputDir); // sets mapred.output.dir
public static Path getOutputPath(JobConf conf) ; // gets mapred.output.dir
public static Path getWorkOutputPath(JobConf conf); // gets mapred.work.output.dir
4. static void setWorkOutputPath(JobConf conf, Path outputDir) is also added to FileOutputFormat. This is used by the framework to set mapred.work.output.dir as task's temporary output dir .
Within a task, the value ofJobConf.getOutputPath() method is modified
---------------------------------------------------------------------
Key: HADOOP-3041
URL:
https://issues.apache.org/jira/browse/HADOOP-3041Project: Hadoop Core
Issue Type: Bug
Components: mapred
Affects Versions: 0.16.1
Environment: all
Reporter: Alejandro Abdelnur
Assignee: Amareshwari Sriramadasu
Priority: Blocker
Fix For: 0.17.0
Attachments: patch-3041-0.16.2.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt, patch-3041.txt
Until 0.16.0 the value of the getOutputPath() method, if queried within a task, pointed to the part file assigned to the task.
For example: /user/foo/myoutput/part_00000
In 0.16.1, now it returns an internal hadoop for the task output temporary location.
For the above example: /user/foo/myoutput/_temporary/part_00000
This change breaks applications that use the getOutputPath() to compute other directories.
IMO, this has always being broken, Hadoop should not change the values of properties injected by the client, instead it should use private properties or internal helper methods.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.