FAQ
[ https://issues.apache.org/jira/browse/HADOOP-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670855#action_12670855 ]

Doug Cutting commented on HADOOP-4927:
--------------------------------------

- LazyOutputFormat should keep a field for the nested output format, not create it again for each call, no?
- We might implement generic FilterOutputFormat and FilterRecordReader that LazyOutputFormat and LazyRecordReader extend. This is probably not the last time someone will need to wrap an OutputFormat or a RecordReader.
- JobConf#setOutputFormatClass(Class, boolean) should instead be static LazyOutputFormat#setClass(Job, Class, boolean). This localizes the change, and it's still a one-line change for applications.
- Similarly, JobContext#getLazyOutputFormatClass() should instead be static LazyOutputFormat#getClass(JobContext). This feature can be entirely contained in LazyOutputFormat and should not require changes to the kernel.
Part files on the output filesystem are created irrespective of whether the corresponding task has anything to write there
--------------------------------------------------------------------------------------------------------------------------

Key: HADOOP-4927
URL: https://issues.apache.org/jira/browse/HADOOP-4927
Project: Hadoop Core
Issue Type: New Feature
Components: mapred
Reporter: Devaraj Das
Assignee: Jothi Padmanabhan
Fix For: 0.21.0

Attachments: hadoop-4927-v1.patch, hadoop-4927-v2.patch, hadoop-4927-v3.patch, hadoop-4927.patch


When OutputFormat.getRecordWriter is invoked, a part file is created on the output filesystem. But the created RecordWriter is not used until the OutputCollector.collect call is made by the task (user's code). This results in empty part files even if the OutputCollector.collect is never invoked by the corresponding tasks.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 32 of 48 | next ›
Discussion Overview
groupcommon-dev @
categorieshadoop
postedDec 22, '08 at 6:01a
activeFeb 23, '09 at 3:19p
posts48
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hudson (JIRA): 48 posts

People

Translate

site design / logo © 2022 Grokbase