FAQ
[ https://issues.apache.org/jira/browse/HADOOP-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670028#action_12670028 ]

Doug Cutting commented on HADOOP-4927:
--------------------------------------
Unless there's a non-FileOutputFormat use case [ ... ]
I see Chris's point and agree. Unless there's a strong reason to put features in the kernel we should prefer to put them in library code, keeping the kernel minimal. Are there non-FileInputFormats that need this feature?

A wrapper implementation is a bit harder to use, since folks would need to both set the job's outputformat to the wrapper, and set the wrapper's parameter to the real output format: two changes instead of just setting a single parameter, although it is more generic. We could perhaps implement both: a flag for FileOutputFormat and a wrapper OutputFormat for folks who've not subclassed FileOutputFormat?

Part files on the output filesystem are created irrespective of whether the corresponding task has anything to write there
--------------------------------------------------------------------------------------------------------------------------

Key: HADOOP-4927
URL: https://issues.apache.org/jira/browse/HADOOP-4927
Project: Hadoop Core
Issue Type: New Feature
Components: mapred
Reporter: Devaraj Das
Assignee: Jothi Padmanabhan
Fix For: 0.21.0

Attachments: hadoop-4927-v1.patch, hadoop-4927-v2.patch, hadoop-4927.patch


When OutputFormat.getRecordWriter is invoked, a part file is created on the output filesystem. But the created RecordWriter is not used until the OutputCollector.collect call is made by the task (user's code). This results in empty part files even if the OutputCollector.collect is never invoked by the corresponding tasks.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 26 of 48 | next ›
Discussion Overview
groupcommon-dev @
categorieshadoop
postedDec 22, '08 at 6:01a
activeFeb 23, '09 at 3:19p
posts48
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hudson (JIRA): 48 posts

People

Translate

site design / logo © 2022 Grokbase