FAQ
DFSClient writes : DataStreamer thread can be removed
------------------------------------------------------

Key: HADOOP-3325
URL: https://issues.apache.org/jira/browse/HADOOP-3325
Project: Hadoop Core
Issue Type: Improvement
Components: dfs
Affects Versions: 0.16.0
Reporter: Raghu Angadi



When a client is writing data to DFS, DFSClient keeps two threads for each file open :
- DataStreamer thread : writes the data to DataNodes (as 64k packets)
- ResponseProcessor : receives acks from the datanodes and detects related errors.

I think job of DataStreamer can be done inside user's write() (i.e. inside the user thread). So for normal case, there will be one less thread. When there is an error in the write pipeline, all the un-acked packets need to be resent. In that case, ResponseProcessor can always create temporary thread to send these packets.

In the future, the acks for multiple pipelines can be handled by a common thread (at least in the default case where sockets are non-blocking).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupcommon-dev @
categorieshadoop
postedApr 29, '08 at 5:57p
activeApr 29, '08 at 5:57p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Raghu Angadi (JIRA): 1 post

People

Translate

site design / logo © 2022 Grokbase