FAQ
Hi,

Every day each map/reduce processes I schedule on my cluster leave files
behind on all the DataNodes in a directory named blocksBeingWritten. After 1
week the amount of files left behind reach 70 GB on each blockBeingWritten
directory on each DataNodes.

I have noticed that once I restart a DataNodes this directory is cleaned up.


Can someone please help me understand what exactly are those files contained
in this directory and why the DataNodes seems to delete them just when it's
restarted ?

Here below an example of the files that I seen in the
blockBeingWritten directory:
-rw-r--r-- 1 hdfs hadoop 2.0K Jun 14 14:24
blk_2226351414820476901_4655671.meta
-rw-r--r-- 1 hdfs hadoop 254K Jun 14 14:24 blk_2226351414820476901
-rw-r--r-- 1 hdfs hadoop 26K Jun 14 14:25
blk_651476714389509127_4655706.meta
-rw-r--r-- 1 hdfs hadoop 3.2M Jun 14 14:25 blk_651476714389509127
-rw-r--r-- 1 hdfs hadoop 182K Jun 14 14:58
blk_1727419676952982071_4659418.meta
-rw-r--r-- 1 hdfs hadoop 23M Jun 14 14:58 blk_1727419676952982071
-rw-r--r-- 1 hdfs hadoop 447K Jun 14 14:59
blk_687415755671726127_4659433.meta
-rw-r--r-- 1 hdfs hadoop 56M Jun 14 14:59 blk_687415755671726127
-rw-r--r-- 1 hdfs hadoop 476K Jun 14 15:02
blk_-1767796325092574815_4659494.meta
-rw-r--r-- 1 hdfs hadoop 60M Jun 14 15:02 blk_-1767796325092574815

Thank you,

JP.

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
grouphdfs-user @
categorieshadoop
postedJun 30, '11 at 6:38p
activeJun 30, '11 at 6:38p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Jean-Pierre OCALAN: 1 post

People

Translate

site design / logo © 2022 Grokbase