We have an 11 node Hadoop cluster running 20.2 that has been in production for 15 months now. The system is used to process log files that are ingested daily, and the oldest files in the HDFS are deleted to free up space as needed, typically when the free space is less than 10% (the delete is done using 'hadoop fs -rmr' on the parent directory of the files to be deleted). When the HDFS was originally built it had 1TB of 'Non DFS' space out of the 20TB total. This 1TB stayed constant for at least the first year the system has been in use.
However over the last few weeks I have seen the 'Non DFS Used' as reported by the NameNode dfshealth.jsp page grow to 2G and rising. The total number of files/directories and blocks in use has remained fairly constant over this time. I am concerned that the Non DFS Used is going to consume more and more of the HDFS if left unchecked. Running fcsk gave "The filesystem under path '/' is HEALTHY".
Questions:
A) What exactly is hadoop reporting as 'Non DFS Used', and how is it calculated? Are these files on the same partition(s) as the HDFS files, but are not actually part of the HDFS?
2) Any ideas on what is driving the growth in Non DFS Used space? I looked for things like growing log files on the datanodes but didn't find anything.
Thanks,
Scott