FAQ
DF enhancement: performance and win XP support
----------------------------------------------

Key: HADOOP-33
URL: http://issues.apache.org/jira/browse/HADOOP-33
Project: Hadoop
Type: Improvement
Components: fs, dfs
Environment: Unix, Cygwin, Win XP
Reporter: Konstantin Shvachko
Priority: Minor


1. DF is called twice for each heartbeat, which happens each 3 seconds.
There is a simple fix for that in the attached patch.

2. cygwin is required to run df program in windows environment.
There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
for different OSs, but it does not have means to get disk capacity.
In general in windows there is no efficient and uniform way to calculate disk capacity
using a shell command.
The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
every 3 seconds.
WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
I implemented a call to fsutil in case df fails, and the OS is right.
Other win versions should still run cygwin.
I tested this fetaure for linux, winXP and cygwin.
See attached patch.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira

Search Discussions

  • Konstantin Shvachko (JIRA) at Feb 10, 2006 at 10:46 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: DFpatch.txt
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Feb 14, 2006 at 2:58 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: DF.patch

    DF.patch uses relative file names
    Please disregard the DFpatch.txt attachment
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Doug Cutting (JIRA) at Feb 23, 2006 at 5:03 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12367534 ]

    Doug Cutting commented on HADOOP-33:
    ------------------------------------

    Fixing things to not call DF twice per heartbeat would be great. But why do we need the DiskUsage class? Can't we just keep the DF instance and reuse it? I don't see the advantage of wrapping the DF inside another class. It just adds code, and less code is better. Also, the logic of getRemaining() is duplicated after your patch.

    Perhaps what's needed is a private getDF() method in FSDataset, that checks to see if a cached DF instance has been refreshed in the last N milleseconds. If it has not then it is refreshed. Then it is returned. Something like:

    private synchronized DF getDF() {
    long now = System.getMillisTime();
    if ((now - lastDfTime) > DF_INTERVAL) {
    df = new DF();
    lastDfTime = now;
    }
    return DF();
    }

    Then getRemaining() and getCapacity() can be defined in terms of getDF(). Does this make sense?

    Finally, Hadoop currently requires Cygwin in a number of places, most notably in the startup scripts. The current strategy is not to maintain native Windows versions of these, but rather to rely on Cygwin. This patch increases the code size without removing the dependency on Cygwin. If you like, we could start another bug to entirely remove the dependency on Cygwin, porting all scripts, DF, etc. But that is a low-priority item for me, since Cygwin offers a fine solution with no code duplication.

    In summary, I'd love to see a patch that fixes the DF problem with a minimum of code. Thanks!
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Feb 25, 2006 at 2:40 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12367755 ]

    Konstantin Shvachko commented on HADOOP-33:
    -------------------------------------------

    1. Having the lastDfTime, and updating DF every time DF_INTERVAL passes is definitely a good solution.
    I would go even further and place the DF renew/refresh logic directly into the DF class so that functions
    calling DF get-methods were free to assume the data is up to date.
    I don't know whether we need that, but DF_INTERVAL can be made a configurable parameter.
    This will bring in more code, but will make the use of the DF class easier in the end.
    Do we want it?

    2. My patch does not remove the dependency on Cygwin. What it does is
    it removes dependency on Cygwin in one particular case without compromising performance for the mainstream OS.
    The whole file system can run (and actually runs) on windows after that without overheads of cygwin.
    Additional code is justifiable and inevitable in this case until Sun will implement this functionality for us using native libraries :-).

    3. What do you mean by minimizing the code?
    Is it "the minimum of changes to the existing code that solve the problem", or is it
    the minimal amount of total code committed to the repository?
    Or is it minimizing the code required in the future to use the feature?
    This is actually an interesting topic for discussion......

    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Doug Cutting (JIRA) at Feb 27, 2006 at 6:40 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12368006 ]

    Doug Cutting commented on HADOOP-33:
    ------------------------------------

    Yes, DF_INTERVAL should be configurable.

    Caching inside DF sounds fine. We'd then want to add a DF field to FSDataset, so that we always reuse the same instance.

    By minimizing the code I primarily mean minimal total code committed to the repository. Minimizing the size of patches is also good, since it makes it easier to understand.

    I do not see how removing the dependency on cygwin in this one case helps the project: it makes it bigger but adds no functionality and removes no dependencies. Dependencies are also not bad: we don't want to re-invent things. Cygwin has already solved this problem (and some others) for us permitting us to focus on Hadoop's more critical issues.
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Tim Patton (JIRA) at Feb 27, 2006 at 8:48 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12368021 ]

    Tim Patton commented on HADOOP-33:
    ----------------------------------

    Konstantin,

    Thank you for submitting that patch, even though it wasn't accepted I just copied your tryOtherOS method into my code to get rid of that annoying Unix dependency. I was suprised to see a dependency like that on an operating system command. Frankly, as a developer and a user, a little extra code is a lot less annoying than several megs of Cygwin and a few hours getting it working (especially since it never worked right).
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Doug Cutting (JIRA) at Feb 27, 2006 at 8:59 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12368023 ]

    Doug Cutting commented on HADOOP-33:
    ------------------------------------

    So, Tim, I take it you're using Hadoop on XP without Cygwin? What are you using for startup scripts?
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Tim Patton (JIRA) at Feb 27, 2006 at 9:16 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12368027 ]

    Tim Patton commented on HADOOP-33:
    ----------------------------------

    Doug,

    I'm getting there, so far the NameNode and the DataNode are working. I've just got hacky batch files so far, I don't have the nice shell script system you guys developed. Here's my batch file for the NameNode:

    java -Xmx256m -cp c:\java\lib\hadoop-nightly\conf;"c:\program files\java\jdk1.5.0_06\lib\tools.jar";c:\java\lib\hadoop-nightly\hadoop-nightly.jar;c:\java\lib\hadoop-nightly\hadoop-nightly-examples.jar;c:\java\lib\hadoop-nightly\hadoop-nightly-examples.jar;c:\java\lib\hadoop-nightly\hadoop-nightly.jar;c:\java\lib\hadoop-nightly\lib\commons-logging-api-1.0.4.jar;c:\java\lib\hadoop-nightly\lib\jetty-5.1.4.jar;c:\java\lib\hadoop-nightly\lib\junit-3.8.1.jar;c:\java\lib\hadoop-nightly\lib\lucene-core-1.9-rc1-dev.jar;c:\java\lib\hadoop-nightly\lib\servlet-api.jar;c:\java\lib\hadoop-nightly\lib\jetty-ext\ant.jar;c:\java\lib\hadoop-nightly\lib\jetty-ext\commons-el.jar;c:\java\lib\hadoop-nightly\lib\jetty-ext\jasper-compiler.jar;c:\java\lib\hadoop-nightly\lib\jetty-ext\jasper-runtime.jar;c:\java\lib\hadoop-nightly\lib\jetty-ext\jsp-api.jar org.apache.hadoop.dfs.NameNode

    Now I am just tryng to figure out why JobTracker won't work. I couldn't get JobTracker working in Cygwin either, so I thought I'd edit the code to give me stack traces and see if I could narrow it down, which led me to try to get it working right in CodeGuide without having to keep going back to Cygwin...
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Tim Patton (JIRA) at Feb 27, 2006 at 9:18 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12368028 ]

    Tim Patton commented on HADOOP-33:
    ----------------------------------

    Oops, I didn't realize lines were not auto-wrapping. Sorry about that. And I meant to say I knew my batch file doesn't have relative paths, I was just copying and pasting to get it working.
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Jeff Ritchie at Feb 28, 2006 at 1:12 am
    Does this project have any use/merit?

    http://unxutils.sourceforge.net/

    I'm not sure if all the unix utils needed for hadoop / nutch are in
    there or not.

    Jeff.

    Tim Patton (JIRA) wrote:
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12368028 ]

    Tim Patton commented on HADOOP-33:
    ----------------------------------

    Oops, I didn't realize lines were not auto-wrapping. Sorry about that. And I meant to say I knew my batch file doesn't have relative paths, I was just copying and pasting to get it working.


    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
  • Konstantin Shvachko (JIRA) at Mar 2, 2006 at 12:12 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12368399 ]

    Konstantin Shvachko commented on HADOOP-33:
    -------------------------------------------

    This patch (DF2.patch) covers DF caching and reuse of the same instance of DF in FSDataset.
    I removed main() from the DF class, and created class TestDF in the test directory.
    Additionally, for those who want Windows XP/2003 df functionality without cygwin
    I attach DF.java, which covers that and is ready for adding other OSs, if desired.
    Just replace committed DF.java with the one attached.
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 2, 2006 at 12:14 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: DF2.patch
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DF2.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 2, 2006 at 12:16 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: DF.java
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.java, DF.patch, DF2.patch, DFpatch.txt

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 2, 2006 at 7:46 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: (was: DFpatch.txt)
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.java, DF.patch, DF2.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 2, 2006 at 8:02 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: (was: DF.java)
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.patch, DF2.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 2, 2006 at 8:10 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: DF3.patch
    DF.java

    Done some code reduction.
    DF3.patch is the latest version now.
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.java, DF.patch, DF2.patch, DF3.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Doug Cutting (JIRA) at Mar 16, 2006 at 1:06 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12370627 ]

    Doug Cutting commented on HADOOP-33:
    ------------------------------------

    This still needs a few improvements.

    - why are all the fields protected? shouldn't they be private?
    - the test case isn't a junit test.
    - the toString() method now needs to call to doDF().
    - the javadoc now says it uses a windows command, but it doesn't


    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.java, DF.patch, DF2.patch, DF3.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 16, 2006 at 2:20 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12370631 ]

    Konstantin Shvachko commented on HADOOP-33:
    -------------------------------------------

    - The fields are protected rather than private in order to keep the class extensible, if anybody
    would want to support other OSs, like I did in the attached version of DF.java
    - I would remove this test entirely, don't see any reason to test DF outside the file system.
    - Don't think toString() should call doDF(). We probably want toString to reflect the current
    state of the class, rather the current state of the disk drive.
    - The Javadoc should say exactly the same it was originally saying, my mistake.
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.java, DF.patch, DF2.patch, DF3.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 17, 2006 at 1:11 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: DF4.patch

    DF4.path

    - made fields private
    - made test junit
    - left toString() unchanged
    - removed WinXp comment

    Is there anything else holding us?
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Priority: Minor
    Attachments: DF.java, DF.patch, DF2.patch, DF3.patch, DF4.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Sameer Paranjpye (JIRA) at Mar 24, 2006 at 10:50 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Sameer Paranjpye updated HADOOP-33:
    -----------------------------------

    Fix Version: 0.1
    Version: 0.1
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    Attachments: DF.java, DF.patch, DF2.patch, DF3.patch, DF4.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Sameer Paranjpye (JIRA) at Mar 24, 2006 at 10:50 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Sameer Paranjpye reassigned HADOOP-33:
    --------------------------------------

    Assign To: Konstantin Shvachko
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    Attachments: DF.java, DF.patch, DF2.patch, DF3.patch, DF4.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 25, 2006 at 2:13 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: (was: DF4.patch)
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    Attachments: DF.java, DF.patch, DF2.patch, DF3.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 25, 2006 at 2:15 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: (was: DF3.patch)
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    Attachments: DF.java

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 25, 2006 at 2:15 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: (was: DF2.patch)
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    Attachments: DF.java

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 25, 2006 at 2:15 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: (was: DF.patch)
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    Attachments: DF.java

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 25, 2006 at 2:40 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: (was: DF.java)
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Mar 25, 2006 at 2:43 am
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: DF.java
    DF5.patch

    Updating the DF patch.
    The constructor is now calling df to assign actual values to the members.
    main() is back instead of the testDF.
    DF.java is also updated to reflect these changes.
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    Attachments: DF.java, DF5.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Doug Cutting (JIRA) at Mar 29, 2006 at 10:40 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Doug Cutting resolved HADOOP-33:
    --------------------------------

    Resolution: Fixed

    The patch looks great. I just committed it. Thanks, Konstantin!
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Type: Improvement
    Components: fs, dfs
    Versions: 0.1
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1
    Attachments: DF.java, DF5.patch

    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Jul 25, 2006 at 8:12 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: (was: DF.java)
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Issue Type: Improvement
    Components: fs, dfs
    Affects Versions: 0.1.0
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assigned To: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1.0

    Attachments: DF5.patch


    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira
  • Konstantin Shvachko (JIRA) at Jul 25, 2006 at 8:20 pm
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=all ]

    Konstantin Shvachko updated HADOOP-33:
    --------------------------------------

    Attachment: DF.java

    I'm updating the universal version of DF.java to reflect changes introduced by HADOOP-344.
    DF is hadoop's only dependency on cygwin not counting the scripts.
    In order to run it on windows XP (without cygwin) you need to replace DF.java with the
    file attached.

    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: http://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Issue Type: Improvement
    Components: fs, dfs
    Affects Versions: 0.1.0
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assigned To: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1.0

    Attachments: DF.java, DF5.patch


    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira
  • Jean-François Ménard (JIRA) at Jan 6, 2008 at 7:58 am
    [ https://issues.apache.org/jira/browse/HADOOP-33?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556354#action_12556354 ]

    Jean-François Ménard commented on HADOOP-33:
    --------------------------------------------

    The result of 'fsutil' is localised.

    On my french windows XP, the result is:

    Nombre total d'octets libres : 23775694848
    Nombre total d'octets : 143913521152
    Nombre total d'octets libres disponibles : 23775694848

    I modified DF.java accordingly for my needs, but if the order of lines is always the same, I guess that parsing the lines sequentially should do the trick.
    DF enhancement: performance and win XP support
    ----------------------------------------------

    Key: HADOOP-33
    URL: https://issues.apache.org/jira/browse/HADOOP-33
    Project: Hadoop
    Issue Type: Improvement
    Components: dfs, fs
    Affects Versions: 0.1.0
    Environment: Unix, Cygwin, Win XP
    Reporter: Konstantin Shvachko
    Assignee: Konstantin Shvachko
    Priority: Minor
    Fix For: 0.1.0

    Attachments: DF.java, DF5.patch


    1. DF is called twice for each heartbeat, which happens each 3 seconds.
    There is a simple fix for that in the attached patch.
    2. cygwin is required to run df program in windows environment.
    There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
    for different OSs, but it does not have means to get disk capacity.
    In general in windows there is no efficient and uniform way to calculate disk capacity
    using a shell command.
    The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
    every 3 seconds.
    WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
    I implemented a call to fsutil in case df fails, and the OS is right.
    Other win versions should still run cygwin.
    I tested this fetaure for linux, winXP and cygwin.
    See attached patch.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 10, '06 at 10:45p
activeJan 6, '08 at 7:58a
posts32
users2
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase