FAQ
[ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michele (aka pirroh) Catasta updated HADOOP-2366:
-------------------------------------------------

Affects Version/s: (was: 0.13.1)
Status: Patch Available (was: Open)

getStrings() trims leading and trailing whitespace by default now.
Patch includes also a simple unit test.
Space in the value for dfs.data.dir can cause great problems
------------------------------------------------------------

Key: HADOOP-2366
URL: https://issues.apache.org/jira/browse/HADOOP-2366
Project: Hadoop Core
Issue Type: Bug
Components: conf
Reporter: Ted Dunning
Assignee: Todd Lipcon

The following configuration causes problems:
<property>
<name>dfs.data.dir</name>
<value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
<description>
Determines where on the local filesystem an DFS data node should store its bl
ocks. If this is a comma-delimited list of directories, then data will be stor
ed in all named directories, typically on different devices. Directories that
do not exist are ignored.
</description>
</property>
The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Michele (aka pirroh) Catasta (JIRA) at Jun 10, 2009 at 11:15 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: HADOOP-2366.patch
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 10, 2009 at 11:21 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: HADOOP-2366.patch

    Patch updated, it was adding a warning on StringUtils.
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch, HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 10, 2009 at 11:23 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: (was: HADOOP-2366.patch)
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 10, 2009 at 11:37 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: HADOOP-2366.patch
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 10, 2009 at 11:37 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: (was: HADOOP-2366.patch)
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 14, 2009 at 3:08 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: (was: HADOOP-2366.patch)
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 14, 2009 at 3:08 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: HADOOP-2366.patch

    Patch updated, now it's using split("\\s*,\\s*").

    @tlipcon: Thanks for the comment! Gotta be honest, I wasn't using the regex because I thought mine was the only way to let getStrings() behave as it was doing before regarding trailing empty tokens.
    Actually, I took a look at the code which is using getStrings(), and throwing away the trailing empty token should not break anything (while helps the users who leave a final comma without any following path). Anyway, to make it behave as it was before, just add a -1 as the second argument of split(). Hope it's OK now :-)
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 16, 2009 at 3:44 am
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: HADOOP-2366.patch

    Patch modified to make getStrings() behave as before with whitespace-only strings.

    Todd: thanks a lot for the feedback, I'm running an 'ant test' in a local server to see if tests are failing again. Feel free to move back the issue to "In Progress" status - I'll let you know as soon as the build ends.
    Sorry for the naiveness, I didn't know that Hudson was triggered by the "Patch Available" status... skipped an important section in the wiki page, shame on me!
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 16, 2009 at 3:44 am
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: (was: HADOOP-2366.patch)
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 17, 2009 at 12:26 am
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: HADOOP-2366-trimmed.patch

    I attached a separate patch which exposes a getTrimmedStrings() method and modifies only the behavior of the Datanode.
    When the property value is empty, the behavior of getTrimmedStrings is by purpose different from getStrings(). I'd rather pass an empty array which is handled correctly by Datanode.instance(), than a null which causes NPE.
    Just let me know if you want me to fix this behavior and make it consistent with the other methods, mine was just a purpose.

    Todd: I run a whole ant clean test cycle with the other patch, and it fails anyway. I gave a shallow look with greps on getStrings() when I was attaching the first version, and I had your same impression. Actually, after that fix you suggested, I was expecting smooth tests, but probably there's something missing still. I'll leave the patch attached, in case it could be useful.

    Tsz Wo: thanks for the hints. run-test-core was successful, hope will be the same with the Hudson build.
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366-trimmed.patch, HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 17, 2009 at 10:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: (was: HADOOP-2366-trimmed.patch)
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366-trimmed.patch, HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 17, 2009 at 10:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: HADOOP-2366-trimmed.patch

    As suggested by Raghu, patch now modifies also NameNode behavior.
    NN was using directly getStringCollection(), so I just added a new getTrimmedStringCollection().
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Core
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Todd Lipcon
    Attachments: HADOOP-2366-trimmed.patch, HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 30, 2009 at 12:27 am
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: (was: HADOOP-2366-trimmed.patch)
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Common
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Michele (aka pirroh) Catasta
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 30, 2009 at 12:27 am
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: HADOOP-2366.patch

    Same logic as the previous patch, without the HDFS part.
    I'll create a new issue in the HDFS project linked to this one.
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Common
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Michele (aka pirroh) Catasta
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Michele (aka pirroh) Catasta (JIRA) at Jun 30, 2009 at 12:27 am
    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michele (aka pirroh) Catasta updated HADOOP-2366:
    -------------------------------------------------

    Attachment: (was: HADOOP-2366.patch)
    Space in the value for dfs.data.dir can cause great problems
    ------------------------------------------------------------

    Key: HADOOP-2366
    URL: https://issues.apache.org/jira/browse/HADOOP-2366
    Project: Hadoop Common
    Issue Type: Bug
    Components: conf
    Reporter: Ted Dunning
    Assignee: Michele (aka pirroh) Catasta
    Attachments: HADOOP-2366.patch


    The following configuration causes problems:
    <property>
    <name>dfs.data.dir</name>
    <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>
    <description>
    Determines where on the local filesystem an DFS data node should store its bl
    ocks. If this is a comma-delimited list of directories, then data will be stor
    ed in all named directories, typically on different devices. Directories that
    do not exist are ignored.
    </description>
    </property>
    The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.
    My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma. This still allows spaces in file and directory names but avoids this problem.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJun 10, '09 at 11:13p
activeJun 30, '09 at 12:27a
posts16
users1
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase