FAQ
S3 native fs should not create bucket
-------------------------------------

Key: HADOOP-4422
URL: https://issues.apache.org/jira/browse/HADOOP-4422
Project: Hadoop Core
Issue Type: Bug
Components: fs/s3
Affects Versions: 0.18.1
Reporter: David Phillips
Attachments: hadoop-s3n-nocreate.patch

S3 native file system tries to create the bucket at every initialization. This is bad because

* Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
* These calls can fail when called concurrently. This makes the file system unusable in large jobs.
* Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.

The initialization code should assume the bucket exists:

* Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
* Any check at initialization for bucket existence is a waste of money.

Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • David Phillips (JIRA) at Oct 15, 2008 at 9:45 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips updated HADOOP-4422:
    -----------------------------------

    Status: Patch Available (was: Open)
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Oct 15, 2008 at 9:45 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips updated HADOOP-4422:
    -----------------------------------

    Attachment: hadoop-s3n-nocreate.patch

    Simple patch that removes bucket creation.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Oct 15, 2008 at 10:57 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639998#action_12639998 ]

    Hadoop QA commented on HADOOP-4422:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12392205/hadoop-s3n-nocreate.patch
    against trunk revision 705073.

    +1 @author. The patch does not contain any @author tags.

    -1 tests included. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    -1 patch. The patch command could not apply the patch.

    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3469/console

    This message is automatically generated.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Doug Cutting (JIRA) at Oct 16, 2008 at 5:30 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Doug Cutting updated HADOOP-4422:
    ---------------------------------

    Assignee: David Phillips
    Status: Open (was: Patch Available)

    This patch needs to be re-generated without the a/ b/ stuff for Hudson to be able to apply it. It must apply with 'patch -p 0 < foo.patch' when connected to trunk.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Oct 16, 2008 at 7:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips updated HADOOP-4422:
    -----------------------------------

    Status: Patch Available (was: Open)
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Oct 16, 2008 at 7:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips updated HADOOP-4422:
    -----------------------------------

    Attachment: hadoop-s3n-nocreate.patch

    Patch applies with -p0 now (used git diff --no-prefix).
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Oct 16, 2008 at 8:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips updated HADOOP-4422:
    -----------------------------------

    Attachment: (was: hadoop-s3n-nocreate.patch)
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tom White (JIRA) at Oct 16, 2008 at 8:39 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tom White updated HADOOP-4422:
    ------------------------------

    Hadoop Flags: [Incompatible change]

    Marked as an incompatible change, since existing code that relies on bucket creation will need to be changed.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Oct 17, 2008 at 10:50 am
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640490#action_12640490 ]

    Hadoop QA commented on HADOOP-4422:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12392277/hadoop-s3n-nocreate.patch
    against trunk revision 705430.

    +1 @author. The patch does not contain any @author tags.

    -1 tests included. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3482/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3482/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3482/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3482/console

    This message is automatically generated.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tom White (JIRA) at Oct 21, 2008 at 8:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641566#action_12641566 ]

    Tom White commented on HADOOP-4422:
    -----------------------------------

    For consistency, we should make the same change to Jets3tFileSystemStore.

    Also, regarding the tests, there are two unit tests: Jets3tS3FileSystemContractTest and Jets3tNativeS3FileSystemContractTest which can be run manually to test the S3 integration. The only difference with this patch is that the buckets they run against must already exist - so I don't think any change to the tests are needed.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Nov 5, 2008 at 5:58 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips reassigned HADOOP-4422:
    --------------------------------------

    Assignee: (was: David Phillips)
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tom White (JIRA) at Nov 12, 2008 at 12:58 am
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tom White updated HADOOP-4422:
    ------------------------------

    Assignee: David Phillips
    Status: Open (was: Patch Available)

    Cancelling patch pending change to Jets3tFileSystemStore.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Nov 21, 2008 at 10:26 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips updated HADOOP-4422:
    -----------------------------------

    Attachment: hadoop-s3n-nocreate.patch

    Bucket creation also removed from Jets3tFileSystemStore.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Nov 21, 2008 at 10:34 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649837#action_12649837 ]

    David Phillips commented on HADOOP-4422:
    ----------------------------------------

    I ran the tests as follows after setting the correct test buckets and keys in src/test/hadoop-site.xml:

    ant -Dtestcase=Jets3tS3FileSystemContractTest test
    ant -Dtestcase=Jets3tNativeS3FileSystemContractTest test

    They seem to pass:

    Testsuite: org.apache.hadoop.fs.s3.Jets3tS3FileSystemContractTest
    Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 131.575 sec
    Testsuite: org.apache.hadoop.fs.s3native.Jets3tNativeS3FileSystemContractTest
    Tests run: 26, Failures: 0, Errors: 0, Time elapsed: 52.694 sec

    However, they both produce hundreds of warnings:

    (s3) 2008-11-21 15:51:28,800 WARN httpclient.RestS3Service (RestS3Service.java:performRequest(317)) - Response '/%2Ftest' - Unexpected response code 404, expected 200
    (s3n) 2008-11-21 15:34:55,646 WARN httpclient.RestS3Service (RestS3Service.java:performRequest(317)) - Response '/test' - Unexpected response code 404, expected 200

    Any ideas?
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Nov 22, 2008 at 12:27 am
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips updated HADOOP-4422:
    -----------------------------------

    Release Note: Never create S3 buckets
    Status: Patch Available (was: Open)
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Nov 22, 2008 at 12:27 am
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649865#action_12649865 ]

    David Phillips commented on HADOOP-4422:
    ----------------------------------------

    Never mind about the warnings from Jets3t during testing. They are expected.
    S3 native fs should not create bucket
    -------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    S3 native file system tries to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • David Phillips (JIRA) at Nov 22, 2008 at 12:29 am
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    David Phillips updated HADOOP-4422:
    -----------------------------------

    Description:
    Both S3 file systems (s3 and s3n) try to create the bucket at every initialization. This is bad because

    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.

    The initialization code should assume the bucket exists:

    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.

    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."


    was:
    S3 native file system tries to create the bucket at every initialization. This is bad because

    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.

    The initialization code should assume the bucket exists:

    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.

    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."


    Summary: S3 file systems should not create bucket (was: S3 native fs should not create bucket)
    S3 file systems should not create bucket
    ----------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    Both S3 file systems (s3 and s3n) try to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Nov 23, 2008 at 12:24 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650020#action_12650020 ]

    Hadoop QA commented on HADOOP-4422:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12394458/hadoop-s3n-nocreate.patch
    against trunk revision 719787.

    +1 @author. The patch does not contain any @author tags.

    -1 tests included. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3638/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3638/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3638/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3638/console

    This message is automatically generated.
    S3 file systems should not create bucket
    ----------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    Both S3 file systems (s3 and s3n) try to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tom White (JIRA) at Nov 25, 2008 at 12:05 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tom White updated HADOOP-4422:
    ------------------------------

    Resolution: Fixed
    Fix Version/s: 0.20.0
    Release Note: S3 buckets are no longer created by the Hadoop filesystem. Applications that relied on this behavior should be changed to create buckets for their S3 filesystems by some other means (e.g. the JetS3t API). (was: Never create S3 buckets)
    Hadoop Flags: [Incompatible change, Reviewed] (was: [Incompatible change])
    Status: Resolved (was: Patch Available)

    I've just committed this. Thanks David!
    S3 file systems should not create bucket
    ----------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Fix For: 0.20.0

    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    Both S3 file systems (s3 and s3n) try to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Nov 25, 2008 at 6:40 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650674#action_12650674 ]

    Hudson commented on HADOOP-4422:
    --------------------------------

    Integrated in Hadoop-trunk #670 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/670/])
    . S3 file systems should not create bucket. Contributed by David Phillips.

    S3 file systems should not create bucket
    ----------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Fix For: 0.20.0

    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    Both S3 file systems (s3 and s3n) try to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Robert Chansler (JIRA) at Mar 3, 2009 at 1:54 am
    [ https://issues.apache.org/jira/browse/HADOOP-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Robert Chansler updated HADOOP-4422:
    ------------------------------------

    Release Note: Modified Hadoop file system to no longer create S3 buckets. Applications can create buckets for their S3 file systems by other means, for example, using the JetS3t API. (was: S3 buckets are no longer created by the Hadoop filesystem. Applications that relied on this behavior should be changed to create buckets for their S3 filesystems by some other means (e.g. the JetS3t API).)
    Hadoop Flags: [Incompatible change, Reviewed] (was: [Reviewed, Incompatible change])

    Edit release note for publication.
    S3 file systems should not create bucket
    ----------------------------------------

    Key: HADOOP-4422
    URL: https://issues.apache.org/jira/browse/HADOOP-4422
    Project: Hadoop Core
    Issue Type: Bug
    Components: fs/s3
    Affects Versions: 0.18.1
    Reporter: David Phillips
    Assignee: David Phillips
    Fix For: 0.20.0

    Attachments: hadoop-s3n-nocreate.patch, hadoop-s3n-nocreate.patch


    Both S3 file systems (s3 and s3n) try to create the bucket at every initialization. This is bad because
    * Every S3 operation costs money. These unnecessary calls are an unnecessary expense.
    * These calls can fail when called concurrently. This makes the file system unusable in large jobs.
    * Any operation, such as a "fs -ls", creates a bucket. This is counter-intuitive and undesirable.
    The initialization code should assume the bucket exists:
    * Creating a bucket is a very rare operation. Accounts are limited to 100 buckets.
    * Any check at initialization for bucket existence is a waste of money.
    Per Amazon: "Because bucket operations work against a centralized, global resource space, it is not appropriate to make bucket create or delete calls on the high availability code path of your application. It is better to create or delete buckets in a separate initialization or setup routine that you run less often."
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedOct 15, '08 at 9:45p
activeMar 3, '09 at 1:54a
posts22
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Robert Chansler (JIRA): 22 posts

People

Translate

site design / logo © 2022 Grokbase