FAQ
Poor IO Performance due to AtomicLong operations
------------------------------------------------

Key: HADOOP-5318
URL: https://issues.apache.org/jira/browse/HADOOP-5318
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.19.0
Environment: 2x quad core xeon linux 64 bit
Reporter: Ben Maurer


The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:

{code:java}
import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.ByteWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.SequenceFile.Writer;
import org.apache.hadoop.io.SequenceFile.CompressionType;


public class Test {
public static void main(String[] args) throws IOException {
final Configuration c = new Configuration();
final FileSystem fs = FileSystem.get(c);

final int NUM = 1000*1000;
for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
final int ii = i;
new Thread(new Runnable() {
@Override
public void run() {

try {
Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
ByteWritable v = new ByteWritable();

long time = System.currentTimeMillis();
for (int i = 0; i < NUM; i ++)
f.append(v, v);
f.close();
long end = System.currentTimeMillis();

System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}
}).start();
}
}
}
{code}

The results of this benchmark are
{code}
==== 1 threads ====
1000000 opartions in 1431 msec. 698812.000000/second
==== 2 threads ====
1000000 opartions in 3001 msec. 333222.250000/second
1000000 opartions in 2985 msec. 335008.375000/second
==== 3 threads ====
1000000 opartions in 4923 msec. 203128.171875/second
1000000 opartions in 4924 msec. 203086.921875/second
1000000 opartions in 4981 msec. 200762.906250/second
==== 4 threads ====
1000000 opartions in 6716 msec. 148898.156250/second
1000000 opartions in 7048 msec. 141884.218750/second
1000000 opartions in 7342 msec. 136202.671875/second
1000000 opartions in 7344 msec. 136165.578125/second
==== 5 threads ====
1000000 opartions in 10366 msec. 96469.226563/second
1000000 opartions in 11085 msec. 90212.000000/second
1000000 opartions in 11121 msec. 89919.968750/second
1000000 opartions in 11464 msec. 87229.585938/second
1000000 opartions in 11538 msec. 86670.132813/second
==== 6 threads ====
1000000 opartions in 16513 msec. 60558.347656/second
1000000 opartions in 17704 msec. 56484.410156/second
1000000 opartions in 18219 msec. 54887.753906/second
1000000 opartions in 18550 msec. 53908.355469/second
1000000 opartions in 18605 msec. 53748.992188/second
1000000 opartions in 18663 msec. 53581.953125/second
==== 7 threads ====
1000000 opartions in 22207 msec. 45030.847656/second
1000000 opartions in 23275 msec. 42964.554688/second
1000000 opartions in 23484 msec. 42582.183594/second
1000000 opartions in 24378 msec. 41020.593750/second
1000000 opartions in 24425 msec. 40941.656250/second
1000000 opartions in 24533 msec. 40761.421875/second
1000000 opartions in 24645 msec. 40576.183594/second
==== 8 threads ====
1000000 opartions in 26375 msec. 37914.691406/second
1000000 opartions in 26420 msec. 37850.113281/second
1000000 opartions in 26532 msec. 37690.335938/second
1000000 opartions in 26670 msec. 37495.312500/second
1000000 opartions in 29772 msec. 33588.605469/second
1000000 opartions in 29859 msec. 33490.738281/second
1000000 opartions in 30098 msec. 33224.800781/second
1000000 opartions in 30082 msec. 33242.468750/second
{code}

However, if I comment out the file system statistics increments, the benchmark improves to:

{code}
==== 1 threads ====
1000000 opartions in 1194 msec. 837520.937500/second
==== 2 threads ====
1000000 opartions in 1433 msec. 697836.687500/second
1000000 opartions in 1433 msec. 697836.687500/second
==== 3 threads ====
1000000 opartions in 1643 msec. 608642.750000/second
1000000 opartions in 1643 msec. 608642.750000/second
1000000 opartions in 1639 msec. 610128.125000/second
==== 4 threads ====
1000000 opartions in 1886 msec. 530222.687500/second
1000000 opartions in 1886 msec. 530222.687500/second
1000000 opartions in 1886 msec. 530222.687500/second
1000000 opartions in 1899 msec. 526592.937500/second
==== 5 threads ====
1000000 opartions in 2065 msec. 484261.500000/second
1000000 opartions in 2066 msec. 484027.093750/second
1000000 opartions in 2067 msec. 483792.937500/second
1000000 opartions in 2066 msec. 484027.093750/second
1000000 opartions in 2066 msec. 484027.093750/second
==== 6 threads ====
1000000 opartions in 2151 msec. 464900.031250/second
1000000 opartions in 2111 msec. 473709.156250/second
1000000 opartions in 2153 msec. 464468.187500/second
1000000 opartions in 2114 msec. 473036.906250/second
1000000 opartions in 2113 msec. 473260.781250/second
1000000 opartions in 2112 msec. 473484.843750/second
==== 7 threads ====
1000000 opartions in 2368 msec. 422297.312500/second
1000000 opartions in 2334 msec. 428449.000000/second
1000000 opartions in 2332 msec. 428816.468750/second
1000000 opartions in 2330 msec. 429184.562500/second
1000000 opartions in 2332 msec. 428816.468750/second
1000000 opartions in 2375 msec. 421052.625000/second
1000000 opartions in 2394 msec. 417710.937500/second
==== 8 threads ====
1000000 opartions in 2517 msec. 397298.375000/second
1000000 opartions in 2538 msec. 394011.031250/second
1000000 opartions in 2538 msec. 394011.031250/second
1000000 opartions in 2538 msec. 394011.031250/second
1000000 opartions in 2539 msec. 393855.843750/second
1000000 opartions in 2614 msec. 382555.468750/second
1000000 opartions in 2666 msec. 375093.781250/second
1000000 opartions in 2701 msec. 370233.250000/second
{code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Steve Loughran (JIRA) at Feb 24, 2009 at 2:14 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676287#action_12676287 ]

    Steve Loughran commented on HADOOP-5318:
    ----------------------------------------

    is this 64-bit java on 64-bit OS? If so, surprising -atomic long operations should be atomic at the x86 opcode level, with minimal contention
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer

    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Ben Maurer (JIRA) at Feb 24, 2009 at 5:30 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676342#action_12676342 ]

    Ben Maurer commented on HADOOP-5318:
    ------------------------------------

    Yes it is. Looking at the JDK source, they do CAS rather than an xadd instruction:

    public final long getAndAdd(long delta) {
    while (true) {
    long current = get();
    long next = current + delta;
    if (compareAndSet(current, next))
    return current;
    }
    }

    So there can easily be contention if this is executed frequently.

    I think the best path here may be to make sure that buffering is pulled up a few layers of abstraction.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer

    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 24, 2009 at 5:34 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676343#action_12676343 ]

    Bryan Duxbury commented on HADOOP-5318:
    ---------------------------------------

    I've noticed the FSStats stuff taking large amounts of CPU time in profiles of our mappers and reducers. I'm not sure why it sucks up so much cpu, but I'd definitely like to see a way to negate this effect.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer

    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Ben Maurer (JIRA) at Feb 24, 2009 at 6:09 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Ben Maurer updated HADOOP-5318:
    -------------------------------

    Attachment: buffer-output.patch

    A quick hack to make the output path buffered -- it'd be nice to see if this helps some real world applications. The input side of this is a bit trickier, still working on it.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buffer-output.patch


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Ben Maurer (JIRA) at Feb 24, 2009 at 7:35 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Ben Maurer updated HADOOP-5318:
    -------------------------------

    Attachment: buf.patch

    Updated version of the patch. Handles the read and write path. Haven't benchmarked reads yet, but on writes we get the following improvements (1 byte key/values):
    1 threads || 4 threads || 8 threads ||
    21% | 449% | 1041% |
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Doug Cutting (JIRA) at Feb 25, 2009 at 8:09 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676756#action_12676756 ]

    Doug Cutting commented on HADOOP-5318:
    --------------------------------------

    Inserting a 256k output buffer in every FSDataOutputStream is probably not good. Each FileSystem implementation internally buffers already. Adding a big new buffer in front increases the memory used and also means that data is in memory longer before it is checksummed. I also suspect even a 100b buffer would be enough, since, at the top-level, much i/o is byte-by-byte. But adding a new small buffer would increase the times data is copied, which we should also avoid.

    So I'd suggest that, rather than adding a buffer, PositionCache can just be lazy about reporting statistics. We can add code like:

    {code}
    private static final int REPORT_INTERVAL = 1024;
    private int unreported;

    private void incrementBytesWritten(int bytesWritten) {
    unreported += bytesWritten;
    if (unreported > REPORT_INTERVAL)
    reportBytesWritten();
    }

    private reportBytesWritten() {
    statistics.incrementBytesWritten(unreported);
    unreported = 0;
    }
    {code}

    Then call incrementBytesWritten() in the write() methods and add a call to reportBytesWritten() to close().

    Does that make sense?
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Ben Maurer (JIRA) at Feb 25, 2009 at 9:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676780#action_12676780 ]

    Ben Maurer commented on HADOOP-5318:
    ------------------------------------

    When I benchmarked, I saw a performance gain from not going into the CRC routines, etc for each byte reported. Also, I did see some level of gains for using a larger IO buffer (though I haven't tested that fully).
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Apr 10, 2009 at 9:20 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697982#action_12697982 ]

    Todd Lipcon commented on HADOOP-5318:
    -------------------------------------

    {quote}
    Then call incrementBytesWritten() in the write() methods and add a call to reportBytesWritten() to close().

    Does that make sense?
    {quote}

    +1 - I like that approach rather than adding a buffer. If the buffer increases performance for CRC, etc, that should be a separate JIRA.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Apr 10, 2009 at 10:16 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698000#action_12698000 ]

    dhruba borthakur commented on HADOOP-5318:
    ------------------------------------------
    So I'd suggest that, rather than adding a buffer, PositionCache can just be lazy about reporting statistics.
    +1. I agree.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Apr 14, 2009 at 1:44 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Todd Lipcon updated HADOOP-5318:
    --------------------------------

    Status: Patch Available (was: Open)

    Implemented the lazy reporting of bytes as Doug suggested.

    I didn't see any particular speedup, but I'm working on my dual-core laptop at the moment ;-) If someone could give this a try on a real box, might be worth it. Also will attach a test program
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Apr 14, 2009 at 1:44 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Todd Lipcon updated HADOOP-5318:
    --------------------------------

    Attachment: TestWriteConcurrency.java
    hadoop-5318.txt
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Apr 15, 2009 at 5:26 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699062#action_12699062 ]

    Hadoop QA commented on HADOOP-5318:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12405373/TestWriteConcurrency.java
    against trunk revision 765025.

    +1 @author. The patch does not contain any @author tags.

    -1 tests included. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    -1 patch. The patch command could not apply the patch.

    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/196/console

    This message is automatically generated.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Apr 15, 2009 at 5:42 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699067#action_12699067 ]

    Todd Lipcon commented on HADOOP-5318:
    -------------------------------------

    Looks like the Hadoop QA bot attempted to apply the .java file as a patch... not sure how to convince it to apply the patch file.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Apr 15, 2009 at 1:39 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699192#action_12699192 ]

    dhruba borthakur commented on HADOOP-5318:
    ------------------------------------------

    The hadoopqa patch proceses always picks the last attached file. In this case, this was the java file. Please re-attach the latest and greatest patch to this JIRA and then cancel and submit patch once again. Thanks.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Apr 15, 2009 at 8:31 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Todd Lipcon updated HADOOP-5318:
    --------------------------------

    Attachment: hadoop-5318.txt

    reattaching patch for QA
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Apr 15, 2009 at 8:35 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699363#action_12699363 ]

    Todd Lipcon commented on HADOOP-5318:
    -------------------------------------

    Dhruba: I don't appear to have permissions to twiddle the "Patch Available" state on this issue. Hopefully someone else can retrigger QA
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Apr 15, 2009 at 8:47 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699372#action_12699372 ]

    dhruba borthakur commented on HADOOP-5318:
    ------------------------------------------

    Todd: I added you as "contributors". Please see if you are able to Cancel and then Submit the patch
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Apr 15, 2009 at 8:49 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Todd Lipcon updated HADOOP-5318:
    --------------------------------

    Status: Open (was: Patch Available)
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Apr 15, 2009 at 8:49 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Todd Lipcon updated HADOOP-5318:
    --------------------------------

    Status: Patch Available (was: Open)

    Yep, that worked. Thanks Dhruba
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at May 7, 2009 at 5:13 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707000#action_12707000 ]

    Hadoop QA commented on HADOOP-5318:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12405569/hadoop-5318.txt
    against trunk revision 772482.

    +1 @author. The patch does not contain any @author tags.

    -1 tests included. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    -1 contrib tests. The patch failed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/295/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/295/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/295/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/295/console

    This message is automatically generated.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at May 21, 2009 at 6:19 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711740#action_12711740 ]

    Todd Lipcon commented on HADOOP-5318:
    -------------------------------------

    The test failure reported by Hudson is on the capacity scheduler contrib (unrelated)

    Given that this is a performance-related patch, I'd like to hear back from Ben that the patch to be committed shows similar performance gains to the original patch.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Ben Maurer (JIRA) at May 24, 2009 at 3:40 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712491#action_12712491 ]

    Ben Maurer commented on HADOOP-5318:
    ------------------------------------

    When I tested stuff out, I got a boost from doing buffering that can't be replicated with just the grouping of atomic increments -- however, if we're just going to go with the simple version, this patch is as good as it gets.

    A similar patch is needed for input streams.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Jun 1, 2009 at 4:15 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715140#action_12715140 ]

    Todd Lipcon commented on HADOOP-5318:
    -------------------------------------

    Ben: sorry, I wasn't clear from your last comment - did you try out the newest patch from this issue, or just commenting on what you saw on the original patch?

    If you don't have a chance to try it, I can fire it up on an 8 core box somewhere and see what I get..
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Jun 1, 2009 at 4:17 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Todd Lipcon updated HADOOP-5318:
    --------------------------------

    Assignee: Todd Lipcon
    Status: Open (was: Patch Available)
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Assignee: Todd Lipcon
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Jun 14, 2009 at 3:48 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719220#action_12719220 ]

    Todd Lipcon commented on HADOOP-5318:
    -------------------------------------

    I just tried this patch on an EC2 c1.xlarge instance (8 cores) and couldn't reproduce the expected performance improvements. I updated the test program a bit to run 10 trials and run System.gc in between each (since the first couple trials seem to speed up due to JITting). I was writing into /dev/shm so actual IO performance shouldn't be a factor - just the contention on the statistics lock. I also changed the test program output format to be suitable for loading into R for analysis. Here's the t-test which fails to show a significant improvement (taking 20:80 to chop off the first 2 runs where JIT happened):

    {noformat}
    d.0k <- read.table(file="0k.tsv",header=T)
    d.1k <- read.table(file="1k.tsv",header=T)
    t.test((d.1k$rate - d.0k$rate)[20:80])
    One Sample t-test

    data: (d.1k$rate - d.0k$rate)[20:80]
    t = -0.6754, df = 60, p-value = 0.502
    alternative hypothesis: true mean is not equal to 0
    95 percent confidence interval:
    -68205.16 33772.57
    sample estimates:
    mean of x
    -17216.30
    {noformat}

    The p-value = 0.502 is pretty unconvincing.

    Any thoughts from the various parties who are seeing this contention in practice?
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Assignee: Todd Lipcon
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Jun 14, 2009 at 3:50 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Todd Lipcon updated HADOOP-5318:
    --------------------------------

    Attachment: TestWriteConcurrency.java

    Here's the updated test code that does more trials, gcs, etc
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Assignee: Todd Lipcon
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Jun 14, 2009 at 4:02 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719222#action_12719222 ]

    Todd Lipcon commented on HADOOP-5318:
    -------------------------------------

    Just tried two more situations with the test code. The first was to comment out all of the increments in PositionCache, and again saw no improvement. I also tried Ben's patch "buf.patch" as well with the same test code. The performance increase is quite pronounced:

    {noformat}
    d.benm <- read.table(file="benm-patch.tsv", header=T)
    t.test(d.benm$rate - d.0k$rate)
    One Sample t-test

    data: d.benm$rate - d.0k$rate
    t = 32.835, df = 79, p-value < 2.2e-16
    alternative hypothesis: true mean is not equal to 0
    95 percent confidence interval:
    2925074 3302593
    sample estimates:
    mean of x
    3113833
    {noformat}

    So, this JIRA should focus on figuring out how we can get that same benefit while addressing Doug's concerns above.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Assignee: Todd Lipcon
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Todd Lipcon (JIRA) at Jun 14, 2009 at 8:28 am
    [ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719238#action_12719238 ]

    Todd Lipcon commented on HADOOP-5318:
    -------------------------------------

    It seems like the real culprit here is HADOOP-5598. Switching to the pure java CRC32 makes the benchmark scale nearly linearly to 8 threads.
    Poor IO Performance due to AtomicLong operations
    ------------------------------------------------

    Key: HADOOP-5318
    URL: https://issues.apache.org/jira/browse/HADOOP-5318
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.19.0
    Environment: 2x quad core xeon linux 64 bit
    Reporter: Ben Maurer
    Assignee: Todd Lipcon
    Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, hadoop-5318.txt, TestWriteConcurrency.java, TestWriteConcurrency.java


    The AtomicLong operations in counting file system statistics can cause high levels of contention with multiple threads. This test demonstrates having multiple threads writing to different sequence files:
    {code:java}
    import java.io.IOException;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.ByteWritable;
    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.SequenceFile.Writer;
    import org.apache.hadoop.io.SequenceFile.CompressionType;
    public class Test {
    public static void main(String[] args) throws IOException {
    final Configuration c = new Configuration();
    final FileSystem fs = FileSystem.get(c);

    final int NUM = 1000*1000;
    for (int i = 0; i < Integer.valueOf(args[0]); i ++) {
    final int ii = i;
    new Thread(new Runnable() {
    @Override
    public void run() {

    try {
    Writer f = SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), ByteWritable.class, ByteWritable.class, CompressionType.NONE);
    ByteWritable v = new ByteWritable();

    long time = System.currentTimeMillis();
    for (int i = 0; i < NUM; i ++)
    f.append(v, v);
    f.close();
    long end = System.currentTimeMillis();

    System.out.printf("%d opartions in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time));

    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }).start();
    }
    }
    }
    {code}
    The results of this benchmark are
    {code}
    ==== 1 threads ====
    1000000 opartions in 1431 msec. 698812.000000/second
    ==== 2 threads ====
    1000000 opartions in 3001 msec. 333222.250000/second
    1000000 opartions in 2985 msec. 335008.375000/second
    ==== 3 threads ====
    1000000 opartions in 4923 msec. 203128.171875/second
    1000000 opartions in 4924 msec. 203086.921875/second
    1000000 opartions in 4981 msec. 200762.906250/second
    ==== 4 threads ====
    1000000 opartions in 6716 msec. 148898.156250/second
    1000000 opartions in 7048 msec. 141884.218750/second
    1000000 opartions in 7342 msec. 136202.671875/second
    1000000 opartions in 7344 msec. 136165.578125/second
    ==== 5 threads ====
    1000000 opartions in 10366 msec. 96469.226563/second
    1000000 opartions in 11085 msec. 90212.000000/second
    1000000 opartions in 11121 msec. 89919.968750/second
    1000000 opartions in 11464 msec. 87229.585938/second
    1000000 opartions in 11538 msec. 86670.132813/second
    ==== 6 threads ====
    1000000 opartions in 16513 msec. 60558.347656/second
    1000000 opartions in 17704 msec. 56484.410156/second
    1000000 opartions in 18219 msec. 54887.753906/second
    1000000 opartions in 18550 msec. 53908.355469/second
    1000000 opartions in 18605 msec. 53748.992188/second
    1000000 opartions in 18663 msec. 53581.953125/second
    ==== 7 threads ====
    1000000 opartions in 22207 msec. 45030.847656/second
    1000000 opartions in 23275 msec. 42964.554688/second
    1000000 opartions in 23484 msec. 42582.183594/second
    1000000 opartions in 24378 msec. 41020.593750/second
    1000000 opartions in 24425 msec. 40941.656250/second
    1000000 opartions in 24533 msec. 40761.421875/second
    1000000 opartions in 24645 msec. 40576.183594/second
    ==== 8 threads ====
    1000000 opartions in 26375 msec. 37914.691406/second
    1000000 opartions in 26420 msec. 37850.113281/second
    1000000 opartions in 26532 msec. 37690.335938/second
    1000000 opartions in 26670 msec. 37495.312500/second
    1000000 opartions in 29772 msec. 33588.605469/second
    1000000 opartions in 29859 msec. 33490.738281/second
    1000000 opartions in 30098 msec. 33224.800781/second
    1000000 opartions in 30082 msec. 33242.468750/second
    {code}
    However, if I comment out the file system statistics increments, the benchmark improves to:
    {code}
    ==== 1 threads ====
    1000000 opartions in 1194 msec. 837520.937500/second
    ==== 2 threads ====
    1000000 opartions in 1433 msec. 697836.687500/second
    1000000 opartions in 1433 msec. 697836.687500/second
    ==== 3 threads ====
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1643 msec. 608642.750000/second
    1000000 opartions in 1639 msec. 610128.125000/second
    ==== 4 threads ====
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1886 msec. 530222.687500/second
    1000000 opartions in 1899 msec. 526592.937500/second
    ==== 5 threads ====
    1000000 opartions in 2065 msec. 484261.500000/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2067 msec. 483792.937500/second
    1000000 opartions in 2066 msec. 484027.093750/second
    1000000 opartions in 2066 msec. 484027.093750/second
    ==== 6 threads ====
    1000000 opartions in 2151 msec. 464900.031250/second
    1000000 opartions in 2111 msec. 473709.156250/second
    1000000 opartions in 2153 msec. 464468.187500/second
    1000000 opartions in 2114 msec. 473036.906250/second
    1000000 opartions in 2113 msec. 473260.781250/second
    1000000 opartions in 2112 msec. 473484.843750/second
    ==== 7 threads ====
    1000000 opartions in 2368 msec. 422297.312500/second
    1000000 opartions in 2334 msec. 428449.000000/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2330 msec. 429184.562500/second
    1000000 opartions in 2332 msec. 428816.468750/second
    1000000 opartions in 2375 msec. 421052.625000/second
    1000000 opartions in 2394 msec. 417710.937500/second
    ==== 8 threads ====
    1000000 opartions in 2517 msec. 397298.375000/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2538 msec. 394011.031250/second
    1000000 opartions in 2539 msec. 393855.843750/second
    1000000 opartions in 2614 msec. 382555.468750/second
    1000000 opartions in 2666 msec. 375093.781250/second
    1000000 opartions in 2701 msec. 370233.250000/second
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 24, '09 at 4:30a
activeJun 14, '09 at 8:28a
posts29
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Todd Lipcon (JIRA): 29 posts

People

Translate

site design / logo © 2022 Grokbase