FAQ
I have been trying to understand how to write a simple custom writable class
and I find the documentation available very vague and unclear about certain
things. okay so here is the sample writable implementation in javadoc of
Writable interface

public class MyWritable implements Writable {
// Some data
private int counter;
private long timestamp;

*public void write(DataOutput out) throws IOException {
out.writeInt(counter);
out.writeLong(timestamp);
}*

* public void readFields(DataInput in) throws IOException {
counter = in.readInt();
timestamp = in.readLong();
}*

public static MyWritable read(DataInput in) throws IOException {
MyWritable w = new MyWritable();
w.readFields(in);
return w;
}
}

so in readFields function we are simply saying read an int from the
datainput and put that in counter .. and then read a long and put that in
timestamp variable .. what doesnt makes sense to me is what is the format of
DataInput here .. what if there are multiple ints and multiple longs .. how
is the correct int gonna go in counter .. what if the data I am reading in
my mapper is a string line .. and I am using regular expression to parse the
tokens .. how do I specify which field goes where .. simply saying readInt
or readText .. how does that gets connected to the right stuff ..

so in my case like I said I am reading from iis log files where my mapper
input is a log line which contains usual log information like data, time,
user, server, url, qry, responseTme etc .. I want to parse these into an
object that can be passed to reducer instead of dumping all that information
as text ..

I would appreciate any help.
Thanks
Adeel

Search Discussions

  • Adeelmahmood at Feb 2, 2011 at 5:25 pm
    I have been trying to understand how to write a simple custom writable class
    and I find the documentation available very vague and unclear about certain
    things. okay so here is the sample writable implementation in javadoc of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }

    public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from the
    datainput and put that in counter .. and then read a long and put that in
    timestamp variable .. what doesnt makes sense to me is what is the format of
    DataInput here .. what if there are multiple ints and multiple longs .. how
    is the correct int gonna go in counter .. what if the data I am reading in
    my mapper is a string line .. and I am using regular expression to parse the
    tokens .. how do I specify which field goes where .. simply saying readInt
    or readText .. how does that gets connected to the right stuff ..

    so in my case like I said I am reading from iis log files where my mapper
    input is a log line which contains usual log information like data, time,
    user, server, url, qry, responseTme etc .. I want to parse these into an
    object that can be passed to reducer instead of dumping all that information
    as text ..

    I would appreciate any help.
    Thanks
    --
    View this message in context: http://old.nabble.com/custom-writable-classes-tp30828079p30828079.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Harsh J at Feb 2, 2011 at 5:42 pm
    See it this way:

    readFields(...) provides a DataInput stream that reads bytes from a
    binary stream, and write(...) provides a DataOutput stream that writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that you may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you wrote A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache Avro
    (http://avro.apache.org) and solve it in a simpler way with a schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop use-cases.
    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi wrote:
    I have been trying to understand how to write a simple custom writable class
    and I find the documentation available very vague and unclear about certain
    things. okay so here is the sample writable implementation in javadoc of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from the
    datainput and put that in counter .. and then read a long and put that in
    timestamp variable .. what doesnt makes sense to me is what is the format of
    DataInput here .. what if there are multiple ints and multiple longs .. how
    is the correct int gonna go in counter .. what if the data I am reading in
    my mapper is a string line .. and I am using regular expression to parse the
    tokens .. how do I specify which field goes where .. simply saying readInt
    or readText .. how does that gets connected to the right stuff ..

    so in my case like I said I am reading from iis log files where my mapper
    input is a log line which contains usual log information like data, time,
    user, server, url, qry, responseTme etc .. I want to parse these into an
    object that can be passed to reducer instead of dumping all that information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com
  • Adeel Qureshi at Feb 2, 2011 at 5:51 pm
    thanks for your reply .. so lets say my input files are formatted like this

    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date reading function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse my line into
    these tokens .. instead of map be using the whole line as one token

    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J wrote:

    See it this way:

    readFields(...) provides a DataInput stream that reads bytes from a
    binary stream, and write(...) provides a DataOutput stream that writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that you may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you wrote A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache Avro
    (http://avro.apache.org) and solve it in a simpler way with a schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop use-cases.
    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi wrote:
    I have been trying to understand how to write a simple custom writable class
    and I find the documentation available very vague and unclear about certain
    things. okay so here is the sample writable implementation in javadoc of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from the
    datainput and put that in counter .. and then read a long and put that in
    timestamp variable .. what doesnt makes sense to me is what is the format of
    DataInput here .. what if there are multiple ints and multiple longs .. how
    is the correct int gonna go in counter .. what if the data I am reading in
    my mapper is a string line .. and I am using regular expression to parse the
    tokens .. how do I specify which field goes where .. simply saying readInt
    or readText .. how does that gets connected to the right stuff ..

    so in my case like I said I am reading from iis log files where my mapper
    input is a log line which contains usual log information like data, time,
    user, server, url, qry, responseTme etc .. I want to parse these into an
    object that can be passed to reducer instead of dumping all that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com
  • Harsh J at Feb 2, 2011 at 6:09 pm
    Hadoop isn't going to magically parse your Text line into anything.
    You'd have to tokenize it yourself and use the tokens to create your
    custom writable within your map call (A constructor, or a set of
    setter methods). The "Writable" is for serialization and
    de-serialization of itself only.
    On Wed, Feb 2, 2011 at 11:20 PM, Adeel Qureshi wrote:
    thanks for your reply .. so lets say my input files are formatted like this

    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date reading function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse my line into
    these tokens .. instead of map be using the whole line as one token

    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J wrote:

    See it this way:

    readFields(...) provides a DataInput stream that reads bytes from a
    binary stream, and write(...) provides a DataOutput stream that writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that you may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you wrote A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache Avro
    (http://avro.apache.org) and solve it in a simpler way with a schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop use-cases.

    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi <adeelmahmood@gmail.com>
    wrote:
    I have been trying to understand how to write a simple custom writable class
    and I find the documentation available very vague and unclear about certain
    things. okay so here is the sample writable implementation in javadoc of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from the
    datainput and put that in counter .. and then read a long and put that in
    timestamp variable .. what doesnt makes sense to me is what is the format of
    DataInput here .. what if there are multiple ints and multiple longs .. how
    is the correct int gonna go in counter .. what if the data I am reading in
    my mapper is a string line .. and I am using regular expression to parse the
    tokens .. how do I specify which field goes where .. simply saying readInt
    or readText .. how does that gets connected to the right stuff ..

    so in my case like I said I am reading from iis log files where my mapper
    input is a log line which contains usual log information like data, time,
    user, server, url, qry, responseTme etc .. I want to parse these into an
    object that can be passed to reducer instead of dumping all that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com


    --
    Harsh J
    www.harshj.com
  • Vijay at Feb 2, 2011 at 6:12 pm
    Hadoop is not going to parse the line for you. Your mapper will take the
    line, parse it and then turn it into your Writable so the next phase can
    just work with your object.

    Thanks,
    Vijay
    On Feb 2, 2011 9:51 AM, "Adeel Qureshi" wrote:
    thanks for your reply .. so lets say my input files are formatted like this
    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date reading function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse my line into
    these tokens .. instead of map be using the whole line as one token

    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J wrote:

    See it this way:

    readFields(...) provides a DataInput stream that reads bytes from a
    binary stream, and write(...) provides a DataOutput stream that writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that you may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you wrote A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache Avro
    (http://avro.apache.org) and solve it in a simpler way with a schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop use-cases.

    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi <adeelmahmood@gmail.com>
    wrote:
    I have been trying to understand how to write a simple custom writable class
    and I find the documentation available very vague and unclear about certain
    things. okay so here is the sample writable implementation in javadoc
    of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from the
    datainput and put that in counter .. and then read a long and put that
    in
    timestamp variable .. what doesnt makes sense to me is what is the
    format
    of
    DataInput here .. what if there are multiple ints and multiple longs .. how
    is the correct int gonna go in counter .. what if the data I am reading in
    my mapper is a string line .. and I am using regular expression to
    parse
    the
    tokens .. how do I specify which field goes where .. simply saying readInt
    or readText .. how does that gets connected to the right stuff ..

    so in my case like I said I am reading from iis log files where my
    mapper
    input is a log line which contains usual log information like data,
    time,
    user, server, url, qry, responseTme etc .. I want to parse these into
    an
    object that can be passed to reducer instead of dumping all that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com
  • Adeel Qureshi at Feb 2, 2011 at 6:17 pm
    okay so then the main question is how do I get the input line .. so that I
    could parse it .. I am assuming it will then be passed to me in via data
    input stream ..

    So in my readFields function .. I am assuming I will get the whole line ..
    then I can parse it out and set my params .. something like this

    readFields(){
    String line = in.readLine(); read the whole line

    //now apply the regular expression to parse it out
    data = pattern.group(1);
    time = pattern.group(2);
    user = pattern.group(3);
    }

    Is that right ???


    On Wed, Feb 2, 2011 at 12:11 PM, Vijay wrote:

    Hadoop is not going to parse the line for you. Your mapper will take the
    line, parse it and then turn it into your Writable so the next phase can
    just work with your object.

    Thanks,
    Vijay
    On Feb 2, 2011 9:51 AM, "Adeel Qureshi" wrote:
    thanks for your reply .. so lets say my input files are formatted like this
    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date reading function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse my line into
    these tokens .. instead of map be using the whole line as one token

    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J wrote:

    See it this way:

    readFields(...) provides a DataInput stream that reads bytes from a
    binary stream, and write(...) provides a DataOutput stream that writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that you may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you wrote A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache Avro
    (http://avro.apache.org) and solve it in a simpler way with a schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop use-cases.

    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi <adeelmahmood@gmail.com>
    wrote:
    I have been trying to understand how to write a simple custom writable class
    and I find the documentation available very vague and unclear about certain
    things. okay so here is the sample writable implementation in javadoc
    of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from the
    datainput and put that in counter .. and then read a long and put that
    in
    timestamp variable .. what doesnt makes sense to me is what is the
    format
    of
    DataInput here .. what if there are multiple ints and multiple longs
    ..
    how
    is the correct int gonna go in counter .. what if the data I am
    reading
    in
    my mapper is a string line .. and I am using regular expression to
    parse
    the
    tokens .. how do I specify which field goes where .. simply saying readInt
    or readText .. how does that gets connected to the right stuff ..

    so in my case like I said I am reading from iis log files where my
    mapper
    input is a log line which contains usual log information like data,
    time,
    user, server, url, qry, responseTme etc .. I want to parse these into
    an
    object that can be passed to reducer instead of dumping all that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com
  • David Sinclair at Feb 2, 2011 at 6:33 pm
    Are you storing your data as text or binary?

    If you are storing as text, your mapper is going to get Keys of
    type LongWritable and values of type Text. Inside your mapper you would
    parse out the strings and wouldn't be using your custom writable; that is
    unless you wanted your mapper/reducer to produce these.

    If you are storing as Binary, e.g. SequenceFiles, you use
    the SequenceFileInputFormat and the sequence file reader will create the
    writables according to the mapper.

    dave
    On Wed, Feb 2, 2011 at 1:16 PM, Adeel Qureshi wrote:

    okay so then the main question is how do I get the input line .. so that I
    could parse it .. I am assuming it will then be passed to me in via data
    input stream ..

    So in my readFields function .. I am assuming I will get the whole line ..
    then I can parse it out and set my params .. something like this

    readFields(){
    String line = in.readLine(); read the whole line

    //now apply the regular expression to parse it out
    data = pattern.group(1);
    time = pattern.group(2);
    user = pattern.group(3);
    }

    Is that right ???


    On Wed, Feb 2, 2011 at 12:11 PM, Vijay wrote:

    Hadoop is not going to parse the line for you. Your mapper will take the
    line, parse it and then turn it into your Writable so the next phase can
    just work with your object.

    Thanks,
    Vijay
    On Feb 2, 2011 9:51 AM, "Adeel Qureshi" wrote:
    thanks for your reply .. so lets say my input files are formatted like this
    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date reading function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse my line into
    these tokens .. instead of map be using the whole line as one token

    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J wrote:

    See it this way:

    readFields(...) provides a DataInput stream that reads bytes from a
    binary stream, and write(...) provides a DataOutput stream that writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that you may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you wrote A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache Avro
    (http://avro.apache.org) and solve it in a simpler way with a schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop use-cases.

    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi <
    adeelmahmood@gmail.com>
    wrote:
    I have been trying to understand how to write a simple custom
    writable
    class
    and I find the documentation available very vague and unclear about certain
    things. okay so here is the sample writable implementation in
    javadoc
    of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from the
    datainput and put that in counter .. and then read a long and put
    that
    in
    timestamp variable .. what doesnt makes sense to me is what is the
    format
    of
    DataInput here .. what if there are multiple ints and multiple longs
    ..
    how
    is the correct int gonna go in counter .. what if the data I am
    reading
    in
    my mapper is a string line .. and I am using regular expression to
    parse
    the
    tokens .. how do I specify which field goes where .. simply saying readInt
    or readText .. how does that gets connected to the right stuff ..

    so in my case like I said I am reading from iis log files where my
    mapper
    input is a log line which contains usual log information like data,
    time,
    user, server, url, qry, responseTme etc .. I want to parse these
    into
    an
    object that can be passed to reducer instead of dumping all that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com
  • Adeel Qureshi at Feb 2, 2011 at 6:39 pm
    i m reading text data and outputting text data so yeah its all text .. the
    reason why i wanted to use custom writable classes is not for the mapper
    purposes .. you are right .. the easiest thing for is to receive the
    LongWritable and Text input in the mapper ... parse the text .. and deal
    with it .. but where I am having trouble is in passing the parsed
    information to the reducer .. right now I am putting a bunch of things as
    text and sending the same LongWritable and Text output to reducer but my
    text includes a bunch of things e.g. several fields separated by a delimiter
    .. this is the part that I am trying to improve .. instead of sending a
    bunch of delimited text I want to send an actual object to my reducer
    On Wed, Feb 2, 2011 at 12:33 PM, David Sinclair wrote:

    Are you storing your data as text or binary?

    If you are storing as text, your mapper is going to get Keys of
    type LongWritable and values of type Text. Inside your mapper you would
    parse out the strings and wouldn't be using your custom writable; that is
    unless you wanted your mapper/reducer to produce these.

    If you are storing as Binary, e.g. SequenceFiles, you use
    the SequenceFileInputFormat and the sequence file reader will create the
    writables according to the mapper.

    dave

    On Wed, Feb 2, 2011 at 1:16 PM, Adeel Qureshi <adeelmahmood@gmail.com
    wrote:
    okay so then the main question is how do I get the input line .. so that I
    could parse it .. I am assuming it will then be passed to me in via data
    input stream ..

    So in my readFields function .. I am assuming I will get the whole line ..
    then I can parse it out and set my params .. something like this

    readFields(){
    String line = in.readLine(); read the whole line

    //now apply the regular expression to parse it out
    data = pattern.group(1);
    time = pattern.group(2);
    user = pattern.group(3);
    }

    Is that right ???


    On Wed, Feb 2, 2011 at 12:11 PM, Vijay wrote:

    Hadoop is not going to parse the line for you. Your mapper will take
    the
    line, parse it and then turn it into your Writable so the next phase
    can
    just work with your object.

    Thanks,
    Vijay
    On Feb 2, 2011 9:51 AM, "Adeel Qureshi" wrote:
    thanks for your reply .. so lets say my input files are formatted
    like
    this
    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date reading function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse my line into
    these tokens .. instead of map be using the whole line as one token


    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J <qwertymaniac@gmail.com>
    wrote:
    See it this way:

    readFields(...) provides a DataInput stream that reads bytes from a
    binary stream, and write(...) provides a DataOutput stream that
    writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that you
    may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you wrote A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache Avro
    (http://avro.apache.org) and solve it in a simpler way with a
    schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop
    use-cases.
    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi <
    adeelmahmood@gmail.com>
    wrote:
    I have been trying to understand how to write a simple custom
    writable
    class
    and I find the documentation available very vague and unclear
    about
    certain
    things. okay so here is the sample writable implementation in
    javadoc
    of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from
    the
    datainput and put that in counter .. and then read a long and put
    that
    in
    timestamp variable .. what doesnt makes sense to me is what is the
    format
    of
    DataInput here .. what if there are multiple ints and multiple
    longs
    ..
    how
    is the correct int gonna go in counter .. what if the data I am
    reading
    in
    my mapper is a string line .. and I am using regular expression to
    parse
    the
    tokens .. how do I specify which field goes where .. simply saying readInt
    or readText .. how does that gets connected to the right stuff ..

    so in my case like I said I am reading from iis log files where my
    mapper
    input is a log line which contains usual log information like
    data,
    time,
    user, server, url, qry, responseTme etc .. I want to parse these
    into
    an
    object that can be passed to reducer instead of dumping all that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com
  • David Sinclair at Feb 2, 2011 at 6:52 pm
    So create your writable as normal, and hadoop takes care of the
    serialization/deserialization between mappers and reducers.

    For example, MyWritable is the same as you had previously, then in your
    mapper output that writable

    class MyMapper extends Mapper<LongWritable, Text, LongWritable, MyWritable>
    {

    private MyWritable writable =new MyWritable();

    protected void map(LongWritable key, Text value, Context context) throws
    IOException, InterruptedException {
    // parse text
    writable.setCounter(parseddata);
    writable.setTimestamp(parseddata);

    // don't know what your key is
    context.write(key, writable);
    }
    }

    and make sure you set the key/value output

    job.setMapOutputKeyClass(LongWritable .class);
    job.setMapOutputValueClass(MyWritable.class);

    dave

    On Wed, Feb 2, 2011 at 1:39 PM, Adeel Qureshi wrote:

    i m reading text data and outputting text data so yeah its all text .. the
    reason why i wanted to use custom writable classes is not for the mapper
    purposes .. you are right .. the easiest thing for is to receive the
    LongWritable and Text input in the mapper ... parse the text .. and deal
    with it .. but where I am having trouble is in passing the parsed
    information to the reducer .. right now I am putting a bunch of things as
    text and sending the same LongWritable and Text output to reducer but my
    text includes a bunch of things e.g. several fields separated by a
    delimiter
    .. this is the part that I am trying to improve .. instead of sending a
    bunch of delimited text I want to send an actual object to my reducer

    On Wed, Feb 2, 2011 at 12:33 PM, David Sinclair <
    dsinclair@chariotsolutions.com> wrote:
    Are you storing your data as text or binary?

    If you are storing as text, your mapper is going to get Keys of
    type LongWritable and values of type Text. Inside your mapper you would
    parse out the strings and wouldn't be using your custom writable; that is
    unless you wanted your mapper/reducer to produce these.

    If you are storing as Binary, e.g. SequenceFiles, you use
    the SequenceFileInputFormat and the sequence file reader will create the
    writables according to the mapper.

    dave

    On Wed, Feb 2, 2011 at 1:16 PM, Adeel Qureshi <adeelmahmood@gmail.com
    wrote:
    okay so then the main question is how do I get the input line .. so
    that
    I
    could parse it .. I am assuming it will then be passed to me in via
    data
    input stream ..

    So in my readFields function .. I am assuming I will get the whole line ..
    then I can parse it out and set my params .. something like this

    readFields(){
    String line = in.readLine(); read the whole line

    //now apply the regular expression to parse it out
    data = pattern.group(1);
    time = pattern.group(2);
    user = pattern.group(3);
    }

    Is that right ???


    On Wed, Feb 2, 2011 at 12:11 PM, Vijay wrote:

    Hadoop is not going to parse the line for you. Your mapper will take
    the
    line, parse it and then turn it into your Writable so the next phase
    can
    just work with your object.

    Thanks,
    Vijay
    On Feb 2, 2011 9:51 AM, "Adeel Qureshi" <adeelmahmood@gmail.com>
    wrote:
    thanks for your reply .. so lets say my input files are formatted
    like
    this
    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date reading function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse my
    line
    into
    these tokens .. instead of map be using the whole line as one token


    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J <qwertymaniac@gmail.com>
    wrote:
    See it this way:

    readFields(...) provides a DataInput stream that reads bytes from
    a
    binary stream, and write(...) provides a DataOutput stream that
    writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that you
    may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you wrote
    A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache
    Avro
    (http://avro.apache.org) and solve it in a simpler way with a
    schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop
    use-cases.
    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi <
    adeelmahmood@gmail.com>
    wrote:
    I have been trying to understand how to write a simple custom
    writable
    class
    and I find the documentation available very vague and unclear
    about
    certain
    things. okay so here is the sample writable implementation in
    javadoc
    of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int from
    the
    datainput and put that in counter .. and then read a long and
    put
    that
    in
    timestamp variable .. what doesnt makes sense to me is what is
    the
    format
    of
    DataInput here .. what if there are multiple ints and multiple
    longs
    ..
    how
    is the correct int gonna go in counter .. what if the data I am
    reading
    in
    my mapper is a string line .. and I am using regular expression
    to
    parse
    the
    tokens .. how do I specify which field goes where .. simply
    saying
    readInt
    or readText .. how does that gets connected to the right stuff
    ..
    so in my case like I said I am reading from iis log files where
    my
    mapper
    input is a log line which contains usual log information like
    data,
    time,
    user, server, url, qry, responseTme etc .. I want to parse these
    into
    an
    object that can be passed to reducer instead of dumping all that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com
  • Adeel Qureshi at Feb 2, 2011 at 8:34 pm
    huh this interesting .. obviously I am not thinking about this whole thing
    right ..

    so in your mapper you parse the line into tokens and set the appropriate
    values on your writable by constructor or setters .. and let hadoop do all
    the serialization and deserialization .. and you tell hadoop how to do that
    by the read and write methods .. okay that makes more sense .. one last
    thing i still dont understand is what is the proper implementation of read
    and write methods .. if i have a bunch of strings in my writable then what
    should be the read method implementation ..

    I really appreciate the help from all you guys ..
    On Wed, Feb 2, 2011 at 12:52 PM, David Sinclair wrote:

    So create your writable as normal, and hadoop takes care of the
    serialization/deserialization between mappers and reducers.

    For example, MyWritable is the same as you had previously, then in your
    mapper output that writable

    class MyMapper extends Mapper<LongWritable, Text, LongWritable, MyWritable>
    {

    private MyWritable writable =new MyWritable();

    protected void map(LongWritable key, Text value, Context context) throws
    IOException, InterruptedException {
    // parse text
    writable.setCounter(parseddata);
    writable.setTimestamp(parseddata);

    // don't know what your key is
    context.write(key, writable);
    }
    }

    and make sure you set the key/value output

    job.setMapOutputKeyClass(LongWritable .class);
    job.setMapOutputValueClass(MyWritable.class);

    dave


    On Wed, Feb 2, 2011 at 1:39 PM, Adeel Qureshi <adeelmahmood@gmail.com
    wrote:
    i m reading text data and outputting text data so yeah its all text .. the
    reason why i wanted to use custom writable classes is not for the mapper
    purposes .. you are right .. the easiest thing for is to receive the
    LongWritable and Text input in the mapper ... parse the text .. and deal
    with it .. but where I am having trouble is in passing the parsed
    information to the reducer .. right now I am putting a bunch of things as
    text and sending the same LongWritable and Text output to reducer but my
    text includes a bunch of things e.g. several fields separated by a
    delimiter
    .. this is the part that I am trying to improve .. instead of sending a
    bunch of delimited text I want to send an actual object to my reducer

    On Wed, Feb 2, 2011 at 12:33 PM, David Sinclair <
    dsinclair@chariotsolutions.com> wrote:
    Are you storing your data as text or binary?

    If you are storing as text, your mapper is going to get Keys of
    type LongWritable and values of type Text. Inside your mapper you would
    parse out the strings and wouldn't be using your custom writable; that
    is
    unless you wanted your mapper/reducer to produce these.

    If you are storing as Binary, e.g. SequenceFiles, you use
    the SequenceFileInputFormat and the sequence file reader will create
    the
    writables according to the mapper.

    dave

    On Wed, Feb 2, 2011 at 1:16 PM, Adeel Qureshi <adeelmahmood@gmail.com
    wrote:
    okay so then the main question is how do I get the input line .. so
    that
    I
    could parse it .. I am assuming it will then be passed to me in via
    data
    input stream ..

    So in my readFields function .. I am assuming I will get the whole
    line
    ..
    then I can parse it out and set my params .. something like this

    readFields(){
    String line = in.readLine(); read the whole line

    //now apply the regular expression to parse it out
    data = pattern.group(1);
    time = pattern.group(2);
    user = pattern.group(3);
    }

    Is that right ???


    On Wed, Feb 2, 2011 at 12:11 PM, Vijay wrote:

    Hadoop is not going to parse the line for you. Your mapper will
    take
    the
    line, parse it and then turn it into your Writable so the next
    phase
    can
    just work with your object.

    Thanks,
    Vijay
    On Feb 2, 2011 9:51 AM, "Adeel Qureshi" <adeelmahmood@gmail.com>
    wrote:
    thanks for your reply .. so lets say my input files are formatted
    like
    this
    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date reading function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse my
    line
    into
    these tokens .. instead of map be using the whole line as one
    token

    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J <qwertymaniac@gmail.com
    wrote:
    See it this way:

    readFields(...) provides a DataInput stream that reads bytes
    from
    a
    binary stream, and write(...) provides a DataOutput stream that
    writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array
    of
    items or a mapping of some, or just a set of different types of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that
    you
    may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you
    wrote
    A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like Apache
    Avro
    (http://avro.apache.org) and solve it in a simpler way with a
    schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop
    use-cases.
    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi <
    adeelmahmood@gmail.com>
    wrote:
    I have been trying to understand how to write a simple custom
    writable
    class
    and I find the documentation available very vague and unclear
    about
    certain
    things. okay so here is the sample writable implementation in
    javadoc
    of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws IOException
    {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int
    from
    the
    datainput and put that in counter .. and then read a long and
    put
    that
    in
    timestamp variable .. what doesnt makes sense to me is what is
    the
    format
    of
    DataInput here .. what if there are multiple ints and multiple
    longs
    ..
    how
    is the correct int gonna go in counter .. what if the data I
    am
    reading
    in
    my mapper is a string line .. and I am using regular
    expression
    to
    parse
    the
    tokens .. how do I specify which field goes where .. simply
    saying
    readInt
    or readText .. how does that gets connected to the right stuff
    ..
    so in my case like I said I am reading from iis log files
    where
    my
    mapper
    input is a log line which contains usual log information like
    data,
    time,
    user, server, url, qry, responseTme etc .. I want to parse
    these
    into
    an
    object that can be passed to reducer instead of dumping all
    that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com
  • David Sinclair at Feb 2, 2011 at 8:58 pm
    You can easily make a custom Writable delegating to the existing writables.
    For example, if your writable is just a bunch of strings, use the existing
    Text writables in your class and use them in your read/write methods. For
    example

    class MyWritable implements Writable {
    private Text fieldA;
    private Text fieldB;

    ....

    public void write(DataOutput dataOutput) throws IOException {
    fieldA.write(dataOutput);
    fieldB.write(dataOutput);
    }

    public void readFields(DataInput dataInput) throws IOException {
    fieldA.readFields(dataInput);
    fieldB.readFields(dataInput);
    }
    }

    dave
    On Wed, Feb 2, 2011 at 3:34 PM, Adeel Qureshi wrote:

    huh this interesting .. obviously I am not thinking about this whole thing
    right ..

    so in your mapper you parse the line into tokens and set the appropriate
    values on your writable by constructor or setters .. and let hadoop do all
    the serialization and deserialization .. and you tell hadoop how to do that
    by the read and write methods .. okay that makes more sense .. one last
    thing i still dont understand is what is the proper implementation of read
    and write methods .. if i have a bunch of strings in my writable then what
    should be the read method implementation ..

    I really appreciate the help from all you guys ..

    On Wed, Feb 2, 2011 at 12:52 PM, David Sinclair <
    dsinclair@chariotsolutions.com> wrote:
    So create your writable as normal, and hadoop takes care of the
    serialization/deserialization between mappers and reducers.

    For example, MyWritable is the same as you had previously, then in your
    mapper output that writable

    class MyMapper extends Mapper<LongWritable, Text, LongWritable,
    MyWritable>
    {

    private MyWritable writable =new MyWritable();

    protected void map(LongWritable key, Text value, Context context) throws
    IOException, InterruptedException {
    // parse text
    writable.setCounter(parseddata);
    writable.setTimestamp(parseddata);

    // don't know what your key is
    context.write(key, writable);
    }
    }

    and make sure you set the key/value output

    job.setMapOutputKeyClass(LongWritable .class);
    job.setMapOutputValueClass(MyWritable.class);

    dave


    On Wed, Feb 2, 2011 at 1:39 PM, Adeel Qureshi <adeelmahmood@gmail.com
    wrote:
    i m reading text data and outputting text data so yeah its all text .. the
    reason why i wanted to use custom writable classes is not for the
    mapper
    purposes .. you are right .. the easiest thing for is to receive the
    LongWritable and Text input in the mapper ... parse the text .. and
    deal
    with it .. but where I am having trouble is in passing the parsed
    information to the reducer .. right now I am putting a bunch of things
    as
    text and sending the same LongWritable and Text output to reducer but
    my
    text includes a bunch of things e.g. several fields separated by a
    delimiter
    .. this is the part that I am trying to improve .. instead of sending a
    bunch of delimited text I want to send an actual object to my reducer

    On Wed, Feb 2, 2011 at 12:33 PM, David Sinclair <
    dsinclair@chariotsolutions.com> wrote:
    Are you storing your data as text or binary?

    If you are storing as text, your mapper is going to get Keys of
    type LongWritable and values of type Text. Inside your mapper you
    would
    parse out the strings and wouldn't be using your custom writable;
    that
    is
    unless you wanted your mapper/reducer to produce these.

    If you are storing as Binary, e.g. SequenceFiles, you use
    the SequenceFileInputFormat and the sequence file reader will create
    the
    writables according to the mapper.

    dave

    On Wed, Feb 2, 2011 at 1:16 PM, Adeel Qureshi <
    adeelmahmood@gmail.com
    wrote:
    okay so then the main question is how do I get the input line .. so
    that
    I
    could parse it .. I am assuming it will then be passed to me in via
    data
    input stream ..

    So in my readFields function .. I am assuming I will get the whole
    line
    ..
    then I can parse it out and set my params .. something like this

    readFields(){
    String line = in.readLine(); read the whole line

    //now apply the regular expression to parse it out
    data = pattern.group(1);
    time = pattern.group(2);
    user = pattern.group(3);
    }

    Is that right ???


    On Wed, Feb 2, 2011 at 12:11 PM, Vijay wrote:

    Hadoop is not going to parse the line for you. Your mapper will
    take
    the
    line, parse it and then turn it into your Writable so the next
    phase
    can
    just work with your object.

    Thanks,
    Vijay
    On Feb 2, 2011 9:51 AM, "Adeel Qureshi" <adeelmahmood@gmail.com>
    wrote:
    thanks for your reply .. so lets say my input files are
    formatted
    like
    this
    each line looks like this
    DATE TIME SERVER USER URL QUERY PORT ...

    so to read this I would create a writable mapper

    public class MyMapper implements Writable {
    Date date
    long time
    String server
    String user
    String url
    String query
    int port

    readFields(){
    date = readDate(in); //not concerned with the actual date
    reading
    function
    time = readLong(in);
    server = readText(in);
    .....
    }
    }

    but I still dont understand how is hadoop gonna know to parse
    my
    line
    into
    these tokens .. instead of map be using the whole line as one
    token

    On Wed, Feb 2, 2011 at 11:42 AM, Harsh J <
    qwertymaniac@gmail.com
    wrote:
    See it this way:

    readFields(...) provides a DataInput stream that reads bytes
    from
    a
    binary stream, and write(...) provides a DataOutput stream
    that
    writes
    bytes to a binary stream.

    Now your data-structure may be a complex one, perhaps an array
    of
    items or a mapping of some, or just a set of different types
    of
    objects. All you need to do is to think about how would you
    _serialize_ your data structure into a binary stream, so that
    you
    may
    _de-serialize_ it back from the same stream when required.

    About what goes where, I think looking up the definition of
    'serialization' will help. It is all in the ordering. If you
    wrote
    A
    before B, you read A before B - simple as that.

    This, or you could use a neat serialization library like
    Apache
    Avro
    (http://avro.apache.org) and solve it in a simpler way with a
    schema.
    I'd recommend learning/using Avro for all
    serialization/de-serialization needs. Especially for Hadoop
    use-cases.
    On Wed, Feb 2, 2011 at 10:51 PM, Adeel Qureshi <
    adeelmahmood@gmail.com>
    wrote:
    I have been trying to understand how to write a simple
    custom
    writable
    class
    and I find the documentation available very vague and
    unclear
    about
    certain
    things. okay so here is the sample writable implementation
    in
    javadoc
    of
    Writable interface

    public class MyWritable implements Writable {
    // Some data
    private int counter;
    private long timestamp;

    *public void write(DataOutput out) throws IOException {
    out.writeInt(counter);
    out.writeLong(timestamp);
    }*

    * public void readFields(DataInput in) throws IOException {
    counter = in.readInt();
    timestamp = in.readLong();
    }*

    public static MyWritable read(DataInput in) throws
    IOException
    {
    MyWritable w = new MyWritable();
    w.readFields(in);
    return w;
    }
    }

    so in readFields function we are simply saying read an int
    from
    the
    datainput and put that in counter .. and then read a long
    and
    put
    that
    in
    timestamp variable .. what doesnt makes sense to me is what
    is
    the
    format
    of
    DataInput here .. what if there are multiple ints and
    multiple
    longs
    ..
    how
    is the correct int gonna go in counter .. what if the data I
    am
    reading
    in
    my mapper is a string line .. and I am using regular
    expression
    to
    parse
    the
    tokens .. how do I specify which field goes where .. simply
    saying
    readInt
    or readText .. how does that gets connected to the right
    stuff
    ..
    so in my case like I said I am reading from iis log files
    where
    my
    mapper
    input is a log line which contains usual log information
    like
    data,
    time,
    user, server, url, qry, responseTme etc .. I want to parse
    these
    into
    an
    object that can be passed to reducer instead of dumping all
    that
    information
    as text ..

    I would appreciate any help.
    Thanks
    Adeel


    --
    Harsh J
    www.harshj.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 2, '11 at 5:22p
activeFeb 2, '11 at 8:58p
posts12
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase