FAQ
For the time being, I've given up on using object serialization to do
what I want. Instead, I'm going to just marshal and unmarshal the
values of my class myself. I've implemented write() and readField()
methods in the classes that I want to read and write. (See my
definition of Sample below.)

Unfortunately, Hadoop throws the following exception when my program starts:

Job started: Wed Oct 10 18:04:06 EDT 2007
07/10/10 18:04:06 INFO mapred.InputFormatBase: Total input paths to process : 1
07/10/10 18:04:06 INFO mapred.JobClient: Running job: job_nlx1k6
07/10/10 18:04:06 WARN mapred.LocalJobRunner: job_nlx1k6
java.lang.ExceptionInInitializerError
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:315)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:326)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:339)
at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:411)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:115)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:126)
Caused by: java.lang.RuntimeException:
java.lang.InstantiationException: net.intelresearch.cvmHadoop.Sample
at org.apache.hadoop.io.WritableComparator.newKey(WritableComparator.java:74)
at org.apache.hadoop.io.WritableComparator.(Unknown Source)
at net.intelresearch.cvmHadoop.Sample.<clinit>(Unknown Source)
... 9 more
Caused by: java.lang.InstantiationException: net.intelresearch.cvmHadoop.Sample
at java.lang.Class.newInstance0(Class.java:340)
at java.lang.Class.newInstance(Class.java:308)
at org.apache.hadoop.io.WritableComparator.newKey(WritableComparator.java:72)
... 12 more
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at net.intelresearch.cvmHadoop.KeyedByLocationalCode$Driver.main(Unknown
Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
at net.intelresearch.cvmHadoop.Usage.main(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

If I'm only trying to use the Writable interface (not
WritableComparable), what is the purpose of a WritableComparator?
Values are not sorted, only Keys, so it seems that there is no need to
define a comparator for them. Just to be on the safe side, I did
implement one and called WritableComparator.define() with it in
Sample's initializer. What am I missing here?

Thanks again for the help.

-steve

---

package net.intelresearch.cvmHadoop;

import java.io.*;

import org.apache.hadoop.io.*;

public class Sample implements Writable {

Address address;
SampleValue value; // sampled value at this point

public Sample(Address a, SampleValue v) {
address = a;
value = v;
}

public SampleValue getValue() { return value;}
public Address getAddress() { return address; }

public String toString () {
return (address.toString() + " " + value.toString());
}

public void write(DataOutput out) throws IOException {
address.write(out);
value.write(out);
}

public void readFields(DataInput in) throws IOException {
address = new Address();
address.readFields(in);

value = new SampleValue();
value.readFields(in);
}

public static class Comparator extends WritableComparator {
public Comparator() {
super (Sample.class);
}

// Just order by Address for now
public int compare(Sample a, Sample b) {
return a.getAddress().compareTo(b.getAddress());
}
}

// register this comparator
static {
WritableComparator.define(Sample.class, new Comparator());
}
}

On 10/10/07, Matt Kent wrote:
You're right, Serializable should be sufficient. I was thinking of a
case where you'd sometimes want to write them out as values, but other
times combine them inside Sample.
On 10/10/07, Steve Schlosser wrote:
Is this true? The fact that SampleValue and Address implement
Serializable should be sufficient to write them out to the stream.
They are not ever written out as keys or values themselves.

-steve
On 10/10/07, Matt Kent wrote:
I believe in this case you'll want to make Sample and Address writable as well.
On 10/10/07, Steve Schlosser wrote:
Hello all

Is there a best practice for using my own classes as keys and values?

My first attempt at doing this was successful - I built a
BigIntegerWritable class using IntWritable as a template. It was easy
because BigInteger has methods converting to and from byte arrays,
which I could then write into the DataOutput or read from the
DataInput.

It seems like I should be able to use object serialization to write
to/read from the DataOutput/Input objects and make my own classes
implement the Writable interface. It seems like I should be able to
do something like this:

import java.io.*;

import org.apache.hadoop.io.*;

public class Sample implements Writable {

Address address;
SampleValue value; // sampled value at this point

public Sample(Address a, SampleValue v) {
address = a;
value = v;
}

public SampleValue getValue() { return value;}
public Address getAddress() { return address; }

public String toString () {
return (address.toString() + " " + value.toString());
}

[...]

public void readFields(DataInput in) throws IOException {
ObjectInputStream oin = new ObjectInputStream((DataInputBuffer)in);

try {
address = (Address)oin.readObject();
value = (SampleValue)oin.readObject();
} catch (ClassNotFoundException e) {
throw new IOException(e.toString());
}

}

public void write(DataOutput out) throws IOException {
ObjectOutputStream oout = new ObjectOutputStream((DataOutputBuffer)out);

oout.writeObject(address);
oout.writeObject(value);
}
}

This code compiles, but throws exceptions at runtime, complaining that
WritableComparator can not access a member of class Sample with
modifiers "". Can someone tell me what this exception is talking
about?

Do I need to implement a WritableComparator for each class that I want
to implement Writable?

Thanks again for the help.

-steve

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 7 of 9 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 10, '07 at 3:03p
activeOct 11, '07 at 1:29a
posts9
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase