FAQ
Hi,
I encounter a problem why try to define my own MultipleOutputFormat class, here is the codes bellow.
public class MultipleOutputFormat extends FileOutputFormat<LongWritable,Text>{
public class LineWriter extends RecordWriter<LongWritable,Text>{
private DataOutputStream output;
private byte separatorBytes[];
public LineWriter(DataOutputStream output, String separator) throws UnsupportedEncodingException
{
this.output=output;
this.separatorBytes=separator.getBytes("UTF-8");
}
@Override
public synchronized void close(TaskAttemptContext context) throws IOException,
InterruptedException {
// TODO Auto-generated method stub
output.close();
}

@Override
public void write(LongWritable key, Text value) throws IOException,
InterruptedException {
System.out.println("key:"+key.get());
System.out.println("value:"+value.toString());
// TODO Auto-generated method stub
//output.writeLong(key.)
//output.write(separatorBytes);
//output.write(value.toString().getBytes("UTF-8"));
//output.write("\n".getBytes("UTF-8"));
//key.write(output);
key.write(output);
value.write(output);

output.write("\n".getBytes("UTF-8"));
}
}
private Path path;
protected String generateFileNameForKeyValue(LongWritable key,Text value,String name)
{
return "key"+Math.random();
}

@Override
public RecordWriter<LongWritable, Text> getRecordWriter(
TaskAttemptContext context) throws IOException, InterruptedException {
path=getOutputPath(context);
System.out.println("ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd");
// TODO Auto-generated method stub
Path file = getDefaultWorkFile(context, "");
FileSystem fs = file.getFileSystem(context.getConfiguration());

FSDataOutputStream fileOut = fs.create(file, false);

return new LineWriter(fileOut, "\t");

}

however, there is a problem of unrecognizable characters occurrences in the output file,
is there any one encounter the problem before, any comment is greatly appreciated, thanks in advance.


James, Teng (Teng Linxiao)
eRL, CDC, eBay, Shanghai
Extension: 86-21-28913530
MSN: [email protected]
Skype: James,Teng
Email: [email protected]
[cid:[email protected]]

Search Discussions

  • Yaozhen Pan at Jul 18, 2011 at 5:02 pm
    Hi James,

    Not sure if you meant to write both key and value as text.
    key.write(output);
    This line of code writes long numbers as binary format, that might be the
    reason you saw unrecognizable characters in output file.

    Yaozhen
    On Mon, Jul 18, 2011 at 2:00 PM, Teng, James wrote:

    ** **

    Hi,****

    I encounter a problem why try to define my own MultipleOutputFormat class,
    here is the codes bellow.****

    *public* *class* MultipleOutputFormat *extends*FileOutputFormat<LongWritable,Text>{
    ****

    *public* *class* LineWriter *extends*RecordWriter<LongWritable,Text>{
    ****

    *private* DataOutputStream output;****

    *private* *byte* *separatorBytes*[];****

    *public* LineWriter(DataOutputStream output, String separator)
    *throws* UnsupportedEncodingException****

    {****

    *this*.output=output;****

    *this*.separatorBytes=separator.getBytes("UTF-8");****

    }****

    @Override****

    *public* *synchronized* *void* close(TaskAttemptContext
    context) *throws* IOException,****

    InterruptedException {****

    // *TODO* Auto-generated method stub****

    output.close();****

    }****

    ** **

    @Override****

    *public* *void* write(LongWritable key, Text value) *throws*IOException,
    ****

    InterruptedException {****

    System.*out*.println("key:"+key.get());****

    System.*out*.println("value:"+value.toString());****

    // *TODO* Auto-generated method stub****

    //output.writeLong(key.)****

    //output.write(separatorBytes);****

    //output.write(value.toString().getBytes("UTF-8"));****

    //output.write("\n".getBytes("UTF-8"));****

    //key.write(output);****

    key.write(output);****

    value.write(output);****

    ** **

    output.write("\n".getBytes("UTF-8"));****

    }****

    }****

    *private* Path *path*;****

    *protected* String generateFileNameForKeyValue(LongWritable key,Text
    value,String name)****

    {****

    *return* "key"+Math.*random*();****

    }****

    ** **

    @Override****

    *public* RecordWriter<LongWritable, Text> getRecordWriter(****

    TaskAttemptContext context) *throws* IOException,
    InterruptedException {****

    path=*getOutputPath*(context);****

    System.*out*.println(
    "ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd"
    );****

    // *TODO* Auto-generated method stub****

    Path file = getDefaultWorkFile(context, "");****

    FileSystem fs = file.getFileSystem(context.getConfiguration());
    ****

    ** **

    FSDataOutputStream fileOut = fs.create(file, *false*);****

    ** **

    *return* *new* LineWriter(fileOut, "\t");****

    ** **

    }****

    ** **

    however, there is a problem of unrecognizable characters occurrences in the
    output file,****

    is there any one encounter the problem before, any comment is greatly
    appreciated, thanks in advance.****

    ** **

    ****

    *James, Teng (Teng Linxiao)*

    *eRL, CDC, eBay, Shanghai*****

    *Extension*: 86-21-28913530****

    *MSN*: [email protected]****

    *Skype*: James,Teng****

    *Email*: [email protected]****

    ****

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 18, '11 at 6:01a
activeJul 18, '11 at 5:02p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Teng, James: 1 post Yaozhen Pan: 1 post

People

Translate

site design / logo © 2023 Grokbase