FAQ
Hi,
I am trying to find a way to change
key-value field separator of streaming.
Streaming documentation says it can be
configured with "stream.map.output.field.separator"
and I tried but it had no effect.

Am I missing something?

I use hadoop 0.18.3

thanks in advance

Search Discussions

  • Akira Kitada at Mar 21, 2009 at 12:32 pm
    answering myself.

    stream.map.output.field.separator is not for
    how framework output data but how it will
    treat the data.

    All I have to do was to let mapper emit key value with my separator
    and tell that to hadoop.

    Thanks
    On 3/21/09, Akira Kitada wrote:
    Hi,
    I am trying to find a way to change
    key-value field separator of streaming.
    Streaming documentation says it can be
    configured with "stream.map.output.field.separator"
    and I tried but it had no effect.

    Am I missing something?

    I use hadoop 0.18.3

    thanks in advance
  • Akira Kitada at Mar 21, 2009 at 1:41 pm
    I thought I foumd the way but I was wrong.
    It seems Hadoop automatically converts my
    separator that I told hadoop to use into the default separator, a tab...

    Is there any way of changing this default
    and output keyvalue with custom separator?
    On 3/21/09, Akira Kitada wrote:
    answering myself.

    stream.map.output.field.separator is not for
    how framework output data but how it will
    treat the data.

    All I have to do was to let mapper emit key value with my separator
    and tell that to hadoop.

    Thanks
    On 3/21/09, Akira Kitada wrote:
    Hi,
    I am trying to find a way to change
    key-value field separator of streaming.
    Streaming documentation says it can be
    configured with "stream.map.output.field.separator"
    and I tried but it had no effect.

    Am I missing something?

    I use hadoop 0.18.3

    thanks in advance
  • Jason hadoop at Mar 21, 2009 at 2:56 pm
    For a job using TextOutputFormat, the final output key value pairs will be
    separated by the string defined in the key
    mapred.textoutputformat.separator, which defaults to TAB

    The string under stream.map.output.field.separator, is used to split the
    lines read back from the mapper into key, value, for use by the comparator,
    partitioner and combiner.
    The string under stream.map.input.field.separator is used to join the key,
    value pairs prior to writing the line to the streaming mapper.

    The string under stream.reduce.output.field.separator, is used to split the
    lines read back from the reducer into key, value. These values are then
    passed to the output collector.
    The string under stream.reduce.input.field.separator is used to join the
    key, value pairs prior to writing the line to the streaming reducer.

    I have attached a drawing that will hopefully make this clearer, this will
    be one of the figures in my book, in the streaming chapter.



    On Sat, Mar 21, 2009 at 6:41 AM, Akira Kitada wrote:

    I thought I foumd the way but I was wrong.
    It seems Hadoop automatically converts my
    separator that I told hadoop to use into the default separator, a tab...

    Is there any way of changing this default
    and output keyvalue with custom separator?
    On 3/21/09, Akira Kitada wrote:
    answering myself.

    stream.map.output.field.separator is not for
    how framework output data but how it will
    treat the data.

    All I have to do was to let mapper emit key value with my separator
    and tell that to hadoop.

    Thanks
    On 3/21/09, Akira Kitada wrote:
    Hi,
    I am trying to find a way to change
    key-value field separator of streaming.
    Streaming documentation says it can be
    configured with "stream.map.output.field.separator"
    and I tried but it had no effect.

    Am I missing something?

    I use hadoop 0.18.3

    thanks in advance


    --
    Alpha Chapters of my book on Hadoop are available
    http://www.apress.com/book/view/9781430219422
  • Akira Kitada at Mar 21, 2009 at 8:10 pm
    Now it's clear now. Thank you Jason.
    Looking forward to the book published
    On 3/21/09, jason hadoop wrote:
    For a job using TextOutputFormat, the final output key value pairs will be
    separated by the string defined in the key
    mapred.textoutputformat.separator, which defaults to TAB

    The string under stream.map.output.field.separator, is used to split the
    lines read back from the mapper into key, value, for use by the comparator,
    partitioner and combiner.
    The string under stream.map.input.field.separator is used to join the key,
    value pairs prior to writing the line to the streaming mapper.

    The string under stream.reduce.output.field.separator, is used to split the
    lines read back from the reducer into key, value. These values are then
    passed to the output collector.
    The string under stream.reduce.input.field.separator is used to join the
    key, value pairs prior to writing the line to the streaming reducer.

    I have attached a drawing that will hopefully make this clearer, this will
    be one of the figures in my book, in the streaming chapter.



    On Sat, Mar 21, 2009 at 6:41 AM, Akira Kitada wrote:

    I thought I foumd the way but I was wrong.
    It seems Hadoop automatically converts my
    separator that I told hadoop to use into the default separator, a tab...

    Is there any way of changing this default
    and output keyvalue with custom separator?
    On 3/21/09, Akira Kitada wrote:
    answering myself.

    stream.map.output.field.separator is not for
    how framework output data but how it will
    treat the data.

    All I have to do was to let mapper emit key value with my separator
    and tell that to hadoop.

    Thanks
    On 3/21/09, Akira Kitada wrote:
    Hi,
    I am trying to find a way to change
    key-value field separator of streaming.
    Streaming documentation says it can be
    configured with "stream.map.output.field.separator"
    and I tried but it had no effect.

    Am I missing something?

    I use hadoop 0.18.3

    thanks in advance


    --
    Alpha Chapters of my book on Hadoop are available
    http://www.apress.com/book/view/9781430219422

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMar 21, '09 at 11:46a
activeMar 21, '09 at 8:10p
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Akira Kitada: 4 posts Jason hadoop: 1 post

People

Translate

site design / logo © 2022 Grokbase