FAQ
yes,my key is ip,and value is a object(which inherited hadoop Record
class,and will be converted
a visualized data),e.g.:
key field1,field2,field3(these are properties belong to
object)
12.121.23.121 121,11,/img/dd.jpg
32.121.23.222 221,11,/img/xx.jpg

1.i want to sort by field1 ,but it is sorted by key in reduce by default,how
to do?
2.by the way,when my value(object) inherited to the Record,why the output
sequence are:
data1

data2
...


--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-vlaue-in-reduce-instead-of-by-key-tp2805541p2805541.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Search Discussions

  • Leibnitz at Apr 11, 2011 at 9:27 am
    can anyone get me a tips ?

    --
    View this message in context: http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-value-in-reduce-instead-of-by-key-tp2805541p2805922.html
    Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
  • Josh Patterson at Apr 11, 2011 at 2:09 pm
    Leibnitz,
    I think you are looking for "secondary sort" in this case where the
    data arrives in some sort of order at the reducer as opposed to "in a
    group by key". Is that the case?

    For a look at secondary sort I've got a few blog articles:

    http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
    http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
    http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/

    and part 3 includes source code on github.com:

    https://github.com/jpatanooga/Caduceus

    Hope that helps,

    Josh


    On Mon, Apr 11, 2011 at 5:26 AM, leibnitz wrote:
    can anyone get me a tips ?

    --
    View this message in context: http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-value-in-reduce-instead-of-by-key-tp2805541p2805922.html
    Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


    --
    Twitter: @jpatanooga
    Solution Architect @ Cloudera
    hadoop: http://www.cloudera.com
    blog: http://jpatterson.floe.tv
  • Leibnitz at Apr 12, 2011 at 2:37 am
    thanks all.
    to : Josh,i think you are right.i have previously tried to use a group key
    by field1+ip at reduce.but it is failed(not sort).
    i will check your point:)

    --
    View this message in context: http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-value-in-reduce-instead-of-by-key-tp2805541p2809859.html
    Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
  • Sumit ghosh at Apr 11, 2011 at 9:31 am
    Your field1 data can be split over multiple reducers. Is it possible to emit
    field1 as the key from the reducer (in case you do not need the ip anymore)?




    ________________________________
    From: leibnitz <[email protected]>
    To: [email protected]
    Sent: Mon, 11 April, 2011 12:02:46 PM
    Subject: how to sort the output by vlaue in reduce instead of by key?

    yes,my key is ip,and value is a object(which inherited hadoop Record
    class,and will be converted
    a visualized data),e.g.:
    key field1,field2,field3(these are properties belong to
    object)
    12.121.23.121 121,11,/img/dd.jpg
    32.121.23.222 221,11,/img/xx.jpg

    1.i want to sort by field1 ,but it is sorted by key in reduce by default,how
    to do?
    2.by the way,when my value(object) inherited to the Record,why the output
    sequence are:
    data1

    data2
    ...


    --
    View this message in context:
    http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-vlaue-in-reduce-instead-of-by-key-tp2805541p2805541.html

    Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
  • Sumit ghosh at Apr 11, 2011 at 9:34 am
    Your field1 data can be split over multiple reducers. Is it possible to emit
    field1 as the key from the reducer (in case you do not need the ip anymore)?




    ________________________________
    From: leibnitz <[email protected]>
    To: [email protected]
    Sent: Mon, 11 April, 2011 12:02:46 PM
    Subject: how to sort the output by vlaue in reduce instead of by key?

    yes,my key is ip,and value is a object(which inherited hadoop Record
    class,and will be converted
    a visualized data),e.g.:
    key field1,field2,field3(these are properties belong to
    object)
    12.121.23.121 121,11,/img/dd.jpg
    32.121.23.222 221,11,/img/xx.jpg

    1.i want to sort by field1 ,but it is sorted by key in reduce by default,how
    to do?
    2.by the way,when my value(object) inherited to the Record,why the output
    sequence are:
    data1

    data2
    ...


    --
    View this message in context:
    http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-vlaue-in-reduce-instead-of-by-key-tp2805541p2805541.html

    Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 11, '11 at 6:33a
activeApr 12, '11 at 2:37a
posts6
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase