FAQ
I have question regarding Mapside Join.
Finally I got a copy of your book.I tried Implementing it. and I have few
Questions on it.

File 1:
31 Rafferty
33 Jones
33 Steinberg
34 Robinson
34 Smith
<null> Jasper

File 2:
31 sales
33 Engg
34 Clerical
35 Marketing

Results I got using mapside join

File1 inner join with File2
31 Rafferty
31 sales
33 Jones
33 Engg
33 Steinberg
33 Engg


File2 inner join with File1

31 sales
31 Rafferty
33 Engg
33 Jones
33 Engg
33 Steinberg
34 Clerical
34 Robinson
34 Clerical
34 Smith


But I am looking some result like below:

31 sales Rafferty
33 Engg Jones
33 Engg Steinberg
34 Clerical Robinson
34 Clerical Smith


Is it possible using map-side join only??

I am looking simple join such that key values present in both files .

Pankil

Search Discussions

  • Amogh Vasekar at Jul 14, 2009 at 4:48 am
    Yes it is. However, I assume file 2 is "comparatively" small to be distributed across all computing nodes without much delay, else the whole point of map side join is defeated.
    If keys in file 2 are unique, it is a simple lookup you need to implement. Else iterate over them to implement the join.

    -----Original Message-----
    From: Pankil Doshi
    Sent: Tuesday, July 14, 2009 4:49 AM
    To: core-user@hadoop.apache.org
    Subject: Question regarding Map side Join

    I have question regarding Mapside Join.
    Finally I got a copy of your book.I tried Implementing it. and I have few
    Questions on it.

    File 1:
    31 Rafferty
    33 Jones
    33 Steinberg
    34 Robinson
    34 Smith
    <null> Jasper

    File 2:
    31 sales
    33 Engg
    34 Clerical
    35 Marketing

    Results I got using mapside join

    File1 inner join with File2
    31 Rafferty
    31 sales
    33 Jones
    33 Engg
    33 Steinberg
    33 Engg


    File2 inner join with File1

    31 sales
    31 Rafferty
    33 Engg
    33 Jones
    33 Engg
    33 Steinberg
    34 Clerical
    34 Robinson
    34 Clerical
    34 Smith


    But I am looking some result like below:

    31 sales Rafferty
    33 Engg Jones
    33 Engg Steinberg
    34 Clerical Robinson
    34 Clerical Smith


    Is it possible using map-side join only??

    I am looking simple join such that key values present in both files .

    Pankil

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 13, '09 at 11:19p
activeJul 14, '09 at 4:48a
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Amogh Vasekar: 1 post Pankil Doshi: 1 post

People

Translate

site design / logo © 2022 Grokbase