FAQ
After about a week of researching, logging, etc. I have finally discovered what is happening, but I have no idea why.

I have created my own WritableComparable object so I can emit it as the key from my Mapper. The object contains several Longs, one String, and one Date property. The following code snippets are from the object

private Date SummaryDate;

/**
* @return the summaryDate
*/
public Date getSummaryDate() {
return SummaryDate;
}

/**
* @param summaryDate the summaryDate to set
*/
public void setSummaryDate(Date summaryDate) {
Calendar cal = Calendar.getInstance();
cal.setTime(summaryDate);
cal.set(Calendar.HOUR, 0);
cal.set(Calendar.MINUTE, 0);
cal.set(Calendar.SECOND, 0);
cal.set(Calendar.MILLISECOND, 0);
cal.set(Calendar.AM_PM, Calendar.AM);
SummaryDate = cal.getTime();
}

@Override
public void readFields(DataInput arg0) throws IOException {
....
getSummaryDate().setTime(arg0.readLong());
}

@Override
public void write(DataOutput arg0) throws IOException {
....
arg0.writeLong(getSummaryDate().getTime());
}

The intent is for the Summary date to be always be as of midnight, thus the use of the Calendar object in the setSummaryDate() method.

I have proven via logging that the Mapper is storing the correct value in the SummaryDate property, but sometimes the value received by the Reducer is the previous day. Does anyone have any idea how this could happen?

My only theory is precision on the Long where the epoch time is actually stored, that it somehow loses a tick and becomes 1 millisecond before midnight, then my code drops the time and the date portion is left with a date that is one day earlier. Has anyone else seen anything like this?

I ready to go change my code to just store the date as a formatted string. But I'd really like to know if this is a known Java or Hadoop problem.

Thanks,
Dave Shine
Sr. Software Engineer
321.939.5093 direct | 407.314.0122 mobile

[cid:image001.png@01CD5378.563817B0]
CI Boost(tm) Clients Outperform Online(tm) www.ciboost.com<http://www.ciboost.com/>
facebook platform | where-to-buy | product search engines | shopping engines



________________________________
The information contained in this email message is considered confidential and proprietary to the sender and is intended solely for review and use by the named recipient. Any unauthorized review, use or distribution is strictly prohibited. If you have received this message in error, please advise the sender by reply email and delete the message.

--

Search Discussions

  • Todd Lipcon at Jun 26, 2012 at 2:18 pm
    Your usage of the Calendar class doesn't look thread-safe to me. Writables
    need to be thread safe to avoid issues like this.

    -Todd
    On Tue, Jun 26, 2012 at 5:59 AM, Dave Shine wrote:

    After about a week of researching, logging, etc. I have finally
    discovered what is happening, but I have no idea why.****

    ** **

    I have created my own WritableComparable object so I can emit it as the
    key from my Mapper. The object contains several Longs, one String, and one
    Date property. The following code snippets are from the object****

    ** **

    private Date SummaryDate;****

    ** **

    /******

    * @return the summaryDate****

    */****

    public Date getSummaryDate() {****

    return SummaryDate;****

    }****

    ** **

    /******

    * @param summaryDate the summaryDate to set****

    */****

    public void setSummaryDate(Date summaryDate) {****

    Calendar cal = Calendar.getInstance();****

    cal.setTime(summaryDate);****

    cal.set(Calendar.HOUR, 0);****

    cal.set(Calendar.MINUTE, 0);****

    cal.set(Calendar.SECOND, 0);****

    cal.set(Calendar.MILLISECOND, 0);****

    cal.set(Calendar.AM_PM, Calendar.AM);****

    SummaryDate = cal.getTime();****

    }****

    ** **

    @Override****

    public void readFields(DataInput arg0) throws IOException {
    ****

    ....****

    getSummaryDate().setTime(arg0.readLong());
    ****

    }****

    ** **

    @Override****

    public void write(DataOutput arg0) throws IOException {***
    *

    ....****

    arg0.writeLong(getSummaryDate().getTime());
    ****

    }****

    ** **

    The intent is for the Summary date to be always be as of midnight, thus
    the use of the Calendar object in the setSummaryDate() method. ****

    ** **

    I have proven via logging that the Mapper is storing the correct value in
    the SummaryDate property, but sometimes the value received by the Reducer
    is the previous day. Does anyone have any idea how this could happen? **
    **

    ** **

    My only theory is precision on the Long where the epoch time is actually
    stored, that it somehow loses a tick and becomes 1 millisecond before
    midnight, then my code drops the time and the date portion is left with a
    date that is one day earlier. Has anyone else seen anything like this?***
    *

    ** **

    I ready to go change my code to just store the date as a formatted
    string. But I’d really like to know if this is a known Java or Hadoop
    problem.****

    ** **

    Thanks,****

    *Dave Shine*****

    Sr. Software Engineer****

    321.939.5093 direct | 407.314.0122 mobile****

    ** **

    [image: cid:D34AFA33-EA7B-4B08-9DD4-2C8DFBE66338]****

    *CI Boost™ Clients* *Outperform Online™ *www.ciboost.com****

    facebook platform | where-to-buy | product search engines | shopping
    engines****

    ** **

    ** **

    ------------------------------
    The information contained in this email message is considered confidential
    and proprietary to the sender and is intended solely for review and use by
    the named recipient. Any unauthorized review, use or distribution is
    strictly prohibited. If you have received this message in error, please
    advise the sender by reply email and delete the message.

    --




    --
    Todd Lipcon
    Software Engineer, Cloudera

    --
  • Dave Shine at Jun 26, 2012 at 2:24 pm
    Umm. OK. Thread safety might be a contributing factor, but I just added the Calendar object yesterday while this problem has been occurring since I deployed the code 2 weeks ago.

    Showing my Java ignorance, but how to I make the Calendar object thread safe?

    Dave Shine
    Sr. Software Engineer
    321.939.5093 direct | 407.314.0122 mobile
    CI Boost(tm) Clients Outperform Online(tm) www.ciboost.com<http://www.ciboost.com/>

    From: Todd Lipcon
    Sent: Tuesday, June 26, 2012 10:19 AM
    To: cdh-user@cloudera.org
    Cc: Eric Lyna; Anne Marshall
    Subject: Re: WritableComparable value changing between Map and Reduce

    Your usage of the Calendar class doesn't look thread-safe to me. Writables need to be thread safe to avoid issues like this.

    -Todd
    On Tue, Jun 26, 2012 at 5:59 AM, Dave Shine wrote:
    After about a week of researching, logging, etc. I have finally discovered what is happening, but I have no idea why.

    I have created my own WritableComparable object so I can emit it as the key from my Mapper. The object contains several Longs, one String, and one Date property. The following code snippets are from the object

    private Date SummaryDate;

    /**
    * @return the summaryDate
    */
    public Date getSummaryDate() {
    return SummaryDate;
    }

    /**
    * @param summaryDate the summaryDate to set
    */
    public void setSummaryDate(Date summaryDate) {
    Calendar cal = Calendar.getInstance();
    cal.setTime(summaryDate);
    cal.set(Calendar.HOUR, 0);
    cal.set(Calendar.MINUTE, 0);
    cal.set(Calendar.SECOND, 0);
    cal.set(Calendar.MILLISECOND, 0);
    cal.set(Calendar.AM_PM, Calendar.AM);
    SummaryDate = cal.getTime();
    }

    @Override
    public void readFields(DataInput arg0) throws IOException {
    ....
    getSummaryDate().setTime(arg0.readLong());
    }

    @Override
    public void write(DataOutput arg0) throws IOException {
    ....
    arg0.writeLong(getSummaryDate().getTime());
    }

    The intent is for the Summary date to be always be as of midnight, thus the use of the Calendar object in the setSummaryDate() method.

    I have proven via logging that the Mapper is storing the correct value in the SummaryDate property, but sometimes the value received by the Reducer is the previous day. Does anyone have any idea how this could happen?

    My only theory is precision on the Long where the epoch time is actually stored, that it somehow loses a tick and becomes 1 millisecond before midnight, then my code drops the time and the date portion is left with a date that is one day earlier. Has anyone else seen anything like this?

    I ready to go change my code to just store the date as a formatted string. But I'd really like to know if this is a known Java or Hadoop problem.

    Thanks,
    Dave Shine
    Sr. Software Engineer
    321.939.5093<tel:321.939.5093> direct | 407.314.0122<tel:407.314.0122> mobile

    [cid:image001.png@01CD5385.CCEF4100]
    CI Boost(tm) Clients Outperform Online(tm) www.ciboost.com<http://www.ciboost.com/>
    facebook platform | where-to-buy | product search engines | shopping engines



    ________________________________
    The information contained in this email message is considered confidential and proprietary to the sender and is intended solely for review and use by the named recipient. Any unauthorized review, use or distribution is strictly prohibited. If you have received this message in error, please advise the sender by reply email and delete the message.
    --






    --
    Todd Lipcon
    Software Engineer, Cloudera
    --



    --
  • Todd Lipcon at Jun 26, 2012 at 2:40 pm
    I'd switch to using JodaTime, or just forego java.util.Date entirely and
    pass around millisecond timestamps in your job.

    Some details here:
    http://stackoverflow.com/questions/6245053/how-to-make-a-static-calendar-thread-safe
    On Tue, Jun 26, 2012 at 7:24 AM, Dave Shine wrote:

    Umm. OK. Thread safety might be a contributing factor, but I just added
    the Calendar object yesterday while this problem has been occurring since I
    deployed the code 2 weeks ago. ****

    ** **

    Showing my Java ignorance, but how to I make the Calendar object thread
    safe?****

    ** **

    *Dave Shine*****

    Sr. Software Engineer****

    321.939.5093 direct | 407.314.0122 mobile****

    *CI Boost™ Clients* *Outperform Online™ *www.ciboost.com****

    ** **

    *From:* Todd Lipcon
    *Sent:* Tuesday, June 26, 2012 10:19 AM
    *To:* cdh-user@cloudera.org
    *Cc:* Eric Lyna; Anne Marshall
    *Subject:* Re: WritableComparable value changing between Map and Reduce***
    *

    ** **

    Your usage of the Calendar class doesn't look thread-safe to me. Writables
    need to be thread safe to avoid issues like this.****

    ** **

    -Todd****

    On Tue, Jun 26, 2012 at 5:59 AM, Dave Shine <
    Dave.Shine@channelintelligence.com> wrote:****

    After about a week of researching, logging, etc. I have finally discovered
    what is happening, but I have no idea why.****

    ****

    I have created my own WritableComparable object so I can emit it as the
    key from my Mapper. The object contains several Longs, one String, and one
    Date property. The following code snippets are from the object****

    ****

    private Date SummaryDate;****

    ****

    /******

    * @return the summaryDate****

    */****

    public Date getSummaryDate() {****

    return SummaryDate;****

    }****

    ****

    /******

    * @param summaryDate the summaryDate to set****

    */****

    public void setSummaryDate(Date summaryDate) {****

    Calendar cal = Calendar.getInstance();****

    cal.setTime(summaryDate);****

    cal.set(Calendar.HOUR, 0);****

    cal.set(Calendar.MINUTE, 0);****

    cal.set(Calendar.SECOND, 0);****

    cal.set(Calendar.MILLISECOND, 0);****

    cal.set(Calendar.AM_PM, Calendar.AM);****

    SummaryDate = cal.getTime();****

    }****

    ****

    @Override****

    public void readFields(DataInput arg0) throws IOException {
    ****

    ....****

    getSummaryDate().setTime(arg0.readLong());
    ****

    }****

    ****

    @Override****

    public void write(DataOutput arg0) throws IOException {***
    *

    ....****

    arg0.writeLong(getSummaryDate().getTime());
    ****

    }****

    ****

    The intent is for the Summary date to be always be as of midnight, thus
    the use of the Calendar object in the setSummaryDate() method. ****

    ****

    I have proven via logging that the Mapper is storing the correct value in
    the SummaryDate property, but sometimes the value received by the Reducer
    is the previous day. Does anyone have any idea how this could happen? **
    **

    ****

    My only theory is precision on the Long where the epoch time is actually
    stored, that it somehow loses a tick and becomes 1 millisecond before
    midnight, then my code drops the time and the date portion is left with a
    date that is one day earlier. Has anyone else seen anything like this?***
    *

    ****

    I ready to go change my code to just store the date as a formatted
    string. But I’d really like to know if this is a known Java or Hadoop
    problem.****

    ****

    Thanks,****

    *Dave Shine*****

    Sr. Software Engineer****

    321.939.5093 direct | 407.314.0122 mobile****

    ****

    [image: cid:D34AFA33-EA7B-4B08-9DD4-2C8DFBE66338]****

    *CI Boost™ Clients* *Outperform Online™ *www.ciboost.com****

    facebook platform | where-to-buy | product search engines | shopping
    engines****

    ****

    ****

    ** **
    ------------------------------

    The information contained in this email message is considered confidential
    and proprietary to the sender and is intended solely for review and use by
    the named recipient. Any unauthorized review, use or distribution is
    strictly prohibited. If you have received this message in error, please
    advise the sender by reply email and delete the message.****

    --


    ****



    ****

    ** **

    --
    Todd Lipcon
    Software Engineer, Cloudera****

    --


    ****

    --




    --
    Todd Lipcon
    Software Engineer, Cloudera

    --
  • Alexey Zotov at Jun 26, 2012 at 4:19 pm
    Dave,
    I think different time zones on cluster's nodes may be a reason for
    this behavior. Try to synchronize it or use
    Calendar.setTimeZone(TimeZone value) method.

    Hope this helps!

    On Jun 26, 6:40 pm, Todd Lipcon wrote:
    I'd switch to using JodaTime, or just forego java.util.Date entirely and
    pass around millisecond timestamps in your job.

    Some details here:http://stackoverflow.com/questions/6245053/how-to-make-a-static-calen...

    On Tue, Jun 26, 2012 at 7:24 AM, Dave Shine <









    Dave.Sh...@channelintelligence.com> wrote:
    Umm.  OK.  Thread safety might be a contributing factor, but I just added
    the Calendar object yesterday while this problem has been occurring since I
    deployed the code 2 weeks ago.  ****
    ** **
    Showing my Java ignorance, but how to I make the Calendar object thread
    safe?****
    ** **
    *Dave Shine*****
    Sr. Software Engineer****
    321.939.5093 direct |  407.314.0122 mobile****
    *CI Boost™ Clients*  *Outperform Online™  *www.ciboost.com****
    ** **
    *From:* Todd Lipcon
    *Sent:* Tuesday, June 26, 2012 10:19 AM
    *To:* cdh-u...@cloudera.org
    *Cc:* Eric Lyna; Anne Marshall
    *Subject:* Re: WritableComparable value changing between Map and Reduce***
    *
    ** **
    Your usage of the Calendar class doesn't look thread-safe to me. Writables
    need to be thread safe to avoid issues like this.****
    ** **
    -Todd****
    On Tue, Jun 26, 2012 at 5:59 AM, Dave Shine <
    Dave.Sh...@channelintelligence.com> wrote:****
    After about a week of researching, logging, etc. I have finally discovered
    what is happening, but I have no idea why.****
    ****
    I have created my own WritableComparable object so I can emit it as the
    key from my Mapper.  The object contains several Longs, one String, and one
    Date property.  The following code snippets are from the object****
    ****
    private Date SummaryDate;****
    ****
    /******
    * @return the summaryDate****
    */****
    public Date getSummaryDate() {****
    return SummaryDate;****
    }****
    ****
    /******
    * @param summaryDate the summaryDate to set****
    */****
    public void setSummaryDate(Date summaryDate) {****
    Calendar cal = Calendar.getInstance();****
    cal.setTime(summaryDate);****
    cal.set(Calendar.HOUR, 0);****
    cal.set(Calendar.MINUTE, 0);****
    cal.set(Calendar.SECOND, 0);****
    cal.set(Calendar.MILLISECOND, 0);****
    cal.set(Calendar.AM_PM, Calendar.AM);****
    SummaryDate = cal.getTime();****
    }****
    ****
    @Override****
    public void readFields(DataInput arg0) throws IOException {
    ****
    ....****
    getSummaryDate().setTime(arg0.readLong());
    ****
    }****
    ****
    @Override****
    public void write(DataOutput arg0) throws IOException {***
    *
    ....****
    arg0.writeLong(getSummaryDate().getTime());
    ****
    }****
    ****
    The intent is for the Summary date to be always be as of midnight, thus
    the use of the Calendar object in the setSummaryDate() method.  ****
    ****
    I have proven via logging that the Mapper is storing the correct value in
    the SummaryDate property, but sometimes the value received by the Reducer
    is the previous day.  Does anyone have any idea how this could happen?  **
    **
    ****
    My only theory is precision on the Long where the epoch time is actually
    stored, that it somehow loses a tick and becomes 1 millisecond before
    midnight, then my code drops the time and the date portion is left with a
    date that is one day earlier.  Has anyone else seen anything like this?***
    *
    ****
    I ready to go change my code to just store the date as a formatted
    string.  But I’d really like to know if this is a known Java or Hadoop
    problem.****
    ****
    Thanks,****
    *Dave Shine*****
    Sr. Software Engineer****
    321.939.5093 direct |  407.314.0122 mobile****
    ****
    [image: cid:D34AFA33-EA7B-4B08-9DD4-2C8DFBE66338]****
    *CI Boost™ Clients*  *Outperform Online™  *www.ciboost.com****
    facebook platform | where-to-buy | product search engines | shopping
    engines****
    ****
    ****
    ** **
    ------------------------------
    The information contained in this email message is considered confidential
    and proprietary to the sender and is intended solely for review and use by
    the named recipient. Any unauthorized review, use or distribution is
    strictly prohibited. If you have received this message in error, please
    advise the sender by reply email and delete the message.****
    --
    ****
    ****
    ** **
    --
    Todd Lipcon
    Software Engineer, Cloudera****
    --
    ****
    --
    --
    Todd Lipcon
    Software Engineer, Cloudera
    --
  • Dave Shine at Jun 26, 2012 at 4:49 pm
    It turns out the one of the servers in the cluster does have the wrong time zone. But, I don't believe any of the tasks for the job ran on that box (there were only 2 map tasks and 1 reduce task). Also, since I'm just passing the epoch time around, I wouldn't have thought the time zone of the server would have any effect.

    Dave Shine
    Sr. Software Engineer
    321.939.5093 direct | 407.314.0122 mobile
    CI Boost(tm) Clients Outperform Online(tm) www.ciboost.com


    -----Original Message-----
    From: Alexey Zotov
    Sent: Tuesday, June 26, 2012 12:19 PM
    To: CDH Users
    Subject: Re: WritableComparable value changing between Map and Reduce

    Dave,
    I think different time zones on cluster's nodes may be a reason for this behavior. Try to synchronize it or use Calendar.setTimeZone(TimeZone value) method.

    Hope this helps!

    On Jun 26, 6:40 pm, Todd Lipcon wrote:
    I'd switch to using JodaTime, or just forego java.util.Date entirely
    and pass around millisecond timestamps in your job.

    Some details here:http://stackoverflow.com/questions/6245053/how-to-make-a-static-calen...

    On Tue, Jun 26, 2012 at 7:24 AM, Dave Shine <









    Dave.Sh...@channelintelligence.com> wrote:
    Umm. OK. Thread safety might be a contributing factor, but I just
    added the Calendar object yesterday while this problem has been
    occurring since I deployed the code 2 weeks ago. ****
    ** **
    Showing my Java ignorance, but how to I make the Calendar object
    thread
    safe?****
    ** **
    *Dave Shine*****
    Sr. Software Engineer****
    321.939.5093 direct | 407.314.0122 mobile****
    *CI Boost(tm) Clients* *Outperform Online(tm) *www.ciboost.com****
    ** **
    *From:* Todd Lipcon
    *Sent:* Tuesday, June 26, 2012 10:19 AM
    *To:* cdh-u...@cloudera.org
    *Cc:* Eric Lyna; Anne Marshall
    *Subject:* Re: WritableComparable value changing between Map and
    Reduce***
    *
    ** **
    Your usage of the Calendar class doesn't look thread-safe to me.
    Writables need to be thread safe to avoid issues like this.****
    ** **
    -Todd****
    On Tue, Jun 26, 2012 at 5:59 AM, Dave Shine <
    Dave.Sh...@channelintelligence.com> wrote:****
    After about a week of researching, logging, etc. I have finally
    discovered what is happening, but I have no idea why.****
    ****
    I have created my own WritableComparable object so I can emit it as
    the key from my Mapper. The object contains several Longs, one
    String, and one Date property. The following code snippets are from
    the object****
    ****
    private Date SummaryDate;****
    ****
    /******
    * @return the summaryDate****
    */****
    public Date getSummaryDate() {****
    return SummaryDate;****
    }****
    ****
    /******
    * @param summaryDate the summaryDate to set****
    */****
    public void setSummaryDate(Date summaryDate) {****
    Calendar cal =
    Calendar.getInstance();****
    cal.setTime(summaryDate);****
    cal.set(Calendar.HOUR, 0);****
    cal.set(Calendar.MINUTE, 0);****
    cal.set(Calendar.SECOND, 0);****
    cal.set(Calendar.MILLISECOND,
    0);****
    cal.set(Calendar.AM_PM,
    Calendar.AM);****
    SummaryDate = cal.getTime();****
    }****
    ****
    @Override****
    public void readFields(DataInput arg0) throws
    IOException {
    ****
    ....****
    getSummaryDate().setTime(arg0.readLong());
    ****
    }****
    ****
    @Override****
    public void write(DataOutput arg0) throws
    IOException {***
    *
    ....****
    arg0.writeLong(getSummaryDate().getTime());
    ****
    }****
    ****
    The intent is for the Summary date to be always be as of midnight,
    thus the use of the Calendar object in the setSummaryDate() method.
    ****
    ****
    I have proven via logging that the Mapper is storing the correct
    value in the SummaryDate property, but sometimes the value received
    by the Reducer is the previous day. Does anyone have any idea how
    this could happen? **
    **
    ****
    My only theory is precision on the Long where the epoch time is
    actually stored, that it somehow loses a tick and becomes 1
    millisecond before midnight, then my code drops the time and the
    date portion is left with a date that is one day earlier. Has
    anyone else seen anything like this?***
    *
    ****
    I ready to go change my code to just store the date as a formatted
    string. But I'd really like to know if this is a known Java or
    Hadoop
    problem.****
    ****
    Thanks,****
    *Dave Shine*****
    Sr. Software Engineer****
    321.939.5093 direct | 407.314.0122 mobile****
    ****
    [image: cid:D34AFA33-EA7B-4B08-9DD4-2C8DFBE66338]****
    *CI Boost(tm) Clients* *Outperform Online(tm) *www.ciboost.com****
    facebook platform | where-to-buy | product search engines | shopping
    engines****
    ****
    ****
    ** **
    ------------------------------
    The information contained in this email message is considered
    confidential and proprietary to the sender and is intended solely
    for review and use by the named recipient. Any unauthorized review,
    use or distribution is strictly prohibited. If you have received
    this message in error, please advise the sender by reply email and
    delete the message.****
    --
    ****
    ****
    ** **
    --
    Todd Lipcon
    Software Engineer, Cloudera****
    --
    ****
    --
    --
    Todd Lipcon
    Software Engineer, Cloudera
    --




    The information contained in this email message is considered confidential and proprietary to the sender and is intended solely for review and use by the named recipient. Any unauthorized review, use or distribution is strictly prohibited. If you have received this message in error, please advise the sender by reply email and delete the message.

    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcdh-user @
categorieshadoop
postedJun 26, '12 at 12:59p
activeJun 26, '12 at 4:49p
posts6
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase