Grokbase Groups Pig dev February 2013
FAQ
Looks like Lohit found a critical bug we should fix for 11.1:
https://issues.apache.org/jira/browse/PIG-3241(only observed in hadoop 2.0)

D

On Wed, Mar 6, 2013 at 12:57 PM, Prashant Kommireddi wrote:

Dmitriy, are the gc fixes all in for 0.11.1? PIG-3148 and PIG-3212 are the
2 JIRAs I know were fixed, any others?

I have a patch up for 3194, I think we should be good for a release once
that makes it in.

-Prashant

On Sat, Mar 2, 2013 at 11:16 AM, Prashant Kommireddi <prash1784@gmail.com
wrote:
Great.

I have commented regarding a possible approach for PIG-3194
http://goo.gl/UQ3zs. Please take a look when you folks have a chance.

On Fri, Mar 1, 2013 at 7:00 PM, Dmitriy Ryaboy wrote:

I'd like to get the gc fix in as well, but looks like Rohini is about to
commit it so we are good there.
On Mar 1, 2013, at 11:33 AM, Bill Graham wrote:

+1 to releasing Pig 0.11.1 when this is addressed. I should be able to help
with the release again.



On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi <
prash1784@gmail.com>wrote:
Hey Guys,

I wanted to start a conversation on this again. If Kai is not looking
at
PIG-3194 I can start working on it to get 0.11 compatible with 20.2.
If
everyone agrees, we should roll out 0.11.1 sooner than usual and I
volunteer to help with it in anyway possible.

Any objections to getting 0.11.1 out soon after 3194 is fixed?

-Prashant

On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney <
russell.jurney@gmail.com
wrote:
I stand corrected. Cool, 0.11 is good!


On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho <
jarcec@apache.org
wrote:
Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than
to
0.20.
Jarcec
On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote:
I agree -- this is a good release. The bugs Kai pointed out should
be
fixed, but as they are not critical regressions, we can fix them
in
0.11.1
(if someone wants to roll 0.11.1 the minute these fixes are
committed,
I
won't mind and will dutifully vote for the release).

I think the Hadoop 20.2 incompatibility is unfortunate but iirc
this
is
fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in
20.2?)
FWIW Twitter's running CDH3 and this release works in our
environment.
At this point things that block a release are critical regressions
in
performance or correctness.

D


On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates <
gates@hortonworks.com
wrote:
No. Bugs like these are supposed to be found and fixed after we
branch
from trunk (which happened several months ago in the case of
0.11).
The
point of RCs are to check that it's a good build, licenses are
right,
etc.
Any bugs found this late in the game have to be seen as failures
of
earlier testing.

Alan.
On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote:

Isn't the point of an RC to find and fix bugs like these>


On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham <
billgraham@gmail.com>
wrote:
Regarding Pig 11 rc2, I propose we continue with the current
vote
as is
(which closes today EOD). Patches for 0.20.2 issues can be
rolled
into a
Pig 0.11.1 release whenever they're available and tested.



On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich <
onatkovich@yahoo.com
wrote:
I agree that supporting as much as we can is a good goal. The
issue is
who
is going to be testing against all these versions? We found
the
issues
under discussion because of a customer report, not because we
consistently
test against all versions. Perhaps when we decide which
versions
to
support
for next release we need also to agree who is going to be
testing
and
maintaining compatibility with a particular version.

For instance since Hadoop 23 compatibility is important for us
at
Yahoo
we
have been maintaining compatibility with this version for 0.9,
0.10 and
will do the same for 0.11 and going forward. I think we would
need
others
to step in and claim the versions of their interest.

Olga


________________________________
From: Kai Londenberg <kai.londenberg@googlemail.com>
To: dev@pig.apache.org
Sent: Wednesday, February 20, 2013 1:51 AM
Subject: Re: pig 0.11 candidate 2 feedback: Several problems

Hi,

I stronly agree with Jonathan here. If there are good reasons
why
you
can't support an older version of Hadoop any more, that's one
thing.
But having to change 2 lines of code doesn't really qualify as
such in
my point of view ;)

At least for me, pig support for 0.20.2 is essential - without
it,
I
can't use it. If it doesn't support it, I'll have to branch
pig
and
hack it myself, or stop using it.

I guess, there are a lot of people still running 0.20.2
Clusters.
If
you really have lots of data stored on HDFS and a continuously
busy
cluster, an upgrade is nothing you do "just because".


2013/2/20 Jonathan Coveney <jcoveney@gmail.com>:
I agree that we shouldn't have to support old versions
forever.
That
said,
I also don't think we should be too blase about supporting
older
versions
where it is not odious to do so. We have a lot of competition
in
the
language space and the broader the versions we can support,
the
better
(assuming it isn't too odious to do so). In this case, I
don't
think
it
should be too hard to change ObjectSerializer so that the
commons-codec
code used is compatible with both versions...we could just
in-line
some
of
the Base64 code, and comment accordingly.

That said, we also should be clear about what versions we
support, but
6-12
months seems short. The upgrade cycles on Hadoop are really,
really
long.

2013/2/20 Prashant Kommireddi <prash1784@gmail.com>
Agreed, that makes sense. Probably supporting older hadoop
version
for
a 1
or 2 pig releases before moving to a newer/stable version?

Having said that, should we use 0.11 period to communicate
the
same
to
the
community and start moving on 0.12 onwards? I know we are
way
past
6-12
months (1-2 release) time frame with 0.20.2, but we also
need
to
make
sure
users are aware and plan accordingly.

I'd also be interested to hear how other projects (Hive,
Oozie)
are
handling this.

-Prashant

On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich <
onatkovich@yahoo.com
wrote:
It seems that for each Pig release we need to agree and
clearly
state
which Hadoop versions it will support. I guess the main
question is
how
we
decide on this. Perhaps we should say that Pig no longer
supports
older
Hadoop versions once the newer one is out for at least 6-12
month to
make
sure it is stable. I don't think we can support old
versions
indefinitely.
It is in everybody's interest to keep moving forward.

Olga


________________________________
From: Prashant Kommireddi <prash1784@gmail.com>
To: dev@pig.apache.org
Sent: Tuesday, February 19, 2013 10:57 AM
Subject: Re: pig 0.11 candidate 2 feedback: Several
problems
What do you guys feel about the JIRA to do with 0.20.2
compatibility
(PIG-3194)? I am interested in discussing the strategy
around
backward
compatibility as this is something that would haunt us each
time we
move
to
the next hadoop version. For eg, we might be in a similar
situation
while
moving to Hadoop 2.0, when some of the stuff might break
for
1.0.
I feel it would be good to get this JIRA fix in for 0.11,
as
0.20.2
users
might be caught unaware. Of course, I must admit there is
selfish
interest
here and it's probably easier for us to have a workaround
on
Pig
rather
than upgrade hadoop in all our production DCs.

-Prashant


On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney <
russell.jurney@gmail.com
wrote:
I think someone should step up and fix the easy ones, if
possible.

On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham <
billgraham@gmail.com>
wrote:
Thanks Kai for reporting these.

What do people think about the severity of these issues
w.r.t.
Pig
11?
I
see a few possible options:

1. We include some or all of these patches in a new Pig
11
rc.
We'd
want
to
make sure that they don't destabilize the current branch.
This
approach
makes sense if we think Pig 11 wouldn't be a good release
without
one
or
more of these included.

2. We continue with the Pig 11 release without these, but
then
include
one
or more in a 0.11.1 release.

3. We continue with the Pig 11 release without these, but
then
include
them
in a 0.12 release.

Jon has a patch for the MAP issue
(PIG-3144<https://issues.apache.org/jira/browse/PIG-3144
)
ready, which seems like the most pressing of the three to
me.
thanks,
Bill

On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg <
kai.londenberg@googlemail.com> wrote:
Hi,

I just subscribed to the dev mailing list in order to
give
you
some
feedback on pig 0.11 candidate 2.

The following three issues are currently present in 0.11
candidate
'Erroneous
map
entry
alias resolution leading to "Duplicate schema alias"
errors'
Changes
to
ObjectSerializer.java break compatibility with Hadoop
0.20.2
Condition in
PhysicalOperator leads to ExecException "Error while
trying
to
get
next result in POStream"

The last two of these are easily solveable (see the
tickets
for
details on that). The first one is a bit trickier I
think,
but
at
least there is a workaround for it (pass Map fields
through
an
UDF)
In my personal opinion, each of these problems is pretty
severe,
but
opinions about the importance of the MAP Datatype and
STREAM
Operator,
as well as Hadoop 0.20.2 compatibility might differ.

so far ..

Kai Londenberg


--
*Note that I'm no longer using my Yahoo! email address.
Please
email
me
at
billgraham@gmail.com going forward.*


--
Russell Jurney twitter.com/rjurney
russell.jurney@gmail.com
datasyndrome.com


--
*Note that I'm no longer using my Yahoo! email address. Please
email me
at
billgraham@gmail.com going forward.*


--
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
datasyndrome.com


--
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
datasyndrome.com

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 25 of 26 | next ›
Discussion Overview
groupdev @
categoriespig, hadoop
postedFeb 18, '13 at 10:27a
activeMar 16, '13 at 12:13a
posts26
users11
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase