FAQ

[P5P] Perl 5.16 and Beyond.

Jesse Vincent
Sep 12, 2011 at 4:28 pm
[ What follows is the prose version of the talk I gave at YAPC::NA, OSCON and YAPC::EU this summer. I'll be giving a similar talk at YAPC::Asia. If you prefer reading big sentence fragments on slides, http://www.slideshare.net/obrajesse/perl-516-and-beyond is my slide deck from YAPC::EU]

Porters,

Over the last two years, I've been thinking a lot about where Perl 5 is headed. Well, that's not quite true. I've been thinking a lot about where Perl 5 is headed since way back in 2005 or 2006 when I was the Perl 6 project manager and it become clear to me that Ponie wasn't likely a horse I'd want to bet Perl 5 on.

As a community, we've done an absolutely stellar job of getting Perl 5 and p5p back into a reasonable groove. We know how to do releases. More individuals have released Perl 5 in the last 18 months as in the first 16 years Perl existed. It shouldn't really surprise anybody that knowing that work you do will be released soon is a huge motivating factor. Releasing frequently has dramatically increased Perl 5's rate of change. That's both good and bad.

I'm thrilled that Perl development is incredibly vibrant. 5.14 is a dramatically better Perl than 5.10 was, thanks to the work of approximately 260 porters. At the same time, I've made (or failed to make) some design choices that, in retrospect, I somewhat regret. There are places where Perl 5 is now more baroque than it was before. There have been changes to Perl 5 that have broken backward compatibility where we might have been better served not doing so. These mistakes are entirely my fault. If I had a do-over, I'd make different mistakes..er. decisions.

Various members of the community have asked me about my vision for the future of Perl 5. I was initially very, very hesitant to make any bold statements about the language, since "we're going to put out a development release every month and a new stable release every year" felt rather...bold itself.

Putting out releases is hard work. At this point, it's hard work we have well in hand.

Much of my thinking about the future of Perl 5 stems from the following principles:

* New versions of Perl 5 should not break your existing software
* Backward compatibility must not stop Perl 5 from evolving

Pay particular attention to "should" and "must" there. It is critically important that we not alienate the people, communities and companies who have invested their time and money in Perl 5. Pulling the rug out from under them isn't good for them and isn't good for us. Wherever possible, we need to preserve backward compatibility with earlier versions of Perl 5. At the same time, it could be argued that _any_ change to Perl 5 breaks backward compatibility. ("But I was depending on that segfault!") If Perl 5 is going to continue to flourish, we're going to need to be able to change the language.

* Why we deprecate things

Perl 5 is not Latin. It is a living language, still borrowing liberally from...just about everything. Sometimes we borrow the wrong things. Sometimes we borrow things and use them
wrong. Sometimes we invent things and later wish we hadn't. Perl has always been something of a packrat, but we're in danger of hitting the point of being diagnosed with a pathological hoarding problem. We need to get better at fixing problems and moving forward without hurting old code.

* How we handle language changes

We've always had a messy relationship with backward compatibility. Because we don't have a spec for the language and we have a single implementation, side-effects bugs and ill-considered accidents have become unchangeable parts of the language.

For a project of Perl's age, scale and diversity, we've been astonishingly conservative about how we make incompatible changes. It's not enough. Every release breaks running code. The more the language evolves, the more legacy code will break.

Standing still is not an option. Perl's internals, syntax and semantics have seen some much-needed improvements in the past few years. There are many additional changes we can't make because they may damage too much legacy code.

To date, version declarations in code have been intended as a marker for "minimum acceptable version of perl" -- They've also been used to enable new language features or semantics changes. Many _other_ new changes in Perl are turned on by default without the need for any declaration. Typically, this is because we don't believe those changes will harm earlier code, though it's also the case that conditionally enabling certain changes would be difficult.

Perl 5 doesn't require that a developer declare which version of Perl 5 a program or library was written to run under. Historically, our default has been "no declaration means latest, sort of."

That changes from here on in. Code that does not declare a 'use v5.16;' or later will be presumed to have been written to target v5.14.

If there is no "use v5.xx" line at the top of the code, the runtime should act as it did on v5.14 without a use v5.14 line.

We'll need a nice compact way to declare what version of Perl you want from the commandline for one-liners too.

Along with this, there's a change to how we'll be removing deprecated features from the core.

If a core feature (syntactic or semantic) is removed, a "use v5.xx" declaration for a earlier version of Perl 5 should re-enable the feature.

This will be difficult. I fully expect people to start throwing pies, tomatoes or bricks when I say this. I will not require that the forward-ported implementation of the old feature have performance parity with the old implementation. In certain circumstances, I _will_ make exceptions to the mandatory-forward-port rule. Security fixes are one obvious case. I know we'll run into other cases.

It is my strong preference that features granted R&R (removal and reinstatement) be implemented as modules, so as not to bloat the runtime when they're not needed. This isn't a pipe dream. Classic::Perl already does this for a few features removed in 5.10 and 5.12;

If it's not possible to reinstate a feature we've removed with existing APIs, we'll need to look at the cost of removing the feature vs simply disabling it in the presence of a new-enough v5.xx declaration.

* How quickly can we deprecate things

When I seized power in late 2009, I set us up with a new policy for deprecation timing. Anything we deprecated needed to warn in version .x that it would be removed in version .x+1. At the same time, I put us on a track to ship a new release of Perl 5 each spring. I'm incredibly pleased to see that multiple vendors have already incorporated Perl 5.12 into their stable releases

There's a reasonable chance that some vendors on a slightly longer release cycle may never ship a given major version of Perl, meaning that
our carefully-crafted deprecation warnings will never be seen.

If we manage to implement the R&R policy for deprecated features, we have little to worry about. Code that declares "use v5.14" should continue to function just as it always has, but for code that declares 'use v5.16', we can make changes without a deprecation cycle.

For cases where we _can't_ implement R&R, I think we need to move to a two year deprecation cycle, so as to have as minimal an impact as possible on users who are upgrading.

* Language Modularization

There's a lot of stuff that ships as part of "Perl 5 the distribution" that's not part of "Perl 5 the language" - Over the years, we've done an increasingly good job unwinding many dark corners of what we affectionately refer to as "blead" into CPAN modules. Most of those items have been add-on scripts and tools that have been part of the distribution but weren't part of the language.
Most of those modules continue to ship as part of the distribution and may well do so for many years to come. When I talk about modularization, I'm not proposing that we stop shipping the newly modularized code as part of the traditional core distribution.

Over the past few years, a number of people have spent significant effort to make certain parts of the language runtime a good deal more pluggable. I expect that we have a good amount of work to do before the core is flexible enough for us to fully implement this plan.

It's time for us to start extracting parts of what has traditionally been considered the "language" part of Perl 5 into CPANable modules. To do this successfully, it is imperative that _nothing_ appear out of the ordinary to code that expects to use those features. If code doesn't declare use v5.16, it should still get the 5.14ish environment it would expect.

At the same time, it should be possible for a future Perl to declare that certain previously-integral features are optional and run-time loadable/unloadable. There are many situations where Perl is used that don't need a whole set of the blades on our swiss-army chainsaw. Forcing their inclusion makes Perl a less viable candidate for a wide variety of applications.

Once language features are modularized, it also becomes _possible_ to maintain and improve them without requiring a full Perl upgrade.

Making this work for some things, like grammar changes, may take herculean effort but will invariably result in a more robust and flexible Perl.

I don't know what we'll extract or when we'll extract it, but there are a number of language features that seem like they might make sense to make pluggable: Formats, SysV IPC functions, Socket IO functions, Unix user information functions, Unix network information functions and Process and process group functions. Jesse Luehrs has already built us a first version of an extraction, modularization and replacement of smartmatch.

It should be clear from this list that I don't intend for us to kill, banish or exile these functions from the traditional Perl distribution. They're a useful and important part of our language and our culture. But they're not all needed or even desired everywhere we'd want Perl 5 to run. If we can modularize them in a way that doesn't negatively impact performance, Perl 5's manipulexity gets better without damaging its whipitupitude. The increased flexibility of the core should actually improve our whipitupitude at the same time.

* What should ship in the Perl core

When I say "traditional Perl distribution", I'm talking about the conglomeration of "language" and "toolkit" we currently ship. In 5.001, that totalled up to 59 packages. In 5.005, it grew to 176. By 5.008, it was 338 packages. 5.10 saw 541 packages. 5.12 ballooned to 625 packages. 5.14 was the most modest growth we've had in many versions. It ships 655 packages. Many of these are, of course, part of a smaller number of distributions.

The bar for getting a distribution into blead has varied over time. The "point" of the core distribution has often been up for debate. Is it an SDK for software written in Perl 5? If so, then we ought to be shipping what we currently consider to be the best-practice modules for building software in Perl. Is it "stuff we've always shipped with the Perl core?" If so, we're doomed to a life in a house full of stacks of old line-printer paper and TK50 cartridges.

Right now, what we include in the core distribution is a somewhat eclectic set of modules. They're not a particularly good SDK. They're not a particularly good sysadmin toolkit. They do, however, include an excellent set of modules to bootstrap getting more modules.

During my tenure, the guiding principle for new distributions in the core has been "does it help us bootstrap CPAN module installation or to test the core?"

In this day and age, most users get Perl from their operating system vendor or a third-party packager. Those vendors often look to us to decide what to ship. And they often ignore us. That's 100% ok. There are vendors who choose not to ship Perl's documentation as part of their "perl" package. There are vendors who have considered removing Perl
from their default installation because it's "too big" -- They've talked about splitting the distribution up according to their own logic or sometimes of removing it entirely. We've done some amazing work paring down the installed footprint. So far, I think we've avoided getting the axe.

And then there's the maintenance issue. It takes a lot of work for us to keep up to date with all the modules we include in the core. We do a pretty good job, but it takes a lot out of us.

It's time we made some more work for ourselves. We will continue to ship a distribution of Perl that contains roughly what we've always shipped, but I'd like it to be our secondary product.

The primary product will be "Perl the language" with the minimal set of modules needed to test the core, offer all the language features we intend to offer (Encode, for example) and to bootstrap the installation of new modules from CPAN. While I intend for us to ship the R&R features in the core distribution, it shouldn't actually be necessary.

Since I became involved in the Perl community, there have been constant requests for various "SDKs" for various uses of Perl. The state of the art in how to do such a thing is better than it was, but isn't really all that useful at this point. It should be possible to take a "Perl the language" distribution, drop some modules into a directory, tar/zip it up and hand it off to your user community. And that's how we should be able to build the "traditional" Perl distribution.

* Documentation changes

Perl's documentation is. Well, it is many things. Current, well-edited, complete and completely accurate are four things it is not. We've been getting a little better at splitting out "introduction for newcomers" documentation from "canonical reference" documentation.

Our introduction documentation has always been opinionated. The particular opinions have varied by author, document and phase of the moon. That's ok. Introductory documentation _should_ be opinionated. Introductory documentation should recommend "at least one good way to do it" and point the curious reader who wants to know more to canonical reference documentation. Canonical reference documentation should try to actually spell out how things work with a minimum of opinion.

As we modularize features, we'll need to consider how we reorganize the docs so that they are maximally useful to users who should neither need nor want to know about how Perl's internals work.

We need a new document, built from previous perldeltas that succinctly describes what, other than bug fixes, came or went in each version of Perl 5. Porters should be discouraged from adding or removing a feature without updating this document (and other documentation).

* Tests

In order to understand how well we're doing on back-compat as we move forward, we're going to need to be able to run a new Perl against older versions of the test suite. One step in that direction would be to be able to run the test suite against an installed Perl.

* TL;DR

- New versions of Perl 5 should not break your existing software
- Backward compatibility must not stop Perl 5 from evolving
- From 'use v5.16' forward, Perl should start treating 'use v5.x' statements as "try to give me a Perl that looks like v5.x" rather than "give me at least v5.x"
- We're awesome at modules. Where possible, we should be modularizing core features.


Best,

Jesse
reply

Search Discussions

36 responses

  • David E. Wheeler at Sep 12, 2011 at 5:17 pm

    On Sep 12, 2011, at 9:28 AM, Jesse Vincent wrote:

    * TL;DR

    - New versions of Perl 5 should not break your existing software
    - Backward compatibility must not stop Perl 5 from evolving
    - From 'use v5.16' forward, Perl should start treating 'use v5.x' statements as "try to give me a Perl that looks like v5.x" rather than "give me at least v5.x"
    - We're awesome at modules. Where possible, we should be modularizing core features.
    +1 great plan. Now comes the hard part.

    Best,

    David
  • Tim Bunce at Sep 12, 2011 at 8:22 pm

    On Mon, Sep 12, 2011 at 12:28:47PM -0400, Jesse Vincent wrote:
    [...]
    Excellent!
    - New versions of Perl 5 should not break your existing software
    - Backward compatibility must not stop Perl 5 from evolving
    - From 'use v5.16' forward, Perl should start treating 'use v5.x' statements as "try to give me a Perl that looks like v5.x" rather than "give me at least v5.x"
    - We're awesome at modules. Where possible, we should be modularizing core features.
    +1 * 4

    I've been on perl5-porters for about two decades now.
    I've seen many comings and goings, plans and fallout, highs and lows.

    In recent years I've watched with growing delight at the renewed life
    on perl5-porters. The heart of the community. The very center of the Onion.

    I'd like to thank you Jesse for your commitment and thoughful leadership.

    And I'd like to thank the many people who have contributed to perl,
    and the many who have made new contributors welcome here.

    Tim [feeling all misty-eyed to be near the center of the Onion for so long]
  • Abigail at Sep 13, 2011 at 11:48 am

    On Mon, Sep 12, 2011 at 12:28:47PM -0400, Jesse Vincent wrote:

    - New versions of Perl 5 should not break your existing software
    - Backward compatibility must not stop Perl 5 from evolving
    - From 'use v5.16' forward, Perl should start treating 'use v5.x'
    statements as "try to give me a Perl that looks like v5.x" rather
    than "give me at least v5.x"
    - We're awesome at modules. Where possible, we should be
    modularizing core features.
    In general, yes, yes, yes. (I've seen Jesse's talk twice, and we've
    talked about it as well, so it isn't all new to me).

    I do, however, like to make a note about the new meaning of "v5.x". At
    first, my reaction was "yes, that's very logical, it's got to be better
    than what we have now".

    Then I wrote a module that does this:

    use 5.006;
    sub match_all {
    my ($subject, $pattern) = @_;

    our @matches = ();

    my $pat = ref $pattern ? $pattern : qr /$pattern/;

    use re 'eval';

    $subject =~
    /(?-x:$pat)
    (?{ push @matches =>
    [map {substr $subject, $- [$_], $+ [$_] - $- [$_]} 1 .. $#-] })
    (*FAIL)
    /x;

    wantarray ? @matches : [@matches]; # Must copy.
    }


    It takes a pattern with captures, and returns every possible match.

    It has "use 5.006;" there. It really means, "give me 5.006 or newer".

    In Jesse's scheme, from 5.16 onwards, this is going to mean "5.14
    semantics". But if this is run on 5.20, it shouldn't restrict itself
    to patterns with 5.14 semantics. If it's run on 5.20, it should also
    work on regexp constructs that were introduced in 5.18.



    So, the deeper issue here is if "use 5.x" is going to mean "run with
    5.x" semantics, you may end up with different parts of a program that
    use different semantics for the same code, and interpret the same data
    differently - just because one module uses "5.x" and the other "5.y".
    (For instance, do we really want a program to use two modules, where
    one module uses Unicode X.0 semantics, and another module Unicode
    Y.0 semantics, possibly giving different results on "lc $str" for the
    same $str?)


    Now, this doesn't mean I'm against changing the meaning of "use 5.x". In
    general, I think it's a good idea, and it will help Perl moving forwards.
    But I think there are some issues that need to be addressed. And for
    which I, at this moment, don't have a solution for.


    Regards,


    Abigail
  • H.Merijn Brand at Sep 13, 2011 at 12:11 pm

    On Tue, 13 Sep 2011 13:48:41 +0200, Abigail wrote:
    On Mon, Sep 12, 2011 at 12:28:47PM -0400, Jesse Vincent wrote:

    - New versions of Perl 5 should not break your existing software
    - Backward compatibility must not stop Perl 5 from evolving
    - From 'use v5.16' forward, Perl should start treating 'use v5.x'
    statements as "try to give me a Perl that looks like v5.x" rather
    than "give me at least v5.x"
    - We're awesome at modules. Where possible, we should be
    modularizing core features.
    In general, yes, yes, yes. (I've seen Jesse's talk twice, and we've
    talked about it as well, so it isn't all new to me).

    I do, however, like to make a note about the new meaning of "v5.x". At
    first, my reaction was "yes, that's very logical, it's got to be better
    than what we have now".

    Then I wrote a module that does this:

    use 5.006;
    sub match_all {
    my ($subject, $pattern) = @_;

    our @matches = ();

    my $pat = ref $pattern ? $pattern : qr /$pattern/;

    use re 'eval';

    $subject =~
    /(?-x:$pat)
    (?{ push @matches =>
    [map {substr $subject, $- [$_], $+ [$_] - $- [$_]} 1 .. $#-] })
    (*FAIL)
    /x;

    wantarray ? @matches : [@matches]; # Must copy.
    }


    It takes a pattern with captures, and returns every possible match.

    It has "use 5.006;" there. It really means, "give me 5.006 or newer".

    In Jesse's scheme, from 5.16 onwards, this is going to mean "5.14
    semantics". But if this is run on 5.20, it shouldn't restrict itself
    to patterns with 5.14 semantics. If it's run on 5.20, it should also
    work on regexp constructs that were introduced in 5.18.



    So, the deeper issue here is if "use 5.x" is going to mean "run with
    5.x" semantics, you may end up with different parts of a program that
    use different semantics for the same code, and interpret the same data
    differently - just because one module uses "5.x" and the other "5.y".
    (For instance, do we really want a program to use two modules, where
    one module uses Unicode X.0 semantics, and another module Unicode
    Y.0 semantics, possibly giving different results on "lc $str" for the
    same $str?)


    Now, this doesn't mean I'm against changing the meaning of "use 5.x". In
    general, I think it's a good idea, and it will help Perl moving forwards.
    But I think there are some issues that need to be addressed. And for
    which I, at this moment, don't have a solution for.
    I agree with your sentiment.

    I suggested

    use ge5.008004; # use any perl version >= 5.8.4
    use v5.10.1-5.12.3; # use >= 5.10.1 but <= 5.12.3

    and alike.

    I have always used 'use 5.008004;' with the *intention* to mean 5.8.4
    *OR NEWER*, as 5.8.4 fixed a bug that would crash my script on older
    versions. I do not want to update that script if 5.38.2 introduces some
    conflicting syntax: I'll fix the script to be compatible and *still*
    require 5.8.4 as a minimum version.

    --
    H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
    using 5.00307 through 5.14 and porting perl5.15.x on HP-UX 10.20, 11.00,
    11.11, 11.23 and 11.31, OpenSuSE 10.1, 11.0 .. 11.4 and AIX 5.2 and 5.3.
    http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
    http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/
  • David Golden at Sep 13, 2011 at 1:04 pm

    On Tue, Sep 13, 2011 at 7:48 AM, Abigail wrote:
    In Jesse's scheme, from 5.16 onwards, this is going to mean "5.14
    semantics".  But if this is run on 5.20, it shouldn't restrict itself
    to patterns with 5.14 semantics. If it's run on 5.20, it should also
    work on regexp constructs that were introduced in 5.18.
    I'm very glad that Jesse put forth this ambitious vision, and at the
    same time, I worry that the devil is in the details and that we might
    find it hard to set expectations. Here's another simple example
    inspired by yours:

    {
    use v5.20;
    sub match_it {
    my ($string, $pattern) = @_;
    return $string =~ /$pattern/;
    }
    }
    {
    use v5.16;
    match_it($string, qr/$pattern/); # case A
    match_it($string, $pattern); # case B
    }

    Assume that $pattern is a string. In case A, the regex is compiled
    under 5.16 semantics but matched in a scope with 5.20 semantics. In
    case B, the pattern is passed into a scope with 5.20 semantics and
    then compiled and matched there.

    Should semantics be bound to where a regex is compiled or to where it
    is matched? What happens when regexes are combined?

    Imagine:

    {
    use v5.20;
    sub match_it {
    my ($string, $pattern) = @_;
    my $new_pattern = qr/$prefix$pattern/; # case C
    return $string =~ /$new_pattern/;
    }
    }

    In case C, we are creating a new compiled regex under 5.20 semantics,
    which could contain a compiled regex from 5.16 semantics. Do we
    expect the pattern to preserve 5.16 semantics for some portion of the
    match or does the recompilation change the semantics of $pattern to
    5.20 semantics?

    Either way, this strikes me as having the potential to be insanely confusing.

    A similar (if less convoluted) issue applies to any code compiled into
    existence via string eval.

    This all makes me wonder if the backwards semantics promise needs to
    be more carefully scoped (no pun intended) to a more manageable set of
    behaviors. For example, I could see limiting the guarantee to syntax
    -- thus ensuring that old cold still *compiles* on new Perl, but not
    promising that it would have the exact same behaviors. That's not
    sufficient (e.g. promising the same layers on filehandles is probably
    necessary) -- but it would be a start.

    I think it might be easier (wiser?) to explicitly include things into
    the promise as they seem feasibly rather than make a blanket promise
    and then give exceptions.

    -- David
  • Zefram at Sep 13, 2011 at 1:07 pm

    David Golden wrote:
    Should semantics be bound to where a regex is compiled or to where it
    is matched?
    Compiled. That's the only sane way, and we've already answered that
    with respect to the /dual-related pragmata.
    What happens when regexes are combined?
    Each segment should retain its original semantics.

    -zefram
  • Abigail at Sep 13, 2011 at 1:21 pm

    On Tue, Sep 13, 2011 at 09:04:12AM -0400, David Golden wrote:
    On Tue, Sep 13, 2011 at 7:48 AM, Abigail wrote:
    In Jesse's scheme, from 5.16 onwards, this is going to mean "5.14
    semantics".  But if this is run on 5.20, it shouldn't restrict itself
    to patterns with 5.14 semantics. If it's run on 5.20, it should also
    work on regexp constructs that were introduced in 5.18.
    I'm very glad that Jesse put forth this ambitious vision, and at the
    same time, I worry that the devil is in the details and that we might
    find it hard to set expectations. Here's another simple example
    inspired by yours:

    Here's an even simpler example:


    use 5.18;
    sub mylc {
    lc $_ [0];
    }


    Normally, I would document that as "mylc returns the lowercase of its
    argument".

    But that should now be documented as "mylc returns the 5.18 lowercase
    version of its argument". After all, lc may change semantics in the
    future. (Of course, I'm just using 'lc' as an example here; it could
    be any other expression). And that would require people to not only
    know the semantics of the version of Perl they are using - but also the
    semantics of all versions since 5.14 up to the one they are working with.



    Abigail
  • Chris Prather at Sep 13, 2011 at 5:47 pm

    On Tue, Sep 13, 2011 at 9:21 AM, Abigail wrote:
    On Tue, Sep 13, 2011 at 09:04:12AM -0400, David Golden wrote:
    On Tue, Sep 13, 2011 at 7:48 AM, Abigail wrote:
    In Jesse's scheme, from 5.16 onwards, this is going to mean "5.14
    semantics".  But if this is run on 5.20, it shouldn't restrict itself
    to patterns with 5.14 semantics. If it's run on 5.20, it should also
    work on regexp constructs that were introduced in 5.18.
    I'm very glad that Jesse put forth this ambitious vision, and at the
    same time, I worry that the devil is in the details and that we might
    find it hard to set expectations.  Here's another simple example
    inspired by yours:

    Here's an even simpler example:


    use 5.18;
    sub mylc {
    lc $_ [0];
    }


    Normally, I would document that as "mylc returns the lowercase of its
    argument".

    But that should now be documented as "mylc returns the 5.18 lowercase
    version of its argument". After all, lc may change semantics in the
    future. (Of course, I'm just using 'lc' as an example here; it could
    be any other expression). And that would require people to not only
    know the semantics of the version of Perl they are using - but also the
    semantics of all versions since 5.14 up to the one they are working with.
    You don't actually escape this problem now. Having worked on legacy
    code, I had to learn all of the semantics for all of the perl versions
    that the company's code worked with up to the versions because some
    things were subtly different across versions. Luckily in my case the
    code was really major upgrades (Perl 4, Perl 5.6.1, Perl 5.8.8)
    because they upgraded the production perl infrequently, but I did get
    bit by things like a change in split() around 5.8.1 that slowed it
    down about 10%.

    At least with the proposed syntax there should be a flag stating what
    semantics are *expected* so the new guy can go and research what they
    were. Trying to track down that split() thing was a nightmare because
    I had no clue where to start back then.

    It won't be perfect, but it should be better than what we have now.

    -Chris
  • Johan Vromans at Sep 13, 2011 at 6:33 pm

    Abigail writes:

    Here's an even simpler example:

    use 5.18;
    It would be nice if this would work as of 5.16...
    (Note the absence of a smiley.)
    Normally, I would document that as "mylc returns the lowercase of its
    argument".
    That would be:

    sub mylc { lc $_[0] }

    However:

    use v5.18;
    sub mylc { lc $_[0] }
    But that should now be documented as "mylc returns the 5.18 lowercase
    version of its argument".
    Because you *explicitly* require the v5.18 semantics.

    More contrived:

    sub mylc { lc $_[0] }

    { use v5.16;
    sub mylc_5016 { lc $_[0] }
    }
    { use v5.18;
    sub mylc_5018 { lc $_[0] }
    }

    Actually, I see more significant problems with Perl trying to carry all
    old semantics into all new versions. You cannot simpy make a small
    change to a built-in, you have to make a backup copy and keep it just in
    case. This is a potential maintenance nightmare.

    -- Johan
  • Dave Mitchell at Sep 14, 2011 at 10:36 am

    On Tue, Sep 13, 2011 at 08:33:09PM +0200, Johan Vromans wrote:
    Actually, I see more significant problems with Perl trying to carry all
    old semantics into all new versions. You cannot simpy make a small
    change to a built-in, you have to make a backup copy and keep it just in
    case. This is a potential maintenance nightmare.
    +10

    I think the devil is in the detail. If, while running under 5.20,

    use v5.16;

    is just about equivalent to

    no feature 'list of features added in 5.18, 5.20';

    then I'd be happy with that.

    If it's supposed to mean "this code will run in exactly the same way as if
    you have just applied the perl-5.16.0 executable against it", then I will
    run away screaming.

    So... do we backwardly support bug fixes where the bug fix caused a
    visible change in behaviour? Even where the old behaviour was clearly wrong
    and went against the documented behaviour?

    Or... in my current work to fix the mess that is /(?{...})/, should
    'use v5.14' cause this code:

    /(?{ warn "boo" })/

    to output the old-style message

    boo at (re_eval 1) line 1.

    rather than the new style

    boo at file line N.

    (where the change in message is an artifact of the fact that (?{...}) is
    parsed in a completely different way now).

    Etc etc...


    --
    But Pity stayed his hand. "It's a pity I've run out of bullets",
    he thought. -- "Bored of the Rings"
  • David Golden at Sep 14, 2011 at 10:53 am

    On Wed, Sep 14, 2011 at 6:28 AM, Dave Mitchell wrote:
    I think the devil is in the detail. If, while running under 5.20,

    use v5.16;

    is just about equivalent to

    no feature 'list of features added in 5.18, 5.20';

    then I'd be happy with that.

    If it's supposed to mean "this code will run in exactly the same way as if
    you have just applied the perl-5.16.0 executable against it", then I will
    run away screaming.
    This is why I suggested a middle ground of "if it compiled under
    v5.16, it will compile here" -- but not guaranteeing complete behavior
    compatibility. That gives us freedom to fix broken things.

    To use your expression, that's "no features added since ..." plus
    "substitutes for things deprecated/removed since..." and possibly some
    other similar stuff. (E.g., if we deprecated and then removed having
    keys/values/each act on arrays, we'd need to import some replacement
    keywords that still did that with correct prototypes.)

    How do you feel about the feasibility/maintainability of that narrower
    definition?

    -- David
  • Dave Rolsky at Sep 14, 2011 at 2:36 pm

    On Wed, 14 Sep 2011, Dave Mitchell wrote:

    I think the devil is in the detail. If, while running under 5.20,

    use v5.16;

    is just about equivalent to

    no feature 'list of features added in 5.18, 5.20';
    I think that realistically, it will have to be somewhere between this and
    "acts exactly like 5.16 in all ways".

    I think each backwards incompatible change will probably need to be
    considered on its own as a candidate for this policy.

    I think there are several categories of backwards incompatible changes.

    For bug fixes, I think we may just ask people to bite the bullet and
    accept the change. It seems unrealistic for anyone to expect us to provide
    infinite bugwards compatibility.

    For additions of new features with new keywords, this use line will block
    the new keywords from being usable in that scope. This seems fairly easy,
    for some value of "fairly" and "easy" which I can't speak to, since I've
    never hacked on the core C code ;)

    The real question is when we change the behavior of existing features. The
    smartmatch operator is a good example. It seems that p5p widely agrees
    that the existing behavior is a mess and needs to be cleaned up. However,
    this behavior has been documented for a number of versions, so preserving
    the old behavior as an option seems like a great idea, if we can do it
    (and apparently we can).

    What about things like updating the Unicode database? That's not a bug
    fix. But does it really make sense to ship every version of Unicode that
    we've ever documented as support? Isn't getting a new version of this
    database one of the reasons to upgrade? I think this is where things get
    sticky.

    I also wonder if there will be a sunset on maintenance of these old
    features. At what point might we consider removing a smartmatch
    implementation from core? How many implementations would we be willing to
    maintain?


    -dave

    /*============================================================
    http://VegGuide.org http://blog.urth.org
    Your guide to all that's veg House Absolute(ly Pointless)
    ============================================================*/
  • Johan Vromans at Sep 14, 2011 at 2:58 pm

    Dave Rolsky writes:

    For additions of new features with new keywords, this use line will
    block the new keywords from being usable in that scope. This seems
    fairly easy, for some value of "fairly" and "easy" which I can't speak
    to, since I've never hacked on the core C code ;)
    I've always wondered why the new keywords have not been added as "weak
    keywords" (like "lock"). This would have avoided the need for a
    feature pragma.

    -- Johan
  • Rafael Garcia-Suarez at Sep 17, 2011 at 5:12 pm

    On 14 September 2011 16:58, Johan Vromans wrote:
    Dave Rolsky <autarch@urth.org> writes:
    For additions of new features with new keywords, this use line will
    block the new keywords from being usable in that scope. This seems
    fairly easy, for some value of "fairly" and "easy" which I can't speak
    to, since I've never hacked on the core C code ;)
    I've always wondered why the new keywords have not been added as "weak
    keywords" (like "lock"). This would have avoided the need for a
    feature pragma.
    In my mind that was to avoid effect at distance : if you upgrade a
    module that changes its default export list to contain a sub named
    like a weak keyword you're using, your program semantics changes under
    you.
  • Karl Williamson at Sep 25, 2011 at 8:26 pm

    On 09/17/2011 11:12 AM, Rafael Garcia-Suarez wrote:
    On 14 September 2011 16:58, Johan Vromanswrote:
    Dave Rolsky<autarch@urth.org> writes:
    For additions of new features with new keywords, this use line will
    block the new keywords from being usable in that scope. This seems
    fairly easy, for some value of "fairly" and "easy" which I can't speak
    to, since I've never hacked on the core C code ;)
    I've always wondered why the new keywords have not been added as "weak
    keywords" (like "lock"). This would have avoided the need for a
    feature pragma.
    In my mind that was to avoid effect at distance : if you upgrade a
    module that changes its default export list to contain a sub named
    like a weak keyword you're using, your program semantics changes under
    you.
    Does this mean this is an issue for 'lock' that should be documented? I
    don't see any other weak keywords mentioned in perlfunc. Are there any
    for which this should be documented?
  • Aristotle Pagaltzis at Sep 14, 2011 at 4:16 pm

    * Abigail [2011-09-13 15:25]:
    Here's an even simpler example:

    use 5.18;
    sub mylc {
    lc $_ [0];
    }

    Normally, I would document that as "mylc returns the lowercase
    of its argument".

    But that should now be documented as "mylc returns the 5.18
    lowercase version of its argument". After all, lc may change
    semantics in the future. (Of course, I'm just using 'lc' as an
    example here; it could be any other expression). And that would
    require people to not only know the semantics of the version of
    Perl they are using - but also the semantics of all versions
    since 5.14 up to the one they are working with.
    * Dave Mitchell [2011-09-14 12:40]:
    I think the devil is in the detail. If, while running under 5.20,

    use v5.16;

    is just about equivalent to

    no feature 'list of features added in 5.18, 5.20';

    then I'd be happy with that.
    This is how I read Jesse’s proposal. However, his slides are clearer
    on that than the prose version he sent to p5p. See slides 127 ff.

    If you declare an old version, you get old syntax and semantics
    …at least to the best of our abilities
    Perfection is not possible
    We can get far closer than we do now
    Breaking existing code should be a last resort
    In limited circumstances we will break backward compatibility
    Some craziness can’t be fixed in an “optional” or lexical way

    The last slide is the key here. The way that reads to me is that Jesse
    means basically what you said. That would address Abigail’s concern.
    Unless a `feature` the behaviour of `lc` in the meantime. I think
    those things should be minted with great care.

    But that brings me to another point.

    I think Abigail’s issue arises because there is a conflation here.

    It is one thing to say

    Run this program with semantics as close to 5.16 as possible

    and quite another to say

    Parse this code the way 5.16 would have

    Think about it. If a script was written to run under 5.16, you want to
    keep all of it working as much like as on 5.16 as possible, even when
    it loads modules written so they could also run on 5.42.

    If a program says it wants 5.16 then the operational semantics for
    all the code it loads should be 5.16, even if parts of the code
    were written to be compatible with 5.42.

    It is perfectly acceptable and sane to mix together different pieces
    of code that that expect to be parsed under different rules. I do not
    see how it can be sane to mix together different units of compiled
    code that have different semantics for different operations.

    If you look at it that way, then the issue Abigail raised vanishes.

    So I think there needs to be a separation between these meanings.

    The first idea I had was `use v5.16` should mean something else at the
    top-most scope of the program than it means in subordinate scopes.
    I am reminded of how Perl 6 draws a distinction between modules and
    programs.

    However I’m not sure it’s a good idea to use the same mechanism for
    both cases, but switching its meaning. We may need a mechanism other
    than `use $VERSION` for that bit.

    Regards,
    --
    Aristotle Pagaltzis // <http://plasmasturm.org/>
  • Jesse Vincent at Sep 29, 2011 at 3:49 am
    On Sep 13, 2011, at 8:04 AM, David Golden wrote:


    This all makes me wonder if the backwards semantics promise needs to
    be more carefully scoped (no pun intended) to a more manageable set of
    behaviors. For example, I could see limiting the guarantee to syntax
    -- thus ensuring that old cold still *compiles* on new Perl, but not
    promising that it would have the exact same behaviors. That's not
    sufficient (e.g. promising the same layers on filehandles is probably
    necessary) -- but it would be a start.

    I think it might be easier (wiser?) to explicitly include things into
    the promise as they seem feasibly rather than make a blanket promise
    and then give exceptions.
    Hrm. To me, that feels a lot like what we've already been doing.
    -- David
  • David Golden at Sep 29, 2011 at 11:00 am

    On Wed, Sep 28, 2011 at 10:26 PM, Jesse Vincent wrote:
    I think it might be easier (wiser?) to explicitly include things into
    the promise as they seem feasibly rather than make a blanket promise
    and then give exceptions.
    Hrm. To me, that feels a lot like what we've already been doing.
    The difference is that this would be only true under a version
    stricture: "If you say C<use v5.16>, then if your code compiled under
    v5.16, it will continue to compile under any subsequent Perl."

    That doesn't make any behavior guarantees and thus I suspect the
    number/magnitude of exceptions will be fewer/less.

    -- David
  • David Nicol at Oct 6, 2011 at 10:13 pm
    On Wed, Sep 28, 2011 at 9:26 PM, Jesse Vincent wrote:
    On Sep 13, 2011, at 8:04 AM, David Golden wrote:


    This all makes me wonder if the backwards semantics promise needs to
    Hrm. To me, that feels a lot like what we've already been doing.
    so all he's asking for is documentation of How It Is Already. Pass the
    hubris!
  • Karl Williamson at Sep 13, 2011 at 7:23 pm

    On 09/12/2011 10:28 AM, Jesse Vincent wrote:
    If there is no "use v5.xx" line at the top of the code, the runtime
    should act as it did on v5.14 without a use v5.14 line.
    Does that mean that without a 'use' line that the Unicode version will
    be the one that is in 5.14?
  • Jesse Vincent at Sep 29, 2011 at 3:49 am
    [I'm desperately behind on mail and working through my backlog as fast as I can. There are many very, very important issues raised by the 25 messages in this thread still in my inbox.]
    On Sep 13, 2011, at 2:23 PM, Karl Williamson wrote:
    On 09/12/2011 10:28 AM, Jesse Vincent wrote:
    If there is no "use v5.xx" line at the top of the code, the runtime
    should act as it did on v5.14 without a use v5.14 line.
    Does that mean that without a 'use' line that the Unicode version will be the one that is in 5.14?
    I did give myself that escape hatch of "wherever possible" for this stuff. I'm given to understand that the work to get Perl to support multiple implementations of Unicode in the same runtime is probably further toward the "not" end of the "wherever possible" spectrum and that this would likely be a case where we'd be best served by relying on the Unicode Consortium's own backward-compatibility promises. I don't know that it's worth spending a lot of time exploring the nuances of this one until we have an upgraded version of Unicode that breaks something new and exciting that we can't possibly live without.

    -Jesse
  • Ævar Arnfjörð Bjarmason at Sep 13, 2011 at 8:48 pm
    I agree with all the backward compatibility *goals*, but I think redefining
    "v5.x" to mean "exactly v5.x" instead of "equal or greater than v5.x" is a
    bit contrary to that aim.

    Since exact version semantic are a new feature of the core awe could
    preserve backward compatibility by having some new syntax like "use =v5.18".
    Instead of what it's always meant (i.e. ">=v5.18").
  • Jesse Vincent at Sep 13, 2011 at 9:07 pm

    On Sep 13, 2011, at 4:48 PM, Ævar Arnfjörð Bjarmason wrote:

    I agree with all the backward compatibility *goals*, but I think redefining "v5.x" to mean "exactly v5.x" instead of "equal or greater than v5.x" is a bit contrary to that aim.
    The point of the new semantics for "use v5.x" is to insulate code from changes to Perl's defaults. Keeping the existing semantics ensures that programs will break as we make changes to Perl's defaults.

    We should certainly have some way for developers who want their code to get "the running Perl's semantics, whether it's 5.16 or 5.36" to hurt themselves if they want to, but it really should not be the default.
  • John Imrie at Sep 13, 2011 at 11:30 pm

    On 13/09/2011 22:07, Jesse Vincent wrote:
    On Sep 13, 2011, at 4:48 PM, Ævar Arnfjörð Bjarmason wrote:

    I agree with all the backward compatibility *goals*, but I think
    redefining "v5.x" to mean "exactly v5.x" instead of "equal or greater
    than v5.x" is a bit contrary to that aim.
    The point of the new semantics for "use v5.x" is to insulate code from
    changes to Perl's defaults. Keeping the existing semantics ensures
    that programs will break as we make changes to Perl's defaults.

    We should certainly have some way for developers who want their code
    to get "the running Perl's semantics, whether it's 5.16 or 5.36" to
    hurt themselves if they want to, but it really should not be the default.
    Would one of the following pragma work

    use latest;
    use version::latest;
    use latest::version;

    Then we have

    use 5.x where x <16 := use at least version 5.x

    use 5.x where x >= 16 := use exactly version 5.x, with all the semantics
    of that version except I'd like to propose it really means use the max
    known version of the interpreter of version 5.x.00 to 5.x.99

    use latest := Use the semantics of the interpreters version;

    This gives us the opportunity to do something like

    use latest;
    my @features = latest::features;
    die "no given/when" unless grep {$_ eq 'switch'} @features

    John
  • John Imrie at Sep 14, 2011 at 12:23 am

    On 14/09/2011 00:30, John Imrie wrote:
    Would one of the following pragma work

    use latest;
    use version::latest;
    use latest::version;

    Then we have

    use 5.x where x <16 := use at least version 5.x

    use 5.x where x >= 16 := use exactly version 5.x, with all the
    semantics of that version except I'd like to propose it really means
    use the max known version of the interpreter of version 5.x.00 to 5.x.99

    use latest := Use the semantics of the interpreters version;

    This gives us the opportunity to do something like

    use latest;
    my @features = latest::features;
    die "no given/when" unless grep {$_ eq 'switch'} @features

    John
    I had another thought having sent the above.

    I'd like to see a class method 'feature::where()'

    With no parameters it returns a hash keyed on feature name. Each value
    in the hash is a hash ref keyed on the function name/keyword the feature
    provides and the value is the fully qualified function name. With a
    parameter it returns a hash ref keyed on the function names/keywords
    that that feature provides.

    eg for 5.10

    %features = frature::where()

    %features containes

    (
    say => {say => 'CORE::say'},
    state => { state => 'CORE::state'},
    switch => {
    given => 'CORE::given',
    when => 'CORE::when'
    }
    )

    Some future version of Perl may return

    (
    say => {say => 'Implement:::Say::say'},
    state => { state => 'Clasic::Perl::state'},
    switch => {
    given => 'CORE::given',
    when => 'CORE::when'
    }
    )

    The inner most hash should be magical so you can do
    $features{say}{say}->('This is from feature say, no matter where it's
    implemented');

    or

    feature::where(say)->{say}->('This is from feature say, no matter where
    it's implemented');

    This could be used internally to move existing keywords out of the core
    without unduly effecting existing code.

    e.g. A new feature called 'sysV' could be introduced to handle all the
    system V keywords and this mechanism could then be used to move them
    selectively out of the core in future versions without having to modify
    the existing scripts moe than ensuring that the sysV feature is activated.

    John
  • Aristotle Pagaltzis at Sep 14, 2011 at 2:49 pm

    * John Imrie [2011-09-14 02:25]:
    The inner most hash should be magical so you can do
    $features{say}{say}->('This is from feature say, no matter
    where it's implemented');
    What do you do with the `unicode_string` feature?

    What about features that enable changes in parsing?

    What if we get some sort of coroutines or CPS in core, or some
    other change in fundamental semantics somewhere?

    This proposal will and can only work for the simplest kind of
    feature.


    * John Imrie [2011-09-14 01:35]:
    use latest;
    my @features = latest::features;
    die "no given/when" unless grep {$_ eq 'switch'} @features
    The following would work just as well:

    use latest;
    use feature 'switch';

    I think that’s a good idea though. Having the option to probe for
    a set of features (and with `no feature`, for the absence of
    certain features) instead of only for a particular interpreter
    version would be nice.


    Regards,
    --
    Aristotle Pagaltzis // <http://plasmasturm.org/>
  • John Imrie at Sep 14, 2011 at 7:06 pm

    On 14/09/2011 15:49, Aristotle Pagaltzis wrote:
    * John Imrie[2011-09-14 02:25]:
    The inner most hash should be magical so you can do
    $features{say}{say}->('This is from feature say, no matter
    where it's implemented');
    What do you do with the `unicode_string` feature?

    What about features that enable changes in parsing?

    What if we get some sort of coroutines or CPS in core, or some
    other change in fundamental semantics somewhere?

    This proposal will and can only work for the simplest kind of
    feature.
    You're right, this only deals with the keyword producing features. I was
    writing this about 1am local time, so was not thinking at my best. The
    reason I came up with it was it tied into another thread that was
    worried about how to split stuff from the core and I saw this and the
    feature::where() function as a way to seamlessly move stuff out of the core.
    * John Imrie[2011-09-14 01:35]:
    use latest;
    my @features = latest::features;
    die "no given/when" unless grep {$_ eq 'switch'} @features
    The following would work just as well:

    use latest;
    use feature 'switch';
    That would only work if you could grantee that the Perl binary 'latest'
    actually implemented feature 'switch'
    With my code you could have your own module or probe for a compatible
    feature at runtime.

    John
  • H.Merijn Brand at Oct 6, 2011 at 2:20 pm

    On Tue, 13 Sep 2011 17:07:23 -0400, Jesse Vincent wrote:
    On Sep 13, 2011, at 4:48 PM, Ævar Arnfjörð Bjarmason wrote:

    I agree with all the backward compatibility *goals*, but I think
    redefining "v5.x" to mean "exactly v5.x" instead of "equal or greater
    than v5.x" is a bit contrary to that aim.
    The point of the new semantics for "use v5.x" is to insulate code from
    changes to Perl's defaults. Keeping the existing semantics ensures
    that programs will break as we make changes to Perl's defaults.

    We should certainly have some way for developers who want their code
    to get "the running Perl's semantics, whether it's 5.16 or 5.36" to
    hurt themselves if they want to, but it really should not be the
    default.
    Will there be any difference between

    require 5.010;

    and

    use 5.010;

    ?

    If the require means "5.10.0 or newer" and the use means what has been
    discussed everywhere, I can live with any decision made :)

    --
    H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
    using 5.00307 through 5.14 and porting perl5.15.x on HP-UX 10.20, 11.00,
    11.11, 11.23 and 11.31, OpenSuSE 10.1, 11.0 .. 11.4 and AIX 5.2 and 5.3.
    http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
    http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/
  • Tom Christiansen at Oct 6, 2011 at 2:46 pm
    "H.Merijn Brand" <h.m.brand@xs4all.nl> wrote
    on Thu, 06 Oct 2011 16:20:03 +0200:
    Will there be any difference between
    require 5.010;
    and
    use 5.010;
    ?
    If the require means "5.10.0 or newer" and the use means what has been
    discussed everywhere, I can live with any decision made :)
    There already is a big difference between use/require version number.

    % perl -e 'use 5.10.0; say what'
    vs
    % perl -e 'require 5.10.0; say what'
    Can't locate object method "say" via package "what" (perhaps you forgot to load "what"?) at -e line 1.

    or

    % perl -e 'use 5.10.0; say "what"'
    what
    vs
    % perl -e 'require 5.10.0; say "what"'
    String found where operator expected at -e line 1, near "say "what""
    (Do you need to predeclare say?)
    syntax error at -e line 1, near "say "what""
    Execution of -e aborted due to compilation errors.

    Which is somewhat curious.

    --tom
  • Chris Prather at Oct 6, 2011 at 4:57 pm

    On Thu, Oct 6, 2011 at 10:46 AM, Tom Christiansen wrote:
    "H.Merijn Brand" <h.m.brand@xs4all.nl> wrote
    on Thu, 06 Oct 2011 16:20:03 +0200:
    Will there be any difference between
    require 5.010;
    and
    use 5.010;
    ?
    If the require means "5.10.0 or newer" and the use means what has been
    discussed everywhere, I can live with any decision made :)
    There already is a big difference between use/require version number.

    % perl -e 'use 5.10.0; say what'
    vs
    % perl -e 'require 5.10.0; say what'
    Can't locate object method "say" via package "what" (perhaps you forgot to load "what"?) at -e line 1.

    or

    % perl -e 'use 5.10.0; say "what"'
    what
    vs
    % perl -e 'require 5.10.0; say "what"'
    String found where operator expected at -e line 1, near "say "what""
    (Do you need to predeclare say?)
    syntax error at -e line 1, near "say "what""
    Execution of -e aborted due to compilation errors.

    Which is somewhat curious.
    I don't find it curious, because `use Foo;' and `require Foo;` do
    different things. The first is documented as `BEGIN { require Foo;
    Foo->import() }`. I would expect use 5.10.0 be equivalent to `BEGIN {
    require 5.10.0; 5.10.0->import }` which you handwave with feature.pm a
    bit you get `BEGIN { require 5.10.0; require feature;
    feature->import(":5.10"); }`.

    Jesse's plan to me has always been making versions roughly equivalent
    to pragmas, that is loadable modules that alter the behavior of the
    core.

    -Chris
  • Zsbán Ambrus at Sep 18, 2011 at 6:23 pm

    On Mon, Sep 12, 2011 at 6:28 PM, Jesse Vincent wrote:
    If a core feature (syntactic or semantic) is removed, a "use v5.xx"
    declaration for a earlier version of Perl 5 should re-enable the feature.
    I wonder, if this is the case, there should be a way for a module to
    declare "this code requires perl 5.xx or later (some of my customers
    still haven't upgraded from 5.xx), but I am really developping for a
    newer perl 5.yy, so don't try to emulate any backwards compatibility
    features before that please".

    Ambrus
  • Nicholas Clark at Sep 19, 2011 at 10:50 am
    tl;dr:

    I like the plan. But the devil will be in the details.

    It's a complex trade off between short, medium and long term
    maintainability, and I don't think that this approach has been taken
    by any comparable project.
    On Tue, Sep 13, 2011 at 01:23:17PM -0600, Karl Williamson wrote:
    On 09/12/2011 10:28 AM, Jesse Vincent wrote:
    If there is no "use v5.xx" line at the top of the code, the runtime
    should act as it did on v5.14 without a use v5.14 line.
    Does that mean that without a 'use' line that the Unicode version will
    be the one that is in 5.14?
    Based on the default of

    If there is no "use v5.xx" line at the top of the code, the runtime
    should act as it did on v5.14 without a use v5.14 line.

    then yes, I'm also assuming that the intent is that "runtime should act"
    also means that the behaviour of Unicode should be the same.


    There is a further section

    * New versions of Perl 5 should not break your existing software
    * Backward compatibility must not stop Perl 5 from evolving

    Pay particular attention to "should" and "must" there.

    On Wed, Sep 14, 2011 at 09:36:17AM -0500, Dave Rolsky wrote:
    On Wed, 14 Sep 2011, Dave Mitchell wrote:

    I think the devil is in the detail. If, while running under 5.20,

    use v5.16;

    is just about equivalent to

    no feature 'list of features added in 5.18, 5.20';
    I think that realistically, it will have to be somewhere between this and
    "acts exactly like 5.16 in all ways".

    I think each backwards incompatible change will probably need to be
    considered on its own as a candidate for this policy.

    I think there are several categories of backwards incompatible changes.

    For bug fixes, I think we may just ask people to bite the bullet and
    accept the change. It seems unrealistic for anyone to expect us to provide
    infinite bugwards compatibility.
    I don't think we get *any* meaningful choice on this.

    The aim is to be able to slim the core distribution, and make [our]
    maintenance easier. But every thing we permit to have divergent
    implementations causes growth and [our] makes code maintenance harder.

    Our size and effort scales roughly linearly with "features".
    [as in http://www.google.com/search?q=bug+feature&tbm=isch ]

    So to stand still on size and complexity we have to run very hard on
    finding other savings. That's before we consider our ability to actually
    add things.
    What about things like updating the Unicode database? That's not a bug
    fix. But does it really make sense to ship every version of Unicode that
    we've ever documented as support? Isn't getting a new version of this
    database one of the reasons to upgrade? I think this is where things get
    sticky.
    In theory we can. In practice, I don't think we can.
    We don't properly have *one* Unicode implementation working yet.
    We don't have the infrastructure to support more than one - we'd have to
    write it. It would definitely bring benefits, but relative to the effort
    needed to actually fix the real bugs we still have, it's certainly a
    distraction. I don't think that we have enough knowledge and time
    [even with money no object] to do both before v5.16, and without unlimited
    money, ever. So means everyone gets the same Unicode versions, so that forces
    an either-or choice between "stick on what we have" and "upgrade everyone,
    bug-and-feature alike". This is one of those short/medium/long term trade
    offs.

    Short term, keeping v5.16 on Unicode version 6.0.0 isn't really a problem
    while we work out which approach is viable.

    Medium term, keeping v5.18 on Unicode version 6.0.0 is bad, but if it buys
    a long term of flexibility that's good. Medium term pain for long term gain.
    But if we don't think it's do-able, then it's not worth the indefinite and
    increasing medium-term pain of holding the Unicode version, then to change
    our mind, abandon the approach, and make a big jump. We've denied people
    Unicode improvements in the short term, and produced the pain of a big jump
    even to people tracking yearly releases.

    Tom has been explaining Unicode's stability guarantees. I think that we
    are going to have to rely on these, and assume that changes that the
    Unicode Consortium themselves make to existing behaviour are bug fixes.
    I also wonder if there will be a sunset on maintenance of these old
    features. At what point might we consider removing a smartmatch
    implementation from core? How many implementations would we be willing to
    maintain?
    I don't think that we can credibly consider *removing* any implementation,
    without making a mockery of the whole *point* of the policy. The elevator
    pitch of the policy is that if your code says "use v5.18;" then it will
    keep working indefinitely.

    This means that code saying "use v5.18" is allowed to rely on v5.18
    features, and that (effectively) v5.20, v5.22 etc behave for it as new
    non-binary compatible stable releases. Which means that they *aren't*
    allowed to add new warnings. So how do we then decide to deprecate
    something, let alone remove it, if we aren't allowed new warnings?

    Even if we do allow new warnings in these stable releases, we'd effectively
    end up with having to maintain $n different forks of the language *in one
    codebase*, because each would need to be tracking which features are now
    deprecated in which subversions, and which have been removed.

    Which iteration of use v5.18 was sir asking for when sir typed that?


    And the elevator pitch gains the small print "oh, but not really. When we
    said indefinitely, and implied forever, actually we meant that in five to
    seven years we might remove *some* of the features you've been using. So
    you can't really rely on any of them"



    To make this work, I believe we have to think in timescales of 5 or 10 years
    of active changes to the language. Which means potentially 5 to 10 divergent
    implementations of some things. What's the cost? How to we support this?


    Questions I'm asking myself are:

    1) Are modules shipped in the core covered by the guarantee about what
    v5.18 means?

    For example, in RT #72506 I propose a change to a not-really-supportable
    corner case feature of warnings, which I don't think anyone is relying on.
    But if that's considered a feature, not a bug fix, does that mean that
    we need to start shipping $n+1 copies of warnings.pm?

    2) What happens about invasive changes to the internals?
    For example, Chip's proposals for minimal copying will be visible to some
    code (particularly XS code making too many assumptions).
    They can't be restricted lexically.

    His types proposal possibly *can*, but it makes it way more complex, and
    doing this might actually introduce more bugs (or at least surprises)
    than it solves. [For example, passing a data structure into some other
    code that currently acts as v5.14 and reads values [with caching] would
    mean keeping current flags behaviour. Outer code is written to expect
    this - that a value "becomes" string or numeric. Then that inner code is
    tweaked, and use v5.18 is added. At which point the flags behaviour has
    to change. Only this would be visible to any calling code that happened
    to be relying on it. Hence adding v5.18 alone in one place might break
    other code.]

    Does the desire to minimise "use v5.18" breakage mean that any such
    structural changes to the underlying VM have to go through the exception
    process? If so, that's going to stifle if not kill improvements.

    3) Are we going to assume that all undocumented, warning and error behaviour
    should stay the same?

    For example, Claes has worked on the todo item of accepting 0o42 as octal.
    This needs oct() to accept this format. Currently oct() only documents
    what it *does* accept as valid. So, the short-term-easy solution would be
    that oct() in the scope of v5.16 onwards accept 0o42, and earlier does
    not.

    But how is this implemented?
    As far as the C code goes, the obvious "clean" implementation takes about
    5 lines, adding a feature test.

    Except that this would be a test on the lexical scope of the caller.
    How does that fit with
    a) the desire to take references to builtins as if they are functions?
    b) the desire to be able to introspect builtins using %CORE:: ?

    The "clean" implementation doesn't fit sanely with being able to take a
    reference. It would provide a reference to a function whose behaviour
    changes depending on the calling scope. Whereas what's needed for sanity
    is for the function's behaviour to be consistent with the builtin's
    behaviour at the lexical scope where the reference is taken.

    This makes for a more complicated implementation - does one copy the code
    for pp_oct out into a second place (and fix bugs in two places), have
    conditional code compiled twice, or store state with the function reference?

    Also, how does it work with introspection? Will there be both
    $CORE::{'oct'} and $CORE::{'oct516'}? Or will there just be $CORE::{'oct'},
    but the value that Perl-spaces sees differ depending on lexical scope?


    On Mon, Sep 12, 2011 at 12:28:47PM -0400, Jesse Vincent wrote:

    Standing still is not an option. Perl's internals, syntax and semantics
    have seen some much-needed improvements in the past few years. There are
    many additional changes we can't make because they may damage too much
    legacy code.
    But I read this as you're saying that we should continue to make changes
    that improve the internals, the syntax and the semantics.

    Which I agree with. But for this plan to work, it needs to be sustainable.
    We need to still be able to do these things in 5 years' and 10 years' time.

    Which means I think we need to judge any changes on the basis of does this
    pay off in 5 years? in 10 years? ever?

    We only have finite effort available to us. Time we spend now refactoring
    delays bug fixing and other improvements. If time spent now discounted at
    10% per annum will never actually pay off, it's not worth doing.
    It is my strong preference that features granted R&R (removal and
    reinstatement) be implemented as modules, so as not to bloat the runtime
    when they're not needed. This isn't a pipe dream. Classic::Perl already
    does this for a few features removed in 5.10 and 5.12;

    If it's not possible to reinstate a feature we've removed with existing
    APIs, we'll need to look at the cost of removing the feature vs simply
    disabling it in the presence of a new-enough v5.xx declaration.

    But not everything can be fully implemented as a module. For example, the
    parser isn't pluggable. I don't know if it ever could be (fully), but it
    certainly isn't *yet*.

    Removing $[ meant that the parser code could actually get simpler. I can't
    find the figure, but I think that chromatic measured the proportion of
    the grammar needed to deal with the legacy 'do subroutine' syntax, and it
    was large enough to be justifiable as a simplification worth the effort of
    making, *if it's actually removed*. But we don't have a parser able to
    do that *and* have it be re-instated via a module. So if we wanted to
    remove it from the language with v5.16, we'd actually complicate the
    parser and increase the maintenance burden, because we'd add more
    conditional code to the core.

    So the R&R policy comes with a cost - it makes some subset of the
    deprecated features simply not worth removing, because it's now more
    costly than keeping them. I suspect that this is one.


    On the other hand, FORMATs keep being given as an example of something that
    many people would like not to be there. As the "default is what 5.14 did",
    FORMATs can't go away. Switching them off in the parser achieves most of the
    "visible language simplification" goals of not having FORMATs

    i) less to teach, less to understand
    ii) permitting re-use of $- as the start of syntax, instead of a scalar
    whilst saving the coding effort of re-implementing them

    But is actually removing them a good trade off?

    i) The existing implementation is pretty stable and doesn't get in the
    way of much
    ii) removing them means some mix of
    a) *adding* a lot of hooks to let the existing C code work "outside"
    the core
    b) re-implementing a little used feature in new code, along with the
    resulting cost of having (and fixing) all the bugs this creates

    So in this case, I think that it's worth the cost of making them
    conditionally disabled, but it's not worth the cost of trying to purge
    them from the core implementation, because over the future lifetime of
    Perl 5, I think that we'll spend more effort than we save.
    For cases where we _can't_ implement R&R, I think we need to move to a two
    year deprecation cycle, so as to have as minimal an impact as possible on
    users who are upgrading.
    Yes, this makes sense. Minimum 2 years, preferably longer.

    But I'm not sure how we communicate clearly "this deprecated feature is
    merely deprecated" vs "this deprecated feature is doomed". Particularly
    if we change our mind about what sort of deprecated something is.
    It's time for us to start extracting parts of what has traditionally been
    considered the "language" part of Perl 5 into CPANable modules. To do this
    successfully, it is imperative that _nothing_ appear out of the ordinary
    to code that expects to use those features. If code doesn't declare use
    v5.16, it should still get the 5.14ish environment it would expect.
    By which you mean that the policy should change. To date, new "things" have
    been allowed if previously they were syntax errors. Henceforth, no language
    changes, even those that are backwards compatible, should appear unless
    asked for?

    This does (mostly) remove the "problem" that one can write and test against
    a newer perl interpreter, and not realise that one is relying on a new
    feature. On balance, I think that this is better.

    However, it's never going to be a substitute for actually testing, as bugs
    will be fixed, and I don't think that all behaviour can be hidden. Likewise
    code written on the older interpreter and running under C<use v5.12;> etc is
    going to encounter "artifacts from the future". For example, if we're able
    to move from throwing core exceptions as strings to
    objects-that-stringify-the-old-way, then some existing code is going to spot
    the difference, and may change behaviour in surprising ways.

    I think we'd be on a hiding to nothing having a policy that we explicitly
    hide examples of "future" such as this, because it will take progressively
    more effort, introduce bodges that make future improvements more costly, and
    eventually we'll hit something we can't conceal thoroughly, at which point
    what gives? The policy, or the improvement. So I don't think we should try
    to "guarantee" any "perfection" better than the level we've managed in the
    17 years to date.
    Once language features are modularized, it also becomes _possible_ to
    maintain and improve them without requiring a full Perl upgrade.
    On the other hand, maintaining something against multiple perl versions is
    harder than just being in the core. As Zefram recently found out with Carp,
    when he rolled it up as a CPAN distribution.

    This might actually cost us more than it saves. We're great at modules.
    But how many modules are XS? How many of those run on more than "both
    kinds of operating system"?
    I don't know what we'll extract or when we'll extract it, but there are a
    number of language features that seem like they might make sense to make
    pluggable: Formats, SysV IPC functions, Socket IO functions, Unix user
    information functions, Unix network information functions and Process and
    process group functions. Jesse Luehrs has already built us a first version
    of an extraction, modularization and replacement of smartmatch.
    This makes sense. We already have some infrastructure to do this. The code for
    $! was generalised, and is now also used to implement %+ and %-

    dbmopen and glob are both actually implemented as modules.

    It should be possible to move more out. However, it's not a panacea. I'd
    estimate that the total size savings for the interpreter binary will be no
    more than 10%. Right now, comparing microperl [pretty much all of the above
    missing] versus perl on the same platform [x86_64, gcc -Os, -DNO_MATHOMS]:

    -rwxr-xr-x 1 nick admin 1154360 18 Sep 10:49 microperl
    -rwxr-xr-x 1 nick admin 1258448 18 Sep 10:50 perl

    $ perl -le 'print 1154360/1258448'
    0.91728859674774

    And we need to be very careful about how we autoload - eg, don't push all
    the socket builtins out into Socket. It's tempting to do this (obviously
    when autoloading one does not have Socket import anything), but it's a trap,
    because it will mean that future people write use v5.12 code which they test
    and works, but will break on a "real" v5.12, because they didn't realise
    that they'd assumed that Socket would be loaded.

    * TL;DR

    - New versions of Perl 5 should not break your existing software
    - Backward compatibility must not stop Perl 5 from evolving
    - From 'use v5.16' forward, Perl should start treating 'use v5.x'
    statements as "try to give me a Perl that looks like v5.x" rather
    than "give me at least v5.x"
    - We're awesome at modules. Where possible, we should be
    modularizing core features.


    We're trying to do something which I think no other dynamic language has
    done before - try to support more than one "version" at runtime from the
    same codebase. To the best of my knowledge, no Python implementation can
    support even close versions simultaneously, such as 2.6 and 2.7 or 3.1 and
    3.2 together.

    In 5 or 10 years, is the hope that Perl 5 is free to evolve as far from
    v5.14 as Python 3 is from Python 2? Because even the current proposal for
    supporting Python 3 in PyPy is an "either/or", not a "concurrently":

    http://mail.python.org/pipermail/pypy-dev/2011-September/008288.html

    [estimate $70,000, and that's not for a complete transition - that's for
    augmenting the codebase to be able to provide a Python3 VM/interpreter to
    the end user. The VM is still implemented in Python2. See
    http://mail.python.org/pipermail/pypy-dev/2011-September/008306.html
    and the reply. I wonder who likes PyPy as much as booking.com likes Perl?]



    I think to make this plan work sustainably, we're going to have to
    conceptually split

    * Perl 5 VM
    * compile time (lexer/parser/opcode generator)
    * runtime builtins (whether they are "built in", or in a module)


    so, Unicode version, copying semantics, etc, are that of the VM, and the VM
    may change in newer releases.

    Likewise introspection is a function of the VM, and older code may see
    something new. For example, if we're able to switch to providing NFC for
    symbol tables, then existing code will not be hidden from this.

    And older code can't be isolated from things arriving "from the future",
    such as exceptions thrown to it, or the nuances of objects returned from
    code it calls.


    The parser is built atop the VM, and converts language source code to
    opcodes according to the relevant "grammar" for the version requested,
    using the correct "builtin"s for that version of the language, loading
    them as necessary.

    The runtime functions as documented for the version requested. But it
    provides the same level of change/non-change as currently XS code can
    across perl (release) versions.

    Nicholas Clark
  • Abigail at Oct 6, 2011 at 7:14 pm

    On Mon, Sep 19, 2011 at 11:49:53AM +0100, Nicholas Clark wrote:
    tl;dr:

    I like the plan. But the devil will be in the details.

    It's a complex trade off between short, medium and long term
    maintainability, and I don't think that this approach has been taken
    by any comparable project.
    On Tue, Sep 13, 2011 at 01:23:17PM -0600, Karl Williamson wrote:
    On 09/12/2011 10:28 AM, Jesse Vincent wrote:
    If there is no "use v5.xx" line at the top of the code, the runtime
    should act as it did on v5.14 without a use v5.14 line.
    Does that mean that without a 'use' line that the Unicode version will
    be the one that is in 5.14?
    Based on the default of

    If there is no "use v5.xx" line at the top of the code, the runtime
    should act as it did on v5.14 without a use v5.14 line.

    then yes, I'm also assuming that the intent is that "runtime should act"
    also means that the behaviour of Unicode should be the same.

    And that, I believe, is what we really, really, should *not* want.

    Say, I'm using subs Foo::foo and Bar::bar. Both take a string as argument,
    and as an intermediate step, they apply operation 'X' on it; where 'X'
    uses some Unicode behaviour.

    Do we really want Foo::foo and Bar::bar get different results after applying
    X, just because one was written in 2011 (when 'use 5.14' was the hot thing),
    and the other a year later (and hence has 'use 5.16')?


    IMO, that leads to subtle, hard to debug issues - which then will be declared
    to be not-a-bug, as it's working as p5p intended.

    Maybe perl will be more maintainable, but I don't think Perl programs
    become more maintainable that way.


    Abigail
  • Tim Bunce at Sep 19, 2011 at 6:34 pm

    On Mon, Sep 12, 2011 at 12:28:47PM -0400, Jesse Vincent wrote:

    Much of my thinking about the future of Perl 5 stems from the following principles:

    * New versions of Perl 5 should not break your existing software
    * Backward compatibility must not stop Perl 5 from evolving

    Pay particular attention to "should" and "must" there. It is critically important that we not alienate
    the people, communities and companies who have invested their time and money in Perl 5. Pulling the rug
    out from under them isn't good for them and isn't good for us. Wherever possible, we need to preserve
    backward compatibility with earlier versions of Perl 5. At the same time, it could be argued that _any_
    change to Perl 5 breaks backward compatibility. ("But I was depending on that segfault!") If Perl 5 is
    going to continue to flourish, we're going to need to be able to change the language.
    s/ Wherever possible, / Wherever practical, /

    I read the above as effectively meaning longer deprecation cycles for
    some changes. So, where practical, code that would otherwise break can
    continue to run on later releases. And continues to run on later
    releases only until it ceases to be practical to keep it running.

    I'd suggest that a lifespan of one extra major release would be a
    reasonable minimum goal for this mechanism.

    The decision on how practical it is to support versioning for any given
    change must be in the hands of those who would implement and maintain
    the versioning.

    Tim.
  • Nicholas Clark at Sep 19, 2011 at 9:34 pm

    On Mon, Sep 19, 2011 at 07:33:44PM +0100, Tim Bunce wrote:

    The decision on how practical it is to support versioning for any given
    change must be in the hands of those who would implement and maintain
    the versioning.
    Yes. He who pays the piper calls the tune. And as long as the piper in
    question remains volunteer labour, then (effectively) as the volunteer
    pays him or herself, it's their call.

    It's always been like this, and as long as remains volunteer labour,
    it's going to remain like this. Anyone can ask for anything, but if
    no-one volunteers to do it*, it's not going to happen.

    [And of course, the beauty of *open source* software is that it permits
    any firm with a commercial need for a different policy to implement that
    locally. Unlike closed source, where a vendor can turn you off, however
    much you're prepared to pay the vendor. There's no lock in.]

    Nicholas Clark

    * Strictly, if no-one volunteers to cause it to happen.
  • Father Chrysostomos at Oct 16, 2011 at 7:05 pm

    Nicholas Clark wrote:
    Except that this would be a test on the lexical scope of the caller.
    How does that fit with
    a) the desire to take references to builtins as if they are functions?
    b) the desire to be able to introspect builtins using %CORE:: ?

    The "clean" implementation doesn't fit sanely with being able to take a
    reference. It would provide a reference to a function whose behaviour
    changes depending on the calling scope. Whereas what's needed for sanity
    is for the function's behaviour to be consistent with the builtin's
    behaviour at the lexical scope where the reference is taken.
    While that model might make sense on its own, that’s not Perl’s model. We already have modules with functions that change behaviour based on the caller’s hints, such as charnames::vianame. Why would warning hints be in a different category from feature hints?

    Having \&CORE::lc behave differently based on the caller’s hints (unicode_strings in this case) is a feature if you ask me, as it allows the code that takes the reference the choice of whether hints are to be captured immediately or inherited from the caller:

    \&CORE::lc
    \&charnames::vianame
    *ARGV->can('getline')

    vs

    sub { goto &CORE::lc }
    sub { goto &charnames::vianame }
    sub { goto &{can ARGV 'getline'} }
    By which you mean that the policy should change. To date, new "things" have
    been allowed if previously they were syntax errors. Henceforth, no language
    changes, even those that are backwards compatible, should appear unless
    asked for?

    This does (mostly) remove the "problem" that one can write and test against
    a newer perl interpreter, and not realise that one is relying on a new
    feature. On balance, I think that this is better.
    That last sentence I do not agree with. See below.
    However, it's never going to be a substitute for actually testing, as bugs
    will be fixed, and I don't think that all behaviour can be hidden.
    Precisely. If you do not test with an earlier perl you have *no idea* whether it works with it or not. In fact, the only cases I’ve encountered that involved using ‘future’ perl by mistake were code that relied on perl bug fixes. So I don’t think providing new features only under a pragma solves anything. In fact, it complicates things. Let me give a couple of examples. The &CORE::subs feature cannot be versioned, as we have no system for making package variables appear and disappear based on what pragma is in scope (and let’s not go there). Hence, allowing parentheses after __FILE__, etc., cannot be versioned, because versioning it results in nonsensical behaviour, unless we use some magic tricks to make

    BEGIN { *f = \&CORE::__FILE__ }
    f()

    into a syntax error outside of ‘use v5.16’, but OK within ‘use v5.16’. I know how to do that; I just think it’s wasted effort, and too weird to be of any use. If we allow parentheses on an aliased &CORE::__FILE__ unconditionally, then

    use subs '__FILE__';
    BEGIN { *__FILE__ = \&CORE::__FILE__ }

    will change the syntax under ‘use v5.14’, which I would consider a bug. I’m not willing to introduce a new feature with known unfixable bugs. Some may consider me nuts for worrying about such edge cases, but edge cases have to make sense; otherwise the model is flawed and needs to be rethought. If the model is flawed, it is going to cause suprises, bug reports and future chagrin.

    Also, allowing CORE::say and ‘continue;’ outside of ‘use feature’ and allowing the (\$) prototype to accept any lvalue expression were features I added in preparation for coresubs. Without them, the edge cases don’t make sense.

    My work on coresubs is currently stalled, and has been since this new policy announcement, since I don’t know whether that feature can stay.

    Also, I would like to see a high-precedence ^^ operator to fill the blank slot in this table:

    bitwise & | ^
    logical (high prec) && ||
    logical (low prec) and or xor

    It would be trivial to implement, but if it requires ‘use v5.16’, it’s not worth the bother.

    So the new policy is *already* impeding development.

    BTW, that is the only part of it I don’t really like. The rest is excellent (in terms of policy; not in terms of which features people now think are going to be disabled).

Related Discussions