FAQ
# New Ticket Created by Alexey Borzenkov
# Please include the string: [perl #119089]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=119089 >


This is a bug report for perl from snaury@gmail.com,
generated with the help of perlbug 1.39 running under perl 5.14.2.


-----------------------------------------------------------------
When using threads::shared on Perl 5.14 using many kinds of
expressions in sub's return becomes very dangerous, as return
values just become undef undef certain circumstances. For example
this program would die in Perl 5.14, but would work normally in
Perl 5.12 or Perl 5.16:

     use strict;
     use threads ();
     use threads::shared;

     sub test_clone {
         my $ref = shared_clone([{a => 1, b => 2}]);
         return $ref->[0];
     }

     die "not defined" unless defined(test_clone);

This causes threaded perl programs on Debian 7.0 or Ubuntu 12.04
to produce really weird failures.

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
     category=library
     severity=high
     module=threads::shared
---
Site configuration information for perl 5.14.2:

Configured by Debian Project at Mon Mar 18 19:16:26 UTC 2013.

Summary of my perl5 (revision 5 version 14 subversion 2) configuration:

   Platform:
     osname=linux, osvers=2.6.42-37-generic,
archname=x86_64-linux-gnu-thread-multi
     uname='linux batsu 2.6.42-37-generic #58-ubuntu smp thu jan 24 15:28:10
utc 2013 x86_64 x86_64 x86_64 gnulinux '
     config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN
-Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr
-Dprivlib=/usr/share/perl/5.14 -Darchlib=/usr/lib/perl/5.14
-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5
-Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local
-Dsitelib=/usr/local/share/perl/5.14.2
-Dsitearch=/usr/local/lib/perl/5.14.2 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl
-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm
-Ui_libutil -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib
-Dlibperl=libperl.so.5.14.2 -des'
     hint=recommended, useposix=true, d_sigaction=define
     useithreads=define, usemultiplicity=define
     useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
     use64bitint=define, use64bitall=define, uselongdouble=undef
     usemymalloc=n, bincompat5005=undef
   Compiler:
     cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
     optimize='-O2 -g',
     cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing
-pipe -fstack-protector -I/usr/local/include'
     ccversion='', gccversion='4.6.3', gccosandvers=''
     intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
     d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
     ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
     alignbytes=8, prototype=define
   Linker and Libraries:
     ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
     libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib
/usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib
     libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
     perllibs=-ldl -lm -lpthread -lc -lcrypt
     libc=, so=so, useshrplib=true, libperl=libperl.so.5.14.2
     gnulibc_version='2.15'
   Dynamic Linking:
     dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
     cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib
-fstack-protector'

Locally applied patches:


---
@INC for perl 5.14.2:
     /etc/perl
     /usr/local/lib/perl/5.14.2
     /usr/local/share/perl/5.14.2
     /usr/lib/perl5
     /usr/share/perl5
     /usr/lib/perl/5.14
     /usr/share/perl/5.14
     /usr/local/lib/site_perl
     .

---
Environment for perl 5.14.2:
     HOME=/home/dragonfox
     LANG=en_US.UTF-8
     LANGUAGE (unset)
     LC_ADDRESS=en_GB.UTF-8
     LC_IDENTIFICATION=en_GB.UTF-8
     LC_MEASUREMENT=en_GB.UTF-8
     LC_MONETARY=en_GB.UTF-8
     LC_NAME=en_GB.UTF-8
     LC_NUMERIC=en_GB.UTF-8
     LC_PAPER=en_GB.UTF-8
     LC_TELEPHONE=en_GB.UTF-8
     LC_TIME=en_GB.UTF-8
     LD_LIBRARY_PATH (unset)
     LOGDIR (unset)

PATH=/home/dragonfox/perl5/perlbrew/bin:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
     PERLBREW_BASHRC_VERSION=0.64
     PERLBREW_HOME=/home/dragonfox/.perlbrew
     PERLBREW_MANPATH=
     PERLBREW_PATH=/home/dragonfox/perl5/perlbrew/bin
     PERLBREW_ROOT=/home/dragonfox/perl5/perlbrew
     PERLBREW_VERSION=0.64
     PERL_BADLANG (unset)
     SHELL=/bin/bash

Search Discussions

  • Dave Mitchell at Jul 31, 2013 at 12:10 pm

    On Tue, Jul 30, 2013 at 10:58:36PM -0700, Alexey Borzenkov wrote:
    # New Ticket Created by Alexey Borzenkov
    # Please include the string: [perl #119089]
    # in the subject line of all future correspondence about this issue.
    # <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=119089 >


    This is a bug report for perl from snaury@gmail.com,
    generated with the help of perlbug 1.39 running under perl 5.14.2.


    -----------------------------------------------------------------
    When using threads::shared on Perl 5.14 using many kinds of
    expressions in sub's return becomes very dangerous, as return
    values just become undef undef certain circumstances. For example
    this program would die in Perl 5.14, but would work normally in
    Perl 5.12 or Perl 5.16:

    use strict;
    use threads ();
    use threads::shared;

    sub test_clone {
    my $ref = shared_clone([{a => 1, b => 2}]);
    return $ref->[0];
    }

    die "not defined" unless defined(test_clone);

    This causes threaded perl programs on Debian 7.0 or Ubuntu 12.04
    to produce really weird failures.
    This was fixed in perl 5.15.7 with commit
         6f48390ab209d16ee8f795f0a83677c8bd9ac69c,
    which wont be back-ported to 5.14.x as its out of its maintenance period.

    You can work around the issue on older perls by assigning the value to
    a temporary var before returning it; i.e.

         my $ret = $ref->[0];
         return $ret;



    --
    Technology is dominated by two types of people: those who understand what
    they do not manage, and those who manage what they do not understand.
  • Nicholas Clark at Aug 5, 2013 at 2:19 pm

    On Wed, Jul 31, 2013 at 01:10:33PM +0100, Dave Mitchell wrote:
    On Tue, Jul 30, 2013 at 10:58:36PM -0700, Alexey Borzenkov wrote:

    When using threads::shared on Perl 5.14 using many kinds of
    expressions in sub's return becomes very dangerous, as return
    values just become undef undef certain circumstances. For example
    this program would die in Perl 5.14, but would work normally in
    Perl 5.12 or Perl 5.16:

    use strict;
    use threads ();
    use threads::shared;

    sub test_clone {
    my $ref = shared_clone([{a => 1, b => 2}]);
    return $ref->[0];
    }

    die "not defined" unless defined(test_clone);
    This can be simplified to:

    #!./perl
    use strict;
    use threads ();
    use threads::shared;

    sub test_clone {
         my @a :shared;
         $a[0] = 6;
         $a[0];
    }

    die "not defined" unless defined test_clone;
    __END__
    This causes threaded perl programs on Debian 7.0 or Ubuntu 12.04
    to produce really weird failures.
    This was fixed in perl 5.15.7 with commit
    6f48390ab209d16ee8f795f0a83677c8bd9ac69c,
    which wont be back-ported to 5.14.x as its out of its maintenance period.

    You can work around the issue on older perls by assigning the value to
    a temporary var before returning it; i.e.

    my $ret = $ref->[0];
    return $ret;
    Totally true.

    If assertions are enabled the test case triggers an assertion failure on
    5.14.0. Given this, I was suspicious as to what introduced the assertion
    failure, it turns out that there's a heck of a lot more to this than first
    meets the eye.

    So a git bisect (with assertions enabled) shows that commit 6f48390ab209d16e
    (as mentioned above) makes the test case work. It *also* makes the test case
    work if assertions are disabled.

    commit 6f48390ab209d16ee8f795f0a83677c8bd9ac69c
    Author: Father Chrysostomos <sprout@cpan.org>
    Date: Wed Jan 4 23:28:54 2012 -0800

         [perl #95548] Returned magical temps are not copied

         return and leavesub, for speed, were not copying temp variables with a
         refcount of 1, which is fine as long as the fact that it was not cop-
         ied is not observable.

         With magical variables, that *can* be observed, so we have to forego
         the optimisation and copy the variable if's magical.

         This obviously applies only to rvalue subs.



    So, what caused the assertions to start failing between v5.12 and v5.14?
    A simple bisect suggests that it's this commit:


    commit f2338a2e8347fc967ab6b9af21d948258b88e341
    Author: David Mitchell <davem@iabyn.com>
    Date: Thu Apr 15 10:20:50 2010 +0100

         use cBOOL for bool casts

         bool b = (bool)some_int

         doesn't necessarily do what you think. In some builds, bool is defined as
         char, and that cast's behaviour is thus undefined. So this line in mg.c:

             const bool was_temp = (bool)SvTEMP(sv);

         was actually setting was_temp to false even when the SVs_TEMP flag was set.
         Fix this by replacing all the (bool) casts with a new cBOOL() cast macro
         that (hopefully) does the right thing.




    which has nothing directly to do with threads shared, or (not?) copying
    values on subroutine return. It turns out after a bit of digging that it's
    this section of that change that matters:

    diff --git a/mg.c b/mg.c
    index 3fb8ec4..4a8d767 100644
    --- a/mg.c
    +++ b/mg.c
    @@ -193,7 +193,7 @@ Perl_mg_get(pTHX_ SV *sv)
      {
          dVAR;
          const I32 mgs_ix = SSNEW(sizeof(MGS));
    - const bool was_temp = (bool)SvTEMP(sv);
    + const bool was_temp = cBOOL(SvTEMP(sv));
          bool have_new = 0;
          MAGIC *newmg, *head, *cur, *mg;
          /* guard against sv having being freed midway by holding a private



    It's not obvious from the code above, but SvTEMP(sv) returns either 0 or
    0x00080000, and in 2010, C<bool> was typedef'd as C<char>, so the above code
    always set was_temp to 0, due to truncation. The cast was added to avoid a
    warning about the truncation. The (buggy) truncation itself was added when
    the type of was_temp was changed from a suitably large integer to a bool as
    part of this commit:

    commit 35a4481cfdbca4941ab3a4206dc266f3e71c2385
    Author: Andy Lester <andy@petdance.com>
    Date: Sun Mar 13 08:20:05 2005 -0600

         Adding const qualifiers
         Message-ID: <20050313202005.ga23535@petdance.com>

         p4raw-id: //depot/perl@24037


    (note that the commit message doesn't even note that some variables' types
    were changed).


    The variable in question is used to enable Perl_mg_get() to retain the
    SVs_TEMP bit set on values it is processing. This is needed because
    Perl_sv_2mortal() conflates "temporary" with "mortal" and automatically sets
    the SVs_TEMP flag bit on everything. Hence this code seeks to remove
    SVs_TEMP if Perl_sv_2mortal() set it. The upshot of the bug was that
    SVs_TEMP was always being removed, so code in subroutine entry that does
    *not* copy TEMPs is skipped because the value is becoming unmarked. Hence
    the values are always copied.


    So, all that conceals what actually going on. When that truncation bug was
    first introduced, it didn't matter that TEMPs weren't copied. When that bug
    was fixed, it mattered a lot that TEMPs weren't copied. What changed in the
    middle?


    I set out to bisect between the two commits, patching *that* the bug to fix
    it. The code doesn't change much, and test-building on the change points
    reveals that at the revision with the last change to that part of the code
    (changing a line adjacent to C<was_temp>), fixing *that* bug doesn't break
    the test case.

    So I made a patch to fix the bug, and bisected with that:

    Porting/bisect.pl -Dusethreads --start f9c6fee519b868a2e8ef8c5b701c0d3f95565423 --end db63319f533e643ef6aac622fcae9a2f7ceabb0d --late-fixup ../119089-W -Dnoextensions=Encode ../perl/119089


    That reveals that the true trigger of the behaviour regression was this
    change:

    commit fd69380d5d5b95ef16e2521cf4251b34ee0ce151
    Author: David Mitchell <davem@iabyn.com>
    Date: Tue Mar 23 12:11:43 2010 +0000

         Fix assorted bugs related to magic (such as pos) not "sticking" to
         magical array and hash elements; e.g. the following looped infinitely:

             $h{tainted_element} =~ /..../g

         There are two side-effects of this fix.
         First, MGf_GSKIP has been extended to work on tied array
         elements as well as hash elements. This is the mechanism that skips all
         but the first tied element magic gets until after the next set.
         Second, rvalue hash/array element access where the element has get magic,
         now directly returns the element rather than a mortal copy.

         The root cause of the bug was code similar to the following in pp_alem,
         pp_aelemfast, pp_helem and pp_rv2av:

             if (!lval && SvGMAGICAL(sv)) /* see note in pp_helem() */
          sv = sv_mortalcopy(sv);

         According to the note, this was added in 1998 to make this work:

             local $tied{foo} = $tied{foo}

         Since it returns a copy rather than the element, this make //g fail.
         My first attempt, a few years ago, to fix this, took the approach that
         the LHS of the bind should be made an lvalue in the presence of //g, since
         it now modifies its LHS; i.e.

             expr =~ // expr is rvalue
             expr =~ s/// expr is lvalue
             expr =~ //g expr was rvalue, I proposed to change it to lvalue

         Unfortunately this fix broke too much stuff (stuff that was arguably
         already broken, but it upset people). For example, f() ~= s////
         correctly gives the error

             Can't modify non-lvalue subroutine call

         My fix extended f() =~ //g to give the same error. Which is reasonable,
         because the g isn't doing what you want. But plenty of people had code that
         only needed to match once and the g had just been cargo-culted. So it
         broke their working code. So lets not do this.

         My new approach has been to remove the sv_mortalcopy(). It turns out
         that this is no longer needed to fix the local $tied{foo} issue.
         Presumably that went away as a side-effect of my container/value magic
         localisation rationalisation of a few years ago, although I haven't
         analysed it - just noted that the tests still pass (!). However, an issue
         with removing it is that mg_get() no longer gets called. So a plain

             $tied_hash{elem};

         in void context no longer calls FETCH(). Which broke some tests and might
         break some code. Also, there's an issue with the delayed calling of magic
         in @+[n] and %+{foo}; by the time the get magic is called, the original
         pattern may have gone out of scope.

         The solution is to simply replace the original

          sv = sv_mortalcopy(sv);

         with

          mg_get(sv);

         This then caused problems with tied array FETCH() getting called too much.
         I fixed this by extending the MGf_GSKIP mechanism to tied arrays as well
         as hashes. I don't understand why tied arrays have always been treated
         differently than tied hashes, but unifying them didn't seem to break
         anything (except for a Storable test, whose comment indicated that the
         test's author thought FETCH() was being called to often anyway).



    The key part is this:


    diff --git a/pp_hot.c b/pp_hot.c
    index 3371e88..8f8af53 100644
    --- a/pp_hot.c
    +++ b/pp_hot.c
    @@ -658,7 +658,7 @@ PP(pp_aelemfast)
          SV *sv = (svp ? *svp : &PL_sv_undef);
          EXTEND(SP, 1);
          if (!lval && SvGMAGICAL(sv)) /* see note in pp_helem() */
    - sv = sv_mortalcopy(sv);
    + mg_get(sv);
          PUSHs(sv);
          RETURN;
      }



    ie it *gets rid of a copy*. So this removes one place that copied the return
    value, and the cast fix removes the other place, hence why both are needed
    to trigger the problem. And why the identified patch fixes it, by
    re-instating a copy.


    The story is not over.


    So, I assumed that taking blead, and reverting the patch that fixes the test
    case, would cause the test case to fail again. It doesn't. More bisecting
    revealed that subsequently this change also has an influence:

    commit 4bac9ae47b5ad7845a24e26b0e95609805de688a
    Author: Chip Salzenberg <chip@pobox.com>
    Date: Fri Jun 22 15:18:18 2012 -0700

         Magic flags harmonization.

         In restore_magic(), which is called after any magic processing, all of
         the public OK flags have been shifted into the private OK flags. Thus
         the lack of an appropriate public OK flags was used to trigger both get
         magic and required conversions. This scheme did not cover ROK, however,
         so all properly written code had to make sure mg_get was called the right
         number of times anyway. Meanwhile the private OK flags gained a second
         purpose of marking converted but non-authoritative values (e.g. the IV
         conversion of an NV), and the inadequate flag shift mechanic broke this
         in some cases.

         This patch removes the shift mechanic for magic flags, thus exposing (and
         fixing) some improper usage of magic SVs in which mg_get() was not called
         correctly. It also has the side effect of making magic get functions
         specifically set their SVs to undef if that is desired, as the new behavior
         of empty get functions is to leave the value unchanged. This is a feature,
         as now get magic that does not modify its value, e.g. tainting, does not
         have to be special cased.

         The changes to cpan/ here are only temporary, for development only, to
         keep blead working until upstream applies them (or something like them).

         Thanks to Rik and Father C for review input.




    because part of it removes the entire was_temp (micro?) optimisation logic:


    @@ -3302,12 +3285,8 @@ S_restore_magic(pTHX_ const void *p)
              So artificially keep it alive a bit longer.
              We avoid turning on the TEMP flag, which can cause the SV's
              buffer to get stolen (and maybe other stuff). */
    - int was_temp = SvTEMP(sv);
           sv_2mortal(sv);
    - if (!was_temp) {
    - SvTEMP_off(sv);
    - }
    - SvOK_off(sv);
    + SvTEMP_off(sv);
       }
       else
           SvREFCNT_dec(sv); /* undo the inc in S_save_magic() */


    meaning that values will always be copied.

    (Given the nature of the first commit identified as a bug fix, this seems
    reasonable. magic values shouldn't be being short circuited.)


    So that explains the given test case. But the story is still not over.

    Whilst temporary values should be copied (by default), LVALUEs returned from
    :lvalue subs should not. This will fail assertions, SEGV, or worse even on
    blead:

    #!./perl
    use strict;
    use threads ();
    use threads::shared;
    use Devel::Peek;

    sub test_clone :lvalue {
         my @a :shared;
         $a[0];
    }

    #Dump(test_clone);
    test_clone() = 0;
    __END__


    I believe that the problem is because threads::shared wrongly assumes that
    the proxy for a shared aggregate will always outlive any LVALUE proxies for
    its elements. This is generally true, but not for element proxies returned
    from :lvalue subroutines for aggregate proxies which go out of scope with
    the subroutine.

    I believe that the fix is to change threads::shared to deal with cleanup
    differently, so that the last magic struct which frees up the relevant
    mg_obj is responsible for cleaning up shared space. This means adding "free"
    magic to the element proxies:


    diff --git a/dist/threads-shared/shared.xs b/dist/threads-shared/shared.xs
    index d3e859d..07379cc 100644
    --- a/dist/threads-shared/shared.xs
    +++ b/dist/threads-shared/shared.xs
    @@ -1027,6 +1027,27 @@ sharedsv_elem_mg_DELETE(pTHX_ SV *sv, MAGIC *mg)
          return (0);
      }

    +int
    +sharedsv_elem_mg_free(pTHX_ SV *sv, MAGIC *mg)
    +{
    + dTHXc;
    + PERL_UNUSED_ARG(sv);
    + ENTER_LOCK;
    + if (mg->mg_obj) {
    + if (!PL_dirty) {
    + assert(SvROK(mg->mg_obj));
    + }
    + if (SvREFCNT(mg->mg_obj) == 1) {
    + /* If the element has the last pointer to the shared aggregate, then
    + it has to free the shared aggregate. mg->mg_obj itself is freed
    + by Perl_mg_free() */
    + S_sharedsv_dec(aTHX_ SHAREDSV_FROM_OBJ(mg->mg_obj));
    + }
    + }
    + LEAVE_LOCK;
    + return (0);
    +}
    +
      /* Called during cloning of PERL_MAGIC_tiedelem(p) magic in new
       * thread */

    @@ -1044,7 +1065,7 @@ MGVTBL sharedsv_elem_vtbl = {
          sharedsv_elem_mg_STORE, /* set */
          0, /* len */
          sharedsv_elem_mg_DELETE, /* clear */
    - 0, /* free */
    + sharedsv_elem_mg_free, /* free */
          0, /* copy */
          sharedsv_elem_mg_dup, /* dup */
      #ifdef MGf_LOCAL
    @@ -1114,7 +1135,21 @@ int
      sharedsv_array_mg_free(pTHX_ SV *sv, MAGIC *mg)
      {
          PERL_UNUSED_ARG(sv);
    - S_sharedsv_dec(aTHX_ (SV*)mg->mg_ptr);
    + if (!PL_dirty) {
    + assert(mg->mg_obj);
    + assert(SvROK(mg->mg_obj));
    + assert(SvUV(SvRV(mg->mg_obj)) == PTR2UV(mg->mg_ptr));
    + }
    + if (mg->mg_obj) {
    + if (SvREFCNT(mg->mg_obj) == 1) {
    + S_sharedsv_dec(aTHX_ (SV*)mg->mg_ptr);
    + } else {
    + /* An element of this aggregate still has PERL_MAGIC_tied(p)
    + pointing to this shared aggregate. It will take responsibility
    + for freeing the shared aggregate. Perl_mg_free() drops the
    + reference count on mg->mg_obj. */
    + }
    + }
          return (0);
      }


    This change has been stewing on the branch smoke-me/nicholas/RT119089-variant
    and doesn't seem to have any problems. (The commit message is now not correct
    - 9b9aaeae0f62 fixed the problem on 5.12.0. Global destruction*)

    I think that it doesn't introduce any leaks, but I'd appreciate a second pair
    of eyes.

    This fix to threads::shared has the side effect of also fixing the original
    reported bug when running on v5.14

    Nicholas Clark

    * Global destruction. threads::shared does not clean up "shared space", even
       when PL_destruct_level is 2. There is no C<perl_destruct> corresponding to
       the C<perl_alloc> in Perl_sharedsv_init(). I'm about to report that as a
       separate bug.
  • Father Chrysostomos via RT at Aug 5, 2013 at 3:32 pm

    On Mon Aug 05 07:20:37 2013, nicholas wrote:
    I think that it doesn't introduce any leaks, but I'd appreciate a
    second pair
    of eyes.
    It looks correct to me, but:
    • I haven’t looked at threads::shared for a long time
    • I didn’t get enough sleep last night (guess why? :-)

    --

    Father Chrysostomos


    ---
    via perlbug: queue: perl5 status: resolved
    https://rt.perl.org:443/rt3/Ticket/Display.html?id=119089
  • Dominic Hargreaves via RT at Sep 7, 2013 at 5:22 pm

    On Mon Aug 05 08:32:28 2013, sprout wrote:
    On Mon Aug 05 07:20:37 2013, nicholas wrote:
    I think that it doesn't introduce any leaks, but I'd appreciate a
    second pair
    of eyes.
    It looks correct to me, but:
    • I haven’t looked at threads::shared for a long time
    • I didn’t get enough sleep last night (guess why? :-)
    I understand that there won't be another 5.14 release for this, but in
    Debian this is still potentially an issue for us (see
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=718438). Is there any
    reason we shouldn't look at applying Nicholas's patch in Debian?

    Thanks,
    Dominic.

    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org:443/rt3/Ticket/Display.html?id=119089
  • Dominic Hargreaves at Sep 7, 2013 at 6:25 pm

    On Sat, Sep 07, 2013 at 10:22:20AM -0700, Dominic Hargreaves via RT wrote:
    On Mon Aug 05 08:32:28 2013, sprout wrote:
    On Mon Aug 05 07:20:37 2013, nicholas wrote:
    I think that it doesn't introduce any leaks, but I'd appreciate a
    second pair
    of eyes.
    It looks correct to me, but:
    • I haven’t looked at threads::shared for a long time
    • I didn’t get enough sleep last night (guess why? :-)
    I understand that there won't be another 5.14 release for this, but in
    Debian this is still potentially an issue for us (see
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=718438). Is there any
    reason we shouldn't look at applying Nicholas's patch in Debian?
    FWIW, I tried applying it to the Debian 5.14 and got:

    dist/threads-shared/t/av_refs..................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 16 tests, saw 15

    dist/threads-shared/t/clone....................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 34 tests, saw 20

    dist/threads-shared/t/hv_refs..................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 22 tests, saw 12

    dist/threads-shared/t/shared_attr................................/../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--non-zero wait status: 32512

    Dominic.
  • Nicholas Clark at Sep 13, 2013 at 1:27 pm

    On Sat, Sep 07, 2013 at 07:25:39PM +0100, Dominic Hargreaves wrote:
    On Sat, Sep 07, 2013 at 10:22:20AM -0700, Dominic Hargreaves via RT wrote:

    I understand that there won't be another 5.14 release for this, but in
    Debian this is still potentially an issue for us (see
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=718438). Is there any
    reason we shouldn't look at applying Nicholas's patch in Debian?
    FWIW, I tried applying it to the Debian 5.14 and got:

    dist/threads-shared/t/av_refs..................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 16 tests, saw 15

    dist/threads-shared/t/clone....................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 34 tests, saw 20

    dist/threads-shared/t/hv_refs..................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 22 tests, saw 12

    dist/threads-shared/t/shared_attr................................/../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--non-zero wait status: 32512
    You applied the patch to Debian? Or you built the current threads-shared
    for Debian?

    Because that macro was added to threads::shared in version 1.42, which in
    blead was this commit:

    commit 28399f576f6389d20835cad7ee86f458880fdcda
    Author: Jerry D. Hedden <jdhedden@cpan.org>
    Date: Tue Oct 2 18:58:32 2012 -0400

         Upgrade to threads::shared 1.42


    That commit has this:

    -STATIC SV *
    -S_sharedsv_from_obj(pTHX_ SV *sv)
    -{
    - return ((SvROK(sv)) ? INT2PTR(SV *, SvIV(SvRV(sv))) : NULL);
    -}
    +#define SHAREDSV_FROM_OBJ(sv) ((SvROK(sv)) ? INT2PTR(SV *, SvIV(SvRV(sv))) : NULL)

    So you might be OK changing all SHAREDSV_FROM_OBJ(sv) to
    S_sharedsv_from_obj(aTHX_ sv)

    Nicholas Clark
  • Dominic Hargreaves at Sep 14, 2013 at 11:08 pm

    On Fri, Sep 13, 2013 at 02:26:57PM +0100, Nicholas Clark wrote:
    On Sat, Sep 07, 2013 at 07:25:39PM +0100, Dominic Hargreaves wrote:
    On Sat, Sep 07, 2013 at 10:22:20AM -0700, Dominic Hargreaves via RT wrote:

    I understand that there won't be another 5.14 release for this, but in
    Debian this is still potentially an issue for us (see
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=718438). Is there any
    reason we shouldn't look at applying Nicholas's patch in Debian?
    FWIW, I tried applying it to the Debian 5.14 and got:

    dist/threads-shared/t/av_refs..................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 16 tests, saw 15

    dist/threads-shared/t/clone....................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 34 tests, saw 20

    dist/threads-shared/t/hv_refs..................................../../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--expected 22 tests, saw 12

    dist/threads-shared/t/shared_attr................................/../t/perl: symbol lookup error: ../../lib/auto/threads/shared/shared.so: undefined symbol: SHAREDSV_FROM_OBJ
    FAILED--non-zero wait status: 32512
    You applied the patch to Debian? Or you built the current threads-shared
    for Debian?
    The former.
    Because that macro was added to threads::shared in version 1.42, which in
    blead was this commit:

    commit 28399f576f6389d20835cad7ee86f458880fdcda
    Author: Jerry D. Hedden <jdhedden@cpan.org>
    Date: Tue Oct 2 18:58:32 2012 -0400

    Upgrade to threads::shared 1.42


    That commit has this:

    -STATIC SV *
    -S_sharedsv_from_obj(pTHX_ SV *sv)
    -{
    - return ((SvROK(sv)) ? INT2PTR(SV *, SvIV(SvRV(sv))) : NULL);
    -}
    +#define SHAREDSV_FROM_OBJ(sv) ((SvROK(sv)) ? INT2PTR(SV *, SvIV(SvRV(sv))) : NULL)

    So you might be OK changing all SHAREDSV_FROM_OBJ(sv) to
    S_sharedsv_from_obj(aTHX_ sv)
    Aha. Thanks for the hint! I've updated the patch and the this now
    builds and tests okay against 5.14.2-21:

    http://anonscm.debian.org/gitweb/?p=perl/perl.git;a=shortlog;h=refs/heads/bug-718438

    And the original test case from the reporter on the Debian bug now
    succeeds.

    Cheers,
    Dominic.
  • Jerry D. Hedden via RT at Sep 30, 2013 at 3:43 pm

    On Sat Sep 14 16:09:05 2013, dom wrote:
    Aha. Thanks for the hint! I've updated the patch and the this now
    builds and tests okay against 5.14.2-21:

    http://anonscm.debian.org/gitweb/?
    p=perl/perl.git;a=shortlog;h=refs/heads/bug-
    718438

    And the original test case from the reporter on the Debian bug now
    succeeds.
    It appears that this bug can be resolved.



    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org:443/rt3/Ticket/Display.html?id=119089
  • James E Keenan via RT at Oct 1, 2013 at 1:39 am

    On Mon Sep 30 08:43:03 2013, jdhedden@gmail.com wrote:
    On Sat Sep 14 16:09:05 2013, dom wrote:
    Aha. Thanks for the hint! I've updated the patch and the this now
    builds and tests okay against 5.14.2-21:

    http://anonscm.debian.org/gitweb/?
    p=perl/perl.git;a=shortlog;h=refs/heads/bug-
    718438

    And the original test case from the reporter on the Debian bug now
    succeeds.
    It appears that this bug can be resolved.
    Doing so now. Thanks for the feedback.


    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org:443/rt3/Ticket/Display.html?id=119089
  • Nicholas Clark at Oct 1, 2013 at 1:30 pm

    On Mon, Sep 30, 2013 at 08:43:04AM -0700, Jerry D. Hedden via RT wrote:
    On Sat Sep 14 16:09:05 2013, dom wrote:
    Aha. Thanks for the hint! I've updated the patch and the this now
    builds and tests okay against 5.14.2-21:

    http://anonscm.debian.org/gitweb/?
    p=perl/perl.git;a=shortlog;h=refs/heads/bug-
    718438

    And the original test case from the reporter on the Debian bug now
    succeeds.
    It appears that this bug can be resolved.
    Actually it can't yet, because the patch isn't yet in blead. No-one
    who looked at it to review it was confident enough of it to merge it
    immediately, and Dave raised a further question which I've not yet had time
    to think about.

    Nicholas Clark
  • Dominic Hargreaves via RT at Oct 4, 2013 at 1:10 pm

    On Tue Oct 01 06:30:56 2013, nicholas wrote:
    On Mon, Sep 30, 2013 at 08:43:04AM -0700, Jerry D. Hedden via RT
    wrote:
    On Sat Sep 14 16:09:05 2013, dom wrote:
    Aha. Thanks for the hint! I've updated the patch and the this now
    builds and tests okay against 5.14.2-21:

    http://anonscm.debian.org/gitweb/?
    p=perl/perl.git;a=shortlog;h=refs/heads/bug-
    718438

    And the original test case from the reporter on the Debian bug now
    succeeds.
    It appears that this bug can be resolved.
    Actually it can't yet, because the patch isn't yet in blead. No-one
    who looked at it to review it was confident enough of it to merge it
    immediately, and Dave raised a further question which I've not yet had time
    to think about.
    Hmm, I can't see the subsequent question from Dave on this ticket. I'd
    be interested in knowing about it even if there isn't an answer for it
    yet, since this patch is queued for a Debian point release update in a
    bid to fix the original regression. If there are sufficient remaining
    uncertainties about the patch, it may be better to revert it before the
    point release.

    Cheers,
    Dominic.

    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org:443/rt3/Ticket/Display.html?id=119089
  • Nicholas Clark at Oct 7, 2013 at 12:37 pm

    On Mon, Aug 12, 2013 at 04:41:14PM +0100, Dave Mitchell wrote:

    I had a quick play with trying to break blead in other ways; I though this
    should do it, but it doesn't, which confuses me:

    use threads ();
    use threads::shared;
    #use Devel::Peek;

    my $r;
    {
    my @a :shared;
    $r = \$a[0];
    #Dump $r;
    }
    $r = 1;

    So I feel less confident that I understand the issue fully.
    To crash things, you need $$r not $r.
    $r is a (Perl) reference to the (proxied) element, and it's the proxy that's
    the problematic thing.

    $r = 1; has the effect of dropping the reference to the proxied element.
    Right now proxied elements assume that the aggregate will outlive them, so
    they take no action on cleanup. So the sv_free() of the referent triggered
    by $r = 1; does pretty much nothing.

    Now, $$r = 1; is far more fun:

    $ cat ../test/119089-dm.pl
    use threads ();
    use threads::shared;

    my $r;
    {
         my @a :shared;
         $r = \$a[0];
    }
    $$r = 1;
    __END__

    $ ./perl -Ilib ../test/119089-dm.pl
    perl: shared.xs:969: sharedsv_elem_mg_STORE: Assertion `mg->mg_ptr != 0' failed.
    Aborted


    This is because that that point saggregate has been freed, SvTYPE(saggregate)
    is SV_TYPEMASK, an so it takes the "must be a hash" else block:

         if (SvTYPE(saggregate) == SVt_PVAV) {
             assert ( mg->mg_ptr == 0 );
             SHARED_CONTEXT;
             svp = av_fetch((AV*) saggregate, mg->mg_len, 1);
         } else {
             char *key = mg->mg_ptr;
             I32 len = mg->mg_len;
             assert ( mg->mg_ptr != 0 );

    Boom!


    Which gave me this idea:

    $ cat ../test/119089-dm.pl
    use threads ();
    use threads::shared;

    my $r;
    {
         my @a :shared;
         $r = \$a[0];
    }
    # $$r = 1;

    my $a;
    {
         my @a :shared;
         $a = \@a;
         $a->[0] = 2;
    }

    print "$a->[0]\n";
    $$r = 3;
    print "$a->[0]\n";
    __END__

    $ ./perl -Ilib ../test/119089-dm.pl
    2
    3


    Accessing a different SV via the stale proxy. :-)

    +int
    +sharedsv_elem_mg_free(pTHX_ SV *sv, MAGIC *mg)
    +{
    + dTHXc;
    + PERL_UNUSED_ARG(sv);
    + ENTER_LOCK;
    + if (mg->mg_obj) {
    + if (!PL_dirty) {
    + assert(SvROK(mg->mg_obj));
    + }
    + if (SvREFCNT(mg->mg_obj) == 1) {
    + /* If the element has the last pointer to the shared aggregate, then
    + it has to free the shared aggregate. mg->mg_obj itself is freed
    + by Perl_mg_free() */
    + S_sharedsv_dec(aTHX_ SHAREDSV_FROM_OBJ(mg->mg_obj));
    + }
    + }
    + LEAVE_LOCK;
    + return (0);
    +}
    I think the ENTER_LOCK and LEAVE_LOCK are superfluous in this function;
    its all done in private space apart from S_sharedsv_dec(), which does its
    own locking; and there's no savestack manipulation which would require the
    ENTER.
    Yes, I think you're right. A bit too much un-thought-through cargo-culting on
    my part. There's nothing that needs to be kept "in order" with shared space,
    so no need to lock it against races with other threads.

    I re-did the original commit with this simplification.
    sharedsv_array_mg_free(pTHX_ SV *sv, MAGIC *mg)
    {
    PERL_UNUSED_ARG(sv);
    - S_sharedsv_dec(aTHX_ (SV*)mg->mg_ptr);
    + if (!PL_dirty) {
    + assert(mg->mg_obj);
    + assert(SvROK(mg->mg_obj));
    + assert(SvUV(SvRV(mg->mg_obj)) == PTR2UV(mg->mg_ptr));
    + }
    + if (mg->mg_obj) {
    + if (SvREFCNT(mg->mg_obj) == 1) {
    + S_sharedsv_dec(aTHX_ (SV*)mg->mg_ptr);
    + } else {
    + /* An element of this aggregate still has PERL_MAGIC_tied(p)
    + pointing to this shared aggregate. It will take responsibility
    + for freeing the shared aggregate. Perl_mg_free() drops the
    + reference count on mg->mg_obj. */
    + }
    + }
    return (0);
    }
    IIUC, sharedsv_elem_mg_free() and sharedsv_array_mg_free() should be
    essentially the same code. They are both dealing with with an mg that has
    mg_obj => RV => MG whose IV => shared thing. I'm not sure whether the
    asserts in the two subs should be different (apart from the mg_ptr bit);
    in particular, whether sharedsv_elem_mg_free() should be skipping the
    assert(mg->mg_obj) in the non-dirty case.
    I think that you're right that they can be the same. I don't think that
    mg->mg_obj should ever be NULL in the non-dirty case so I kept that assert.

    In the branch the function now looks like this:

    int
    sharedsv_array_mg_free(pTHX_ SV *sv, MAGIC *mg)
    {
         PERL_UNUSED_ARG(sv);
         if (!PL_dirty) {
             assert(mg->mg_obj);
             assert(SvROK(mg->mg_obj));
             if (mg->mg_type == PERL_MAGIC_tied) {
                 /* Only aggregates have this alternative shortcut. */
                 assert(SvUV(SvRV(mg->mg_obj)) == PTR2UV(mg->mg_ptr));
             }
         }
         if (mg->mg_obj) {
             if (SvREFCNT(mg->mg_obj) == 1) {
                 /* There is only one proxy object per thread, stored in mg->mg_obj,
                    which ends up being referenced by both the aggregate and any
                    elements. Perl_mg_free() drops the reference count on
                    mg->mg_obj, but if this is the last reference (and the proxy is
                    about to be freed) then we need to manually drop the reference
                    on the original aggregate in shared space. */
                 S_sharedsv_dec(aTHX_ SHAREDSV_FROM_OBJ(mg->mg_obj));
             }
         }
         return (0);
    }

    I've re-pushed this as smoke-me/nicholas/RT119089-variant
    On Fri, Oct 04, 2013 at 06:10:27AM -0700, Dominic Hargreaves via RT wrote:

    Hmm, I can't see the subsequent question from Dave on this ticket. I'd
    be interested in knowing about it even if there isn't an answer for it
    yet, since this patch is queued for a Debian point release update in a
    bid to fix the original regression. If there are sufficient remaining
    uncertainties about the patch, it may be better to revert it before the
    point release.
    I think that the approach is still good. But the patch you've queued now
    isn't going to be the one that ends up in blead.


    Also, I've not re-tested my changes on older versions, so help there from
    someone (not necessarily Dominic) would be welcome.

    Nicholas Clark
  • Tony Cook via RT at Nov 15, 2013 at 12:14 am

    On Fri Oct 04 06:10:26 2013, dom wrote:
    Hmm, I can't see the subsequent question from Dave on this ticket. I'd
    be interested in knowing about it even if there isn't an answer for it
    yet, since this patch is queued for a Debian point release update in a
    bid to fix the original regression. If there are sufficient remaining
    uncertainties about the patch, it may be better to revert it before the
    point release.
    I think he meant:

    https://rt.perl.org/Ticket/Display.html?id=119089#txn-1243579

    I'm not sure about the correctness of the patch, but I think Dave's "why doesn't
    this break" code example isn't correct, it should be:

       use threads ();
       use threads::shared;
       #use Devel::Peek;
       my $r;
       {
           my @a :shared;
           $r = \$a[0];
           #Dump $r;
       }

       $$r = 1; # changed this line

    which does break.

    Tony

    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org/Ticket/Display.html?id=119089
  • Dave Mitchell at Nov 15, 2013 at 10:57 am

    On Thu, Nov 14, 2013 at 04:14:37PM -0800, Tony Cook via RT wrote:
    On Fri Oct 04 06:10:26 2013, dom wrote:
    Hmm, I can't see the subsequent question from Dave on this ticket. I'd
    be interested in knowing about it even if there isn't an answer for it
    yet, since this patch is queued for a Debian point release update in a
    bid to fix the original regression. If there are sufficient remaining
    uncertainties about the patch, it may be better to revert it before the
    point release.
    I think he meant:

    https://rt.perl.org/Ticket/Display.html?id=119089#txn-1243579
    There was a followup from Nicholas which doesn't seem to have made it into
    the RT ticket:

    Message-ID: <20131007123705.ge66035@plum.flirble.org>


    --
    Red sky at night - gerroff my land!
    Red sky at morning - gerroff my land!
         -- old farmers' sayings #14

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupperl5-porters @
categoriesperl
postedJul 31, '13 at 5:58a
activeNov 15, '13 at 10:57a
posts15
users4
websiteperl.org

People

Translate

site design / logo © 2021 Grokbase