FAQ

Search Discussions

  • Nicholas Clark at Jan 17, 2012 at 10:36 am

    On Tue, Jan 17, 2012 at 10:33:58AM +0000, Steffen Schwigon wrote:
    Hi!

    Not sure whether Disney patches are my only competency :-), however,
    before 5.16 the copyright should be bumped to 2012, like in the patch
    below.
    It's also needed in the README, and probably some other places.
    diff --git a/perl.c b/perl.c
    index c8e8bfb..0f67ff1 100644
    --- a/perl.c
    +++ b/perl.c
    @@ -3422,7 +3422,7 @@ S_minus_v(pTHX)
    #endif

    PerlIO_printf(PerlIO_stdout(),
    - "\n\nCopyright 1987-2011, Larry Wall\n");
    + "\n\nCopyright 1987-2012, Larry Wall\n");
    #ifdef MSDOS
    PerlIO_printf(PerlIO_stdout(),
    "\nMS-DOS port Copyright (c) 1989, 1990, Diomidis Spinellis\n");
    --
    Would you be able to have a dig through the source tree and find out
    if there are any other places where the copyright year needs updated?

    Nicholas Clark
  • Steffen Schwigon at Jan 17, 2012 at 2:35 pm

    Nicholas Clark writes:
    On Tue, Jan 17, 2012 at 10:33:58AM +0000, Steffen Schwigon wrote:
    Hi!

    Not sure whether Disney patches are my only competency :-), however,
    before 5.16 the copyright should be bumped to 2012, like in the patch
    below.
    It's also needed in the README, and probably some other places.
    Ack. I added README. Didn't find more such public places.
    This is in patch 1, attached.

    Would you be able to have a dig through the source tree and find out
    if there are any other places where the copyright year needs updated?
    A separate patch 2, attached, targets coprights in files.

    It is the result of grep'ing for “Copyright … Larry Wall” and old dates
    and syncing those places with dates from actual substantial git commits,
    i.e, ignored adding editor meta variables. I also ignored subfolders
    cpan/ and dist/ and files only mentioning non-Larry authors.

    Before I prove that this is an exhausting approach I want to be sure
    that's what you also want.

    I'm also not sure whether the changes in regen*.pl would require the
    generated files in the same changeset. I propose: No.

    Kind regards,
    Steffen
    --
    Steffen Schwigon <ss5@renormalist.net>
    Dresden Perl Mongers <http://dresden-pm.org/>
  • David Golden at Jan 17, 2012 at 4:05 pm

    On Tue, Jan 17, 2012 at 9:35 AM, Steffen Schwigon wrote:
    It is the result of grep'ing for “Copyright … Larry Wall” and old dates
    and syncing those places with dates from actual substantial git commits,
    i.e, ignored adding editor meta variables. I also ignored subfolders
    cpan/ and dist/ and files only mentioning non-Larry authors.
    FWIW, if you're up for a little extra work, it would be *awesome* if
    you were to automate the process of finding Copyright discrepancies
    into a Porting/checkCopyright.pl script and then worked up a
    t/porting/copyright.t test.

    Then next year, on Jan 1, test_porting will fail and it will be easier
    to fix since the hard work of finding the discrepancies will be done.

    -- David

    P.S. Extra credit for a flag to *fix* the errors as well as find them. :-)
  • Nicholas Clark at Jan 17, 2012 at 5:02 pm

    On Tue, Jan 17, 2012 at 11:05:25AM -0500, David Golden wrote:
    On Tue, Jan 17, 2012 at 9:35 AM, Steffen Schwigon wrote:

    It is the result of grep'ing for ???Copyright ??? Larry Wall??? and old dates
    and syncing those places with dates from actual substantial git commits,
    i.e, ignored adding editor meta variables. I also ignored subfolders
    cpan/ and dist/ and files only mentioning non-Larry authors.
    Did you automate this?

    In particular, I think that one can avoid editor meta-variables by ignoring
    changes to the last comment in the file, and copyright date changes by
    ignoring the first comment in the file. The structure of the C files is
    pretty predictable.
    FWIW, if you're up for a little extra work, it would be *awesome* if
    you were to automate the process of finding Copyright discrepancies
    into a Porting/checkCopyright.pl script and then worked up a
    t/porting/copyright.t test.
    If anyone (Steffen or otherwise) wants to do this, I think split it into
    two parts. Testing that the copyright year at the top of C files tallies,
    and testing that the copyright year as reported is sane.
    (perl.c and INSTALL, it seems)

    "Sane" I think is something like "the most recent tag is this year or last
    year".

    Put the task of updating that into the release manager's guide.

    (It can be automated too, but I think it fits best as a release step,
    to be done as part of the preparations to the first blead release of
    the year (eg this Friday's), and cherry-pickable to maint-*)

    Alternatively, "sane" might be "the same year as the last edit on
    pod/perldelta.pod", which again means that it doesn't start failing until
    someone changes something.
    Then next year, on Jan 1, test_porting will fail and it will be easier
    to fix since the hard work of finding the discrepancies will be done.
    I think that that would be extremely unwise. It creates an infinite
    maintenance burden into the future. Having had tests at ex-job fail as
    a result of the calendar moving was *extremely* frustrating. Admittedly,
    most were way more complex to diagnose than what's being suggested here,
    but the principal of the thing is pain - change nothing, and tests start
    failing.

    Testing the copyright years in the C code headers doesn't cause test
    failures until someone changes them, so that's safe to automate.
    P.S. Extra credit for a flag to *fix* the errors as well as find them. :-)
    Usually that's fairly easy once the detecting is done.

    Nicholas Clark
  • David Golden at Jan 17, 2012 at 5:11 pm

    On Tue, Jan 17, 2012 at 12:02 PM, Nicholas Clark wrote:
    I think that that would be extremely unwise. It creates an infinite
    maintenance burden into the future. Having had tests at ex-job fail as
    a result of the calendar moving was *extremely* frustrating. Admittedly,
    most were way more complex to diagnose than what's being suggested here,
    but the principal of the thing is pain - change nothing, and tests start
    failing.
    <cough>dzil for core</cough>

    :-)

    I never have to remember to update copyright anymore in my CPAN dists.

    -- David
  • Steffen Schwigon at Jan 17, 2012 at 5:20 pm

    Nicholas Clark writes:
    On Tue, Jan 17, 2012 at 11:05:25AM -0500, David Golden wrote:
    On Tue, Jan 17, 2012 at 9:35 AM, Steffen Schwigon wrote:

    It is the result of grep'ing for ???Copyright ??? Larry Wall??? and
    old dates and syncing those places with dates from actual
    substantial git commits, i.e, ignored adding editor meta
    variables. I also ignored subfolders cpan/ and dist/ and files only
    mentioning non-Larry authors.
    Did you automate this?
    Not completely. A manual series of ad-hoc lines of the style

    for i in $(grep -l foo $(grep -C… bar)) ; do git log —format='…', etc.

    but no rocket science.

    FWIW, if you're up for a little extra work, it would be *awesome* if
    you were to automate the process of finding Copyright discrepancies
    into a Porting/checkCopyright.pl script and then worked up a
    t/porting/copyright.t test.
    The technical part is not that hard.

    I'm mostly unsure about which files to look at or skip.

    Is it really simply ignore dist/, cpan/, and files with “DO NOT EDIT”
    but take everything else that already has “Copyright” in it? I'm
    thankful for all other hints/heuristics about that…

    Kind regards,
    Steffen
    --
    Steffen Schwigon <ss5@renormalist.net>
    Dresden Perl Mongers <http://dresden-pm.org/>
  • David Golden at Jan 17, 2012 at 5:31 pm

    On Tue, Jan 17, 2012 at 12:20 PM, Steffen Schwigon wrote:
    Is it really simply ignore dist/, cpan/, and files with “DO NOT EDIT”
    but take everything else that already has “Copyright” in it? I'm
    thankful for all other hints/heuristics about that…
    I would ignore cpan/

    I'm on the fence about dist/. Technically, those are part of core.
    It's like the version check -- if the file is modified, the copyright
    should be, too.

    -- David
  • Nicholas Clark at Jan 18, 2012 at 12:30 pm

    On Tue, Jan 17, 2012 at 12:31:25PM -0500, David Golden wrote:
    On Tue, Jan 17, 2012 at 12:20 PM, Steffen Schwigon wrote:
    Is it really simply ignore dist/, cpan/, and files with 'DO NOT EDIT'
    but take everything else that already has 'Copyright' in it? I'm
    thankful for all other hints/heuristics about that...
    I would ignore cpan/

    I'm on the fence about dist/. Technically, those are part of core.
    It's like the version check -- if the file is modified, the copyright
    should be, too.
    I can't be sure either - like most of these things, one discovers new stuff
    as one tries to implement it. But:

    1: Remembering that the perfect is the enemy of the good, and we don't have
    anything currently, a partial "solution" is better than what we have.
    So even just README and the part of perl.c responsible for the
    perl -v output, and that copyright block at the start of the top-level
    *.c and *.h files would be a good start. Get that working

    2: Files in cpan/ aren't "owned" by the core, so can't be updated to fix
    tests, so it's not fair game to test them. Everything else is (pretty
    much), but that doesn't mean that it's sensible. As an obvious (to me)
    counter-example, regen/reentr.pl and regen/regen_lib.pl between them the
    copyright text and dates that end up in reentr.c, reentr.h etc, but those
    dates don't refer to modifications made to the scripts. So not all
    /Copyright/i text can be tested.

    3: Any file not in MANIFEST is generated (somehow) during the build process,
    so it's not sane to edit them. It may not be sensible to test them
    (directly) - I don't know. Particularly if the "test" is also intended to
    be able to edit the files to update them.
    (for an example of this see Porting/checkcfgvar.pl, which has a --regen
    option, and is called as a test by t/porting/checkcfgvar.t)

    I'd suggest seeing if you can get something working for the top level C
    files, and worry about everything else later.

    Nicholas Clark
  • Nicholas Clark at Jan 19, 2012 at 9:16 pm

    On Thu, Jan 19, 2012 at 01:51:34PM +0000, Nicholas Clark wrote:
    On Tue, Jan 17, 2012 at 02:35:25PM +0000, Steffen Schwigon wrote:
    Nicholas Clark <nick@ccl4.org> writes:
    On Tue, Jan 17, 2012 at 10:33:58AM +0000, Steffen Schwigon wrote:
    Hi!

    Not sure whether Disney patches are my only competency :-), however,
    before 5.16 the copyright should be bumped to 2012, like in the patch
    below.
    It's also needed in the README, and probably some other places.
    Ack. I added README. Didn't find more such public places.
    This is in patch 1, attached.
    I've applied this as fdfb76e4ec2559ba. Thanks.
    Would you be able to have a dig through the source tree and find out
    if there are any other places where the copyright year needs updated?
    A separate patch 2, attached, targets coprights in files.
    This causes 1 test to fail (t/porting/regen.t)

    perly.c etc no longer reflect the perly.y they were generated from, because
    perly.y has changed. There's a hash embedded to allow this to be tested
    without needing bison. A side effect of this is that one needs bison to
    fix this, and I was digging for where I have already got a copy of the
    same version of bison installed as last time, so that I don't generate
    spurious diffs. Found *that*, but there's something else relevant to fix
    first, so I'm working my way back up the stack of yaks before I fix this.
    Thanks, applied

    commit 2eee27d7b177d9896e448afbc693e62df0094ca3
    Author: Steffen Schwigon <ss5@renormalist.net>
    Date: Tue Jan 17 14:17:13 2012 +0000

    Bump several file copyright dates

    Sync copyright dates with actual changes according to git history.

    [Plus run regen_perly.h to update the SHA-256 checksums, and
    regen/regcharclass.pl to update regcharclass.h]
    On Wed, Jan 18, 2012 at 12:30:29PM +0000, Nicholas Clark wrote:

    I can't be sure either - like most of these things, one discovers new stuff
    as one tries to implement it. But:

    1: Remembering that the perfect is the enemy of the good, and we don't have
    anything currently, a partial "solution" is better than what we have.
    So even just README and the part of perl.c responsible for the
    perl -v output, and that copyright block at the start of the top-level
    *.c and *.h files would be a good start. Get that working
    3: Any file not in MANIFEST is generated (somehow) during the build process,
    so it's not sane to edit them. It may not be sensible to test them
    (directly) - I don't know. Particularly if the "test" is also intended to
    be able to edit the files to update them.
    (for an example of this see Porting/checkcfgvar.pl, which has a --regen
    option, and is called as a test by t/porting/checkcfgvar.t)

    I'd suggest seeing if you can get something working for the top level C
    files, and worry about everything else later.
    I don't know if you've started on a different plan, but I had a bit more
    inspiration on what I'd do.

    Knowing that

    1: git is fast, but not magic, hence asking it for revision history all the
    way back to the first commit is still going to be slow
    2: things are actually correct currently
    3: tests pass when releases are tagged

    I think start by working on the assumption that we only need to check for
    changes since the last release. This is what Porting/cmpVERSION.pl does.
    To cater for special cases such as someone changing the editor blocks
    [again :-)] or other stuff we don't really think deserves a copyright year
    bump, I think it would work to have the option of using a more recent
    commit hash as "start here, known good".

    Porting/cmpVERSION.pl uses this to find the most recent tag:

    git describe --abbrev=0

    So, if for example the hard coded known good commit is
    5637ef5b34a3e8caf72080387a15ea8d81b61baf

    I'd use the output of (something like)

    git log blead ^v5.15.6 ^5637ef5b34a3e8caf72080387a15ea8d81b61baf *.c *.h

    to find changes, where (obviously) v5.16.0 came from the previous command.
    This stops at ^5637ef5b34a3e8caf72080387a15ea8d81b61baf, as it's more recent
    than ^v5.15.6. This approach still works without having to edit anything
    when BinGOs tags v5.15.7, as the output of

    git log blead ^v5.15.7 ^5637ef5b34a3e8caf72080387a15ea8d81b61baf *.c *.h

    will stop at v5.15.7 (as it will be more recent than that commit).

    And then for each file, find the list of changes, and extract the year(s),
    and then check that those years are listed in the copyright. Not worrying
    about whether the previous years listed are correct, because you've already
    done that manually.

    Nicholas Clark
  • Steffen Schwigon at Jan 22, 2012 at 9:58 am

    Nicholas Clark writes:
    I don't know if you've started on a different plan, but I had a bit
    more inspiration on what I'd do.
    Just as a live signal: I already started and have a working Perl script
    which does copyright grepping and already some git digging. It automates
    a bit more than what I did manually so far and even covers some
    hand-polished funny but easy corner cases. I will try to incorporate
    other inspiration from your emails.

    My main approach currently is digging through everything that
    *already has* a “Copyright” in it somewhere. From this assumption
    it is no rocket science so far but careful manual validation.

    As usual I'm working in the “blanking interval” of my real live, but
    will post something working soon for feedback.

    Kind regards,
    Steffen
    --
    Steffen Schwigon <ss5@renormalist.net>
    Dresden Perl Mongers <http://dresden-pm.org/>
  • Steffen Schwigon at Jan 31, 2012 at 2:01 am

    Steffen Schwigon writes:
    Nicholas Clark <nick@ccl4.org> writes:
    I don't know if you've started on a different plan
    I already started and have a working Perl script which does copyright
    grepping and already some git digging.
    Nobody should ever do this again. It was an ecstatic lesson. :-)
    And yes, I saw others already did it, when I reviewed the history...

    In case you apply my patches, please carefully follow these single
    steps, to let my script know and ignore its own resulting copyright
    bump:

    1. apply the actual copyright changes patch first
    (0001-Copyright-overhaul.patch)

    2. optionally regenerate files and "--amend" them into the same commit

    3. remember the resulting commitid for step 5.

    4. now apply the utility script Porting/check_copyright.pl
    (0001-Script-to-check-Copyright-information-in-git.patch)

    5. replace the AAAAAAA in the check_copyright.pl script with the
    real commitid from 3. and "--amend" this change

    Result: two commits; one for the updated years, one for the script.

    If you followed the steps (especially 3. and 5.) above you should be
    able to run

    prove -v Porting/check_copyright.pl

    from the git repository and it should succeed.


    ***


    The rest of this email is just background trivia.

    The utility does:

    - find “Copyright” and according context lines in files
    * ignore files without existing Copyright lines
    * ignore cpan/
    * ignore ext/

    - from those files/lines extract greatest year

    - compare this to last git change

    - if commits happened after latest copyright year:
    * list them (max 5 per year) for review

    - skip exceptions, like special files and commits to ignore
    (e.g., other copyright bumps in the past)

    - it additionally runs dedicated check against perl.c and README to
    match the current year

    It's slightly more tricky in the details.

    Ideally that script finds future Copyright update needs and does all the
    neccessary git digging. I suggest maintaining it for upcoming exceptions
    like I already did with the file and commit skip lists.


    Kind regards,
    Steffen
    --
    Steffen Schwigon <ss5@renormalist.net>
    Dresden Perl Mongers <http://dresden-pm.org/>
  • Abhijit Menon-Sen at Jan 31, 2012 at 2:49 am
    (Sorry to jump into the middle of this thread, but…)
    At 2012-01-31 02:01:32 +0000, ss5@renormalist.net wrote:

    Nobody should ever do this again.
    I couldn't agree more. It's a complete waste of time. I've spoken to
    more than one lawyer about this, and there's no reason to keep adding
    years to copyright notices other than cargo culting.

    *Maybe* one could make a case for changing (not adding) the year for a
    major release, but doing it automatically at the beginning of every year
    is just meaningless.

    -- ams
  • Steffen Schwigon at Jan 31, 2012 at 3:07 pm

    Abhijit Menon-Sen writes:
    *Maybe* one could make a case for changing (not adding) the year for a
    major release, but doing it automatically at the beginning of every
    year is just meaningless.
    It's not an undiscriminating change at the beginning of a year on
    everything -- it's driven by when actual changes happened in the
    actually changed files. I would have expected that this would declare
    the copyright of those changes that happened. Don't you think so?

    Kind regards,
    Steffen
    --
    Steffen Schwigon <ss5@renormalist.net>
    Dresden Perl Mongers <http://dresden-pm.org/>
  • Abhijit Menon-Sen at Jan 31, 2012 at 3:25 pm

    At 2012-01-31 15:07:30 +0000, ss5@renormalist.net wrote:
    I would have expected that this would declare the copyright of those
    changes that happened. Don't you think so?
    No. It just doesn't mean anything to have multiple years in the notice.
    The copyright notice itself doesn't mean enough to warrant such careful
    maintenance. Yes, I know everyone does it, but it's a waste of time.

    What is the intended effect, anyway? To increase the period for which
    the copyright protection persists, from 70 years past Larry's lifetime
    to 90 or 100 years past it? By the time it really becomes a problem,
    Disney will probably have solved the problem more effectively.

    -- ams
  • Steffen Schwigon at Mar 1, 2012 at 7:41 pm
  • Steffen Schwigon at Jul 3, 2012 at 8:39 am
  • Steffen Schwigon at Nov 2, 2012 at 1:28 pm
    Once upon a time:

    Steffen Schwigon <ss5@renormalist.net> writes:
    Hello Perl 5 porters!

    Steffen Schwigon <ss5@renormalist.net> writes:
    Steffen Schwigon <ss5@renormalist.net> writes:
    Nicholas Clark <nick@ccl4.org> writes:
    I don't know if you've started on a different plan
    I already started and have a working Perl script which does copyright
    grepping and already some git digging.
    Shibboleet!
    Here my latest version. I adapted script and copyright changes to latest
    blead as of few minutes ago, which again gave me great confidence in the
    script's usefulness for the annoying Copyright checking task - the pain
    that can be seen in git history over the years really turns into a
    maintainable task now. Please apply.

    - The first patch is the copyright checking script itself, which outputs
    TAP so can be run like this:

    prove -v Porting/checkcopyright.pl

    - and the second patch is the actual Copyright fix overhaul, based on
    above script's dignostics but carefully manually reviewed and
    hand-crafted.

    Generated files are not part of the overhaul patch but the checking
    script knows about those files and reports them as '# TODO'.


    You can also review it on github:

    https://github.com/renormalist/perl/tree/copyright
    https://github.com/renormalist/perl/compare/blead...renormalist:copyright
    I have some time the next days+weeks and use it to resume on stalled tasks.

    Here I kindly ask:

    Is there still interest in a "final solution" to the copyright
    checking problem?

    If yes, I would happily rebase my submission, run the script again and
    apply more neccessary copyright updates, again. But ideally not with the
    result ignored again, because it's a tedious task.

    I still think that at least the *script* "Porting/checkcopyright.pl"
    should go in to help the next guy who tackles that problem...

    Kind regards,
    Steffen
    --
    Steffen Schwigon <ss5@renormalist.net>
    Perl benchmarks <http://perlformance.net>
    Dresden Perl Mongers <http://dresden-pm.org/>
  • Nicholas Clark at Jan 19, 2012 at 1:51 pm

    On Tue, Jan 17, 2012 at 02:35:25PM +0000, Steffen Schwigon wrote:
    Nicholas Clark <nick@ccl4.org> writes:
    On Tue, Jan 17, 2012 at 10:33:58AM +0000, Steffen Schwigon wrote:
    Hi!

    Not sure whether Disney patches are my only competency :-), however,
    before 5.16 the copyright should be bumped to 2012, like in the patch
    below.
    It's also needed in the README, and probably some other places.
    Ack. I added README. Didn't find more such public places.
    This is in patch 1, attached.
    I've applied this as fdfb76e4ec2559ba. Thanks.
    Would you be able to have a dig through the source tree and find out
    if there are any other places where the copyright year needs updated?
    A separate patch 2, attached, targets coprights in files.
    This causes 1 test to fail (t/porting/regen.t)

    perly.c etc no longer reflect the perly.y they were generated from, because
    perly.y has changed. There's a hash embedded to allow this to be tested
    without needing bison. A side effect of this is that one needs bison to
    fix this, and I was digging for where I have already got a copy of the
    same version of bison installed as last time, so that I don't generate
    spurious diffs. Found *that*, but there's something else relevant to fix
    first, so I'm working my way back up the stack of yaks before I fix this.

    Nicholas Clark

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupperl5-porters @
categoriesperl
postedJan 17, '12 at 10:34a
activeNov 2, '12 at 1:28p
posts19
users4
websiteperl.org

People

Translate

site design / logo © 2021 Grokbase