User Information
| Display Name: | Sam Vilain |
|---|
| Partial Email Address: | s...@vilain.net |
| Posts: |
|
| 1) Sam Vilain Perl.git "So, Tomorrow?" Release |
|
|
| It gives me great pleasure to release another, yet-more-improved version of the Perl history. I... |
|
|
|
|
|
|
|
It gives me great pleasure to release another, yet-more-improved version of the Perl history.
I triaged and dug to the bottom of each of the faults that Nicholas raised last month. Using his QA approach as a guideline, I was able to use the perforce exporter I wrote for this effort repeatedly until I ended up with output which is largely identical to the perforce source.
After a brief discussion on #p5p, it was decided that rather than messing around with history patches that would need to be grafted on by those who wanted the cleaned history for bisecting etc, that it would be easier to just force everyone to re-clone and have one good history.
The new version is available at:
git://utsl.gen.nz/perl-sotomorrow
It's now about an 75MB download (37k revisions). Much reduced from previous downloads, without investigating I'd say that boils down to improvements to the git deltification and object candidate selection code, resulting in more, longer string matches. Or, maybe I just left a big chunk of important stuff out, who knows ;-).
I'll keep it a bit more up to date than the other one has been historically - assuming my upstream also updates beyond 13th November soon.
And for those interested, here is a blow-by-blow account of each of the issues raised, and what happened with them:
* Change 42 (maint-5.004/perl):
- Change 42 was the commit that introduced perl-5.004_01 to Perforce. Its parent was Change 32. However, there was a fine grained changelog for this, so it was expanded to approximately 47 commits using the "Timinator" script.
=> There was a tag in the 'soyesterday' release of the Perl history so that 'git log --all --grep=@42\\\>' will show up a revision with @42 on it.
* Change 82 (maint-5.004/perl):
- Change 82 was where the 5.004_03 revision was checked into Perforce. The Timinator extracted 29 commits from that change; it was recorded as a cross-merge from the surrogate Change 81 and the actual commit tagged with perl-5.004_03.
=> Actually the error comes from change 32, which because I was kind of manually doing those maint-5.004 revisions using the timinator, didn't get the attention required.
So, when I rewound the exporter to that change and re-exported it, it comes up clean:
maia:~/src/pumpkin/perl$ git-p4raw unexport-commits 32 git-p4raw: deleting commit records for 34714 changes maia:~/src/pumpkin/perl$ rm -r .git/refs/heads/p4/ maia:~/src/pumpkin/perl$ git-p4raw show-branches 31 | \ sed 's!//depot!refs/heads/p4!' | \ while read ref x commit; do git-update-ref $ref $commit; done maia:~/src/pumpkin/perl$ git-p4raw export-commits -n 1 git-p4raw: gathering export plan git-p4raw: exporting commits between 32 and 32 100% [=================================================================================]D 0h00m00sgit-p4raw: Now checkpointing. git-p4raw: waited 1s for p4raw.29232.marks to be created maia:~/src/pumpkin/perl$ git checkout p4/maint-5.004/perl Switched to branch "p4/maint-5.004/perl" maia:~/src/pumpkin/perl$ diff -rpu . ../p4perl/maint-5.004/perl Only in .: .git Only in .: p4raw.29232.marks maia:~/src/pumpkin/perl$
* Change 84 (blead):
- Change 84 was a cross-merge from 5.004_01 (Change 42) and blead (Change 78)
=> diffs seemed to come out in the wash.
* Change 157 (oneperl):
- See the note on change 162. This manual fixup was possibly botched previously; it's clean now, though.
* Change 192 (oneperl):
=> related to other oneperl issues
* Change 562 (blead):
- Changes 562 through 564 were quite special, a whole set of partial merges of the p4/perlext/Compiler tree, with re-organisation. In the history, p4/perlext/Compiler appears as a new root and is merged in at change 563.
Indeed the export of Change 562 was completely screwed.
=> rotated change several turns anti-clockwise, now completely un-screwed
* Change 973 (win32/perl):
- Another "nuke everything" commit. There was a lot of manual work around that area, and the old buggy importer probably didn't help matters.
=> all the non-'fixed' history has been checked, double checked and re-written.
Change 986 is the end of the "manual" history - I didn't bother running the timinator over later changes on the maint-5.004 branch. That history - particularly commit cb99a88 - has been left untouched, as the only problems are a few e-mail addresses and some of the 'p4raw-link' links will refer to unknown revisions - and fixing that up would be a bit of a headache right now.
* Change 2041 (perlext/jpl):
- Fixed symlink handling, this is the first change with a symlink. Affects many revisions.
* Change 4249 (vmsperl):
- Export of this revision only affected one file.
missing: configure.com vms/vms.c
also missing (may be related to Change 82): ext/Thread/join.t ext/Thread/specific.t lib/warning.pm op/delete.t op/flip.t op/push.t op/wantarray.t warning.h x2p/str.c
=> another one fixed in the wash.
* Change 8520:
- This included a file with keyword expansion. In theory this work directory corruption can be achieved via gitattributes; but, I didn't bother; those features never really work ;-).
=> WONTFIX
* Change 11243:
- Change 11243 contained files which had a file type of XXX - "mac" Sorry, I didn't do any .gitattributes conversion for these either. It would probably need more infrastructure - currently the 'smudge' and 'clean' filters are just that - stream filters - and there is no infrastructure for unpacking single repository files to multiple working copy files (or vice versa, a possible useful feature for storing collections of, eg tar archives).
=> WONTFIX
* Change 16123:
- This is a pretty kooky change. It created a new branch, and pulled in a tree from another branch. It was represented in Git as a cross-merge between the two sources of changes.
However, this strange mish-mash of a situation represents a novel condition to the importer; it needed to spot that the path doesn't match the old path and issue a "deleteall" git-fast-import instruction to clear the branch.
=> implemented, and works.
* Change 29366:
- This change added a file with a literal "#" in the filename. This is indeed a deficiency in my perforce converter; that's the first file named like that I found. Fixed the conversion and now the short-lived file appears and disappears correctly named.
Other things not fixed and general errata: ------------------------------------------ - The maint-5.004 track has some dead 'p4raw-link' comments; I didn't change these as mentioned earlier. Complicated rewrites to fix them up could have made matters worse.
- The maint-5.004 and one other branch has some files which are execute enabled in Perforce but not in the git version; this was a result of the manual conversion work for the earlier history.
- One of the manual corrections I made around change 136 - 151 was missed, so the history for those changes on the affected branch (probably oneperl) may be a little screwy.
- The amend note tags present in the perl-soyesterday repository have not been migrated to the final repository; I'll leave them in perl-soyesterday.
Enjoy! Sam.
|
|
|
| 2) Sam Vilain Re: git conversion discrepency |
|
|
| Being an emacs user I tend to have a whole bunch of exclude rules for filenames like that; perhaps... |
|
|
|
|
|
|
|
On Wed, 2008-10-08 at 09:10 +0200, Rafael Garcia-Suarez wrote: > 2008/10/7 Nicholas Clark <nick@ccl4.org>: > > Turns out to be exactly that one revision in perforce. The next commit was to > > remove Module::Pluggable, and when it was re-added the file starting with # > > wasn't present. (I think that it's generated by the Makefile.PL) > > > > I suspect that even if it is a limitation of git (and a strange one) we can > > live with it. > > apparently git handles # in names quite well -- that might be a > limitation of the p4 importer.
Being an emacs user I tend to have a whole bunch of exclude rules for filenames like that; perhaps git-fast-import honours those default excludes. I'll check it out. Sam.
|
|
|
| 3) Sam Vilain Re: git conversion discrepency |
|
|
| Blech. Good detective work. The earlier change, if I read the report correctly, may have been one... |
|
|
|
|
|
|
|
On Mon, 2008-10-06 at 17:25 +0100, Nicholas Clark wrote: > It seems that all this is caused by exactly two errors. I hope that they are > easy to identify and fix. I'm afraid that I have no idea how to help do this.
Blech. Good detective work. The earlier change, if I read the report correctly, may have been one of those where I took a single change which was more like a tarball release, and unrolled it into a series of patches.
The later one is almost certainly an error. I'll look into it and release a history patch which can be attached using grafts for those that need to work with it. For those working with history after that, it shouldn't matter.
Cheers, Sam.
|
|
|
| 4) Sam Vilain Re: This Week on perl5-porters - 9-15 March 2008 |
|
|
| Sorry about that. I'm quite backlogged. Detecting cherry-picks is documented on the git rev-list... |
|
|
|
|
|
|
|
David Landgren wrote: > Nicholas also wanted to know what support git provided to answer > questions such as "which changes from this branch have been integrated > into that branch". Rafael seemed to think it should be possible, but > no people with strong git-fu responded.
Sorry about that. I'm quite backlogged. Detecting cherry-picks is documented on the git rev-list man page; it's something like this:
git rev-list --left-right --cherry-pick maint-5.10..blead
That would show changes on blead which are not on maint-5.10. If the change was altered along the way, such as being ignored, it is considered a new change and will not be omitted by this command.
It's worth noting that Perforce had no concept of cherry-picked changes either, and instead people wrote the scripts to work with the facilities it does provide. I made sure during the conversion to copy across the relevant information into breadcrumbs in the revision history, to allow those particular tools to be ported.
If you want to do it based on some other token, well git has no built-in support for that, but I have implemented this before based on a simple comparison of the subject line, and in practice - assuming all commits are well titled, and that you merge without squashing commits - this solves the actual problem well.
> Elsewhere, there was some idle chatter of converting everything to > UTF-8, but no resolution.
On the perl foundation wiki there is a page on this. Note that this applies to author names only, not to anything in the content of the repository. Since that conversation, I have had time to triage and plan the fixes, and I hope that I will find time to make a new release of the history with these issues fixed. I will announce when this is the case!
Thanks for your summaries, Sam.
|
|
|
| 5) Sam Vilain Re: Getting "SO Yesterday" blead via git |
|
|
| Pretty much, yes. Not quite such a large depth - more like 100. Sam. |
|
|
|
|
|
|
|
Alexey Tourbin wrote: > On Tue, Mar 04, 2008 at 11:47:30AM +1300, Sam Vilain wrote: >> Full clone (112MB download): >> git clone git://utsl.gen.nz/perl > > Did you repack that with something like > git-repack -a -d -f --window=100 --depth=500
Pretty much, yes. Not quite such a large depth - more like 100. Sam.
|
|
|
|
 | |