git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* RFC: grafts generalised
@ 2008-07-02 14:35 Stephen R. van den Berg
  2008-07-02 16:35 ` Jakub Narebski
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-02 14:35 UTC (permalink / raw
  To: git

I'm in the process of converting and stitching and patching vast amounts
of initially disjunct CVS and SVN repositories into larger complete
histories inside a single git repository.  Recreating history as
accurately as possible.

The problem I encounter is that any number of times I have to "edit"
history in a non-parameterable fashion, in any of the following ways:
- Change parents.
- Add merges.
- Change author, committer, commitdate, authordate.
- Change the tree (because of conversion errors in the automated
  conversion process) belonging to a single commit.
- Retrofit a patch which has to ripple through all of history until
  the present.

The only things which are easily done at the moment are:
Change parents and add merges.  This can be accomplished fairly easily
using the grafts file.
The other changes are messy at best and need to be parameterised into the
form of a shell script so that git filter-branch can have a go at it.
This parameterisation is doable for author/committer/dates in most cases
(but not pretty), but is rather (too) convoluted for ripple-through
patches.

You have to imagine that the whole tree has lots of interconnects
already (merges), and changing the tree at a point in history which has
to ripple through is a mess, because all references and interconnects
need to be rewritten as well.

I propose the following:
- Extend git fsck to do more sanity checks on the content of the grafts
  file (to make it more difficult to shoot yourself in the foot with
  that file; my feet will be grateful).
- Extend the grafts file format to support something like the following syntax:

commit eb03813cdb999f25628784bb4f07b3f4c8bfe3f6
Parent: 7bc72e647d54c2f713160b22e2e08c39d86c7c28
Merge: 3b3da24960a82a479b9ad64affab50226df02abe 13b8f53e8ccec3b08eeb6515e6a10a2a
Merge: ac719ed37270558f21d89676fce97eab4469b0f1
Tree: 32fc99814b97322174dbe97ec320cf32314959e2
Author: Foo Bar (FooBar) <foo@bar>
AuthorDate: Sat Jun 6 13:50:44 1998 +0000
Commit: Foo Bar (FooBar) <foo@bar>
CommitDate: Sat Jun 7 13:50:44 1998 +0000
Logmessage: First line of logmessage override
Logmessage: Second line of logmessage override
Logmessage: Etc.
 
- Whereas not specified fields default to not altering the commit for
  those fields.
  E.g.

commit eb03813cdb999f25628784bb4f07b3f4c8bfe3f6
Parent: 7bc72e647d54c2f713160b22e2e08c39d86c7c28

  Would alter just the parent, nothing else.
- Keep backward compatibility with the old format.

Obviously, the use case for this is to change the tree as needed, then
run git filter-branch to actually get things in permanently, after which
it becomes clonable.
-- 
Sincerely,
           Stephen R. van den Berg.

You are confused; but this is your normal state.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 14:35 RFC: grafts generalised Stephen R. van den Berg
@ 2008-07-02 16:35 ` Jakub Narebski
  2008-07-02 16:43   ` Michael J Gruber
  2008-07-02 17:32   ` Stephen R. van den Berg
  2008-07-02 17:19 ` Dmitry Potapov
  2008-07-03  0:13 ` Petr Baudis
  2 siblings, 2 replies; 36+ messages in thread
From: Jakub Narebski @ 2008-07-02 16:35 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git

"Stephen R. van den Berg" <srb@cuci.nl> writes:

> I'm in the process of converting and stitching and patching vast amounts
> of initially disjunct CVS and SVN repositories into larger complete
> histories inside a single git repository.  Recreating history as
> accurately as possible.
> 
> The problem I encounter is that any number of times I have to "edit"
> history in a non-parameterable fashion, in any of the following ways:
> - Change parents.
> - Add merges.
> - Change author, committer, commitdate, authordate.
> - Change the tree (because of conversion errors in the automated
>   conversion process) belonging to a single commit.
> - Retrofit a patch which has to ripple through all of history until
>   the present.
> 
> The only things which are easily done at the moment are:
> Change parents and add merges.  This can be accomplished fairly easily
> using the grafts file.
> The other changes are messy at best and need to be parameterised into the
> form of a shell script so that git filter-branch can have a go at it.
[...]

> I propose the following:
> - Extend git fsck to do more sanity checks on the content of the grafts
>   file (to make it more difficult to shoot yourself in the foot with
>   that file; my feet will be grateful).
> - Extend the grafts file format to support something like the following syntax:
> 
> commit eb03813cdb999f25628784bb4f07b3f4c8bfe3f6
> Parent: 7bc72e647d54c2f713160b22e2e08c39d86c7c28
> Merge: 3b3da24960a82a479b9ad64affab50226df02abe 13b8f53e8ccec3b08eeb6515e6a10a2a
> Merge: ac719ed37270558f21d89676fce97eab4469b0f1
> Tree: 32fc99814b97322174dbe97ec320cf32314959e2
> Author: Foo Bar (FooBar) <foo@bar>
> AuthorDate: Sat Jun 6 13:50:44 1998 +0000
> Commit: Foo Bar (FooBar) <foo@bar>
> CommitDate: Sat Jun 7 13:50:44 1998 +0000
> Logmessage: First line of logmessage override
> Logmessage: Second line of logmessage override
> Logmessage: Etc.
[...]

First, if I remember correctly (from KernelTrap and now defunct Kernel
Traffic and one issue of Git Traffic) the 'graft' mechanizm was
created so it would be possible to "graft" (join) historical
conversion repository with the "current work" git repository (started
from zero when git was deemed good enough for Linux kernel
development).  The same mechanism is used for shallow clone, where one
goes in the opposite direction, shortening history instead of joining
two repositories (two histories).

The fact that git-filter-branch (and earlier cg-admin-rewrite-hist)
respects grafts, and rewrites history so that grafts are no-op and are
not needed further is a bit of side-effect.  So I think that it would
be better to provide generic git-filter-branch filter which can
understand this "generalized grafts" file format, or rather
'description of changes' file.  Put it in contrib/, and here you
go...

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 16:35 ` Jakub Narebski
@ 2008-07-02 16:43   ` Michael J Gruber
  2008-07-02 17:42     ` Stephen R. van den Berg
  2008-07-02 17:32   ` Stephen R. van den Berg
  1 sibling, 1 reply; 36+ messages in thread
From: Michael J Gruber @ 2008-07-02 16:43 UTC (permalink / raw
  To: git

Jakub Narebski venit, vidit, dixit 02.07.2008 18:35:
> "Stephen R. van den Berg" <srb@cuci.nl> writes:
> 
>> I'm in the process of converting and stitching and patching vast amounts
>> of initially disjunct CVS and SVN repositories into larger complete
>> histories inside a single git repository.  Recreating history as
>> accurately as possible.
>>
>> The problem I encounter is that any number of times I have to "edit"
>> history in a non-parameterable fashion, in any of the following ways:
>> - Change parents.
>> - Add merges.
>> - Change author, committer, commitdate, authordate.
>> - Change the tree (because of conversion errors in the automated
>>   conversion process) belonging to a single commit.
>> - Retrofit a patch which has to ripple through all of history until
>>   the present.
>>
>> The only things which are easily done at the moment are:
>> Change parents and add merges.  This can be accomplished fairly easily
>> using the grafts file.
>> The other changes are messy at best and need to be parameterised into the
>> form of a shell script so that git filter-branch can have a go at it.
> [...]
> 
>> I propose the following:
>> - Extend git fsck to do more sanity checks on the content of the grafts
>>   file (to make it more difficult to shoot yourself in the foot with
>>   that file; my feet will be grateful).
>> - Extend the grafts file format to support something like the following syntax:
>>
>> commit eb03813cdb999f25628784bb4f07b3f4c8bfe3f6
>> Parent: 7bc72e647d54c2f713160b22e2e08c39d86c7c28
>> Merge: 3b3da24960a82a479b9ad64affab50226df02abe 13b8f53e8ccec3b08eeb6515e6a10a2a
>> Merge: ac719ed37270558f21d89676fce97eab4469b0f1
>> Tree: 32fc99814b97322174dbe97ec320cf32314959e2
>> Author: Foo Bar (FooBar) <foo@bar>
>> AuthorDate: Sat Jun 6 13:50:44 1998 +0000
>> Commit: Foo Bar (FooBar) <foo@bar>
>> CommitDate: Sat Jun 7 13:50:44 1998 +0000
>> Logmessage: First line of logmessage override
>> Logmessage: Second line of logmessage override
>> Logmessage: Etc.
> [...]
> 
> First, if I remember correctly (from KernelTrap and now defunct Kernel
> Traffic and one issue of Git Traffic) the 'graft' mechanizm was
> created so it would be possible to "graft" (join) historical
> conversion repository with the "current work" git repository (started
> from zero when git was deemed good enough for Linux kernel
> development).  The same mechanism is used for shallow clone, where one
> goes in the opposite direction, shortening history instead of joining
> two repositories (two histories).
> 
> The fact that git-filter-branch (and earlier cg-admin-rewrite-hist)
> respects grafts, and rewrites history so that grafts are no-op and are
> not needed further is a bit of side-effect.  So I think that it would
> be better to provide generic git-filter-branch filter which can
> understand this "generalized grafts" file format, or rather
> 'description of changes' file.  Put it in contrib/, and here you
> go...
> 

Maybe the upcoming git-sequencer could be the appropriate place? It 
tries to achieve just that: edit history by specifying a list of 
commands. The currently planned set of commands would need to be 
amended, but the framework should be in place.

Michael

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 14:35 RFC: grafts generalised Stephen R. van den Berg
  2008-07-02 16:35 ` Jakub Narebski
@ 2008-07-02 17:19 ` Dmitry Potapov
  2008-07-02 17:58   ` Dmitry Potapov
  2008-07-02 17:59   ` Stephen R. van den Berg
  2008-07-03  0:13 ` Petr Baudis
  2 siblings, 2 replies; 36+ messages in thread
From: Dmitry Potapov @ 2008-07-02 17:19 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git

On Wed, Jul 02, 2008 at 04:35:19PM +0200, Stephen R. van den Berg wrote:
>_
> - Extend the grafts file format to support something like the following syntax:
>_
> commit eb03813cdb999f25628784bb4f07b3f4c8bfe3f6
> Parent: 7bc72e647d54c2f713160b22e2e08c39d86c7c28
> Merge: 3b3da24960a82a479b9ad64affab50226df02abe 13b8f53e8ccec3b08eeb6515e6a10a2a
> Merge: ac719ed37270558f21d89676fce97eab4469b0f1
> Tree: 32fc99814b97322174dbe97ec320cf32314959e2
> Author: Foo Bar (FooBar) <foo@bar>
> AuthorDate: Sat Jun 6 13:50:44 1998 +0000
> Commit: Foo Bar (FooBar) <foo@bar>
> CommitDate: Sat Jun 7 13:50:44 1998 +0000
> Logmessage: First line of logmessage override
> Logmessage: Second line of logmessage override
> Logmessage: Etc.

I don't think that the grafts file is the right place for this kind of
information. Perhaps, it would be better to have a separate file or
even a directory with files where commit-id identifies a text file with
a new commit object, which should be placed instead of an old one.  So,
it will be easy to tell git filter-branch to use this new information.

However, if you want more than just ability to edit commits in a text
file but also inspect changes using normal git commands and gitk (as it
is possible with grafts), it will require changes to the git core, which,
perhaps, not difficult to implement using pretend_sha1_file(), but I am
not sure that everyone will welcome that...

Dmitry

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 16:35 ` Jakub Narebski
  2008-07-02 16:43   ` Michael J Gruber
@ 2008-07-02 17:32   ` Stephen R. van den Berg
  2008-07-03  0:21     ` Petr Baudis
  2008-07-04  0:43     ` Jakub Narebski
  1 sibling, 2 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-02 17:32 UTC (permalink / raw
  To: Jakub Narebski; +Cc: git

Jakub Narebski wrote:
>First, if I remember correctly (from KernelTrap and now defunct Kernel
>Traffic and one issue of Git Traffic) the 'graft' mechanizm was
>created so it would be possible to "graft" (join) historical
>conversion repository with the "current work" git repository (started
>from zero when git was deemed good enough for Linux kernel
>development).

Quite.  Which is exactly the spirit I'm extending here.
I need it to stitch together history, but it needs to be more perfect
than mere connecting parents.

Also, the graft mechanism specifically is intended as a temporary
solution until one uses filter-branch to "finalise" the result into a
proper repository which becomes cloneable.

>The fact that git-filter-branch (and earlier cg-admin-rewrite-hist)
>respects grafts, and rewrites history so that grafts are no-op and are
>not needed further is a bit of side-effect.

I beg to differ.  It's not a side effect, it's the proper way to get
rid of the grafts file.  Grafts are temporary and ugly.  In proper
repositories they are a sign of transition to a proper state.
The proper state is attained by using git filter-branch.

>  So I think that it would
>be better to provide generic git-filter-branch filter which can
>understand this "generalized grafts" file format, or rather
>'description of changes' file.  Put it in contrib/, and here you
>go...

The problem is that the process of fixing history is an iterative one,
which can take many months, and everytime you make a change, the
correctness needs to be viewed using gitk.

For argument sake, consider the repository at hand which I'm trying to
"fix", it has 33000 commits, distributed over eight branches with
roughly 3500 merges over a timeperiod of 13 years.
The eight branches were eight separate CVS repositories which have
intersecting histories, and 3500 merges between CVS repositories (i.e.
branches).

If I need to backpatch a certain patch into history, it is likely that
in order to let the change ripple through, it will take 20000 commits to
be rewritten every time I make a slight change to history.
It's not really workable to ripple through 20000 commits everytime I
make a historical change, yet I need to view the change in gitk.

Using git filter-branch, or git sequencer basically has the same
problem, I need to ripple through most of history to get to a state
which is viewable using gitk again.  That is too long a turnaround
cycle.

Using the proposed grafts format, I can make changes incrementally, and
immediately viewable (though not cloneable) on the local repository using gitk.
Then after making all the necessary changes, one git filter-branch run
will "burn" the changes into the repository proper in one go
(renumbering all tags, branches and merges along the way).
-- 
Sincerely,
           Stephen R. van den Berg.

You are confused; but this is your normal state.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 16:43   ` Michael J Gruber
@ 2008-07-02 17:42     ` Stephen R. van den Berg
  2008-07-02 18:25       ` Mike Hommey
  2008-07-07  6:28       ` Andreas Ericsson
  0 siblings, 2 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-02 17:42 UTC (permalink / raw
  To: Michael J Gruber; +Cc: git

Michael J Gruber wrote:
>Maybe the upcoming git-sequencer could be the appropriate place? It 
>tries to achieve just that: edit history by specifying a list of 
>commands. The currently planned set of commands would need to be 

That's the problem.  Like git filter-branch, git sequencer needs you to
parameterise the changes, which, in my case, is hardly possible, since
the changes are randomlike.
Also, having to run the sequencer to dig 20000 commits into the past,
then change something, then come back up and rewrite all following
history and relations (parents/tags/merges) will take a sizeable amount
of time.  I need something that can be changed at will, then viewed with
gitk a second later.

These edits are numerous and spread over many months, so the typical 
history fixup-sessions involve periods where you make 30 random
historicaledits per hour (which need to be viewed and checked every time
immediately after making the change).  And say once every 4 months, you
run it through git filter-branch to cast everything into stone.  A
typical git filter-branch run takes 15 minutes on a repository this
size.
-- 
Sincerely,
           Stephen R. van den Berg.

You are confused; but this is your normal state.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 17:19 ` Dmitry Potapov
@ 2008-07-02 17:58   ` Dmitry Potapov
  2008-07-02 18:10     ` Stephen R. van den Berg
  2008-07-02 17:59   ` Stephen R. van den Berg
  1 sibling, 1 reply; 36+ messages in thread
From: Dmitry Potapov @ 2008-07-02 17:58 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git

On Wed, Jul 2, 2008 at 9:19 PM, Dmitry Potapov <dpotapov@gmail.com> wrote:
>
> However, if you want more than just ability to edit commits in a text
> file but also inspect changes using normal git commands and gitk (as it
> is possible with grafts), it will require changes to the git core, which,
> perhaps, not difficult to implement using pretend_sha1_file(), but I am
> not sure that everyone will welcome that...

On second thought, it may be not necessary. You can extract an old commit
object, edit it, put it into Git with a new SHA1, and then use the graft file to
replace all references from an old to a new one. And you will be able to see
changes immediately in gitk.

Dmitry

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 17:19 ` Dmitry Potapov
  2008-07-02 17:58   ` Dmitry Potapov
@ 2008-07-02 17:59   ` Stephen R. van den Berg
  1 sibling, 0 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-02 17:59 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: git

Dmitry Potapov wrote:
>I don't think that the grafts file is the right place for this kind of
>information.

Yet the grafts file is exactly the place where this type of
"overlay-information" is being placed now.  It seems to be the natural place.

> Perhaps, it would be better to have a separate file or
>even a directory with files where commit-id identifies a text file with
>a new commit object, which should be placed instead of an old one.  So,
>it will be easy to tell git filter-branch to use this new information.

Not quite sure why this makes it easier.  The point is that there
is not supposed to be a grafts file in a proper repository.  Thus,
having a lot of these files means a larger disruption to the core, and
I'd like the core to be as efficient and lean as possible given an empty
grafts file.  So I'd prefer to keep it to one file.

>However, if you want more than just ability to edit commits in a text
>file but also inspect changes using normal git commands and gitk (as it
>is possible with grafts), it will require changes to the git core, which,
>perhaps, not difficult to implement using pretend_sha1_file(), but I am
>not sure that everyone will welcome that...

I'd want to avoid a plethora of files, and the changes that can be
specified are supposed to be partial overrides, not complete rewrites.
So using pretend_sha1_file() is a bit overkill and more than I was
aiming for.

The point is, that the changes in grafts (as they are now) are *not*
used when cloning.  I.e. the only thing you mess up is your *own*
repository, not someone else's.  I.e. you can't make someone remote
think that the repository has been altered.  That would require git
filter-branch, which immediately changes all the historical SHA1s, and
makes the changes in history blatantly visible.
-- 
Sincerely,
           Stephen R. van den Berg.

You are confused; but this is your normal state.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 17:58   ` Dmitry Potapov
@ 2008-07-02 18:10     ` Stephen R. van den Berg
  2008-07-02 18:33       ` Dmitry Potapov
                         ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-02 18:10 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: git

Dmitry Potapov wrote:
>On second thought, it may be not necessary. You can extract an old commit
>object, edit it, put it into Git with a new SHA1, and then use the graft file to
>replace all references from an old to a new one. And you will be able to see
>changes immediately in gitk.

Hmmmm, interesting thought.  That just might solve my problem.
In that case, I will stick to extending git fsck to check grafts more
rigorously and fix git clone to *refrain* from looking at grafts.
If anyone still wants the extended format, I'd be willing to implement
it, but my immediate itch for it is gone.
-- 
Sincerely,
           Stephen R. van den Berg.

You are confused; but this is your normal state.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 17:42     ` Stephen R. van den Berg
@ 2008-07-02 18:25       ` Mike Hommey
  2008-07-02 18:34         ` Michael J Gruber
  2008-07-02 18:37         ` Stephen R. van den Berg
  2008-07-07  6:28       ` Andreas Ericsson
  1 sibling, 2 replies; 36+ messages in thread
From: Mike Hommey @ 2008-07-02 18:25 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: Michael J Gruber, git

On Wed, Jul 02, 2008 at 07:42:55PM +0200, Stephen R. van den Berg wrote:
> Michael J Gruber wrote:
> >Maybe the upcoming git-sequencer could be the appropriate place? It 
> >tries to achieve just that: edit history by specifying a list of 
> >commands. The currently planned set of commands would need to be 
> 
> That's the problem.  Like git filter-branch, git sequencer needs you to
> parameterise the changes, which, in my case, is hardly possible, since
> the changes are randomlike.
> Also, having to run the sequencer to dig 20000 commits into the past,
> then change something, then come back up and rewrite all following
> history and relations (parents/tags/merges) will take a sizeable amount
> of time.  I need something that can be changed at will, then viewed with
> gitk a second later.
> 
> These edits are numerous and spread over many months, so the typical 
> history fixup-sessions involve periods where you make 30 random
> historicaledits per hour (which need to be viewed and checked every time
> immediately after making the change).  And say once every 4 months, you
> run it through git filter-branch to cast everything into stone.  A
> typical git filter-branch run takes 15 minutes on a repository this
> size.

I think the point was more about making a tool to do exactly what you
want, based on the new git sequencer. Note that git filter-branch could
also be rewritten to use the sequencer.

Mike

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 18:10     ` Stephen R. van den Berg
@ 2008-07-02 18:33       ` Dmitry Potapov
  2008-07-02 20:39       ` Dmitry Potapov
  2008-07-03  6:02       ` Johannes Sixt
  2 siblings, 0 replies; 36+ messages in thread
From: Dmitry Potapov @ 2008-07-02 18:33 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git

On Wed, Jul 2, 2008 at 10:10 PM, Stephen R. van den Berg <srb@cuci.nl> wrote:
> Dmitry Potapov wrote:
>>On second thought, it may be not necessary. You can extract an old commit
>>object, edit it, put it into Git with a new SHA1, and then use the graft file to
>>replace all references from an old to a new one. And you will be able to see
>>changes immediately in gitk.
>
> Hmmmm, interesting thought.  That just might solve my problem.

This script is just a prove of the concept. It seems to work for me, but
I don't really tested it.

===========================================
#!/bin/bash

set -e

# creating some silly repo
git init
# creating some history
for ((i=0; $i<10; i++))
do
	echo foo$i > foo$i
	git add foo$i
	git commit -m "add foo$i"
done

# run gitk to see it
gitk --all &

# dump all graft info to text file
git rev-list --parents --all > .git/info/grafts.tmp
mv .git/info/grafts.tmp .git/info/grafts

# please choose what commit you want to edit
echo
while read -p 'Edit commit: ' C
do
C=$(git rev-parse "$C") || continue
# edit commit C
git cat-file commit $C > .git/COMMIT_OBJ
vim .git/COMMIT_OBJ

C2=$(git hash-object -w -t commit .git/COMMIT_OBJ)

# replace all references from C to C2
sed -e 's/\<'$C'\>/'$C2'/g' < .git/info/grafts > .git/info/grafts.tmp
mv .git/info/grafts.tmp .git/info/grafts
done
===========================================

Dmitry

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 18:25       ` Mike Hommey
@ 2008-07-02 18:34         ` Michael J Gruber
  2008-07-02 19:31           ` Stephan Beyer
  2008-07-02 18:37         ` Stephen R. van den Berg
  1 sibling, 1 reply; 36+ messages in thread
From: Michael J Gruber @ 2008-07-02 18:34 UTC (permalink / raw
  To: git

Mike Hommey venit, vidit, dixit 02.07.2008 20:25:
> On Wed, Jul 02, 2008 at 07:42:55PM +0200, Stephen R. van den Berg wrote:
>> Michael J Gruber wrote:
>>> Maybe the upcoming git-sequencer could be the appropriate place? It 
>>> tries to achieve just that: edit history by specifying a list of 
>>> commands. The currently planned set of commands would need to be 
>> That's the problem.  Like git filter-branch, git sequencer needs you to
>> parameterise the changes, which, in my case, is hardly possible, since
>> the changes are randomlike.
>> Also, having to run the sequencer to dig 20000 commits into the past,
>> then change something, then come back up and rewrite all following
>> history and relations (parents/tags/merges) will take a sizeable amount
>> of time.  I need something that can be changed at will, then viewed with
>> gitk a second later.
>>
>> These edits are numerous and spread over many months, so the typical 
>> history fixup-sessions involve periods where you make 30 random
>> historicaledits per hour (which need to be viewed and checked every time
>> immediately after making the change).  And say once every 4 months, you
>> run it through git filter-branch to cast everything into stone.  A
>> typical git filter-branch run takes 15 minutes on a repository this
>> size.
> 
> I think the point was more about making a tool to do exactly what you
> want, based on the new git sequencer. Note that git filter-branch could
> also be rewritten to use the sequencer.

Yes, that was at least my point. As I understand, git filter-branch -i 
is a candidate for that rewrite.

But I understand now that OP wants to do lots of history edits and see 
them immediately before doing the actual (time consuming) rewrite; and 
then do the rewrite occasionally. Rewriting is surpirsingly slow even on 
tmpfs.

Michael

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 18:25       ` Mike Hommey
  2008-07-02 18:34         ` Michael J Gruber
@ 2008-07-02 18:37         ` Stephen R. van den Berg
  1 sibling, 0 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-02 18:37 UTC (permalink / raw
  To: Mike Hommey; +Cc: Michael J Gruber, git

Mike Hommey wrote:
>> These edits are numerous and spread over many months, so the typical 
>> history fixup-sessions involve periods where you make 30 random
>> historicaledits per hour (which need to be viewed and checked every time
>> immediately after making the change).  And say once every 4 months, you
>> run it through git filter-branch to cast everything into stone.  A
>> typical git filter-branch run takes 15 minutes on a repository this
>> size.

>I think the point was more about making a tool to do exactly what you
>want, based on the new git sequencer. Note that git filter-branch could
>also be rewritten to use the sequencer.

As far as I understood it, the new git sequencer rewrites history
proper.  That is timeconsuming by definition, and thus it is *not*
possible to make a tool based on the sequencer that supports the desired
iterative-history-rewrite workflow.
-- 
Sincerely,
           Stephen R. van den Berg.

You are confused; but this is your normal state.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 18:34         ` Michael J Gruber
@ 2008-07-02 19:31           ` Stephan Beyer
  2008-07-02 19:36             ` Stephan Beyer
  2008-07-02 20:42             ` Dmitry Potapov
  0 siblings, 2 replies; 36+ messages in thread
From: Stephan Beyer @ 2008-07-02 19:31 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git, Mike Hommey, Michael J Gruber

Hi,

I'm somehow quite confused about the desired workflow but I try an
answer.

Stephen R. van den Berg wrote:
> As far as I understood it, the new git sequencer rewrites history
> proper.  That is timeconsuming by definition, and thus it is *not*
> possible to make a tool based on the sequencer that supports the desired
> iterative-history-rewrite workflow.

If I got the problem right, it is possible.
But you have to rewrite and cannot just fake history, of course.
And, as Michael wrote:
> The currently planned set of commands would need to be amended, but the
> framework should be in place.

...for example, a "pick <commit>" that just picks the _tree_ of the
commit and not the _introduced changes_. (I've never used info/grafts,
but if I get the principle right, such tree-picks could realize a
linear list of info/grafts history fakes.)

Stephen wrote earlier:
> The problem I encounter is that any number of times I have to "edit"
> history in a non-parameterable fashion, in any of the following ways:

Hm, imho sequencer is well-suited for "non-parameterable" stuff.

> - Change parents.

The "pick" instruction (onto the new parent) is your friend.

> - Add merges.

"merge" instruction ;)

> - Change author, committer, commitdate, authordate.

sequencer doesn't allow to change committer data, but this could
be an easy change if you really need that.
The same with the author timestamp, that could only be reused from
an old commit by using -C option on pick.

> - Change the tree (because of conversion errors in the automated
>   conversion process) belonging to a single commit.
> - Retrofit a patch which has to ripple through all of history until
>   the present.

"pause" instruction, and then do manual changes, then
	git sequencer --continue

Stephan has also written:
> Also, having to run the sequencer to dig 20000 commits into the past,
> then change something, then come back up and rewrite all following
> history and relations (parents/tags/merges) will take a sizeable amount
> of time.

I wonder if grafts can be used in combination with sequencer in such a
way that you rewrite foo~20000..foo~19950 and then fake the parents of
foo~19949 to be the rewritten once.

> I need something that can be changed at will, then viewed with
> gitk a second later.

You can run gitk whenever you did "pause" in the sequencer file.
[Btw, an integration of sequencer into gitk is also on the TODO list,
 but that's OT here.]

Regards,
  Stephan

-- 
Stephan Beyer <s-beyer@gmx.net>, PGP 0x6EDDD207FCC5040F

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 19:31           ` Stephan Beyer
@ 2008-07-02 19:36             ` Stephan Beyer
  2008-07-02 20:42             ` Dmitry Potapov
  1 sibling, 0 replies; 36+ messages in thread
From: Stephan Beyer @ 2008-07-02 19:36 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git, Mike Hommey, Michael J Gruber

A short typo fix:
> I wonder if grafts can be used in combination with sequencer in such a
> way that you rewrite foo~20000..foo~19950 and then fake the parents of
> foo~19949 to be the rewritten once.

s/once/ones/

To give it some sense. Sorry ;)

-- 
Stephan Beyer <s-beyer@gmx.net>, PGP 0x6EDDD207FCC5040F

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 18:10     ` Stephen R. van den Berg
  2008-07-02 18:33       ` Dmitry Potapov
@ 2008-07-02 20:39       ` Dmitry Potapov
  2008-07-02 21:18         ` Stephen R. van den Berg
  2008-07-02 21:27         ` Junio C Hamano
  2008-07-03  6:02       ` Johannes Sixt
  2 siblings, 2 replies; 36+ messages in thread
From: Dmitry Potapov @ 2008-07-02 20:39 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git

On Wed, Jul 2, 2008 at 10:10 PM, Stephen R. van den Berg <srb@cuci.nl> wrote:
>
> In that case, I will stick to extending git fsck to check grafts more
> rigorously and fix git clone to *refrain* from looking at grafts.

Linus suggested that "git-fsck and repacking should just consider
it[grafts] to be an  _additional_ source of parenthood rather than
a _replacement_ source."

http://article.gmane.org/gmane.comp.version-control.git/84686

Dmitry

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 19:31           ` Stephan Beyer
  2008-07-02 19:36             ` Stephan Beyer
@ 2008-07-02 20:42             ` Dmitry Potapov
  2008-07-02 23:46               ` Stephan Beyer
  1 sibling, 1 reply; 36+ messages in thread
From: Dmitry Potapov @ 2008-07-02 20:42 UTC (permalink / raw
  To: Stephan Beyer; +Cc: Stephen R. van den Berg, git, Mike Hommey, Michael J Gruber

On Wed, Jul 2, 2008 at 11:31 PM, Stephan Beyer <s-beyer@gmx.net> wrote:
>
> I'm somehow quite confused about the desired workflow but I try an
> answer.

I don't think we speak about any normal workflow but about importing
"initially disjunct CVS and SVN repositories into larger complete
histories inside a single git repository." This is one-time work, not
a regular workflow.

>
> Stephen R. van den Berg wrote:
>> As far as I understood it, the new git sequencer rewrites history
>> proper.  That is timeconsuming by definition, and thus it is *not*
>> possible to make a tool based on the sequencer that supports the desired
>> iterative-history-rewrite workflow.
>
> If I got the problem right, it is possible.
> But you have to rewrite and cannot just fake history, of course.

Using grafts allows you to fake history, which is very useful during
import, because it allows you to edit history without running any
filter-branch, which is very timeconsuming. Of course, at the end
you have to run git filter-branch to have the "true" history, otherwise
anyone who clones from you will end up with a broken repo.

The purpose of rebase (and I believe the sequencer too) is rather
different -- to allow you to keep your changes as patches to the
upstream.

> I wonder if grafts can be used in combination with sequencer in such a
> way that you rewrite foo~20000..foo~19950 and then fake the parents of
> foo~19949 to be the rewritten once.

I don't think it is a good idea. During the normal work you should never
use grafts. Well, you can use grafts to add old history, but using it for
anything else is really dangerous, because its *fakes* history. git rebase
(and AFAIK sequencer too) just re-write history of some branch. IOW, it
creates another branch from a different starting point using patches from
some existing branch and then reassign the branch name to it.

Dmitry

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 20:39       ` Dmitry Potapov
@ 2008-07-02 21:18         ` Stephen R. van den Berg
  2008-07-02 21:28           ` Avery Pennarun
  2008-07-02 21:27         ` Junio C Hamano
  1 sibling, 1 reply; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-02 21:18 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: git

Dmitry Potapov wrote:
>On Wed, Jul 2, 2008 at 10:10 PM, Stephen R. van den Berg <srb@cuci.nl> wrote:
>> In that case, I will stick to extending git fsck to check grafts more
>> rigorously and fix git clone to *refrain* from looking at grafts.

>Linus suggested that "git-fsck and repacking should just consider
>it[grafts] to be an  _additional_ source of parenthood rather than
>a _replacement_ source."

>http://article.gmane.org/gmane.comp.version-control.git/84686

Yes, I know that's what he suggested, the way it should be implemented
IMO though is by checking once without and once with regard to grafts.
And still it should be such that git clone disregards grafts completely.
I'll fix both, eventually, since I need this functionality to verify
correctness for the projects I'm working on at the moment.

As for repack, it should probably ignore grafts, except for reference.
I.e. repack/gc should consider all mentioned SHA1s in the grafts file
to be referenced and undeletable.
-- 
Sincerely,
           Stephen R. van den Berg.

You are confused; but this is your normal state.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 20:39       ` Dmitry Potapov
  2008-07-02 21:18         ` Stephen R. van den Berg
@ 2008-07-02 21:27         ` Junio C Hamano
  2008-07-02 21:49           ` Junio C Hamano
  1 sibling, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2008-07-02 21:27 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: Stephen R. van den Berg, git

"Dmitry Potapov" <dpotapov@gmail.com> writes:

> On Wed, Jul 2, 2008 at 10:10 PM, Stephen R. van den Berg <srb@cuci.nl> wrote:
>>
>> In that case, I will stick to extending git fsck to check grafts more
>> rigorously and fix git clone to *refrain* from looking at grafts.
>
> Linus suggested that "git-fsck and repacking should just consider
> it[grafts] to be an  _additional_ source of parenthood rather than
> a _replacement_ source."
>
> http://article.gmane.org/gmane.comp.version-control.git/84686

Yeah, thanks for a reminder.

    http://thread.gmane.org/gmane.comp.version-control.git/37744/focus=37866

is still on my "things to look at" list.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 21:18         ` Stephen R. van den Berg
@ 2008-07-02 21:28           ` Avery Pennarun
  0 siblings, 0 replies; 36+ messages in thread
From: Avery Pennarun @ 2008-07-02 21:28 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: Dmitry Potapov, git

On 7/2/08, Stephen R. van den Berg <srb@cuci.nl> wrote:
> Dmitry Potapov wrote:
>  >On Wed, Jul 2, 2008 at 10:10 PM, Stephen R. van den Berg <srb@cuci.nl> wrote:
>  >> In that case, I will stick to extending git fsck to check grafts more
>  >> rigorously and fix git clone to *refrain* from looking at grafts.
>
>  >Linus suggested that "git-fsck and repacking should just consider
>  >it[grafts] to be an  _additional_ source of parenthood rather than
>  >a _replacement_ source."
>
>  >http://article.gmane.org/gmane.comp.version-control.git/84686
>
> Yes, I know that's what he suggested, the way it should be implemented
>  IMO though is by checking once without and once with regard to grafts.
>  And still it should be such that git clone disregards grafts completely.

I could see an argument that the only modes you really need are a) use
grafts as replacements, and b) use grafts as additions.  There is
perhaps no need for c) ignore grafts.

For example, say I wanted to give someone a copy of my repo that
includes grafts (ignoring the fact that this is probably bad to do in
general).  He could git-clone it and then install a copy of my grafts
file, as long as git-clone does (a) or (b) but not (c).  On the other
hand, if he just wants a copy of the "real" (graft-free) repo, then
git-clone needs to do (b) or (c) but not (a).  git-fsck needs (b), and
most normal git operations want (a) (since that was the original
purpose of grafts).

Based on that, (c) is redundant, unless you're really concerned about
not sending redundant objects to people who clone your repo that has
grafts installed.  But I think you probably shouldn't have people
cloning your grafted repository anyway unless you know what you're
doing, and if you know what you're doing, you probably want (b).  If
you see what I mean.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 21:27         ` Junio C Hamano
@ 2008-07-02 21:49           ` Junio C Hamano
  2008-07-03  0:03             ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2008-07-02 21:49 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: Stephen R. van den Berg, Linus Torvalds, git

Junio C Hamano <gitster@pobox.com> writes:

> "Dmitry Potapov" <dpotapov@gmail.com> writes:
>
>> On Wed, Jul 2, 2008 at 10:10 PM, Stephen R. van den Berg <srb@cuci.nl> wrote:
>>>
>>> In that case, I will stick to extending git fsck to check grafts more
>>> rigorously and fix git clone to *refrain* from looking at grafts.
>>
>> Linus suggested that "git-fsck and repacking should just consider
>> it[grafts] to be an  _additional_ source of parenthood rather than
>> a _replacement_ source."
>>
>> http://article.gmane.org/gmane.comp.version-control.git/84686
>
> Yeah, thanks for a reminder.
>
>     http://thread.gmane.org/gmane.comp.version-control.git/37744/focus=37866
>
> is still on my "things to look at" list.

This shows how the "object transfer ignores grafts" side of the earlier
suggestion by Linus would look like to get people started.  Totally
untested.

I threw in for_each_commit_graft() in the patch so that updates to the
reachability walker can add otherwise hidden objects, but otherwise it is
not used yet.

 builtin-pack-objects.c |    5 +++++
 builtin-send-pack.c    |    3 ++-
 cache.h                |    1 +
 commit.c               |   10 ++++++++++
 commit.h               |    2 ++
 environment.c          |    1 +
 upload-pack.c          |    1 +
 7 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 28207d9..53b0b33 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -30,6 +30,7 @@ git-pack-objects [{ -q | --progress | --all-progress }] \n\
 	[--threads=N] [--non-empty] [--revs [--unpacked | --all]*] [--reflog] \n\
 	[--stdout | base-name] [--include-tag] \n\
 	[--keep-unreachable | --unpack-unreachable] \n\
+	[--ignore-graft] \n\
 	[<ref-list | <object-list]";
 
 struct object_entry {
@@ -2160,6 +2161,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 				die("bad %s", arg);
 			continue;
 		}
+		if (!strcmp(arg, "--ignore-graft")) {
+			honor_graft = 0;
+			continue;
+		}
 		usage(pack_usage);
 	}
 
diff --git a/builtin-send-pack.c b/builtin-send-pack.c
index d76260c..d932352 100644
--- a/builtin-send-pack.c
+++ b/builtin-send-pack.c
@@ -27,6 +27,7 @@ static int pack_objects(int fd, struct ref *refs)
 	 */
 	const char *argv[] = {
 		"pack-objects",
+		"--ignore-graft",
 		"--all-progress",
 		"--revs",
 		"--stdout",
@@ -36,7 +37,7 @@ static int pack_objects(int fd, struct ref *refs)
 	struct child_process po;
 
 	if (args.use_thin_pack)
-		argv[4] = "--thin";
+		argv[5] = "--thin";
 	memset(&po, 0, sizeof(po));
 	po.argv = argv;
 	po.in = -1;
diff --git a/cache.h b/cache.h
index 188428d..00858f9 100644
--- a/cache.h
+++ b/cache.h
@@ -435,6 +435,7 @@ extern size_t packed_git_limit;
 extern size_t delta_base_cache_limit;
 extern int auto_crlf;
 extern int fsync_object_files;
+extern int honor_graft;
 
 enum safe_crlf {
 	SAFE_CRLF_FALSE = 0,
diff --git a/commit.c b/commit.c
index e2d8624..62cf104 100644
--- a/commit.c
+++ b/commit.c
@@ -101,6 +101,13 @@ static int commit_graft_pos(const unsigned char *sha1)
 	return -lo - 1;
 }
 
+void for_each_commit_graft(void (*fn)(struct commit_graft *))
+{
+	int i;
+	for (i = 0; i < commit_graft_nr; i++)
+		fn(commit_graft[i]);
+}
+
 int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 {
 	int pos = commit_graft_pos(graft->sha1);
@@ -196,7 +203,10 @@ static void prepare_commit_graft(void)
 struct commit_graft *lookup_commit_graft(const unsigned char *sha1)
 {
 	int pos;
+
 	prepare_commit_graft();
+	if (!honor_graft)
+		return NULL;
 	pos = commit_graft_pos(sha1);
 	if (pos < 0)
 		return NULL;
diff --git a/commit.h b/commit.h
index 2d94d41..8f76dd9 100644
--- a/commit.h
+++ b/commit.h
@@ -138,4 +138,6 @@ static inline int single_parent(struct commit *commit)
 	return commit->parents && !commit->parents->next;
 }
 
+void for_each_commit_graft(void (*fn)(struct commit_graft *));
+
 #endif /* COMMIT_H */
diff --git a/environment.c b/environment.c
index 4a88a17..eb8f36d 100644
--- a/environment.c
+++ b/environment.c
@@ -41,6 +41,7 @@ enum safe_crlf safe_crlf = SAFE_CRLF_WARN;
 unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
 enum branch_track git_branch_track = BRANCH_TRACK_REMOTE;
 enum rebase_setup_type autorebase = AUTOREBASE_NEVER;
+int honor_graft = 1;
 
 /* This is set by setup_git_dir_gently() and/or git_default_config() */
 char *git_work_tree_cfg;
diff --git a/upload-pack.c b/upload-pack.c
index b46dd36..d948d64 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -158,6 +158,7 @@ static void create_pack_file(void)
 		die("git-upload-pack: unable to fork git-rev-list");
 
 	argv[arg++] = "pack-objects";
+	argv[arg++] = "--ignore-graft";
 	argv[arg++] = "--stdout";
 	if (!no_progress)
 		argv[arg++] = "--progress";

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 20:42             ` Dmitry Potapov
@ 2008-07-02 23:46               ` Stephan Beyer
  2008-07-03  6:05                 ` Stephen R. van den Berg
  0 siblings, 1 reply; 36+ messages in thread
From: Stephan Beyer @ 2008-07-02 23:46 UTC (permalink / raw
  To: Dmitry Potapov
  Cc: Stephen R. van den Berg, git, Mike Hommey, Michael J Gruber

Hi,

On Thu, Jul 03, 2008 at 12:42:30AM +0400, Dmitry Potapov wrote:
> On Wed, Jul 2, 2008 at 11:31 PM, Stephan Beyer <s-beyer@gmx.net> wrote:
> > I wonder if grafts can be used in combination with sequencer in such a
> > way that you rewrite foo~20000..foo~19950 and then fake the parents of
> > foo~19949 to be the rewritten once.
> 
> I don't think it is a good idea. During the normal work you should never
> use grafts.

I have written this in the context that Stephen only changes some commits
from a long time ago (foo~20000) and then I showed a way how to avoid that
sequencer rewrites the rest which takes so long.
This is not related to "normal work", but to Stephen's use case (if I
got it right).

What I've meant, was:
Instead of faking a lot of parents, changes and even merges using an
extended grafts file, he could rewrite some patches - which can be fast -
and then use _only one_ graft to change the parent to the changed and
rewritten commit.
This can be done iteratively and seems to be a good agreement in speed
and reliability.

Regards,
  Stephan

-- 
Stephan Beyer <s-beyer@gmx.net>, PGP 0x6EDDD207FCC5040F

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 21:49           ` Junio C Hamano
@ 2008-07-03  0:03             ` Junio C Hamano
  0 siblings, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2008-07-03  0:03 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: Stephen R. van den Berg, Linus Torvalds, git

Junio C Hamano <gitster@pobox.com> writes:

>> Yeah, thanks for a reminder.
>>
>>     http://thread.gmane.org/gmane.comp.version-control.git/37744/focus=37866
>>
>> is still on my "things to look at" list.
>
> This shows how the "object transfer ignores grafts" side of the earlier
> suggestion by Linus would look like to get people started.  Totally
> untested.
>
> I threw in for_each_commit_graft() in the patch so that updates to the
> reachability walker can add otherwise hidden objects, but otherwise it is
> not used yet.

This updates the earlier patch to teach the object transfer side to ignore
grafts, which makes things consistent between dumb commit walkers and
native transport.  It is not meant for application as I haven't thought
about[*1*] nor looked into how this may interact with the "shallow clone"
stuff (which is graft in disguise but implemented separately).

Footnote. *1* I also suspect Linus did not think about interactions with
"shallow" when he made the suggestion referenced above, as "shallow" was
still a relatively new curiosity back then.

I am not sure if the addition of --ignore-graft to revision.c should be
there when this becomes real.  I added it primarily for debugging
purposes, as it is something the end users should never trigger in the
normal workflow.

--
 builtin-pack-objects.c |    5 +++
 builtin-send-pack.c    |    3 +-
 cache.h                |    1 +
 commit.c               |   10 +++++++
 commit.h               |    2 +
 environment.c          |    1 +
 revision.c             |    4 +++
 t/t6500-graft.sh       |   70 ++++++++++++++++++++++++++++++++++++++++++++++++
 upload-pack.c          |    2 +
 9 files changed, 97 insertions(+), 1 deletions(-)
 create mode 100755 t/t6500-graft.sh

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 28207d9..53b0b33 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -30,6 +30,7 @@ git-pack-objects [{ -q | --progress | --all-progress }] \n\
 	[--threads=N] [--non-empty] [--revs [--unpacked | --all]*] [--reflog] \n\
 	[--stdout | base-name] [--include-tag] \n\
 	[--keep-unreachable | --unpack-unreachable] \n\
+	[--ignore-graft] \n\
 	[<ref-list | <object-list]";
 
 struct object_entry {
@@ -2160,6 +2161,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 				die("bad %s", arg);
 			continue;
 		}
+		if (!strcmp(arg, "--ignore-graft")) {
+			honor_graft = 0;
+			continue;
+		}
 		usage(pack_usage);
 	}
 
diff --git a/builtin-send-pack.c b/builtin-send-pack.c
index d76260c..d932352 100644
--- a/builtin-send-pack.c
+++ b/builtin-send-pack.c
@@ -27,6 +27,7 @@ static int pack_objects(int fd, struct ref *refs)
 	 */
 	const char *argv[] = {
 		"pack-objects",
+		"--ignore-graft",
 		"--all-progress",
 		"--revs",
 		"--stdout",
@@ -36,7 +37,7 @@ static int pack_objects(int fd, struct ref *refs)
 	struct child_process po;
 
 	if (args.use_thin_pack)
-		argv[4] = "--thin";
+		argv[5] = "--thin";
 	memset(&po, 0, sizeof(po));
 	po.argv = argv;
 	po.in = -1;
diff --git a/cache.h b/cache.h
index 188428d..00858f9 100644
--- a/cache.h
+++ b/cache.h
@@ -435,6 +435,7 @@ extern size_t packed_git_limit;
 extern size_t delta_base_cache_limit;
 extern int auto_crlf;
 extern int fsync_object_files;
+extern int honor_graft;
 
 enum safe_crlf {
 	SAFE_CRLF_FALSE = 0,
diff --git a/commit.c b/commit.c
index e2d8624..62cf104 100644
--- a/commit.c
+++ b/commit.c
@@ -101,6 +101,13 @@ static int commit_graft_pos(const unsigned char *sha1)
 	return -lo - 1;
 }
 
+void for_each_commit_graft(void (*fn)(struct commit_graft *))
+{
+	int i;
+	for (i = 0; i < commit_graft_nr; i++)
+		fn(commit_graft[i]);
+}
+
 int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 {
 	int pos = commit_graft_pos(graft->sha1);
@@ -196,7 +203,10 @@ static void prepare_commit_graft(void)
 struct commit_graft *lookup_commit_graft(const unsigned char *sha1)
 {
 	int pos;
+
 	prepare_commit_graft();
+	if (!honor_graft)
+		return NULL;
 	pos = commit_graft_pos(sha1);
 	if (pos < 0)
 		return NULL;
diff --git a/commit.h b/commit.h
index 2d94d41..8f76dd9 100644
--- a/commit.h
+++ b/commit.h
@@ -138,4 +138,6 @@ static inline int single_parent(struct commit *commit)
 	return commit->parents && !commit->parents->next;
 }
 
+void for_each_commit_graft(void (*fn)(struct commit_graft *));
+
 #endif /* COMMIT_H */
diff --git a/environment.c b/environment.c
index 4a88a17..eb8f36d 100644
--- a/environment.c
+++ b/environment.c
@@ -41,6 +41,7 @@ enum safe_crlf safe_crlf = SAFE_CRLF_WARN;
 unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
 enum branch_track git_branch_track = BRANCH_TRACK_REMOTE;
 enum rebase_setup_type autorebase = AUTOREBASE_NEVER;
+int honor_graft = 1;
 
 /* This is set by setup_git_dir_gently() and/or git_default_config() */
 char *git_work_tree_cfg;
diff --git a/revision.c b/revision.c
index fc66755..25c96d0 100644
--- a/revision.c
+++ b/revision.c
@@ -991,6 +991,10 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, const ch
 		const char *arg = argv[i];
 		if (*arg == '-') {
 			int opts;
+			if (!strcmp(arg, "--ignore-graft")) {
+				honor_graft = 0;
+				continue;
+			}
 			if (!prefixcmp(arg, "--max-count=")) {
 				revs->max_count = atoi(arg + 12);
 				continue;
diff --git a/t/t6500-graft.sh b/t/t6500-graft.sh
new file mode 100755
index 0000000..122ed1c
--- /dev/null
+++ b/t/t6500-graft.sh
@@ -0,0 +1,70 @@
+#!/bin/sh
+
+test_description='graft and ancestry traversal'
+. test-lib.sh
+
+# Real history:
+#
+#      .----D
+#     /
+#    A--B--C--E
+#
+# Grafted history:
+#
+#      .----D
+#     /      \
+#    A-----C--E
+#
+
+advance_one () {
+	echo "$1" >file &&
+	git add file &&
+	test_tick &&
+	git commit -m "$1" &&
+	git rev-parse --verify HEAD >"$1"
+}
+
+test_expect_success setup '
+	advance_one A &&
+	advance_one D &&
+
+	git reset --hard $(cat A) &&
+	advance_one B &&
+	advance_one C &&
+	advance_one E &&
+
+	(
+		echo $(cat E) $(cat C) $(cat D)
+		echo $(cat C) $(cat A)
+	) >.git/info/grafts &&
+
+	git log --graph --pretty=oneline --abbrev-commit --parents &&
+	echo " - - - - - - - - - - " &&
+	git log --graph --pretty=oneline --abbrev-commit --parents --ignore-graft
+
+'
+
+test_expect_success 'clone should lose grafts' '
+	git clone --bare "file://$(pwd)/.git" cloned.git &&
+	(
+		GIT_DIR=cloned.git && export GIT_DIR &&
+		git log --graph --pretty=oneline --abbrev-commit --parents &&
+		git cat-file commit $(cat B) &&
+		test_must_fail git cat-file commit $(cat D) &&
+		git fsck --full
+	)
+'
+
+test_expect_success 'push should lose grafts' '
+	test_create_repo pushed.git &&
+	(
+		cd pushed.git &&
+		git fetch ../.git master:refs/heads/master &&
+		git log --graph --pretty=oneline --abbrev-commit --parents &&
+		git cat-file commit $(cat ../B) &&
+		test_must_fail git cat-file commit $(cat ../D) &&
+		git fsck --full
+	)
+'
+
+test_done
\ No newline at end of file
diff --git a/upload-pack.c b/upload-pack.c
index b46dd36..798caaa 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -158,6 +158,7 @@ static void create_pack_file(void)
 		die("git-upload-pack: unable to fork git-rev-list");
 
 	argv[arg++] = "pack-objects";
+	argv[arg++] = "--ignore-graft";
 	argv[arg++] = "--stdout";
 	if (!no_progress)
 		argv[arg++] = "--progress";
@@ -614,6 +615,7 @@ int main(int argc, char **argv)
 	int i;
 	int strict = 0;
 
+	honor_graft = 0;
 	for (i = 1; i < argc; i++) {
 		char *arg = argv[i];
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 14:35 RFC: grafts generalised Stephen R. van den Berg
  2008-07-02 16:35 ` Jakub Narebski
  2008-07-02 17:19 ` Dmitry Potapov
@ 2008-07-03  0:13 ` Petr Baudis
  2008-07-03  0:16   ` Petr Baudis
  2 siblings, 1 reply; 36+ messages in thread
From: Petr Baudis @ 2008-07-03  0:13 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git

On Wed, Jul 02, 2008 at 04:35:19PM +0200, Stephen R. van den Berg wrote:
> - Extend the grafts file format to support something like the following syntax:
> 
> commit eb03813cdb999f25628784bb4f07b3f4c8bfe3f6
> Parent: 7bc72e647d54c2f713160b22e2e08c39d86c7c28
> Merge: 3b3da24960a82a479b9ad64affab50226df02abe 13b8f53e8ccec3b08eeb6515e6a10a2a
> Merge: ac719ed37270558f21d89676fce97eab4469b0f1
> Tree: 32fc99814b97322174dbe97ec320cf32314959e2
> Author: Foo Bar (FooBar) <foo@bar>
> AuthorDate: Sat Jun 6 13:50:44 1998 +0000
> Commit: Foo Bar (FooBar) <foo@bar>
> CommitDate: Sat Jun 7 13:50:44 1998 +0000
> Logmessage: First line of logmessage override
> Logmessage: Second line of logmessage override
> Logmessage: Etc.

  Please, don't. It adds completely unnecessary complexity and it is
_not_ grafting anymore - look the word up in a dictionary. :-)

  Have a look at what you wrote above - now, Git already has a way to
store all this information, right? In the commit objects!

  So, the real solution is to take the commit objects you want to
modify, create new commit objects, then graft the new commit on all the
old commit children. It fits neatly in the Git philosophy, there is no
need at all to tweak the current infrastructure for this and it should
be trivial to automate, too.

-- 
				Petr "Pasky" Baudis
The last good thing written in C++ was the Pachelbel Canon. -- J. Olson

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-03  0:13 ` Petr Baudis
@ 2008-07-03  0:16   ` Petr Baudis
  2008-07-03  0:28     ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Petr Baudis @ 2008-07-03  0:16 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git

On Thu, Jul 03, 2008 at 02:13:31AM +0200, Petr Baudis wrote:
>   So, the real solution is to take the commit objects you want to
> modify, create new commit objects, then graft the new commit on all the
> old commit children. It fits neatly in the Git philosophy, there is no
> need at all to tweak the current infrastructure for this and it should
> be trivial to automate, too.

  Oops, sorry; I stopped reading the branch of the thread I thought was
going off on a different tangent one post too early. :-)

-- 
				Petr "Pasky" Baudis
The last good thing written in C++ was the Pachelbel Canon. -- J. Olson

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 17:32   ` Stephen R. van den Berg
@ 2008-07-03  0:21     ` Petr Baudis
  2008-07-03  7:11       ` Stephen R. van den Berg
  2008-07-04  0:43     ` Jakub Narebski
  1 sibling, 1 reply; 36+ messages in thread
From: Petr Baudis @ 2008-07-03  0:21 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: Jakub Narebski, git

On Wed, Jul 02, 2008 at 07:32:03PM +0200, Stephen R. van den Berg wrote:
> Also, the graft mechanism specifically is intended as a temporary
> solution until one uses filter-branch to "finalise" the result into a
> proper repository which becomes cloneable.

Grafts are _much_ older than filter-branch and I'm not sure where did
you get this idea; do we claim that in any documentation?

> >The fact that git-filter-branch (and earlier cg-admin-rewrite-hist)
> >respects grafts, and rewrites history so that grafts are no-op and are
> >not needed further is a bit of side-effect.
> 
> I beg to differ.  It's not a side effect, it's the proper way to get
> rid of the grafts file.  Grafts are temporary and ugly.  In proper
> repositories they are a sign of transition to a proper state.
> The proper state is attained by using git filter-branch.

There's nothing ugly or necessarily temporary about grafts. One example
of completely valid usage is adding previous history of a project to it
later.

First, you don't need to carry around all the archived baggage you are
probably rarely going to access anyway if you don't need to; changing a
VCS is ideal cutoff point.

Second, you don't need to worry about doing perfect conversion at the
moment of the switch.

Third, even if you think you have done it perfectly, it will turn out
later that something is wrong anyway.

Fourth, it may not be actually _clear_ what the canonical history should
be. Consider linux-kernel, you can graft the BitKeeper history (or one
of possible candidates for the ideal conversion, though one is AFAIK
clearly favoured), or you could also graft commit-per-tarball history
even from the times before BitKeeper; you certainly don't want either in
the current main history DAG.

-- 
				Petr "Pasky" Baudis
The last good thing written in C++ was the Pachelbel Canon. -- J. Olson

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-03  0:16   ` Petr Baudis
@ 2008-07-03  0:28     ` Junio C Hamano
  0 siblings, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2008-07-03  0:28 UTC (permalink / raw
  To: Petr Baudis; +Cc: Stephen R. van den Berg, git

Petr Baudis <pasky@suse.cz> writes:

> On Thu, Jul 03, 2008 at 02:13:31AM +0200, Petr Baudis wrote:
>>   So, the real solution is to take the commit objects you want to
>> modify, create new commit objects, then graft the new commit on all the
>> old commit children. It fits neatly in the Git philosophy, there is no
>> need at all to tweak the current infrastructure for this and it should
>> be trivial to automate, too.
>
>   Oops, sorry; I stopped reading the branch of the thread I thought was
> going off on a different tangent one post too early. :-)

What you wrote was a very good summary of what Dmitry suggested earlier
;-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 18:10     ` Stephen R. van den Berg
  2008-07-02 18:33       ` Dmitry Potapov
  2008-07-02 20:39       ` Dmitry Potapov
@ 2008-07-03  6:02       ` Johannes Sixt
  2008-07-03  7:30         ` Stephen R. van den Berg
  2 siblings, 1 reply; 36+ messages in thread
From: Johannes Sixt @ 2008-07-03  6:02 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: Dmitry Potapov, git

Stephen R. van den Berg schrieb:
> Dmitry Potapov wrote:
>> On second thought, it may be not necessary. You can extract an old commit
>> object, edit it, put it into Git with a new SHA1, and then use the graft file to
>> replace all references from an old to a new one. And you will be able to see
>> changes immediately in gitk.
> 
> Hmmmm, interesting thought.  That just might solve my problem.

I don't think it would.

You want to apply a patch through a part of the history. To do that, it is
not sufficient to apply the patch to only one commit/tree and then fake
parenthood of its child commits. You still need to apply the patch to all
children.

-- Hannes

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 23:46               ` Stephan Beyer
@ 2008-07-03  6:05                 ` Stephen R. van den Berg
  0 siblings, 0 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-03  6:05 UTC (permalink / raw
  To: Stephan Beyer; +Cc: Dmitry Potapov, git, Mike Hommey, Michael J Gruber

Stephan Beyer wrote:
>On Thu, Jul 03, 2008 at 12:42:30AM +0400, Dmitry Potapov wrote:
>> On Wed, Jul 2, 2008 at 11:31 PM, Stephan Beyer <s-beyer@gmx.net> wrote:
>> > I wonder if grafts can be used in combination with sequencer in such a
>> > way that you rewrite foo~20000..foo~19950 and then fake the parents of
>> > foo~19949 to be the rewritten once.

>> I don't think it is a good idea. During the normal work you should never
>> use grafts.

>I have written this in the context that Stephen only changes some commits
>from a long time ago (foo~20000) and then I showed a way how to avoid that
>sequencer rewrites the rest which takes so long.
>This is not related to "normal work", but to Stephen's use case (if I
>got it right).

You got it right.

>What I've meant, was:
>Instead of faking a lot of parents, changes and even merges using an
>extended grafts file, he could rewrite some patches - which can be fast -
>and then use _only one_ graft to change the parent to the changed and
>rewritten commit.
>This can be done iteratively and seems to be a good agreement in speed
>and reliability.

Indeed.
-- 
Sincerely,
           Stephen R. van den Berg.

This is a day for firm decisions!  Or is it?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-03  0:21     ` Petr Baudis
@ 2008-07-03  7:11       ` Stephen R. van den Berg
  0 siblings, 0 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-03  7:11 UTC (permalink / raw
  To: Petr Baudis; +Cc: Jakub Narebski, git

Petr Baudis wrote:
>On Wed, Jul 02, 2008 at 07:32:03PM +0200, Stephen R. van den Berg wrote:
>> Also, the graft mechanism specifically is intended as a temporary
>> solution until one uses filter-branch to "finalise" the result into a
>> proper repository which becomes cloneable.

>Grafts are _much_ older than filter-branch and I'm not sure where did
>you get this idea; do we claim that in any documentation?

Not in direct documentation, but it is what breaths down from posts on
the mailinglist like:

http://kerneltrap.org/mailarchive/git/2008/6/10/2085624

Jakub Narebski:
>Then if possible use git-filter-branch to make history recorded in
>grafts file permanent...

Petr Baudis wrote:
>There's nothing ugly or necessarily temporary about grafts. One example
>of completely valid usage is adding previous history of a project to it
>later.

>First, you don't need to carry around all the archived baggage you are
>probably rarely going to access anyway if you don't need to; changing a
>VCS is ideal cutoff point.

That depends on the project, of course, and is not a valid statement in
general.  Part of the charm of full history is that git-blame and
git-bisect work, at arbitrary points in the past.

>Second, you don't need to worry about doing perfect conversion at the
>moment of the switch.

Well, you do, if you intend to make it cloneable.

>Third, even if you think you have done it perfectly, it will turn out
>later that something is wrong anyway.

Not necessarily.  I have automated the checkout-verification-process which
basically checks out every revision from the respective old repository
and binary-compares it with the corresponding revision in the git
repository.  This ensures a full binary match across the board.
With respect to historical merges, I agree, those might not be
completely correctly grafted, but the level of correctness can be
determined at will, and once we achieve somewhere around 99% accuracy,
we consider it done (for this project).

>Fourth, it may not be actually _clear_ what the canonical history should
>be.

That depends on the project.  In my project it *is* clear, so this point
doesn't make any difference.

-- 
Sincerely,
           Stephen R. van den Berg.

This is a day for firm decisions!  Or is it?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-03  6:02       ` Johannes Sixt
@ 2008-07-03  7:30         ` Stephen R. van den Berg
  2008-07-03  7:42           ` Johannes Sixt
  0 siblings, 1 reply; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-03  7:30 UTC (permalink / raw
  To: Johannes Sixt; +Cc: Dmitry Potapov, git

Johannes Sixt wrote:
>Stephen R. van den Berg schrieb:
>> Dmitry Potapov wrote:
>>> On second thought, it may be not necessary. You can extract an old commit
>>> object, edit it, put it into Git with a new SHA1, and then use the graft file to
>>> replace all references from an old to a new one. And you will be able to see
>>> changes immediately in gitk.

>> Hmmmm, interesting thought.  That just might solve my problem.

>I don't think it would.

>You want to apply a patch through a part of the history. To do that, it is
>not sufficient to apply the patch to only one commit/tree and then fake
>parenthood of its child commits. You still need to apply the patch to all
>children.

I am aware of that.
There are actually two common cases:
- Historical changes which are confined and don't ripple through.  The
  above solution works just fine for that.
- Ripple-through changes.  They indeed need to be applied to every tree
  in the first-parent chain.  Even though this is going to take a
  considerable amount of time, there still are certain advantages to
  doing this using the method described above:
  + You can apply the patch to every commit/tree "interactively" if you want.
    (Yes, I know, git-sequencer supports this one as well, but not the
    next point).
  + You can view the change at any point in time (including in relation to the
    tree that follows it), right after making the amendments (without letting
    it ripple through to the end).
  + The ripple-through does not need to be performed in topological order,
    i.e. eventually you'll have to touch everything, but you can do it
    in the order you see fit (whatever is most efficient to work on).
  + If, at some point during the ripple-through process, you find out
    that you forgot some change(s), you can abort or restart the
    ripple-through without having spent all that time waiting for a
    full-ripple-through.

Actually, ripple-through changes are rare.  In the current project it
seems I need exactly one, but it's buried deep in the past (sadly).
The reason why I need it, is to make sure that git-bisect will work for
any revision in the past (i.e. the tree contained/contains some
too-clever-for-their-own-good $Revision$-expansion dependencies)
-- 
Sincerely,
           Stephen R. van den Berg.

This is a day for firm decisions!  Or is it?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-03  7:30         ` Stephen R. van den Berg
@ 2008-07-03  7:42           ` Johannes Sixt
  2008-07-03  9:37             ` Stephen R. van den Berg
  0 siblings, 1 reply; 36+ messages in thread
From: Johannes Sixt @ 2008-07-03  7:42 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: Dmitry Potapov, git

Stephen R. van den Berg schrieb:
> Actually, ripple-through changes are rare.  In the current project it
> seems I need exactly one, but it's buried deep in the past (sadly).
> The reason why I need it, is to make sure that git-bisect will work for
> any revision in the past (i.e. the tree contained/contains some
> too-clever-for-their-own-good $Revision$-expansion dependencies)

But you do know that you don't need to apply the change *now*; you can
apply it at bisect-time? Unless you expect you or your mere mortal
coworkers are going to do dozens of bisects into that part of the history,
I wouldn't change history *like*this*. But of course, I don't understand
the circumstances enough, so... just my 2 cents.

-- Hannes

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-03  7:42           ` Johannes Sixt
@ 2008-07-03  9:37             ` Stephen R. van den Berg
  0 siblings, 0 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-03  9:37 UTC (permalink / raw
  To: Johannes Sixt; +Cc: Dmitry Potapov, git

Johannes Sixt wrote:
>Stephen R. van den Berg schrieb:
>> Actually, ripple-through changes are rare.  In the current project it
>> seems I need exactly one, but it's buried deep in the past (sadly).
>> The reason why I need it, is to make sure that git-bisect will work for
>> any revision in the past (i.e. the tree contained/contains some
>> too-clever-for-their-own-good $Revision$-expansion dependencies)

>But you do know that you don't need to apply the change *now*; you can
>apply it at bisect-time? Unless you expect you or your mere mortal
>coworkers are going to do dozens of bisects into that part of the history,
>I wouldn't change history *like*this*. But of course, I don't understand
>the circumstances enough, so... just my 2 cents.

That is exactly the case, I do expect dozens of bisects.
-- 
Sincerely,
           Stephen R. van den Berg.

This is a day for firm decisions!  Or is it?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 17:32   ` Stephen R. van den Berg
  2008-07-03  0:21     ` Petr Baudis
@ 2008-07-04  0:43     ` Jakub Narebski
  1 sibling, 0 replies; 36+ messages in thread
From: Jakub Narebski @ 2008-07-04  0:43 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: git

On Wed, 2 July 2008, Stephen R. van den Berg wrote:
> Jakub Narebski wrote:
>>
>> [...] So I think that it would
>> be better to provide generic git-filter-branch filter which can
>> understand this "generalized grafts" file format, or rather
>> 'description of changes' file.  Put it in contrib/, and here you
>> go...
> 
> The problem is that the process of fixing history is an iterative one,
> which can take many months, and everytime you make a change, the
> correctness needs to be viewed using gitk.
[...]

I wanted to propose that git-filter-branch generic "generalized grafts"
file based filter should be accompanied by extending gitk so it
understand this format to...

...but after reading wonderfull suggestion to create new commits with
corrected contents, and insert them (replace older version by them)
using grafts, thought and brought independently by Dmitry Potapov and
Petr Baudis, I think that you would be best with extending gitk to
support this way instead.

You would have to extend gitk to maintain reverse revision mapping
(from revision to its children), and then you would be able to edit
history interactively from within gitk, with gitk correcting its
internal structures to redisplay changed commits, and creating commits
and doing grafting behind the scenes for later git-filter-branch
run.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-02 17:42     ` Stephen R. van den Berg
  2008-07-02 18:25       ` Mike Hommey
@ 2008-07-07  6:28       ` Andreas Ericsson
  2008-07-07  6:59         ` Stephen R. van den Berg
  1 sibling, 1 reply; 36+ messages in thread
From: Andreas Ericsson @ 2008-07-07  6:28 UTC (permalink / raw
  To: Stephen R. van den Berg; +Cc: Michael J Gruber, git

Stephen R. van den Berg wrote:
> Michael J Gruber wrote:
>> Maybe the upcoming git-sequencer could be the appropriate place? It 
>> tries to achieve just that: edit history by specifying a list of 
>> commands. The currently planned set of commands would need to be 
> 
> That's the problem.  Like git filter-branch, git sequencer needs you to
> parameterise the changes, which, in my case, is hardly possible, since
> the changes are randomlike.
> Also, having to run the sequencer to dig 20000 commits into the past,
> then change something, then come back up and rewrite all following
> history and relations (parents/tags/merges) will take a sizeable amount
> of time.  I need something that can be changed at will, then viewed with
> gitk a second later.
> 

A second later might be too much, but for the case where you need to
add a patch in the middle (which I suspect is the most timeconsuming
and tricky part at the moment), you might want to use a temporary
branch checked out where you need to apply the patch, apply the patch
and then rebase the rest of the history onto that new commit. Rebase
is fairly quick (although not a one-second thing for 20k commits), so
you'll get the time down quite a bit, I imagine.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: RFC: grafts generalised
  2008-07-07  6:28       ` Andreas Ericsson
@ 2008-07-07  6:59         ` Stephen R. van den Berg
  0 siblings, 0 replies; 36+ messages in thread
From: Stephen R. van den Berg @ 2008-07-07  6:59 UTC (permalink / raw
  To: Andreas Ericsson; +Cc: Michael J Gruber, git

Andreas Ericsson wrote:
>Stephen R. van den Berg wrote:
>>of time.  I need something that can be changed at will, then viewed with
>>gitk a second later.

>A second later might be too much, but for the case where you need to
>add a patch in the middle (which I suspect is the most timeconsuming
>and tricky part at the moment), you might want to use a temporary
>branch checked out where you need to apply the patch, apply the patch
>and then rebase the rest of the history onto that new commit. Rebase
>is fairly quick (although not a one-second thing for 20k commits), so
>you'll get the time down quite a bit, I imagine.

Not really.
Rebase does two things:
a. Apply every patch/commit again, which takes too long for 20k commits.
b. Mess up carefully grafted parent/merge relationships.

Rebase is only suitable for short linear strands of commits.
The history I'm dealing with is neither short, nor linear.
-- 
Sincerely,
           Stephen R. van den Berg.

A truly wise man never plays leapfrog with a unicorn.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2008-07-07  7:00 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-02 14:35 RFC: grafts generalised Stephen R. van den Berg
2008-07-02 16:35 ` Jakub Narebski
2008-07-02 16:43   ` Michael J Gruber
2008-07-02 17:42     ` Stephen R. van den Berg
2008-07-02 18:25       ` Mike Hommey
2008-07-02 18:34         ` Michael J Gruber
2008-07-02 19:31           ` Stephan Beyer
2008-07-02 19:36             ` Stephan Beyer
2008-07-02 20:42             ` Dmitry Potapov
2008-07-02 23:46               ` Stephan Beyer
2008-07-03  6:05                 ` Stephen R. van den Berg
2008-07-02 18:37         ` Stephen R. van den Berg
2008-07-07  6:28       ` Andreas Ericsson
2008-07-07  6:59         ` Stephen R. van den Berg
2008-07-02 17:32   ` Stephen R. van den Berg
2008-07-03  0:21     ` Petr Baudis
2008-07-03  7:11       ` Stephen R. van den Berg
2008-07-04  0:43     ` Jakub Narebski
2008-07-02 17:19 ` Dmitry Potapov
2008-07-02 17:58   ` Dmitry Potapov
2008-07-02 18:10     ` Stephen R. van den Berg
2008-07-02 18:33       ` Dmitry Potapov
2008-07-02 20:39       ` Dmitry Potapov
2008-07-02 21:18         ` Stephen R. van den Berg
2008-07-02 21:28           ` Avery Pennarun
2008-07-02 21:27         ` Junio C Hamano
2008-07-02 21:49           ` Junio C Hamano
2008-07-03  0:03             ` Junio C Hamano
2008-07-03  6:02       ` Johannes Sixt
2008-07-03  7:30         ` Stephen R. van den Berg
2008-07-03  7:42           ` Johannes Sixt
2008-07-03  9:37             ` Stephen R. van den Berg
2008-07-02 17:59   ` Stephen R. van den Berg
2008-07-03  0:13 ` Petr Baudis
2008-07-03  0:16   ` Petr Baudis
2008-07-03  0:28     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).