git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* using git-blame with patches as input
@ 2008-06-16 21:35 Don Zickus
  2008-06-16 21:45 ` Junio C Hamano
  2008-06-16 21:54 ` Junio C Hamano
  0 siblings, 2 replies; 6+ messages in thread
From: Don Zickus @ 2008-06-16 21:35 UTC (permalink / raw)
  To: git

I deal with a lot of backported patches that are a combination of multiple
commits.  I was looking to develop a tool that would help me determine
which chunks of the patch are upstream (not necessarily currently in HEAD
but at some point in the file's history).

For example, if I took the top three commits from HEAD and appended them
into one patch file and then ran this tool with the patch as input, I
would hope that it gave as output the three original commits.

git-blame seem to handle a lot of the pieces I would need but my little
brain can't follow all the logic behind some of the mechanisms.

Seeing that git-blame can take patch chunks and traverse through commit
history to see if a particular chunk can be blamed on a parent, I feel
like I am most of the way there.  Unfortunately, I don't quite understand
some of the algorithms git-blame does when it splits the patch chunks into
smaller pieces to determine which pieces are blame-able on the parents.

Is there anyone who can help explain some of the low level logic to me?

What I would like to do is take a patch as input, split it into chunks and
traverse through the commit history looking for a match (or something of
high similarity) and output that commit id for each patch chunk.

git-cherry does something close but patches have to be exact whereas my
situation has a combination of patches.  I also understand there are
plenty of normal scenerios where my approach falls flat on its face (but I
have ideas for those).  I just wanted to get a simple common case going
first.

Thanks in advance.

Cheers,
Don

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: using git-blame with patches as input
  2008-06-16 21:35 using git-blame with patches as input Don Zickus
@ 2008-06-16 21:45 ` Junio C Hamano
  2008-06-17 14:15   ` Don Zickus
  2008-06-16 21:54 ` Junio C Hamano
  1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2008-06-16 21:45 UTC (permalink / raw)
  To: Don Zickus; +Cc: git

Don Zickus <dzickus@redhat.com> writes:

> I deal with a lot of backported patches that are a combination of multiple
> commits.  I was looking to develop a tool that would help me determine
> which chunks of the patch are upstream (not necessarily currently in HEAD
> but at some point in the file's history).
>
> For example, if I took the top three commits from HEAD and appended them
> into one patch file and then ran this tool with the patch as input, I
> would hope that it gave as output the three original commits.

A quick and dirty hack would be to:

	rm .git/index
	sed -ne 's/^[+ ]//p' -e '/^@@/p' patches... >file
        git add file
        git commit -m 'only "a file" remains'
        git blame -C -C -w file

which would try blaming all the postimage concatenated together ;-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: using git-blame with patches as input
  2008-06-16 21:35 using git-blame with patches as input Don Zickus
  2008-06-16 21:45 ` Junio C Hamano
@ 2008-06-16 21:54 ` Junio C Hamano
  2008-06-16 22:08   ` Junio C Hamano
  2008-06-17 14:17   ` Don Zickus
  1 sibling, 2 replies; 6+ messages in thread
From: Junio C Hamano @ 2008-06-16 21:54 UTC (permalink / raw)
  To: Don Zickus; +Cc: git

Don Zickus <dzickus@redhat.com> writes:

> For example, if I took the top three commits from HEAD and appended them
> into one patch file and then ran this tool with the patch as input, I
> would hope that it gave as output the three original commits.

Unfortunately blame does not work in such an inefficient way.  The patch
text from your second commit (that is, the diff that shows what used to be
in the first commit and what is in the second commit) may be further
rewritten in the third commit, so if you start blaming such a text from
HEAD, the blame stops at the HEAD commit saying "the text you have is even
newer".

> ...  Unfortunately, I don't quite understand
> some of the algorithms git-blame does when it splits the patch chunks into
> smaller pieces to determine which pieces are blame-able on the parents.

http://thread.gmane.org/gmane.comp.version-control.git/28826

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: using git-blame with patches as input
  2008-06-16 21:54 ` Junio C Hamano
@ 2008-06-16 22:08   ` Junio C Hamano
  2008-06-17 14:17   ` Don Zickus
  1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2008-06-16 22:08 UTC (permalink / raw)
  To: Don Zickus; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Don Zickus <dzickus@redhat.com> writes:
> ...
>> ...  Unfortunately, I don't quite understand
>> some of the algorithms git-blame does when it splits the patch chunks into
>> smaller pieces to determine which pieces are blame-able on the parents.
>
> http://thread.gmane.org/gmane.comp.version-control.git/28826

In the article quoted, "blame" refers to a very old "git-blame" code that
does not in our codebase anymore.  It talks about "git-pickaxe" which
later took over the "git-blame" name, which happend in acca687
(git-pickaxe: retire pickaxe, 2006-11-08).

It talks about "NEEDSWORK" to hint that the implementation was incomplete,
refering to the version that eventually lead to cee7f24 (git-pickaxe:
blame rewritten., 2006-10-19).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: using git-blame with patches as input
  2008-06-16 21:45 ` Junio C Hamano
@ 2008-06-17 14:15   ` Don Zickus
  0 siblings, 0 replies; 6+ messages in thread
From: Don Zickus @ 2008-06-17 14:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Jun 16, 2008 at 02:45:59PM -0700, Junio C Hamano wrote:
> Don Zickus <dzickus@redhat.com> writes:
> 
> > I deal with a lot of backported patches that are a combination of multiple
> > commits.  I was looking to develop a tool that would help me determine
> > which chunks of the patch are upstream (not necessarily currently in HEAD
> > but at some point in the file's history).
> >
> > For example, if I took the top three commits from HEAD and appended them
> > into one patch file and then ran this tool with the patch as input, I
> > would hope that it gave as output the three original commits.
> 
> A quick and dirty hack would be to:
> 
> 	rm .git/index
> 	sed -ne 's/^[+ ]//p' -e '/^@@/p' patches... >file
>         git add file
>         git commit -m 'only "a file" remains'
>         git blame -C -C -w file
> 
> which would try blaming all the postimage concatenated together ;-)

Heh.  Interesting.  I'll try that today.  Thanks.

Cheers,
Don

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: using git-blame with patches as input
  2008-06-16 21:54 ` Junio C Hamano
  2008-06-16 22:08   ` Junio C Hamano
@ 2008-06-17 14:17   ` Don Zickus
  1 sibling, 0 replies; 6+ messages in thread
From: Don Zickus @ 2008-06-17 14:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Jun 16, 2008 at 02:54:54PM -0700, Junio C Hamano wrote:
> Don Zickus <dzickus@redhat.com> writes:
> 
> > For example, if I took the top three commits from HEAD and appended them
> > into one patch file and then ran this tool with the patch as input, I
> > would hope that it gave as output the three original commits.
> 
> Unfortunately blame does not work in such an inefficient way.  The patch
> text from your second commit (that is, the diff that shows what used to be
> in the first commit and what is in the second commit) may be further
> rewritten in the third commit, so if you start blaming such a text from
> HEAD, the blame stops at the HEAD commit saying "the text you have is even
> newer".

I know, but I am trying to crawl before I run.  So I am attacking the
simple cases first to help me understand how the whole git internal
mechanisms work (I am still trying to figure out the correct way to walk
the revision list for a particular file using git-blame as a guide).  Once
my code works for the simple cases, then I can attack the more 'normal'
cases like you described above.

Cheers,
Don

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-06-17 14:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-16 21:35 using git-blame with patches as input Don Zickus
2008-06-16 21:45 ` Junio C Hamano
2008-06-17 14:15   ` Don Zickus
2008-06-16 21:54 ` Junio C Hamano
2008-06-16 22:08   ` Junio C Hamano
2008-06-17 14:17   ` Don Zickus

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).