git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time
@ 2010-08-17  1:58 Christian Couder
  2010-08-17 21:18 ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Christian Couder @ 2010-08-17  1:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Nguyen Thai Ngoc Duy

The function parse_commit() is not safe regarding replaced commits
because it uses the buffer of the replacement commit but the object
part of the commit struct stay the same. Especially the sha1 is not
changed so it doesn't match the content of the commit.

To fix that, this patch adds a new function that takes a
"struct commit **" instead of a "struct commit *" so we can
change the commit pointer that is passed to us.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 commit.c |   43 +++++++++++++++++++++++++++++++++++++++++++
 commit.h |    6 ++++++
 2 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/commit.c b/commit.c
index 652c1ba..f170179 100644
--- a/commit.c
+++ b/commit.c
@@ -316,6 +316,49 @@ int parse_commit(struct commit *item)
 	return ret;
 }
 
+int parse_commit_repl(struct commit **commit)
+{
+	enum object_type type;
+	void *buffer;
+	unsigned long size;
+	int ret;
+	const unsigned char *repl;
+	struct commit *item = *commit;
+
+	if (!item)
+		return -1;
+	if (item->object.parsed)
+		return 0;
+	buffer = read_sha1_file_repl(item->object.sha1, &type, &size, &repl);
+	if (!buffer)
+		return error("Could not read %s",
+			     sha1_to_hex(item->object.sha1));
+
+	if (item->object.sha1 != repl) {
+		struct commit *repl_item = lookup_commit(repl);
+		if (!repl_item) {
+			free(buffer);
+			return error("Bad replacement %s for commit %s",
+				     sha1_to_hex(repl),
+				     sha1_to_hex(item->object.sha1));
+		}
+		repl_item->object.flags = item->object.flags;
+		*commit = item = repl_item;
+	} else if (type != OBJ_COMMIT) {
+		free(buffer);
+		return error("Object %s not a commit",
+			     sha1_to_hex(item->object.sha1));
+	}
+
+	ret = parse_commit_buffer(item, buffer, size);
+	if (save_commit_buffer && !ret) {
+		item->buffer = buffer;
+		return 0;
+	}
+	free(buffer);
+	return ret;
+}
+
 int find_commit_subject(const char *commit_buffer, const char **subject)
 {
 	const char *eol;
diff --git a/commit.h b/commit.h
index a3618f8..d3dfebb 100644
--- a/commit.h
+++ b/commit.h
@@ -39,6 +39,12 @@ struct commit *lookup_commit_reference_gently(const unsigned char *sha1,
 
 int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size);
 
+int parse_commit_repl(struct commit **item);
+
+/*
+ * parse_commit() is deprecated, because it's buggy regarding replacements.
+ * Use parse_commit_repl() instead.
+ */
 int parse_commit(struct commit *item);
 
 /* Find beginning and length of commit subject. */
-- 
1.7.2.1.351.g275bf

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time
  2010-08-17  1:58 [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time Christian Couder
@ 2010-08-17 21:18 ` Junio C Hamano
  2010-08-18  3:17   ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2010-08-17 21:18 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Nguyen Thai Ngoc Duy

Christian Couder <chriscool@tuxfamily.org> writes:

> The function parse_commit() is not safe regarding replaced commits
> because it uses the buffer of the replacement commit but the object
> part of the commit struct stay the same. Especially the sha1 is not
> changed so it doesn't match the content of the commit.

This all sounds backwards to me, if I am reading the discussion correctly.

If a replace record says commit 0123 is replaced by commit 4567 (iow, 0123
was a mistake, and pretend that its content is what is recorded in 4567),
and when we are honoring the replace records (iow, we are not fsck),
shouldn't read_sha1("0123") give us a piece of memory that stores what is
recorded in 4567, parse_object("0123") return a struct commit whose buffer
points at a block of memory that has what is recorded in 4567 _while_ its
object.sha1[] say "0123"?

What problem are you trying to solve?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time
  2010-08-17 21:18 ` Junio C Hamano
@ 2010-08-18  3:17   ` Nguyen Thai Ngoc Duy
  2010-08-18  4:07     ` Christian Couder
  0 siblings, 1 reply; 8+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-08-18  3:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christian Couder, git

On Wed, Aug 18, 2010 at 7:18 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <chriscool@tuxfamily.org> writes:
>
>> The function parse_commit() is not safe regarding replaced commits
>> because it uses the buffer of the replacement commit but the object
>> part of the commit struct stay the same. Especially the sha1 is not
>> changed so it doesn't match the content of the commit.
>
> This all sounds backwards to me, if I am reading the discussion correctly.
>
> If a replace record says commit 0123 is replaced by commit 4567 (iow, 0123
> was a mistake, and pretend that its content is what is recorded in 4567),
> and when we are honoring the replace records (iow, we are not fsck),
> shouldn't read_sha1("0123") give us a piece of memory that stores what is
> recorded in 4567, parse_object("0123") return a struct commit whose buffer
> points at a block of memory that has what is recorded in 4567 _while_ its
> object.sha1[] say "0123"?

1. parse_object() as it is now would return object.sha1[] = "4567".
2. lookup_commit(), then parse_commit() would return object.sha1[] = "0123".

> What problem are you trying to solve?

Inconsistency in replacing objects. I have no comments whether #1 or
#2 is expected behavior. But at least it should stick to one behavior
only.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time
  2010-08-18  3:17   ` Nguyen Thai Ngoc Duy
@ 2010-08-18  4:07     ` Christian Couder
  2010-08-18  4:24       ` Jonathan Nieder
  2010-08-18 14:50       ` Junio C Hamano
  0 siblings, 2 replies; 8+ messages in thread
From: Christian Couder @ 2010-08-18  4:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyen Thai Ngoc Duy, git

On Wednesday 18 August 2010 05:17:52 Nguyen Thai Ngoc Duy wrote:
> On Wed, Aug 18, 2010 at 7:18 AM, Junio C Hamano <gitster@pobox.com> wrote:
> > Christian Couder <chriscool@tuxfamily.org> writes:
> >> The function parse_commit() is not safe regarding replaced commits
> >> because it uses the buffer of the replacement commit but the object
> >> part of the commit struct stay the same. Especially the sha1 is not
> >> changed so it doesn't match the content of the commit.
> > 
> > This all sounds backwards to me, if I am reading the discussion
> > correctly.
> > 
> > If a replace record says commit 0123 is replaced by commit 4567 (iow,
> > 0123 was a mistake, and pretend that its content is what is recorded in
> > 4567), and when we are honoring the replace records (iow, we are not
> > fsck), shouldn't read_sha1("0123") give us a piece of memory that stores
> > what is recorded in 4567, parse_object("0123") return a struct commit
> > whose buffer points at a block of memory that has what is recorded in
> > 4567 _while_ its object.sha1[] say "0 123"?
> 
> 1. parse_object() as it is now would return object.sha1[] = "4567".
> 2. lookup_commit(), then parse_commit() would return object.sha1[] =
> "0123".
> 
> > What problem are you trying to solve?
> 
> Inconsistency in replacing objects. I have no comments whether #1 or
> #2 is expected behavior. But at least it should stick to one behavior
> only.

We discussed this inconsistency in this thread:

http://thread.gmane.org/gmane.comp.version-control.git/152321/ 

So we can resolve the inconsistency with Duy's patch to make parse_object() 
return object.sha1[] = "0123".

It's simpler and probably safer. The downside is that the sha1 will not be 
consistent with the content anymore and that it will be more difficult to 
realize that an object has been replaced as there will be no sha1 change to be 
seen.

Best regards,
Christian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time
  2010-08-18  4:07     ` Christian Couder
@ 2010-08-18  4:24       ` Jonathan Nieder
  2010-08-18  4:37         ` Nguyen Thai Ngoc Duy
  2010-08-18 14:50       ` Junio C Hamano
  1 sibling, 1 reply; 8+ messages in thread
From: Jonathan Nieder @ 2010-08-18  4:24 UTC (permalink / raw)
  To: Christian Couder; +Cc: Junio C Hamano, Nguyen Thai Ngoc Duy, git

Christian Couder wrote:

> The downside is that the sha1 will not be 
> consistent with the content anymore and that it will be more difficult to 
> realize that an object has been replaced as there will be no sha1 change to be 
> seen.

Maybe in the long run it would make sense to keep a "replaced" flag
and use it to mark the replace objects specially in some user-facing
commands (like log --format=medium).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time
  2010-08-18  4:24       ` Jonathan Nieder
@ 2010-08-18  4:37         ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 8+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-08-18  4:37 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Christian Couder, Junio C Hamano, git

On Wed, Aug 18, 2010 at 2:24 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Christian Couder wrote:
>
>> The downside is that the sha1 will not be
>> consistent with the content anymore and that it will be more difficult to
>> realize that an object has been replaced as there will be no sha1 change to be
>> seen.
>
> Maybe in the long run it would make sense to keep a "replaced" flag
> and use it to mark the replace objects specially in some user-facing
> commands (like log --format=medium).

Sounds good (if uses really need to know that). It can be used for
commit grafts too.

You would need to find an available bit flag first though. Currently
object.flags is used for different purposes and its bit definitions
are not centralized. So it's hard to find a "good" bit that no one
uses yet, to become the "replaced" bit. OK I'm getting off-topic now.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time
  2010-08-18  4:07     ` Christian Couder
  2010-08-18  4:24       ` Jonathan Nieder
@ 2010-08-18 14:50       ` Junio C Hamano
  2010-08-20  4:04         ` Christian Couder
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2010-08-18 14:50 UTC (permalink / raw)
  To: Christian Couder; +Cc: Junio C Hamano, Nguyen Thai Ngoc Duy, git

Christian Couder <chriscool@tuxfamily.org> writes:

> On Wednesday 18 August 2010 05:17:52 Nguyen Thai Ngoc Duy wrote:
>> On Wed, Aug 18, 2010 at 7:18 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> > Christian Couder <chriscool@tuxfamily.org> writes:
>> >> The function parse_commit() is not safe regarding replaced commits
>> >> because it uses the buffer of the replacement commit but the object
>> >> part of the commit struct stay the same. Especially the sha1 is not
>> >> changed so it doesn't match the content of the commit.
>> > 
>> > This all sounds backwards to me, if I am reading the discussion
>> > correctly.
>> > 
>> > If a replace record says commit 0123 is replaced by commit 4567 (iow,
>> > 0123 was a mistake, and pretend that its content is what is recorded in
>> > 4567), and when we are honoring the replace records (iow, we are not
>> > fsck), shouldn't read_sha1("0123") give us a piece of memory that stores
>> > what is recorded in 4567, parse_object("0123") return a struct commit
>> > whose buffer points at a block of memory that has what is recorded in
>> > 4567 _while_ its object.sha1[] say "0 123"?
>> 
>> 1. parse_object() as it is now would return object.sha1[] = "4567".
>> 2. lookup_commit(), then parse_commit() would return object.sha1[] =
>> "0123".
>> 
>> > What problem are you trying to solve?
>> 
>> Inconsistency in replacing objects. I have no comments whether #1 or
>> #2 is expected behavior. But at least it should stick to one behavior
>> only.
>
> We discussed this inconsistency in this thread:
>
> http://thread.gmane.org/gmane.comp.version-control.git/152321/ 
>
> So we can resolve the inconsistency with Duy's patch to make parse_object() 
> return object.sha1[] = "0123".
>
> It's simpler and probably safer. The downside is that the sha1 will not be 
> consistent with the content anymore and that it will be more difficult to 
> realize that an object has been replaced as there will be no sha1 change to be 
> seen.

I do not see it as a downside at all.

If the user wants to take replaced objects, they should be shown just like
an ordinary objects at the machinery level.

Of course, the user is free to add comments on the commit log to note the
fact that a new commit is replacing some other commit and for what
purpose.  Also if somebody really wants to, cat-file piped to hash-object
can be used to see the difference.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time
  2010-08-18 14:50       ` Junio C Hamano
@ 2010-08-20  4:04         ` Christian Couder
  0 siblings, 0 replies; 8+ messages in thread
From: Christian Couder @ 2010-08-20  4:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyen Thai Ngoc Duy, git

On Wednesday 18 August 2010 16:50:24 Junio C Hamano wrote:
> Christian Couder <chriscool@tuxfamily.org> writes:
> > On Wednesday 18 August 2010 05:17:52 Nguyen Thai Ngoc Duy wrote:
> >> On Wed, Aug 18, 2010 at 7:18 AM, Junio C Hamano <gitster@pobox.com> 
wrote:
> >> > Christian Couder <chriscool@tuxfamily.org> writes:
> >> >> The function parse_commit() is not safe regarding replaced commits
> >> >> because it uses the buffer of the replacement commit but the object
> >> >> part of the commit struct stay the same. Especially the sha1 is not
> >> >> changed so it doesn't match the content of the commit.
> >> > 
> >> > This all sounds backwards to me, if I am reading the discussion
> >> > correctly.
> >> > 
> >> > If a replace record says commit 0123 is replaced by commit 4567 (iow,
> >> > 0123 was a mistake, and pretend that its content is what is recorded
> >> > in 4567), and when we are honoring the replace records (iow, we are
> >> > not fsck), shouldn't read_sha1("0123") give us a piece of memory that
> >> > stores what is recorded in 4567, parse_object("0123") return a struct
> >> > commit whose buffer points at a block of memory that has what is
> >> > recorded in 4567 _while_ its object.sha1[] say "0 123"?
> >> 
> >> 1. parse_object() as it is now would return object.sha1[] = "4567".
> >> 2. lookup_commit(), then parse_commit() would return object.sha1[] =
> >> "0123".
> >> 
> >> > What problem are you trying to solve?
> >> 
> >> Inconsistency in replacing objects. I have no comments whether #1 or
> >> #2 is expected behavior. But at least it should stick to one behavior
> >> only.
> > 
> > We discussed this inconsistency in this thread:
> > 
> > http://thread.gmane.org/gmane.comp.version-control.git/152321/
> > 
> > So we can resolve the inconsistency with Duy's patch to make
> > parse_object() return object.sha1[] = "0123".
> > 
> > It's simpler and probably safer. The downside is that the sha1 will not
> > be consistent with the content anymore and that it will be more
> > difficult to realize that an object has been replaced as there will be
> > no sha1 change to be seen.
> 
> I do not see it as a downside at all.
> 
> If the user wants to take replaced objects, they should be shown just like
> an ordinary objects at the machinery level.
> 
> Of course, the user is free to add comments on the commit log to note the
> fact that a new commit is replacing some other commit and for what
> purpose.  Also if somebody really wants to, cat-file piped to hash-object
> can be used to see the difference.

Ok so please apply Duy's patch perhaps with an improved commit message.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-08-20  4:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-17  1:58 [RFC/PATCH 1/2] commit: add parse_commit_repl() to replace commits at parsing time Christian Couder
2010-08-17 21:18 ` Junio C Hamano
2010-08-18  3:17   ` Nguyen Thai Ngoc Duy
2010-08-18  4:07     ` Christian Couder
2010-08-18  4:24       ` Jonathan Nieder
2010-08-18  4:37         ` Nguyen Thai Ngoc Duy
2010-08-18 14:50       ` Junio C Hamano
2010-08-20  4:04         ` Christian Couder

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).