git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* mktree: multiple same-named objects
@ 2014-08-27  4:41 David Turner
  2014-08-27  5:01 ` Duy Nguyen
  2014-08-27  5:13 ` Jeff King
  0 siblings, 2 replies; 6+ messages in thread
From: David Turner @ 2014-08-27  4:41 UTC (permalink / raw
  To: git mailing list

git mktree seems to allow the creation of a tree object with multiple
objects of the same name but different SHAs.  This leads to weird
behavior later, unsurprisingly.  For instance, if there are two tree
objects with the same name but different SHAs, the checked out tree will
be the union of them (reasonably), but if you do git add $name, some or
all unmodified files under $name will show up in git status as modified
-- since they differ from one of the parent trees, presumably.

And if different git implementations treat this case differently, then
it might be possible to make a repo that appears to contain one thing
when viewed with one implementation, but contains a different thing for
a different implementation.

Summary: git mktree ought to forbid this, and possibly there ought to be
other checks (for instance, when unpacking) to prevent this.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mktree: multiple same-named objects
  2014-08-27  4:41 mktree: multiple same-named objects David Turner
@ 2014-08-27  5:01 ` Duy Nguyen
  2014-08-27 21:48   ` David Turner
  2014-08-27  5:13 ` Jeff King
  1 sibling, 1 reply; 6+ messages in thread
From: Duy Nguyen @ 2014-08-27  5:01 UTC (permalink / raw
  To: David Turner; +Cc: git mailing list

On Wed, Aug 27, 2014 at 11:41 AM, David Turner <dturner@twopensource.com> wrote:
> Summary: git mktree ought to forbid this, and possibly there ought to be
> other checks (for instance, when unpacking) to prevent this.

Does fsck detect this (because we ought to fix fsck first if it does not)?
-- 
Duy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mktree: multiple same-named objects
  2014-08-27  4:41 mktree: multiple same-named objects David Turner
  2014-08-27  5:01 ` Duy Nguyen
@ 2014-08-27  5:13 ` Jeff King
  2014-08-27 15:17   ` Junio C Hamano
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff King @ 2014-08-27  5:13 UTC (permalink / raw
  To: David Turner; +Cc: Johannes Schindelin, git mailing list

On Wed, Aug 27, 2014 at 12:41:57AM -0400, David Turner wrote:

> git mktree seems to allow the creation of a tree object with multiple
> objects of the same name but different SHAs.

Yeah, I don't think we do many quality checks there. Ditto for "git
hash-object".

The latter goes through index_mem, which at least checks that the
resulting tree is parseable. It does not look like mktree even checks
that.

> Summary: git mktree ought to forbid this, and possibly there ought to be
> other checks (for instance, when unpacking) to prevent this.

The checks in git-fsck will notice your problem (and many others). I
think we should be running them anytime we create an object based on
arbitrary data (including mktree and hash-object). Code paths like "git
write-tree" and "git commit-tree" are probably OK, as their code should
follow the standard (it would not hurt to double-check their output,
though there may be a performance implication).

Dscho (cc'd) has been looking into this approach; I don't know how far
he has gotten.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mktree: multiple same-named objects
  2014-08-27  5:13 ` Jeff King
@ 2014-08-27 15:17   ` Junio C Hamano
  2014-08-27 16:24     ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2014-08-27 15:17 UTC (permalink / raw
  To: Jeff King; +Cc: David Turner, Johannes Schindelin, git mailing list

Jeff King <peff@peff.net> writes:

> On Wed, Aug 27, 2014 at 12:41:57AM -0400, David Turner wrote:
>
>> git mktree seems to allow the creation of a tree object with multiple
>> objects of the same name but different SHAs.
>
> Yeah, I don't think we do many quality checks there. Ditto for "git
> hash-object".

I am somewhat against outright removing the capability to write out
invalid objects deliberately from these low level tools, because we
would need a way to easily reproduce bugs in end-user facing tools
by other people who claim to produce Git objects, but I would agree
that by default that should be forbidden.

In other words, two things must happen; improve checks when these
low level debugging aid tools are creating objects, and allow
bypassing these additional checks with "--experiment" option or
something.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mktree: multiple same-named objects
  2014-08-27 15:17   ` Junio C Hamano
@ 2014-08-27 16:24     ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2014-08-27 16:24 UTC (permalink / raw
  To: Junio C Hamano; +Cc: David Turner, Johannes Schindelin, git mailing list

On Wed, Aug 27, 2014 at 08:17:15AM -0700, Junio C Hamano wrote:

> I am somewhat against outright removing the capability to write out
> invalid objects deliberately from these low level tools, because we
> would need a way to easily reproduce bugs in end-user facing tools
> by other people who claim to produce Git objects, but I would agree
> that by default that should be forbidden.
> 
> In other words, two things must happen; improve checks when these
> low level debugging aid tools are creating objects, and allow
> bypassing these additional checks with "--experiment" option or
> something.

Yeah, definitely. I had imagined it as "--strict" and "--no-strict",
with flipping the default to "--strict" at some point (I do not see a
reason anybody would not want it in normal use, but if we are worried,
we can even go slowly on flipping the default).

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mktree: multiple same-named objects
  2014-08-27  5:01 ` Duy Nguyen
@ 2014-08-27 21:48   ` David Turner
  0 siblings, 0 replies; 6+ messages in thread
From: David Turner @ 2014-08-27 21:48 UTC (permalink / raw
  To: Duy Nguyen; +Cc: git mailing list

On Wed, 2014-08-27 at 12:01 +0700, Duy Nguyen wrote:
> On Wed, Aug 27, 2014 at 11:41 AM, David Turner <dturner@twopensource.com> wrote:
> > Summary: git mktree ought to forbid this, and possibly there ought to be
> > other checks (for instance, when unpacking) to prevent this.
> 
> Does fsck detect this (because we ought to fix fsck first if it does not)?

Yes, it does.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-27 21:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-27  4:41 mktree: multiple same-named objects David Turner
2014-08-27  5:01 ` Duy Nguyen
2014-08-27 21:48   ` David Turner
2014-08-27  5:13 ` Jeff King
2014-08-27 15:17   ` Junio C Hamano
2014-08-27 16:24     ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).