git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Randall S. Becker" <rsbecker@nexbridge.com>
Cc: "'Johannes Sixt'" <j6t@kdbg.org>, <git@vger.kernel.org>
Subject: Re: [Question] Signature calculation ignoring parts of binary files
Date: Thu, 13 Sep 2018 10:51:40 -0700	[thread overview]
Message-ID: <xmqqva79ffpv.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <xmqqefdxign2.fsf@gitster-ct.c.googlers.com> (Junio C. Hamano's message of "Thu, 13 Sep 2018 08:03:29 -0700")

Junio C Hamano <gitster@pobox.com> writes:

> "Randall S. Becker" <rsbecker@nexbridge.com> writes:
>
>> The scenario is slightly different.
>> 1. Person A gives me a new binary file-1 with fingerprint A1. This goes into
>> git unchanged.
>> 2. Person B gives me binary file-2 with fingerprint B2. This does not go
>> into git yet.
>> 3. We attempt a git diff between the committed file-1 and uncommitted file-2
>> using a textconv implementation that strips what we don't need to compare.
>> 4. If file-1 and file-2 have no difference when textconv is used, file-2 is
>> not added and not committed. It is discarded with impunity, never to be seen
>> again, although we might whine a lot at the user for attempting to put
>> file-2 in - but that's not git's issue.
>
> You are forgetting that Git is a distributed version control system,
> aren't you?  Person A and B can introduce their "moral equivalent
> but bytewise different" copies to their repository under the same
> object name, and you can pull from them--what happens?
>
> It is fundamental that one object name given to Git identifies one
> specific byte sequence contained in an object uniquely.  Once you
> broke that, you no longer have Git.

Having said all that, if you want to keep the original with frills
but somehow give these bytewise different things that reduce to the
same essence (e.g. when passed thru a filter like textconv), I
suspect a better approach might be to store both the "original" and
the result of passing the "original" through the filter in the
object database.  In the above example, you'll get two "original"
objects from person A and person B, plus one "canonical" object that
are bytewise different from either of these two originals, but what
they reduce to when you use the filter on them.  Then you record the
fact that to derive the "essence" object, you can reduce either
person A's or person B's "original" through the filter, perhaps by
using "git notes" attached to the "essence" object, recording the
object names of these originals (the reason why using notes in this
direction is because you can mechanically determine which "essence"
object any given "original" object reduces to---it is just the
matter of passing it through the filter.  But there can be more than
one "original" that reduces to the same "essence").


  parent reply	other threads:[~2018-09-13 17:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-12 19:16 [Question] Signature calculation ignoring parts of binary files Randall S. Becker
2018-09-12 20:48 ` Johannes Sixt
2018-09-12 20:53   ` Randall S. Becker
2018-09-12 22:20     ` Randall S. Becker
2018-09-12 22:59       ` Junio C Hamano
2018-09-13 12:19         ` Randall S. Becker
2018-09-13 15:03           ` Junio C Hamano
2018-09-13 15:38             ` Randall S. Becker
2018-09-13 17:51             ` Junio C Hamano [this message]
2018-09-13 17:55               ` Randall S. Becker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqva79ffpv.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=j6t@kdbg.org \
    --cc=rsbecker@nexbridge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).