From: "Randall S. Becker" <rsbecker@nexbridge.com>
To: "'Junio C Hamano'" <gitster@pobox.com>
Cc: "'Johannes Sixt'" <j6t@kdbg.org>, <git@vger.kernel.org>
Subject: RE: [Question] Signature calculation ignoring parts of binary files
Date: Thu, 13 Sep 2018 13:55:13 -0400 [thread overview]
Message-ID: <005e01d44b8a$ef1221e0$cd3665a0$@nexbridge.com> (raw)
In-Reply-To: <xmqqva79ffpv.fsf@gitster-ct.c.googlers.com>
On September 13, 2018 1:52 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > "Randall S. Becker" <rsbecker@nexbridge.com> writes:
> >
> >> The scenario is slightly different.
> >> 1. Person A gives me a new binary file-1 with fingerprint A1. This
> >> goes into git unchanged.
> >> 2. Person B gives me binary file-2 with fingerprint B2. This does not
> >> go into git yet.
> >> 3. We attempt a git diff between the committed file-1 and uncommitted
> >> file-2 using a textconv implementation that strips what we don't need
to
> compare.
> >> 4. If file-1 and file-2 have no difference when textconv is used,
> >> file-2 is not added and not committed. It is discarded with impunity,
> >> never to be seen again, although we might whine a lot at the user for
> >> attempting to put
> >> file-2 in - but that's not git's issue.
> >
> > You are forgetting that Git is a distributed version control system,
> > aren't you? Person A and B can introduce their "moral equivalent but
> > bytewise different" copies to their repository under the same object
> > name, and you can pull from them--what happens?
> >
> > It is fundamental that one object name given to Git identifies one
> > specific byte sequence contained in an object uniquely. Once you
> > broke that, you no longer have Git.
>
> Having said all that, if you want to keep the original with frills but
somehow
> give these bytewise different things that reduce to the same essence (e.g.
> when passed thru a filter like textconv), I suspect a better approach
might be
> to store both the "original" and the result of passing the "original"
through
> the filter in the object database. In the above example, you'll get two
> "original"
> objects from person A and person B, plus one "canonical" object that are
> bytewise different from either of these two originals, but what they
reduce
> to when you use the filter on them. Then you record the fact that to
derive
> the "essence" object, you can reduce either person A's or person B's
> "original" through the filter, perhaps by using "git notes" attached to
the
> "essence" object, recording the object names of these originals (the
reason
> why using notes in this direction is because you can mechanically
determine
> which "essence"
> object any given "original" object reduces to---it is just the matter of
passing
> it through the filter. But there can be more than one "original" that
reduces
> to the same "essence").
I like that idea. It turns the reduced object into a contract. Thanks.
prev parent reply other threads:[~2018-09-13 17:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-12 19:16 [Question] Signature calculation ignoring parts of binary files Randall S. Becker
2018-09-12 20:48 ` Johannes Sixt
2018-09-12 20:53 ` Randall S. Becker
2018-09-12 22:20 ` Randall S. Becker
2018-09-12 22:59 ` Junio C Hamano
2018-09-13 12:19 ` Randall S. Becker
2018-09-13 15:03 ` Junio C Hamano
2018-09-13 15:38 ` Randall S. Becker
2018-09-13 17:51 ` Junio C Hamano
2018-09-13 17:55 ` Randall S. Becker [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='005e01d44b8a$ef1221e0$cd3665a0$@nexbridge.com' \
--to=rsbecker@nexbridge.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j6t@kdbg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).