git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Coiner, John" <John.Coiner@amd.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: git, monorepos, and access control
Date: Thu, 6 Dec 2018 02:20:03 -0500	[thread overview]
Message-ID: <20181206072002.GA29787@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqwoona2c6.fsf@gitster-ct.c.googlers.com>

On Thu, Dec 06, 2018 at 10:08:57AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > In my opinion this feature is so contrary to Git's general assumptions
> > that it's likely to create a ton of information leaks of the supposedly
> > protected data.
> > ...
> 
> Yup, with s/implemented/designed/, I agree all you said here
> (snipped).

Heh, yeah, I actually scratched my head over what word to use. I think
Git _could_ be written in a way that is both compatible with existing
repositories (i.e., is still recognizably Git) and is careful about
object access control. But either way, what we have now is not close to
that.

> > Sorry I don't have a more positive response. What you want to do is
> > perfectly reasonable, but I just think it's a mismatch with how Git
> > works (and because of the security impact, one missed corner case
> > renders the whole thing useless).
> 
> Yup, again.
> 
> Storing source files encrypted and decrypting with smudge filter
> upon checkout (and those without the access won't get keys and will
> likely to use sparse checkout to exclude these priviledged sources)
> is probably the only workaround that does not involve submodules.
> Viewing "diff" and "log -p" would still be a challenge, which
> probably could use the same filter as smudge for textconv.

I suspect there are going to be some funny corner cases there. I use:

  [diff "gpg"]
  textconv = gpg -qd --no-tty

which works pretty well, but it's for files which are _never_ decrypted
by Git. So they're encrypted in the working tree too, and I don't use
clean/smudge filters.

If the files are already decrypted in the working tree, then running
them through gpg again would be the wrong thing. I guess for a diff
against the working tree, we would always do a "clean" operation to
produce the encrypted text, and then decrypt the result using textconv.
Which would work, but is rather slow.

> I wonder (and this is the primary reason why I am responding to you)
> if it is common enough wish to use the same filter for smudge and
> textconv?  So far, our stance (which can be judged from the way the
> clean/smudge filters are named) has been that the in-repo
> representation is the canonical, and the representation used in the
> checkout is ephemeral, and that is why we run "diff", "grep",
> etc. over the in-repo representation, but the "encrypted in repo,
> decrypted in checkout" abuse would be helped by an option to do the
> reverse---find changes and look substrings in the representation
> used in the checkout.  I am not sure if there are other use cases
> that is helped by such an option.

Hmm. Yeah, I agree with your line of reasoning here. I'm not sure how
common it is. This is the first I can recall it. And personally, I have
never really used clean/smudge filters myself, beyond some toy
experiments.

The other major user of that feature I can think of is LFS. There Git
ends up diffing the LFS pointers, not the big files. Which arguably is
the wrong thing (you'd prefer to see the actual file contents diffed),
but I think nobody cares in practice because large files generally don't
have readable diffs anyway.

-Peff

  reply	other threads:[~2018-12-06  7:20 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-05 20:13 git, monorepos, and access control Coiner, John
2018-12-05 20:34 ` Ævar Arnfjörð Bjarmason
2018-12-05 20:43   ` Derrick Stolee
2018-12-05 20:58     ` Duy Nguyen
2018-12-05 21:12       ` Ævar Arnfjörð Bjarmason
2018-12-05 23:42         ` Coiner, John
2018-12-06  7:23           ` Jeff King
2018-12-05 21:01 ` Jeff King
2018-12-06  0:23   ` brian m. carlson
2018-12-06  1:08   ` Junio C Hamano
2018-12-06  7:20     ` Jeff King [this message]
2018-12-06  9:17       ` Ævar Arnfjörð Bjarmason
2018-12-06  9:30         ` Jeff King
2018-12-06 20:08   ` Johannes Schindelin
2018-12-06 22:15     ` Stefan Beller
2018-12-06 22:59     ` Coiner, John
2018-12-05 22:37 ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181206072002.GA29787@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=John.Coiner@amd.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).