git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Git <git@vger.kernel.org>
Subject: Re: [PATCH 6/6] upload-pack: provide a hook for running pack-objects
Date: Thu, 19 May 2016 08:08:49 -0400	[thread overview]
Message-ID: <20160519120848.GC3050@sigill.intra.peff.net> (raw)
In-Reply-To: <CACBZZX5SVJ2CSB0AS3Lj1A8_S+ejGOPUDn6Sc3whotkyFwxEiA@mail.gmail.com>

On Thu, May 19, 2016 at 12:12:43PM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Thu, May 19, 2016 at 12:45 AM, Jeff King <peff@peff.net> wrote:
> >   3. You may want to insert a caching layer around
> >      pack-objects; it is the most CPU- and memory-intensive
> >      part of serving a fetch, and its output is a pure
> >      function[1] of its input, making it an ideal place to
> >      consolidate identical requests.
> 
> Cool to see this on the list after we talked briefly about this at Git
> Merge. Being able to cache this so simply is a great optimization.
> 
> As I recall you guys at GitHub ended up writing your own utility to
> cache output depending on stdin/argv because none existed already.

Yeah, we do have such a tool internally. It's possible we may one day
open-source that, but there aren't plans to do so right now.

I don't know whether this kind of caching would be useful to most sites
or not. It's good if you have lots of clients asking you for the same
thing at roughly the same time (say, somebody using "git pull" as a
deploy mechanism from their AWS cluster), but otherwise not.

> So do I understand correctly that you're trying to guard against the
> case where you e.g.:
> 
>     rsync untrusted.example.com:/tmp/poison.git /tmp/
>     git clone /tmp/poison.git /tmp/safe.git
> 
> Not hosing your system if the poison.git/config has a
> uploadpack.packObjectsHook that's "sudo rm -rf /".

I'm not that worried about this case, as it's just not that common.  I
think we're more concerned with two cases:

  1. multi-user servers where you ssh as yourself, but then access
     repositories owned by somebody else. This is basically the ssh case
     you described later.

  2. hosting sites that run git-daemon as the "daemon" user, but serve
     repositories owned by random untrusted users (where you would not
     want those users to run arbitrary code as "daemon").

> We've already accepted that "push" hooks like the pre-receive or
> update hook can do something malicious like this, so on one hand maybe
> we should say if you scp raw *.git repositories with hooks this sort
> of thing might happen, or if you ssh to a remote box and run their
> per-repo hooks it's really their problem to make sure their users
> don't run malicious hooks on your behalf.

Yeah, we make no promises for repositories that you push to. It's _only_
for the fetching side. It's kind of a funny distinction, but it's one we
have maintained since the beginning of git, and I do think there are
real sites that depend on it (see, e.g., the history of the
post-upload-pack hook added in the v1.6.x time frame).

Rsyncing a repository is generally of questionable safety. It's OK to
fetch from the result, but certainly not to run "git log" (which can run
arbitrary commands via external diff, etc).

> But as you point out this makes the hook interface a bit unusual.
> Wouldn't this give us the same security and normalize the hook
> interface:
> 
>  * Don't do the uploadpack.packObjectsHook variable, just have a
> normal "pack-objects" hook that works like any other git hook
>  * By default we don't run this hook unless core.runDangerousHooks (or
> whatever we call it) is true.
>  * The core.runDangerousHooks variable cannot be set on a per-repo
> basis using your new config facility.
>  * If there's a pack-objects hook and core.runDangerousHooks isn't
> true we warn "not executing potentially unsafe hook $path_to_hook" and
> carry on

This is the "could we just set a bool" option I discussed in the commit
message. The problem is that it doesn't let the admin say "I don't trust
these repositories, but I _do_ want to run just this one hook when
serving them, and not any other hooks".

> This would allow use-cases that are a bit inconvenient with your patch
> (again, if I'm understanding it correctly):
> 
>  * I can set core.runDangerousHooks=true in /etc/gitconfig on my git
> server because I also control all the repos, and I want to experiment
> with trying this on a per-repo basis for users that are cloning from
> me.
>  * I can similarly play with this locally knowing I'm only cloning
> repos I trust by setting core.runDangerousHooks=true in ~/.gitconfig

Yes, those use cases are not well served by the git config alone. But
you can do them (and much more) once your trusted hook is running (by
checking $GIT_DIR, or looking in a database, or whatever you want).

-Peff

  reply	other threads:[~2016-05-19 12:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-18 22:37 [PATCH/RFC 0/6] pack-objects hook for upload-pack Jeff King
2016-05-18 22:39 ` [PATCH 1/6] git_config_with_options: drop "found" counting Jeff King
2016-05-18 22:39 ` [PATCH 2/6] git_config_parse_parameter: refactor cleanup code Jeff King
2016-05-18 22:41 ` [PATCH 3/6] config: set up config_source for command-line config Jeff King
2016-05-18 22:43 ` [PATCH 4/6] config: return configset value for current_config_ functions Jeff King
2016-05-19  0:08   ` Jeff King
2016-05-26  7:47     ` Duy Nguyen
2016-05-26 16:42       ` Junio C Hamano
2016-05-26 16:50         ` Jeff King
2016-05-26 17:36           ` Junio C Hamano
2016-05-27  0:41             ` Jeff King
2016-05-27  2:11               ` Junio C Hamano
2016-05-27  0:32           ` Jeff King
2016-05-18 22:44 ` [PATCH 5/6] config: add a notion of "scope" Jeff King
2016-05-18 22:45 ` [PATCH 6/6] upload-pack: provide a hook for running pack-objects Jeff King
2016-05-19  0:14   ` Jeff King
2016-05-19 10:12   ` Ævar Arnfjörð Bjarmason
2016-05-19 12:08     ` Jeff King [this message]
2016-05-19 14:54       ` Ævar Arnfjörð Bjarmason
2016-05-26  5:37         ` Jeff King
2016-05-25  0:59 ` [PATCH/RFC 0/6] pack-objects hook for upload-pack Junio C Hamano
2016-05-26  5:44   ` Jeff King
2016-05-26 16:44     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160519120848.GC3050@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).