git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: Is git clone followed by git verify-tag meaningful?
Date: Thu, 29 Aug 2019 09:34:57 -0400	[thread overview]
Message-ID: <20190829133457.GA26173@chatter.i7.local> (raw)
In-Reply-To: <20190828234706.GB25355@sigill.intra.peff.net>

On Wed, Aug 28, 2019 at 07:47:06PM -0400, Jeff King wrote:
>On Wed, Aug 28, 2019 at 04:32:24PM -0400, Konstantin Ryabitsev wrote:
>
>> If I know that a project uses tag signing, would "git clone" followed by
>> "git verify-tag" be meaningful without a "git fsck" in-between? I.e. if an
>> attacker has control over the remote server, can they sneak in any badness
>> into any of the resulting files and still have the clone, checkout, and
>> verify-tag return success unless the repository is fsck'd before verify-tag?
>
>It depends on your definition of badness. :)

As you know, for the Linux kernel we provide both tag signatures and 
detached PGP signatures on tarballs (and the same is true for git). The 
argument I hear frequently is that providing detached tarball signatures 
is redundant[*] when tags are already PGP-signed, so I wanted to 
double-check that all checksums are computed and matched on the client 
in the process of "git checkout" and we're not just verifying a 
signature of a non-verified checksum.

In other words, I needed to double-check that what we get in the end is 
assurance that "all files in this repository are exactly the same as on 
the developer's system at the time when they ran 'git tag -s'."

>Generally, Git clients do not trust the server much at all (not only to
>be no malicious, but also not to accidentally introduce bit errors).
>
>Even without the fsck, we will compute the sha1 of each object (we must,
>because the other side doesn't send it at all), and that we have all
>objects reachable from the refs. So verifying the tag at that point
>demonstrates a signature on the tag object, which refers to probably
>some commit via sha1, which refers to actual trees and blobs by a chain
>of sha1s. If you believe in the integrity of sha1, then it has
>effectively signed all of that content.

So, the client will actually calculate those checksums during the 
checkout stage to make sure that all content in the repository matches 
the hash of the commit being checked out, correct? 

>If you want to analyze each object for such malformed bits before the
>checkout, you can do so with "git fsck". But consider instead setting
>transfer.fsckObjects to check the objects while they're being indexed by
>the initial clone (i.e., having their sha1's computed). It's effectively
>free to do it at that point, whereas a later fsck has to access each
>object again (this takes on the order of minutes of CPU for the kernel).
>
>I don't think there's any real safety in doing so for the case you've
>described (there's no bad pattern that fsck knows about that the actual
>checkout code does not).  But it does give you an early warning, and is
>especially help if you're not planning to check things out yourself, but
>want to avoid hosting malicious repos.

Right, but it's not something end-users are going to do if they just 
want to check out a repository and access code from it. The "git clone 
&& git verify-tag" workflow is now used by some distros that are 
packaging Github releases, and they aren't setting transfer.fsckObjects 
before "git clone" starts, pretty sure.

Thanks for your help!

-K

[*] Tarball signatures may be redundant in cryptographic sense, but for 
repositories like linux.git, which are now around 1.2 GB in size, it 
makes significant difference whether someone downloads the full git tree 
or just a highly compressed tarball that is only 100MB. I know that it's 
possible to clone with --depth 1 to reduce the amount of downloaded 
history, but that's hard on the servers and not something I really want 
to widely advertise as a mechanism for getting the kernel. :) In 
addition to that, distributing static content like tarballs is much 
easier logistically than git repositories, and it's much harder to 
introduce accidental corruption to a bunch of static files. Disk is 
cheap, but CPU and admin time aren't.

  reply	other threads:[~2019-08-29 13:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-28 20:32 Is git clone followed by git verify-tag meaningful? Konstantin Ryabitsev
2019-08-28 23:27 ` brian m. carlson
2019-08-28 23:47 ` Jeff King
2019-08-29 13:34   ` Konstantin Ryabitsev [this message]
2019-08-29 14:10     ` Jeff King
2019-08-29  3:41 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190829133457.GA26173@chatter.i7.local \
    --to=konstantin@linuxfoundation.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).