Is git clone followed by git verify-tag meaningful?

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Is git clone followed by git verify-tag meaningful?
@ 2019-08-28 20:32 Konstantin Ryabitsev
  2019-08-28 23:27 ` brian m. carlson
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Konstantin Ryabitsev @ 2019-08-28 20:32 UTC (permalink / raw)
  To: git

Hi, all:

If I know that a project uses tag signing, would "git clone" followed by 
"git verify-tag" be meaningful without a "git fsck" in-between? I.e. if 
an attacker has control over the remote server, can they sneak in any 
badness into any of the resulting files and still have the clone, 
checkout, and verify-tag return success unless the repository is fsck'd 
before verify-tag?

I assume that it would break during the checkout stage, but I wanted to 
verify my assumptions.

Thanks,
-K

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is git clone followed by git verify-tag meaningful?
  2019-08-28 20:32 Is git clone followed by git verify-tag meaningful? Konstantin Ryabitsev
@ 2019-08-28 23:27 ` brian m. carlson
  2019-08-28 23:47 ` Jeff King
  2019-08-29  3:41 ` Junio C Hamano
  2 siblings, 0 replies; 6+ messages in thread
From: brian m. carlson @ 2019-08-28 23:27 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

On 2019-08-28 at 20:32:24, Konstantin Ryabitsev wrote:
> Hi, all:
> 
> If I know that a project uses tag signing, would "git clone" followed by
> "git verify-tag" be meaningful without a "git fsck" in-between? I.e. if an
> attacker has control over the remote server, can they sneak in any badness
> into any of the resulting files and still have the clone, checkout, and
> verify-tag return success unless the repository is fsck'd before verify-tag?
> 
> I assume that it would break during the checkout stage, but I wanted to
> verify my assumptions.

We pass the entire tag buffer to GnuPG, which means that we verify
exactly what is in the tag: no more, no less.  Whether that represents a
valid, usable tag with meaningful data is not verified, although of
course it can't be changed once written.

If you trust the signer to produce valid data, then you can verify the
tag and know that the data is correct.  If not, then you probably need
git fsck to verify that the data is usable and meets Git's standards.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is git clone followed by git verify-tag meaningful?
  2019-08-28 20:32 Is git clone followed by git verify-tag meaningful? Konstantin Ryabitsev
  2019-08-28 23:27 ` brian m. carlson
@ 2019-08-28 23:47 ` Jeff King
  2019-08-29 13:34   ` Konstantin Ryabitsev
  2019-08-29  3:41 ` Junio C Hamano
  2 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2019-08-28 23:47 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: git

On Wed, Aug 28, 2019 at 04:32:24PM -0400, Konstantin Ryabitsev wrote:

> If I know that a project uses tag signing, would "git clone" followed by
> "git verify-tag" be meaningful without a "git fsck" in-between? I.e. if an
> attacker has control over the remote server, can they sneak in any badness
> into any of the resulting files and still have the clone, checkout, and
> verify-tag return success unless the repository is fsck'd before verify-tag?

It depends on your definition of badness. :)

Generally, Git clients do not trust the server much at all (not only to
be no malicious, but also not to accidentally introduce bit errors).

Even without the fsck, we will compute the sha1 of each object (we must,
because the other side doesn't send it at all), and that we have all
objects reachable from the refs. So verifying the tag at that point
demonstrates a signature on the tag object, which refers to probably
some commit via sha1, which refers to actual trees and blobs by a chain
of sha1s. If you believe in the integrity of sha1, then it has
effectively signed all of that content.

Likewise, Git does not necessarily trust what is in the objects. A
malicious repository could claim to store an entry for ".git/config" or
"/etc/passwd". Without any further action from you, we'd detect and
reject those during a checkout.

If you want to analyze each object for such malformed bits before the
checkout, you can do so with "git fsck". But consider instead setting
transfer.fsckObjects to check the objects while they're being indexed by
the initial clone (i.e., having their sha1's computed). It's effectively
free to do it at that point, whereas a later fsck has to access each
object again (this takes on the order of minutes of CPU for the kernel).

I don't think there's any real safety in doing so for the case you've
described (there's no bad pattern that fsck knows about that the actual
checkout code does not).  But it does give you an early warning, and is
especially help if you're not planning to check things out yourself, but
want to avoid hosting malicious repos.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is git clone followed by git verify-tag meaningful?
  2019-08-28 20:32 Is git clone followed by git verify-tag meaningful? Konstantin Ryabitsev
  2019-08-28 23:27 ` brian m. carlson
  2019-08-28 23:47 ` Jeff King
@ 2019-08-29  3:41 ` Junio C Hamano
  2 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2019-08-29  3:41 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: git

Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes:

> If I know that a project uses tag signing, would "git clone" followed
> by "git verify-tag" be meaningful without a "git fsck" in-between?
> I.e. if an attacker has control over the remote server, can they sneak
> in any badness into any of the resulting files and still have the
> clone, checkout, and verify-tag return success unless the repository
> is fsck'd before verify-tag?
>
> I assume that it would break during the checkout stage, but I wanted
> to verify my assumptions.

What you are trusting and what you are trying to protect?

I am assuming that you are cloning and a commit that has a signed
tag is at the tip of the default branch, which gets checked out and
you want to make sure that what you see in the working tree after
checkout is healthy.  I also assume that you trust your local
machine, its Git and GPG binary included, and also you trust that
the underlying hash function Git uses has not been exploited for
this particular repository.

verify-tag would tell you that the tag you specify is signed by
which GPG key you have in your keychain.  Since tag records the
commit object name, you can check it against the HEAD.  As long
as you trust the underlying hash function and your local Git,
the trust flows from the HEAD's commit object name down to each
and every file checked out to the working tree.  As long as you
did not get any error from checkout, no fsck is needed here.

If your project is high-valued target like the Linux kernel, it is
probably a good idea to enable fetch.fsckobjects so that the
incoming objects are automatically checked while receiving over the
wire.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is git clone followed by git verify-tag meaningful?
  2019-08-28 23:47 ` Jeff King
@ 2019-08-29 13:34   ` Konstantin Ryabitsev
  2019-08-29 14:10     ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Konstantin Ryabitsev @ 2019-08-29 13:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Aug 28, 2019 at 07:47:06PM -0400, Jeff King wrote:
>On Wed, Aug 28, 2019 at 04:32:24PM -0400, Konstantin Ryabitsev wrote:
>
>> If I know that a project uses tag signing, would "git clone" followed by
>> "git verify-tag" be meaningful without a "git fsck" in-between? I.e. if an
>> attacker has control over the remote server, can they sneak in any badness
>> into any of the resulting files and still have the clone, checkout, and
>> verify-tag return success unless the repository is fsck'd before verify-tag?
>
>It depends on your definition of badness. :)

As you know, for the Linux kernel we provide both tag signatures and 
detached PGP signatures on tarballs (and the same is true for git). The 
argument I hear frequently is that providing detached tarball signatures 
is redundant[*] when tags are already PGP-signed, so I wanted to 
double-check that all checksums are computed and matched on the client 
in the process of "git checkout" and we're not just verifying a 
signature of a non-verified checksum.

In other words, I needed to double-check that what we get in the end is 
assurance that "all files in this repository are exactly the same as on 
the developer's system at the time when they ran 'git tag -s'."

>Generally, Git clients do not trust the server much at all (not only to
>be no malicious, but also not to accidentally introduce bit errors).
>
>Even without the fsck, we will compute the sha1 of each object (we must,
>because the other side doesn't send it at all), and that we have all
>objects reachable from the refs. So verifying the tag at that point
>demonstrates a signature on the tag object, which refers to probably
>some commit via sha1, which refers to actual trees and blobs by a chain
>of sha1s. If you believe in the integrity of sha1, then it has
>effectively signed all of that content.

So, the client will actually calculate those checksums during the 
checkout stage to make sure that all content in the repository matches 
the hash of the commit being checked out, correct? 

>If you want to analyze each object for such malformed bits before the
>checkout, you can do so with "git fsck". But consider instead setting
>transfer.fsckObjects to check the objects while they're being indexed by
>the initial clone (i.e., having their sha1's computed). It's effectively
>free to do it at that point, whereas a later fsck has to access each
>object again (this takes on the order of minutes of CPU for the kernel).
>
>I don't think there's any real safety in doing so for the case you've
>described (there's no bad pattern that fsck knows about that the actual
>checkout code does not).  But it does give you an early warning, and is
>especially help if you're not planning to check things out yourself, but
>want to avoid hosting malicious repos.

Right, but it's not something end-users are going to do if they just 
want to check out a repository and access code from it. The "git clone 
&& git verify-tag" workflow is now used by some distros that are 
packaging Github releases, and they aren't setting transfer.fsckObjects 
before "git clone" starts, pretty sure.

Thanks for your help!

-K

[*] Tarball signatures may be redundant in cryptographic sense, but for 
repositories like linux.git, which are now around 1.2 GB in size, it 
makes significant difference whether someone downloads the full git tree 
or just a highly compressed tarball that is only 100MB. I know that it's 
possible to clone with --depth 1 to reduce the amount of downloaded 
history, but that's hard on the servers and not something I really want 
to widely advertise as a mechanism for getting the kernel. :) In 
addition to that, distributing static content like tarballs is much 
easier logistically than git repositories, and it's much harder to 
introduce accidental corruption to a bunch of static files. Disk is 
cheap, but CPU and admin time aren't.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is git clone followed by git verify-tag meaningful?
  2019-08-29 13:34   ` Konstantin Ryabitsev
@ 2019-08-29 14:10     ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2019-08-29 14:10 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: git

On Thu, Aug 29, 2019 at 09:34:57AM -0400, Konstantin Ryabitsev wrote:

> As you know, for the Linux kernel we provide both tag signatures and
> detached PGP signatures on tarballs (and the same is true for git). The
> argument I hear frequently is that providing detached tarball signatures is
> redundant[*] when tags are already PGP-signed, so I wanted to double-check
> that all checksums are computed and matched on the client in the process of
> "git checkout" and we're not just verifying a signature of a non-verified
> checksum.
> 
> In other words, I needed to double-check that what we get in the end is
> assurance that "all files in this repository are exactly the same as on the
> developer's system at the time when they ran 'git tag -s'."

Then yes, there is no need to fsck. When the objects were received on
the server side (by push) and then again when you got them from the
server (by clone), their sha1s were recomputed from scratch, not
trusting the sender at all in either case.

(Again, assuming you trust sha1; I think you should, especially since we
use the collision-detecting sha1 by default, but I wanted to make that
part clear).

> > Even without the fsck, we will compute the sha1 of each object (we must,
> > because the other side doesn't send it at all), and that we have all
> > objects reachable from the refs. So verifying the tag at that point
> > demonstrates a signature on the tag object, which refers to probably
> > some commit via sha1, which refers to actual trees and blobs by a chain
> > of sha1s. If you believe in the integrity of sha1, then it has
> > effectively signed all of that content.
> 
> So, the client will actually calculate those checksums during the checkout
> stage to make sure that all content in the repository matches the hash of
> the commit being checked out, correct?

It's not during the checkout itself, but rather during the transfer of
objects into the receiving repository. I.e., there is no need to even
have a checkout. E.g., you could verify the tag and then use "git
archive".

Do note that both archive and checkout can modify files from their
in-repository state using gitattributes (e.g., to do line-ending
conversion, or using export-subst to add things like the commit ID into
the generated tarball). So it's possible that a tarball (either
generated from git-archive or from checked out contents) may not be
byte-for-byte identical.

Depending on your use case, that can range from an annoyance to ignore
(if a developer is using those features, tell them not to do that) to a
security issue (if you are somehow certifying the tarball contents based
on the tag signature, there is room for a malicious signer to tweak the
tarball contents).

But I think your question is mostly just "if I clone the repo and verify
the tag, is it what the original person signed?". And that answer is
yes.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-08-29 14:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-28 20:32 Is git clone followed by git verify-tag meaningful? Konstantin Ryabitsev
2019-08-28 23:27 ` brian m. carlson
2019-08-28 23:47 ` Jeff King
2019-08-29 13:34   ` Konstantin Ryabitsev
2019-08-29 14:10     ` Jeff King
2019-08-29  3:41 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).