From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: Christoph Anton Mitterer <calestyo@scientia.net>
Cc: Jonathan Nieder <jrnieder@gmail.com>, git@vger.kernel.org
Subject: Re: how to (integrity) verify a whole git repo
Date: Tue, 21 Apr 2020 12:19:56 -0400 [thread overview]
Message-ID: <20200421161956.45slynbgkkom3qc3@chatter.i7.local> (raw)
In-Reply-To: <be69ed1bade98cb7e414c2713fe0d6b5cadd7172.camel@scientia.net>
On Tue, Apr 21, 2020 at 04:42:16PM +0200, Christoph Anton Mitterer wrote:
> Taking again the kernel as an example:
> If I clone the repo (or fsck it later), than all I know is that there
> was no corruption, if the all the tips are correct, since they start
> the chain of hash sums to all other objects.
Notably, there is normally only one branch in torvalds/linux.git, and
that's "master". So, there's only one tip.
> But an attacker could have just forged these tips.
> So for checking authenticity, I need to verify some signatures on them
>
> Now if I check e.g. Linus signature on tag v5.6; I should know that
> everything earlier (in the tree, not chronologically) to that tag are
> authentic.
Yes, verifying a signature on a tag tells you that all commits are
bit-for-bit exactly the same as on Linus's workstation where he created
the signature.
> But not e.g. any commits on top of v.5.6 (which aren't either signed
> themselves or protected by another tag "above" them).
This is mostly true, yes.
> Neither any commits never reached from v.5.6, e.g. later stable patches
> like anything from above v.5.5 (which is again below v.5.6) up to
> v.5.5.13, which is not.
Stable commits would be in the stable tree, and those tags are signed by
Greg Kroah-Hartman.
> So from my understanding, to use only commits that are authentic by the
> kernel upstream developers, I'd need verify all these tips.. and throw
> away everything which is not reachable by one of them.
>
> Is that somehow possible?
You probably don't care about commits that arrive between releases, so
effectively you are already doing that? Even if you have loose objects
that aren't reachable from your current tip (e.g. you only care about
objects in the stable branch linux-5.6.y), it's not like they are going
to "poison" your tree, so removing them is just a garbage collection
operation at best.
## Minor attestation rant
I would argue that your premise of "authenticity" is wrong. The best
that we are currently able to offer is a guarantee that, at the point
where the tag was signed, the tree is bit-for-bit exact to the tree the
way it exists on Linus Torvalds' (or Greg KH's) workstation.
However, both Linus and Greg merge code from tens of thousands of other
contributors and it's important to keep in mind that their tag
signatures do not offer any kind of attestation proof of the code's
actual authorship or origin. Looking for such proof would be
near-impossible -- even if we had a universally accepted mechanism to do
cryptographic attestation of all patches and commits, normal maintainer
operations would necessarily break this chain:
- maintainers insert their own trailers into commit messages
(Signed-off-by, Tested-by, Acked-by, etc).
- maintainers reorder and edit patches that they receive from individual
contributors -- for typos, minor stylistical cleanups, extra comments,
etc.
- maintainers routinely rebase patches they receive before they can
submit them to be merged into mainline.
Full code attestation is possible in projects where all commits are
forks and merges -- for example, many Git**b/Gerrit projects could be
set up to require full cryptographic attestation of commits, if all
operations are forks, pull requests, and merges. However, it would be
impossible to force this development paradigm onto the Linux kernel --
it would be extremely disruptive and require massive individual effort
to overhaul every maintainer's workflow. Furthermore, many maintainers
would reject this approach because they would disagree about the main
premise behind the effort -- that cryptographically signing every commit
offers enough tangible benefit to be worth it.
Let me expound on the last point. There are some 15,000 personas who
have committed code to the Linux kernel (a persona could be the same
person committing code from different commercial entities --
jdoe@google.com vs jdoe@redhat.com). Even if we assume that each commit
is signed, we then must have a way to perform some kind of meaningful
verification, right?
- Where do we get all the public keys required for such a task?
- How do we handle cases where a key has expired or worse, has been
revoked by the developer? This can't invalidate their past commits,
because it's impossible to re-sign those.
- How do we bootstrap distributed trust without relying on someone being
a Fundamentally Non-corruptible Person? It's certainly not me -- I
have close relatives living under, shall we say, regimes with loose
standards when it comes to personal freedoms.
- How much trust should we be putting into cryptographic signatures?
Linux developers aren't necessarily that much better about keeping
their workstations protected against malicious attacks, so they are
just as vulnerable to having their private keys stolen as anyone else.
For this reason, Linux maintainers use either a zero-trust approach, or
a last-leg trust approach:
- Submaintainers don't put much trust into *who* wrote the code and
review all submissions they receive as potentially containing security
bugs (intentional or not); their job is to review the code and pass it
up the chain to maintainers.
- if maintainers receive pull requests from submaintainers, then they
*may* check cryptographic signatures on the trees they pull. I am
trying to encourage all maintainers to do this, and I've been working
to introduce patch attestation so that maintainers preferring to work
with patch series as opposed to pull requests can have similar
functionality.
- Linus checks all signatures on trees he pulls from non-kernel.org
locations. Unfortunately, I've not been able to convince him that he
should check them on stuff he pulls from kernel.org as well (and he
has his own reasons for that).
So, all of this is to say that as the person cloning linux.git you are
merely the last link in the chain of "trusting the maintainer before
you." In your case that maintainer is Linus (or Greg KH), and you have
to agree that, in the end, "having a tree that is bit-for-bit identical
with what Linus has" is a pretty good assurance that it's as "authentic
Linux" as it gets.
-K
next prev parent reply other threads:[~2020-04-21 16:21 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-21 4:45 how to (integrity) verify a whole git repo Christoph Anton Mitterer
2020-04-21 6:53 ` Jonathan Nieder
2020-04-21 14:42 ` Christoph Anton Mitterer
2020-04-21 16:19 ` Konstantin Ryabitsev [this message]
2020-04-23 18:12 ` Christoph Anton Mitterer
2020-04-21 19:14 ` Junio C Hamano
2020-04-23 4:02 ` Christoph Anton Mitterer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200421161956.45slynbgkkom3qc3@chatter.i7.local \
--to=konstantin@linuxfoundation.org \
--cc=calestyo@scientia.net \
--cc=git@vger.kernel.org \
--cc=jrnieder@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).