git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* Re: Preserving the ability to have both SHA1 and SHA256 signatures
@ 2021-05-16 20:57 Personal Sam Smith
  2021-05-17  3:23 ` Felipe Contreras
  0 siblings, 1 reply; 3+ messages in thread
From: Personal Sam Smith @ 2021-05-16 20:57 UTC (permalink / raw)
  To: dwh; +Cc: git

dwh invited me to contribute to this discussion and I hope my comments are helpful. He referenced my work on the DIF KERI WG standard. This emerging standard has been adopted by the Global Legal Entity Identifier Foundation (GLEIF) as the basis for its new verifiable LEIs. These are required by many regulator bodies for participating legal entities.
https://keri.one  
https://identity.foundation/working-groups/keri.html 
https://www.gleif.org/en/lei-solutions/gleifs-digital-strategy-for-the-lei/introducing-the-verifiable-lei-vlei

This is part of a much larger effort to fix the security of internet distributed systems in general. The approach is based on the principles of what I like to call zero-trust-computing (ZTC) which is a generalization of the more commonly know zero-trust-networking (ZTN). Zero trust mean never trust always verify where verify is in the cryptographic sense of verifying cryptographic operations such as signatures or digests. ZTN is becoming increasingly popular for access control of networked applications. In contrast, ZTC merges ZTN principles with trusted computing principles to the architecture of any distributed software application.  
https://trustedcomputinggroup.org
https://github.com/WebOfTrustInfo/rwot7-toronto/blob/master/final-documents/A_DID_for_everything.pdf
https://github.com/WebOfTrustInfo/rwot10-buenosaires/blob/master/final-documents/quantum-secure-dids.pdf

The core idea of zero-trust is end-to-end verifiability of all operations in the system. The type of operation is application dependent. The verifiability is cryptographic. One of the most important (and most relevant to git) types of end-to-end verifiability is authenticity via non-repudiable signatures. A signature is also a hash (digest) so it secures both the integrity of and attribution to the source of that data. 

In trusted computing one starts with secure roots-of-trust that one may then build the rest of the system upon. In distributed trusted computing the root-of-trust is a verifiable data structure https://www.continusec.com/static/VerifiableDataStructures.pdf  https://transparency.dev/verifiable-data-structures/ https://www.bbva.com/en/on-building-a-verifiable-log-part-1-core-ideas/

The point is that a verifiable data structure provides an end-verifiable proof of some state. It becomes a verifiable state machine which means any software application may be made verifiable using verifiable data structures. The verifiable data structure provides a secure root-of-trust that satisfies the end-verifiability principle of zero-trust computing needed for distributed systems. A open end-verifiable system may exhibit ambient verifiability, that is, any copy is verifiable by anyone anywhere at anytime.

One of the simplest forms of a verifiable data structure is a hash chained signed append only log such as a provenance log (proposed above @dwh). A variant would be a hash chained signed DAG. The degree of security or cryptographic strength of the log is a function of the cryptographic strength of both the digest and signature operations. Unlike what is popularly portrayed in movies, a crypto system with at least 128 bits of cryptographic strength is practically infeasible to attack by brute force, i.e. are impervious to brute force attack. Instead the attack must be some sort of what is called a side-channel attack usually against one of three targets, key creation and storage infrastructure, data signing infrastructure or signature verification infrastructure.  https://github.com/SmithSamuelM/Papers/blob/master/whitepapers/IdentifierTheory_web.pdf

For the first two (key creation/storage and data signing) there are  many well known techniques such as secure enclaves, TPMs, HSMs, and TEEs as well as using threshold structures like multi-sig that may provide arbitrarily high levels of security. The third side channel attack targets signature verification usually is dependent on using secure code libraries. But the last two, namely, data signing and signature verification infrastructure, require secure code delivery of the  code as integrated into the application that consumes it. The result is that when designing zero-trust computing systems based on verifiable data structures, the weakest link is a side channel attack, the weakest link for side channel attacks is often the secure code delivery mechanism, and the weakest link for secure code delivery is often git.

What dwh is proposing is converting git from a software application with what the security community would consider antiquated security to a best-of-breed security system based on zero-trust-computing principles. This conversion does not come from imbuing git with its own security system for end-verifiable authenticity but instead layering git on top of a secure end-verifiable authenticity layer outside of git. This layering is enabled by using self-describing cryptographic primitives inside a self-describing verifiable data structure. Self-describing verifiable data structures are to the security world what JSON is to the API world. By using self-describing primates (such as a self-describing hash) in git's data structure, then those become end-verifiable data structures themselves. A signature on a secure digest is a convenient way of making secure attribution to the associated data without signing the data itself. But this requires that the digest be at least as secure as the signature. A secure digest also has the property of post-quantum protection. So a secure digest such as Blake2b, Sha3, and Blake3 digests can be used to protect non-post-quantum proof signature schemes from surprise quantum attack. 

One of the essential properties of any good cryptographic system is what is called cryptographic algorithm agility. Without it the system cannot easily adapt to new attacks and newly discovered weaknesses in cryptographic algorithms. Self-describing cryptographic primitives are the most convenient enabler for cryptographic agility. One advantage of signed hash chained provenance logs is that the whole log must be compromised not merely one part of it. Such a log that exhibits agility especially through self-describing primitives is self-healing in sense that new appendages to the log may use stronger crypto primitives which protect earlier entries in the log that use weaker primitives. This makes the log (or any such agile self-describing verifiable data structure) future proof. It is the best practice for designing distributed (over the internet) zero trust computing applications. 

It is my prediction that over the next few years there will be a rapid switchover to the use of zero-trust computing architectures based on self-describing verifiable data structures for distributed internet applications. It is the most elegant, most decentralized, solution to the security problems of distributed internet applications. Because of git's important role in code creation and delivery, it should IMHO be leading out in this space and dwh's proposal does just that.  Not fixing git in this way will eventually force work arounds for anyone seriously implementing zero-trust architectures. This will result in non-standard usually proprietary implementations of access control mechanisms in an attempt to fix up the relatively antiquated security of git tooling. This will be bad for everyone as it will balkanize git tooling along proprietary access control mechanisms, (which is already happening). A open interoperable zero-trust future proofed secure git requires that git be secured by a verifiable substrate such as dwh is proposing. Not some antiquated mechanism as is the case today. 

















^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Preserving the ability to have both SHA1 and SHA256 signatures
  2021-05-16 20:57 Preserving the ability to have both SHA1 and SHA256 signatures Personal Sam Smith
@ 2021-05-17  3:23 ` Felipe Contreras
  2021-05-17  6:49   ` Cryptographic hash agnostic git (was: Re: Preserving the ability to have both SHA1 and SHA256 signatures) Bagas Sanjaya
  0 siblings, 1 reply; 3+ messages in thread
From: Felipe Contreras @ 2021-05-17  3:23 UTC (permalink / raw)
  To: Personal Sam Smith, dwh; +Cc: git

Personal Sam Smith wrote:
> One of the essential properties of any good cryptographic system is
> what is called cryptographic algorithm agility. Without it the system
> cannot easily adapt to new attacks and newly discovered weaknesses in
> cryptographic algorithms. Self-describing cryptographic primitives are
> the most convenient enabler for cryptographic agility. One advantage
> of signed hash chained provenance logs is that the whole log must be
> compromised not merely one part of it. Such a log that exhibits
> agility especially through self-describing primitives is self-healing
> in sense that new appendages to the log may use stronger crypto
> primitives which protect earlier entries in the log that use weaker
> primitives. This makes the log (or any such agile self-describing
> verifiable data structure) future proof. It is the best practice for
> designing distributed (over the internet) zero trust computing
> applications. 

This is way above my pay grade, but let me try to interpret the above.

If we have a repository with two digest algorithms:

 2. BLAKE2b (considered non-compromised)
 1. SHA-1 (broken)

We may not be confident on the SHA-1 history (1), but as long as we have
BLAKE2b history (2), we can be confident on that.

The delta between when SHA-1 was broken, and the switch to BLAKE2b
happened, is when the repository could be potentially compromised.

So, it's in the best interest of the repository owners to switch to the
non-compromised version as soon as possible. In fact, it would be better
if the switch happened *BEFORE* SHA-1 was broken.

This is why algorithm agility is important.


But this is not sufficient, because BLAKE2b could get
compromised in the future. The repository owners need to be thinking
ahead to the time, to when they'll need to make yet another algorithm
switch.

When such times comes, they need their infrastructure to be able to
perform the switch as fast as possible. If possible right after they've
finalized their decision.


So, if I can summarize your and dwh's proposal: git should be
cryptographic-digest-algorithm-agnostic.


So far this makes sense to me.

The only problem comes when you consider day-to-day operations, which to
be honest have been totally uninterrupted by 15 years of using SHA-1.

At this point it's worth noting that if the git project has a maxim, it
would be a single word: "performance". Nothing else matters.

So, if you suggest to switch from SHA-1 to SHA-256, that's fine; as long
as you can guarantee that *performance* is not affected. This is the
work brian m. carlson seems to have been doing.

On the other hand what dwh seemed to suggest is to support every digest
algorithm on the horizon--without regards of how that would affect
performance--and as expected that didn't land very smoothly.


But I don't think the two approaches are incompatible.

All we have to do is reconcile two facts:

  1. The ability for users to switch to a new digest is important
  2. We don't want users to be switching algorithms every other commit

If git can switch the digest algorithm on a per-repository basis, I
don't think anybody would have a problem with that.

Git could support SHA-1, SHA-256, and BLAKE2b as of today. The
repository owners can decide wich algorithm to choose today, and their
past history would not be affected.

This is future-proof, and would make repository owners be able to make
that decision, not git.

If at some point in the future people want to start to get ready for
SHA-4, that could be introduced to the git core, *before* people want to
make such transition, and *after* the project has made sure such change
does not impact on performance.

Or am I missing something?

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Cryptographic hash agnostic git (was: Re: Preserving the ability to have both SHA1 and SHA256 signatures)
  2021-05-17  3:23 ` Felipe Contreras
@ 2021-05-17  6:49   ` Bagas Sanjaya
  0 siblings, 0 replies; 3+ messages in thread
From: Bagas Sanjaya @ 2021-05-17  6:49 UTC (permalink / raw)
  To: Felipe Contreras, Personal Sam Smith, dwh; +Cc: git



On 17/05/21 10.23, Felipe Contreras wrote:
> Personal Sam Smith wrote:
>> One of the essential properties of any good cryptographic system is
>> what is called cryptographic algorithm agility. Without it the system
>> cannot easily adapt to new attacks and newly discovered weaknesses in
>> cryptographic algorithms. Self-describing cryptographic primitives are
>> the most convenient enabler for cryptographic agility. One advantage
>> of signed hash chained provenance logs is that the whole log must be
>> compromised not merely one part of it. Such a log that exhibits
>> agility especially through self-describing primitives is self-healing
>> in sense that new appendages to the log may use stronger crypto
>> primitives which protect earlier entries in the log that use weaker
>> primitives. This makes the log (or any such agile self-describing
>> verifiable data structure) future proof. It is the best practice for
>> designing distributed (over the internet) zero trust computing
>> applications.
> 
> This is way above my pay grade, but let me try to interpret the above.
> 
> If we have a repository with two digest algorithms:
> 
>   2. BLAKE2b (considered non-compromised)
>   1. SHA-1 (broken)
> 
> We may not be confident on the SHA-1 history (1), but as long as we have
> BLAKE2b history (2), we can be confident on that.
> 
> The delta between when SHA-1 was broken, and the switch to BLAKE2b
> happened, is when the repository could be potentially compromised.
> 
> So, it's in the best interest of the repository owners to switch to the
> non-compromised version as soon as possible. In fact, it would be better
> if the switch happened *BEFORE* SHA-1 was broken.
> 
> This is why algorithm agility is important.
> 
> 
> But this is not sufficient, because BLAKE2b could get
> compromised in the future. The repository owners need to be thinking
> ahead to the time, to when they'll need to make yet another algorithm
> switch.
> 
> When such times comes, they need their infrastructure to be able to
> perform the switch as fast as possible. If possible right after they've
> finalized their decision.
> 
> 
> So, if I can summarize your and dwh's proposal: git should be
> cryptographic-digest-algorithm-agnostic.
>

  
But SHA-256 support on Git is still on progress, unfortunately. What if
on someday SHA-1 is broken completely, and we're still not yet switch to
stronger hashes?

Anyway, beside SHA-256 and BLAKE2b, there is also SHA-3 family, which
supports from 224 bits to 512 bits. If Git wants to support SHA-3 hashed
objects, which bit length should we use? I prefer 256 bits, because it's
a nice trade-off between performance (speed) and security (resistance).

> 
> So far this makes sense to me.
> 
> The only problem comes when you consider day-to-day operations, which to
> be honest have been totally uninterrupted by 15 years of using SHA-1.
> 
> At this point it's worth noting that if the git project has a maxim, it
> would be a single word: "performance". Nothing else matters.
> 
> So, if you suggest to switch from SHA-1 to SHA-256, that's fine; as long
> as you can guarantee that *performance* is not affected. This is the
> work brian m. carlson seems to have been doing.
> 
> On the other hand what dwh seemed to suggest is to support every digest
> algorithm on the horizon--without regards of how that would affect
> performance--and as expected that didn't land very smoothly.
> 
> 
> But I don't think the two approaches are incompatible.
> 
> All we have to do is reconcile two facts:
> 
>    1. The ability for users to switch to a new digest is important
>    2. We don't want users to be switching algorithms every other commit
> 
> If git can switch the digest algorithm on a per-repository basis, I
> don't think anybody would have a problem with that.
> 
> Git could support SHA-1, SHA-256, and BLAKE2b as of today. The
> repository owners can decide wich algorithm to choose today, and their
> past history would not be affected.
> 

In reality, many users just use Git that is packaged by the distribution,
and depending on release version of the distro, it can be older than
recent. So we need to also consider that.

> This is future-proof, and would make repository owners be able to make
> that decision, not git.
> 
> If at some point in the future people want to start to get ready for
> SHA-4, that could be introduced to the git core, *before* people want to
> make such transition, and *after* the project has made sure such change
> does not impact on performance.
> 
> Or am I missing something?
> 
> Cheers.
> 

Another remark: currently we roll-out hash algorithms on our own, but
industry best practices said that we should instead use third-party libraries
to do the job (OpenSSL or similar).

The problem of offloading hash algorithm implementation to third-party
libraries is some (or most) distributions camp specific version of library
for usage for several years, with only (backported) bugfix updates are added.
This make algorithm agility is more resistant to do, because we must wait
until ALL distributions supported our objective algorithms in their libraries.

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-05-17  6:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-16 20:57 Preserving the ability to have both SHA1 and SHA256 signatures Personal Sam Smith
2021-05-17  3:23 ` Felipe Contreras
2021-05-17  6:49   ` Cryptographic hash agnostic git (was: Re: Preserving the ability to have both SHA1 and SHA256 signatures) Bagas Sanjaya

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).