git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Re: [RFC PATCH 0/2] Add named reference to latest push cert
       [not found] <20170906093913.21485-1-root@shikherverma.com>
@ 2017-09-06 21:31 ` Stefan Beller
  2017-09-07  0:55   ` Junio C Hamano
  2017-09-07  9:11   ` Shikher Verma
  2017-09-07  7:08 ` [RFC PATCH 0/2] Add named reference to latest push cert Shikher Verma
  1 sibling, 2 replies; 16+ messages in thread
From: Stefan Beller @ 2017-09-06 21:31 UTC (permalink / raw)
  To: Shikher Verma; +Cc: git@vger.kernel.org, Santiago Torres

On Wed, Sep 6, 2017 at 2:39 AM, Shikher Verma <root@shikherverma.com> wrote:
> Currently, git only stores push certificates if there is a receive hook
> present. This may violate the principle of least surprise (e.g., I
> pushed with --signed, and I don't see anything in upstream).
> Additionally, push certificates could be more versatile if they are not
> tightly bound to git hooks. Finally, it would be useful to verify the
> signed pushes at later points of time with ease.
>
> A named ref is added for ease of access/tooling around push
> certificates. If the last push was signed, ref/PUSH_CERT stores the
> ref of the latest push cert otherwise it is empty.
>
> Sending patches as RFC since the documentation would have to be
> updated and git gc might have to be patched to not garbage collect
> the latest push certificate.
>
> This patch applies on master (3ec7d702a)

What are performance implications for busy repositories at busy hosts?
(think kernel.org / github) They may want to disable this new feature
for performance reasons or because they don't want to clutter the
object store. So at least a config option to turn it off would be useful.

On the ref to store the push certs:
(a) Currently the ref points at the blob, I wonder if we'd rather want to
    point at a commit? (Then we can build up a history of
    push certs, instead of relying on the reflog to show all
    push certs. Also the gc issue might be easier to solve using this)
(b) When going with (a), we might want to change the name. Most
    refs are 3 directories deep:
      refs/heads/<branch name>
      refs/pr/<pull request nr> # at github IIUC
      refs/changes/<id> # Gerrit
      refs/meta/config # Gerrit to e.g. configure ACLs of the repo
    "refs" indicates it is a ref, whereas the second part can be seen
    as a "namespace". Currently Git only uses the "heads" and "tags"
    namespace, "meta" is used by more than just Gerrit, so maybe it is
    not wise to use "refs/meta/push_cert", but go with refs/gitmeta/pushcert
    instead?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-06 21:31 ` [RFC PATCH 0/2] Add named reference to latest push cert Stefan Beller
@ 2017-09-07  0:55   ` Junio C Hamano
  2017-09-07  8:55     ` Shikher Verma
  2017-09-07  9:11   ` Shikher Verma
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2017-09-07  0:55 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Shikher Verma, git@vger.kernel.org, Santiago Torres

Stefan Beller <sbeller@google.com> writes:

> On the ref to store the push certs:
> (a) Currently the ref points at the blob, I wonder if we'd rather want to
>     point at a commit? (Then we can build up a history of
>     push certs, instead of relying on the reflog to show all
>     push certs. Also the gc issue might be easier to solve using this)
> (b) When going with (a), we might want to change the name. Most
>     refs are 3 directories deep:
>       refs/heads/<branch name>
>       refs/pr/<pull request nr> # at github IIUC
>       refs/changes/<id> # Gerrit
>       refs/meta/config # Gerrit to e.g. configure ACLs of the repo
>     "refs" indicates it is a ref, whereas the second part can be seen
>     as a "namespace". Currently Git only uses the "heads" and "tags"
>     namespace, "meta" is used by more than just Gerrit, so maybe it is
>     not wise to use "refs/meta/push_cert", but go with refs/gitmeta/pushcert
>     instead?

You also need to worry about concurrent pushes.  The resulting
"history" may not have to be sequencial, but two pushes that affect
the same ref must be serialized in the resulting push-cert store.

The original design deliberately punts all the complexity to hook
exactly because we do not want to have a half-baked "built-in"
implementation that would only get in the way of those who wants to
do high-performance servers.  It is very likely that they want to
have a long-running daemon that listens to a port or a named pipe,
where the only thing the hook would do is to write the value of
GIT_PUSH_CERT to that daemon process, which acts as a serialization
point, can read from the object store that is used as a short-term
temporary area, and write the push cert to a more permanent store.

Having said all that, I am sympathetic to a wish to have an
easy-to-enable-without-thinking example that is not so involved
(e.g. no portability concern, does not have to perform very well but
must be correct).  If Shikher wants to add one, I think the right
approach to do so would be to add and ship a sample hook.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
       [not found] <20170906093913.21485-1-root@shikherverma.com>
  2017-09-06 21:31 ` [RFC PATCH 0/2] Add named reference to latest push cert Stefan Beller
@ 2017-09-07  7:08 ` Shikher Verma
  2017-09-07 17:21   ` Stefan Beller
  1 sibling, 1 reply; 16+ messages in thread
From: Shikher Verma @ 2017-09-07  7:08 UTC (permalink / raw)
  To: git

Hey everyone,

I felt like I should introduce myself since this is my first patch on
the git mailing list (or any mailing list actually) :D

I am Shikher[1], currently in my 4th year undergrad at IIT Kanpur.
This summer I was lucky enough to intern at NYU Secure Systems Lab[2]
mentored by Santiago. We looked into how signed pushes work and how 
we can use them to increase the security of git. We encountered a
strange error in tests which resulted in a patch[3] and I wrote a
python script to verify push certificates[4]. I was pretty surprised 
to not find any push certificate on the remote repo after I did a 
signed push, hence this RFC.

Anyway this is my first time trying to contribute to a large OSS so 
forgive me if I make any noob mistakes.

Thanks
Shikher Verma

[1]http://shikherverma.com/
[2]https://ssl.engineering.nyu.edu/
[3]https://public-inbox.org/git/20170707220159.12752-1-santiago@nyu.edu/
[4]https://gist.github.com/ShikherVerma/9204060b545c00597e7ad9b694cfeb9c

On Wed, Sep 06, 2017 at 03:09:11PM +0530, Shikher Verma wrote:
> Currently, git only stores push certificates if there is a receive hook 
> present. This may violate the principle of least surprise (e.g., I 
> pushed with --signed, and I don't see anything in upstream). 
> Additionally, push certificates could be more versatile if they are not 
> tightly bound to git hooks. Finally, it would be useful to verify the 
> signed pushes at later points of time with ease.
> 
> A named ref is added for ease of access/tooling around push 
> certificates. If the last push was signed, ref/PUSH_CERT stores the 
> ref of the latest push cert otherwise it is empty.
>  
> Sending patches as RFC since the documentation would have to be 
> updated and git gc might have to be patched to not garbage collect 
> the latest push certificate.
> 
> This patch applies on master (3ec7d702a) 
> 
> Shikher Verma (2):
>   Always write push cert to disk
>   Store latest push cert ref in PUSH_CERT
> 
>  builtin/receive-pack.c | 25 ++++++++++++++++++++-----
>  path.c                 |  1 +
>  path.h                 |  1 +
>  3 files changed, 22 insertions(+), 5 deletions(-)
> 
> -- 
> 2.14.1
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-07  0:55   ` Junio C Hamano
@ 2017-09-07  8:55     ` Shikher Verma
  0 siblings, 0 replies; 16+ messages in thread
From: Shikher Verma @ 2017-09-07  8:55 UTC (permalink / raw)
  To: Junio C Hamano, git

On Thu, Sep 07, 2017 at 09:55:25AM +0900, Junio C Hamano wrote:
> Stefan Beller <sbeller@google.com> writes:
> 
> > On the ref to store the push certs:
> > (a) Currently the ref points at the blob, I wonder if we'd rather want to
> >     point at a commit? (Then we can build up a history of
> >     push certs, instead of relying on the reflog to show all
> >     push certs. Also the gc issue might be easier to solve using this)
> > (b) When going with (a), we might want to change the name. Most
> >     refs are 3 directories deep:
> >       refs/heads/<branch name>
> >       refs/pr/<pull request nr> # at github IIUC
> >       refs/changes/<id> # Gerrit
> >       refs/meta/config # Gerrit to e.g. configure ACLs of the repo
> >     "refs" indicates it is a ref, whereas the second part can be seen
> >     as a "namespace". Currently Git only uses the "heads" and "tags"
> >     namespace, "meta" is used by more than just Gerrit, so maybe it is
> >     not wise to use "refs/meta/push_cert", but go with refs/gitmeta/pushcert
> >     instead?
> 
> You also need to worry about concurrent pushes.  The resulting
> "history" may not have to be sequencial, but two pushes that affect
> the same ref must be serialized in the resulting push-cert store.

Oh I see. I guess concurrency would be an issue. How does recieve-pack
handle concurrent pushes? Is there a lock that I could use to decide 
if named push cert ref has to be updated or not?

> The original design deliberately punts all the complexity to hook
> exactly because we do not want to have a half-baked "built-in"
> implementation that would only get in the way of those who wants to
> do high-performance servers.  It is very likely that they want to
> have a long-running daemon that listens to a port or a named pipe,
> where the only thing the hook would do is to write the value of
> GIT_PUSH_CERT to that daemon process, which acts as a serialization
> point, can read from the object store that is used as a short-term
> temporary area, and write the push cert to a more permanent store.
> 
> Having said all that, I am sympathetic to a wish to have an
> easy-to-enable-without-thinking example that is not so involved
> (e.g. no portability concern, does not have to perform very well but
> must be correct).  If Shikher wants to add one, I think the right
> approach to do so would be to add and ship a sample hook.
> 

This patch would only add one more object to write per push so I 
think the performance penalty is not that high. We can have a config
to turn it off so that it does not get in the way of those who want 
to do high-performance servers.

People can use the recieve hook for advance use cases but I think git
should provide a builtin solution for the simple case. The reason I
favour a named ref over a sample hook is because decouping push
certificate from hook opens up more possibilities like we could store
a map of refs to the latest push cert which updated the ref. And 
serve the corresponding push cert whenever someone does 
`git pull --signed important-ref`. Effectively removing trust from 
the server by preventing tampering with refs. This could really help 
the Github generation developers like me. What do you think?

> Thanks.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-06 21:31 ` [RFC PATCH 0/2] Add named reference to latest push cert Stefan Beller
  2017-09-07  0:55   ` Junio C Hamano
@ 2017-09-07  9:11   ` Shikher Verma
  2017-09-07 17:43     ` Stefan Beller
  1 sibling, 1 reply; 16+ messages in thread
From: Shikher Verma @ 2017-09-07  9:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, Santiago Torres

On Wed, Sep 06, 2017 at 02:31:49PM -0700, Stefan Beller wrote: 
> On Wed, Sep 6, 2017 at 2:39 AM, Shikher Verma <root@shikherverma.com> wrote: 
> > Currently, git only stores push certificates if there is a receive hook 
> > present. This may violate the principle of least surprise (e.g., I 
> > pushed with --signed, and I don't see anything in upstream). 
> > Additionally, push certificates could be more versatile if they are not 
> > tightly bound to git hooks. Finally, it would be useful to verify the 
> > signed pushes at later points of time with ease. 
> > 
> > A named ref is added for ease of access/tooling around push 
> > certificates. If the last push was signed, ref/PUSH_CERT stores the 
> > ref of the latest push cert otherwise it is empty. 
> > 
> > Sending patches as RFC since the documentation would have to be 
> > updated and git gc might have to be patched to not garbage collect 
> > the latest push certificate. 
> > 
> > This patch applies on master (3ec7d702a) 
> 
> What are performance implications for busy repositories at busy hosts? 
> (think kernel.org / github) They may want to disable this new feature 
> for performance reasons or because they don't want to clutter the 
> object store. So at least a config option to turn it off would be useful. 

Any typical git push would write several objects to disk, this patch 
would only add one more object per push so I think the performance 
penalty is not that high. But I agree that we can have a config to turn 
it off. 
> On the ref to store the push certs: 
> (a) Currently the ref points at the blob, I wonder if we'd rather want to 
> point at a commit? (Then we can build up a history of 
> push certs, instead of relying on the reflog to show all 
> push certs. Also the gc issue might be easier to solve using this) 

I am not sure how that would work. The ref points at the blob of push 
certificate. Since each push can update multiple refs, each push 
certificate can point to mutiple commits (tip of the updated refs). 
Also if the named ref points at the commit then how will we get the 
corresponding push certificate? 

I did think about keeping a history of push certificates but the problem 
is new pushes can delete refs and commits which were pointed to by 
previous push certificates. This makes it really difficult to decide 
which push certificates to keep and which to gc. Also this history would 
be different for different clones of the same repo. Since push 
certificate are only meta data of the git workflow I think its best to 
just keep the latest push certificate and gc the old ones. People can 
use the recieve hook if they want to do advance things like logging a 
history of push certificates. I think git should provide a builtin 
solution for the simple case. 

Another motivation to decouple push certificates from hooks was that 
later we could store a map of refs to the latest push cert which 
updated the ref. And serve the corresponding push cert whenever someone 
does `git pull --signed important-ref`. Effectively removing trust from 
the server by preventing tampering with refs. This could really help 
the Github generation developers like me.
> (b) When going with (a), we might want to change the name. Most 
> refs are 3 directories deep: 
> refs/heads/<branch name> 
> refs/pr/<pull request nr> # at github IIUC 
> refs/changes/<id> # Gerrit 
> refs/meta/config # Gerrit to e.g. configure ACLs of the repo 
> "refs" indicates it is a ref, whereas the second part can be seen 
> as a "namespace". Currently Git only uses the "heads" and "tags" 
> namespace, "meta" is used by more than just Gerrit, so maybe it is 
> not wise to use "refs/meta/push_cert", but go with refs/gitmeta/pushcert 
> instead? 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-07  7:08 ` [RFC PATCH 0/2] Add named reference to latest push cert Shikher Verma
@ 2017-09-07 17:21   ` Stefan Beller
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Beller @ 2017-09-07 17:21 UTC (permalink / raw)
  To: Shikher Verma; +Cc: git@vger.kernel.org

On Thu, Sep 7, 2017 at 12:08 AM, Shikher Verma <root@shikherverma.com> wrote:
> Hey everyone,
>
> I felt like I should introduce myself since this is my first patch on
> the git mailing list (or any mailing list actually) :D

Welcome to the list. :)

> I am Shikher[1], currently in my 4th year undergrad at IIT Kanpur.
> This summer I was lucky enough to intern at NYU Secure Systems Lab[2]
> mentored by Santiago. We looked into how signed pushes work and how
> we can use them to increase the security of git. We encountered a
> strange error in tests which resulted in a patch[3] and I wrote a
> python script to verify push certificates[4]. I was pretty surprised
> to not find any push certificate on the remote repo after I did a
> signed push, hence this RFC.
>
> Anyway this is my first time trying to contribute to a large OSS so
> forgive me if I make any noob mistakes.

Patches are very welcome. Everyone was a noob once, but that
changes quickly.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-07  9:11   ` Shikher Verma
@ 2017-09-07 17:43     ` Stefan Beller
  2017-09-16  7:21       ` Shikher Verma
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Beller @ 2017-09-07 17:43 UTC (permalink / raw)
  To: Shikher Verma; +Cc: git@vger.kernel.org, Santiago Torres

On Thu, Sep 7, 2017 at 2:11 AM, Shikher Verma <root@shikherverma.com> wrote:
> On Wed, Sep 06, 2017 at 02:31:49PM -0700, Stefan Beller wrote:
>> On Wed, Sep 6, 2017 at 2:39 AM, Shikher Verma <root@shikherverma.com> wrote:
>> > Currently, git only stores push certificates if there is a receive hook
>> > present. This may violate the principle of least surprise (e.g., I
>> > pushed with --signed, and I don't see anything in upstream).
>> > Additionally, push certificates could be more versatile if they are not
>> > tightly bound to git hooks. Finally, it would be useful to verify the
>> > signed pushes at later points of time with ease.
>> >
>> > A named ref is added for ease of access/tooling around push
>> > certificates. If the last push was signed, ref/PUSH_CERT stores the
>> > ref of the latest push cert otherwise it is empty.
>> >
>> > Sending patches as RFC since the documentation would have to be
>> > updated and git gc might have to be patched to not garbage collect
>> > the latest push certificate.
>> >
>> > This patch applies on master (3ec7d702a)
>>
>> What are performance implications for busy repositories at busy hosts?
>> (think kernel.org / github) They may want to disable this new feature
>> for performance reasons or because they don't want to clutter the
>> object store. So at least a config option to turn it off would be useful.
>
> Any typical git push would write several objects to disk,

(or just one pack file, [which may be renamed eventually, see 722ff7f8])

> this patch
> would only add one more object per push so I think the performance
> penalty is not that high. But I agree that we can have a config to turn
> it off.

I personally do not run a high performance server, so I do not terribly mind,
but thought it would be nice for them to have at least an option ready made
instead of a potential performance regression.

>> On the ref to store the push certs:
>> (a) Currently the ref points at the blob, I wonder if we'd rather want to
>> point at a commit? (Then we can build up a history of
>> push certs, instead of relying on the reflog to show all
>> push certs. Also the gc issue might be easier to solve using this)
>
> I am not sure how that would work. The ref points at the blob of push
> certificate. Since each push can update multiple refs, each push
> certificate can point to mutiple commits (tip of the updated refs).
> Also if the named ref points at the commit then how will we get the
> corresponding push certificate?
>
> I did think about keeping a history of push certificates but the problem
> is new pushes can delete refs and commits which were pointed to by
> previous push certificates. This makes it really difficult to decide
> which push certificates to keep and which to gc. Also this history would
> be different for different clones of the same repo. Since push
> certificate are only meta data of the git workflow I think its best to
> just keep the latest push certificate and gc the old ones. People can
> use the recieve hook if they want to do advance things like logging a
> history of push certificates. I think git should provide a builtin
> solution for the simple case.

What I had in mind was what would be achieved with a
hook like this (untested):

    #!/bin/sh
    if test -z GIT_PUSH_CERT ; then
        exit 0
    fi

    # add a new worktree 'tmp', checking out the magic ref:
    git worktree add tmp refs/PUSH_CERT

    cp $GIT_PUSH_CERT tmp/cert
    git -C tmp add cert
    git -C tmp commit -m "new push cert"
    # maybe include GIT_PUSH_CERT_[NONCE_]STATUS
    # in commit message?

    # clean up, command doesn't exist yet:
    git worktree delete tmp

This would not deal with concurrency as it re-uses the
same worktree, but illustrates what I had in mind
for the git history of that special ref.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-07 17:43     ` Stefan Beller
@ 2017-09-16  7:21       ` Shikher Verma
  2017-09-17  1:40         ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Shikher Verma @ 2017-09-16  7:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, Santiago Torres, gitster

[-- Attachment #1: Type: text/plain, Size: 5116 bytes --]

On Thu, Sep 07, 2017 at 05:43:19PM +0000, Stefan Beller wrote:
> On Thu, Sep 7, 2017 at 2:11 AM, Shikher Verma <root@shikherverma.com> wrote:
> > On Wed, Sep 06, 2017 at 02:31:49PM -0700, Stefan Beller wrote:
> >> On Wed, Sep 6, 2017 at 2:39 AM, Shikher Verma <root@shikherverma.com> wrote:
> >> > Currently, git only stores push certificates if there is a receive hook
> >> > present. This may violate the principle of least surprise (e.g., I
> >> > pushed with --signed, and I don't see anything in upstream).
> >> > Additionally, push certificates could be more versatile if they are not
> >> > tightly bound to git hooks. Finally, it would be useful to verify the
> >> > signed pushes at later points of time with ease.
> >> >
> >> > A named ref is added for ease of access/tooling around push
> >> > certificates. If the last push was signed, ref/PUSH_CERT stores the
> >> > ref of the latest push cert otherwise it is empty.
> >> >
> >> > Sending patches as RFC since the documentation would have to be
> >> > updated and git gc might have to be patched to not garbage collect
> >> > the latest push certificate.
> >> >
> >> > This patch applies on master (3ec7d702a)
> >>
> >> What are performance implications for busy repositories at busy hosts?
> >> (think kernel.org / github) They may want to disable this new feature
> >> for performance reasons or because they don't want to clutter the
> >> object store. So at least a config option to turn it off would be useful.
> >
> > Any typical git push would write several objects to disk,
> 
> (or just one pack file, [which may be renamed eventually, see 722ff7f8])
> 
> > this patch
> > would only add one more object per push so I think the performance
> > penalty is not that high. But I agree that we can have a config to turn
> > it off.
> 
> I personally do not run a high performance server, so I do not terribly mind,
> but thought it would be nice for them to have at least an option ready made
> instead of a potential performance regression.
> 
> >> On the ref to store the push certs:
> >> (a) Currently the ref points at the blob, I wonder if we'd rather want to
> >> point at a commit? (Then we can build up a history of
> >> push certs, instead of relying on the reflog to show all
> >> push certs. Also the gc issue might be easier to solve using this)
> >
> > I am not sure how that would work. The ref points at the blob of push
> > certificate. Since each push can update multiple refs, each push
> > certificate can point to mutiple commits (tip of the updated refs).
> > Also if the named ref points at the commit then how will we get the
> > corresponding push certificate?
> >
> > I did think about keeping a history of push certificates but the problem
> > is new pushes can delete refs and commits which were pointed to by
> > previous push certificates. This makes it really difficult to decide
> > which push certificates to keep and which to gc. Also this history would
> > be different for different clones of the same repo. Since push
> > certificate are only meta data of the git workflow I think its best to
> > just keep the latest push certificate and gc the old ones. People can
> > use the recieve hook if they want to do advance things like logging a
> > history of push certificates. I think git should provide a builtin
> > solution for the simple case.
> 
> What I had in mind was what would be achieved with a
> hook like this (untested):
> 
>     #!/bin/sh
>     if test -z GIT_PUSH_CERT ; then
>         exit 0
>     fi
> 
>     # add a new worktree 'tmp', checking out the magic ref:
>     git worktree add tmp refs/PUSH_CERT
> 
>     cp $GIT_PUSH_CERT tmp/cert
>     git -C tmp add cert
>     git -C tmp commit -m "new push cert"
>     # maybe include GIT_PUSH_CERT_[NONCE_]STATUS
>     # in commit message?
> 
>     # clean up, command doesn't exist yet:
>     git worktree delete tmp
>

This might be a good starting point for a sample hook if we choose to go
that way. As Junio suggested.
> This would not deal with concurrency as it re-uses the
> same worktree, but illustrates what I had in mind
> for the git history of that special ref.

I personally feel that we should decouple push certificates from hooks
and serve the push certificate whenever someone does
`git pull --signed important-ref`. That way we remove trust from
services like Github, Gitlab, Bitbucket. This could be done if git
stores a map of refs to last push certificate that updated that ref.

Push certificates are preventing MiTM attack between pusher and server.
If we start serving push certificates on pull, it would prevent MiTM
attack between pusher and puller! Compromise of server wouldn't mean
vulnerabilities in your project.

A sample hook solves the immediate problem of making push certs more
accessible. But going with `git pull --signed` make push certs
tremendously more accessible and useful. What are your thoughts on
having git signed pull?

What do you guys think I should do in regards to this patch series?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-16  7:21       ` Shikher Verma
@ 2017-09-17  1:40         ` Junio C Hamano
  2017-09-18 14:22           ` Santiago Torres
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2017-09-17  1:40 UTC (permalink / raw)
  To: Shikher Verma; +Cc: Stefan Beller, git@vger.kernel.org, Santiago Torres

Shikher Verma <root@shikherverma.com> writes:

> This might be a good starting point for a sample hook if we choose to go
> that way. As Junio suggested.
>> This would not deal with concurrency as it re-uses the
>> same worktree, but illustrates what I had in mind
>> for the git history of that special ref.

That's Stefan; I wouldn't have suggested any approach that uses the
blob whose sole purpose is to serve as a temporary storage area to
pass the information to the hook as part of the permanent record.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-17  1:40         ` Junio C Hamano
@ 2017-09-18 14:22           ` Santiago Torres
  2017-09-18 17:43             ` Stefan Beller
  2017-09-19  1:04             ` Junio C Hamano
  0 siblings, 2 replies; 16+ messages in thread
From: Santiago Torres @ 2017-09-18 14:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shikher Verma, Stefan Beller, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2617 bytes --]

Hello, everyone.

Sorry for being late in this thread, I was making sure I didn't say
anything outrageously wrong. 

> That's Stefan; I wouldn't have suggested any approach that uses the
> blob whose sole purpose is to serve as a temporary storage area to
> pass the information to the hook as part of the permanent record.
> 

Hmm, as far as I understand *this* is the status quo. We get an
ephemeral sha1/oid only if you have a hook attached. Otherwise, you will
never see the object at all, even if you have --signed set.

I suggested preparing this RFC/Patch because of the following reasons
(please let me know if my understanding of any of this is wrong...):

    - I find it weird that the cli allows for a --signed push and
      nowhere in the receive-pack's feedback you're told if the push
      certificate is compute/stored/handled at all. I think that, at the
      latest, receive pack should let the user know whether the signed
      push succeeded, or if there is no hook attached to handle it.

    - *if there is a hook* the blob is computed, but it is up to the
      hook itself to store it *somewhere*. This makes me feel like it's
      somewhat of a useless waste of computation if the hook is not
      meant to handle it anyway(which is just a post-receive hook). I
      find it rather weird that --signed is a builtin flag, and is
      handled on the server side only partially (just my two cents).

    - To me, the way push certificates are handled now feels like having
      git commit -S producing a detached signature that you have to
      embed somehow in the resulting commit object. (As a matter of
      fact, many points on [1] seem to back this notion, and even recall
      some drawbacks on push certificates the way they are handled
      today)

I understand the concurrency concerns, so I agree with stefan's
solution, although I don't know how big of a deal it would be, if git
already supports --atomic pushes (admittedly, I haven't checked if there
are any guards in place for someone who pushes millions of refs
atomically). It'd be completely understandable to have a setting to
disable hadnling of --signed pushes and, ideally, a way to warn the user
of this feature being disabled if --signed is sent.

Maybe I'm missing the bigger picture, but to me it feels that a named
ref to the latest push certificate would make it easier to
handle/tool/sync around the push certificate solution?

Thanks,
-Santiago.

[1] https://public-inbox.org/git/CAJo=hJvWbjEM9E5AjPHgmQ=eY8xf=Q=xtukeu2Ur7auUqeabDg@mail.gmail.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-18 14:22           ` Santiago Torres
@ 2017-09-18 17:43             ` Stefan Beller
  2017-09-19  1:04             ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Stefan Beller @ 2017-09-18 17:43 UTC (permalink / raw)
  To: Santiago Torres; +Cc: Junio C Hamano, Shikher Verma, git@vger.kernel.org

On Mon, Sep 18, 2017 at 7:22 AM, Santiago Torres <santiago@nyu.edu> wrote:
> Hello, everyone.
>
> Sorry for being late in this thread, I was making sure I didn't say
> anything outrageously wrong.
>
>> That's Stefan; I wouldn't have suggested any approach that uses the
>> blob whose sole purpose is to serve as a temporary storage area to
>> pass the information to the hook as part of the permanent record.
>>

I put out a vague design that seemed to have more advantages
in my mind at the time of writing. :)

>
> Hmm, as far as I understand *this* is the status quo. We get an
> ephemeral sha1/oid only if you have a hook attached. Otherwise, you will
> never see the object at all, even if you have --signed set.
>
> I suggested preparing this RFC/Patch because of the following reasons
> (please let me know if my understanding of any of this is wrong...):
>
>     - I find it weird that the cli allows for a --signed push and
>       nowhere in the receive-pack's feedback you're told if the push
>       certificate is compute/stored/handled at all. I think that, at the
>       latest, receive pack should let the user know whether the signed
>       push succeeded, or if there is no hook attached to handle it.

How would a user benefit from it?
(Are there cases where the organisation wants the user to not know
deliberately what happened to their push cert? Do we care about these
cases?)

>     - *if there is a hook* the blob is computed, but it is up to the
>       hook itself to store it *somewhere*. This makes me feel like it's
>       somewhat of a useless waste of computation if the hook is not
>       meant to handle it anyway(which is just a post-receive hook). I
>       find it rather weird that --signed is a builtin flag, and is
>       handled on the server side only partially (just my two cents).

I agree, but many features in Git start out small and only partially.

>     - To me, the way push certificates are handled now feels like having
>       git commit -S producing a detached signature that you have to
>       embed somehow in the resulting commit object. (As a matter of
>       fact, many points on [1] seem to back this notion, and even recall
>       some drawbacks on push certificates the way they are handled
>       today)
>
> I understand the concurrency concerns, so I agree with stefan's
> solution, although I don't know how big of a deal it would be, if git
> already supports --atomic pushes (admittedly, I haven't checked if there
> are any guards in place for someone who pushes millions of refs
> atomically). It'd be completely understandable to have a setting to
> disable hadnling of --signed pushes and, ideally, a way to warn the user
> of this feature being disabled if --signed is sent.

That makes sense.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-18 14:22           ` Santiago Torres
  2017-09-18 17:43             ` Stefan Beller
@ 2017-09-19  1:04             ` Junio C Hamano
  2017-09-19  3:11               ` Junio C Hamano
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2017-09-19  1:04 UTC (permalink / raw)
  To: Santiago Torres; +Cc: Shikher Verma, Stefan Beller, git@vger.kernel.org

Santiago Torres <santiago@nyu.edu> writes:

>     - *if there is a hook* the blob is computed, but it is up to the
>       hook itself to store it *somewhere*. This makes me feel like it's
>       somewhat of a useless waste of computation if the hook is not
>       meant to handle it anyway(which is just a post-receive hook). I
>       find it rather weird that --signed is a builtin flag, and is
>       handled on the server side only partially (just my two cents).

The way it was envisioned to be used is that the repository meant to
be protected by collected push certs may not be trusted as the
permanent store for push certs by all hosting sites.  The hook may
be told the name of a blob to read its contents and is expected to
store it away to somewhere else.

The only reason why we use blob is because creating a blob in
respose to pushes being executed in parallel will result in
different blobs unless there is hash collision.  Instead of us
having to come up with and use a different mechanism to create a
unique temporary filename and feed that to hook, reusing blob as
such was the simplest.

> I understand the concurrency concerns, so I agree with stefan's
> solution, although I don't know how big of a deal it would be, if git
> already supports --atomic pushes (admittedly, I haven't checked if there
> are any guards in place for someone who pushes millions of refs
> atomically). It'd be completely understandable to have a setting to
> disable hadnling of --signed pushes and, ideally, a way to warn the user
> of this feature being disabled if --signed is sent.

I do not think atomic helps at all, when one atomic push updates
branch A while another atomic push updates branch B.  They can still
go in parallel, and their certificates must both be stored.  You can
somehow serialize them and create a single strand of pearls to
advance a single ref, or you can let both to fork two histories to
store the push certs from these two pushes and have somebody create
a merge commit to join the history.  

But the point is that we do not want such an overhead in core, as
all of that is a useless waste of the cycle for a site that wants to
store the push certificate away outside of the repository itself.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/2] Add named reference to latest push cert
  2017-09-19  1:04             ` Junio C Hamano
@ 2017-09-19  3:11               ` Junio C Hamano
  2017-12-02  9:12                 ` [PATCH] Add a sample hook which saves push certs as notes Shikher Verma
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2017-09-19  3:11 UTC (permalink / raw)
  To: Santiago Torres; +Cc: Shikher Verma, Stefan Beller, git@vger.kernel.org

Junio C Hamano <gitster@pobox.com> writes:

> But the point is that we do not want such an overhead in core, as
> all of that is a useless waste of the cycle for a site that wants to
> store the push certificate away outside of the repository itself.

By this I do not mean that cert blobs shouldn't become part of the
in-repository record in _all_ installations that receive signed push
certificates.  An easy to reuse example shipped together with our
source would be a good thing, and an easy to enable sample hook may
even be better.  It's just that I do not want to have any "solution"
in the core part that everybody that wants to accept push certs must
run, if the "solution" is not something we can endorse as the best
current practice.  And I do not yet see how anything along the lines
of the patch that started this thread, or its extention by wrapping
them with commit chain, would become that "best current practice".

A sample hook, on the other hand, is simpler for people to experiment
and we might even come up with an unversally useful best current
prctice out of such an experiment, though.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] Add a sample hook which saves push certs as notes
  2017-09-19  3:11               ` Junio C Hamano
@ 2017-12-02  9:12                 ` Shikher Verma
  2017-12-03  0:45                   ` Todd Zullinger
  0 siblings, 1 reply; 16+ messages in thread
From: Shikher Verma @ 2017-12-02  9:12 UTC (permalink / raw)
  To: gitster; +Cc: git, root, santiago, sbeller

hooks--post-receive.sample: If push cert is present, add it as a git
note to the top most commit of the updated ref.

Signed-off-by: Shikher Verma <root@shikherverma.com>
---
 templates/hooks--post-receive.sample | 38 ++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)
 create mode 100755 templates/hooks--post-receive.sample

diff --git a/templates/hooks--post-receive.sample b/templates/hooks--post-receive.sample
new file mode 100755
index 000000000..b4366e43f
--- /dev/null
+++ b/templates/hooks--post-receive.sample
@@ -0,0 +1,38 @@
+#!/bin/sh
+#
+# An example hook script to store push certificates as notes.
+#
+# To enable this hook, rename this file to "post-receive".
+#
+# The stdin of the hook will be one line for each updated ref:
+# <old-id> <new-id> <refname>
+#
+# For each updated ref this script will :
+# 1. Verify that the ref update matches that in push certificate.
+# 2. add the push cert as note (namespace pushcerts) to <new-id>.
+#
+# If this hook is enabled on the server then clients can prevent
+# git metadata tampering, by using signed pushes and 
+# doing the following while fetching :
+# 1. fetch the git notes (of namespace pushcerts) from server.
+#     $ git fetch origin refs/notes/pushcerts:refs/notes/pushcerts
+# 2. Check that the fetched ref's top most commit has a note
+#     containing a push certificate.
+# 3. Verify the validity of the push certificate in the note and 
+#     check that the ref update matches that in push certificate.
+#
+
+if test -z GIT_PUSH_CERT ; then
+    exit 0
+fi
+
+push_cert=$(git cat-file -p  $GIT_PUSH_CERT)
+
+while read oval nval ref
+do
+	# Verify that the ref update matches that in push certificate.
+	if [[ $push_cert == *$oval" "$nval" "$ref* ]]; then
+		# add the push cert as note (namespaced pushcerts) to nval.
+		git notes --ref=pushcerts add -m "$push_cert" $nval -f
+	fi
+done
-- 
2.15.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add a sample hook which saves push certs as notes
  2017-12-02  9:12                 ` [PATCH] Add a sample hook which saves push certs as notes Shikher Verma
@ 2017-12-03  0:45                   ` Todd Zullinger
  2017-12-03  6:05                     ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Todd Zullinger @ 2017-12-03  0:45 UTC (permalink / raw)
  To: Shikher Verma; +Cc: gitster, git, santiago, sbeller

Hi Shikher,

I'm not familiar with push certs, but I did notice some general issues
in the sample hook.  I hope they're helpful.

Shikher Verma wrote:
> index 000000000..b4366e43f
> --- /dev/null
> +++ b/templates/hooks--post-receive.sample
> +#!/bin/sh
...
> +if test -z GIT_PUSH_CERT ; then
> +    exit 0
> +fi

The $ is missing from GIT_PUSH_CERT.  test -z GIT_PUSH_CERT will
always be false. :)

The variable should also be quoted.  Not all sh implementations accept
a missing argument to test -z, as bash does.

More minor, Documentation/CodingGuidelines suggests placing 'then' on
a new line:

    if test -z "$GIT_PUSH_CERT"
    then
        exit 0
    fi

(There is plenty of code that doesn't follow that, so I don't know how
strong that preference is.)

This could also be written as:

    test -z "$GIT_PUSH_CERT" && exit 0

I don't know if there's any general preference to shorten it in git's
code or not.

> +push_cert=$(git cat-file -p  $GIT_PUSH_CERT)

Very minor: there's an extra space before the variable here.

(I also noticed the tests which use $GIT_PUSH_CERT, like t5534, use
'cat-file blob ...' rather than 'cat-file -p ...'.  I don't know if
that's much safer/better than letting cat-file guess the object type
in the hook.  I have no idea if there's a chance that "$GIT_PUSH_CERT"
has some unexpected, non-blob object type.)

> +while read oval nval ref
> +do
> +	# Verify that the ref update matches that in push certificate.
> +	if [[ $push_cert == *$oval" "$nval" "$ref* ]]; then

[[ isn't portable across all the sh implementations git strives to
support, as far as I know.

The minor point about 'then' on new line is applicable here too.  It
would also better match the outer 'while' loop.

> +		# add the push cert as note (namespaced pushcerts) to nval.
> +		git notes --ref=pushcerts add -m "$push_cert" $nval -f
> +	fi
> +done

-- 
Todd
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Learn from the mistakes of others--you can never live long enough to
make them all yourself.
    -- John Luther


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Add a sample hook which saves push certs as notes
  2017-12-03  0:45                   ` Todd Zullinger
@ 2017-12-03  6:05                     ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2017-12-03  6:05 UTC (permalink / raw)
  To: Todd Zullinger; +Cc: Shikher Verma, git, santiago, sbeller

Todd Zullinger <tmz@pobox.com> writes:

> (I also noticed the tests which use $GIT_PUSH_CERT, like t5534, use
> 'cat-file blob ...' rather than 'cat-file -p ...'.  I don't know if
> that's much safer/better than letting cat-file guess the object type
> in the hook.

The '-p' option is meant for human consumption and we promise that
the output from it _will_ change if it makes sense at the UI level.

In a script like this, you do care about the exact byte sequence.
So that is a more important reason why you should say "blob" not
"-p".

>> +	# Verify that the ref update matches that in push certificate.
>> +	if [[ $push_cert == *$oval" "$nval" "$ref* ]]; then

I am not sure what this expression is trying to do in the first
place.  The contents of the push certificate blob may contain these
three values, but has a lot more than that.

A post-receive is run after all the receive processing is done, so
its failing cannot abort the transfer.  I wonder how an almost
simultaneous push to a same ref, that would not fail normally
without this new hook script, would behave.  One receive updates the
tip from A to B and then starts running this script, while the other
receive updates the tip from B to C and then starts running another
copy of the script.  They both wants to update the notes database
but there can be only one winner in the race for its tip.  

What happens then?  Don't we need to be running a script like this
from a hook mechanism that runs under a lock or something?

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-12-03  6:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20170906093913.21485-1-root@shikherverma.com>
2017-09-06 21:31 ` [RFC PATCH 0/2] Add named reference to latest push cert Stefan Beller
2017-09-07  0:55   ` Junio C Hamano
2017-09-07  8:55     ` Shikher Verma
2017-09-07  9:11   ` Shikher Verma
2017-09-07 17:43     ` Stefan Beller
2017-09-16  7:21       ` Shikher Verma
2017-09-17  1:40         ` Junio C Hamano
2017-09-18 14:22           ` Santiago Torres
2017-09-18 17:43             ` Stefan Beller
2017-09-19  1:04             ` Junio C Hamano
2017-09-19  3:11               ` Junio C Hamano
2017-12-02  9:12                 ` [PATCH] Add a sample hook which saves push certs as notes Shikher Verma
2017-12-03  0:45                   ` Todd Zullinger
2017-12-03  6:05                     ` Junio C Hamano
2017-09-07  7:08 ` [RFC PATCH 0/2] Add named reference to latest push cert Shikher Verma
2017-09-07 17:21   ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).