* [PATCH] submodule: use cheaper check for submodule pushes @ 2017-07-12 23:45 Stefan Beller 2017-07-13 0:01 ` Jonathan Nieder 0 siblings, 1 reply; 20+ messages in thread From: Stefan Beller @ 2017-07-12 23:45 UTC (permalink / raw) To: git; +Cc: Stefan Beller In the function push_submodule[1] we use add_submodule_odb[2] to determine if a submodule has been populated. However the function does not work with the submodules objects that are added, instead a new child process is used to perform the actual push in the submodule. Use is_submodule_populated[3] that is cheaper to guard from unpopulated submodules. [1] 'push_submodule' was added in eb21c732d6 (push: teach --recurse-submodules the on-demand option, 2012-03-29) [2] 'add_submodule_odb' was introduced in 752c0c2492 (Add the --submodule option to the diff option family, 2009-10-19) [3] 'is_submodule_populated' was added in 5688c28d81 (submodules: add helper to determine if a submodule is populated, 2016-12-16) Signed-off-by: Stefan Beller <sbeller@google.com> --- submodule.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/submodule.c b/submodule.c index da2b484879..55afad3e8c 100644 --- a/submodule.c +++ b/submodule.c @@ -976,7 +976,9 @@ static int push_submodule(const char *path, const struct string_list *push_options, int dry_run) { - if (add_submodule_odb(path)) + int code; + + if (!is_submodule_populated_gently(path, &code)) return 1; if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) { -- 2.13.2.695.g117ddefdb4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] submodule: use cheaper check for submodule pushes 2017-07-12 23:45 [PATCH] submodule: use cheaper check for submodule pushes Stefan Beller @ 2017-07-13 0:01 ` Jonathan Nieder 2017-07-13 0:09 ` Stefan Beller 2017-07-13 0:53 ` Junio C Hamano 0 siblings, 2 replies; 20+ messages in thread From: Jonathan Nieder @ 2017-07-13 0:01 UTC (permalink / raw) To: Stefan Beller; +Cc: git Hi, Stefan Beller wrote: > In the function push_submodule[1] we use add_submodule_odb[2] to determine > if a submodule has been populated. However the function does not work with > the submodules objects that are added, instead a new child process is used > to perform the actual push in the submodule. > > Use is_submodule_populated[3] that is cheaper to guard from unpopulated > submodules. > > [1] 'push_submodule' was added in eb21c732d6 (push: teach > --recurse-submodules the on-demand option, 2012-03-29) > [2] 'add_submodule_odb' was introduced in 752c0c2492 (Add the > --submodule option to the diff option family, 2009-10-19) > [3] 'is_submodule_populated' was added in 5688c28d81 (submodules: > add helper to determine if a submodule is populated, 2016-12-16) These footnotes don't answer the question that I really have: why did this use add_submodule_odb in the first place? E.g. did the ref iteration code require access to the object store previously and stop requiring it later? > Signed-off-by: Stefan Beller <sbeller@google.com> > --- > submodule.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/submodule.c b/submodule.c > index da2b484879..55afad3e8c 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -976,7 +976,9 @@ static int push_submodule(const char *path, > const struct string_list *push_options, > int dry_run) > { > - if (add_submodule_odb(path)) > + int code; > + > + if (!is_submodule_populated_gently(path, &code)) Should this examine the code to distinguish between hard errors (e.g. "Error reading .git") and a missing repository? Thanks, Jonathan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] submodule: use cheaper check for submodule pushes 2017-07-13 0:01 ` Jonathan Nieder @ 2017-07-13 0:09 ` Stefan Beller 2017-07-13 0:53 ` Junio C Hamano 1 sibling, 0 replies; 20+ messages in thread From: Stefan Beller @ 2017-07-13 0:09 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git@vger.kernel.org On Wed, Jul 12, 2017 at 5:01 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > These footnotes don't answer the question that I really have: why did > this use add_submodule_odb in the first place? oh, I forgot to put that down: presumably add_submodule_odb was used because it was available back then? Note the dates! > E.g. did the ref iteration code require access to the object store > previously and stop requiring it later? No. See [1], it was overkill since the beginning of time. >> - if (add_submodule_odb(path)) >> + int code; >> + >> + if (!is_submodule_populated_gently(path, &code)) > > Should this examine the code to distinguish between hard errors > (e.g. "Error reading .git") and a missing repository? add_submodule_odb does neither, so I think this is best kept without additional checks. Thanks! ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] submodule: use cheaper check for submodule pushes 2017-07-13 0:01 ` Jonathan Nieder 2017-07-13 0:09 ` Stefan Beller @ 2017-07-13 0:53 ` Junio C Hamano 2017-07-13 5:14 ` Stefan Beller 1 sibling, 1 reply; 20+ messages in thread From: Junio C Hamano @ 2017-07-13 0:53 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Stefan Beller, git Jonathan Nieder <jrnieder@gmail.com> writes: >> In the function push_submodule[1] we use add_submodule_odb[2] to determine >> if a submodule has been populated. However the function does not work with >> the submodules objects that are added, instead a new child process is used >> to perform the actual push in the submodule. >> >> Use is_submodule_populated[3] that is cheaper to guard from unpopulated >> submodules. >> >> [1] 'push_submodule' was added in eb21c732d6 (push: teach >> --recurse-submodules the on-demand option, 2012-03-29) >> [2] 'add_submodule_odb' was introduced in 752c0c2492 (Add the >> --submodule option to the diff option family, 2009-10-19) >> [3] 'is_submodule_populated' was added in 5688c28d81 (submodules: >> add helper to determine if a submodule is populated, 2016-12-16) > > These footnotes don't answer the question that I really have: why did > this use add_submodule_odb in the first place? > > E.g. did the ref iteration code require access to the object store > previously and stop requiring it later? Yes, the most important question is if it is safe to lose the access to the object store of the submodule. It is an endgame we should aim for to get rid of add_submodule_odb(), but does the rest of this codepath not require objects in the submodule at all or do we still need to change something to make it so? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] submodule: use cheaper check for submodule pushes 2017-07-13 0:53 ` Junio C Hamano @ 2017-07-13 5:14 ` Stefan Beller 2017-07-13 18:37 ` Junio C Hamano 0 siblings, 1 reply; 20+ messages in thread From: Stefan Beller @ 2017-07-13 5:14 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, git@vger.kernel.org On Wed, Jul 12, 2017 at 5:53 PM, Junio C Hamano <gitster@pobox.com> wrote: > Jonathan Nieder <jrnieder@gmail.com> writes: > >>> In the function push_submodule[1] we use add_submodule_odb[2] to determine >>> if a submodule has been populated. However the function does not work with >>> the submodules objects that are added, instead a new child process is used >>> to perform the actual push in the submodule. >>> >>> Use is_submodule_populated[3] that is cheaper to guard from unpopulated >>> submodules. >>> >>> [1] 'push_submodule' was added in eb21c732d6 (push: teach >>> --recurse-submodules the on-demand option, 2012-03-29) >>> [2] 'add_submodule_odb' was introduced in 752c0c2492 (Add the >>> --submodule option to the diff option family, 2009-10-19) >>> [3] 'is_submodule_populated' was added in 5688c28d81 (submodules: >>> add helper to determine if a submodule is populated, 2016-12-16) >> >> These footnotes don't answer the question that I really have: why did >> this use add_submodule_odb in the first place? >> >> E.g. did the ref iteration code require access to the object store >> previously and stop requiring it later? > > Yes, the most important question is if it is safe to lose the access > to the object store of the submodule. It is an endgame we should > aim for to get rid of add_submodule_odb(), but does the rest of this > codepath not require objects in the submodule at all or do we still > need to change something to make it so? Yes, as the code in the current form as well as in its first occurrence used the result of add_submodule_odb to determine if to spawn a child process. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] submodule: use cheaper check for submodule pushes 2017-07-13 5:14 ` Stefan Beller @ 2017-07-13 18:37 ` Junio C Hamano 2017-07-13 19:39 ` Stefan Beller 0 siblings, 1 reply; 20+ messages in thread From: Junio C Hamano @ 2017-07-13 18:37 UTC (permalink / raw) To: Stefan Beller; +Cc: Jonathan Nieder, git@vger.kernel.org Stefan Beller <sbeller@google.com> writes: > On Wed, Jul 12, 2017 at 5:53 PM, Junio C Hamano <gitster@pobox.com> wrote: >> Jonathan Nieder <jrnieder@gmail.com> writes: >> >>>> In the function push_submodule[1] we use add_submodule_odb[2] to determine >>>> if a submodule has been populated. However the function does not work with >>>> the submodules objects that are added, instead a new child process is used >>>> to perform the actual push in the submodule. >>>> >>>> Use is_submodule_populated[3] that is cheaper to guard from unpopulated >>>> submodules. >>>> >>>> [1] 'push_submodule' was added in eb21c732d6 (push: teach >>>> --recurse-submodules the on-demand option, 2012-03-29) >>>> [2] 'add_submodule_odb' was introduced in 752c0c2492 (Add the >>>> --submodule option to the diff option family, 2009-10-19) >>>> [3] 'is_submodule_populated' was added in 5688c28d81 (submodules: >>>> add helper to determine if a submodule is populated, 2016-12-16) >>> >>> These footnotes don't answer the question that I really have: why did >>> this use add_submodule_odb in the first place? >>> >>> E.g. did the ref iteration code require access to the object store >>> previously and stop requiring it later? >> >> Yes, the most important question is if it is safe to lose the access >> to the object store of the submodule. It is an endgame we should >> aim for to get rid of add_submodule_odb(), but does the rest of this >> codepath not require objects in the submodule at all or do we still >> need to change something to make it so? > > Yes, as the code in the current form as well as in its first occurrence > used the result of add_submodule_odb to determine if to spawn a child process. The original added so that the return value of the call can be used for that, and the current code still uses the return value for that purpose. That much is already known. I think Jonathan's question (which I concurred) is if we also ended up relying on the side effect of calling that function (i.e. being able to now find objects that are not in our repository but in the submodule's object store). By looking at the eb21c732d6, we can tell that the original didn't mean to and didn't add any code that relies on the ability to be able to read from the submodule object store. I am not sure if that is still true after 5 years (i.e. is there any new code added in the meantime that made us depend on the ability to read from submodule object store?). My hunch (and hope) is that we are probably safe, but that is a lot weaker than "yes this is a good change we want to apply". ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] submodule: use cheaper check for submodule pushes 2017-07-13 18:37 ` Junio C Hamano @ 2017-07-13 19:39 ` Stefan Beller 2017-07-13 20:48 ` Jonathan Nieder 0 siblings, 1 reply; 20+ messages in thread From: Stefan Beller @ 2017-07-13 19:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, git@vger.kernel.org On Thu, Jul 13, 2017 at 11:37 AM, Junio C Hamano <gitster@pobox.com> wrote: > > I think Jonathan's question (which I concurred) is if we also ended > up relying on the side effect of calling that function (i.e. being > able to now find objects that are not in our repository but in the > submodule's object store). By looking at the eb21c732d6, we can > tell that the original didn't mean to and didn't add any code that > relies on the ability to be able to read from the submodule object > store. I am not sure if that is still true after 5 years (i.e. is > there any new code added in the meantime that made us depend on the > ability to read from submodule object store?). Yes we are safe, because the function itself only spawns a child process (not using any of the objects). It's only caller push_unpushed_submodules also doesn't rely on objects loaded after calling push_submodule. The caller of push_unpushed_submodules (transport.c, transport_push) also doesn't need submodule objects loaded. > My hunch (and hope) is that we are probably safe, but that is a lot > weaker than "yes this is a good change we want to apply". Given the above (I went through the code), all I can do is repeating "yes this is a good change we want to apply". Thanks, Stefan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] submodule: use cheaper check for submodule pushes 2017-07-13 19:39 ` Stefan Beller @ 2017-07-13 20:48 ` Jonathan Nieder 2017-07-13 20:54 ` Stefan Beller 0 siblings, 1 reply; 20+ messages in thread From: Jonathan Nieder @ 2017-07-13 20:48 UTC (permalink / raw) To: Stefan Beller; +Cc: Junio C Hamano, git@vger.kernel.org Hi, Stefan Beller wrote: > Yes we are safe, because the function itself only spawns a child process > (not using any of the objects). > > It's only caller push_unpushed_submodules also doesn't rely on objects > loaded after calling push_submodule. > > The caller of push_unpushed_submodules (transport.c, transport_push) > also doesn't need submodule objects loaded. Thanks for looking into it. This is what the commit message should say to help reviewers or people trying to understand it later. The footnotes don't help and are distracting, except that it makes sense to point out the original GSoC patch to say the alternate submodule odb wasn't needed even then. E.g.: Subject: push: do not add submodule odb as an alternate when recursing on demand "git push --recurse-submodules=on-demand" adds each submodule as an alternate with add_submodule_odb before checking whether the submodule has anything to push and pushing it if so. However, it never accesses any objects from the submodule. In the parent process it uses the submodule's ref database to see if there is anything to push. The actual push (which does rely on objects) occurs in a child process. The same was try when this call was originally added in v1.7.11-rc0~111^2 (push: teach --recurse-submodules the on-demand option, 2012-03-29). Most likely it was added by analogy with fetch --recurse-submodules=on-demand, which did use the submodule's object database. Use is_submodule_populated_gently instead, which is simpler and cheaper. [...] > On Thu, Jul 13, 2017 at 11:37 AM, Junio C Hamano <gitster@pobox.com> wrote: >> My hunch (and hope) is that we are probably safe, but that is a lot >> weaker than "yes this is a good change we want to apply". > > Given the above (I went through the code), all I can do is repeating > "yes this is a good change we want to apply". With such a commit message change, this seems like a reasonable change in principle (though I haven't looked carefully to verify it). My one doubt is the is_submodule_populated_gently. Why are we using that instead of simpler is_submodule_populated? The names and API comments don't explain. Thanks for your patient explanations, Jonathan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] submodule: use cheaper check for submodule pushes 2017-07-13 20:48 ` Jonathan Nieder @ 2017-07-13 20:54 ` Stefan Beller 2017-08-15 22:43 ` [PATCH] push: do not add submodule odb as an alternate when recursing on demand Stefan Beller 0 siblings, 1 reply; 20+ messages in thread From: Stefan Beller @ 2017-07-13 20:54 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Junio C Hamano, git@vger.kernel.org On Thu, Jul 13, 2017 at 1:48 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Hi, > > Stefan Beller wrote: > >> Yes we are safe, because the function itself only spawns a child process >> (not using any of the objects). >> >> It's only caller push_unpushed_submodules also doesn't rely on objects >> loaded after calling push_submodule. >> >> The caller of push_unpushed_submodules (transport.c, transport_push) >> also doesn't need submodule objects loaded. > > Thanks for looking into it. This is what the commit message should > say to help reviewers or people trying to understand it later. The > footnotes don't help and are distracting, except that it makes sense > to point out the original GSoC patch to say the alternate submodule > odb wasn't needed even then. > > E.g.: > > Subject: push: do not add submodule odb as an alternate when recursing on demand > > "git push --recurse-submodules=on-demand" adds each submodule as an > alternate with add_submodule_odb before checking whether the > submodule has anything to push and pushing it if so. > > However, it never accesses any objects from the submodule. In the > parent process it uses the submodule's ref database to see if there > is anything to push. The actual push (which does rely on objects) > occurs in a child process. > > The same was try when this call was originally added in > v1.7.11-rc0~111^2 (push: teach --recurse-submodules the on-demand > option, 2012-03-29). Most likely it was added by analogy with > fetch --recurse-submodules=on-demand, which did use the submodule's > object database. > > Use is_submodule_populated_gently instead, which is simpler and > cheaper. Thanks for giving a good example of commit message that I could use in a reroll. > With such a commit message change, this seems like a reasonable change > in principle (though I haven't looked carefully to verify it). > > My one doubt is the is_submodule_populated_gently. Why are we using > that instead of simpler is_submodule_populated? The names and API > comments don't explain. One could posit this is laziness of thinking. See 15cdc64776 (make is_submodule_populated gently, 2017-03-14), and discover there is no non-gentle version of is_submodule_populated. And for each new use, it may be cheaper to just use the gentle version instead of adding a non-gentle version. Thanks, Stefan ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-07-13 20:54 ` Stefan Beller @ 2017-08-15 22:43 ` Stefan Beller 2017-08-15 23:10 ` Jonathan Nieder ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Stefan Beller @ 2017-08-15 22:43 UTC (permalink / raw) To: sbeller; +Cc: git, gitster, jrnieder "git push --recurse-submodules=on-demand" adds each submodule as an alternate with add_submodule_odb before checking whether the submodule has anything to push and pushing it if so. However, it never accesses any objects from the submodule. In the parent process it uses the submodule's ref database to see if there is anything to push. The actual push (which does rely on objects) occurs in a child process. The same was true when this call was originally added in v1.7.11-rc0~111^2 (push: teach --recurse-submodules the on-demand option, 2012-03-29). Most likely it was added by analogy with fetch --recurse-submodules=on-demand, which did use the submodule's object database. Use is_submodule_populated_gently instead, which is simpler and cheaper. Signed-off-by: Stefan Beller <sbeller@google.com> --- Originally I intended to send this out as part of a larger series, but the series is getting too large series, sending all things in smaller units! Thanks, Stefan submodule.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/submodule.c b/submodule.c index 111a3007fc..e20216bc0f 100644 --- a/submodule.c +++ b/submodule.c @@ -966,7 +966,9 @@ static int push_submodule(const char *path, const struct string_list *push_options, int dry_run) { - if (add_submodule_odb(path)) + int code; + + if (!is_submodule_populated_gently(path, &code)) return 1; if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) { -- 2.14.0.rc0.3.g6c2e499285 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-15 22:43 ` [PATCH] push: do not add submodule odb as an alternate when recursing on demand Stefan Beller @ 2017-08-15 23:10 ` Jonathan Nieder 2017-08-15 23:14 ` Jonathan Nieder 2017-08-15 23:23 ` Junio C Hamano 2 siblings, 0 replies; 20+ messages in thread From: Jonathan Nieder @ 2017-08-15 23:10 UTC (permalink / raw) To: Stefan Beller; +Cc: git, gitster Stefan Beller wrote: > "git push --recurse-submodules=on-demand" adds each submodule as an > alternate with add_submodule_odb before checking whether the > submodule has anything to push and pushing it if so. > > However, it never accesses any objects from the submodule. In the > parent process it uses the submodule's ref database to see if there > is anything to push. The actual push (which does rely on objects) > occurs in a child process. > > The same was true when this call was originally added in > v1.7.11-rc0~111^2 (push: teach --recurse-submodules the on-demand > option, 2012-03-29). Most likely it was added by analogy with > fetch --recurse-submodules=on-demand, which did use the submodule's > object database. > > Use is_submodule_populated_gently instead, which is simpler and > cheaper. > > Signed-off-by: Stefan Beller <sbeller@google.com> > --- > submodule.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Thanks for picking up this loose end. > diff --git a/submodule.c b/submodule.c > index 111a3007fc..e20216bc0f 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -966,7 +966,9 @@ static int push_submodule(const char *path, > const struct string_list *push_options, > int dry_run) > { > - if (add_submodule_odb(path)) > + int code; > + > + if (!is_submodule_populated_gently(path, &code)) > return 1; > > if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) { ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-15 22:43 ` [PATCH] push: do not add submodule odb as an alternate when recursing on demand Stefan Beller 2017-08-15 23:10 ` Jonathan Nieder @ 2017-08-15 23:14 ` Jonathan Nieder 2017-08-15 23:27 ` Stefan Beller 2017-08-15 23:23 ` Junio C Hamano 2 siblings, 1 reply; 20+ messages in thread From: Jonathan Nieder @ 2017-08-15 23:14 UTC (permalink / raw) To: Stefan Beller; +Cc: git, gitster Stefan Beller wrote: > Use is_submodule_populated_gently instead, which is simpler and > cheaper. [...] > --- a/submodule.c > +++ b/submodule.c > @@ -966,7 +966,9 @@ static int push_submodule(const char *path, > const struct string_list *push_options, > int dry_run) > { > - if (add_submodule_odb(path)) > + int code; > + > + if (!is_submodule_populated_gently(path, &code)) > return 1; Ah, I forgot about this detail. I don't think it should block this patch (so my Reviewed-by still stands), but I wonder why this needs to be gentle. add_submodule_odb is gentle so that is the conservative thing to do, but that doesn't mean it is the *right* thing to do. If this passed NULL instead of &code as the second argument, would anything break? Could there be a comment explaining what kind of error we are expecting and why it is okay to continue when that error is encountered without any error handling? Thanks, Jonathan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-15 23:14 ` Jonathan Nieder @ 2017-08-15 23:27 ` Stefan Beller 0 siblings, 0 replies; 20+ messages in thread From: Stefan Beller @ 2017-08-15 23:27 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git@vger.kernel.org, Junio C Hamano On Tue, Aug 15, 2017 at 4:14 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Stefan Beller wrote: > >> Use is_submodule_populated_gently instead, which is simpler and >> cheaper. > [...] >> --- a/submodule.c >> +++ b/submodule.c >> @@ -966,7 +966,9 @@ static int push_submodule(const char *path, >> const struct string_list *push_options, >> int dry_run) >> { >> - if (add_submodule_odb(path)) >> + int code; >> + >> + if (!is_submodule_populated_gently(path, &code)) >> return 1; > > Ah, I forgot about this detail. I don't think it should block this > patch (so my Reviewed-by still stands), but I wonder why this needs to > be gentle. add_submodule_odb is gentle so that is the conservative > thing to do, but that doesn't mean it is the *right* thing to do. > > If this passed NULL instead of &code as the second argument, would > anything break? push_submodule, which is called by push_unpushed_submodules just issues warnings on submodule push error, (which happen all before the main push,) such that any submodule error does not prevent the main push. We could tighten that, but I would suggest another patch for that. > Could there be a comment explaining what kind of error we are > expecting and why it is okay to continue when that error is > encountered without any error handling? The same as in add_submodule_odb, digging through the logs. I up to now we did not care about the submodule succeeding as much, it was just an aid for the main repo push. > > Thanks, > Jonathan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-15 22:43 ` [PATCH] push: do not add submodule odb as an alternate when recursing on demand Stefan Beller 2017-08-15 23:10 ` Jonathan Nieder 2017-08-15 23:14 ` Jonathan Nieder @ 2017-08-15 23:23 ` Junio C Hamano 2017-08-15 23:31 ` Stefan Beller 2 siblings, 1 reply; 20+ messages in thread From: Junio C Hamano @ 2017-08-15 23:23 UTC (permalink / raw) To: Stefan Beller; +Cc: git, jrnieder Stefan Beller <sbeller@google.com> writes: > "git push --recurse-submodules=on-demand" adds each submodule as an > alternate with add_submodule_odb before checking whether the > submodule has anything to push and pushing it if so. > > However, it never accesses any objects from the submodule. > ... > Use is_submodule_populated_gently instead, which is simpler and > cheaper. > > Signed-off-by: Stefan Beller <sbeller@google.com> > --- > > Originally I intended to send this out as part of a larger series, > but the series is getting too large series, sending all things in smaller > units! This vaguely reminds me that you sent something imilar perhaps for a different codepath. Is "is it populated" a good thing to check here, though? IIRC, add-submodule-odb allows you to add the object database of an inactivated submodule, so this seems to change the behaviour. I do not know if the behaviour change is a good thing (i.e. bugfix) or not (i.e. regression) offhand, though. Thanks. > diff --git a/submodule.c b/submodule.c > index 111a3007fc..e20216bc0f 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -966,7 +966,9 @@ static int push_submodule(const char *path, > const struct string_list *push_options, > int dry_run) > { > - if (add_submodule_odb(path)) > + int code; > + > + if (!is_submodule_populated_gently(path, &code)) > return 1; > > if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) { ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-15 23:23 ` Junio C Hamano @ 2017-08-15 23:31 ` Stefan Beller 2017-08-16 0:11 ` Junio C Hamano 0 siblings, 1 reply; 20+ messages in thread From: Stefan Beller @ 2017-08-15 23:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: git@vger.kernel.org, Jonathan Nieder On Tue, Aug 15, 2017 at 4:23 PM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >> "git push --recurse-submodules=on-demand" adds each submodule as an >> alternate with add_submodule_odb before checking whether the >> submodule has anything to push and pushing it if so. >> >> However, it never accesses any objects from the submodule. >> ... >> Use is_submodule_populated_gently instead, which is simpler and >> cheaper. >> >> Signed-off-by: Stefan Beller <sbeller@google.com> >> --- >> >> Originally I intended to send this out as part of a larger series, >> but the series is getting too large series, sending all things in smaller >> units! > > This vaguely reminds me that you sent something imilar perhaps for a > different codepath. https://public-inbox.org/git/xmqqh8xzq6td.fsf@gitster.mtv.corp.google.com/ > Is "is it populated" a good thing to check here, though? IIRC, > add-submodule-odb allows you to add the object database of an > inactivated submodule, so this seems to change the behaviour. I do > not know if the behaviour change is a good thing (i.e. bugfix) or > not (i.e. regression) offhand, though. Good point, we should be able to push non-populated, even inactive(?) submodules. For that we strictly need add_submodule_odb here (or the repo object of the submodule, eventually). So let's retract this patch for now. Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-15 23:31 ` Stefan Beller @ 2017-08-16 0:11 ` Junio C Hamano 2017-08-16 1:05 ` Stefan Beller 0 siblings, 1 reply; 20+ messages in thread From: Junio C Hamano @ 2017-08-16 0:11 UTC (permalink / raw) To: Stefan Beller; +Cc: git@vger.kernel.org, Jonathan Nieder Stefan Beller <sbeller@google.com> writes: >> Is "is it populated" a good thing to check here, though? IIRC, >> add-submodule-odb allows you to add the object database of an >> inactivated submodule, so this seems to change the behaviour. I do >> not know if the behaviour change is a good thing (i.e. bugfix) or >> not (i.e. regression) offhand, though. > > Good point, we should be able to push non-populated, even inactive(?) > submodules. For that we strictly need add_submodule_odb here > (or the repo object of the submodule, eventually). > > So let's retract this patch for now. Not so fast. I am not convinced "push --recursive" should touch a submodule that was once cloned from the upstream and then deactivated, so using add-submodule-odb to decide if the push should go through may be a bug that we may want to fix, in which case the diff of the patch in question may be good as-is. We need to sell it as a bugfix to the users, who may complain about behaviour change (if there is one). On the other hand, even if it were desirable for such a deactivated submodule to be pushed, as your log message explained, there is no reason to contaminate the in-core object hash by calling the add-submodule-odb helper, when the only thing we care about is "do we have the refs and object store for this submodule? we do not care if it is activated or not". Perhaps there is a more appropriate helper in submodule.c that answers that question that we should be using instead of add-submodule-odb, and if there is not yet such a helper, perhaps this indicates that we need to add such a helper, which essentially is the early half of what add-submodule-odb does, i.e. ask git_path_submodule() for the object store and check if that directory exists. Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-16 0:11 ` Junio C Hamano @ 2017-08-16 1:05 ` Stefan Beller 2017-08-16 2:08 ` Jonathan Nieder 2017-08-16 16:35 ` Heiko Voigt 0 siblings, 2 replies; 20+ messages in thread From: Stefan Beller @ 2017-08-16 1:05 UTC (permalink / raw) To: Junio C Hamano, Jonathan Tan; +Cc: git@vger.kernel.org, Jonathan Nieder On Tue, Aug 15, 2017 at 5:11 PM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >>> Is "is it populated" a good thing to check here, though? IIRC, >>> add-submodule-odb allows you to add the object database of an >>> inactivated submodule, so this seems to change the behaviour. I do >>> not know if the behaviour change is a good thing (i.e. bugfix) or >>> not (i.e. regression) offhand, though. >> >> Good point, we should be able to push non-populated, even inactive(?) >> submodules. For that we strictly need add_submodule_odb here >> (or the repo object of the submodule, eventually). >> >> So let's retract this patch for now. > > Not so fast. Ok, I took another look at the code. While we may desire that un-populated submodules can be pushed (due to checking out another revision where the submodule doesn't exist, before pushing), this is not supported currently, because the call to run the push in the submodule assumes there is a "<path>/.git" on which the child process can operate. So for now we HAVE to have the submodule populated. In the future we may have the a lighter version just checking the object store of the submodule. Maybe this use case in the submodule can be covered by the .remote/.imported mechanism as well, such that we'd know if we have any local objects. > I am not convinced "push --recursive" should touch a submodule that > was once cloned from the upstream and then deactivated, so using > add-submodule-odb to decide if the push should go through may be a > bug that we may want to fix, in which case the diff of the patch in > question may be good as-is. We need to sell it as a bugfix to the > users, who may complain about behaviour change (if there is one). correct. Coupled with the observation above, we want really need both "is active and populated" (your paragraph suggest not-active submodules don't need pushing, the populated bit comes from the observation above) > On the other hand, even if it were desirable for such a deactivated > submodule to be pushed, as your log message explained, there is no > reason to contaminate the in-core object hash by calling the > add-submodule-odb helper, when the only thing we care about is "do > we have the refs and object store for this submodule? we do not care > if it is activated or not". Perhaps there is a more appropriate > helper in submodule.c that answers that question that we should be > using instead of add-submodule-odb, and if there is not yet such a > helper, perhaps this indicates that we need to add such a helper, > which essentially is the early half of what add-submodule-odb does, > i.e. ask git_path_submodule() for the object store and check if that > directory exists. I think due to the constraint of also needing a worktree the is_populated helper is the correct choice here. > > Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-16 1:05 ` Stefan Beller @ 2017-08-16 2:08 ` Jonathan Nieder 2017-08-16 5:52 ` Stefan Beller 2017-08-16 16:35 ` Heiko Voigt 1 sibling, 1 reply; 20+ messages in thread From: Jonathan Nieder @ 2017-08-16 2:08 UTC (permalink / raw) To: Stefan Beller; +Cc: Junio C Hamano, Jonathan Tan, git@vger.kernel.org Hi, Stefan Beller wrote: > On Tue, Aug 15, 2017 at 5:11 PM, Junio C Hamano <gitster@pobox.com> wrote: >> Stefan Beller <sbeller@google.com> writes: >>> Junio C Hamano wrote: >>>> Is "is it populated" a good thing to check here, though? IIRC, >>>> add-submodule-odb allows you to add the object database of an >>>> inactivated submodule, so this seems to change the behaviour. I do >>>> not know if the behaviour change is a good thing (i.e. bugfix) or >>>> not (i.e. regression) offhand, though. >>> >>> Good point, we should be able to push non-populated, even inactive(?) >>> submodules. For that we strictly need add_submodule_odb here >>> (or the repo object of the submodule, eventually). >>> >>> So let's retract this patch for now. >> >> Not so fast. > > Ok, I took another look at the code. > > While we may desire that un-populated submodules can be pushed > (due to checking out another revision where the submodule > doesn't exist, before pushing), this is not supported currently, because > the call to run the push in the submodule assumes there is a > "<path>/.git" on which the child process can operate. > So for now we HAVE to have the submodule populated. It was not immediately obvious to me that this is just "for now". I would be really confused if I had deactivated a submodule and "git push --recurse-submodules" pushed from it anyway. If the submodule is active but not populated, then the question becomes "Why wasn't it populated?" If this is a bare repository, then nothing is populated, and pushing from an active-but-unpopulated submodule sounds like a plausible wish. But in a non-bare repository, I'm having trouble imagining the use case that brings this situation about. And where people have been needing this so far has been non-bare repositories. In that context, the check "is active and populated" does not seem unusual or provisional. Are you hinting that replacing the check with "is active" would make it work well in bare repositories? I think I agree, though you'd have to be careful about the case where the submodule is active but hasn't been fetched to $GIT_DIR/modules yet. Thanks, Jonathan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-16 2:08 ` Jonathan Nieder @ 2017-08-16 5:52 ` Stefan Beller 0 siblings, 0 replies; 20+ messages in thread From: Stefan Beller @ 2017-08-16 5:52 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Junio C Hamano, Jonathan Tan, git@vger.kernel.org On Tue, Aug 15, 2017 at 7:08 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Hi, > > Stefan Beller wrote: >> On Tue, Aug 15, 2017 at 5:11 PM, Junio C Hamano <gitster@pobox.com> wrote: >>> Stefan Beller <sbeller@google.com> writes: >>>> Junio C Hamano wrote: > >>>>> Is "is it populated" a good thing to check here, though? IIRC, >>>>> add-submodule-odb allows you to add the object database of an >>>>> inactivated submodule, so this seems to change the behaviour. I do >>>>> not know if the behaviour change is a good thing (i.e. bugfix) or >>>>> not (i.e. regression) offhand, though. >>>> >>>> Good point, we should be able to push non-populated, even inactive(?) >>>> submodules. For that we strictly need add_submodule_odb here >>>> (or the repo object of the submodule, eventually). >>>> >>>> So let's retract this patch for now. >>> >>> Not so fast. >> >> Ok, I took another look at the code. >> >> While we may desire that un-populated submodules can be pushed >> (due to checking out another revision where the submodule >> doesn't exist, before pushing), this is not supported currently, because >> the call to run the push in the submodule assumes there is a >> "<path>/.git" on which the child process can operate. >> So for now we HAVE to have the submodule populated. > > It was not immediately obvious to me that this is just "for now". > > I would be really confused if I had deactivated a submodule and > "git push --recurse-submodules" pushed from it anyway. agreed. > If the > submodule is active but not populated, then the question becomes "Why > wasn't it populated?" because you checked out a different revision where the sub is not part of the working tree, for example. As long as you have locally interesting "stuff" you want to transfer it. (IMHO even when the sub is not active) > If this is a bare repository, then nothing is populated, Currently bare repos do not support having submodules AT ALL, i.e. there is no modules/<name> directory where the submodules would have any valuable information. instead we *only* have the .gitmodules files at e.g. HEAD. > and pushing > from an active-but-unpopulated submodule sounds like a plausible wish. it seems we're on the same page here. > But in a non-bare repository, I'm having trouble imagining the use > case that brings this situation about. git checkout revision-with-sub # hack.. hack.. hack.. "git commit --recurse-submodules -m bugfix" # ^ not yet implemented git checkout revision-without-sub # Oh! I forgot to push the bugfix git push --recurse-submodule <bugfix> > And where people have been needing this so far has been non-bare > repositories. In that context, the check "is active and populated" > does not seem unusual or provisional. Are you hinting that replacing > the check with "is active" would make it work well in bare > repositories? No I was saying: * currently the code only works when "is populated" * currently "is active" seems not to be considered * in the future the code shall work even when not "is populated", (but submodule git dir is there at the very least) * we may want to discuss if we want to care about "is active" additionally > I think I agree, though you'd have to be careful about > the case where the submodule is active but hasn't been fetched to > $GIT_DIR/modules yet. When not fetched yet, there is no modules/<name> git dir and we are sure there is no locally interesting information, so we want to omit the push. (see above "has git dir" is the minimum requirement, as opposed to current "is populated". > Thanks, > Jonathan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] push: do not add submodule odb as an alternate when recursing on demand 2017-08-16 1:05 ` Stefan Beller 2017-08-16 2:08 ` Jonathan Nieder @ 2017-08-16 16:35 ` Heiko Voigt 1 sibling, 0 replies; 20+ messages in thread From: Heiko Voigt @ 2017-08-16 16:35 UTC (permalink / raw) To: Stefan Beller Cc: Junio C Hamano, Jonathan Tan, git@vger.kernel.org, Jonathan Nieder Hi, was about to write that we are maybe overly cautious here. Because the current way a submodule ends up in the list to be pushed is through: find_unpushed_submodules() that itself collects all changed submodules when submodule_needs_pushing() is true. In there we have this: if (!submodule_has_commits(path, commits)) /* * NOTE: We do consider it safe to return "no" here. The * correct answer would be "We do not know" instead of * "No push needed", but it is quite hard to change * the submodule pointer without having the submodule * around. If a user did however change the submodules * without having the submodule around, this indicates * an expert who knows what they are doing or a * maintainer integrating work from other people. In * both cases it should be safe to skip this check. */ return 0; So if the check, whether a submodule has commits, fails for any reason it will not end up in the list to be pushed. As a side note: inside submodule_has_commits() there is an add_submodule_odb() followed by a process to really make sure that the commits are in the submodule. So IMO at this point we can be sure that the *database* exists and this extra check could be dropped if we said that a caller to push_submodule() should make sure that the submodule exists. The current ones are doing it already (if I did not miss anything). On Tue, Aug 15, 2017 at 06:05:25PM -0700, Stefan Beller wrote: > On Tue, Aug 15, 2017 at 5:11 PM, Junio C Hamano <gitster@pobox.com> wrote: > > Stefan Beller <sbeller@google.com> writes: > > > >>> Is "is it populated" a good thing to check here, though? IIRC, > >>> add-submodule-odb allows you to add the object database of an > >>> inactivated submodule, so this seems to change the behaviour. I do > >>> not know if the behaviour change is a good thing (i.e. bugfix) or > >>> not (i.e. regression) offhand, though. > >> > >> Good point, we should be able to push non-populated, even inactive(?) > >> submodules. For that we strictly need add_submodule_odb here > >> (or the repo object of the submodule, eventually). > >> > >> So let's retract this patch for now. > > > > Not so fast. > > Ok, I took another look at the code. > > While we may desire that un-populated submodules can be pushed > (due to checking out another revision where the submodule > doesn't exist, before pushing), this is not supported currently, because > the call to run the push in the submodule assumes there is a > "<path>/.git" on which the child process can operate. > So for now we HAVE to have the submodule populated. That is a good point though. In the current form of push_submodule() we need to have a populated submodule. So IMO to check whether the submodule is actually *populated* instead of adding the odb is correct and a possible bug fix. Cheers Heiko ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2017-08-16 16:36 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-07-12 23:45 [PATCH] submodule: use cheaper check for submodule pushes Stefan Beller 2017-07-13 0:01 ` Jonathan Nieder 2017-07-13 0:09 ` Stefan Beller 2017-07-13 0:53 ` Junio C Hamano 2017-07-13 5:14 ` Stefan Beller 2017-07-13 18:37 ` Junio C Hamano 2017-07-13 19:39 ` Stefan Beller 2017-07-13 20:48 ` Jonathan Nieder 2017-07-13 20:54 ` Stefan Beller 2017-08-15 22:43 ` [PATCH] push: do not add submodule odb as an alternate when recursing on demand Stefan Beller 2017-08-15 23:10 ` Jonathan Nieder 2017-08-15 23:14 ` Jonathan Nieder 2017-08-15 23:27 ` Stefan Beller 2017-08-15 23:23 ` Junio C Hamano 2017-08-15 23:31 ` Stefan Beller 2017-08-16 0:11 ` Junio C Hamano 2017-08-16 1:05 ` Stefan Beller 2017-08-16 2:08 ` Jonathan Nieder 2017-08-16 5:52 ` Stefan Beller 2017-08-16 16:35 ` Heiko Voigt
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).