* Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs @ 2022-05-02 14:42 Huang Zou 2022-05-05 23:57 ` Philippe Blain 2022-05-09 23:32 ` Glen Choo 0 siblings, 2 replies; 7+ messages in thread From: Huang Zou @ 2022-05-02 14:42 UTC (permalink / raw) To: git Thank you for filling out a Git bug report! Please answer the following questions to help us understand your issue. What did you do before the bug happened? (Steps to reproduce your issue) 1) Set the following configs: git config submodule.recurse true git config fetch.recurseSubmodules false 2) On a repo with submodules, run: git pull What did you expect to happen? (Expected behavior) git pull doesn't recursively fetch submodules What happened instead? (Actual behavior) Submodules are fetched recursively What's different between what you expected and what actually happened? Submodules are fetched recursively Anything else you want to add: git fetch works as intended. The documentation for fetch.recurseSubmodules states that "This option controls whether git fetch (and the underlying fetch in git pull)" so I would naturally expect git pull to behave the same as git fetch [System Info] git version: git version 2.36.0 cpu: x86_64 no commit associated with this build sizeof-long: 8 sizeof-size_t: 8 shell-path: /bin/sh feature: fsmonitor--daemon uname: Darwin 19.6.0 Darwin Kernel Version 19.6.0: Tue Aug 24 20:28:00 PDT 2021; root:xnu-6153.141.40~1/RELEASE_X86_64 x86_64 compiler info: clang: 12.0.0 (clang-1200.0.32.29) libc info: no libc information available $SHELL (typically, interactive shell): /usr/local/bin/bash [Enabled Hooks] pre-commit Thanks, Huang ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs 2022-05-02 14:42 Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs Huang Zou @ 2022-05-05 23:57 ` Philippe Blain 2022-05-06 21:50 ` Glen Choo 2022-05-09 23:32 ` Glen Choo 1 sibling, 1 reply; 7+ messages in thread From: Philippe Blain @ 2022-05-05 23:57 UTC (permalink / raw) To: Huang Zou, git; +Cc: Glen Choo Hi Huang, Le 2022-05-02 à 10:42, Huang Zou a écrit : > Thank you for filling out a Git bug report! > Please answer the following questions to help us understand your issue. > > What did you do before the bug happened? (Steps to reproduce your issue) > 1) Set the following configs: > git config submodule.recurse true > git config fetch.recurseSubmodules false > 2) On a repo with submodules, run: > git pull > > What did you expect to happen? (Expected behavior) > git pull doesn't recursively fetch submodules > > What happened instead? (Actual behavior) > Submodules are fetched recursively > > What's different between what you expected and what actually happened? > Submodules are fetched recursively > > Anything else you want to add: > git fetch works as intended. The documentation for fetch.recurseSubmodules > states that "This option controls whether git fetch (and the underlying > fetch in git pull)" so I would naturally expect git pull to behave the same > as git fetch I did not try to reproduce, but I took a look at the code and I think I understand what happens. When 'git pull' invokes 'git fetch', it does so by specifically using the '--recurse-submodules' flag, see [1]. It sends either 'yes', 'no' or 'on-demand' as value, depending on the value of the local variable 'recurse_submodules'. This variable is initialized to the config value of 'submodule.recurse' in 'git_pull_config' [2], called at [3], and then overwritten by the value given explicitely on the command line [4], parsed at [5]. So when 'git fetch' runs when called by 'git pull', it always receive the '--recurse-submodules' flag, and thus any config for fetch.recurseSubmodules is ignored (since explicit command line flags always have precedence over config values). So one way to fix this would be to also parse 'fetch.recurseSubmodules' in 'git_pull_config', and send the correct value to the 'git fetch' invocation... Or simpler, call 'git fetch' with '--recurse-submodules-default' [9] instead... [sidenote] I'm thought for a while that it was maybe not a good idea to change the behaviour in your specific situation. If you have 'submodule.recurse' set to true and 'fetch.recurseSubmodules' set to false, and if the code is changed so that indeed 'git pull' does not fetch recursively, then the code will still try to update the submodule working trees after the end of the operation (merge or rebase), see the end of 'cmd_pull' [6], [7]. This is OK, because if there are new submodule commits referenced by the superproject and they were not fetched because the fetch was not recursive, then the call to 'git submodule update' in update_submodules/rebase_submodule should fetch them, so no problem there. [/sidenote] The bug also exists in other configurations, i.e. you could have 'submodule.recurse' set to 'false' or 'true' and 'fetch.recurseSubmodules' set to 'true' or 'on-demand' and then again the 'git fetch' invocation would ignore the config value. So maybe a good way forward would be to change the flag to '--recurse-submodules-default' when invoking 'git fetch', so that the following configurations work as expected: - submodule.recurse=false / fetch.recurseSubmodules=true (should fetch) - submodule.recurse=false / fetch.recurseSubmodules=on-demand (should fetch if needed) - submodule.recurse=true / fetch.recurseSubmodules=on-demand (should fetch if needed) - submodule.recurse=true / fetch.recurseSubmodules=false (should not fetch) I hope that helps. Cheers, Philippe. [1] https://github.com/git/git/blob/f5aaf72f1b5aeb3b77bccabce014ea2590e823e2/builtin/pull.c#L539-L551 [2] https://github.com/git/git/blob/f5aaf72f1b5aeb3b77bccabce014ea2590e823e2/builtin/pull.c#L366-L369 [3] https://github.com/git/git/blob/f5aaf72f1b5aeb3b77bccabce014ea2590e823e2/builtin/pull.c#L996 [4] https://github.com/git/git/blob/f5aaf72f1b5aeb3b77bccabce014ea2590e823e2/builtin/pull.c#L122-L125 [5] https://github.com/git/git/blob/f5aaf72f1b5aeb3b77bccabce014ea2590e823e2/builtin/pull.c#L1002 [6] https://github.com/git/git/blob/f5aaf72f1b5aeb3b77bccabce014ea2590e823e2/builtin/pull.c#L1144-L1155 [7] https://github.com/git/git/blob/f5aaf72f1b5aeb3b77bccabce014ea2590e823e2/builtin/pull.c#L624-L648 [8] https://git-scm.com/docs/git-config#Documentation/git-config.txt-fetchrecurseSubmodules [9] https://git-scm.com/docs/git-fetch#Documentation/git-fetch.txt---recurse-submodules-defaultyeson-demand ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs 2022-05-05 23:57 ` Philippe Blain @ 2022-05-06 21:50 ` Glen Choo 2022-05-06 22:24 ` Philippe Blain 0 siblings, 1 reply; 7+ messages in thread From: Glen Choo @ 2022-05-06 21:50 UTC (permalink / raw) To: Philippe Blain, Huang Zou, git Philippe Blain <levraiphilippeblain@gmail.com> writes: > Hi Huang, > > Le 2022-05-02 à 10:42, Huang Zou a écrit : >> Thank you for filling out a Git bug report! >> Please answer the following questions to help us understand your issue. >> >> What did you do before the bug happened? (Steps to reproduce your issue) >> 1) Set the following configs: >> git config submodule.recurse true >> git config fetch.recurseSubmodules false >> 2) On a repo with submodules, run: >> git pull >> >> What did you expect to happen? (Expected behavior) >> git pull doesn't recursively fetch submodules >> >> What happened instead? (Actual behavior) >> Submodules are fetched recursively >> >> What's different between what you expected and what actually happened? >> Submodules are fetched recursively >> >> Anything else you want to add: >> git fetch works as intended. The documentation for fetch.recurseSubmodules >> states that "This option controls whether git fetch (and the underlying >> fetch in git pull)" so I would naturally expect git pull to behave the same >> as git fetch > > I did not try to reproduce, but I took a look at the code and I think I understand > what happens. > > When 'git pull' invokes 'git fetch', it does so by specifically using the '--recurse-submodules' > flag, see [1]. It sends either 'yes', 'no' or 'on-demand' as value, depending on the value > of the local variable 'recurse_submodules'. This variable is initialized to the config value > of 'submodule.recurse' in 'git_pull_config' [2], called at [3], and then overwritten by the value given > explicitely on the command line [4], parsed at [5]. > > So when 'git fetch' runs when called by 'git pull', it always receive the > '--recurse-submodules' flag, and thus any config for fetch.recurseSubmodules is ignored > (since explicit command line flags always have precedence over config values). Thanks for looking into this! This seems to agree with my reading of the code. I haven't tried to reproduce it either, but the code looks obviously incorrect. > So one way to fix this would be to also parse 'fetch.recurseSubmodules' in 'git_pull_config', > and send the correct value to the 'git fetch' invocation... Or simpler, call 'git fetch' with > '--recurse-submodules-default' [9] instead... Despite having touched this code fairly recently, I had to do quite a rereading to refresh myself on how this works. If I _am_ reading this correctly, then I think we actually want to set `--recurse-submodules` and not `--recurse-submodules-default`. The short story is that the two are not equivalent - when figuring out _which_ submodules to fetch (we determine on a submodule-by-submodule basis; we don't just say "should we fetch all submodules?"), `--recurse-submodules-default` gets overrided by config values, but `--recurse-submodules` does not. The longer story (which I think is quite difficult to explain, I am also a little confused) is that in a recursive fetch, `--recurse-submodules-default` is the value of the parent's (we'll call it P) `--recurse-submodules`. This only matters when a process, C1, has to pass a value for `--recurse-submodules` to its child, C2. The precedence order is: - C1's --recurse-submodules | fetch.recurseSubmodules | submodule.recurse - C2's submodule entry in C1's .git/config - C2's entry in C1's .gitmodules - C1's --recurse-submodules-default (aka P's --recurse-submodules) Specifically, in code: static int get_fetch_recurse_config(const struct submodule *submodule, struct submodule_parallel_fetch *spf) { // Glen: passed in from builtin/fetch, which parses // --recurse-submodules, fetch.recurseSubmodules, submodule.recurse if (spf->command_line_option != RECURSE_SUBMODULES_DEFAULT) return spf->command_line_option; if (submodule) { // ... // Glen: fetch configuration from .gitmodules int fetch_recurse = submodule->fetch_recurse; key = xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name); if (!repo_config_get_string_tmp(spf->r, key, &value)) { // Glen: fetch configuration from .git/config fetch_recurse = parse_fetch_recurse_submodules_arg(key, value); } // ... } // Glen: --recurse-submodules-default return spf->default_option; } So `--recurse-submodules-default` really wasn't meant for anything other than "fetch" invoking itself in a superproject-submodule setting. Of course, I could be entirely wrong and I should just write up a test case :). I hope to send one soon. > [sidenote] > I'm thought for a while that it was maybe not a good idea to change the behaviour > in your specific situation. If you have 'submodule.recurse' > set to true and 'fetch.recurseSubmodules' set to false, and if the code is changed so that indeed > 'git pull' does not fetch recursively, then the code will still try to update the submodule working > trees after the end of the operation (merge or rebase), see the end of 'cmd_pull' [6], [7]. This is > OK, because if there are new submodule commits referenced by the superproject and they were not fetched because the > fetch was not recursive, then the call to 'git submodule update' in update_submodules/rebase_submodule > should fetch them, so no problem there. > [/sidenote] I think the bigger question to ask is "what is the intended effect of 'submodule.recurse = true' and 'fetch.recurseSubmodules = false'?". I think it is not surprising to think that recursive worktree operations might fail if the submodule commits weren't fetched. Perhaps this is just a performance optimization? i.e. after fetching in the superproject, the user wants to skip the rev walk that discovers new submodule commits. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs 2022-05-06 21:50 ` Glen Choo @ 2022-05-06 22:24 ` Philippe Blain 2022-05-09 16:24 ` Huang Zou 0 siblings, 1 reply; 7+ messages in thread From: Philippe Blain @ 2022-05-06 22:24 UTC (permalink / raw) To: Glen Choo, Huang Zou, git Hi Glen! Le 2022-05-06 à 17:50, Glen Choo a écrit : > Philippe Blain <levraiphilippeblain@gmail.com> writes: > >> Hi Huang, >> >> Le 2022-05-02 à 10:42, Huang Zou a écrit : >>> Thank you for filling out a Git bug report! >>> Please answer the following questions to help us understand your issue. >>> >>> What did you do before the bug happened? (Steps to reproduce your issue) >>> 1) Set the following configs: >>> git config submodule.recurse true >>> git config fetch.recurseSubmodules false >>> 2) On a repo with submodules, run: >>> git pull >>> >>> What did you expect to happen? (Expected behavior) >>> git pull doesn't recursively fetch submodules >>> >>> What happened instead? (Actual behavior) >>> Submodules are fetched recursively >>> >>> What's different between what you expected and what actually happened? >>> Submodules are fetched recursively >>> >>> Anything else you want to add: >>> git fetch works as intended. The documentation for fetch.recurseSubmodules >>> states that "This option controls whether git fetch (and the underlying >>> fetch in git pull)" so I would naturally expect git pull to behave the same >>> as git fetch >> >> I did not try to reproduce, but I took a look at the code and I think I understand >> what happens. >> >> When 'git pull' invokes 'git fetch', it does so by specifically using the '--recurse-submodules' >> flag, see [1]. It sends either 'yes', 'no' or 'on-demand' as value, depending on the value >> of the local variable 'recurse_submodules'. This variable is initialized to the config value >> of 'submodule.recurse' in 'git_pull_config' [2], called at [3], and then overwritten by the value given >> explicitely on the command line [4], parsed at [5]. >> >> So when 'git fetch' runs when called by 'git pull', it always receive the >> '--recurse-submodules' flag, and thus any config for fetch.recurseSubmodules is ignored >> (since explicit command line flags always have precedence over config values). > > Thanks for looking into this! This seems to agree with my reading of the > code. I haven't tried to reproduce it either, but the code looks > obviously incorrect. > >> So one way to fix this would be to also parse 'fetch.recurseSubmodules' in 'git_pull_config', >> and send the correct value to the 'git fetch' invocation... Or simpler, call 'git fetch' with >> '--recurse-submodules-default' [9] instead... > > Despite having touched this code fairly recently, I had to do quite a > rereading to refresh myself on how this works. If I _am_ reading this > correctly, then I think we actually want to set `--recurse-submodules` > and not `--recurse-submodules-default`. > > The short story is that the two are not equivalent - when figuring out > _which_ submodules to fetch (we determine on a submodule-by-submodule > basis; we don't just say "should we fetch all submodules?"), > `--recurse-submodules-default` gets overrided by config values, but > `--recurse-submodules` does not. > > The longer story (which I think is quite difficult to explain, I am also > a little confused) is that in a recursive fetch, > `--recurse-submodules-default` is the value of the parent's (we'll call > it P) `--recurse-submodules`. This only matters when a process, C1, has > to pass a value for `--recurse-submodules` to its child, C2. The > precedence order is: > > - C1's --recurse-submodules | fetch.recurseSubmodules | > submodule.recurse > - C2's submodule entry in C1's .git/config > - C2's entry in C1's .gitmodules > - C1's --recurse-submodules-default (aka P's --recurse-submodules) > > Specifically, in code: > > static int get_fetch_recurse_config(const struct submodule *submodule, > struct submodule_parallel_fetch *spf) > { > // Glen: passed in from builtin/fetch, which parses > // --recurse-submodules, fetch.recurseSubmodules, submodule.recurse > if (spf->command_line_option != RECURSE_SUBMODULES_DEFAULT) > return spf->command_line_option; > > if (submodule) { > // ... > // Glen: fetch configuration from .gitmodules > int fetch_recurse = submodule->fetch_recurse; > > key = xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name); > if (!repo_config_get_string_tmp(spf->r, key, &value)) { > // Glen: fetch configuration from .git/config > fetch_recurse = parse_fetch_recurse_submodules_arg(key, value); > } > // ... > } > > // Glen: --recurse-submodules-default > return spf->default_option; > } > > So `--recurse-submodules-default` really wasn't meant for anything other > than "fetch" invoking itself in a superproject-submodule setting. > > Of course, I could be entirely wrong and I should just write up a test > case :). I hope to send one soon. OK so it's more complicated than I thought (I'm not surprised :P). Thanks for the details. The other thing I thought about, is that even if '--recurse-submodules-default' worked as I thought it did when I wrote this, it would still not be the right thing to change builtin/pull to invoke 'git fetch' with it instead, since then a user invoking e.g. 'git pull --recurse-submodules=false', who has 'fetch.recurseSubmodules=true', would not get a recursive fetch, since the internal fetch would prioritise 'fetch.recurseSubmodules=true', but that's contrary to user expectation that command-line flags override config, always... So whatever is done to fix this, we might need to keep track of where 'recurse_submodules' was set from, either the command-line or the config. > >> [sidenote] >> I'm thought for a while that it was maybe not a good idea to change the behaviour >> in your specific situation. If you have 'submodule.recurse' >> set to true and 'fetch.recurseSubmodules' set to false, and if the code is changed so that indeed >> 'git pull' does not fetch recursively, then the code will still try to update the submodule working >> trees after the end of the operation (merge or rebase), see the end of 'cmd_pull' [6], [7]. This is >> OK, because if there are new submodule commits referenced by the superproject and they were not fetched because the >> fetch was not recursive, then the call to 'git submodule update' in update_submodules/rebase_submodule >> should fetch them, so no problem there. >> [/sidenote] > > I think the bigger question to ask is "what is the intended effect of > 'submodule.recurse = true' and 'fetch.recurseSubmodules = false'?". Yes, I agree that it would be nice if Huang clarified that as I'm not sure either of the use case. > I > think it is not surprising to think that recursive worktree operations > might fail if the submodule commits weren't fetched. Yes, if ever 'merge' and 'rebase' are taught to obey 'submodule.recurse' (if only I had time to work on that!), then all hell would break loose if submodule commits were not fetched when these operations start. > > Perhaps this is just a performance optimization? i.e. after fetching in > the superproject, the user wants to skip the rev walk that discovers new > submodule commits. > Cheers, Philippe. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs 2022-05-06 22:24 ` Philippe Blain @ 2022-05-09 16:24 ` Huang Zou 0 siblings, 0 replies; 7+ messages in thread From: Huang Zou @ 2022-05-09 16:24 UTC (permalink / raw) To: Philippe Blain; +Cc: Glen Choo, git Hi Glen & Philippe, Thanks for taking the time to look into this! > > I think the bigger question to ask is "what is the intended effect of > > 'submodule.recurse = true' and 'fetch.recurseSubmodules = false'?". > > Yes, I agree that it would be nice if Huang clarified that as I'm not sure > either of the use case. > > Perhaps this is just a performance optimization? i.e. after fetching in > > the superproject, the user wants to skip the rev walk that discovers new > > submodule commits. > > So the use case here is just performance optimization. My team has over 10 submodules and I do not deal with most of them. I want to be able to pull the latest changes quickly (fetching submodules adds ~13 seconds when there are no new commits to a pull that would otherwise take ~1 second). I want my working tree to be clean after pulls/checkouts. So checkouts and other commands that update my commit HEAD should still recursively update submodules (hence submodule.recurse is true). Although, I may be naively assuming that fetches can be avoided if I only care about the commit referenced in the submodule link. If that isn't the case, then the more practical use-case would be 'submodule.recurse = true' and 'fetch.recurseSubmodules = on-demand', so at least submodules are only fetched when there are changes. The other use-case here, which I should have probably included in my original report, is `git pull` currently also recurse into *inactive submodules*. i.e. submodules that are not set in "submodule.active". If my submodules are not active, and not initialized, then git should not fetch commits in those submodules regardless of the settings in fetch.recurseSubmodules I hope this helps clear up a few things! Thanks, Huang ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs 2022-05-02 14:42 Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs Huang Zou 2022-05-05 23:57 ` Philippe Blain @ 2022-05-09 23:32 ` Glen Choo 2022-05-10 0:57 ` Philippe Blain 1 sibling, 1 reply; 7+ messages in thread From: Glen Choo @ 2022-05-09 23:32 UTC (permalink / raw) To: Huang Zou, git, Philippe Blain Huang Zou <huang.zou@schrodinger.com> writes: > Thank you for filling out a Git bug report! > Please answer the following questions to help us understand your issue. > > What did you do before the bug happened? (Steps to reproduce your issue) > 1) Set the following configs: > git config submodule.recurse true > git config fetch.recurseSubmodules false > 2) On a repo with submodules, run: > git pull > > What did you expect to happen? (Expected behavior) > git pull doesn't recursively fetch submodules > > What happened instead? (Actual behavior) > Submodules are fetched recursively > > What's different between what you expected and what actually happened? > Submodules are fetched recursively > > Anything else you want to add: > git fetch works as intended. The documentation for fetch.recurseSubmodules > states that "This option controls whether git fetch (and the underlying > fetch in git pull)" so I would naturally expect git pull to behave the same > as git fetch > > > > [System Info] > git version: > git version 2.36.0 > cpu: x86_64 > no commit associated with this build > sizeof-long: 8 > sizeof-size_t: 8 > shell-path: /bin/sh > feature: fsmonitor--daemon > uname: Darwin 19.6.0 Darwin Kernel Version 19.6.0: Tue Aug 24 20:28:00 PDT > 2021; root:xnu-6153.141.40~1/RELEASE_X86_64 x86_64 > compiler info: clang: 12.0.0 (clang-1200.0.32.29) > libc info: no libc information available > $SHELL (typically, interactive shell): /usr/local/bin/bash > > > [Enabled Hooks] > pre-commit > > Thanks, > Huang I've sent a fix at [1] :) I intended for the patch to be sent in reply to this thread, but I clearly don't know how to use GitGitGadget correctly. Whoops [1] https://lore.kernel.org/git/pull.1262.git.git.1652138854255.gitgitgadget@gmail.com/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs 2022-05-09 23:32 ` Glen Choo @ 2022-05-10 0:57 ` Philippe Blain 0 siblings, 0 replies; 7+ messages in thread From: Philippe Blain @ 2022-05-10 0:57 UTC (permalink / raw) To: Glen Choo, Huang Zou, git Hi Glen, Le 2022-05-09 à 19:32, Glen Choo a écrit : > Huang Zou <huang.zou@schrodinger.com> writes: > > I've sent a fix at [1] :) I intended for the patch to be sent in reply > to this thread, but I clearly don't know how to use GitGitGadget > correctly. Whoops Yeah, this is not supported by GitGitGadget (yet?). So it's not you :P Philippe. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-05-10 0:58 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-02 14:42 Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs Huang Zou 2022-05-05 23:57 ` Philippe Blain 2022-05-06 21:50 ` Glen Choo 2022-05-06 22:24 ` Philippe Blain 2022-05-09 16:24 ` Huang Zou 2022-05-09 23:32 ` Glen Choo 2022-05-10 0:57 ` Philippe Blain
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).