* [GSoC] [PATCH] submodule--helper: run update procedures from C @ 2021-07-22 13:40 Atharva Raykar 2021-07-23 9:37 ` Ævar Arnfjörð Bjarmason 2021-08-02 13:06 ` [GSoC] [PATCH v2] " Atharva Raykar 0 siblings, 2 replies; 13+ messages in thread From: Atharva Raykar @ 2021-07-22 13:40 UTC (permalink / raw) To: git Cc: Atharva Raykar, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan Add a new submodule--helper subcommand `run-update-procedure` that runs the update procedure if the SHA1 of the submodule does not match what the superproject expects. This is an intermediate change that works towards total conversion of `submodule update` from shell to C. Specific error codes are returned so that the shell script calling the subcommand can take a decision on the control flow, and preserve the error messages across subsequent recursive calls of `cmd_update`. This patch could have been approached differently, by first changing the `is_tip_reachable` and `fetch_in_submodule` shell functions to be `submodule--helper` subcommands, and then following up with a patch that introduces the `run-update-procedure` subcommand. We have not done it like that because those functions are trivial enough to convert directly along with these other changes. This lets us avoid the boilerplate and the cleanup patches that will need to be introduced in following that approach. Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Shourya Shukla <periperidip@gmail.com> --- This patch depends on changes introduced in 559e49fe5c (submodule: prefix die messages with 'fatal', 2021-07-08), which belongs to the ar/submodule-add (2021-07-12) series[1]. Other than that commit, it is independent of my other 'submodule add' conversion series. Opinions on the following would be appreciated: * Currently there is a lot of special meaning for the exit code, of this subcommand, which was needed to handle the various failures of running the update mode in the shell code that follows (note the extra handling of exit code 128, because a die() in C returns that value). I felt this was okay to do because in a later series that converts whatever is left, the handling of exit codes will be simplified. * Is there a way to check if a sha1 is unreachable from all the refs? Currently 'is_tip_reachable()' spawns a subprocess with an incantation of: 'git rev-list -n 1 $sha1 --not --all' I suppose I could do this with the revision-walk API [2], but it felt like a lot of boilerplate for something that was really succinct in the original shell implementation. Maybe worth looking into it for a later patch? * I added a 'NOTE' comment for `case SM_UPDATE_COMMAND` in submodule--helper.c:run_update_command(). I wonder if that comment is unnecessary noise or worth mentioning. Is there an edge case where the !command in the 'submodule.update' configuration can be improperly handled by strvec_split()? [1] https://lore.kernel.org/git/20210710074801.19917-1-raykar.ath@gmail.com/ Fetch-it-via: git fetch https://github.com/tfidfwastaken/git.git submodule-run-update-proc-list-1 builtin/submodule--helper.c | 247 ++++++++++++++++++++++++++++++++++++ git-submodule.sh | 95 ++++---------- 2 files changed, 269 insertions(+), 73 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index d55f6262e9..4e16561bf1 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -2029,6 +2029,20 @@ struct submodule_update_clone { .max_jobs = 1, \ } +struct update_data { + const char *recursive_prefix; + const char *sm_path; + const char *displaypath; + struct object_id sha1; + struct object_id subsha1; + struct submodule_update_strategy update_strategy; + int depth; + unsigned int force: 1; + unsigned int quiet: 1; + unsigned int nofetch: 1; + unsigned int just_cloned: 1; +}; +#define UPDATE_DATA_INIT { .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT } static void next_submodule_warn_missing(struct submodule_update_clone *suc, struct strbuf *out, const char *displaypath) @@ -2282,6 +2296,165 @@ static int git_update_clone_config(const char *var, const char *value, return 0; } +/* NEEDSWORK: try to do this without creating a new process */ +static int is_tip_reachable(const char *path, struct object_id *sha1) +{ + struct child_process cp = CHILD_PROCESS_INIT; + struct strbuf rev = STRBUF_INIT; + char *sha1_hex = oid_to_hex(sha1); + + cp.git_cmd = 1; + cp.dir = xstrdup(path); + cp.no_stderr = 1; + strvec_pushl(&cp.args, "rev-list", "-n", "1", sha1_hex, "--not", "--all", NULL); + + prepare_submodule_repo_env(&cp.env_array); + + if (capture_command(&cp, &rev, GIT_MAX_HEXSZ + 1) || rev.len) + return 0; + + return 1; +} + +static int fetch_in_submodule(const char *module_path, int depth, int quiet, struct object_id *sha1) +{ + struct child_process cp = CHILD_PROCESS_INIT; + + prepare_submodule_repo_env(&cp.env_array); + cp.git_cmd = 1; + cp.dir = xstrdup(module_path); + + strvec_push(&cp.args, "fetch"); + if (quiet) + strvec_push(&cp.args, "--quiet"); + if (depth) + strvec_pushf(&cp.args, "--depth=%d", depth); + if (sha1) { + char *sha1_hex = oid_to_hex(sha1); + char *remote = get_default_remote(); + strvec_pushl(&cp.args, remote, sha1_hex, NULL); + } + + return run_command(&cp); +} + +static int run_update_command(struct update_data *ud, int subforce) +{ + struct child_process cp = CHILD_PROCESS_INIT; + struct strbuf die_msg = STRBUF_INIT; + struct strbuf say_msg = STRBUF_INIT; + char *sha1 = oid_to_hex(&ud->sha1); + int retval, must_die_on_failure = 0; + + cp.dir = xstrdup(ud->sm_path); + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + cp.git_cmd = 1; + strvec_pushl(&cp.args, "checkout", "-q", NULL); + if (subforce) + strvec_push(&cp.args, "-f"); + strbuf_addf(&die_msg, "fatal: Unable to checkout '%s' in submodule path '%s'\n", + sha1, ud->displaypath); + strbuf_addf(&say_msg, "Submodule path '%s': checked out '%s'\n", + ud->displaypath, sha1); + break; + case SM_UPDATE_REBASE: + cp.git_cmd = 1; + strvec_push(&cp.args, "rebase"); + if (ud->quiet) + strvec_push(&cp.args, "--quiet"); + strbuf_addf(&die_msg, "fatal: Unable to rebase '%s' in submodule path '%s'\n", + sha1, ud->displaypath); + strbuf_addf(&say_msg, "Submodule path '%s': rebased into '%s'\n", + ud->displaypath, sha1); + must_die_on_failure = 1; + break; + case SM_UPDATE_MERGE: + cp.git_cmd = 1; + strvec_push(&cp.args, "merge"); + if (ud->quiet) + strvec_push(&cp.args, "--quiet"); + strbuf_addf(&die_msg, "fatal: Unable to merge '%s' in submodule path '%s'\n", + sha1, ud->displaypath); + strbuf_addf(&say_msg, "Submodule path '%s': merged in '%s'\n", + ud->displaypath, sha1); + must_die_on_failure = 1; + break; + case SM_UPDATE_COMMAND: + /* NOTE: this does not handle quoted arguments */ + strvec_split(&cp.args, ud->update_strategy.command); + strbuf_addf(&die_msg, "fatal: Execution of '%s %s' failed in submodule path '%s'\n", + ud->update_strategy.command, sha1, ud->displaypath); + strbuf_addf(&say_msg, "Submodule path '%s': '%s %s'\n", + ud->displaypath, ud->update_strategy.command, sha1); + must_die_on_failure = 1; + break; + case SM_UPDATE_UNSPECIFIED: + case SM_UPDATE_NONE: + BUG("update strategy should have been specified"); + } + + strvec_push(&cp.args, sha1); + + prepare_submodule_repo_env(&cp.env_array); + + if (run_command(&cp)) { + if (must_die_on_failure) { + retval = 2; + fputs(_(die_msg.buf), stderr); + goto cleanup; + } + /* + * This signifies to the caller in shell that + * the command failed without dying + */ + retval = 1; + goto cleanup; + } + retval = 0; + puts(_(say_msg.buf)); + +cleanup: + strbuf_release(&die_msg); + strbuf_release(&say_msg); + return retval; +} + +static int do_run_update_procedure(struct update_data *ud) +{ + if ((!is_null_oid(&ud->sha1) && !is_null_oid(&ud->subsha1) && !oideq(&ud->sha1, &ud->subsha1)) || + is_null_oid(&ud->subsha1) || ud->force) { + int subforce = is_null_oid(&ud->subsha1) || ud->force; + + if (!ud->nofetch) { + /* + * Run fetch only if `sha1` isn't present or it + * is not reachable from a ref. + */ + if (!is_tip_reachable(ud->sm_path, &ud->sha1)) + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && + !ud->quiet) + fprintf_ln(stderr, + _("Unable to fetch in submodule path '%s'; " + "trying to directly fetch %s:"), + ud->displaypath, oid_to_hex(&ud->sha1)); + /* + * Now we tried the usual fetch, but `sha1` may + * not be reachable from any of the refs. + */ + if (!is_tip_reachable(ud->sm_path, &ud->sha1)) + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->sha1)) + die(_("Fetched in submodule path '%s', but it did not " + "contain %s. Direct fetching of that commit failed."), + ud->displaypath, oid_to_hex(&ud->sha1)); + } + + return run_update_command(ud, subforce); + } + + return 3; +} + static void update_submodule(struct update_clone_data *ucd) { fprintf(stdout, "dummy %s %d\t%s\n", @@ -2379,6 +2552,79 @@ static int update_clone(int argc, const char **argv, const char *prefix) return update_submodules(&suc); } +static int run_update_procedure(int argc, const char **argv, const char *prefix) +{ + int force = 0, quiet = 0, nofetch = 0, just_cloned = 0; + char *prefixed_path, *update = NULL; + char *sha1 = NULL, *subsha1 = NULL; + struct update_data update_data = UPDATE_DATA_INIT; + + struct option options[] = { + OPT__QUIET(&quiet, N_("suppress output for update by rebase or merge")), + OPT__FORCE(&force, N_("force checkout updates"), 0), + OPT_BOOL('N', "no-fetch", &nofetch, + N_("don't fetch new objects from the remote site")), + OPT_BOOL(0, "just-cloned", &just_cloned, + N_("overrides update mode in case the repository is a fresh clone")), + OPT_INTEGER(0, "depth", &update_data.depth, N_("depth for shallow fetch")), + OPT_STRING(0, "prefix", &prefix, + N_("path"), + N_("path into the working tree")), + OPT_STRING(0, "update", &update, + N_("string"), + N_("rebase, merge, checkout or none")), + OPT_STRING(0, "recursive-prefix", &update_data.recursive_prefix, N_("path"), + N_("path into the working tree, across nested " + "submodule boundaries")), + OPT_STRING(0, "sha1", &sha1, N_("string"), + N_("SHA1 expected by superproject")), + OPT_STRING(0, "subsha1", &subsha1, N_("string"), + N_("SHA1 of submodule's HEAD")), + OPT_END() + }; + + const char *const usage[] = { + N_("git submodule--helper run-update-procedure [<options>] <path>"), + NULL + }; + + argc = parse_options(argc, argv, prefix, options, usage, 0); + + if (argc != 1) + usage_with_options(usage, options); + + update_data.force = !!force; + update_data.quiet = !!quiet; + update_data.nofetch = !!nofetch; + update_data.just_cloned = !!just_cloned; + update_data.sm_path = argv[0]; + + if (sha1) + get_oid_hex(sha1, &update_data.sha1); + else + oidcpy(&update_data.sha1, null_oid()); + + if (subsha1) + get_oid_hex(subsha1, &update_data.subsha1); + else + oidcpy(&update_data.subsha1, null_oid()); + + if (update_data.recursive_prefix) + prefixed_path = xstrfmt("%s%s", update_data.recursive_prefix, update_data.sm_path); + else + prefixed_path = xstrdup(update_data.sm_path); + + update_data.displaypath = get_submodule_displaypath(prefixed_path, prefix); + + determine_submodule_update_strategy(the_repository, update_data.just_cloned, + update_data.sm_path, update, + &update_data.update_strategy); + + free(prefixed_path); + + return do_run_update_procedure(&update_data); +} + static int resolve_relative_path(int argc, const char **argv, const char *prefix) { struct strbuf sb = STRBUF_INIT; @@ -2759,6 +3005,7 @@ static struct cmd_struct commands[] = { {"clone", module_clone, 0}, {"update-module-mode", module_update_module_mode, 0}, {"update-clone", update_clone, 0}, + {"run-update-procedure", run_update_procedure, 0}, {"ensure-core-worktree", ensure_core_worktree, 0}, {"relative-path", resolve_relative_path, 0}, {"resolve-relative-url", resolve_relative_url, 0}, diff --git a/git-submodule.sh b/git-submodule.sh index 1187c21260..4d5437f5c2 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -405,13 +405,6 @@ cmd_deinit() git ${wt_prefix:+-C "$wt_prefix"} submodule--helper deinit ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} ${force:+--force} ${deinit_all:+--all} -- "$@" } -is_tip_reachable () ( - sanitize_submodule_env && - cd "$1" && - rev=$(git rev-list -n 1 "$2" --not --all 2>/dev/null) && - test -z "$rev" -) - # usage: fetch_in_submodule <module_path> [<depth>] [<sha1>] # Because arguments are positional, use an empty string to omit <depth> # but include <sha1>. @@ -555,14 +548,13 @@ cmd_update() git submodule--helper ensure-core-worktree "$sm_path" || exit 1 - update_module=$(git submodule--helper update-module-mode $just_cloned "$sm_path" $update) - displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix") if test $just_cloned -eq 1 then subsha1= else + just_cloned= subsha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD) || die "$(eval_gettext "fatal: Unable to find current revision in submodule path '\$displaypath'")" @@ -583,70 +575,27 @@ cmd_update() die "$(eval_gettext "fatal: Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")" fi - if test "$subsha1" != "$sha1" || test -n "$force" - then - subforce=$force - # If we don't already have a -f flag and the submodule has never been checked out - if test -z "$subsha1" && test -z "$force" - then - subforce="-f" - fi + out=$(git submodule--helper run-update-procedure ${wt_prefix:+--prefix "$wt_prefix"} ${GIT_QUIET:+--quiet} ${force:+--force} ${just_cloned:+--just-cloned} ${nofetch:+--no-fetch} ${depth:+"$depth"} ${update:+--update "$update"} ${prefix:+--recursive-prefix "$prefix"} ${sha1:+--sha1 "$sha1"} ${subsha1:+--subsha1 "$subsha1"} "$sm_path") - if test -z "$nofetch" - then - # Run fetch only if $sha1 isn't present or it - # is not reachable from a ref. - is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" $depth || - say "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'; trying to directly fetch \$sha1:")" - - # Now we tried the usual fetch, but $sha1 may - # not be reachable from any of the refs - is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" "$depth" "$sha1" || - die "$(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" - fi - - must_die_on_failure= - case "$update_module" in - checkout) - command="git checkout $subforce -q" - die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")" - ;; - rebase) - command="git rebase ${GIT_QUIET:+--quiet}" - die_msg="$(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")" - must_die_on_failure=yes - ;; - merge) - command="git merge ${GIT_QUIET:+--quiet}" - die_msg="$(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")" - must_die_on_failure=yes - ;; - !*) - command="${update_module#!}" - die_msg="$(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")" - must_die_on_failure=yes - ;; - *) - die "$(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")" - esac - - if (sanitize_submodule_env; cd "$sm_path" && $command "$sha1") - then - say "$say_msg" - elif test -n "$must_die_on_failure" - then - die_with_status 2 "$die_msg" - else - err="${err};$die_msg" - continue - fi - fi + # exit codes for run-update-procedure: + # 0: update was successful, say command output + # 128: subcommand died during execution + # 1: update procedure failed and must die + # 2: update procedure failed, but should not die + # 3: no update procedure was run + res="$?" + case $res in + 0) + say "$out" + ;; + 2|128) + exit $res + ;; + 1) + err="${err};$out" + continue + ;; + esac if test -n "$recursive" then @@ -661,7 +610,7 @@ cmd_update() if test $res -gt 0 then die_msg="$(eval_gettext "fatal: Failed to recurse into submodule path '\$displaypath'")" - if test $res -ne 2 + if test $res -ne 2 && test $res -ne 128 then err="${err};$die_msg" continue -- 2.32.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [GSoC] [PATCH] submodule--helper: run update procedures from C 2021-07-22 13:40 [GSoC] [PATCH] submodule--helper: run update procedures from C Atharva Raykar @ 2021-07-23 9:37 ` Ævar Arnfjörð Bjarmason 2021-07-23 16:59 ` Atharva Raykar 2021-08-02 13:06 ` [GSoC] [PATCH v2] " Atharva Raykar 1 sibling, 1 reply; 13+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-07-23 9:37 UTC (permalink / raw) To: Atharva Raykar Cc: git, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan On Thu, Jul 22 2021, Atharva Raykar wrote: > +/* NEEDSWORK: try to do this without creating a new process */ > +static int is_tip_reachable(const char *path, struct object_id *sha1) > +{ > + struct child_process cp = CHILD_PROCESS_INIT; > + struct strbuf rev = STRBUF_INIT; > + char *sha1_hex = oid_to_hex(sha1); > + > + cp.git_cmd = 1; > + cp.dir = xstrdup(path); > + cp.no_stderr = 1; > + strvec_pushl(&cp.args, "rev-list", "-n", "1", sha1_hex, "--not", "--all", NULL); > + > + prepare_submodule_repo_env(&cp.env_array); > + > + if (capture_command(&cp, &rev, GIT_MAX_HEXSZ + 1) || rev.len) > + return 0; > + > + return 1; > +} I think it's fine to do this & leave out the NEEDSWORK commit, just briefly noting in the commit-message that we're not bothering with trying to reduce sub-command invocations. It can be done later if anyone cares. > [...] > + strbuf_addf(&die_msg, "fatal: Unable to checkout '%s' in submodule path '%s'\n", > + sha1, ud->displaypath); > + strbuf_addf(&say_msg, "Submodule path '%s': checked out '%s'\n", > + ud->displaypath, sha1); For all of these you're removing the translation from a message like: die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" Which is easy enough to fix, just use _(), i.e.: strbuf_addf(&die_msg, _("Unable to checkout '%s' in submodule path '%s'\n"), [...] I removed the "fatal: " per a comment below... > + break; > + case SM_UPDATE_REBASE: > + cp.git_cmd = 1; > + strvec_push(&cp.args, "rebase"); > + if (ud->quiet) > + strvec_push(&cp.args, "--quiet"); > + strbuf_addf(&die_msg, "fatal: Unable to rebase '%s' in submodule path '%s'\n", > + sha1, ud->displaypath); > + strbuf_addf(&say_msg, "Submodule path '%s': rebased into '%s'\n", > + ud->displaypath, sha1); > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_MERGE: > + cp.git_cmd = 1; > + strvec_push(&cp.args, "merge"); > + if (ud->quiet) > + strvec_push(&cp.args, "--quiet"); > + strbuf_addf(&die_msg, "fatal: Unable to merge '%s' in submodule path '%s'\n", > + sha1, ud->displaypath); > + strbuf_addf(&say_msg, "Submodule path '%s': merged in '%s'\n", > + ud->displaypath, sha1); > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_COMMAND: > + /* NOTE: this does not handle quoted arguments */ > + strvec_split(&cp.args, ud->update_strategy.command); > + strbuf_addf(&die_msg, "fatal: Execution of '%s %s' failed in submodule path '%s'\n", > + ud->update_strategy.command, sha1, ud->displaypath); > + strbuf_addf(&say_msg, "Submodule path '%s': '%s %s'\n", > + ud->displaypath, ud->update_strategy.command, sha1); > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_UNSPECIFIED: > + case SM_UPDATE_NONE: > + BUG("update strategy should have been specified"); > + } > + > + strvec_push(&cp.args, sha1); > + > + prepare_submodule_repo_env(&cp.env_array); > + > + if (run_command(&cp)) { > + if (must_die_on_failure) { > + retval = 2; > + fputs(_(die_msg.buf), stderr); > + goto cleanup; FWIW I'd find this clearer if we just kept track of what operation we ran above, and just in this run_command() && must_die_on_failure case started populating these die messages. But even if not the reason I dropped the "fatal: " is shouldn't we just call die() here directly? Why clean up when we're dying anyway? Also since I see you used _() here that won't work, i.e. with gettet if you happen to need to declare things earlier, you need to use N_() to mark the message for translation. The _() here won't find any message translated (unless the string happened to exactly match a thing in the *.po file for other reasons, not the case here). But in this case we can just die(msg) here and have used the _() above, or just call die() directly here not having made a die_msg we usually won't use... > +static int do_run_update_procedure(struct update_data *ud) > +{ > + if ((!is_null_oid(&ud->sha1) && !is_null_oid(&ud->subsha1) && !oideq(&ud->sha1, &ud->subsha1)) || > + is_null_oid(&ud->subsha1) || ud->force) { > + int subforce = is_null_oid(&ud->subsha1) || ud->force; > + > + if (!ud->nofetch) { > + /* > + * Run fetch only if `sha1` isn't present or it > + * is not reachable from a ref. > + */ > + if (!is_tip_reachable(ud->sm_path, &ud->sha1)) > + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && > + !ud->quiet) > + fprintf_ln(stderr, > + _("Unable to fetch in submodule path '%s'; " > + "trying to directly fetch %s:"), > + ud->displaypath, oid_to_hex(&ud->sha1)); > + /* > + * Now we tried the usual fetch, but `sha1` may > + * not be reachable from any of the refs. > + */ > + if (!is_tip_reachable(ud->sm_path, &ud->sha1)) > + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->sha1)) > + die(_("Fetched in submodule path '%s', but it did not " > + "contain %s. Direct fetching of that commit failed."), > + ud->displaypath, oid_to_hex(&ud->sha1)); > + } > + > + return run_update_command(ud, subforce); > + } > + > + return 3; > +} Since this has excatly one caller I think it's better for readability (less indentation) and flow to just remove that "return 3" condition and do the big "if" you have at the end, i.e. have this function start with "int subforce =" and... > static void update_submodule(struct update_clone_data *ucd) > { > fprintf(stdout, "dummy %s %d\t%s\n", > @@ -2379,6 +2552,79 @@ static int update_clone(int argc, const char **argv, const char *prefix) > return update_submodules(&suc); > } > > +static int run_update_procedure(int argc, const char **argv, const char *prefix) > +{ > + int force = 0, quiet = 0, nofetch = 0, just_cloned = 0; > + char *prefixed_path, *update = NULL; > + char *sha1 = NULL, *subsha1 = NULL; > + struct update_data update_data = UPDATE_DATA_INIT; > + > + struct option options[] = { > + OPT__QUIET(&quiet, N_("suppress output for update by rebase or merge")), > + OPT__FORCE(&force, N_("force checkout updates"), 0), > + OPT_BOOL('N', "no-fetch", &nofetch, > + N_("don't fetch new objects from the remote site")), > + OPT_BOOL(0, "just-cloned", &just_cloned, > + N_("overrides update mode in case the repository is a fresh clone")), > + OPT_INTEGER(0, "depth", &update_data.depth, N_("depth for shallow fetch")), > + OPT_STRING(0, "prefix", &prefix, > + N_("path"), > + N_("path into the working tree")), > + OPT_STRING(0, "update", &update, > + N_("string"), > + N_("rebase, merge, checkout or none")), > + OPT_STRING(0, "recursive-prefix", &update_data.recursive_prefix, N_("path"), > + N_("path into the working tree, across nested " > + "submodule boundaries")), > + OPT_STRING(0, "sha1", &sha1, N_("string"), > + N_("SHA1 expected by superproject")), > + OPT_STRING(0, "subsha1", &subsha1, N_("string"), > + N_("SHA1 of submodule's HEAD")), > + OPT_END() > + }; > + > + const char *const usage[] = { > + N_("git submodule--helper run-update-procedure [<options>] <path>"), > + NULL > + }; > + > + argc = parse_options(argc, argv, prefix, options, usage, 0); > + > + if (argc != 1) > + usage_with_options(usage, options); > + update_data.force = !!force; > + update_data.quiet = !!quiet; > + update_data.nofetch = !!nofetch; > + update_data.just_cloned = !!just_cloned; For all of these just pass the reference to the update_data variable directly in the OPT_*(). No need to set an "int force", only to copy it over to update_data.force. Let's just use the latter only. > + > + if (sha1) > + get_oid_hex(sha1, &update_data.sha1); > + else > + oidcpy(&update_data.sha1, null_oid()); Nit: Even if a historical option forces us to support --sha1, let's use "oid" for the variable etc. But in this case the --sha1 is new, no? Let's use --object-id or --oid (whatever is more common, I didn't check)> > + > + if (subsha1) > + get_oid_hex(subsha1, &update_data.subsha1); > + else > + oidcpy(&update_data.subsha1, null_oid()); Ditto. Also I think for both of these you can re-use parse_opt_object_id. See "squash-onto" and "upstream" in builtin/rebase.c. Then you just supply an oid variable directly and let that helper do all the get_oid etc. > + if (update_data.recursive_prefix) > + prefixed_path = xstrfmt("%s%s", update_data.recursive_prefix, update_data.sm_path); > + else > + prefixed_path = xstrdup(update_data.sm_path); > + > + update_data.displaypath = get_submodule_displaypath(prefixed_path, prefix); > + > + determine_submodule_update_strategy(the_repository, update_data.just_cloned, > + update_data.sm_path, update, > + &update_data.update_strategy); > + > + free(prefixed_path); > + > + return do_run_update_procedure(&update_data); ....(continued from above) ...here just do: if (that big if condition) return do_run_update_procedure(&update_data); else return 3; ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GSoC] [PATCH] submodule--helper: run update procedures from C 2021-07-23 9:37 ` Ævar Arnfjörð Bjarmason @ 2021-07-23 16:59 ` Atharva Raykar 2021-08-04 8:35 ` Atharva Raykar 0 siblings, 1 reply; 13+ messages in thread From: Atharva Raykar @ 2021-07-23 16:59 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Git List, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan On 23-Jul-2021, at 15:07, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > > On Thu, Jul 22 2021, Atharva Raykar wrote: > >> +/* NEEDSWORK: try to do this without creating a new process */ >> +static int is_tip_reachable(const char *path, struct object_id *sha1) >> +{ >> + struct child_process cp = CHILD_PROCESS_INIT; >> + struct strbuf rev = STRBUF_INIT; >> + char *sha1_hex = oid_to_hex(sha1); >> + >> + cp.git_cmd = 1; >> + cp.dir = xstrdup(path); >> + cp.no_stderr = 1; >> + strvec_pushl(&cp.args, "rev-list", "-n", "1", sha1_hex, "--not", "--all", NULL); >> + >> + prepare_submodule_repo_env(&cp.env_array); >> + >> + if (capture_command(&cp, &rev, GIT_MAX_HEXSZ + 1) || rev.len) >> + return 0; >> + >> + return 1; >> +} > > I think it's fine to do this & leave out the NEEDSWORK commit, just > briefly noting in the commit-message that we're not bothering with > trying to reduce sub-command invocations. It can be done later if anyone > cares. Okay, fair enough. I realise I have not been consistent about leaving such comments in all my conversion efforts. >> [...] >> + strbuf_addf(&die_msg, "fatal: Unable to checkout '%s' in submodule path '%s'\n", >> + sha1, ud->displaypath); >> + strbuf_addf(&say_msg, "Submodule path '%s': checked out '%s'\n", >> + ud->displaypath, sha1); > > For all of these you're removing the translation from a message like: > > die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" > > Which is easy enough to fix, just use _(), i.e.: > > strbuf_addf(&die_msg, _("Unable to checkout '%s' in submodule path '%s'\n"), [...] Thanks for catching this. > I removed the "fatal: " per a comment below... > >> + break; >> + case SM_UPDATE_REBASE: >> + cp.git_cmd = 1; >> + strvec_push(&cp.args, "rebase"); >> + if (ud->quiet) >> + strvec_push(&cp.args, "--quiet"); >> + strbuf_addf(&die_msg, "fatal: Unable to rebase '%s' in submodule path '%s'\n", >> + sha1, ud->displaypath); >> + strbuf_addf(&say_msg, "Submodule path '%s': rebased into '%s'\n", >> + ud->displaypath, sha1); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_MERGE: >> + cp.git_cmd = 1; >> + strvec_push(&cp.args, "merge"); >> + if (ud->quiet) >> + strvec_push(&cp.args, "--quiet"); >> + strbuf_addf(&die_msg, "fatal: Unable to merge '%s' in submodule path '%s'\n", >> + sha1, ud->displaypath); >> + strbuf_addf(&say_msg, "Submodule path '%s': merged in '%s'\n", >> + ud->displaypath, sha1); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_COMMAND: >> + /* NOTE: this does not handle quoted arguments */ >> + strvec_split(&cp.args, ud->update_strategy.command); >> + strbuf_addf(&die_msg, "fatal: Execution of '%s %s' failed in submodule path '%s'\n", >> + ud->update_strategy.command, sha1, ud->displaypath); >> + strbuf_addf(&say_msg, "Submodule path '%s': '%s %s'\n", >> + ud->displaypath, ud->update_strategy.command, sha1); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + case SM_UPDATE_NONE: >> + BUG("update strategy should have been specified"); >> + } >> + >> + strvec_push(&cp.args, sha1); >> + >> + prepare_submodule_repo_env(&cp.env_array); >> + >> + if (run_command(&cp)) { >> + if (must_die_on_failure) { >> + retval = 2; >> + fputs(_(die_msg.buf), stderr); >> + goto cleanup; > > FWIW I'd find this clearer if we just kept track of what operation we > ran above, and just in this run_command() && must_die_on_failure case > started populating these die messages. Could you clarify what you meant here? I can't think of a way to do this at the moment without introducing one or more redundant 'switch ()' statements. This is what I interpreted your statement as: ... /* we added the arguments to 'cp' already in the switch () above... */ if (run_command(&cp)) { if (must_die_on_failure) { switch (ud->update_strategy.type) { case SM_UPDATE_CHECKOUT: die(_("Execution of...")); break; case ... case ... } /* * This signifies to the caller in shell that * the command failed without dying */ retval = 1; goto cleanup; } /* ...another switch to figure out which say_msg() to use? */ ... Or did you mean that instead of storing the die_msg in entirety at the first switch, I instead store just the action, and in the conditional finally have something like...? die(_("Unable to %s '%s' in submodule path '%s'"), action, sha1, ud->displaypath); ...where 'action' is a value like "merge", "checkout" etc, set in the first 'switch'. I am guessing the motivation for this is to make more clear which error message will be shown for each case? > But even if not the reason I dropped the "fatal: " is shouldn't we just > call die() here directly? Why clean up when we're dying anyway? The reason I did not call die() directly is because the original shell version originally specifically exited with 2 in the 'must_die_on_error' case while die() in C exits with 128. I think it would be more semantically correct for me to have done something like an 'exit(2)' instead of cleaning up and returning. That said, it occured to me that the receiving end does not do anything special with the different exit code, other than just pass it on to another exit, ie, this bit: + 2|128) + exit $res + ;; This code currently sits in a weird middle position where it is neither fully matching the exit codes as before the conversion (where a failure unrelated to the command execution should have exited with 1, not 128), nor is it having a complete disregard for their exact value, which would somewhat simplify the failure handling code. I wonder if any script in a machine somewhere cares about the exact exit value of 'submodule add'. I suspect it would be relatively harmless if I just follow your suggestion and just die() on command execution failure... Thoughts? > Also since I see you used _() here that won't work, i.e. with gettet if > you happen to need to declare things earlier, you need to use N_() to > mark the message for translation. > > The _() here won't find any message translated (unless the string > happened to exactly match a thing in the *.po file for other reasons, > not the case here). > > But in this case we can just die(msg) here and have used the _() above, > or just call die() directly here not having made a die_msg we usually > won't use... Okay, I'll ensure that the translations are marked properly when I reroll. >> +static int do_run_update_procedure(struct update_data *ud) >> +{ >> + if ((!is_null_oid(&ud->sha1) && !is_null_oid(&ud->subsha1) && !oideq(&ud->sha1, &ud->subsha1)) || >> + is_null_oid(&ud->subsha1) || ud->force) { >> + int subforce = is_null_oid(&ud->subsha1) || ud->force; >> + >> + if (!ud->nofetch) { >> + /* >> + * Run fetch only if `sha1` isn't present or it >> + * is not reachable from a ref. >> + */ >> + if (!is_tip_reachable(ud->sm_path, &ud->sha1)) >> + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && >> + !ud->quiet) >> + fprintf_ln(stderr, >> + _("Unable to fetch in submodule path '%s'; " >> + "trying to directly fetch %s:"), >> + ud->displaypath, oid_to_hex(&ud->sha1)); >> + /* >> + * Now we tried the usual fetch, but `sha1` may >> + * not be reachable from any of the refs. >> + */ >> + if (!is_tip_reachable(ud->sm_path, &ud->sha1)) >> + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->sha1)) >> + die(_("Fetched in submodule path '%s', but it did not " >> + "contain %s. Direct fetching of that commit failed."), >> + ud->displaypath, oid_to_hex(&ud->sha1)); >> + } >> + >> + return run_update_command(ud, subforce); >> + } >> + >> + return 3; >> +} > > Since this has excatly one caller I think it's better for readability > (less indentation) and flow to just remove that "return 3" condition and > do the big "if" you have at the end, i.e. have this function start with > "int subforce =" and... Yeah, that would be better. I'll change that. >> static void update_submodule(struct update_clone_data *ucd) >> { >> fprintf(stdout, "dummy %s %d\t%s\n", >> @@ -2379,6 +2552,79 @@ static int update_clone(int argc, const char **argv, const char *prefix) >> return update_submodules(&suc); >> } >> >> +static int run_update_procedure(int argc, const char **argv, const char *prefix) >> +{ >> + int force = 0, quiet = 0, nofetch = 0, just_cloned = 0; >> + char *prefixed_path, *update = NULL; >> + char *sha1 = NULL, *subsha1 = NULL; >> + struct update_data update_data = UPDATE_DATA_INIT; >> + >> + struct option options[] = { >> + OPT__QUIET(&quiet, N_("suppress output for update by rebase or merge")), >> + OPT__FORCE(&force, N_("force checkout updates"), 0), >> + OPT_BOOL('N', "no-fetch", &nofetch, >> + N_("don't fetch new objects from the remote site")), >> + OPT_BOOL(0, "just-cloned", &just_cloned, >> + N_("overrides update mode in case the repository is a fresh clone")), >> + OPT_INTEGER(0, "depth", &update_data.depth, N_("depth for shallow fetch")), >> + OPT_STRING(0, "prefix", &prefix, >> + N_("path"), >> + N_("path into the working tree")), >> + OPT_STRING(0, "update", &update, >> + N_("string"), >> + N_("rebase, merge, checkout or none")), >> + OPT_STRING(0, "recursive-prefix", &update_data.recursive_prefix, N_("path"), >> + N_("path into the working tree, across nested " >> + "submodule boundaries")), >> + OPT_STRING(0, "sha1", &sha1, N_("string"), >> + N_("SHA1 expected by superproject")), >> + OPT_STRING(0, "subsha1", &subsha1, N_("string"), >> + N_("SHA1 of submodule's HEAD")), >> + OPT_END() >> + }; >> + >> + const char *const usage[] = { >> + N_("git submodule--helper run-update-procedure [<options>] <path>"), >> + NULL >> + }; >> + >> + argc = parse_options(argc, argv, prefix, options, usage, 0); >> + >> + if (argc != 1) >> + usage_with_options(usage, options); >> + update_data.force = !!force; >> + update_data.quiet = !!quiet; >> + update_data.nofetch = !!nofetch; >> + update_data.just_cloned = !!just_cloned; > > For all of these just pass the reference to the update_data variable > directly in the OPT_*(). No need to set an "int force", only to copy it > over to update_data.force. Let's just use the latter only. Hmm, I'm trying to remember why the single bit values are treated this way in this whole file... ...there seems to be no good reason for it. The API docs for parse options state that OPT_BOOL() is guaranteed to return either zero or one, so that double negation does look unnecessary. >> + >> + if (sha1) >> + get_oid_hex(sha1, &update_data.sha1); >> + else >> + oidcpy(&update_data.sha1, null_oid()); > > Nit: Even if a historical option forces us to support --sha1, let's use > "oid" for the variable etc. But in this case the --sha1 is new, no? > Let's use --object-id or --oid (whatever is more common, I didn't > check)> Okay. I can see the confusion this may cause. >> + >> + if (subsha1) >> + get_oid_hex(subsha1, &update_data.subsha1); >> + else >> + oidcpy(&update_data.subsha1, null_oid()); > > Ditto. Also I think for both of these you can re-use > parse_opt_object_id. See "squash-onto" and "upstream" in > builtin/rebase.c. > > Then you just supply an oid variable directly and let that helper do all > the get_oid etc. Thanks for pointing me to this! >> + if (update_data.recursive_prefix) >> + prefixed_path = xstrfmt("%s%s", update_data.recursive_prefix, update_data.sm_path); >> + else >> + prefixed_path = xstrdup(update_data.sm_path); >> + >> + update_data.displaypath = get_submodule_displaypath(prefixed_path, prefix); >> + >> + determine_submodule_update_strategy(the_repository, update_data.just_cloned, >> + update_data.sm_path, update, >> + &update_data.update_strategy); >> + >> + free(prefixed_path); >> + >> + return do_run_update_procedure(&update_data); > > ....(continued from above) ...here just do: > > if (that big if condition) > return do_run_update_procedure(&update_data); > else > return 3; Okay. Thanks for the review! ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GSoC] [PATCH] submodule--helper: run update procedures from C 2021-07-23 16:59 ` Atharva Raykar @ 2021-08-04 8:35 ` Atharva Raykar 0 siblings, 0 replies; 13+ messages in thread From: Atharva Raykar @ 2021-08-04 8:35 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Git List, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan Atharva Raykar <raykar.ath@gmail.com> writes: >> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: >>> static void update_submodule(struct update_clone_data *ucd) >>> { >>> fprintf(stdout, "dummy %s %d\t%s\n", >>> @@ -2379,6 +2552,79 @@ static int update_clone(int argc, const char **argv, const char *prefix) >>> return update_submodules(&suc); >>> } >>> >>> +static int run_update_procedure(int argc, const char **argv, const char *prefix) >>> +{ >>> + int force = 0, quiet = 0, nofetch = 0, just_cloned = 0; >>> + char *prefixed_path, *update = NULL; >>> + char *sha1 = NULL, *subsha1 = NULL; >>> + struct update_data update_data = UPDATE_DATA_INIT; >>> + >>> + struct option options[] = { >>> + OPT__QUIET(&quiet, N_("suppress output for update by rebase or merge")), >>> + OPT__FORCE(&force, N_("force checkout updates"), 0), >>> + OPT_BOOL('N', "no-fetch", &nofetch, >>> + N_("don't fetch new objects from the remote site")), >>> + OPT_BOOL(0, "just-cloned", &just_cloned, >>> + N_("overrides update mode in case the repository is a fresh clone")), >>> + OPT_INTEGER(0, "depth", &update_data.depth, N_("depth for shallow fetch")), >>> + OPT_STRING(0, "prefix", &prefix, >>> + N_("path"), >>> + N_("path into the working tree")), >>> + OPT_STRING(0, "update", &update, >>> + N_("string"), >>> + N_("rebase, merge, checkout or none")), >>> + OPT_STRING(0, "recursive-prefix", &update_data.recursive_prefix, N_("path"), >>> + N_("path into the working tree, across nested " >>> + "submodule boundaries")), >>> + OPT_STRING(0, "sha1", &sha1, N_("string"), >>> + N_("SHA1 expected by superproject")), >>> + OPT_STRING(0, "subsha1", &subsha1, N_("string"), >>> + N_("SHA1 of submodule's HEAD")), >>> + OPT_END() >>> + }; >>> + >>> + const char *const usage[] = { >>> + N_("git submodule--helper run-update-procedure [<options>] <path>"), >>> + NULL >>> + }; >>> + >>> + argc = parse_options(argc, argv, prefix, options, usage, 0); >>> + >>> + if (argc != 1) >>> + usage_with_options(usage, options); >>> + update_data.force = !!force; >>> + update_data.quiet = !!quiet; >>> + update_data.nofetch = !!nofetch; >>> + update_data.just_cloned = !!just_cloned; >> >> For all of these just pass the reference to the update_data variable >> directly in the OPT_*(). No need to set an "int force", only to copy it >> over to update_data.force. Let's just use the latter only. > > Hmm, I'm trying to remember why the single bit values are treated this way > in this whole file... > > ...there seems to be no good reason for it. The API docs for parse options > state that OPT_BOOL() is guaranteed to return either zero or one, so that > double negation does look unnecessary. I forgot to mention why I did not address this change in my v3 patch. The reason why we are handling boolean values this way is because they are declared as bitfields in the 'update_data' struct. Since we cannot take the address of bitfields, we have to use a different variable to store when using 'parse_options()'. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [GSoC] [PATCH v2] submodule--helper: run update procedures from C 2021-07-22 13:40 [GSoC] [PATCH] submodule--helper: run update procedures from C Atharva Raykar 2021-07-23 9:37 ` Ævar Arnfjörð Bjarmason @ 2021-08-02 13:06 ` Atharva Raykar 2021-08-02 18:50 ` Shourya Shukla 2021-08-13 7:56 ` [GSoC] [PATCH v3] " Atharva Raykar 1 sibling, 2 replies; 13+ messages in thread From: Atharva Raykar @ 2021-08-02 13:06 UTC (permalink / raw) To: git Cc: Atharva Raykar, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan Add a new submodule--helper subcommand `run-update-procedure` that runs the update procedure if the SHA1 of the submodule does not match what the superproject expects. This is an intermediate change that works towards total conversion of `submodule update` from shell to C. Specific error codes are returned so that the shell script calling the subcommand can take a decision on the control flow, and preserve the error messages across subsequent recursive calls of `cmd_update`. This patch could have been approached differently, by first changing the `is_tip_reachable` and `fetch_in_submodule` shell functions to be `submodule--helper` subcommands, and then following up with a patch that introduces the `run-update-procedure` subcommand. We have not done it like that because those functions are trivial enough to convert directly along with these other changes. This lets us avoid the boilerplate and the cleanup patches that will need to be introduced in following that approach. This change is more focused on doing a faithful conversion, so for now we are not too concerned with trying to reduce subprocess spawns. Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Shourya Shukla <periperidip@gmail.com> --- Notable changes since v1: * Modified the code structure in submodule--helper.c:run_update_command(), while fixing problems with the translation marks. * Renamed '--sha1' and '--subsha1' options to '--oid' and '--suboid' to since the argument is parsed into an object_id struct, not plain sha1 data. * Used option callbacks to parse the SHA1 arguments directly. * Moved the conditional out of 'do_run_update_procedure()'. Feedback required: Ævar felt that it would be clearer to populate the 'fatal' messages after the run_command() operation in 'run_update_command()', to make it more readable [1]. I have attempted something like that here, and it has led to a lot more duplicated 'switch' statements, which feels suboptimal. I'd appreciate suggestions to make it more legible. [1] https://lore.kernel.org/git/87r1fps63r.fsf@evledraar.gmail.com/ Fetch-it-Via: git fetch https://github.com/tfidfwastaken/git submodule-run-update-proc-list-2 builtin/submodule--helper.c | 253 ++++++++++++++++++++++++++++++++++++ git-submodule.sh | 106 +++++---------- 2 files changed, 286 insertions(+), 73 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index d55f6262e9..b9c40324d0 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -2029,6 +2029,20 @@ struct submodule_update_clone { .max_jobs = 1, \ } +struct update_data { + const char *recursive_prefix; + const char *sm_path; + const char *displaypath; + struct object_id oid; + struct object_id suboid; + struct submodule_update_strategy update_strategy; + int depth; + unsigned int force: 1; + unsigned int quiet: 1; + unsigned int nofetch: 1; + unsigned int just_cloned: 1; +}; +#define UPDATE_DATA_INIT { .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT } static void next_submodule_warn_missing(struct submodule_update_clone *suc, struct strbuf *out, const char *displaypath) @@ -2282,6 +2296,175 @@ static int git_update_clone_config(const char *var, const char *value, return 0; } +static int is_tip_reachable(const char *path, struct object_id *oid) +{ + struct child_process cp = CHILD_PROCESS_INIT; + struct strbuf rev = STRBUF_INIT; + char *hex = oid_to_hex(oid); + + cp.git_cmd = 1; + cp.dir = xstrdup(path); + cp.no_stderr = 1; + strvec_pushl(&cp.args, "rev-list", "-n", "1", hex, "--not", "--all", NULL); + + prepare_submodule_repo_env(&cp.env_array); + + if (capture_command(&cp, &rev, GIT_MAX_HEXSZ + 1) || rev.len) + return 0; + + return 1; +} + +static int fetch_in_submodule(const char *module_path, int depth, int quiet, struct object_id *oid) +{ + struct child_process cp = CHILD_PROCESS_INIT; + + prepare_submodule_repo_env(&cp.env_array); + cp.git_cmd = 1; + cp.dir = xstrdup(module_path); + + strvec_push(&cp.args, "fetch"); + if (quiet) + strvec_push(&cp.args, "--quiet"); + if (depth) + strvec_pushf(&cp.args, "--depth=%d", depth); + if (oid) { + char *hex = oid_to_hex(oid); + char *remote = get_default_remote(); + strvec_pushl(&cp.args, remote, hex, NULL); + } + + return run_command(&cp); +} + +static int run_update_command(struct update_data *ud, int subforce) +{ + struct child_process cp = CHILD_PROCESS_INIT; + char *oid = oid_to_hex(&ud->oid); + int must_die_on_failure = 0; + + cp.dir = xstrdup(ud->sm_path); + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + cp.git_cmd = 1; + strvec_pushl(&cp.args, "checkout", "-q", NULL); + if (subforce) + strvec_push(&cp.args, "-f"); + break; + case SM_UPDATE_REBASE: + cp.git_cmd = 1; + strvec_push(&cp.args, "rebase"); + if (ud->quiet) + strvec_push(&cp.args, "--quiet"); + must_die_on_failure = 1; + break; + case SM_UPDATE_MERGE: + cp.git_cmd = 1; + strvec_push(&cp.args, "merge"); + if (ud->quiet) + strvec_push(&cp.args, "--quiet"); + must_die_on_failure = 1; + break; + case SM_UPDATE_COMMAND: + /* NOTE: this does not handle quoted arguments */ + strvec_split(&cp.args, ud->update_strategy.command); + must_die_on_failure = 1; + break; + case SM_UPDATE_UNSPECIFIED: + case SM_UPDATE_NONE: + BUG("update strategy should have been specified"); + } + + strvec_push(&cp.args, oid); + + prepare_submodule_repo_env(&cp.env_array); + + if (run_command(&cp)) { + if (must_die_on_failure) { + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + die(_("Unable to checkout '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_REBASE: + die(_("Unable to rebase '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_MERGE: + die(_("Unable to merge '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_COMMAND: + die(_("Execution of '%s %s' failed in submodule path '%s'"), + ud->update_strategy.command, oid, ud->displaypath); + break; + case SM_UPDATE_UNSPECIFIED: + case SM_UPDATE_NONE: + BUG("update strategy should have been specified"); + } + } + /* + * This signifies to the caller in shell that + * the command failed without dying + */ + return 1; + } + + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + printf(_("Submodule path '%s': checked out '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_REBASE: + printf(_("Submodule path '%s': rebased into '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_MERGE: + printf(_("Submodule path '%s': merged in '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_COMMAND: + printf(_("Submodule path '%s': '%s %s'\n"), + ud->displaypath, ud->update_strategy.command, oid); + break; + case SM_UPDATE_UNSPECIFIED: + case SM_UPDATE_NONE: + BUG("update strategy should have been specified"); + } + + return 0; +} + +static int do_run_update_procedure(struct update_data *ud) +{ + int subforce = is_null_oid(&ud->suboid) || ud->force; + + if (!ud->nofetch) { + /* + * Run fetch only if `oid` isn't present or it + * is not reachable from a ref. + */ + if (!is_tip_reachable(ud->sm_path, &ud->oid)) + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && + !ud->quiet) + fprintf_ln(stderr, + _("Unable to fetch in submodule path '%s'; " + "trying to directly fetch %s:"), + ud->displaypath, oid_to_hex(&ud->oid)); + /* + * Now we tried the usual fetch, but `oid` may + * not be reachable from any of the refs. + */ + if (!is_tip_reachable(ud->sm_path, &ud->oid)) + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->oid)) + die(_("Fetched in submodule path '%s', but it did not " + "contain %s. Direct fetching of that commit failed."), + ud->displaypath, oid_to_hex(&ud->oid)); + } + + return run_update_command(ud, subforce); +} + static void update_submodule(struct update_clone_data *ucd) { fprintf(stdout, "dummy %s %d\t%s\n", @@ -2379,6 +2562,75 @@ static int update_clone(int argc, const char **argv, const char *prefix) return update_submodules(&suc); } +static int run_update_procedure(int argc, const char **argv, const char *prefix) +{ + int force = 0, quiet = 0, nofetch = 0, just_cloned = 0; + char *prefixed_path, *update = NULL; + struct update_data update_data = UPDATE_DATA_INIT; + + struct option options[] = { + OPT__QUIET(&quiet, N_("suppress output for update by rebase or merge")), + OPT__FORCE(&force, N_("force checkout updates"), 0), + OPT_BOOL('N', "no-fetch", &nofetch, + N_("don't fetch new objects from the remote site")), + OPT_BOOL(0, "just-cloned", &just_cloned, + N_("overrides update mode in case the repository is a fresh clone")), + OPT_INTEGER(0, "depth", &update_data.depth, N_("depth for shallow fetch")), + OPT_STRING(0, "prefix", &prefix, + N_("path"), + N_("path into the working tree")), + OPT_STRING(0, "update", &update, + N_("string"), + N_("rebase, merge, checkout or none")), + OPT_STRING(0, "recursive-prefix", &update_data.recursive_prefix, N_("path"), + N_("path into the working tree, across nested " + "submodule boundaries")), + OPT_CALLBACK_F(0, "oid", &update_data.oid, N_("sha1"), + N_("SHA1 expected by superproject"), PARSE_OPT_NONEG, + parse_opt_object_id), + OPT_CALLBACK_F(0, "suboid", &update_data.suboid, N_("subsha1"), + N_("SHA1 of submodule's HEAD"), PARSE_OPT_NONEG, + parse_opt_object_id), + OPT_END() + }; + + const char *const usage[] = { + N_("git submodule--helper run-update-procedure [<options>] <path>"), + NULL + }; + + argc = parse_options(argc, argv, prefix, options, usage, 0); + + if (argc != 1) + usage_with_options(usage, options); + + update_data.force = !!force; + update_data.quiet = !!quiet; + update_data.nofetch = !!nofetch; + update_data.just_cloned = !!just_cloned; + update_data.sm_path = argv[0]; + + if (update_data.recursive_prefix) + prefixed_path = xstrfmt("%s%s", update_data.recursive_prefix, update_data.sm_path); + else + prefixed_path = xstrdup(update_data.sm_path); + + update_data.displaypath = get_submodule_displaypath(prefixed_path, prefix); + + determine_submodule_update_strategy(the_repository, update_data.just_cloned, + update_data.sm_path, update, + &update_data.update_strategy); + + free(prefixed_path); + + if ((!is_null_oid(&update_data.oid) && !is_null_oid(&update_data.suboid) && + !oideq(&update_data.oid, &update_data.suboid)) || + is_null_oid(&update_data.suboid) || update_data.force) + return do_run_update_procedure(&update_data); + + return 3; +} + static int resolve_relative_path(int argc, const char **argv, const char *prefix) { struct strbuf sb = STRBUF_INIT; @@ -2759,6 +3011,7 @@ static struct cmd_struct commands[] = { {"clone", module_clone, 0}, {"update-module-mode", module_update_module_mode, 0}, {"update-clone", update_clone, 0}, + {"run-update-procedure", run_update_procedure, 0}, {"ensure-core-worktree", ensure_core_worktree, 0}, {"relative-path", resolve_relative_path, 0}, {"resolve-relative-url", resolve_relative_url, 0}, diff --git a/git-submodule.sh b/git-submodule.sh index 1187c21260..0bb4514859 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -405,13 +405,6 @@ cmd_deinit() git ${wt_prefix:+-C "$wt_prefix"} submodule--helper deinit ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} ${force:+--force} ${deinit_all:+--all} -- "$@" } -is_tip_reachable () ( - sanitize_submodule_env && - cd "$1" && - rev=$(git rev-list -n 1 "$2" --not --all 2>/dev/null) && - test -z "$rev" -) - # usage: fetch_in_submodule <module_path> [<depth>] [<sha1>] # Because arguments are positional, use an empty string to omit <depth> # but include <sha1>. @@ -555,14 +548,13 @@ cmd_update() git submodule--helper ensure-core-worktree "$sm_path" || exit 1 - update_module=$(git submodule--helper update-module-mode $just_cloned "$sm_path" $update) - displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix") if test $just_cloned -eq 1 then subsha1= else + just_cloned= subsha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD) || die "$(eval_gettext "fatal: Unable to find current revision in submodule path '\$displaypath'")" @@ -583,70 +575,38 @@ cmd_update() die "$(eval_gettext "fatal: Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")" fi - if test "$subsha1" != "$sha1" || test -n "$force" - then - subforce=$force - # If we don't already have a -f flag and the submodule has never been checked out - if test -z "$subsha1" && test -z "$force" - then - subforce="-f" - fi + out=$(git submodule--helper run-update-procedure \ + ${wt_prefix:+--prefix "$wt_prefix"} \ + ${GIT_QUIET:+--quiet} \ + ${force:+--force} \ + ${just_cloned:+--just-cloned} \ + ${nofetch:+--no-fetch} \ + ${depth:+"$depth"} \ + ${update:+--update "$update"} \ + ${prefix:+--recursive-prefix "$prefix"} \ + ${sha1:+--oid "$sha1"} \ + ${subsha1:+--suboid "$subsha1"} \ + "--" \ + "$sm_path") - if test -z "$nofetch" - then - # Run fetch only if $sha1 isn't present or it - # is not reachable from a ref. - is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" $depth || - say "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'; trying to directly fetch \$sha1:")" - - # Now we tried the usual fetch, but $sha1 may - # not be reachable from any of the refs - is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" "$depth" "$sha1" || - die "$(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" - fi - - must_die_on_failure= - case "$update_module" in - checkout) - command="git checkout $subforce -q" - die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")" - ;; - rebase) - command="git rebase ${GIT_QUIET:+--quiet}" - die_msg="$(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")" - must_die_on_failure=yes - ;; - merge) - command="git merge ${GIT_QUIET:+--quiet}" - die_msg="$(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")" - must_die_on_failure=yes - ;; - !*) - command="${update_module#!}" - die_msg="$(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")" - must_die_on_failure=yes - ;; - *) - die "$(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")" - esac - - if (sanitize_submodule_env; cd "$sm_path" && $command "$sha1") - then - say "$say_msg" - elif test -n "$must_die_on_failure" - then - die_with_status 2 "$die_msg" - else - err="${err};$die_msg" - continue - fi - fi + # exit codes for run-update-procedure: + # 0: update was successful, say command output + # 128: subcommand died during execution + # 1: update procedure failed, but should not die + # 3: no update procedure was run + res="$?" + case $res in + 0) + say "$out" + ;; + 128) + exit $res + ;; + 1) + err="${err};$out" + continue + ;; + esac if test -n "$recursive" then @@ -661,7 +621,7 @@ cmd_update() if test $res -gt 0 then die_msg="$(eval_gettext "fatal: Failed to recurse into submodule path '\$displaypath'")" - if test $res -ne 2 + if test $res -ne 2 && test $res -ne 128 then err="${err};$die_msg" continue -- 2.32.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [GSoC] [PATCH v2] submodule--helper: run update procedures from C 2021-08-02 13:06 ` [GSoC] [PATCH v2] " Atharva Raykar @ 2021-08-02 18:50 ` Shourya Shukla 2021-08-03 8:46 ` Atharva Raykar 2021-08-03 10:07 ` Atharva Raykar 2021-08-13 7:56 ` [GSoC] [PATCH v3] " Atharva Raykar 1 sibling, 2 replies; 13+ messages in thread From: Shourya Shukla @ 2021-08-02 18:50 UTC (permalink / raw) To: Atharva Raykar Cc: Git List, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Kaartic Sivaraam, Prathamesh Chavan Le lun. 2 août 2021 à 18:36, Atharva Raykar <raykar.ath@gmail.com> a écrit : > > Add a new submodule--helper subcommand `run-update-procedure` that runs > the update procedure if the SHA1 of the submodule does not match what > the superproject expects. > > This is an intermediate change that works towards total conversion of > `submodule update` from shell to C. > > Specific error codes are returned so that the shell script calling the > subcommand can take a decision on the control flow, and preserve the > error messages across subsequent recursive calls of `cmd_update`. > > This patch could have been approached differently, by first changing the > `is_tip_reachable` and `fetch_in_submodule` shell functions to be > `submodule--helper` subcommands, and then following up with a patch that > introduces the `run-update-procedure` subcommand. We have not done it > like that because those functions are trivial enough to convert directly > along with these other changes. This lets us avoid the boilerplate and > the cleanup patches that will need to be introduced in following that > approach. I feel that this part is more suitable for a cover letter rather than the commit message itself. It is a useful piece of info though. > This change is more focused on doing a faithful conversion, so for now we > are not too concerned with trying to reduce subprocess spawns. > > Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> > Mentored-by: Christian Couder <christian.couder@gmail.com> > Mentored-by: Shourya Shukla <periperidip@gmail.com> > --- > > Notable changes since v1: > > * Modified the code structure in > submodule--helper.c:run_update_command(), while fixing problems with > the translation marks. > > * Renamed '--sha1' and '--subsha1' options to '--oid' and '--suboid' to > since the argument is parsed into an object_id struct, not plain sha1 > data. > > * Used option callbacks to parse the SHA1 arguments directly. > > * Moved the conditional out of 'do_run_update_procedure()'. > > Feedback required: > > Ævar felt that it would be clearer to populate the 'fatal' messages > after the run_command() operation in 'run_update_command()', to make it > more readable [1]. I have attempted something like that here, and it has led > to a lot more duplicated 'switch' statements, which feels suboptimal. > I'd appreciate suggestions to make it more legible. > > [1] https://lore.kernel.org/git/87r1fps63r.fsf@evledraar.gmail.com/ > > Fetch-it-Via: > git fetch https://github.com/tfidfwastaken/git submodule-run-update-proc-list-2 > > builtin/submodule--helper.c | 253 ++++++++++++++++++++++++++++++++++++ > git-submodule.sh | 106 +++++---------- > 2 files changed, 286 insertions(+), 73 deletions(-) > > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c > index d55f6262e9..b9c40324d0 100644 > --- a/builtin/submodule--helper.c > +++ b/builtin/submodule--helper.c > @@ -2029,6 +2029,20 @@ struct submodule_update_clone { > .max_jobs = 1, \ > } > > +struct update_data { > + const char *recursive_prefix; > + const char *sm_path; > + const char *displaypath; > + struct object_id oid; > + struct object_id suboid; > + struct submodule_update_strategy update_strategy; > + int depth; > + unsigned int force: 1; > + unsigned int quiet: 1; > + unsigned int nofetch: 1; > + unsigned int just_cloned: 1; > +}; > +#define UPDATE_DATA_INIT { .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT } > > static void next_submodule_warn_missing(struct submodule_update_clone *suc, > struct strbuf *out, const char *displaypath) > @@ -2282,6 +2296,175 @@ static int git_update_clone_config(const char *var, const char *value, > return 0; > } > + > +static int run_update_command(struct update_data *ud, int subforce) > +{ > + struct child_process cp = CHILD_PROCESS_INIT; > + char *oid = oid_to_hex(&ud->oid); > + int must_die_on_failure = 0; > + > + cp.dir = xstrdup(ud->sm_path); > + switch (ud->update_strategy.type) { > + case SM_UPDATE_CHECKOUT: > + cp.git_cmd = 1; > + strvec_pushl(&cp.args, "checkout", "-q", NULL); Would it be possible to add the 'if' statement above just before the 'switch' (or after, whichever seems okay) since this is common amongst (almost) all the cases? > + if (subforce) > + strvec_push(&cp.args, "-f"); > + break; > + case SM_UPDATE_REBASE: > + cp.git_cmd = 1; > + strvec_push(&cp.args, "rebase"); > + if (ud->quiet) > + strvec_push(&cp.args, "--quiet"); > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_MERGE: > + cp.git_cmd = 1; > + strvec_push(&cp.args, "merge"); > + if (ud->quiet) > + strvec_push(&cp.args, "--quiet"); > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_COMMAND: > + /* NOTE: this does not handle quoted arguments */ > + strvec_split(&cp.args, ud->update_strategy.command); > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_UNSPECIFIED: > + case SM_UPDATE_NONE: > + BUG("update strategy should have been specified"); > + } If the original did not bug out, do we need to? The documentation does not mention this as well: https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt-none > + > + strvec_push(&cp.args, oid); > + > + prepare_submodule_repo_env(&cp.env_array); > + > + if (run_command(&cp)) { > + if (must_die_on_failure) { > + switch (ud->update_strategy.type) { > + case SM_UPDATE_CHECKOUT: > + die(_("Unable to checkout '%s' in submodule path '%s'"), > + oid, ud->displaypath); > + break; > + case SM_UPDATE_REBASE: > + die(_("Unable to rebase '%s' in submodule path '%s'"), > + oid, ud->displaypath); > + break; > + case SM_UPDATE_MERGE: > + die(_("Unable to merge '%s' in submodule path '%s'"), > + oid, ud->displaypath); > + break; > + case SM_UPDATE_COMMAND: > + die(_("Execution of '%s %s' failed in submodule path '%s'"), > + ud->update_strategy.command, oid, ud->displaypath); > + break; > + case SM_UPDATE_UNSPECIFIED: > + case SM_UPDATE_NONE: > + BUG("update strategy should have been specified"); > + } > + } > + /* > + * This signifies to the caller in shell that > + * the command failed without dying > + */ > + return 1; > + } > + > + switch (ud->update_strategy.type) { > + case SM_UPDATE_CHECKOUT: > + printf(_("Submodule path '%s': checked out '%s'\n"), > + ud->displaypath, oid); > + break; > + case SM_UPDATE_REBASE: > + printf(_("Submodule path '%s': rebased into '%s'\n"), > + ud->displaypath, oid); > + break; > + case SM_UPDATE_MERGE: > + printf(_("Submodule path '%s': merged in '%s'\n"), > + ud->displaypath, oid); > + break; > + case SM_UPDATE_COMMAND: > + printf(_("Submodule path '%s': '%s %s'\n"), > + ud->displaypath, ud->update_strategy.command, oid); > + break; > + case SM_UPDATE_UNSPECIFIED: > + case SM_UPDATE_NONE: > + BUG("update strategy should have been specified"); > + } > + > + return 0; > +} > + > +static int do_run_update_procedure(struct update_data *ud) > +{ > + int subforce = is_null_oid(&ud->suboid) || ud->force; > + > + if (!ud->nofetch) { > + /* > + * Run fetch only if `oid` isn't present or it > + * is not reachable from a ref. > + */ > + if (!is_tip_reachable(ud->sm_path, &ud->oid)) > + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && > + !ud->quiet) > + fprintf_ln(stderr, > + _("Unable to fetch in submodule path '%s'; " > + "trying to directly fetch %s:"), > + ud->displaypath, oid_to_hex(&ud->oid)); I was wondering if an OID is invalid, will it be counted as unreachable and vice-versa? If that is the case then that would simplify the work. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GSoC] [PATCH v2] submodule--helper: run update procedures from C 2021-08-02 18:50 ` Shourya Shukla @ 2021-08-03 8:46 ` Atharva Raykar 2021-08-03 10:07 ` Atharva Raykar 1 sibling, 0 replies; 13+ messages in thread From: Atharva Raykar @ 2021-08-03 8:46 UTC (permalink / raw) To: Shourya Shukla Cc: Git List, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Kaartic Sivaraam, Prathamesh Chavan Shourya Shukla <periperidip@gmail.com> writes: > Le lun. 2 août 2021 à 18:36, Atharva Raykar > <raykar.ath@gmail.com> a écrit : >> >> Add a new submodule--helper subcommand `run-update-procedure` >> that runs >> the update procedure if the SHA1 of the submodule does not >> match what >> the superproject expects. >> >> This is an intermediate change that works towards total >> conversion of >> `submodule update` from shell to C. >> >> Specific error codes are returned so that the shell script >> calling the >> subcommand can take a decision on the control flow, and >> preserve the >> error messages across subsequent recursive calls of >> `cmd_update`. >> >> This patch could have been approached differently, by first >> changing the >> `is_tip_reachable` and `fetch_in_submodule` shell functions to >> be >> `submodule--helper` subcommands, and then following up with a >> patch that >> introduces the `run-update-procedure` subcommand. We have not >> done it >> like that because those functions are trivial enough to convert >> directly >> along with these other changes. This lets us avoid the >> boilerplate and >> the cleanup patches that will need to be introduced in >> following that >> approach. > > I feel that this part is more suitable for a cover letter rather > than the commit > message itself. It is a useful piece of info though. Okay, that seems right, the message does seem a bit too context-sensitive. >> This change is more focused on doing a faithful conversion, so >> for now we >> are not too concerned with trying to reduce subprocess spawns. >> >> Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> >> Mentored-by: Christian Couder <christian.couder@gmail.com> >> Mentored-by: Shourya Shukla <periperidip@gmail.com> >> --- >> >> Notable changes since v1: >> >> * Modified the code structure in >> submodule--helper.c:run_update_command(), while fixing >> problems with >> the translation marks. >> >> * Renamed '--sha1' and '--subsha1' options to '--oid' and >> '--suboid' to >> since the argument is parsed into an object_id struct, not >> plain sha1 >> data. >> >> * Used option callbacks to parse the SHA1 arguments directly. >> >> * Moved the conditional out of 'do_run_update_procedure()'. >> >> Feedback required: >> >> Ævar felt that it would be clearer to populate the 'fatal' >> messages >> after the run_command() operation in 'run_update_command()', to >> make it >> more readable [1]. I have attempted something like that here, >> and it has led >> to a lot more duplicated 'switch' statements, which feels >> suboptimal. >> I'd appreciate suggestions to make it more legible. >> >> [1] >> https://lore.kernel.org/git/87r1fps63r.fsf@evledraar.gmail.com/ >> >> Fetch-it-Via: >> git fetch https://github.com/tfidfwastaken/git >> submodule-run-update-proc-list-2 >> >> builtin/submodule--helper.c | 253 >> ++++++++++++++++++++++++++++++++++++ >> git-submodule.sh | 106 +++++---------- >> 2 files changed, 286 insertions(+), 73 deletions(-) >> >> diff --git a/builtin/submodule--helper.c >> b/builtin/submodule--helper.c >> index d55f6262e9..b9c40324d0 100644 >> --- a/builtin/submodule--helper.c >> +++ b/builtin/submodule--helper.c >> @@ -2029,6 +2029,20 @@ struct submodule_update_clone { >> .max_jobs = 1, \ >> } >> >> +struct update_data { >> + const char *recursive_prefix; >> + const char *sm_path; >> + const char *displaypath; >> + struct object_id oid; >> + struct object_id suboid; >> + struct submodule_update_strategy update_strategy; >> + int depth; >> + unsigned int force: 1; >> + unsigned int quiet: 1; >> + unsigned int nofetch: 1; >> + unsigned int just_cloned: 1; >> +}; >> +#define UPDATE_DATA_INIT { .update_strategy = >> SUBMODULE_UPDATE_STRATEGY_INIT } >> >> static void next_submodule_warn_missing(struct >> submodule_update_clone *suc, >> struct strbuf *out, const char *displaypath) >> @@ -2282,6 +2296,175 @@ static int >> git_update_clone_config(const char *var, const char *value, >> return 0; >> } >> + >> +static int run_update_command(struct update_data *ud, int >> subforce) >> +{ >> + struct child_process cp = CHILD_PROCESS_INIT; >> + char *oid = oid_to_hex(&ud->oid); >> + int must_die_on_failure = 0; >> + >> + cp.dir = xstrdup(ud->sm_path); >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + cp.git_cmd = 1; >> + strvec_pushl(&cp.args, "checkout", "-q", NULL); > > Would it be possible to add the 'if' statement above just before > the > 'switch' (or after, > whichever seems okay) since this is common amongst (almost) all > the cases? I'll try it on once, if it makes the code more readable, I'll include it in the reroll. >> + if (subforce) >> + strvec_push(&cp.args, "-f"); >> + break; >> + case SM_UPDATE_REBASE: >> + cp.git_cmd = 1; >> + strvec_push(&cp.args, "rebase"); >> + if (ud->quiet) >> + strvec_push(&cp.args, "--quiet"); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_MERGE: >> + cp.git_cmd = 1; >> + strvec_push(&cp.args, "merge"); >> + if (ud->quiet) >> + strvec_push(&cp.args, "--quiet"); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_COMMAND: >> + /* NOTE: this does not handle quoted arguments >> */ >> + strvec_split(&cp.args, >> ud->update_strategy.command); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + case SM_UPDATE_NONE: >> + BUG("update strategy should have been >> specified"); >> + } > > If the original did not bug out, do we need to? The > documentation does > not mention > this as well: > https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt-none This was how the original shell porcelain did it: case "$update_module" in checkout) command="git checkout $subforce -q" die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")" ;; rebase) command="git rebase ${GIT_QUIET:+--quiet}" die_msg="$(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")" say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")" must_die_on_failure=yes ;; merge) command="git merge ${GIT_QUIET:+--quiet}" die_msg="$(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")" say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")" must_die_on_failure=yes ;; !*) command="${update_module#!}" die_msg="$(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")" say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")" must_die_on_failure=yes ;; *) die "$(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")" esac The fallthrough case used to die, but I noticed that this branch will never get activated. This is because the 'update-clone' helper will not output any entry that has the update mode set to 'none', and thus the `while` loop that contains this code would never run. Which is why I decided to BUG out on that case, because that state should never be reached. But I see the source of confusion, and maybe I should have different BUG messages for SM_UPDATE_UNSPECIFIED and SM_UPDATE_NONE. The latter should probably say "should have been handled by update-clone". >> + >> + strvec_push(&cp.args, oid); >> + >> + prepare_submodule_repo_env(&cp.env_array); >> + >> + if (run_command(&cp)) { >> + if (must_die_on_failure) { >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + die(_("Unable to checkout '%s' >> in submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_REBASE: >> + die(_("Unable to rebase '%s' in >> submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_MERGE: >> + die(_("Unable to merge '%s' in >> submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_COMMAND: >> + die(_("Execution of '%s %s' >> failed in submodule path '%s'"), >> + >> ud->update_strategy.command, oid, ud->displaypath); >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + case SM_UPDATE_NONE: >> + BUG("update strategy should >> have been specified"); >> + } >> + } >> + /* >> + * This signifies to the caller in shell that >> + * the command failed without dying >> + */ >> + return 1; >> + } >> + >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + printf(_("Submodule path '%s': checked out >> '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_REBASE: >> + printf(_("Submodule path '%s': rebased into >> '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_MERGE: >> + printf(_("Submodule path '%s': merged in >> '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_COMMAND: >> + printf(_("Submodule path '%s': '%s %s'\n"), >> + ud->displaypath, >> ud->update_strategy.command, oid); >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + case SM_UPDATE_NONE: >> + BUG("update strategy should have been >> specified"); >> + } >> + >> + return 0; >> +} >> + >> +static int do_run_update_procedure(struct update_data *ud) >> +{ >> + int subforce = is_null_oid(&ud->suboid) || ud->force; >> + >> + if (!ud->nofetch) { >> + /* >> + * Run fetch only if `oid` isn't present or it >> + * is not reachable from a ref. >> + */ >> + if (!is_tip_reachable(ud->sm_path, &ud->oid)) >> + if (fetch_in_submodule(ud->sm_path, >> ud->depth, ud->quiet, NULL) && >> + !ud->quiet) >> + fprintf_ln(stderr, >> + _("Unable to fetch >> in submodule path '%s'; " >> + "trying to >> directly fetch %s:"), >> + ud->displaypath, >> oid_to_hex(&ud->oid)); > > I was wondering if an OID is invalid, will it be counted as > unreachable and vice-versa? > If that is the case then that would simplify the work. Could you elaborate? I'm not sure what you mean by 'invalid' in this context. I don't think this code will receive any kind of malformed oid--they come from 'update-clone' which handles it correctly. As far as I can tell, the only way to check if a particular OID is unreachable is when we check if all the refs cannot find it. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GSoC] [PATCH v2] submodule--helper: run update procedures from C 2021-08-02 18:50 ` Shourya Shukla 2021-08-03 8:46 ` Atharva Raykar @ 2021-08-03 10:07 ` Atharva Raykar 1 sibling, 0 replies; 13+ messages in thread From: Atharva Raykar @ 2021-08-03 10:07 UTC (permalink / raw) To: Shourya Shukla Cc: Git List, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Kaartic Sivaraam, Prathamesh Chavan (I am resending this email, because my client mangled the whitespaces due to a misconfiguration. Please ignore the my previous message.) Shourya Shukla <periperidip@gmail.com> writes: > Le lun. 2 août 2021 à 18:36, Atharva Raykar <raykar.ath@gmail.com> a écrit : >> >> Add a new submodule--helper subcommand `run-update-procedure` that runs >> the update procedure if the SHA1 of the submodule does not match what >> the superproject expects. >> >> This is an intermediate change that works towards total conversion of >> `submodule update` from shell to C. >> >> Specific error codes are returned so that the shell script calling the >> subcommand can take a decision on the control flow, and preserve the >> error messages across subsequent recursive calls of `cmd_update`. >> >> This patch could have been approached differently, by first changing the >> `is_tip_reachable` and `fetch_in_submodule` shell functions to be >> `submodule--helper` subcommands, and then following up with a patch that >> introduces the `run-update-procedure` subcommand. We have not done it >> like that because those functions are trivial enough to convert directly >> along with these other changes. This lets us avoid the boilerplate and >> the cleanup patches that will need to be introduced in following that >> approach. > > I feel that this part is more suitable for a cover letter rather than the commit > message itself. It is a useful piece of info though. Okay, that seems right, the message does seem a bit too context-sensitive. >> This change is more focused on doing a faithful conversion, so for now we >> are not too concerned with trying to reduce subprocess spawns. >> >> Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> >> Mentored-by: Christian Couder <christian.couder@gmail.com> >> Mentored-by: Shourya Shukla <periperidip@gmail.com> >> --- >> >> Notable changes since v1: >> >> * Modified the code structure in >> submodule--helper.c:run_update_command(), while fixing problems with >> the translation marks. >> >> * Renamed '--sha1' and '--subsha1' options to '--oid' and '--suboid' to >> since the argument is parsed into an object_id struct, not plain sha1 >> data. >> >> * Used option callbacks to parse the SHA1 arguments directly. >> >> * Moved the conditional out of 'do_run_update_procedure()'. >> >> Feedback required: >> >> Ævar felt that it would be clearer to populate the 'fatal' messages >> after the run_command() operation in 'run_update_command()', to make it >> more readable [1]. I have attempted something like that here, and it has led >> to a lot more duplicated 'switch' statements, which feels suboptimal. >> I'd appreciate suggestions to make it more legible. >> >> [1] https://lore.kernel.org/git/87r1fps63r.fsf@evledraar.gmail.com/ >> >> Fetch-it-Via: >> git fetch https://github.com/tfidfwastaken/git submodule-run-update-proc-list-2 >> >> builtin/submodule--helper.c | 253 ++++++++++++++++++++++++++++++++++++ >> git-submodule.sh | 106 +++++---------- >> 2 files changed, 286 insertions(+), 73 deletions(-) >> >> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c >> index d55f6262e9..b9c40324d0 100644 >> --- a/builtin/submodule--helper.c >> +++ b/builtin/submodule--helper.c >> @@ -2029,6 +2029,20 @@ struct submodule_update_clone { >> .max_jobs = 1, \ >> } >> >> +struct update_data { >> + const char *recursive_prefix; >> + const char *sm_path; >> + const char *displaypath; >> + struct object_id oid; >> + struct object_id suboid; >> + struct submodule_update_strategy update_strategy; >> + int depth; >> + unsigned int force: 1; >> + unsigned int quiet: 1; >> + unsigned int nofetch: 1; >> + unsigned int just_cloned: 1; >> +}; >> +#define UPDATE_DATA_INIT { .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT } >> >> static void next_submodule_warn_missing(struct submodule_update_clone *suc, >> struct strbuf *out, const char *displaypath) >> @@ -2282,6 +2296,175 @@ static int git_update_clone_config(const char *var, const char *value, >> return 0; >> } >> + >> +static int run_update_command(struct update_data *ud, int subforce) >> +{ >> + struct child_process cp = CHILD_PROCESS_INIT; >> + char *oid = oid_to_hex(&ud->oid); >> + int must_die_on_failure = 0; >> + >> + cp.dir = xstrdup(ud->sm_path); >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + cp.git_cmd = 1; >> + strvec_pushl(&cp.args, "checkout", "-q", NULL); > > Would it be possible to add the 'if' statement above just before the > 'switch' (or after, > whichever seems okay) since this is common amongst (almost) all the cases? I'll try it on once, if it makes the code more readable, I'll include it in the reroll. >> + if (subforce) >> + strvec_push(&cp.args, "-f"); >> + break; >> + case SM_UPDATE_REBASE: >> + cp.git_cmd = 1; >> + strvec_push(&cp.args, "rebase"); >> + if (ud->quiet) >> + strvec_push(&cp.args, "--quiet"); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_MERGE: >> + cp.git_cmd = 1; >> + strvec_push(&cp.args, "merge"); >> + if (ud->quiet) >> + strvec_push(&cp.args, "--quiet"); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_COMMAND: >> + /* NOTE: this does not handle quoted arguments */ >> + strvec_split(&cp.args, ud->update_strategy.command); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + case SM_UPDATE_NONE: >> + BUG("update strategy should have been specified"); >> + } > > If the original did not bug out, do we need to? The documentation does > not mention > this as well: > https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt-none This was how the original shell porcelain did it: case "$update_module" in checkout) command="git checkout $subforce -q" die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")" ;; rebase) command="git rebase ${GIT_QUIET:+--quiet}" die_msg="$(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")" say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")" must_die_on_failure=yes ;; merge) command="git merge ${GIT_QUIET:+--quiet}" die_msg="$(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")" say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")" must_die_on_failure=yes ;; !*) command="${update_module#!}" die_msg="$(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")" say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")" must_die_on_failure=yes ;; *) die "$(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")" esac The fallthrough case used to die, but I noticed that this branch will never get activated. This is because the 'update-clone' helper will not output any entry that has the update mode set to 'none', and thus the `while` loop that contains this code would never run. Which is why I decided to BUG out on that case, because that state should never be reached. But I see the source of confusion, and maybe I should have different BUG messages for SM_UPDATE_UNSPECIFIED and SM_UPDATE_NONE. The latter should probably say "should have been handled by update-clone". >> + >> + strvec_push(&cp.args, oid); >> + >> + prepare_submodule_repo_env(&cp.env_array); >> + >> + if (run_command(&cp)) { >> + if (must_die_on_failure) { >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + die(_("Unable to checkout '%s' in submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_REBASE: >> + die(_("Unable to rebase '%s' in submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_MERGE: >> + die(_("Unable to merge '%s' in submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_COMMAND: >> + die(_("Execution of '%s %s' failed in submodule path '%s'"), >> + ud->update_strategy.command, oid, ud->displaypath); >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + case SM_UPDATE_NONE: >> + BUG("update strategy should have been specified"); >> + } >> + } >> + /* >> + * This signifies to the caller in shell that >> + * the command failed without dying >> + */ >> + return 1; >> + } >> + >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + printf(_("Submodule path '%s': checked out '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_REBASE: >> + printf(_("Submodule path '%s': rebased into '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_MERGE: >> + printf(_("Submodule path '%s': merged in '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_COMMAND: >> + printf(_("Submodule path '%s': '%s %s'\n"), >> + ud->displaypath, ud->update_strategy.command, oid); >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + case SM_UPDATE_NONE: >> + BUG("update strategy should have been specified"); >> + } >> + >> + return 0; >> +} >> + >> +static int do_run_update_procedure(struct update_data *ud) >> +{ >> + int subforce = is_null_oid(&ud->suboid) || ud->force; >> + >> + if (!ud->nofetch) { >> + /* >> + * Run fetch only if `oid` isn't present or it >> + * is not reachable from a ref. >> + */ >> + if (!is_tip_reachable(ud->sm_path, &ud->oid)) >> + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && >> + !ud->quiet) >> + fprintf_ln(stderr, >> + _("Unable to fetch in submodule path '%s'; " >> + "trying to directly fetch %s:"), >> + ud->displaypath, oid_to_hex(&ud->oid)); > > I was wondering if an OID is invalid, will it be counted as > unreachable and vice-versa? > If that is the case then that would simplify the work. Could you elaborate? I'm not sure what you mean by 'invalid' in this context. I don't think this code will receive any kind of malformed oid--they come from 'update-clone' which handles it correctly. As far as I can tell, the only way to check if a particular OID is unreachable is when we check if all the refs cannot find it. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [GSoC] [PATCH v3] submodule--helper: run update procedures from C 2021-08-02 13:06 ` [GSoC] [PATCH v2] " Atharva Raykar 2021-08-02 18:50 ` Shourya Shukla @ 2021-08-13 7:56 ` Atharva Raykar 2021-08-13 18:32 ` Junio C Hamano 2021-08-24 14:06 ` [PATCH v4] " Atharva Raykar 1 sibling, 2 replies; 13+ messages in thread From: Atharva Raykar @ 2021-08-13 7:56 UTC (permalink / raw) To: git Cc: Atharva Raykar, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan Add a new submodule--helper subcommand `run-update-procedure` that runs the update procedure if the SHA1 of the submodule does not match what the superproject expects. This is an intermediate change that works towards total conversion of `submodule update` from shell to C. Specific error codes are returned so that the shell script calling the subcommand can take a decision on the control flow, and preserve the error messages across subsequent recursive calls of `cmd_update`. This change is more focused on doing a faithful conversion, so for now we are not too concerned with trying to reduce subprocess spawns. Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Shourya Shukla <periperidip@gmail.com> --- This patch could have been approached differently, by first changing the `is_tip_reachable` and `fetch_in_submodule` shell functions to be `submodule--helper` subcommands, and then following up with a patch that introduces the `run-update-procedure` subcommand. We have not done it like that because those functions are trivial enough to convert directly along with these other changes. This lets us avoid the boilerplate and the cleanup patches that will need to be introduced in following that approach. Since v2: * Different BUG messages in run_update_command() for the "Unspecified" and "None" update modes. * Move the information about how the patch was approached out of the commit message. * Rebase this patch on top of master (the previous one was based on a stale, unmerged topic branch). This patch no longer depends on a topic branch. Fetch-it-Via: git fetch https://github.com/tfidfwastaken/git submodule-run-update-proc-list-4 builtin/submodule--helper.c | 259 ++++++++++++++++++++++++++++++++++++ git-submodule.sh | 104 +++++---------- 2 files changed, 291 insertions(+), 72 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index ef2776a9e4..9b34b29ce2 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -2045,6 +2045,20 @@ struct submodule_update_clone { .max_jobs = 1, \ } +struct update_data { + const char *recursive_prefix; + const char *sm_path; + const char *displaypath; + struct object_id oid; + struct object_id suboid; + struct submodule_update_strategy update_strategy; + int depth; + unsigned int force: 1; + unsigned int quiet: 1; + unsigned int nofetch: 1; + unsigned int just_cloned: 1; +}; +#define UPDATE_DATA_INIT { .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT } static void next_submodule_warn_missing(struct submodule_update_clone *suc, struct strbuf *out, const char *displaypath) @@ -2298,6 +2312,181 @@ static int git_update_clone_config(const char *var, const char *value, return 0; } +static int is_tip_reachable(const char *path, struct object_id *oid) +{ + struct child_process cp = CHILD_PROCESS_INIT; + struct strbuf rev = STRBUF_INIT; + char *hex = oid_to_hex(oid); + + cp.git_cmd = 1; + cp.dir = xstrdup(path); + cp.no_stderr = 1; + strvec_pushl(&cp.args, "rev-list", "-n", "1", hex, "--not", "--all", NULL); + + prepare_submodule_repo_env(&cp.env_array); + + if (capture_command(&cp, &rev, GIT_MAX_HEXSZ + 1) || rev.len) + return 0; + + return 1; +} + +static int fetch_in_submodule(const char *module_path, int depth, int quiet, struct object_id *oid) +{ + struct child_process cp = CHILD_PROCESS_INIT; + + prepare_submodule_repo_env(&cp.env_array); + cp.git_cmd = 1; + cp.dir = xstrdup(module_path); + + strvec_push(&cp.args, "fetch"); + if (quiet) + strvec_push(&cp.args, "--quiet"); + if (depth) + strvec_pushf(&cp.args, "--depth=%d", depth); + if (oid) { + char *hex = oid_to_hex(oid); + char *remote = get_default_remote(); + strvec_pushl(&cp.args, remote, hex, NULL); + } + + return run_command(&cp); +} + +static int run_update_command(struct update_data *ud, int subforce) +{ + struct child_process cp = CHILD_PROCESS_INIT; + char *oid = oid_to_hex(&ud->oid); + int must_die_on_failure = 0; + + cp.dir = xstrdup(ud->sm_path); + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + cp.git_cmd = 1; + strvec_pushl(&cp.args, "checkout", "-q", NULL); + if (subforce) + strvec_push(&cp.args, "-f"); + break; + case SM_UPDATE_REBASE: + cp.git_cmd = 1; + strvec_push(&cp.args, "rebase"); + if (ud->quiet) + strvec_push(&cp.args, "--quiet"); + must_die_on_failure = 1; + break; + case SM_UPDATE_MERGE: + cp.git_cmd = 1; + strvec_push(&cp.args, "merge"); + if (ud->quiet) + strvec_push(&cp.args, "--quiet"); + must_die_on_failure = 1; + break; + case SM_UPDATE_COMMAND: + /* NOTE: this does not handle quoted arguments */ + strvec_split(&cp.args, ud->update_strategy.command); + must_die_on_failure = 1; + break; + case SM_UPDATE_NONE: + BUG("this should have been handled before. How did we reach here?"); + break; + case SM_UPDATE_UNSPECIFIED: + BUG("update strategy should have been specified"); + } + + strvec_push(&cp.args, oid); + + prepare_submodule_repo_env(&cp.env_array); + + if (run_command(&cp)) { + if (must_die_on_failure) { + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + die(_("Unable to checkout '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_REBASE: + die(_("Unable to rebase '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_MERGE: + die(_("Unable to merge '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_COMMAND: + die(_("Execution of '%s %s' failed in submodule path '%s'"), + ud->update_strategy.command, oid, ud->displaypath); + break; + case SM_UPDATE_NONE: + BUG("this should have been handled before. How did we reach here?"); + break; + case SM_UPDATE_UNSPECIFIED: + BUG("update strategy should have been specified"); + } + } + /* + * This signifies to the caller in shell that + * the command failed without dying + */ + return 1; + } + + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + printf(_("Submodule path '%s': checked out '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_REBASE: + printf(_("Submodule path '%s': rebased into '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_MERGE: + printf(_("Submodule path '%s': merged in '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_COMMAND: + printf(_("Submodule path '%s': '%s %s'\n"), + ud->displaypath, ud->update_strategy.command, oid); + break; + case SM_UPDATE_NONE: + BUG("this should have been handled before. How did we reach here?"); + break; + case SM_UPDATE_UNSPECIFIED: + BUG("update strategy should have been specified"); + } + + return 0; +} + +static int do_run_update_procedure(struct update_data *ud) +{ + int subforce = is_null_oid(&ud->suboid) || ud->force; + + if (!ud->nofetch) { + /* + * Run fetch only if `oid` isn't present or it + * is not reachable from a ref. + */ + if (!is_tip_reachable(ud->sm_path, &ud->oid)) + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && + !ud->quiet) + fprintf_ln(stderr, + _("Unable to fetch in submodule path '%s'; " + "trying to directly fetch %s:"), + ud->displaypath, oid_to_hex(&ud->oid)); + /* + * Now we tried the usual fetch, but `oid` may + * not be reachable from any of the refs. + */ + if (!is_tip_reachable(ud->sm_path, &ud->oid)) + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->oid)) + die(_("Fetched in submodule path '%s', but it did not " + "contain %s. Direct fetching of that commit failed."), + ud->displaypath, oid_to_hex(&ud->oid)); + } + + return run_update_command(ud, subforce); +} + static void update_submodule(struct update_clone_data *ucd) { fprintf(stdout, "dummy %s %d\t%s\n", @@ -2395,6 +2584,75 @@ static int update_clone(int argc, const char **argv, const char *prefix) return update_submodules(&suc); } +static int run_update_procedure(int argc, const char **argv, const char *prefix) +{ + int force = 0, quiet = 0, nofetch = 0, just_cloned = 0; + char *prefixed_path, *update = NULL; + struct update_data update_data = UPDATE_DATA_INIT; + + struct option options[] = { + OPT__QUIET(&quiet, N_("suppress output for update by rebase or merge")), + OPT__FORCE(&force, N_("force checkout updates"), 0), + OPT_BOOL('N', "no-fetch", &nofetch, + N_("don't fetch new objects from the remote site")), + OPT_BOOL(0, "just-cloned", &just_cloned, + N_("overrides update mode in case the repository is a fresh clone")), + OPT_INTEGER(0, "depth", &update_data.depth, N_("depth for shallow fetch")), + OPT_STRING(0, "prefix", &prefix, + N_("path"), + N_("path into the working tree")), + OPT_STRING(0, "update", &update, + N_("string"), + N_("rebase, merge, checkout or none")), + OPT_STRING(0, "recursive-prefix", &update_data.recursive_prefix, N_("path"), + N_("path into the working tree, across nested " + "submodule boundaries")), + OPT_CALLBACK_F(0, "oid", &update_data.oid, N_("sha1"), + N_("SHA1 expected by superproject"), PARSE_OPT_NONEG, + parse_opt_object_id), + OPT_CALLBACK_F(0, "suboid", &update_data.suboid, N_("subsha1"), + N_("SHA1 of submodule's HEAD"), PARSE_OPT_NONEG, + parse_opt_object_id), + OPT_END() + }; + + const char *const usage[] = { + N_("git submodule--helper run-update-procedure [<options>] <path>"), + NULL + }; + + argc = parse_options(argc, argv, prefix, options, usage, 0); + + if (argc != 1) + usage_with_options(usage, options); + + update_data.force = !!force; + update_data.quiet = !!quiet; + update_data.nofetch = !!nofetch; + update_data.just_cloned = !!just_cloned; + update_data.sm_path = argv[0]; + + if (update_data.recursive_prefix) + prefixed_path = xstrfmt("%s%s", update_data.recursive_prefix, update_data.sm_path); + else + prefixed_path = xstrdup(update_data.sm_path); + + update_data.displaypath = get_submodule_displaypath(prefixed_path, prefix); + + determine_submodule_update_strategy(the_repository, update_data.just_cloned, + update_data.sm_path, update, + &update_data.update_strategy); + + free(prefixed_path); + + if ((!is_null_oid(&update_data.oid) && !is_null_oid(&update_data.suboid) && + !oideq(&update_data.oid, &update_data.suboid)) || + is_null_oid(&update_data.suboid) || update_data.force) + return do_run_update_procedure(&update_data); + + return 3; +} + static int resolve_relative_path(int argc, const char **argv, const char *prefix) { struct strbuf sb = STRBUF_INIT; @@ -2951,6 +3209,7 @@ static struct cmd_struct commands[] = { {"add-clone", add_clone, 0}, {"update-module-mode", module_update_module_mode, 0}, {"update-clone", update_clone, 0}, + {"run-update-procedure", run_update_procedure, 0}, {"ensure-core-worktree", ensure_core_worktree, 0}, {"relative-path", resolve_relative_path, 0}, {"resolve-relative-url", resolve_relative_url, 0}, diff --git a/git-submodule.sh b/git-submodule.sh index dbd2ec2050..d8e30d1afa 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -369,13 +369,6 @@ cmd_deinit() git ${wt_prefix:+-C "$wt_prefix"} submodule--helper deinit ${GIT_QUIET:+--quiet} ${force:+--force} ${deinit_all:+--all} -- "$@" } -is_tip_reachable () ( - sanitize_submodule_env && - cd "$1" && - rev=$(git rev-list -n 1 "$2" --not --all 2>/dev/null) && - test -z "$rev" -) - # usage: fetch_in_submodule <module_path> [<depth>] [<sha1>] # Because arguments are positional, use an empty string to omit <depth> # but include <sha1>. @@ -519,14 +512,13 @@ cmd_update() git submodule--helper ensure-core-worktree "$sm_path" || exit 1 - update_module=$(git submodule--helper update-module-mode $just_cloned "$sm_path" $update) - displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix") if test $just_cloned -eq 1 then subsha1= else + just_cloned= subsha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD) || die "fatal: $(eval_gettext "Unable to find current revision in submodule path '\$displaypath'")" @@ -547,70 +539,38 @@ cmd_update() die "fatal: $(eval_gettext "Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")" fi - if test "$subsha1" != "$sha1" || test -n "$force" - then - subforce=$force - # If we don't already have a -f flag and the submodule has never been checked out - if test -z "$subsha1" && test -z "$force" - then - subforce="-f" - fi + out=$(git submodule--helper run-update-procedure \ + ${wt_prefix:+--prefix "$wt_prefix"} \ + ${GIT_QUIET:+--quiet} \ + ${force:+--force} \ + ${just_cloned:+--just-cloned} \ + ${nofetch:+--no-fetch} \ + ${depth:+"$depth"} \ + ${update:+--update "$update"} \ + ${prefix:+--recursive-prefix "$prefix"} \ + ${sha1:+--oid "$sha1"} \ + ${subsha1:+--suboid "$subsha1"} \ + "--" \ + "$sm_path") - if test -z "$nofetch" - then - # Run fetch only if $sha1 isn't present or it - # is not reachable from a ref. - is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" $depth || - say "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'; trying to directly fetch \$sha1:")" - - # Now we tried the usual fetch, but $sha1 may - # not be reachable from any of the refs - is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" "$depth" "$sha1" || - die "fatal: $(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" - fi - - must_die_on_failure= - case "$update_module" in - checkout) - command="git checkout $subforce -q" - die_msg="fatal: $(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")" - ;; - rebase) - command="git rebase ${GIT_QUIET:+--quiet}" - die_msg="fatal: $(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")" - must_die_on_failure=yes - ;; - merge) - command="git merge ${GIT_QUIET:+--quiet}" - die_msg="fatal: $(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")" - must_die_on_failure=yes - ;; - !*) - command="${update_module#!}" - die_msg="fatal: $(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")" - must_die_on_failure=yes - ;; - *) - die "fatal: $(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")" - esac - - if (sanitize_submodule_env; cd "$sm_path" && $command "$sha1") - then - say "$say_msg" - elif test -n "$must_die_on_failure" - then - die_with_status 2 "$die_msg" - else - err="${err};$die_msg" - continue - fi - fi + # exit codes for run-update-procedure: + # 0: update was successful, say command output + # 128: subcommand died during execution + # 1: update procedure failed, but should not die + # 3: no update procedure was run + res="$?" + case $res in + 0) + say "$out" + ;; + 128) + exit $res + ;; + 1) + err="${err};$out" + continue + ;; + esac if test -n "$recursive" then -- 2.32.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [GSoC] [PATCH v3] submodule--helper: run update procedures from C 2021-08-13 7:56 ` [GSoC] [PATCH v3] " Atharva Raykar @ 2021-08-13 18:32 ` Junio C Hamano 2021-08-24 8:58 ` Atharva Raykar 2021-08-24 14:06 ` [PATCH v4] " Atharva Raykar 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2021-08-13 18:32 UTC (permalink / raw) To: Atharva Raykar Cc: git, Emily Shaffer, Jonathan Nieder, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan Atharva Raykar <raykar.ath@gmail.com> writes: > Add a new submodule--helper subcommand `run-update-procedure` that runs > the update procedure if the SHA1 of the submodule does not match what > the superproject expects. > > This is an intermediate change that works towards total conversion of > `submodule update` from shell to C. OK. Various things can happen depending on the setting during the update procedure, and it is complex enough to split it out to a single step, I guess. > Specific error codes are returned so that the shell script calling the > subcommand can take a decision on the control flow, and preserve the > error messages across subsequent recursive calls of `cmd_update`. > > This change is more focused on doing a faithful conversion, so for now we > are not too concerned with trying to reduce subprocess spawns. > > Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> > Mentored-by: Christian Couder <christian.couder@gmail.com> > Mentored-by: Shourya Shukla <periperidip@gmail.com> Keep trailer lines in chronological order. The mentors mentored, the patch was written and finally you signed it off. > +static int run_update_command(struct update_data *ud, int subforce) > +{ > + struct child_process cp = CHILD_PROCESS_INIT; > + char *oid = oid_to_hex(&ud->oid); > + int must_die_on_failure = 0; > + > + cp.dir = xstrdup(ud->sm_path); > + switch (ud->update_strategy.type) { > + case SM_UPDATE_CHECKOUT: > + cp.git_cmd = 1; > + strvec_pushl(&cp.args, "checkout", "-q", NULL); > + if (subforce) > + strvec_push(&cp.args, "-f"); > + break; > + case SM_UPDATE_REBASE: > + cp.git_cmd = 1; > + strvec_push(&cp.args, "rebase"); > + if (ud->quiet) > + strvec_push(&cp.args, "--quiet"); > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_MERGE: > + cp.git_cmd = 1; > + strvec_push(&cp.args, "merge"); > + if (ud->quiet) > + strvec_push(&cp.args, "--quiet"); > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_COMMAND: > + /* NOTE: this does not handle quoted arguments */ > + strvec_split(&cp.args, ud->update_strategy.command); Indeed this doesn't. I think cp.use_shell would be the right way to run this. Study what happens before run_command_v_opt_cd_env() with RUN_USING_SHELL calls run_command() and make something similar happen here, instead of doing a manual command line splitting. Side note: run_command_v_opt_cd_env() with RUN_USING_SHELL is used in places like diff.c::run_external_diff() to invoke an external diff command, ll-merge.c::ll_ext_merge() to invoke a user-defined low level merge driver, sequencer.c::do_exec() to invoke 'x cmd' you write in the todo list during an "rebase -i" session. > + must_die_on_failure = 1; > + break; > + case SM_UPDATE_NONE: > + BUG("this should have been handled before. How did we reach here?"); > + break; > + case SM_UPDATE_UNSPECIFIED: > + BUG("update strategy should have been specified"); These two case arms are not a faithful conversion from the original, but because you do not carry around a random string from the caller and instead have parsed enums, it cannot be ;-) But it makes me wonder why we want these two cases separate. Isn't it a BUG() if anything other than what we handled (i.e. prepared cp.args for) already is in ud->update_strategy.type? IOW, wouldn't it be more forward looking to do default: BUG("unexpected ud->update_strategy.type (%d)", ud->update_strategy.type (%d)"); or something? That way, if we ever come up with a new update strategy and forget to update this part, we will catch such a bug fairly quickly. > + } > + > + strvec_push(&cp.args, oid); > + > + prepare_submodule_repo_env(&cp.env_array); > + > + if (run_command(&cp)) { > + if (must_die_on_failure) { > + switch (ud->update_strategy.type) { > + case SM_UPDATE_CHECKOUT: > + die(_("Unable to checkout '%s' in submodule path '%s'"), > + oid, ud->displaypath); > + break; > + case SM_UPDATE_REBASE: > + die(_("Unable to rebase '%s' in submodule path '%s'"), > + oid, ud->displaypath); > + break; > + case SM_UPDATE_MERGE: > + die(_("Unable to merge '%s' in submodule path '%s'"), > + oid, ud->displaypath); > + break; > + case SM_UPDATE_COMMAND: > + die(_("Execution of '%s %s' failed in submodule path '%s'"), > + ud->update_strategy.command, oid, ud->displaypath); > + break; The messages here correspond to what is assigned to $die_message in the original. Note that they are emitted to the standard error stream. I suspect that these should be "printf()" followed by a call to exit() with some non-zero value (see below). > + case SM_UPDATE_NONE: > + BUG("this should have been handled before. How did we reach here?"); > + break; > + case SM_UPDATE_UNSPECIFIED: > + BUG("update strategy should have been specified"); > + } The same comment applies to the last two case arms of this switch statement and the next one, too. I think we just should catch "everything else" with a simple "default:" label. Also, don't omit the "break;" from the last case arm in a switch statement. It harms the long-term help of the code---the last case arm may not forever stay to be the last one. > + } > + /* > + * This signifies to the caller in shell that > + * the command failed without dying > + */ > + return 1; > + } > + > + switch (ud->update_strategy.type) { > + case SM_UPDATE_CHECKOUT: > + printf(_("Submodule path '%s': checked out '%s'\n"), > + ud->displaypath, oid); > + break; > + case SM_UPDATE_REBASE: > + printf(_("Submodule path '%s': rebased into '%s'\n"), > + ud->displaypath, oid); > + break; > + case SM_UPDATE_MERGE: > + printf(_("Submodule path '%s': merged in '%s'\n"), > + ud->displaypath, oid); > + break; > + case SM_UPDATE_COMMAND: > + printf(_("Submodule path '%s': '%s %s'\n"), > + ud->displaypath, ud->update_strategy.command, oid); > + break; > + case SM_UPDATE_NONE: > + BUG("this should have been handled before. How did we reach here?"); > + break; > + case SM_UPDATE_UNSPECIFIED: > + BUG("update strategy should have been specified"); Likewise here. > + } > + > + return 0; > +} > + > +static int do_run_update_procedure(struct update_data *ud) > +{ > + int subforce = is_null_oid(&ud->suboid) || ud->force; > + > + if (!ud->nofetch) { > + /* > + * Run fetch only if `oid` isn't present or it > + * is not reachable from a ref. > + */ > + if (!is_tip_reachable(ud->sm_path, &ud->oid)) > + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && > + !ud->quiet) > + fprintf_ln(stderr, OK. Combining these into a single statement like if (!is_tip_reachable(...) && fetch_in_submodule(...) && !ud->quiet) fprintf_ln(... would reduce the indentation level, but the way the conditional is structured may convey the flow of the thought better, i.e. if we need to fetch, try to fetch and if that fails, report failure. On the other hand, if we take that line of thought to the extreme, the check for !ud->quiet should belong to another level of if statement so perhaps the more concise version I showed above might be an overall win. I dunno. > + _("Unable to fetch in submodule path '%s'; " > + "trying to directly fetch %s:"), > + ud->displaypath, oid_to_hex(&ud->oid)); > + /* > + * Now we tried the usual fetch, but `oid` may > + * not be reachable from any of the refs. > + */ > + if (!is_tip_reachable(ud->sm_path, &ud->oid)) > + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->oid)) > + die(_("Fetched in submodule path '%s', but it did not " > + "contain %s. Direct fetching of that commit failed."), > + ud->displaypath, oid_to_hex(&ud->oid)); Likewise. > + } > + > + return run_update_command(ud, subforce); > +} > + > static void update_submodule(struct update_clone_data *ucd) > { > fprintf(stdout, "dummy %s %d\t%s\n", > @@ -2395,6 +2584,75 @@ static int update_clone(int argc, const char **argv, const char *prefix) > return update_submodules(&suc); > } > > +static int run_update_procedure(int argc, const char **argv, const char *prefix) > +{ > + int force = 0, quiet = 0, nofetch = 0, just_cloned = 0; > + char *prefixed_path, *update = NULL; > + struct update_data update_data = UPDATE_DATA_INIT; > + > + struct option options[] = { > ... > + OPT_END() > + }; > ... > + update_data.displaypath = get_submodule_displaypath(prefixed_path, prefix); > + > + determine_submodule_update_strategy(the_repository, update_data.just_cloned, > + update_data.sm_path, update, > + &update_data.update_strategy); > + > + free(prefixed_path); > + > + if ((!is_null_oid(&update_data.oid) && !is_null_oid(&update_data.suboid) && > + !oideq(&update_data.oid, &update_data.suboid)) || > + is_null_oid(&update_data.suboid) || update_data.force) > + return do_run_update_procedure(&update_data); The original does the update procedure if $sha1 and $subsha1 are different or if $force option is given. The rewritten seems to skip the update when .oid is NULL and .suboid is not NULL; intended? I understand that the division of labour between this function and do_run_update_procedure() is for the former to only exist to interface with the script side by populating the update_data structure, and the latter implements the logic to run update procedure. I was a bit surprised that this conditional is here, not at the very beginning of the callee. > + return 3; > +} > + > static int resolve_relative_path(int argc, const char **argv, const char *prefix) > { > struct strbuf sb = STRBUF_INIT; > @@ -2951,6 +3209,7 @@ static struct cmd_struct commands[] = { > {"add-clone", add_clone, 0}, > {"update-module-mode", module_update_module_mode, 0}, > {"update-clone", update_clone, 0}, > + {"run-update-procedure", run_update_procedure, 0}, > {"ensure-core-worktree", ensure_core_worktree, 0}, > {"relative-path", resolve_relative_path, 0}, > {"resolve-relative-url", resolve_relative_url, 0}, > diff --git a/git-submodule.sh b/git-submodule.sh > index dbd2ec2050..d8e30d1afa 100755 > --- a/git-submodule.sh > +++ b/git-submodule.sh > @@ -519,14 +512,13 @@ cmd_update() > > git submodule--helper ensure-core-worktree "$sm_path" || exit 1 > > - update_module=$(git submodule--helper update-module-mode $just_cloned "$sm_path" $update) > - > displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix") On the other side of the API boundary, update_data.displaypath is populated by value computed in C. It is a bit unfortunate that we still need to compute it here and risk the two to drift apart. > if test $just_cloned -eq 1 > then > subsha1= > else > + just_cloned= > subsha1=$(sanitize_submodule_env; cd "$sm_path" && > git rev-parse --verify HEAD) || > die "fatal: $(eval_gettext "Unable to find current revision in submodule path '\$displaypath'")" > @@ -547,70 +539,38 @@ cmd_update() > die "fatal: $(eval_gettext "Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")" > fi > > - if test "$subsha1" != "$sha1" || test -n "$force" > - then > - subforce=$force > - # If we don't already have a -f flag and the submodule has never been checked out > - if test -z "$subsha1" && test -z "$force" > - then > - subforce="-f" > - fi > + out=$(git submodule--helper run-update-procedure \ > + ${wt_prefix:+--prefix "$wt_prefix"} \ > + ${GIT_QUIET:+--quiet} \ > + ${force:+--force} \ > + ${just_cloned:+--just-cloned} \ > + ${nofetch:+--no-fetch} \ > + ${depth:+"$depth"} \ > + ${update:+--update "$update"} \ > + ${prefix:+--recursive-prefix "$prefix"} \ > + ${sha1:+--oid "$sha1"} \ > + ${subsha1:+--suboid "$subsha1"} \ > + "--" \ > + "$sm_path") We'd just show errors directly to the standard error stream from submodule--helper, but what comes from the printf in the switch statement at the end of run_update_command() is captured in $out variable. Notably, the messages from die()s in the second switch statement in run_update_command() are not captured in $out here. > - if test -z "$nofetch" > - then > - # Run fetch only if $sha1 isn't present or it > - # is not reachable from a ref. > - is_tip_reachable "$sm_path" "$sha1" || > - fetch_in_submodule "$sm_path" $depth || > - say "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'; trying to directly fetch \$sha1:")" > - > - # Now we tried the usual fetch, but $sha1 may > - # not be reachable from any of the refs > - is_tip_reachable "$sm_path" "$sha1" || > - fetch_in_submodule "$sm_path" "$depth" "$sha1" || > - die "fatal: $(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" > - fi > - > - must_die_on_failure= > - case "$update_module" in > - checkout) > - command="git checkout $subforce -q" > - die_msg="fatal: $(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" > - say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")" > - ;; > - rebase) > - command="git rebase ${GIT_QUIET:+--quiet}" > - die_msg="fatal: $(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")" > - say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")" > - must_die_on_failure=yes > - ;; > - merge) > - command="git merge ${GIT_QUIET:+--quiet}" > - die_msg="fatal: $(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")" > - say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")" > - must_die_on_failure=yes > - ;; > - !*) > - command="${update_module#!}" > - die_msg="fatal: $(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")" > - say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")" > - must_die_on_failure=yes > - ;; > - *) > - die "fatal: $(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")" > - esac > - > - if (sanitize_submodule_env; cd "$sm_path" && $command "$sha1") > - then > - say "$say_msg" > - elif test -n "$must_die_on_failure" > - then > - die_with_status 2 "$die_msg" > - else > - err="${err};$die_msg" > - continue > - fi > - fi > + # exit codes for run-update-procedure: > + # 0: update was successful, say command output > + # 128: subcommand died during execution > + # 1: update procedure failed, but should not die > + # 3: no update procedure was run > + res="$?" > + case $res in > + 0) > + say "$out" > + ;; And the case where there is no error is quite straight-forward. We just emit what we saw in the standard output stream of the helper. > + 128) > + exit $res > + ;; > + 1) > + err="${err};$out" This part is dubious. In the original, $err accumulates what is in $die_msg, which are things like "fatal: Unable to rebase ...", but with this patch, what used to be the contents of $die_msg are given to die() after we see run_command() fail, and would have sent to the standard error stream, not captured in $out here, no? > + continue > + ;; > + esac > > if test -n "$recursive" > then Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GSoC] [PATCH v3] submodule--helper: run update procedures from C 2021-08-13 18:32 ` Junio C Hamano @ 2021-08-24 8:58 ` Atharva Raykar 0 siblings, 0 replies; 13+ messages in thread From: Atharva Raykar @ 2021-08-24 8:58 UTC (permalink / raw) To: Junio C Hamano Cc: git, Emily Shaffer, Jonathan Nieder, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan Pardon my late response, I had been occupied with other things. Junio C Hamano <gitster@pobox.com> writes: > Atharva Raykar <raykar.ath@gmail.com> writes: > [...] >> Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> >> Mentored-by: Christian Couder <christian.couder@gmail.com> >> Mentored-by: Shourya Shukla <periperidip@gmail.com> > > Keep trailer lines in chronological order. The mentors mentored, > the patch was written and finally you signed it off. Okay. >> +static int run_update_command(struct update_data *ud, int subforce) >> +{ >> + struct child_process cp = CHILD_PROCESS_INIT; >> + char *oid = oid_to_hex(&ud->oid); >> + int must_die_on_failure = 0; >> + >> + cp.dir = xstrdup(ud->sm_path); >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + cp.git_cmd = 1; >> + strvec_pushl(&cp.args, "checkout", "-q", NULL); >> + if (subforce) >> + strvec_push(&cp.args, "-f"); >> + break; >> + case SM_UPDATE_REBASE: >> + cp.git_cmd = 1; >> + strvec_push(&cp.args, "rebase"); >> + if (ud->quiet) >> + strvec_push(&cp.args, "--quiet"); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_MERGE: >> + cp.git_cmd = 1; >> + strvec_push(&cp.args, "merge"); >> + if (ud->quiet) >> + strvec_push(&cp.args, "--quiet"); >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_COMMAND: >> + /* NOTE: this does not handle quoted arguments */ >> + strvec_split(&cp.args, ud->update_strategy.command); > > Indeed this doesn't. I think cp.use_shell would be the right way to > run this. > > Study what happens before run_command_v_opt_cd_env() with > RUN_USING_SHELL calls run_command() and make something similar > happen here, instead of doing a manual command line splitting. > > Side note: run_command_v_opt_cd_env() with RUN_USING_SHELL > is used in places like diff.c::run_external_diff() to invoke > an external diff command, ll-merge.c::ll_ext_merge() to > invoke a user-defined low level merge driver, > sequencer.c::do_exec() to invoke 'x cmd' you write in the > todo list during an "rebase -i" session. Thanks for the pointers, the details helped. I'll handle this more correctly in the next version. >> + must_die_on_failure = 1; >> + break; >> + case SM_UPDATE_NONE: >> + BUG("this should have been handled before. How did we reach here?"); >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + BUG("update strategy should have been specified"); > > These two case arms are not a faithful conversion from the original, > but because you do not carry around a random string from the caller > and instead have parsed enums, it cannot be ;-) But it makes me > wonder why we want these two cases separate. Isn't it a BUG() if > anything other than what we handled (i.e. prepared cp.args for) > already is in ud->update_strategy.type? IOW, wouldn't it be more > forward looking to do > > default: > BUG("unexpected ud->update_strategy.type (%d)", > ud->update_strategy.type (%d)"); > > or something? That way, if we ever come up with a new update > strategy and forget to update this part, we will catch such a bug > fairly quickly. The original intention for separating the cases was to differentiate the cause for the invalid state, but your proposed suggestion is a lot better. I'll address this. >> + } >> + >> + strvec_push(&cp.args, oid); >> + >> + prepare_submodule_repo_env(&cp.env_array); >> + >> + if (run_command(&cp)) { >> + if (must_die_on_failure) { >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + die(_("Unable to checkout '%s' in submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_REBASE: >> + die(_("Unable to rebase '%s' in submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_MERGE: >> + die(_("Unable to merge '%s' in submodule path '%s'"), >> + oid, ud->displaypath); >> + break; >> + case SM_UPDATE_COMMAND: >> + die(_("Execution of '%s %s' failed in submodule path '%s'"), >> + ud->update_strategy.command, oid, ud->displaypath); >> + break; > > The messages here correspond to what is assigned to $die_message in > the original. Note that they are emitted to the standard error > stream. > > I suspect that these should be "printf()" followed by a call to > exit() with some non-zero value (see below). I also notice another major lapse in conversion. In the shell porcelain, the "checkout" mode should not die out at all, instead it should print out the error message. My code tries to die() on the checkout mode (in a case arm that will never be activated), and does not ever print the checkout failure message at all. I will fix this in the re-roll. (Will address more of this below...) >> + case SM_UPDATE_NONE: >> + BUG("this should have been handled before. How did we reach here?"); >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + BUG("update strategy should have been specified"); >> + } > > The same comment applies to the last two case arms of this switch > statement and the next one, too. I think we just should catch > "everything else" with a simple "default:" label. > > Also, don't omit the "break;" from the last case arm in a switch > statement. It harms the long-term help of the code---the last case > arm may not forever stay to be the last one. > >> + } >> + /* >> + * This signifies to the caller in shell that >> + * the command failed without dying >> + */ >> + return 1; >> + } >> + >> + switch (ud->update_strategy.type) { >> + case SM_UPDATE_CHECKOUT: >> + printf(_("Submodule path '%s': checked out '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_REBASE: >> + printf(_("Submodule path '%s': rebased into '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_MERGE: >> + printf(_("Submodule path '%s': merged in '%s'\n"), >> + ud->displaypath, oid); >> + break; >> + case SM_UPDATE_COMMAND: >> + printf(_("Submodule path '%s': '%s %s'\n"), >> + ud->displaypath, ud->update_strategy.command, oid); >> + break; >> + case SM_UPDATE_NONE: >> + BUG("this should have been handled before. How did we reach here?"); >> + break; >> + case SM_UPDATE_UNSPECIFIED: >> + BUG("update strategy should have been specified"); > > Likewise here. Okay. >> + } >> + >> + return 0; >> +} >> + >> +static int do_run_update_procedure(struct update_data *ud) >> +{ >> + int subforce = is_null_oid(&ud->suboid) || ud->force; >> + >> + if (!ud->nofetch) { >> + /* >> + * Run fetch only if `oid` isn't present or it >> + * is not reachable from a ref. >> + */ >> + if (!is_tip_reachable(ud->sm_path, &ud->oid)) >> + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && >> + !ud->quiet) >> + fprintf_ln(stderr, > > OK. Combining these into a single statement like > > if (!is_tip_reachable(...) && > fetch_in_submodule(...) && > !ud->quiet) > fprintf_ln(... > > would reduce the indentation level, but the way the conditional is > structured may convey the flow of the thought better, i.e. > > if we need to fetch, > try to fetch and if that fails, > report failure. > > On the other hand, if we take that line of thought to the extreme, > the check for !ud->quiet should belong to another level of if > statement so perhaps the more concise version I showed above might > be an overall win. I dunno. Right. I agree with you, mainly because there are many other predicates in my previous conversions that were done with short-circuited &&'s, so might as well stick to what has been my convention. >> + _("Unable to fetch in submodule path '%s'; " >> + "trying to directly fetch %s:"), >> + ud->displaypath, oid_to_hex(&ud->oid)); >> + /* >> + * Now we tried the usual fetch, but `oid` may >> + * not be reachable from any of the refs. >> + */ >> + if (!is_tip_reachable(ud->sm_path, &ud->oid)) >> + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->oid)) >> + die(_("Fetched in submodule path '%s', but it did not " >> + "contain %s. Direct fetching of that commit failed."), >> + ud->displaypath, oid_to_hex(&ud->oid)); > > Likewise. > >> + } >> [...] >> + >> + if ((!is_null_oid(&update_data.oid) && !is_null_oid(&update_data.suboid) && >> + !oideq(&update_data.oid, &update_data.suboid)) || >> + is_null_oid(&update_data.suboid) || update_data.force) >> + return do_run_update_procedure(&update_data); > > The original does the update procedure if $sha1 and $subsha1 are > different or if $force option is given. The rewritten seems to skip > the update when .oid is NULL and .suboid is not NULL; intended? Unintended. I initially implemented this with raw chars until I discovered the object_id API. So this was the result of me indiscriminately substituting NULL checks with the OID equivalents. I realise this is not needed anymore, and we can simplify that to: if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force) > I understand that the division of labour between this function and > do_run_update_procedure() is for the former to only exist to > interface with the script side by populating the update_data > structure, and the latter implements the logic to run update > procedure. I was a bit surprised that this conditional is > here, not at the very beginning of the callee. Ævar pointed out that since this function just had one caller, I could move the whole 'if' outside it for now, which would save me one level of indentation within that function, and make it easier to parse. With the conditional being simplified to what I showed above, I think it can still be justified? Even in the series that will follow this, we would still have only one caller for this function. >> + return 3; >> +} >> + >> static int resolve_relative_path(int argc, const char **argv, const char *prefix) >> { >> struct strbuf sb = STRBUF_INIT; >> @@ -2951,6 +3209,7 @@ static struct cmd_struct commands[] = { >> {"add-clone", add_clone, 0}, >> {"update-module-mode", module_update_module_mode, 0}, >> {"update-clone", update_clone, 0}, >> + {"run-update-procedure", run_update_procedure, 0}, >> {"ensure-core-worktree", ensure_core_worktree, 0}, >> {"relative-path", resolve_relative_path, 0}, >> {"resolve-relative-url", resolve_relative_url, 0}, >> diff --git a/git-submodule.sh b/git-submodule.sh >> index dbd2ec2050..d8e30d1afa 100755 >> --- a/git-submodule.sh >> +++ b/git-submodule.sh >> @@ -519,14 +512,13 @@ cmd_update() >> >> git submodule--helper ensure-core-worktree "$sm_path" || exit 1 >> >> - update_module=$(git submodule--helper update-module-mode $just_cloned "$sm_path" $update) >> - >> displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix") > > On the other side of the API boundary, update_data.displaypath is > populated by value computed in C. It is a bit unfortunate that we > still need to compute it here and risk the two to drift apart. Yes, and I had some trouble figuring out a clean separation boundary, and this compromise felt like the best one. The best I can do is assure you that the patch series following this change solves this issue entirely, as it moves all of the shell code you see here into the C helper, and thus we only compute this value once. So there should be no worry about drift, as I will not give any chance to introduce it at all :) >> if test $just_cloned -eq 1 >> then >> subsha1= >> else >> + just_cloned= >> subsha1=$(sanitize_submodule_env; cd "$sm_path" && >> git rev-parse --verify HEAD) || >> die "fatal: $(eval_gettext "Unable to find current revision in submodule path '\$displaypath'")" >> @@ -547,70 +539,38 @@ cmd_update() >> die "fatal: $(eval_gettext "Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")" >> fi >> >> - if test "$subsha1" != "$sha1" || test -n "$force" >> - then >> - subforce=$force >> - # If we don't already have a -f flag and the submodule has never been checked out >> - if test -z "$subsha1" && test -z "$force" >> - then >> - subforce="-f" >> - fi >> + out=$(git submodule--helper run-update-procedure \ >> + ${wt_prefix:+--prefix "$wt_prefix"} \ >> + ${GIT_QUIET:+--quiet} \ >> + ${force:+--force} \ >> + ${just_cloned:+--just-cloned} \ >> + ${nofetch:+--no-fetch} \ >> + ${depth:+"$depth"} \ >> + ${update:+--update "$update"} \ >> + ${prefix:+--recursive-prefix "$prefix"} \ >> + ${sha1:+--oid "$sha1"} \ >> + ${subsha1:+--suboid "$subsha1"} \ >> + "--" \ >> + "$sm_path") > > We'd just show errors directly to the standard error stream from > submodule--helper, but what comes from the printf in the switch > statement at the end of run_update_command() is captured in $out > variable. Notably, the messages from die()s in the second switch > statement in run_update_command() are not captured in $out here. > >> - if test -z "$nofetch" >> - then >> - # Run fetch only if $sha1 isn't present or it >> - # is not reachable from a ref. >> - is_tip_reachable "$sm_path" "$sha1" || >> - fetch_in_submodule "$sm_path" $depth || >> - say "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'; trying to directly fetch \$sha1:")" >> - >> - # Now we tried the usual fetch, but $sha1 may >> - # not be reachable from any of the refs >> - is_tip_reachable "$sm_path" "$sha1" || >> - fetch_in_submodule "$sm_path" "$depth" "$sha1" || >> - die "fatal: $(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" >> - fi >> - >> - must_die_on_failure= >> - case "$update_module" in >> - checkout) >> - command="git checkout $subforce -q" >> - die_msg="fatal: $(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" >> - say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")" >> - ;; >> - rebase) >> - command="git rebase ${GIT_QUIET:+--quiet}" >> - die_msg="fatal: $(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")" >> - say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")" >> - must_die_on_failure=yes >> - ;; >> - merge) >> - command="git merge ${GIT_QUIET:+--quiet}" >> - die_msg="fatal: $(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")" >> - say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")" >> - must_die_on_failure=yes >> - ;; >> - !*) >> - command="${update_module#!}" >> - die_msg="fatal: $(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")" >> - say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")" >> - must_die_on_failure=yes >> - ;; >> - *) >> - die "fatal: $(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")" >> - esac >> - >> - if (sanitize_submodule_env; cd "$sm_path" && $command "$sha1") >> - then >> - say "$say_msg" >> - elif test -n "$must_die_on_failure" >> - then >> - die_with_status 2 "$die_msg" >> - else >> - err="${err};$die_msg" >> - continue >> - fi >> - fi >> + # exit codes for run-update-procedure: >> + # 0: update was successful, say command output >> + # 128: subcommand died during execution >> + # 1: update procedure failed, but should not die >> + # 3: no update procedure was run >> + res="$?" >> + case $res in >> + 0) >> + say "$out" >> + ;; > > And the case where there is no error is quite straight-forward. We > just emit what we saw in the standard output stream of the helper. > >> + 128) >> + exit $res >> + ;; >> + 1) >> + err="${err};$out" > > This part is dubious. In the original, $err accumulates what is in > $die_msg, which are things like "fatal: Unable to rebase ...", but > with this patch, what used to be the contents of $die_msg are given > to die() after we see run_command() fail, and would have sent to the > standard error stream, not captured in $out here, no? Yes, this is bad. This error slipped past the test suite that I was too reliant on. Even if it did work, it would not have printed the error messages for the "checkout" mode. I will fix all of these issues. Printing to stdout with error return ought to fix it, as you suggested. >> + continue >> + ;; >> + esac >> >> if test -n "$recursive" >> then > > > Thanks. Thanks for the thorough review. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4] submodule--helper: run update procedures from C 2021-08-13 7:56 ` [GSoC] [PATCH v3] " Atharva Raykar 2021-08-13 18:32 ` Junio C Hamano @ 2021-08-24 14:06 ` Atharva Raykar 2021-09-08 0:14 ` Junio C Hamano 1 sibling, 1 reply; 13+ messages in thread From: Atharva Raykar @ 2021-08-24 14:06 UTC (permalink / raw) To: git Cc: Atharva Raykar, Emily Shaffer, Jonathan Nieder, Junio C Hamano, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan Add a new submodule--helper subcommand `run-update-procedure` that runs the update procedure if the SHA1 of the submodule does not match what the superproject expects. This is an intermediate change that works towards total conversion of `submodule update` from shell to C. Specific error codes are returned so that the shell script calling the subcommand can take a decision on the control flow, and preserve the error messages across subsequent recursive calls of `cmd_update`. This change is more focused on doing a faithful conversion, so for now we are not too concerned with trying to reduce subprocess spawns. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Shourya Shukla <periperidip@gmail.com> Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> --- This patch addresses the concerns raised by Junio. The important changes are: * Fix error message handling of the output of 'run-update-procedure'. While at it, ensure the "checkout" mode error message is stored and printed appropriately. * In 'run_update_command()' switch from 'run_command()' to 'run_command_v_opt_cd_env()' to ensure quoted command update modes are handled correctly. * Code style and hygiene changes. * Introduce a NEEDSWORK comment, because the printf() and error return is correct only because the shell caller in the other end redirects it to the correct output stream. Once we switch this completely to C (ie, in the follow-up series), I need to remember to die() instead (or print to stderr) to reproduce the original behaviour. Range-diff against v3: 1: 2ff48f8790 ! 1: 2729834e43 submodule--helper: run update procedures from C @@ Commit message This change is more focused on doing a faithful conversion, so for now we are not too concerned with trying to reduce subprocess spawns. - Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Shourya Shukla <periperidip@gmail.com> + Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> ## builtin/submodule--helper.c ## @@ builtin/submodule--helper.c: struct submodule_update_clone { @@ builtin/submodule--helper.c: static int git_update_clone_config(const char *var, + +static int run_update_command(struct update_data *ud, int subforce) +{ -+ struct child_process cp = CHILD_PROCESS_INIT; ++ struct strvec args = STRVEC_INIT; ++ struct strvec child_env = STRVEC_INIT; + char *oid = oid_to_hex(&ud->oid); + int must_die_on_failure = 0; ++ int git_cmd; + -+ cp.dir = xstrdup(ud->sm_path); + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: -+ cp.git_cmd = 1; -+ strvec_pushl(&cp.args, "checkout", "-q", NULL); ++ git_cmd = 1; ++ strvec_pushl(&args, "checkout", "-q", NULL); + if (subforce) -+ strvec_push(&cp.args, "-f"); ++ strvec_push(&args, "-f"); + break; + case SM_UPDATE_REBASE: -+ cp.git_cmd = 1; -+ strvec_push(&cp.args, "rebase"); ++ git_cmd = 1; ++ strvec_push(&args, "rebase"); + if (ud->quiet) -+ strvec_push(&cp.args, "--quiet"); ++ strvec_push(&args, "--quiet"); + must_die_on_failure = 1; + break; + case SM_UPDATE_MERGE: -+ cp.git_cmd = 1; -+ strvec_push(&cp.args, "merge"); ++ git_cmd = 1; ++ strvec_push(&args, "merge"); + if (ud->quiet) -+ strvec_push(&cp.args, "--quiet"); ++ strvec_push(&args, "--quiet"); + must_die_on_failure = 1; + break; + case SM_UPDATE_COMMAND: -+ /* NOTE: this does not handle quoted arguments */ -+ strvec_split(&cp.args, ud->update_strategy.command); ++ git_cmd = 0; ++ strvec_push(&args, ud->update_strategy.command); + must_die_on_failure = 1; + break; -+ case SM_UPDATE_NONE: -+ BUG("this should have been handled before. How did we reach here?"); -+ break; -+ case SM_UPDATE_UNSPECIFIED: -+ BUG("update strategy should have been specified"); ++ default: ++ BUG("unexpected update strategy type: %s", ++ submodule_strategy_to_string(&ud->update_strategy)); + } -+ -+ strvec_push(&cp.args, oid); -+ -+ prepare_submodule_repo_env(&cp.env_array); -+ -+ if (run_command(&cp)) { -+ if (must_die_on_failure) { -+ switch (ud->update_strategy.type) { -+ case SM_UPDATE_CHECKOUT: -+ die(_("Unable to checkout '%s' in submodule path '%s'"), -+ oid, ud->displaypath); -+ break; -+ case SM_UPDATE_REBASE: -+ die(_("Unable to rebase '%s' in submodule path '%s'"), -+ oid, ud->displaypath); -+ break; -+ case SM_UPDATE_MERGE: -+ die(_("Unable to merge '%s' in submodule path '%s'"), -+ oid, ud->displaypath); -+ break; -+ case SM_UPDATE_COMMAND: -+ die(_("Execution of '%s %s' failed in submodule path '%s'"), -+ ud->update_strategy.command, oid, ud->displaypath); -+ break; -+ case SM_UPDATE_NONE: -+ BUG("this should have been handled before. How did we reach here?"); -+ break; -+ case SM_UPDATE_UNSPECIFIED: -+ BUG("update strategy should have been specified"); -+ } ++ strvec_push(&args, oid); ++ ++ prepare_submodule_repo_env(&child_env); ++ if (run_command_v_opt_cd_env(args.v, git_cmd ? RUN_GIT_CMD : RUN_USING_SHELL, ++ ud->sm_path, child_env.v)) { ++ switch (ud->update_strategy.type) { ++ case SM_UPDATE_CHECKOUT: ++ printf(_("Unable to checkout '%s' in submodule path '%s'"), ++ oid, ud->displaypath); ++ break; ++ case SM_UPDATE_REBASE: ++ printf(_("Unable to rebase '%s' in submodule path '%s'"), ++ oid, ud->displaypath); ++ break; ++ case SM_UPDATE_MERGE: ++ printf(_("Unable to merge '%s' in submodule path '%s'"), ++ oid, ud->displaypath); ++ break; ++ case SM_UPDATE_COMMAND: ++ printf(_("Execution of '%s %s' failed in submodule path '%s'"), ++ ud->update_strategy.command, oid, ud->displaypath); ++ break; ++ default: ++ BUG("unexpected update strategy type: %s", ++ submodule_strategy_to_string(&ud->update_strategy)); + } + /* -+ * This signifies to the caller in shell that -+ * the command failed without dying ++ * NEEDSWORK: We are currently printing to stdout with error ++ * return so that the shell caller handles the error output ++ * properly. Once we start handling the error messages within ++ * C, we should use die() instead. ++ */ ++ if (must_die_on_failure) ++ return 2; ++ /* ++ * This signifies to the caller in shell that the command ++ * failed without dying + */ + return 1; + } @@ builtin/submodule--helper.c: static int git_update_clone_config(const char *var, + printf(_("Submodule path '%s': '%s %s'\n"), + ud->displaypath, ud->update_strategy.command, oid); + break; -+ case SM_UPDATE_NONE: -+ BUG("this should have been handled before. How did we reach here?"); -+ break; -+ case SM_UPDATE_UNSPECIFIED: -+ BUG("update strategy should have been specified"); ++ default: ++ BUG("unexpected update strategy type: %s", ++ submodule_strategy_to_string(&ud->update_strategy)); + } + + return 0; @@ builtin/submodule--helper.c: static int git_update_clone_config(const char *var, + * Run fetch only if `oid` isn't present or it + * is not reachable from a ref. + */ -+ if (!is_tip_reachable(ud->sm_path, &ud->oid)) -+ if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && -+ !ud->quiet) -+ fprintf_ln(stderr, -+ _("Unable to fetch in submodule path '%s'; " -+ "trying to directly fetch %s:"), -+ ud->displaypath, oid_to_hex(&ud->oid)); ++ if (!is_tip_reachable(ud->sm_path, &ud->oid) && ++ fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && ++ !ud->quiet) ++ fprintf_ln(stderr, ++ _("Unable to fetch in submodule path '%s'; " ++ "trying to directly fetch %s:"), ++ ud->displaypath, oid_to_hex(&ud->oid)); + /* + * Now we tried the usual fetch, but `oid` may + * not be reachable from any of the refs. + */ -+ if (!is_tip_reachable(ud->sm_path, &ud->oid)) -+ if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->oid)) -+ die(_("Fetched in submodule path '%s', but it did not " -+ "contain %s. Direct fetching of that commit failed."), -+ ud->displaypath, oid_to_hex(&ud->oid)); ++ if (!is_tip_reachable(ud->sm_path, &ud->oid) && ++ fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->oid)) ++ die(_("Fetched in submodule path '%s', but it did not " ++ "contain %s. Direct fetching of that commit failed."), ++ ud->displaypath, oid_to_hex(&ud->oid)); + } + + return run_update_command(ud, subforce); @@ builtin/submodule--helper.c: static int update_clone(int argc, const char **argv + + free(prefixed_path); + -+ if ((!is_null_oid(&update_data.oid) && !is_null_oid(&update_data.suboid) && -+ !oideq(&update_data.oid, &update_data.suboid)) || -+ is_null_oid(&update_data.suboid) || update_data.force) ++ if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force) + return do_run_update_procedure(&update_data); + + return 3; @@ git-submodule.sh: cmd_update() + + # exit codes for run-update-procedure: + # 0: update was successful, say command output -+ # 128: subcommand died during execution + # 1: update procedure failed, but should not die ++ # 2 or 128: subcommand died during execution + # 3: no update procedure was run + res="$?" + case $res in + 0) + say "$out" + ;; -+ 128) -+ exit $res -+ ;; + 1) -+ err="${err};$out" ++ err="${err};fatal: $out" + continue + ;; ++ 2|128) ++ die_with_status $res "fatal: $out" ++ ;; + esac if test -n "$recursive" builtin/submodule--helper.c | 257 ++++++++++++++++++++++++++++++++++++ git-submodule.sh | 104 +++++---------- 2 files changed, 289 insertions(+), 72 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index ef2776a9e4..80619361fc 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -2045,6 +2045,20 @@ struct submodule_update_clone { .max_jobs = 1, \ } +struct update_data { + const char *recursive_prefix; + const char *sm_path; + const char *displaypath; + struct object_id oid; + struct object_id suboid; + struct submodule_update_strategy update_strategy; + int depth; + unsigned int force: 1; + unsigned int quiet: 1; + unsigned int nofetch: 1; + unsigned int just_cloned: 1; +}; +#define UPDATE_DATA_INIT { .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT } static void next_submodule_warn_missing(struct submodule_update_clone *suc, struct strbuf *out, const char *displaypath) @@ -2298,6 +2312,181 @@ static int git_update_clone_config(const char *var, const char *value, return 0; } +static int is_tip_reachable(const char *path, struct object_id *oid) +{ + struct child_process cp = CHILD_PROCESS_INIT; + struct strbuf rev = STRBUF_INIT; + char *hex = oid_to_hex(oid); + + cp.git_cmd = 1; + cp.dir = xstrdup(path); + cp.no_stderr = 1; + strvec_pushl(&cp.args, "rev-list", "-n", "1", hex, "--not", "--all", NULL); + + prepare_submodule_repo_env(&cp.env_array); + + if (capture_command(&cp, &rev, GIT_MAX_HEXSZ + 1) || rev.len) + return 0; + + return 1; +} + +static int fetch_in_submodule(const char *module_path, int depth, int quiet, struct object_id *oid) +{ + struct child_process cp = CHILD_PROCESS_INIT; + + prepare_submodule_repo_env(&cp.env_array); + cp.git_cmd = 1; + cp.dir = xstrdup(module_path); + + strvec_push(&cp.args, "fetch"); + if (quiet) + strvec_push(&cp.args, "--quiet"); + if (depth) + strvec_pushf(&cp.args, "--depth=%d", depth); + if (oid) { + char *hex = oid_to_hex(oid); + char *remote = get_default_remote(); + strvec_pushl(&cp.args, remote, hex, NULL); + } + + return run_command(&cp); +} + +static int run_update_command(struct update_data *ud, int subforce) +{ + struct strvec args = STRVEC_INIT; + struct strvec child_env = STRVEC_INIT; + char *oid = oid_to_hex(&ud->oid); + int must_die_on_failure = 0; + int git_cmd; + + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + git_cmd = 1; + strvec_pushl(&args, "checkout", "-q", NULL); + if (subforce) + strvec_push(&args, "-f"); + break; + case SM_UPDATE_REBASE: + git_cmd = 1; + strvec_push(&args, "rebase"); + if (ud->quiet) + strvec_push(&args, "--quiet"); + must_die_on_failure = 1; + break; + case SM_UPDATE_MERGE: + git_cmd = 1; + strvec_push(&args, "merge"); + if (ud->quiet) + strvec_push(&args, "--quiet"); + must_die_on_failure = 1; + break; + case SM_UPDATE_COMMAND: + git_cmd = 0; + strvec_push(&args, ud->update_strategy.command); + must_die_on_failure = 1; + break; + default: + BUG("unexpected update strategy type: %s", + submodule_strategy_to_string(&ud->update_strategy)); + } + strvec_push(&args, oid); + + prepare_submodule_repo_env(&child_env); + if (run_command_v_opt_cd_env(args.v, git_cmd ? RUN_GIT_CMD : RUN_USING_SHELL, + ud->sm_path, child_env.v)) { + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + printf(_("Unable to checkout '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_REBASE: + printf(_("Unable to rebase '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_MERGE: + printf(_("Unable to merge '%s' in submodule path '%s'"), + oid, ud->displaypath); + break; + case SM_UPDATE_COMMAND: + printf(_("Execution of '%s %s' failed in submodule path '%s'"), + ud->update_strategy.command, oid, ud->displaypath); + break; + default: + BUG("unexpected update strategy type: %s", + submodule_strategy_to_string(&ud->update_strategy)); + } + /* + * NEEDSWORK: We are currently printing to stdout with error + * return so that the shell caller handles the error output + * properly. Once we start handling the error messages within + * C, we should use die() instead. + */ + if (must_die_on_failure) + return 2; + /* + * This signifies to the caller in shell that the command + * failed without dying + */ + return 1; + } + + switch (ud->update_strategy.type) { + case SM_UPDATE_CHECKOUT: + printf(_("Submodule path '%s': checked out '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_REBASE: + printf(_("Submodule path '%s': rebased into '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_MERGE: + printf(_("Submodule path '%s': merged in '%s'\n"), + ud->displaypath, oid); + break; + case SM_UPDATE_COMMAND: + printf(_("Submodule path '%s': '%s %s'\n"), + ud->displaypath, ud->update_strategy.command, oid); + break; + default: + BUG("unexpected update strategy type: %s", + submodule_strategy_to_string(&ud->update_strategy)); + } + + return 0; +} + +static int do_run_update_procedure(struct update_data *ud) +{ + int subforce = is_null_oid(&ud->suboid) || ud->force; + + if (!ud->nofetch) { + /* + * Run fetch only if `oid` isn't present or it + * is not reachable from a ref. + */ + if (!is_tip_reachable(ud->sm_path, &ud->oid) && + fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) && + !ud->quiet) + fprintf_ln(stderr, + _("Unable to fetch in submodule path '%s'; " + "trying to directly fetch %s:"), + ud->displaypath, oid_to_hex(&ud->oid)); + /* + * Now we tried the usual fetch, but `oid` may + * not be reachable from any of the refs. + */ + if (!is_tip_reachable(ud->sm_path, &ud->oid) && + fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, &ud->oid)) + die(_("Fetched in submodule path '%s', but it did not " + "contain %s. Direct fetching of that commit failed."), + ud->displaypath, oid_to_hex(&ud->oid)); + } + + return run_update_command(ud, subforce); +} + static void update_submodule(struct update_clone_data *ucd) { fprintf(stdout, "dummy %s %d\t%s\n", @@ -2395,6 +2584,73 @@ static int update_clone(int argc, const char **argv, const char *prefix) return update_submodules(&suc); } +static int run_update_procedure(int argc, const char **argv, const char *prefix) +{ + int force = 0, quiet = 0, nofetch = 0, just_cloned = 0; + char *prefixed_path, *update = NULL; + struct update_data update_data = UPDATE_DATA_INIT; + + struct option options[] = { + OPT__QUIET(&quiet, N_("suppress output for update by rebase or merge")), + OPT__FORCE(&force, N_("force checkout updates"), 0), + OPT_BOOL('N', "no-fetch", &nofetch, + N_("don't fetch new objects from the remote site")), + OPT_BOOL(0, "just-cloned", &just_cloned, + N_("overrides update mode in case the repository is a fresh clone")), + OPT_INTEGER(0, "depth", &update_data.depth, N_("depth for shallow fetch")), + OPT_STRING(0, "prefix", &prefix, + N_("path"), + N_("path into the working tree")), + OPT_STRING(0, "update", &update, + N_("string"), + N_("rebase, merge, checkout or none")), + OPT_STRING(0, "recursive-prefix", &update_data.recursive_prefix, N_("path"), + N_("path into the working tree, across nested " + "submodule boundaries")), + OPT_CALLBACK_F(0, "oid", &update_data.oid, N_("sha1"), + N_("SHA1 expected by superproject"), PARSE_OPT_NONEG, + parse_opt_object_id), + OPT_CALLBACK_F(0, "suboid", &update_data.suboid, N_("subsha1"), + N_("SHA1 of submodule's HEAD"), PARSE_OPT_NONEG, + parse_opt_object_id), + OPT_END() + }; + + const char *const usage[] = { + N_("git submodule--helper run-update-procedure [<options>] <path>"), + NULL + }; + + argc = parse_options(argc, argv, prefix, options, usage, 0); + + if (argc != 1) + usage_with_options(usage, options); + + update_data.force = !!force; + update_data.quiet = !!quiet; + update_data.nofetch = !!nofetch; + update_data.just_cloned = !!just_cloned; + update_data.sm_path = argv[0]; + + if (update_data.recursive_prefix) + prefixed_path = xstrfmt("%s%s", update_data.recursive_prefix, update_data.sm_path); + else + prefixed_path = xstrdup(update_data.sm_path); + + update_data.displaypath = get_submodule_displaypath(prefixed_path, prefix); + + determine_submodule_update_strategy(the_repository, update_data.just_cloned, + update_data.sm_path, update, + &update_data.update_strategy); + + free(prefixed_path); + + if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force) + return do_run_update_procedure(&update_data); + + return 3; +} + static int resolve_relative_path(int argc, const char **argv, const char *prefix) { struct strbuf sb = STRBUF_INIT; @@ -2951,6 +3207,7 @@ static struct cmd_struct commands[] = { {"add-clone", add_clone, 0}, {"update-module-mode", module_update_module_mode, 0}, {"update-clone", update_clone, 0}, + {"run-update-procedure", run_update_procedure, 0}, {"ensure-core-worktree", ensure_core_worktree, 0}, {"relative-path", resolve_relative_path, 0}, {"resolve-relative-url", resolve_relative_url, 0}, diff --git a/git-submodule.sh b/git-submodule.sh index dbd2ec2050..f703cddce8 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -369,13 +369,6 @@ cmd_deinit() git ${wt_prefix:+-C "$wt_prefix"} submodule--helper deinit ${GIT_QUIET:+--quiet} ${force:+--force} ${deinit_all:+--all} -- "$@" } -is_tip_reachable () ( - sanitize_submodule_env && - cd "$1" && - rev=$(git rev-list -n 1 "$2" --not --all 2>/dev/null) && - test -z "$rev" -) - # usage: fetch_in_submodule <module_path> [<depth>] [<sha1>] # Because arguments are positional, use an empty string to omit <depth> # but include <sha1>. @@ -519,14 +512,13 @@ cmd_update() git submodule--helper ensure-core-worktree "$sm_path" || exit 1 - update_module=$(git submodule--helper update-module-mode $just_cloned "$sm_path" $update) - displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix") if test $just_cloned -eq 1 then subsha1= else + just_cloned= subsha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD) || die "fatal: $(eval_gettext "Unable to find current revision in submodule path '\$displaypath'")" @@ -547,70 +539,38 @@ cmd_update() die "fatal: $(eval_gettext "Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")" fi - if test "$subsha1" != "$sha1" || test -n "$force" - then - subforce=$force - # If we don't already have a -f flag and the submodule has never been checked out - if test -z "$subsha1" && test -z "$force" - then - subforce="-f" - fi + out=$(git submodule--helper run-update-procedure \ + ${wt_prefix:+--prefix "$wt_prefix"} \ + ${GIT_QUIET:+--quiet} \ + ${force:+--force} \ + ${just_cloned:+--just-cloned} \ + ${nofetch:+--no-fetch} \ + ${depth:+"$depth"} \ + ${update:+--update "$update"} \ + ${prefix:+--recursive-prefix "$prefix"} \ + ${sha1:+--oid "$sha1"} \ + ${subsha1:+--suboid "$subsha1"} \ + "--" \ + "$sm_path") - if test -z "$nofetch" - then - # Run fetch only if $sha1 isn't present or it - # is not reachable from a ref. - is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" $depth || - say "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'; trying to directly fetch \$sha1:")" - - # Now we tried the usual fetch, but $sha1 may - # not be reachable from any of the refs - is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" "$depth" "$sha1" || - die "fatal: $(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" - fi - - must_die_on_failure= - case "$update_module" in - checkout) - command="git checkout $subforce -q" - die_msg="fatal: $(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")" - ;; - rebase) - command="git rebase ${GIT_QUIET:+--quiet}" - die_msg="fatal: $(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")" - must_die_on_failure=yes - ;; - merge) - command="git merge ${GIT_QUIET:+--quiet}" - die_msg="fatal: $(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")" - must_die_on_failure=yes - ;; - !*) - command="${update_module#!}" - die_msg="fatal: $(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")" - say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")" - must_die_on_failure=yes - ;; - *) - die "fatal: $(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")" - esac - - if (sanitize_submodule_env; cd "$sm_path" && $command "$sha1") - then - say "$say_msg" - elif test -n "$must_die_on_failure" - then - die_with_status 2 "$die_msg" - else - err="${err};$die_msg" - continue - fi - fi + # exit codes for run-update-procedure: + # 0: update was successful, say command output + # 1: update procedure failed, but should not die + # 2 or 128: subcommand died during execution + # 3: no update procedure was run + res="$?" + case $res in + 0) + say "$out" + ;; + 1) + err="${err};fatal: $out" + continue + ;; + 2|128) + die_with_status $res "fatal: $out" + ;; + esac if test -n "$recursive" then -- 2.32.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4] submodule--helper: run update procedures from C 2021-08-24 14:06 ` [PATCH v4] " Atharva Raykar @ 2021-09-08 0:14 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2021-09-08 0:14 UTC (permalink / raw) To: git Cc: Atharva Raykar, Emily Shaffer, Jonathan Nieder, Christian Couder, Shourya Shukla, Kaartic Sivaraam, Prathamesh Chavan Atharva Raykar <raykar.ath@gmail.com> writes: > * Fix error message handling of the output of 'run-update-procedure'. While at > it, ensure the "checkout" mode error message is stored and printed > appropriately. > > * In 'run_update_command()' switch from 'run_command()' to > 'run_command_v_opt_cd_env()' to ensure quoted command update modes are handled > correctly. > > * Code style and hygiene changes. > > * Introduce a NEEDSWORK comment, because the printf() and error return is > correct only because the shell caller in the other end redirects it to the > correct output stream. Once we switch this completely to C (ie, in the > follow-up series), I need to remember to die() instead (or print to stderr) to > reproduce the original behaviour. I didn't see anybody comment on this round (and do not think I saw anything glaringly wrong). Is everybody happy with this version? I am about to mark it for 'next' in the next issue of "What's cooking" report, so please holler if I should wait. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-09-08 0:14 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-22 13:40 [GSoC] [PATCH] submodule--helper: run update procedures from C Atharva Raykar 2021-07-23 9:37 ` Ævar Arnfjörð Bjarmason 2021-07-23 16:59 ` Atharva Raykar 2021-08-04 8:35 ` Atharva Raykar 2021-08-02 13:06 ` [GSoC] [PATCH v2] " Atharva Raykar 2021-08-02 18:50 ` Shourya Shukla 2021-08-03 8:46 ` Atharva Raykar 2021-08-03 10:07 ` Atharva Raykar 2021-08-13 7:56 ` [GSoC] [PATCH v3] " Atharva Raykar 2021-08-13 18:32 ` Junio C Hamano 2021-08-24 8:58 ` Atharva Raykar 2021-08-24 14:06 ` [PATCH v4] " Atharva Raykar 2021-09-08 0:14 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).