From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> To: git@vger.kernel.org Cc: jrnieder@gmail.com, jonathantanmy@google.com, sluongng@gmail.com, congdanhqx@gmail.com, "SZEDER Gábor" <szeder.dev@gmail.com>, "Derrick Stolee" <stolee@gmail.com>, "Đoàn Trần Công Danh" <congdanhqx@gmail.com>, "Derrick Stolee" <derrickstolee@github.com> Subject: [PATCH v3 0/7] Maintenance III: Background maintenance Date: Mon, 05 Oct 2020 12:57:07 +0000 [thread overview] Message-ID: <pull.724.v3.git.1601902635.gitgitgadget@gmail.com> (raw) In-Reply-To: <pull.724.v2.git.1599846560.gitgitgadget@gmail.com> This is based on ds/maintenance-part-2 and replaces the RFC from [1]. [1] https://lore.kernel.org/git/pull.680.v3.git.1598629517.gitgitgadget@gmail.com/ This series introduces background maintenance to Git, through an integration with cron and crontab. Some preliminary work is done to allow a new --schedule option that tells the command which tasks to run based on a maintenance.<task>.schedule config option. The timing is not enforced by Git, but instead is expected to be provided as a hint from a cron schedule. The options are "hourly", "daily", and "weekly". A new for-each-repo builtin runs Git commands on every repo in a given list. Currently, the list is stored as a config setting, allowing a new maintenance.repos config list to store the repositories registered for background maintenance. Others may want to add a --file=<file> option for their own workflows, but I focused on making this as simple as possible for now. The updates to the git maintenance builtin include new register/unregister subcommands and start/stop subcommands. The register subcommand initializes the config while the start subcommand does everything register does plus update the cron table. The unregister and stop commands reverse this process. A troubleshooting guide is added to Documentation/git-maintenance.txt to advise expert users who choose to create custom cron schedules. The very last patch is entirely optional. It sets a recommended schedule based on my own experience with very large repositories. I'm open to other suggestions, but these are ones that I think work well and don't cause a "rewrite the world" scenario like running nightly 'gc' would do. I've been testing this scenario on my macOS laptop and Linux desktop. I have modified my cron task to provide logging via trace2 so I can see what's happening. A future direction here would be to add some maintenance logs to the repository so we can track what is happening and diagnose whether the maintenance strategy is working on real repos. Note: git maintenance (start|stop) only works on machines with cron by design. The proper thing to do on Windows will come later. Perhaps this command should be marked as unavailable on Windows somehow, or at least a better error than "cron may not be available on your system". I did find that that message is helpful sometimes: macOS worker agents for CI builds typically do not have cron available. Updates in v3: * Instead of writing config upon "register" or "start", simply create an in-memory default schedule when no .schedule or .enabled configs are present. Thanks, Martin! This causes patch 6 to look so different that the range-diff considers it a dropped-and-added patch instead of showing a diff. * There are some context lines that changed because this is rebased onto a recent version of ds/maintenance-part-2. Updates in v2: * Fixed the char/int issue in test-tool crontab, and a typo. * Updated commit message and patch noise in PATCH 2 * This should fix the test failures, allowing this to be picked up in 'seen'. Derrick Stolee (7): maintenance: optionally skip --auto process maintenance: add --schedule option and config for-each-repo: run subcommands on configured repos maintenance: add [un]register subcommands maintenance: add start/stop subcommands maintenance: use default schedule if not configured maintenance: add troubleshooting guide to docs .gitignore | 1 + Documentation/config/maintenance.txt | 10 + Documentation/git-for-each-repo.txt | 59 ++++++ Documentation/git-maintenance.txt | 97 ++++++++- Makefile | 2 + builtin.h | 1 + builtin/for-each-repo.c | 58 ++++++ builtin/gc.c | 301 ++++++++++++++++++++++++++- command-list.txt | 1 + git.c | 1 + run-command.c | 6 + t/helper/test-crontab.c | 35 ++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t0068-for-each-repo.sh | 30 +++ t/t7900-maintenance.sh | 101 ++++++++- t/test-lib.sh | 6 + 17 files changed, 705 insertions(+), 6 deletions(-) create mode 100644 Documentation/git-for-each-repo.txt create mode 100644 builtin/for-each-repo.c create mode 100644 t/helper/test-crontab.c create mode 100755 t/t0068-for-each-repo.sh base-commit: e841a79a131d8ce491cf04d0ca3e24f139a10b82 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-724%2Fderrickstolee%2Fmaintenance%2Fscheduled-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-724/derrickstolee/maintenance/scheduled-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/724 Range-diff vs v2: 1: b21cd68c90 = 1: 02e7286dba maintenance: optionally skip --auto process 2: e2d14d66d4 ! 2: dae8c04bb5 maintenance: add --schedule option and config @@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char ## t/t7900-maintenance.sh ## @@ t/t7900-maintenance.sh: test_expect_success 'maintenance.incremental-repack.auto' ' - done + test_subcommand git multi-pack-index write --no-progress <trace-B ' +test_expect_success '--auto and --schedule incompatible' ' 3: 41a346dfbb = 3: dd92379273 for-each-repo: run subcommands on configured repos 4: 1f49cda18e ! 4: 922b984c8a maintenance: add [un]register subcommands @@ t/t7900-maintenance.sh: GIT_TEST_MULTI_PACK_INDEX=0 - test_i18ngrep "usage: git maintenance run" err && + test_i18ngrep "usage: git maintenance <subcommand>" err && test_expect_code 128 git maintenance barf 2>err && - test_i18ngrep "invalid subcommand: barf" err - ' + test_i18ngrep "invalid subcommand: barf" err && + test_expect_code 129 git maintenance 2>err && @@ t/t7900-maintenance.sh: test_expect_success '--schedule inheritance weekly -> daily -> hourly' ' test_subcommand git multi-pack-index write --no-progress <weekly.txt ' 5: e9b2a39c1d ! 5: 5194f6b1fa maintenance: add start/stop subcommands @@ Makefile: TEST_BUILTINS_OBJS += test-advise.o ## builtin/gc.c ## @@ + #include "refs.h" #include "remote.h" - #include "midx.h" #include "object-store.h" +#include "exec-cmd.h" 6: f609c1bde2 ! 6: d833fffe89 maintenance: recommended schedule in register/start @@ Metadata Author: Derrick Stolee <dstolee@microsoft.com> ## Commit message ## - maintenance: recommended schedule in register/start + maintenance: use default schedule if not configured The 'git maintenance (register|start)' subcommands add the current repository to the global Git config so maintenance will operate on that @@ Commit message If a user sets any 'maintenance.<task>.schedule' config value, then they have chosen a specific schedule for themselves and Git should - respect that. + respect that when running 'git maintenance run --schedule=<frequency>'. - However, in an effort to recommend a good schedule for repositories of - all sizes, set new config values for recommended tasks that are safe to - run in the background while users run foreground Git commands. These - commands are generally everything but the 'gc' task. + To make this process extremely simple for users, assume a default + schedule when no 'maintenance.<task>.schedule' or '...enabled' config + settings are concretely set. This is only an in-process assumption, so + future versions of Git could adjust this expected schedule. + Helped-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> ## Documentation/git-maintenance.txt ## @@ Documentation/git-maintenance.txt: register:: for running in the background without disrupting foreground processes. ++ -+If your repository has no 'maintenance.<task>.schedule' configuration -+values set, then Git will set configuration values to some recommended -+settings. These settings disable foreground maintenance while performing -+maintenance tasks in the background that will not interrupt foreground Git -+operations. ++If your repository has no `maintenance.<task>.schedule` configuration ++values set, then Git will use a recommended default schedule that performs ++background maintenance that will not interrupt foreground commands. The ++default schedule is as follows: +++ ++* `gc`: disabled. ++* `commit-graph`: hourly. ++* `prefetch`: hourly. ++* `loose-objects`: daily. ++* `incremental-repack`: daily. +++ ++`git maintenance register` will also disable foreground maintenance by ++setting `maintenance.auto = false` in the current repository. This config ++setting will remain after a `git maintenance unregister` command. run:: Run one or more maintenance tasks. If one or more `--task` options ## builtin/gc.c ## -@@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char *prefix) - return maintenance_run_tasks(&opts); +@@ builtin/gc.c: static int compare_tasks_by_selection(const void *a_, const void *b_) + return b->selected_order - a->selected_order; } +static int has_schedule_config(void) @@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char + found = 1; + FREE_AND_NULL(value); + } ++ ++ strbuf_setlen(&config_name, prefix); ++ strbuf_addf(&config_name, "%s.enabled", tasks[i].name); ++ ++ if (!git_config_get_string(config_name.buf, &value)) { ++ found = 1; ++ FREE_AND_NULL(value); ++ } + } + + strbuf_release(&config_name); @@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char + +static void set_recommended_schedule(void) +{ -+ git_config_set("maintenance.auto", "false"); -+ git_config_set("maintenance.gc.enabled", "false"); ++ if (has_schedule_config()) ++ return; ++ ++ tasks[TASK_GC].enabled = 0; + -+ git_config_set("maintenance.prefetch.enabled", "true"); -+ git_config_set("maintenance.prefetch.schedule", "hourly"); ++ tasks[TASK_PREFETCH].enabled = 1; ++ tasks[TASK_PREFETCH].schedule = SCHEDULE_HOURLY; + -+ git_config_set("maintenance.commit-graph.enabled", "true"); -+ git_config_set("maintenance.commit-graph.schedule", "hourly"); ++ tasks[TASK_COMMIT_GRAPH].enabled = 1; ++ tasks[TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY; + -+ git_config_set("maintenance.loose-objects.enabled", "true"); -+ git_config_set("maintenance.loose-objects.schedule", "daily"); ++ tasks[TASK_LOOSE_OBJECTS].enabled = 1; ++ tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY; + -+ git_config_set("maintenance.incremental-repack.enabled", "true"); -+ git_config_set("maintenance.incremental-repack.schedule", "daily"); ++ tasks[TASK_INCREMENTAL_REPACK].enabled = 1; ++ tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY; +} + - static int maintenance_register(void) + static int maintenance_run_tasks(struct maintenance_run_opts *opts) { - struct child_process config_set = CHILD_PROCESS_INIT; + int i, found_selected = 0; +@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts) + + if (found_selected) + QSORT(tasks, TASK__COUNT, compare_tasks_by_selection); ++ else if (opts->schedule != SCHEDULE_NONE) ++ set_recommended_schedule(); + + for (i = 0; i < TASK__COUNT; i++) { + if (found_selected && tasks[i].selected_order < 0) @@ builtin/gc.c: static int maintenance_register(void) if (!the_repository || !the_repository->gitdir) return 0; -+ if (!has_schedule_config()) -+ set_recommended_schedule(); ++ /* Disable foreground maintenance */ ++ git_config_set("maintenance.auto", "false"); + config_get.git_cmd = 1; strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo", @@ t/t7900-maintenance.sh: test_expect_success 'register and unregister' ' git config --global --add maintenance.repo /existing2 && git config --global --get-all maintenance.repo >before && + -+ # We still have maintenance.<task>.schedule config set, -+ # so this does not update the local schedule -+ git maintenance register && -+ test_must_fail git config maintenance.auto && -+ -+ # Clear previous maintenance.<task>.schedule values -+ for task in loose-objects commit-graph incremental-repack -+ do -+ git config --unset maintenance.$task.schedule || return 1 -+ done && git maintenance register && +- git config --global --get-all maintenance.repo >actual && +- cp before after && +- pwd >>after && +- test_cmp after actual && + test_cmp_config false maintenance.auto && -+ test_cmp_config false maintenance.gc.enabled && -+ test_cmp_config true maintenance.prefetch.enabled && -+ test_cmp_config hourly maintenance.commit-graph.schedule && -+ test_cmp_config daily maintenance.incremental-repack.schedule && ++ git config --global --get-all maintenance.repo >between && ++ cp before expect && ++ pwd >>expect && ++ test_cmp expect between && ++ + git maintenance unregister && git config --global --get-all maintenance.repo >actual && - cp before after && - pwd >>after && + test_cmp before actual 7: 2344eff4ba = 7: 8e42ff44ce maintenance: add troubleshooting guide to docs -- gitgitgadget
next prev parent reply other threads:[~2020-10-05 13:04 UTC|newest] Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-04 15:41 [PATCH " Derrick Stolee via GitGitGadget 2020-09-04 15:42 ` [PATCH 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget 2020-09-04 15:42 ` [PATCH 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget 2020-09-08 13:07 ` Đoàn Trần Công Danh 2020-09-09 12:14 ` Derrick Stolee 2020-09-04 15:42 ` [PATCH 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget 2020-09-04 15:42 ` [PATCH 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget 2020-09-04 15:42 ` [PATCH 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget 2020-09-08 6:29 ` SZEDER Gábor 2020-09-08 12:43 ` Derrick Stolee 2020-09-08 19:31 ` Junio C Hamano 2020-09-04 15:42 ` [PATCH 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget 2020-09-04 15:42 ` [PATCH 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget 2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget 2020-09-11 17:49 ` [PATCH v2 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget 2020-09-11 17:49 ` [PATCH v2 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget 2020-09-11 17:49 ` [PATCH v2 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget 2020-09-11 17:49 ` [PATCH v2 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget 2020-09-17 14:05 ` Đoàn Trần Công Danh 2020-09-11 17:49 ` [PATCH v2 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget 2020-09-11 17:49 ` [PATCH v2 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget 2020-09-29 19:48 ` Martin Ågren 2020-09-30 20:11 ` Derrick Stolee 2020-10-01 20:38 ` Derrick Stolee 2020-10-02 0:38 ` Đoàn Trần Công Danh 2020-10-02 1:55 ` Derrick Stolee 2020-10-05 13:16 ` Đoàn Trần Công Danh 2020-10-05 18:17 ` Derrick Stolee 2020-09-11 17:49 ` [PATCH v2 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget 2020-10-05 12:57 ` Derrick Stolee via GitGitGadget [this message] 2020-10-05 12:57 ` [PATCH v3 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget 2020-10-05 12:57 ` [PATCH v3 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget 2020-10-05 12:57 ` [PATCH v3 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget 2020-10-05 12:57 ` [PATCH v3 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget 2020-10-05 12:57 ` [PATCH v3 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget 2020-10-05 12:57 ` [PATCH v3 6/7] maintenance: use default schedule if not configured Derrick Stolee via GitGitGadget 2020-10-05 19:57 ` Martin Ågren 2020-10-08 13:32 ` Derrick Stolee 2020-10-05 12:57 ` [PATCH v3 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget 2020-10-15 17:21 ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget 2020-10-15 17:21 ` [PATCH v4 1/8] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget 2020-10-15 17:21 ` [PATCH v4 2/8] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget 2021-02-09 14:06 ` Ævar Arnfjörð Bjarmason 2021-02-09 16:54 ` Derrick Stolee 2021-05-10 12:16 ` Ævar Arnfjörð Bjarmason 2021-05-10 18:42 ` Junio C Hamano 2020-10-15 17:21 ` [PATCH v4 3/8] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget 2021-05-03 16:10 ` Andrzej Hunt 2021-05-03 17:01 ` Eric Sunshine 2021-05-03 19:26 ` Eric Sunshine 2021-05-03 19:43 ` Derrick Stolee 2020-10-15 17:22 ` [PATCH v4 4/8] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget 2020-10-15 17:22 ` [PATCH v4 5/8] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget 2020-12-09 18:51 ` Josh Steadmon 2020-12-09 19:16 ` Josh Steadmon 2020-12-09 21:59 ` Derrick Stolee 2020-12-10 0:13 ` Junio C Hamano 2020-12-10 1:52 ` Derrick Stolee 2020-12-10 6:54 ` Junio C Hamano 2020-10-15 17:22 ` [PATCH v4 6/8] maintenance: create maintenance.strategy config Derrick Stolee via GitGitGadget 2020-10-15 17:22 ` [PATCH v4 7/8] maintenance: use 'incremental' strategy by default Derrick Stolee via GitGitGadget 2020-10-15 17:22 ` [PATCH v4 8/8] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=pull.724.v3.git.1601902635.gitgitgadget@gmail.com \ --to=gitgitgadget@gmail.com \ --cc=congdanhqx@gmail.com \ --cc=derrickstolee@github.com \ --cc=git@vger.kernel.org \ --cc=jonathantanmy@google.com \ --cc=jrnieder@gmail.com \ --cc=sluongng@gmail.com \ --cc=stolee@gmail.com \ --cc=szeder.dev@gmail.com \ --subject='Re: [PATCH v3 0/7] Maintenance III: Background maintenance' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).