* [PATCH 0/2] Maintenance: add pack-refs task @ 2021-02-08 14:52 Derrick Stolee via GitGitGadget 2021-02-08 14:52 ` [PATCH 1/2] maintenance: " Derrick Stolee via GitGitGadget ` (3 more replies) 0 siblings, 4 replies; 12+ messages in thread From: Derrick Stolee via GitGitGadget @ 2021-02-08 14:52 UTC (permalink / raw) To: git; +Cc: gitster, sluongng, martin.agren, sunshine, Derrick Stolee This patch series adds a new pack-refs task to the maintenance builtin. This operation already happens within git gc (and hence the gc task) but it is easy to extract. Packing refs does not delete any data, only collects loose objects into a combined file. This makes things faster in subtle ways, especially when a command needs to iterate through refs (especially tags). Credit for inspiring this goes to Suolong, who asked for this to be added to Scalar [1]. I've been waiting instead to add it directly to Git and its background maintenance. Now is the time! [1] https://github.com/microsoft/scalar/issues/382 I chose to add it to the incremental maintenance strategy at a weekly cadence. I'm not sure there is significant value to the difference between weekly and daily. It just seems to me that weekly is often enough. Feel free to correct me if you have a different opinion. My hope is that this patch series could be used as an example for further extracting tasks out of the gc task and making them be full maintenance tasks. Doing more of these extractions could be a good project for a new contributor. One thing that is not implemented in this series is a notion of the behavior for the pack-refs task during git maintenance run --auto. This could be added in the future, but I wanted to focus on getting this behavior into the incremental maintenance schedule. Thanks, -Stolee Derrick Stolee (2): maintenance: add pack-refs task maintenance: incremental strategy runs pack-refs weekly Documentation/config/maintenance.txt | 5 +++-- Documentation/git-maintenance.txt | 6 ++++++ builtin/gc.c | 23 +++++++++++++++++++---- t/t7900-maintenance.sh | 24 ++++++++++++++++++++++++ 4 files changed, 52 insertions(+), 6 deletions(-) base-commit: fb7fa4a1fd273f22efcafdd13c7f897814fd1eb9 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-871%2Fderrickstolee%2Fmaintenance%2Fpack-refs-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-871/derrickstolee/maintenance/pack-refs-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/871 -- gitgitgadget ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] maintenance: add pack-refs task 2021-02-08 14:52 [PATCH 0/2] Maintenance: add pack-refs task Derrick Stolee via GitGitGadget @ 2021-02-08 14:52 ` Derrick Stolee via GitGitGadget 2021-02-08 22:53 ` Taylor Blau 2021-02-08 23:06 ` Eric Sunshine 2021-02-08 14:52 ` [PATCH 2/2] maintenance: incremental strategy runs pack-refs weekly Derrick Stolee via GitGitGadget ` (2 subsequent siblings) 3 siblings, 2 replies; 12+ messages in thread From: Derrick Stolee via GitGitGadget @ 2021-02-08 14:52 UTC (permalink / raw) To: git Cc: gitster, sluongng, martin.agren, sunshine, Derrick Stolee, Derrick Stolee From: Derrick Stolee <dstolee@microsoft.com> It is valuable to collect loose refs into a more compressed form. This is typically the packed-refs file, although this could be the reftable in the future. Having packed refs can be extremely valuable in repos with many tags or remote branches that are not modified by the local user, but still are necessary for other queries. For instance, with many exploded refs, commands such as git describe --tags --exact-match HEAD can be very slow (multiple seconds). This command in particular is used by terminal prompts to show when a detatched HEAD is pointing to an existing tag, so having it be slow causes significant delays for users. Add a new 'pack-refs' maintenance task. This is already a sub-step of the 'gc' task, but users could run this at other intervals if they are interested. Also, if users opt-in to the default background maintenance schedule, then the 'gc' task is disabled. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> --- Documentation/git-maintenance.txt | 6 ++++++ builtin/gc.c | 21 +++++++++++++++++---- t/t7900-maintenance.sh | 10 ++++++++++ 3 files changed, 33 insertions(+), 4 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 3b432171d608..8cc0225b3304 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -145,6 +145,12 @@ incremental-repack:: which is a special case that attempts to repack all pack-files into a single pack-file. +pack-refs:: + The `pack-refs` task collects the loose reference files and + collects them into a single file. This speeds up operations that + need to iterate across many refereences. See linkgit:git-pack-refs[1] + for more information. + OPTIONS ------- --auto:: diff --git a/builtin/gc.c b/builtin/gc.c index 4c40594d660e..8f13c3bb8607 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -54,7 +54,6 @@ static const char *prune_worktrees_expire = "3.months.ago"; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; -static struct strvec pack_refs_cmd = STRVEC_INIT; static struct strvec reflog = STRVEC_INIT; static struct strvec repack = STRVEC_INIT; static struct strvec prune = STRVEC_INIT; @@ -163,6 +162,15 @@ static void gc_config(void) git_config(git_default_config, NULL); } +struct maintenance_run_opts; +static int maintenance_task_pack_refs(struct maintenance_run_opts *opts) +{ + struct strvec pack_refs_cmd = STRVEC_INIT; + strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL); + + return run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD); +} + static int too_many_loose_objects(void) { /* @@ -518,8 +526,8 @@ static void gc_before_repack(void) if (done++) return; - if (pack_refs && run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD)) - die(FAILED_RUN, pack_refs_cmd.v[0]); + if (pack_refs && maintenance_task_pack_refs(NULL)) + die(FAILED_RUN, "pack-refs"); if (prune_reflogs && run_command_v_opt(reflog.v, RUN_GIT_CMD)) die(FAILED_RUN, reflog.v[0]); @@ -556,7 +564,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (argc == 2 && !strcmp(argv[1], "-h")) usage_with_options(builtin_gc_usage, builtin_gc_options); - strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL); strvec_pushl(&reflog, "reflog", "expire", "--all", NULL); strvec_pushl(&repack, "repack", "-d", "-l", NULL); strvec_pushl(&prune, "prune", "--expire", NULL); @@ -1224,6 +1231,7 @@ enum maintenance_task_label { TASK_INCREMENTAL_REPACK, TASK_GC, TASK_COMMIT_GRAPH, + TASK_PACK_REFS, /* Leave as final value */ TASK__COUNT @@ -1255,6 +1263,11 @@ static struct maintenance_task tasks[] = { maintenance_task_commit_graph, should_write_commit_graph, }, + [TASK_PACK_REFS] = { + "pack-refs", + maintenance_task_pack_refs, + NULL, + }, }; static int compare_tasks_by_selection(const void *a_, const void *b_) diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 78ccf4b33f87..ef4527823d29 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -343,6 +343,16 @@ test_expect_success 'maintenance.incremental-repack.auto' ' test_subcommand git multi-pack-index write --no-progress <trace-B ' +test_expect_success 'pack-refs task' ' + for n in $(test_seq 1 5) + do + git branch -f to-pack/$n HEAD || return 1 + done && + git maintenance run --task=pack-refs && + ls .git/refs/heads/ >after && + test_must_be_empty after +' + test_expect_success '--auto and --schedule incompatible' ' test_must_fail git maintenance run --auto --schedule=daily 2>err && test_i18ngrep "at most one" err -- gitgitgadget ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] maintenance: add pack-refs task 2021-02-08 14:52 ` [PATCH 1/2] maintenance: " Derrick Stolee via GitGitGadget @ 2021-02-08 22:53 ` Taylor Blau 2021-02-09 12:42 ` Derrick Stolee 2021-02-08 23:06 ` Eric Sunshine 1 sibling, 1 reply; 12+ messages in thread From: Taylor Blau @ 2021-02-08 22:53 UTC (permalink / raw) To: Derrick Stolee via GitGitGadget Cc: git, gitster, sluongng, martin.agren, sunshine, Derrick Stolee, Derrick Stolee On Mon, Feb 08, 2021 at 02:52:22PM +0000, Derrick Stolee via GitGitGadget wrote: > +pack-refs:: > + The `pack-refs` task collects the loose reference files and > + collects them into a single file. This speeds up operations that > + need to iterate across many refereences. See linkgit:git-pack-refs[1] > + for more information. > + Do you think it's worth documenting here or in git-gc(1) that running this has the effect of disabling the same step during gc? > OPTIONS > ------- > --auto:: > diff --git a/builtin/gc.c b/builtin/gc.c > index 4c40594d660e..8f13c3bb8607 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -54,7 +54,6 @@ static const char *prune_worktrees_expire = "3.months.ago"; > static unsigned long big_pack_threshold; > static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; > > -static struct strvec pack_refs_cmd = STRVEC_INIT; > static struct strvec reflog = STRVEC_INIT; > static struct strvec repack = STRVEC_INIT; > static struct strvec prune = STRVEC_INIT; > @@ -163,6 +162,15 @@ static void gc_config(void) > git_config(git_default_config, NULL); > } > > +struct maintenance_run_opts; > +static int maintenance_task_pack_refs(struct maintenance_run_opts *opts) It may be worth calling this "unused", since you explicitly pass NULL below. > +{ > + struct strvec pack_refs_cmd = STRVEC_INIT; > + strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL); > + > + return run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD); > +} > + > static int too_many_loose_objects(void) > { > /* > @@ -518,8 +526,8 @@ static void gc_before_repack(void) > if (done++) > return; > > - if (pack_refs && run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD)) > - die(FAILED_RUN, pack_refs_cmd.v[0]); > + if (pack_refs && maintenance_task_pack_refs(NULL)) > + die(FAILED_RUN, "pack-refs"); Hmm. Am I missing where opting into the maintenance step disables this in gc? You suggest that it does in the commit message, but I can't seem to see it here. > +test_expect_success 'pack-refs task' ' > + for n in $(test_seq 1 5) > + do > + git branch -f to-pack/$n HEAD || return 1 > + done && > + git maintenance run --task=pack-refs && > + ls .git/refs/heads/ >after && > + test_must_be_empty after Worth testing that $GIT_DIR/packed-refs exists, too? Thanks, Taylor ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] maintenance: add pack-refs task 2021-02-08 22:53 ` Taylor Blau @ 2021-02-09 12:42 ` Derrick Stolee 0 siblings, 0 replies; 12+ messages in thread From: Derrick Stolee @ 2021-02-09 12:42 UTC (permalink / raw) To: Taylor Blau, Derrick Stolee via GitGitGadget Cc: git, gitster, sluongng, martin.agren, sunshine, Derrick Stolee, Derrick Stolee On 2/8/2021 5:53 PM, Taylor Blau wrote: > On Mon, Feb 08, 2021 at 02:52:22PM +0000, Derrick Stolee via GitGitGadget wrote: >> +pack-refs:: >> + The `pack-refs` task collects the loose reference files and >> + collects them into a single file. This speeds up operations that >> + need to iterate across many refereences. See linkgit:git-pack-refs[1] >> + for more information. >> + > > Do you think it's worth documenting here or in git-gc(1) that running > this has the effect of disabling the same step during gc? It doesn't disable the step during gc. Perhaps I should use a better term than "extract". The 'gc' task shouldn't change as we make some of its steps also available as independent tasks. Instead, users can select a subset of its steps by enabling them directly. Other such tasks could include: * prune-reflog * full-repack (as opposed to the existing incremental-repack task) >> +struct maintenance_run_opts; >> +static int maintenance_task_pack_refs(struct maintenance_run_opts *opts) > > It may be worth calling this "unused", since you explicitly pass NULL > below. Good idea. >> + if (pack_refs && maintenance_task_pack_refs(NULL)) >> + die(FAILED_RUN, "pack-refs"); > > Hmm. Am I missing where opting into the maintenance step disables this > in gc? You suggest that it does in the commit message, but I can't seem > to see it here. My commit message suggests wrong. >> +test_expect_success 'pack-refs task' ' >> + for n in $(test_seq 1 5) >> + do >> + git branch -f to-pack/$n HEAD || return 1 >> + done && >> + git maintenance run --task=pack-refs && >> + ls .git/refs/heads/ >after && >> + test_must_be_empty after > > Worth testing that $GIT_DIR/packed-refs exists, too? That would start breaking if we change the backing store, right? I want to be sure this doesn't create test breakages with reftable. I _could_ add a 'test_subcommand' check here, though. Thanks, -Stolee ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] maintenance: add pack-refs task 2021-02-08 14:52 ` [PATCH 1/2] maintenance: " Derrick Stolee via GitGitGadget 2021-02-08 22:53 ` Taylor Blau @ 2021-02-08 23:06 ` Eric Sunshine 2021-02-09 12:42 ` Derrick Stolee 1 sibling, 1 reply; 12+ messages in thread From: Eric Sunshine @ 2021-02-08 23:06 UTC (permalink / raw) To: Derrick Stolee via GitGitGadget Cc: Git List, Junio C Hamano, Son Luong Ngoc, Martin Ågren, Derrick Stolee, Derrick Stolee On Mon, Feb 8, 2021 at 9:52 AM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > +pack-refs:: > + The `pack-refs` task collects the loose reference files and > + collects them into a single file. This speeds up operations that > + need to iterate across many refereences. See linkgit:git-pack-refs[1] > + for more information. s/refereences/references/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] maintenance: add pack-refs task 2021-02-08 23:06 ` Eric Sunshine @ 2021-02-09 12:42 ` Derrick Stolee 0 siblings, 0 replies; 12+ messages in thread From: Derrick Stolee @ 2021-02-09 12:42 UTC (permalink / raw) To: Eric Sunshine, Derrick Stolee via GitGitGadget Cc: Git List, Junio C Hamano, Son Luong Ngoc, Martin Ågren, Derrick Stolee, Derrick Stolee On 2/8/2021 6:06 PM, Eric Sunshine wrote: > On Mon, Feb 8, 2021 at 9:52 AM Derrick Stolee via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> +pack-refs:: >> + The `pack-refs` task collects the loose reference files and >> + collects them into a single file. This speeds up operations that >> + need to iterate across many refereences. See linkgit:git-pack-refs[1] >> + for more information. > > s/refereences/references/ Thanks! -Stolee ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] maintenance: incremental strategy runs pack-refs weekly 2021-02-08 14:52 [PATCH 0/2] Maintenance: add pack-refs task Derrick Stolee via GitGitGadget 2021-02-08 14:52 ` [PATCH 1/2] maintenance: " Derrick Stolee via GitGitGadget @ 2021-02-08 14:52 ` Derrick Stolee via GitGitGadget 2021-02-08 22:46 ` [PATCH 0/2] Maintenance: add pack-refs task Taylor Blau 2021-02-09 13:42 ` [PATCH v2 " Derrick Stolee via GitGitGadget 3 siblings, 0 replies; 12+ messages in thread From: Derrick Stolee via GitGitGadget @ 2021-02-08 14:52 UTC (permalink / raw) To: git Cc: gitster, sluongng, martin.agren, sunshine, Derrick Stolee, Derrick Stolee From: Derrick Stolee <dstolee@microsoft.com> When the 'maintenance.strategy' config option is set to 'incremental', a default maintenance schedule is enabled. Add the 'pack-refs' task to that strategy at the weekly cadence. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> --- Documentation/config/maintenance.txt | 5 +++-- builtin/gc.c | 2 ++ t/t7900-maintenance.sh | 14 ++++++++++++++ 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index a5ead09e4bc2..18f056213145 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -15,8 +15,9 @@ maintenance.strategy:: * `none`: This default setting implies no task are run at any schedule. * `incremental`: This setting optimizes for performing small maintenance activities that do not delete any data. This does not schedule the `gc` - task, but runs the `prefetch` and `commit-graph` tasks hourly and the - `loose-objects` and `incremental-repack` tasks daily. + task, but runs the `prefetch` and `commit-graph` tasks hourly, the + `loose-objects` and `incremental-repack` tasks daily, and the `pack-refs` + task weekly. maintenance.<task>.enabled:: This boolean config option controls whether the maintenance task diff --git a/builtin/gc.c b/builtin/gc.c index 8f13c3bb8607..8d365a59d027 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1352,6 +1352,8 @@ static void initialize_maintenance_strategy(void) tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY; tasks[TASK_LOOSE_OBJECTS].enabled = 1; tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY; + tasks[TASK_PACK_REFS].enabled = 1; + tasks[TASK_PACK_REFS].schedule = SCHEDULE_WEEKLY; } } diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index ef4527823d29..fa92e01fb2c5 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -406,18 +406,32 @@ test_expect_success 'maintenance.strategy inheritance' ' git maintenance run --schedule=hourly --quiet && GIT_TRACE2_EVENT="$(pwd)/incremental-daily.txt" \ git maintenance run --schedule=daily --quiet && + GIT_TRACE2_EVENT="$(pwd)/incremental-weekly.txt" \ + git maintenance run --schedule=weekly --quiet && test_subcommand git commit-graph write --split --reachable \ --no-progress <incremental-hourly.txt && test_subcommand ! git prune-packed --quiet <incremental-hourly.txt && test_subcommand ! git multi-pack-index write --no-progress \ <incremental-hourly.txt && + test_subcommand ! git pack-refs --all --prune \ + <incremental-hourly.txt && test_subcommand git commit-graph write --split --reachable \ --no-progress <incremental-daily.txt && test_subcommand git prune-packed --quiet <incremental-daily.txt && test_subcommand git multi-pack-index write --no-progress \ <incremental-daily.txt && + test_subcommand ! git pack-refs --all --prune \ + <incremental-daily.txt && + + test_subcommand git commit-graph write --split --reachable \ + --no-progress <incremental-weekly.txt && + test_subcommand git prune-packed --quiet <incremental-weekly.txt && + test_subcommand git multi-pack-index write --no-progress \ + <incremental-weekly.txt && + test_subcommand git pack-refs --all --prune \ + <incremental-weekly.txt && # Modify defaults git config maintenance.commit-graph.schedule daily && -- gitgitgadget ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Maintenance: add pack-refs task 2021-02-08 14:52 [PATCH 0/2] Maintenance: add pack-refs task Derrick Stolee via GitGitGadget 2021-02-08 14:52 ` [PATCH 1/2] maintenance: " Derrick Stolee via GitGitGadget 2021-02-08 14:52 ` [PATCH 2/2] maintenance: incremental strategy runs pack-refs weekly Derrick Stolee via GitGitGadget @ 2021-02-08 22:46 ` Taylor Blau 2021-02-09 13:42 ` [PATCH v2 " Derrick Stolee via GitGitGadget 3 siblings, 0 replies; 12+ messages in thread From: Taylor Blau @ 2021-02-08 22:46 UTC (permalink / raw) To: Derrick Stolee via GitGitGadget Cc: git, gitster, sluongng, martin.agren, sunshine, Derrick Stolee On Mon, Feb 08, 2021 at 02:52:21PM +0000, Derrick Stolee via GitGitGadget wrote: > I chose to add it to the incremental maintenance strategy at a weekly > cadence. I'm not sure there is significant value to the difference between > weekly and daily. It just seems to me that weekly is often enough. Feel free > to correct me if you have a different opinion. I don't have any strong opinion about this. At GitHub we run pack-refs as part of a maintenance-like job, which varies in frequency between different repositories based on a number of factors. In our busiest repository networks, this is almost certainly at least a few times a day, if not "as often as possible." I'd suspect that for any laptop-sized repository, anywhere between daily and weekly is just fine. I suppose weekly is a good starting point, since we can always change the default to run more often if enough users complain. > My hope is that this patch series could be used as an example for further > extracting tasks out of the gc task and making them be full maintenance > tasks. Doing more of these extractions could be a good project for a new > contributor. That's great :-). Thanks, Taylor ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 0/2] Maintenance: add pack-refs task 2021-02-08 14:52 [PATCH 0/2] Maintenance: add pack-refs task Derrick Stolee via GitGitGadget ` (2 preceding siblings ...) 2021-02-08 22:46 ` [PATCH 0/2] Maintenance: add pack-refs task Taylor Blau @ 2021-02-09 13:42 ` Derrick Stolee via GitGitGadget 2021-02-09 13:42 ` [PATCH v2 1/2] maintenance: " Derrick Stolee via GitGitGadget ` (2 more replies) 3 siblings, 3 replies; 12+ messages in thread From: Derrick Stolee via GitGitGadget @ 2021-02-09 13:42 UTC (permalink / raw) To: git; +Cc: Taylor Blau, Eric Sunshine, Derrick Stolee, Derrick Stolee This patch series adds a new pack-refs task to the maintenance builtin. This operation already happens within git gc (and hence the gc task) but it is easy to extract. Packing refs does not delete any data, only collects loose objects into a combined file. This makes things faster in subtle ways, especially when a command needs to iterate through refs (especially tags). Credit for inspiring this goes to Suolong, who asked for this to be added to Scalar [1]. I've been waiting instead to add it directly to Git and its background maintenance. Now is the time! [1] https://github.com/microsoft/scalar/issues/382 I chose to add it to the incremental maintenance strategy at a weekly cadence. I'm not sure there is significant value to the difference between weekly and daily. It just seems to me that weekly is often enough. Feel free to correct me if you have a different opinion. My hope is that this patch series could be used as an example for further extracting tasks out of the gc task and making them be full maintenance tasks. Doing more of these extractions could be a good project for a new contributor. One thing that is not implemented in this series is a notion of the behavior for the pack-refs task during git maintenance run --auto. This could be added in the future, but I wanted to focus on getting this behavior into the incremental maintenance schedule. Updates in V2 ============= * Fixed doc typo. Thanks, Eric! * Updated commit messages to make it clear that the 'pack-refs' step will still happen within the 'gc' task. * Updated the test to check that we run the correct subcommand. * maintenance_task_pack_refs() uses MAYBE_UNUSED on its parameter. Thanks, -Stolee Cc: gitster@pobox.com Cc: sluongng@gmail.com Cc: martin.agren@gmail.com Cc: sunshine@sunshineco.com Derrick Stolee (2): maintenance: add pack-refs task maintenance: incremental strategy runs pack-refs weekly Documentation/config/maintenance.txt | 5 +++-- Documentation/git-maintenance.txt | 6 ++++++ builtin/gc.c | 23 +++++++++++++++++++---- t/t7900-maintenance.sh | 26 ++++++++++++++++++++++++++ 4 files changed, 54 insertions(+), 6 deletions(-) base-commit: fb7fa4a1fd273f22efcafdd13c7f897814fd1eb9 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-871%2Fderrickstolee%2Fmaintenance%2Fpack-refs-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-871/derrickstolee/maintenance/pack-refs-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/871 Range-diff vs v1: 1: 33b7a74af4eb ! 1: bedaeb548b06 maintenance: add pack-refs task @@ Commit message by terminal prompts to show when a detatched HEAD is pointing to an existing tag, so having it be slow causes significant delays for users. - Add a new 'pack-refs' maintenance task. This is already a sub-step of - the 'gc' task, but users could run this at other intervals if they are - interested. Also, if users opt-in to the default background maintenance - schedule, then the 'gc' task is disabled. + Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all + --prune' to move loose refs into a packed form. For now, that is the + packed-refs file, but could adjust to other file formats in the future. + + This is the first of several sub-tasks of the 'gc' task that could be + extracted to their own tasks. In this process, we should not change the + behavior of the 'gc' task since that remains the default way to keep + repositories maintained. Creating a new task for one of these sub-tasks + only provides more customization options for those choosing to not use + the 'gc' task. It is certainly possible to have both the 'gc' and + 'pack-refs' tasks enabled and run regularly. While they may repeat + effort, they do not conflict in a destructive way. + + The 'auto_condition' function pointer is left NULL for now. We could + extend this in the future to have a condition check if pack-refs should + be run during 'git maintenance run --auto'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> @@ Documentation/git-maintenance.txt: incremental-repack:: +pack-refs:: + The `pack-refs` task collects the loose reference files and + collects them into a single file. This speeds up operations that -+ need to iterate across many refereences. See linkgit:git-pack-refs[1] ++ need to iterate across many references. See linkgit:git-pack-refs[1] + for more information. + OPTIONS @@ builtin/gc.c: static void gc_config(void) } +struct maintenance_run_opts; -+static int maintenance_task_pack_refs(struct maintenance_run_opts *opts) ++static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts) +{ + struct strvec pack_refs_cmd = STRVEC_INIT; + strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL); @@ t/t7900-maintenance.sh: test_expect_success 'maintenance.incremental-repack.auto + do + git branch -f to-pack/$n HEAD || return 1 + done && -+ git maintenance run --task=pack-refs && ++ GIT_TRACE2_EVENT="$(pwd)/pack-refs.txt" \ ++ git maintenance run --task=pack-refs && + ls .git/refs/heads/ >after && -+ test_must_be_empty after ++ test_must_be_empty after && ++ test_subcommand git pack-refs --all --prune <pack-refs.txt +' + test_expect_success '--auto and --schedule incompatible' ' 2: 8012d2dc1420 = 2: c38fc9a4170e maintenance: incremental strategy runs pack-refs weekly -- gitgitgadget ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] maintenance: add pack-refs task 2021-02-09 13:42 ` [PATCH v2 " Derrick Stolee via GitGitGadget @ 2021-02-09 13:42 ` Derrick Stolee via GitGitGadget 2021-02-09 13:42 ` [PATCH v2 2/2] maintenance: incremental strategy runs pack-refs weekly Derrick Stolee via GitGitGadget 2021-02-10 2:41 ` [PATCH v2 0/2] Maintenance: add pack-refs task Taylor Blau 2 siblings, 0 replies; 12+ messages in thread From: Derrick Stolee via GitGitGadget @ 2021-02-09 13:42 UTC (permalink / raw) To: git Cc: Taylor Blau, Eric Sunshine, Derrick Stolee, Derrick Stolee, Derrick Stolee From: Derrick Stolee <dstolee@microsoft.com> It is valuable to collect loose refs into a more compressed form. This is typically the packed-refs file, although this could be the reftable in the future. Having packed refs can be extremely valuable in repos with many tags or remote branches that are not modified by the local user, but still are necessary for other queries. For instance, with many exploded refs, commands such as git describe --tags --exact-match HEAD can be very slow (multiple seconds). This command in particular is used by terminal prompts to show when a detatched HEAD is pointing to an existing tag, so having it be slow causes significant delays for users. Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all --prune' to move loose refs into a packed form. For now, that is the packed-refs file, but could adjust to other file formats in the future. This is the first of several sub-tasks of the 'gc' task that could be extracted to their own tasks. In this process, we should not change the behavior of the 'gc' task since that remains the default way to keep repositories maintained. Creating a new task for one of these sub-tasks only provides more customization options for those choosing to not use the 'gc' task. It is certainly possible to have both the 'gc' and 'pack-refs' tasks enabled and run regularly. While they may repeat effort, they do not conflict in a destructive way. The 'auto_condition' function pointer is left NULL for now. We could extend this in the future to have a condition check if pack-refs should be run during 'git maintenance run --auto'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> --- Documentation/git-maintenance.txt | 6 ++++++ builtin/gc.c | 21 +++++++++++++++++---- t/t7900-maintenance.sh | 12 ++++++++++++ 3 files changed, 35 insertions(+), 4 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 3b432171d608..80ddd33ceba0 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -145,6 +145,12 @@ incremental-repack:: which is a special case that attempts to repack all pack-files into a single pack-file. +pack-refs:: + The `pack-refs` task collects the loose reference files and + collects them into a single file. This speeds up operations that + need to iterate across many references. See linkgit:git-pack-refs[1] + for more information. + OPTIONS ------- --auto:: diff --git a/builtin/gc.c b/builtin/gc.c index 4c40594d660e..41bec4f177b3 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -54,7 +54,6 @@ static const char *prune_worktrees_expire = "3.months.ago"; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; -static struct strvec pack_refs_cmd = STRVEC_INIT; static struct strvec reflog = STRVEC_INIT; static struct strvec repack = STRVEC_INIT; static struct strvec prune = STRVEC_INIT; @@ -163,6 +162,15 @@ static void gc_config(void) git_config(git_default_config, NULL); } +struct maintenance_run_opts; +static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts) +{ + struct strvec pack_refs_cmd = STRVEC_INIT; + strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL); + + return run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD); +} + static int too_many_loose_objects(void) { /* @@ -518,8 +526,8 @@ static void gc_before_repack(void) if (done++) return; - if (pack_refs && run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD)) - die(FAILED_RUN, pack_refs_cmd.v[0]); + if (pack_refs && maintenance_task_pack_refs(NULL)) + die(FAILED_RUN, "pack-refs"); if (prune_reflogs && run_command_v_opt(reflog.v, RUN_GIT_CMD)) die(FAILED_RUN, reflog.v[0]); @@ -556,7 +564,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (argc == 2 && !strcmp(argv[1], "-h")) usage_with_options(builtin_gc_usage, builtin_gc_options); - strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL); strvec_pushl(&reflog, "reflog", "expire", "--all", NULL); strvec_pushl(&repack, "repack", "-d", "-l", NULL); strvec_pushl(&prune, "prune", "--expire", NULL); @@ -1224,6 +1231,7 @@ enum maintenance_task_label { TASK_INCREMENTAL_REPACK, TASK_GC, TASK_COMMIT_GRAPH, + TASK_PACK_REFS, /* Leave as final value */ TASK__COUNT @@ -1255,6 +1263,11 @@ static struct maintenance_task tasks[] = { maintenance_task_commit_graph, should_write_commit_graph, }, + [TASK_PACK_REFS] = { + "pack-refs", + maintenance_task_pack_refs, + NULL, + }, }; static int compare_tasks_by_selection(const void *a_, const void *b_) diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 78ccf4b33f87..4a8a78769bd6 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -343,6 +343,18 @@ test_expect_success 'maintenance.incremental-repack.auto' ' test_subcommand git multi-pack-index write --no-progress <trace-B ' +test_expect_success 'pack-refs task' ' + for n in $(test_seq 1 5) + do + git branch -f to-pack/$n HEAD || return 1 + done && + GIT_TRACE2_EVENT="$(pwd)/pack-refs.txt" \ + git maintenance run --task=pack-refs && + ls .git/refs/heads/ >after && + test_must_be_empty after && + test_subcommand git pack-refs --all --prune <pack-refs.txt +' + test_expect_success '--auto and --schedule incompatible' ' test_must_fail git maintenance run --auto --schedule=daily 2>err && test_i18ngrep "at most one" err -- gitgitgadget ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] maintenance: incremental strategy runs pack-refs weekly 2021-02-09 13:42 ` [PATCH v2 " Derrick Stolee via GitGitGadget 2021-02-09 13:42 ` [PATCH v2 1/2] maintenance: " Derrick Stolee via GitGitGadget @ 2021-02-09 13:42 ` Derrick Stolee via GitGitGadget 2021-02-10 2:41 ` [PATCH v2 0/2] Maintenance: add pack-refs task Taylor Blau 2 siblings, 0 replies; 12+ messages in thread From: Derrick Stolee via GitGitGadget @ 2021-02-09 13:42 UTC (permalink / raw) To: git Cc: Taylor Blau, Eric Sunshine, Derrick Stolee, Derrick Stolee, Derrick Stolee From: Derrick Stolee <dstolee@microsoft.com> When the 'maintenance.strategy' config option is set to 'incremental', a default maintenance schedule is enabled. Add the 'pack-refs' task to that strategy at the weekly cadence. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> --- Documentation/config/maintenance.txt | 5 +++-- builtin/gc.c | 2 ++ t/t7900-maintenance.sh | 14 ++++++++++++++ 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index a5ead09e4bc2..18f056213145 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -15,8 +15,9 @@ maintenance.strategy:: * `none`: This default setting implies no task are run at any schedule. * `incremental`: This setting optimizes for performing small maintenance activities that do not delete any data. This does not schedule the `gc` - task, but runs the `prefetch` and `commit-graph` tasks hourly and the - `loose-objects` and `incremental-repack` tasks daily. + task, but runs the `prefetch` and `commit-graph` tasks hourly, the + `loose-objects` and `incremental-repack` tasks daily, and the `pack-refs` + task weekly. maintenance.<task>.enabled:: This boolean config option controls whether the maintenance task diff --git a/builtin/gc.c b/builtin/gc.c index 41bec4f177b3..6db9cb39e679 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1352,6 +1352,8 @@ static void initialize_maintenance_strategy(void) tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY; tasks[TASK_LOOSE_OBJECTS].enabled = 1; tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY; + tasks[TASK_PACK_REFS].enabled = 1; + tasks[TASK_PACK_REFS].schedule = SCHEDULE_WEEKLY; } } diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 4a8a78769bd6..286b18db3cc2 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -408,18 +408,32 @@ test_expect_success 'maintenance.strategy inheritance' ' git maintenance run --schedule=hourly --quiet && GIT_TRACE2_EVENT="$(pwd)/incremental-daily.txt" \ git maintenance run --schedule=daily --quiet && + GIT_TRACE2_EVENT="$(pwd)/incremental-weekly.txt" \ + git maintenance run --schedule=weekly --quiet && test_subcommand git commit-graph write --split --reachable \ --no-progress <incremental-hourly.txt && test_subcommand ! git prune-packed --quiet <incremental-hourly.txt && test_subcommand ! git multi-pack-index write --no-progress \ <incremental-hourly.txt && + test_subcommand ! git pack-refs --all --prune \ + <incremental-hourly.txt && test_subcommand git commit-graph write --split --reachable \ --no-progress <incremental-daily.txt && test_subcommand git prune-packed --quiet <incremental-daily.txt && test_subcommand git multi-pack-index write --no-progress \ <incremental-daily.txt && + test_subcommand ! git pack-refs --all --prune \ + <incremental-daily.txt && + + test_subcommand git commit-graph write --split --reachable \ + --no-progress <incremental-weekly.txt && + test_subcommand git prune-packed --quiet <incremental-weekly.txt && + test_subcommand git multi-pack-index write --no-progress \ + <incremental-weekly.txt && + test_subcommand git pack-refs --all --prune \ + <incremental-weekly.txt && # Modify defaults git config maintenance.commit-graph.schedule daily && -- gitgitgadget ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] Maintenance: add pack-refs task 2021-02-09 13:42 ` [PATCH v2 " Derrick Stolee via GitGitGadget 2021-02-09 13:42 ` [PATCH v2 1/2] maintenance: " Derrick Stolee via GitGitGadget 2021-02-09 13:42 ` [PATCH v2 2/2] maintenance: incremental strategy runs pack-refs weekly Derrick Stolee via GitGitGadget @ 2021-02-10 2:41 ` Taylor Blau 2 siblings, 0 replies; 12+ messages in thread From: Taylor Blau @ 2021-02-10 2:41 UTC (permalink / raw) To: Derrick Stolee via GitGitGadget Cc: git, Taylor Blau, Eric Sunshine, Derrick Stolee, Derrick Stolee On Tue, Feb 09, 2021 at 01:42:27PM +0000, Derrick Stolee via GitGitGadget wrote: > Updates in V2 > ============= > > * Fixed doc typo. Thanks, Eric! > * Updated commit messages to make it clear that the 'pack-refs' step will > still happen within the 'gc' task. > * Updated the test to check that we run the correct subcommand. > * maintenance_task_pack_refs() uses MAYBE_UNUSED on its parameter. Thanks, the range-diff looked sane to me. Reviewed-by: Taylor Blau <me@ttaylorr.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-02-10 2:46 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-02-08 14:52 [PATCH 0/2] Maintenance: add pack-refs task Derrick Stolee via GitGitGadget 2021-02-08 14:52 ` [PATCH 1/2] maintenance: " Derrick Stolee via GitGitGadget 2021-02-08 22:53 ` Taylor Blau 2021-02-09 12:42 ` Derrick Stolee 2021-02-08 23:06 ` Eric Sunshine 2021-02-09 12:42 ` Derrick Stolee 2021-02-08 14:52 ` [PATCH 2/2] maintenance: incremental strategy runs pack-refs weekly Derrick Stolee via GitGitGadget 2021-02-08 22:46 ` [PATCH 0/2] Maintenance: add pack-refs task Taylor Blau 2021-02-09 13:42 ` [PATCH v2 " Derrick Stolee via GitGitGadget 2021-02-09 13:42 ` [PATCH v2 1/2] maintenance: " Derrick Stolee via GitGitGadget 2021-02-09 13:42 ` [PATCH v2 2/2] maintenance: incremental strategy runs pack-refs weekly Derrick Stolee via GitGitGadget 2021-02-10 2:41 ` [PATCH v2 0/2] Maintenance: add pack-refs task Taylor Blau
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).