git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v2 0/2] Maintenance: add pack-refs task
Date: Tue, 09 Feb 2021 13:42:27 +0000	[thread overview]
Message-ID: <pull.871.v2.git.1612878149.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.871.git.1612795943.gitgitgadget@gmail.com>

This patch series adds a new pack-refs task to the maintenance builtin. This
operation already happens within git gc (and hence the gc task) but it is
easy to extract. Packing refs does not delete any data, only collects loose
objects into a combined file. This makes things faster in subtle ways,
especially when a command needs to iterate through refs (especially tags).

Credit for inspiring this goes to Suolong, who asked for this to be added to
Scalar [1]. I've been waiting instead to add it directly to Git and its
background maintenance. Now is the time!

[1] https://github.com/microsoft/scalar/issues/382

I chose to add it to the incremental maintenance strategy at a weekly
cadence. I'm not sure there is significant value to the difference between
weekly and daily. It just seems to me that weekly is often enough. Feel free
to correct me if you have a different opinion.

My hope is that this patch series could be used as an example for further
extracting tasks out of the gc task and making them be full maintenance
tasks. Doing more of these extractions could be a good project for a new
contributor.

One thing that is not implemented in this series is a notion of the behavior
for the pack-refs task during git maintenance run --auto. This could be
added in the future, but I wanted to focus on getting this behavior into the
incremental maintenance schedule.


Updates in V2
=============

 * Fixed doc typo. Thanks, Eric!
 * Updated commit messages to make it clear that the 'pack-refs' step will
   still happen within the 'gc' task.
 * Updated the test to check that we run the correct subcommand.
 * maintenance_task_pack_refs() uses MAYBE_UNUSED on its parameter.

Thanks, -Stolee

Cc: gitster@pobox.com Cc: sluongng@gmail.com Cc: martin.agren@gmail.com Cc:
sunshine@sunshineco.com

Derrick Stolee (2):
  maintenance: add pack-refs task
  maintenance: incremental strategy runs pack-refs weekly

 Documentation/config/maintenance.txt |  5 +++--
 Documentation/git-maintenance.txt    |  6 ++++++
 builtin/gc.c                         | 23 +++++++++++++++++++----
 t/t7900-maintenance.sh               | 26 ++++++++++++++++++++++++++
 4 files changed, 54 insertions(+), 6 deletions(-)


base-commit: fb7fa4a1fd273f22efcafdd13c7f897814fd1eb9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-871%2Fderrickstolee%2Fmaintenance%2Fpack-refs-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-871/derrickstolee/maintenance/pack-refs-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/871

Range-diff vs v1:

 1:  33b7a74af4eb ! 1:  bedaeb548b06 maintenance: add pack-refs task
     @@ Commit message
          by terminal prompts to show when a detatched HEAD is pointing to an
          existing tag, so having it be slow causes significant delays for users.
      
     -    Add a new 'pack-refs' maintenance task. This is already a sub-step of
     -    the 'gc' task, but users could run this at other intervals if they are
     -    interested. Also, if users opt-in to the default background maintenance
     -    schedule, then the 'gc' task is disabled.
     +    Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all
     +    --prune' to move loose refs into a packed form. For now, that is the
     +    packed-refs file, but could adjust to other file formats in the future.
     +
     +    This is the first of several sub-tasks of the 'gc' task that could be
     +    extracted to their own tasks. In this process, we should not change the
     +    behavior of the 'gc' task since that remains the default way to keep
     +    repositories maintained. Creating a new task for one of these sub-tasks
     +    only provides more customization options for those choosing to not use
     +    the 'gc' task. It is certainly possible to have both the 'gc' and
     +    'pack-refs' tasks enabled and run regularly. While they may repeat
     +    effort, they do not conflict in a destructive way.
     +
     +    The 'auto_condition' function pointer is left NULL for now. We could
     +    extend this in the future to have a condition check if pack-refs should
     +    be run during 'git maintenance run --auto'.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
     @@ Documentation/git-maintenance.txt: incremental-repack::
      +pack-refs::
      +	The `pack-refs` task collects the loose reference files and
      +	collects them into a single file. This speeds up operations that
     -+	need to iterate across many refereences. See linkgit:git-pack-refs[1]
     ++	need to iterate across many references. See linkgit:git-pack-refs[1]
      +	for more information.
      +
       OPTIONS
     @@ builtin/gc.c: static void gc_config(void)
       }
       
      +struct maintenance_run_opts;
     -+static int maintenance_task_pack_refs(struct maintenance_run_opts *opts)
     ++static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts)
      +{
      +	struct strvec pack_refs_cmd = STRVEC_INIT;
      +	strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL);
     @@ t/t7900-maintenance.sh: test_expect_success 'maintenance.incremental-repack.auto
      +	do
      +		git branch -f to-pack/$n HEAD || return 1
      +	done &&
     -+	git maintenance run --task=pack-refs &&
     ++	GIT_TRACE2_EVENT="$(pwd)/pack-refs.txt" \
     ++		git maintenance run --task=pack-refs &&
      +	ls .git/refs/heads/ >after &&
     -+	test_must_be_empty after
     ++	test_must_be_empty after &&
     ++	test_subcommand git pack-refs --all --prune <pack-refs.txt
      +'
      +
       test_expect_success '--auto and --schedule incompatible' '
 2:  8012d2dc1420 = 2:  c38fc9a4170e maintenance: incremental strategy runs pack-refs weekly

-- 
gitgitgadget

  parent reply	other threads:[~2021-02-09 13:47 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 14:52 [PATCH 0/2] Maintenance: add pack-refs task Derrick Stolee via GitGitGadget
2021-02-08 14:52 ` [PATCH 1/2] maintenance: " Derrick Stolee via GitGitGadget
2021-02-08 22:53   ` Taylor Blau
2021-02-09 12:42     ` Derrick Stolee
2021-02-08 23:06   ` Eric Sunshine
2021-02-09 12:42     ` Derrick Stolee
2021-02-08 14:52 ` [PATCH 2/2] maintenance: incremental strategy runs pack-refs weekly Derrick Stolee via GitGitGadget
2021-02-08 22:46 ` [PATCH 0/2] Maintenance: add pack-refs task Taylor Blau
2021-02-09 13:42 ` Derrick Stolee via GitGitGadget [this message]
2021-02-09 13:42   ` [PATCH v2 1/2] maintenance: " Derrick Stolee via GitGitGadget
2021-02-09 13:42   ` [PATCH v2 2/2] maintenance: incremental strategy runs pack-refs weekly Derrick Stolee via GitGitGadget
2021-02-10  2:41   ` [PATCH v2 0/2] Maintenance: add pack-refs task Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.871.v2.git.1612878149.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=stolee@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).