git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: peff@peff.net, jrnieder@google.com, stolee@gmail.com,
	Derrick Stolee <dstolee@microsoft.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH 15/15] runjob: customize the loose-objects batch size
Date: Fri, 03 Apr 2020 20:48:14 +0000	[thread overview]
Message-ID: <84cab34e8f26cca7eedd07e58b99bd2152d90a7d.1585946894.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.597.git.1585946894.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

Allow a user to override the default number of loose objects to
place into a new pack-file as part of the loose-objects job. This
can be done via the job.loose-objects.batchSize config option or
the --batch-size=<count> option in the 'git run-job' command. The
config value is checked once per run of 'git run-job loose-objects'
so an instance started by 'git job-runner' will use new values
automatically without restarting the 'git job-runner' process.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/job.txt  |  6 ++++++
 Documentation/git-run-job.txt |  8 +++++---
 builtin/run-job.c             | 31 ++++++++++++++++++++++++++-----
 3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/job.txt b/Documentation/config/job.txt
index 6c22a40dd36..baa5b927e14 100644
--- a/Documentation/config/job.txt
+++ b/Documentation/config/job.txt
@@ -16,6 +16,12 @@ job.<job-name>.lastRun::
 	can manually update this to a later time to delay a specific
 	job on this repository.
 
+job.loose-objects.batchSize::
+	This string value `<count>` limits the number of loose-objects
+	collected into a single pack-file during the `loose-objects`
+	job. Default batch size is fifty thousand. See linkgit:git-run-job[1]
+	for more details.
+
 job.pack-files.batchSize::
 	This string value `<size>` will be passed to the
 	`git multi-pack-index repack --batch-size=<size>` command as
diff --git a/Documentation/git-run-job.txt b/Documentation/git-run-job.txt
index c6d5674d699..73210791533 100644
--- a/Documentation/git-run-job.txt
+++ b/Documentation/git-run-job.txt
@@ -67,9 +67,11 @@ commands, it follows a two-step process. First, it deletes any loose
 objects that already exist in a pack-file; concurrent Git processes will
 examine the pack-file for the object data instead of the loose object.
 Second, it creates a new pack-file (starting with "loose-") containing
-a batch of loose objects. The batch size is limited to 50 thousand
-objects to prevent the job from taking too long on a repository with
-many loose objects.
+a batch of loose objects.
++
+By default, the batch size is limited to 50 thousand objects to prevent
+the job from taking too long on a repository with many loose objects.
+This can be overridden with the `--batch-size=<count>` option.
 
 'pack-files'::
 
diff --git a/builtin/run-job.c b/builtin/run-job.c
index 76765535e09..b7c5a74cdbb 100644
--- a/builtin/run-job.c
+++ b/builtin/run-job.c
@@ -13,6 +13,11 @@ static char const * const builtin_run_job_usage[] = {
 	NULL
 };
 
+static char const * const builtin_run_job_loose_objects_usage[] = {
+	N_("git run-job loose-objects [--batch-size=<count>]"),
+	NULL
+};
+
 static char const * const builtin_run_job_pack_file_usage[] = {
 	N_("git run-job pack-files [--batch-size=<size>]"),
 	NULL
@@ -183,7 +188,7 @@ static int write_loose_object_to_stdin(const struct object_id *oid,
 	return ++(d->count) > d->batch_size;
 }
 
-static int pack_loose(void)
+static int pack_loose(int batch_size)
 {
 	int result = 0;
 	struct write_loose_object_data data;
@@ -219,7 +224,7 @@ static int pack_loose(void)
 
 	data.in = xfdopen(pack_proc->in, "w");
 	data.count = 0;
-	data.batch_size = 50000;
+	data.batch_size = batch_size;
 
 	for_each_loose_file_in_objdir(the_repository->objects->odb->path,
 				      write_loose_object_to_stdin,
@@ -240,9 +245,25 @@ static int pack_loose(void)
 	return result;
 }
 
-static int run_loose_objects_job(void)
+static int run_loose_objects_job(int argc, const char **argv)
 {
-	return prune_packed() || pack_loose();
+	static int batch_size;
+	static struct option builtin_run_job_loose_objects_options[] = {
+		OPT_INTEGER(0, "batch-size", &batch_size,
+			    N_("specify the maximum number of loose objects to store in a pack-file")),
+		OPT_END(),
+	};
+
+	if (repo_config_get_int(the_repository,
+				"job.loose-objects.batchsize",
+				&batch_size))
+		batch_size = 50000;
+
+	argc = parse_options(argc, argv, NULL,
+			     builtin_run_job_loose_objects_options,
+			     builtin_run_job_loose_objects_usage, 0);
+
+	return prune_packed() || pack_loose(batch_size);
 }
 
 static int multi_pack_index_write(void)
@@ -427,7 +448,7 @@ int cmd_run_job(int argc, const char **argv, const char *prefix)
 		if (!strcmp(argv[0], "fetch"))
 			return run_fetch_job();
 		if (!strcmp(argv[0], "loose-objects"))
-			return run_loose_objects_job();
+			return run_loose_objects_job(argc, argv);
 		if (!strcmp(argv[0], "pack-files"))
 			return run_pack_files_job(argc, argv);
 	}
-- 
gitgitgadget

  parent reply	other threads:[~2020-04-03 20:48 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-03 20:47 [PATCH 00/15] [RFC] Maintenance jobs and job runner Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 01/15] run-job: create barebones builtin Derrick Stolee via GitGitGadget
2020-04-05 15:10   ` Phillip Wood
2020-04-05 19:21     ` Junio C Hamano
2020-04-06 14:42       ` Derrick Stolee
2020-04-07  0:58         ` Danh Doan
2020-04-07 10:54           ` Derrick Stolee
2020-04-07 14:16             ` Danh Doan
2020-04-07 14:30               ` Johannes Schindelin
2020-04-03 20:48 ` [PATCH 02/15] run-job: implement commit-graph job Derrick Stolee via GitGitGadget
2020-05-20 19:08   ` Josh Steadmon
2020-04-03 20:48 ` [PATCH 03/15] run-job: implement fetch job Derrick Stolee via GitGitGadget
2020-04-05 15:14   ` Phillip Wood
2020-04-06 12:48     ` Derrick Stolee
2020-04-05 20:28   ` Junio C Hamano
2020-04-06 12:46     ` Derrick Stolee
2020-05-20 19:08   ` Josh Steadmon
2020-04-03 20:48 ` [PATCH 04/15] run-job: implement loose-objects job Derrick Stolee via GitGitGadget
2020-04-05 20:33   ` Junio C Hamano
2020-04-03 20:48 ` [PATCH 05/15] run-job: implement pack-files job Derrick Stolee via GitGitGadget
2020-05-27 22:17   ` Josh Steadmon
2020-04-03 20:48 ` [PATCH 06/15] run-job: auto-size or use custom pack-files batch Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 07/15] config: add job.pack-files.batchSize option Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 08/15] job-runner: create builtin for job loop Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 09/15] job-runner: load repos from config by default Derrick Stolee via GitGitGadget
2020-04-05 15:18   ` Phillip Wood
2020-04-06 12:49     ` Derrick Stolee
2020-04-05 15:41   ` Phillip Wood
2020-04-06 12:57     ` Derrick Stolee
2020-04-03 20:48 ` [PATCH 10/15] job-runner: use config to limit job frequency Derrick Stolee via GitGitGadget
2020-04-05 15:24   ` Phillip Wood
2020-04-03 20:48 ` [PATCH 11/15] job-runner: use config for loop interval Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 12/15] job-runner: add --interval=<span> option Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 13/15] job-runner: skip a job if job.<job-name>.enabled is false Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 14/15] job-runner: add --daemonize option Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` Derrick Stolee via GitGitGadget [this message]
2020-04-03 21:40 ` [PATCH 00/15] [RFC] Maintenance jobs and job runner Junio C Hamano
2020-04-04  0:16   ` Derrick Stolee
2020-04-07  0:50     ` Danh Doan
2020-04-07 10:59       ` Derrick Stolee
2020-04-07 14:26         ` Danh Doan
2020-04-07 14:43           ` Johannes Schindelin
2020-04-07  1:48     ` brian m. carlson
2020-04-07 20:08       ` Junio C Hamano
2020-04-07 22:23       ` Johannes Schindelin
2020-04-08  0:01         ` brian m. carlson
2020-05-27 22:39           ` Josh Steadmon
2020-05-28  0:47             ` Junio C Hamano
2020-05-27 21:52               ` Johannes Schindelin
2020-05-28 14:48                 ` Junio C Hamano
2020-05-28 14:50                 ` Jonathan Nieder
2020-05-28 14:57                   ` Junio C Hamano
2020-05-28 15:03                     ` Jonathan Nieder
2020-05-28 15:30                       ` Derrick Stolee
2020-05-28  4:39                         ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84cab34e8f26cca7eedd07e58b99bd2152d90a7d.1585946894.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@google.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).