git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] Maintenance III: Background maintenance
@ 2020-09-04 15:41 Derrick Stolee via GitGitGadget
  2020-09-04 15:42 ` [PATCH 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
                   ` (7 more replies)
  0 siblings, 8 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-04 15:41 UTC (permalink / raw)
  To: git; +Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, Derrick Stolee

This is based on ds/maintenance-part-2 and replaces the RFC from [1].

[1] 
https://lore.kernel.org/git/pull.680.v3.git.1598629517.gitgitgadget@gmail.com/

This series introduces background maintenance to Git, through an integration
with cron and crontab.

Some preliminary work is done to allow a new --schedule option that tells
the command which tasks to run based on a maintenance.<task>.schedule config
option. The timing is not enforced by Git, but instead is expected to be
provided as a hint from a cron schedule. The options are "hourly", "daily",
and "weekly".

A new for-each-repo builtin runs Git commands on every repo in a given list.
Currently, the list is stored as a config setting, allowing a new 
maintenance.repos config list to store the repositories registered for
background maintenance. Others may want to add a --file=<file> option for
their own workflows, but I focused on making this as simple as possible for
now.

The updates to the git maintenance builtin include new register/unregister 
subcommands and start/stop subcommands. The register subcommand initializes
the config while the start subcommand does everything register does plus 
update the cron table. The unregister and stop commands reverse this
process.

A troubleshooting guide is added to Documentation/git-maintenance.txt to
advise expert users who choose to create custom cron schedules.

The very last patch is entirely optional. It sets a recommended schedule
based on my own experience with very large repositories. I'm open to other
suggestions, but these are ones that I think work well and don't cause a
"rewrite the world" scenario like running nightly 'gc' would do.

I've been testing this scenario on my macOS laptop and Linux desktop. I have
modified my cron task to provide logging via trace2 so I can see what's
happening. A future direction here would be to add some maintenance logs to
the repository so we can track what is happening and diagnose whether the
maintenance strategy is working on real repos.

Note: git maintenance (start|stop) only works on machines with cron by
design. The proper thing to do on Windows will come later. Perhaps this
command should be marked as unavailable on Windows somehow, or at least a
better error than "cron may not be available on your system". I did find
that that message is helpful sometimes: macOS worker agents for CI builds
typically do not have cron available.

Derrick Stolee (7):
  maintenance: optionally skip --auto process
  maintenance: add --schedule option and config
  for-each-repo: run subcommands on configured repos
  maintenance: add [un]register subcommands
  maintenance: add start/stop subcommands
  maintenance: recommended schedule in register/start
  maintenance: add troubleshooting guide to docs

 .gitignore                           |   1 +
 Documentation/config/maintenance.txt |  10 +
 Documentation/git-for-each-repo.txt  |  59 ++++++
 Documentation/git-maintenance.txt    |  88 +++++++-
 Makefile                             |   2 +
 builtin.h                            |   1 +
 builtin/for-each-repo.c              |  58 ++++++
 builtin/gc.c                         | 292 ++++++++++++++++++++++++++-
 command-list.txt                     |   1 +
 git.c                                |   1 +
 run-command.c                        |   6 +
 t/helper/test-crontab.c              |  35 ++++
 t/helper/test-tool.c                 |   1 +
 t/helper/test-tool.h                 |   1 +
 t/t0068-for-each-repo.sh             |  30 +++
 t/t7900-maintenance.sh               | 114 ++++++++++-
 t/test-lib.sh                        |   6 +
 17 files changed, 698 insertions(+), 8 deletions(-)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100644 t/helper/test-crontab.c
 create mode 100755 t/t0068-for-each-repo.sh


base-commit: e576ac2c7c7f6c7aa5ac08a516baeb61bf723596
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-724%2Fderrickstolee%2Fmaintenance%2Fscheduled-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-724/derrickstolee/maintenance/scheduled-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/724
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 1/7] maintenance: optionally skip --auto process
  2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
@ 2020-09-04 15:42 ` Derrick Stolee via GitGitGadget
  2020-09-04 15:42 ` [PATCH 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-04 15:42 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Some commands run 'git maintenance run --auto --[no-]quiet' after doing
their normal work, as a way to keep repositories clean as they are used.
Currently, users who do not want this maintenance to occur would set the
'gc.auto' config option to 0 to avoid the 'gc' task from running.
However, this does not stop the extra process invocation. On Windows,
this extra process invocation can be more expensive than necessary.

Allow users to drop this extra process by setting 'maintenance.auto' to
'false'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++++
 run-command.c                        |  6 ++++++
 t/t7900-maintenance.sh               | 13 +++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index a0706d8f09..06db758172 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -1,3 +1,8 @@
+maintenance.auto::
+	This boolean config option controls whether some commands run
+	`git maintenance run --auto` after doing their normal work. Defaults
+	to true.
+
 maintenance.<task>.enabled::
 	This boolean config option controls whether the maintenance task
 	with name `<task>` is run when no `--task` option is specified to
diff --git a/run-command.c b/run-command.c
index 2ee59acdc8..ea4d0fb4b1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -7,6 +7,7 @@
 #include "strbuf.h"
 #include "string-list.h"
 #include "quote.h"
+#include "config.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -1868,8 +1869,13 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 
 int run_auto_maintenance(int quiet)
 {
+	int enabled;
 	struct child_process maint = CHILD_PROCESS_INIT;
 
+	if (!git_config_get_bool("maintenance.auto", &enabled) &&
+	    !enabled)
+		return 0;
+
 	maint.git_cmd = 1;
 	strvec_pushl(&maint.args, "maintenance", "run", "--auto", NULL);
 	strvec_push(&maint.args, quiet ? "--quiet" : "--no-quiet");
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 6f878b0141..e0ba19e1ff 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -26,6 +26,19 @@ test_expect_success 'run [--auto|--quiet]' '
 	test_subcommand git gc --no-quiet <run-no-quiet.txt
 '
 
+test_expect_success 'maintenance.auto config option' '
+	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
+	test_subcommand git maintenance run --auto --quiet <default &&
+	GIT_TRACE2_EVENT="$(pwd)/true" \
+		git -c maintenance.auto=true \
+		commit --quiet --allow-empty -m 2 &&
+	test_subcommand git maintenance run --auto --quiet  <true &&
+	GIT_TRACE2_EVENT="$(pwd)/false" \
+		git -c maintenance.auto=false \
+		commit --quiet --allow-empty -m 3 &&
+	test_subcommand ! git maintenance run --auto --quiet  <false
+'
+
 test_expect_success 'maintenance.<task>.enabled' '
 	git config maintenance.gc.enabled false &&
 	git config maintenance.commit-graph.enabled true &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 2/7] maintenance: add --schedule option and config
  2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  2020-09-04 15:42 ` [PATCH 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
@ 2020-09-04 15:42 ` Derrick Stolee via GitGitGadget
  2020-09-08 13:07   ` Đoàn Trần Công Danh
  2020-09-04 15:42 ` [PATCH 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-04 15:42 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

A user may want to run certain maintenance tasks based on frequency, not
conditions given in the repository. For example, the user may want to
perform a 'prefetch' task every hour, or 'gc' task every day. To assist,
update the 'git maintenance run' command to include a
'--schedule=<frequency>' option. The allowed frequencies are 'hourly',
'daily', and 'weekly'. These values are also allowed in a new config
value 'maintenance.<task>.schedule'.

The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
config value for each enabled task to see if the configured frequency is
at least as frequent as the frequency from the '--schedule' argument. We
use the following order, for full clarity:

	'hourly' > 'daily' > 'weekly'

Use new 'enum schedule_priority' to track these values numerically.

The following cron table would run the scheduled tasks with the correct
frequencies:

  0 1-23 * * *    git -C <repo> maintenance run --scheduled=hourly
  0 0    * * 1-6  git -C <repo> maintenance run --scheduled=daily
  0 0    * * 0    git -C <repo> maintenance run --scheduled=weekly

This cron schedule will run --scheduled=hourly every hour except at
midnight. This avoids a concurrent run with the --scheduled=daily that
runs at midnight every day except the first day of the week. This avoids
a concurrent run with the --scheduled=weekly that runs at midnight on
the first day of the week. Since --scheduled=daily also runs the
'hourly' tasks and --scheduled=weekly runs the 'hourly' and 'daily'
tasks, we will still see all tasks run with the proper frequencies.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++
 Documentation/git-maintenance.txt    | 13 +++++-
 builtin/gc.c                         | 67 +++++++++++++++++++++++++---
 t/t7900-maintenance.sh               | 40 +++++++++++++++++
 4 files changed, 119 insertions(+), 6 deletions(-)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index 06db758172..70585564fa 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -10,6 +10,11 @@ maintenance.<task>.enabled::
 	`--task` option exists. By default, only `maintenance.gc.enabled`
 	is true.
 
+maintenance.<task>.schedule::
+	This config option controls whether or not the given `<task>` runs
+	during a `git maintenance run --schedule=<frequency>` command. The
+	value must be one of "hourly", "daily", or "weekly".
+
 maintenance.commit-graph.auto::
 	This integer config option controls how often the `commit-graph` task
 	should be run as part of `git maintenance run --auto`. If zero, then
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index b44efb05a3..3af5907b01 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -107,7 +107,18 @@ OPTIONS
 	only if certain thresholds are met. For example, the `gc` task
 	runs when the number of loose objects exceeds the number stored
 	in the `gc.auto` config setting, or when the number of pack-files
-	exceeds the `gc.autoPackLimit` config setting.
+	exceeds the `gc.autoPackLimit` config setting. Not compatible with
+	the `--schedule` option.
+
+--schedule::
+	When combined with the `run` subcommand, run maintenance tasks
+	only if certain time conditions are met, as specified by the
+	`maintenance.<task>.schedule` config value for each `<task>`.
+	This config value specifies a number of seconds since the last
+	time that task ran, according to the `maintenance.<task>.lastRun`
+	config value. The tasks that are tested are those provided by
+	the `--task=<task>` option(s) or those with
+	`maintenance.<task>.enabled` set to true.
 
 --quiet::
 	Do not report progress or other information over `stderr`.
diff --git a/builtin/gc.c b/builtin/gc.c
index f8459df04c..85a3370692 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -704,14 +704,51 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static const char * const builtin_maintenance_run_usage[] = {
-	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>]"),
+static const char *const builtin_maintenance_run_usage[] = {
+	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>] [--schedule]"),
 	NULL
 };
 
+enum schedule_priority {
+	SCHEDULE_NONE = 0,
+	SCHEDULE_WEEKLY = 1,
+	SCHEDULE_DAILY = 2,
+	SCHEDULE_HOURLY = 3,
+};
+
+static enum schedule_priority parse_schedule(const char *value)
+{
+	if (!value)
+		return SCHEDULE_NONE;
+	if (!strcasecmp(value, "hourly"))
+		return SCHEDULE_HOURLY;
+	if (!strcasecmp(value, "daily"))
+		return SCHEDULE_DAILY;
+	if (!strcasecmp(value, "weekly"))
+		return SCHEDULE_WEEKLY;
+	return SCHEDULE_NONE;
+}
+
+static int maintenance_opt_schedule(const struct option *opt, const char *arg,
+				    int unset)
+{
+	enum schedule_priority *priority = opt->value;
+
+	if (unset)
+		die(_("--no-schedule is not allowed"));
+
+	*priority = parse_schedule(arg);
+
+	if (!*priority)
+		die(_("unrecognized --schedule argument '%s'"), arg);
+
+	return 0;
+}
+
 struct maintenance_run_opts {
 	int auto_flag;
 	int quiet;
+	enum schedule_priority schedule;
 };
 
 /* Remember to update object flag allocation in object.h */
@@ -1159,6 +1196,8 @@ struct maintenance_task {
 	maintenance_auto_fn *auto_condition;
 	unsigned enabled:1;
 
+	enum schedule_priority schedule;
+
 	/* -1 if not selected. */
 	int selected_order;
 };
@@ -1250,8 +1289,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 			continue;
 
 		if (opts->auto_flag &&
-		    (!tasks[i].auto_condition ||
-		     !tasks[i].auto_condition()))
+		    (!tasks[i].auto_condition || !tasks[i].auto_condition()))
+			continue;
+
+		if (opts->schedule && tasks[i].schedule < opts->schedule)
 			continue;
 
 		trace2_region_enter("maintenance", tasks[i].name, r);
@@ -1274,13 +1315,23 @@ static void initialize_task_config(void)
 
 	for (i = 0; i < TASK__COUNT; i++) {
 		int config_value;
+		char *config_str;
 
-		strbuf_setlen(&config_name, 0);
+		strbuf_reset(&config_name);
 		strbuf_addf(&config_name, "maintenance.%s.enabled",
 			    tasks[i].name);
 
 		if (!git_config_get_bool(config_name.buf, &config_value))
 			tasks[i].enabled = config_value;
+
+		strbuf_reset(&config_name);
+		strbuf_addf(&config_name, "maintenance.%s.schedule",
+			    tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &config_str)) {
+			tasks[i].schedule = parse_schedule(config_str);
+			free(config_str);
+		}
 	}
 
 	strbuf_release(&config_name);
@@ -1324,6 +1375,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
+		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
+			     N_("run tasks based on frequency"),
+			     maintenance_opt_schedule),
 		OPT_BOOL(0, "quiet", &opts.quiet,
 			 N_("do not report progress or other information over stderr")),
 		OPT_CALLBACK_F(0, "task", NULL, N_("task"),
@@ -1344,6 +1398,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 			     builtin_maintenance_run_usage,
 			     PARSE_OPT_STOP_AT_NON_OPTION);
 
+	if (opts.auto_flag && opts.schedule)
+		die(_("use at most one of --auto and --schedule=<frequency>"));
+
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index e0ba19e1ff..328bbaa830 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -264,4 +264,44 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 	done
 '
 
+test_expect_success '--auto and --schedule incompatible' '
+	test_must_fail git maintenance run --auto --schedule=daily 2>err &&
+	test_i18ngrep "at most one" err
+'
+
+test_expect_success 'invalid --schedule value' '
+	test_must_fail git maintenance run --schedule=annually 2>err &&
+	test_i18ngrep "unrecognized --schedule" err
+'
+
+test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
+	git config maintenance.loose-objects.enabled true &&
+	git config maintenance.loose-objects.schedule hourly &&
+	git config maintenance.commit-graph.enabled true &&
+	git config maintenance.commit-graph.schedule daily &&
+	git config maintenance.incremental-repack.enabled true &&
+	git config maintenance.incremental-repack.schedule weekly &&
+
+	GIT_TRACE2_EVENT="$(pwd)/hourly.txt" \
+		git maintenance run --schedule=hourly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <hourly.txt &&
+	test_subcommand ! git commit-graph write --split --reachable \
+		--no-progress <hourly.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <hourly.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/daily.txt" \
+		git maintenance run --schedule=daily 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <daily.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <daily.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <daily.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/weekly.txt" \
+		git maintenance run --schedule=weekly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <weekly.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <weekly.txt &&
+	test_subcommand git multi-pack-index write --no-progress <weekly.txt
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 3/7] for-each-repo: run subcommands on configured repos
  2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  2020-09-04 15:42 ` [PATCH 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
  2020-09-04 15:42 ` [PATCH 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
@ 2020-09-04 15:42 ` Derrick Stolee via GitGitGadget
  2020-09-04 15:42 ` [PATCH 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-04 15:42 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

It can be helpful to store a list of repositories in global or system
config and then iterate Git commands on that list. Create a new builtin
that makes this process simple for experts. We will use this builtin to
run scheduled maintenance on all configured repositories in a future
change.

The test is very simple, but does highlight that the "--" argument is
optional.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 .gitignore                          |  1 +
 Documentation/git-for-each-repo.txt | 59 +++++++++++++++++++++++++++++
 Makefile                            |  1 +
 builtin.h                           |  1 +
 builtin/for-each-repo.c             | 58 ++++++++++++++++++++++++++++
 command-list.txt                    |  1 +
 git.c                               |  1 +
 t/t0068-for-each-repo.sh            | 30 +++++++++++++++
 8 files changed, 152 insertions(+)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100755 t/t0068-for-each-repo.sh

diff --git a/.gitignore b/.gitignore
index a5808fa30d..5eb2a2be71 100644
--- a/.gitignore
+++ b/.gitignore
@@ -67,6 +67,7 @@
 /git-filter-branch
 /git-fmt-merge-msg
 /git-for-each-ref
+/git-for-each-repo
 /git-format-patch
 /git-fsck
 /git-fsck-objects
diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt
new file mode 100644
index 0000000000..94bd19da26
--- /dev/null
+++ b/Documentation/git-for-each-repo.txt
@@ -0,0 +1,59 @@
+git-for-each-repo(1)
+====================
+
+NAME
+----
+git-for-each-repo - Run a Git command on a list of repositories
+
+
+SYNOPSIS
+--------
+[verse]
+'git for-each-repo' --config=<config> [--] <arguments>
+
+
+DESCRIPTION
+-----------
+Run a Git command on a list of repositories. The arguments after the
+known options or `--` indicator are used as the arguments for the Git
+subprocess.
+
+THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
+
+For example, we could run maintenance on each of a list of repositories
+stored in a `maintenance.repo` config variable using
+
+-------------
+git for-each-repo --config=maintenance.repo maintenance run
+-------------
+
+This will run `git -C <repo> maintenance run` for each value `<repo>`
+in the multi-valued config variable `maintenance.repo`.
+
+
+OPTIONS
+-------
+--config=<config>::
+	Use the given config variable as a multi-valued list storing
+	absolute path names. Iterate on that list of paths to run
+	the given arguments.
++
+These config values are loaded from system, global, and local Git config,
+as available. If `git for-each-repo` is run in a directory that is not a
+Git repository, then only the system and global config is used.
+
+
+SUBPROCESS BEHAVIOR
+-------------------
+
+If any `git -C <repo> <arguments>` subprocess returns a non-zero exit code,
+then the `git for-each-repo` process returns that exit code without running
+more subprocesses.
+
+Each `git -C <repo> <arguments>` subprocess inherits the standard file
+descriptors `stdin`, `stdout`, and `stderr`.
+
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 65f8cfb236..7c588ff036 100644
--- a/Makefile
+++ b/Makefile
@@ -1071,6 +1071,7 @@ BUILTIN_OBJS += builtin/fetch-pack.o
 BUILTIN_OBJS += builtin/fetch.o
 BUILTIN_OBJS += builtin/fmt-merge-msg.o
 BUILTIN_OBJS += builtin/for-each-ref.o
+BUILTIN_OBJS += builtin/for-each-repo.o
 BUILTIN_OBJS += builtin/fsck.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/get-tar-commit-id.o
diff --git a/builtin.h b/builtin.h
index 17c1c0ce49..ff7c6e5aa9 100644
--- a/builtin.h
+++ b/builtin.h
@@ -150,6 +150,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix);
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix);
 int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix);
 int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
 int cmd_format_patch(int argc, const char **argv, const char *prefix);
 int cmd_fsck(int argc, const char **argv, const char *prefix);
 int cmd_gc(int argc, const char **argv, const char *prefix);
diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
new file mode 100644
index 0000000000..5bba623ff1
--- /dev/null
+++ b/builtin/for-each-repo.c
@@ -0,0 +1,58 @@
+#include "cache.h"
+#include "config.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "run-command.h"
+#include "string-list.h"
+
+static const char * const for_each_repo_usage[] = {
+	N_("git for-each-repo --config=<config> <command-args>"),
+	NULL
+};
+
+static int run_command_on_repo(const char *path,
+			       void *cbdata)
+{
+	int i;
+	struct child_process child = CHILD_PROCESS_INIT;
+	struct strvec *args = (struct strvec *)cbdata;
+
+	child.git_cmd = 1;
+	strvec_pushl(&child.args, "-C", path, NULL);
+
+	for (i = 0; i < args->nr; i++)
+		strvec_push(&child.args, args->v[i]);
+
+	return run_command(&child);
+}
+
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
+{
+	static const char *config_key = NULL;
+	int i, result = 0;
+	const struct string_list *values;
+	struct strvec args = STRVEC_INIT;
+
+	const struct option options[] = {
+		OPT_STRING(0, "config", &config_key, N_("config"),
+			   N_("config key storing a list of repository paths")),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, prefix, options, for_each_repo_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+
+	if (!config_key)
+		die(_("missing --config=<config>"));
+
+	for (i = 0; i < argc; i++)
+		strvec_push(&args, argv[i]);
+
+	values = repo_config_get_value_multi(the_repository,
+					     config_key);
+
+	for (i = 0; !result && i < values->nr; i++)
+		result = run_command_on_repo(values->items[i].string, &args);
+
+	return result;
+}
diff --git a/command-list.txt b/command-list.txt
index 0e3204e7d1..581499be82 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -94,6 +94,7 @@ git-fetch-pack                          synchingrepositories
 git-filter-branch                       ancillarymanipulators
 git-fmt-merge-msg                       purehelpers
 git-for-each-ref                        plumbinginterrogators
+git-for-each-repo                       plumbinginterrogators
 git-format-patch                        mainporcelain
 git-fsck                                ancillaryinterrogators          complete
 git-gc                                  mainporcelain
diff --git a/git.c b/git.c
index 24f250d29a..1cab64b5d1 100644
--- a/git.c
+++ b/git.c
@@ -511,6 +511,7 @@ static struct cmd_struct commands[] = {
 	{ "fetch-pack", cmd_fetch_pack, RUN_SETUP | NO_PARSEOPT },
 	{ "fmt-merge-msg", cmd_fmt_merge_msg, RUN_SETUP },
 	{ "for-each-ref", cmd_for_each_ref, RUN_SETUP },
+	{ "for-each-repo", cmd_for_each_repo, RUN_SETUP_GENTLY },
 	{ "format-patch", cmd_format_patch, RUN_SETUP },
 	{ "fsck", cmd_fsck, RUN_SETUP },
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
diff --git a/t/t0068-for-each-repo.sh b/t/t0068-for-each-repo.sh
new file mode 100755
index 0000000000..136b4ec839
--- /dev/null
+++ b/t/t0068-for-each-repo.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+test_description='git for-each-repo builtin'
+
+. ./test-lib.sh
+
+test_expect_success 'run based on configured value' '
+	git init one &&
+	git init two &&
+	git init three &&
+	git -C two commit --allow-empty -m "DID NOT RUN" &&
+	git config run.key "$TRASH_DIRECTORY/one" &&
+	git config --add run.key "$TRASH_DIRECTORY/three" &&
+	git for-each-repo --config=run.key commit --allow-empty -m "ran" &&
+	git -C one log -1 --pretty=format:%s >message &&
+	grep ran message &&
+	git -C two log -1 --pretty=format:%s >message &&
+	! grep ran message &&
+	git -C three log -1 --pretty=format:%s >message &&
+	grep ran message &&
+	git for-each-repo --config=run.key -- commit --allow-empty -m "ran again" &&
+	git -C one log -1 --pretty=format:%s >message &&
+	grep again message &&
+	git -C two log -1 --pretty=format:%s >message &&
+	! grep again message &&
+	git -C three log -1 --pretty=format:%s >message &&
+	grep again message
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 4/7] maintenance: add [un]register subcommands
  2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-09-04 15:42 ` [PATCH 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
@ 2020-09-04 15:42 ` Derrick Stolee via GitGitGadget
  2020-09-04 15:42 ` [PATCH 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-04 15:42 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In preparation for launching background maintenance from the 'git
maintenance' builtin, create register/unregister subcommands. These
commands update the new 'maintenance.repos' config option in the global
config so the background maintenance job knows which repositories to
maintain.

These commands allow users to add a repository to the background
maintenance list without disrupting the actual maintenance mechanism.

For example, a user can run 'git maintenance register' when no
background maintenance is running and it will not start the background
maintenance. A later update to start running background maintenance will
then pick up this repository automatically.

The opposite example is that a user can run 'git maintenance unregister'
to remove the current repository from background maintenance without
halting maintenance for other repositories.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 14 ++++++++
 builtin/gc.c                      | 55 ++++++++++++++++++++++++++++++-
 t/t7900-maintenance.sh            | 17 +++++++++-
 3 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 3af5907b01..78d0d8df91 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -29,6 +29,15 @@ Git repository.
 SUBCOMMANDS
 -----------
 
+register::
+	Initialize Git config values so any scheduled maintenance will
+	start running on this repository. This adds the repository to the
+	`maintenance.repo` config variable in the current user's global
+	config and enables some recommended configuration values for
+	`maintenance.<task>.schedule`. The tasks that are enabled are safe
+	for running in the background without disrupting foreground
+	processes.
+
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
 	are specified, then those tasks are run in that order. Otherwise,
@@ -36,6 +45,11 @@ run::
 	config options are true. By default, only `maintenance.gc.enabled`
 	is true.
 
+unregister::
+	Remove the current repository from background maintenance. This
+	only removes the repository from the configured list. It does not
+	stop the background maintenance processes from running.
+
 TASKS
 -----
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 85a3370692..ec77e8d2fa 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1407,7 +1407,56 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	return maintenance_run_tasks(&opts);
 }
 
-static const char builtin_maintenance_usage[] = N_("git maintenance run [<options>]");
+static int maintenance_register(void)
+{
+	struct child_process config_set = CHILD_PROCESS_INIT;
+	struct child_process config_get = CHILD_PROCESS_INIT;
+
+	/* There is no current repository, so skip registering it */
+	if (!the_repository || !the_repository->gitdir)
+		return 0;
+
+	config_get.git_cmd = 1;
+	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+			 NULL);
+	config_get.out = -1;
+
+	if (start_command(&config_get))
+		return error(_("failed to run 'git config'"));
+
+	/* We already have this value in our config! */
+	if (!finish_command(&config_get))
+		return 0;
+
+	config_set.git_cmd = 1;
+	strvec_pushl(&config_set.args, "config", "--add", "--global", "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+		     NULL);
+
+	return run_command(&config_set);
+}
+
+static int maintenance_unregister(void)
+{
+	struct child_process config_unset = CHILD_PROCESS_INIT;
+
+	if (!the_repository || !the_repository->gitdir)
+		return error(_("no current repository to unregister"));
+
+	config_unset.git_cmd = 1;
+	strvec_pushl(&config_unset.args, "config", "--global", "--unset",
+		     "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+		     NULL);
+
+	return run_command(&config_unset);
+}
+
+static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
 
 int cmd_maintenance(int argc, const char **argv, const char *prefix)
 {
@@ -1416,6 +1465,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "register"))
+		return maintenance_register();
+	if (!strcmp(argv[1], "unregister"))
+		return maintenance_unregister();
 
 	die(_("invalid subcommand: %s"), argv[1]);
 }
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 328bbaa830..272d1605d2 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -9,7 +9,7 @@ GIT_TEST_MULTI_PACK_INDEX=0
 
 test_expect_success 'help text' '
 	test_expect_code 129 git maintenance -h 2>err &&
-	test_i18ngrep "usage: git maintenance run" err &&
+	test_i18ngrep "usage: git maintenance <subcommand>" err &&
 	test_expect_code 128 git maintenance barf 2>err &&
 	test_i18ngrep "invalid subcommand: barf" err
 '
@@ -304,4 +304,19 @@ test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
 	test_subcommand git multi-pack-index write --no-progress <weekly.txt
 '
 
+test_expect_success 'register and unregister' '
+	test_when_finished git config --global --unset-all maintenance.repo &&
+	git config --global --add maintenance.repo /existing1 &&
+	git config --global --add maintenance.repo /existing2 &&
+	git config --global --get-all maintenance.repo >before &&
+	git maintenance register &&
+	git config --global --get-all maintenance.repo >actual &&
+	cp before after &&
+	pwd >>after &&
+	test_cmp after actual &&
+	git maintenance unregister &&
+	git config --global --get-all maintenance.repo >actual &&
+	test_cmp before actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 5/7] maintenance: add start/stop subcommands
  2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                   ` (3 preceding siblings ...)
  2020-09-04 15:42 ` [PATCH 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
@ 2020-09-04 15:42 ` Derrick Stolee via GitGitGadget
  2020-09-08  6:29   ` SZEDER Gábor
  2020-09-04 15:42 ` [PATCH 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-04 15:42 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add new subcommands to 'git maintenance' that start or stop background
maintenance using 'cron', when available. This integration is as simple
as I could make it, barring some implementation complications.

The schedule is laid out as follows:

  0 1-23 * * *   $cmd maintenance run --schedule=hourly
  0 0    * * 1-6 $cmd maintenance run --schedule=daily
  0 0    * * 0   $cmd maintenance run --schedule=weekly

where $cmd is a properly-qualified 'git for-each-repo' execution:

$cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo

where $path points to the location of the Git executable running 'git
maintenance start'. This is critical for systems with multiple versions
of Git. Specifically, macOS has a system version at '/usr/bin/git' while
the version that users can install resides at '/usr/local/bin/git'
(symlinked to '/usr/local/libexec/git-core/git'). This will also use
your locally-built version if you build and run this in your development
environment without installing first.

This conditional schedule avoids having cron launch multiple 'git
for-each-repo' commands in parallel. Such parallel commands would likely
lead to the 'hourly' and 'daily' tasks competing over the object
database lock. This could lead to to some tasks never being run! Since
the --schedule=<frequency> argument will run all tasks with _at least_
the given frequency, the daily runs will also run the hourly tasks.
Similarly, the weekly runs will also run the daily and hourly tasks.

The GIT_TEST_CRONTAB environment variable is not intended for users to
edit, but instead as a way to mock the 'crontab [-l]' command. This
variable is set in test-lib.sh to avoid a future test from accidentally
running anything with the cron integration from modifying the user's
schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our
tests to check how the schedule is modified in 'git maintenance
(start|stop)' commands.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  11 +++
 Makefile                          |   1 +
 builtin/gc.c                      | 124 ++++++++++++++++++++++++++++++
 t/helper/test-crontab.c           |  35 +++++++++
 t/helper/test-tool.c              |   1 +
 t/helper/test-tool.h              |   1 +
 t/t7900-maintenance.sh            |  28 +++++++
 t/test-lib.sh                     |   6 ++
 8 files changed, 207 insertions(+)
 create mode 100644 t/helper/test-crontab.c

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 78d0d8df91..7f8c279fe8 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -45,6 +45,17 @@ run::
 	config options are true. By default, only `maintenance.gc.enabled`
 	is true.
 
+start::
+	Start running maintenance on the current repository. This performs
+	the same config updates as the `register` subcommand, then updates
+	the background scheduler to run `git maintenance run --scheduled`
+	on an hourly basis.
+
+stop::
+	Halt the background maintenance schedule. The current repository
+	is not removed from the list of maintained repositories, in case
+	the background maintenance is restarted later.
+
 unregister::
 	Remove the current repository from background maintenance. This
 	only removes the repository from the configured list. It does not
diff --git a/Makefile b/Makefile
index 7c588ff036..c39b39bd7d 100644
--- a/Makefile
+++ b/Makefile
@@ -690,6 +690,7 @@ TEST_BUILTINS_OBJS += test-advise.o
 TEST_BUILTINS_OBJS += test-bloom.o
 TEST_BUILTINS_OBJS += test-chmtime.o
 TEST_BUILTINS_OBJS += test-config.o
+TEST_BUILTINS_OBJS += test-crontab.o
 TEST_BUILTINS_OBJS += test-ctype.o
 TEST_BUILTINS_OBJS += test-date.o
 TEST_BUILTINS_OBJS += test-delta.o
diff --git a/builtin/gc.c b/builtin/gc.c
index ec77e8d2fa..9914417e25 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -32,6 +32,7 @@
 #include "remote.h"
 #include "midx.h"
 #include "object-store.h"
+#include "exec-cmd.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -1456,6 +1457,125 @@ static int maintenance_unregister(void)
 	return run_command(&config_unset);
 }
 
+#define BEGIN_LINE "# BEGIN GIT MAINTENANCE SCHEDULE"
+#define END_LINE "# END GIT MAINTENANCE SCHEDULE"
+
+static int update_background_schedule(int run_maintenance)
+{
+	int result = 0;
+	int in_old_region = 0;
+	struct child_process crontab_list = CHILD_PROCESS_INIT;
+	struct child_process crontab_edit = CHILD_PROCESS_INIT;
+	FILE *cron_list, *cron_in;
+	const char *crontab_name;
+	struct strbuf line = STRBUF_INIT;
+	struct lock_file lk;
+	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
+
+	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
+		return error(_("another process is scheduling background maintenance"));
+
+	crontab_name = getenv("GIT_TEST_CRONTAB");
+	if (!crontab_name)
+		crontab_name = "crontab";
+
+	strvec_split(&crontab_list.args, crontab_name);
+	strvec_push(&crontab_list.args, "-l");
+	crontab_list.in = -1;
+	crontab_list.out = dup(lk.tempfile->fd);
+	crontab_list.git_cmd = 0;
+
+	if (start_command(&crontab_list)) {
+		result = error(_("failed to run 'crontab -l'; your system might not support 'cron'"));
+		goto cleanup;
+	}
+
+	/* Ignore exit code, as an empty crontab will return error. */
+	finish_command(&crontab_list);
+
+	/*
+	 * Read from the .lock file, filtering out the old
+	 * schedule while appending the new schedule.
+	 */
+	cron_list = fdopen(lk.tempfile->fd, "r");
+	rewind(cron_list);
+
+	strvec_split(&crontab_edit.args, crontab_name);
+	crontab_edit.in = -1;
+	crontab_edit.git_cmd = 0;
+
+	if (start_command(&crontab_edit)) {
+		result = error(_("failed to run 'crontab'; your system might not support 'cron'"));
+		goto cleanup;
+	}
+
+	cron_in = fdopen(crontab_edit.in, "w");
+	if (!cron_in) {
+		result = error(_("failed to open stdin of 'crontab'"));
+		goto done_editing;
+	}
+
+	while (!strbuf_getline_lf(&line, cron_list)) {
+		if (!in_old_region && !strcmp(line.buf, BEGIN_LINE))
+			in_old_region = 1;
+		if (in_old_region)
+			continue;
+		fprintf(cron_in, "%s\n", line.buf);
+		if (in_old_region && !strcmp(line.buf, END_LINE))
+			in_old_region = 0;
+	}
+
+	if (run_maintenance) {
+		struct strbuf line_format = STRBUF_INIT;
+		const char *exec_path = git_exec_path();
+
+		fprintf(cron_in, "%s\n", BEGIN_LINE);
+		fprintf(cron_in,
+			"# The following schedule was created by Git\n");
+		fprintf(cron_in, "# Any edits made in this region might be\n");
+		fprintf(cron_in,
+			"# replaced in the future by a Git command.\n\n");
+
+		strbuf_addf(&line_format,
+			    "%%s %%s * * %%s \"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%s\n",
+			    exec_path, exec_path);
+		fprintf(cron_in, line_format.buf, "0", "1-23", "*", "hourly");
+		fprintf(cron_in, line_format.buf, "0", "0", "1-6", "daily");
+		fprintf(cron_in, line_format.buf, "0", "0", "0", "weekly");
+		strbuf_release(&line_format);
+
+		fprintf(cron_in, "\n%s\n", END_LINE);
+	}
+
+	fflush(cron_in);
+	fclose(cron_in);
+	close(crontab_edit.in);
+
+done_editing:
+	if (finish_command(&crontab_edit)) {
+		result = error(_("'crontab' died"));
+		goto cleanup;
+	}
+	fclose(cron_list);
+
+cleanup:
+	rollback_lock_file(&lk);
+	return result;
+}
+
+static int maintenance_start(void)
+{
+	if (maintenance_register())
+		warning(_("failed to add repo to global config"));
+
+	return update_background_schedule(1);
+}
+
+static int maintenance_stop(void)
+{
+	return update_background_schedule(0);
+}
+
 static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
 
 int cmd_maintenance(int argc, const char **argv, const char *prefix)
@@ -1465,6 +1585,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "start"))
+		return maintenance_start();
+	if (!strcmp(argv[1], "stop"))
+		return maintenance_stop();
 	if (!strcmp(argv[1], "register"))
 		return maintenance_register();
 	if (!strcmp(argv[1], "unregister"))
diff --git a/t/helper/test-crontab.c b/t/helper/test-crontab.c
new file mode 100644
index 0000000000..f5db6319c6
--- /dev/null
+++ b/t/helper/test-crontab.c
@@ -0,0 +1,35 @@
+#include "test-tool.h"
+#include "cache.h"
+
+/*
+ * Usage: test-tool cron <file> [-l]
+ *
+ * If -l is specified, then write the contents of <file> to stdou.
+ * Otherwise, write from stdin into <file>.
+ */
+int cmd__crontab(int argc, const char **argv)
+{
+	char a;
+	FILE *from, *to;
+
+	if (argc == 3 && !strcmp(argv[2], "-l")) {
+		from = fopen(argv[1], "r");
+		if (!from)
+			return 0;
+		to = stdout;
+	} else if (argc == 2) {
+		from = stdin;
+		to = fopen(argv[1], "w");
+	} else
+		return error("unknown arguments");
+
+	while ((a = fgetc(from)) != EOF)
+		fputc(a, to);
+
+	if (argc == 3)
+		fclose(from);
+	else
+		fclose(to);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 590b2efca7..432b49d948 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -18,6 +18,7 @@ static struct test_cmd cmds[] = {
 	{ "bloom", cmd__bloom },
 	{ "chmtime", cmd__chmtime },
 	{ "config", cmd__config },
+	{ "crontab", cmd__crontab },
 	{ "ctype", cmd__ctype },
 	{ "date", cmd__date },
 	{ "delta", cmd__delta },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index ddc8e990e9..7c3281e071 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -8,6 +8,7 @@ int cmd__advise_if_enabled(int argc, const char **argv);
 int cmd__bloom(int argc, const char **argv);
 int cmd__chmtime(int argc, const char **argv);
 int cmd__config(int argc, const char **argv);
+int cmd__crontab(int argc, const char **argv);
 int cmd__ctype(int argc, const char **argv);
 int cmd__date(int argc, const char **argv);
 int cmd__delta(int argc, const char **argv);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 272d1605d2..8803fcf621 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -319,4 +319,32 @@ test_expect_success 'register and unregister' '
 	test_cmp before actual
 '
 
+test_expect_success 'start from empty cron table' '
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
+
+	# start registers the repo
+	git config --get --global maintenance.repo "$(pwd)" &&
+
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=weekly" cron.txt
+'
+
+test_expect_success 'stop from existing schedule' '
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
+
+	# stop does not unregister the repo
+	git config --get --global maintenance.repo "$(pwd)" &&
+
+	# Operation is idempotent
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
+	test_must_be_empty cron.txt
+'
+
+test_expect_success 'start preserves existing schedule' '
+	echo "Important information!" >cron.txt &&
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
+	grep "Important information!" cron.txt
+'
+
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index ef31f40037..4a60d1ed76 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1702,3 +1702,9 @@ test_lazy_prereq SHA1 '
 test_lazy_prereq REBASE_P '
 	test -z "$GIT_TEST_SKIP_REBASE_P"
 '
+
+# Ensure that no test accidentally triggers a Git command
+# that runs 'crontab', affecting a user's cron schedule.
+# Tests that verify the cron integration must set this locally
+# to avoid errors.
+GIT_TEST_CRONTAB="exit 1"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 6/7] maintenance: recommended schedule in register/start
  2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                   ` (4 preceding siblings ...)
  2020-09-04 15:42 ` [PATCH 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
@ 2020-09-04 15:42 ` Derrick Stolee via GitGitGadget
  2020-09-04 15:42 ` [PATCH 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-04 15:42 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance (register|start)' subcommands add the current
repository to the global Git config so maintenance will operate on that
repository. It does not specify what maintenance should occur or how
often.

If a user sets any 'maintenance.<task>.schedule' config value, then
they have chosen a specific schedule for themselves and Git should
respect that.

However, in an effort to recommend a good schedule for repositories of
all sizes, set new config values for recommended tasks that are safe to
run in the background while users run foreground Git commands. These
commands are generally everything but the 'gc' task.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  6 ++++
 builtin/gc.c                      | 46 +++++++++++++++++++++++++++++++
 t/t7900-maintenance.sh            | 16 +++++++++++
 3 files changed, 68 insertions(+)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 7f8c279fe8..364b3e32bf 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -37,6 +37,12 @@ register::
 	`maintenance.<task>.schedule`. The tasks that are enabled are safe
 	for running in the background without disrupting foreground
 	processes.
++
+If your repository has no 'maintenance.<task>.schedule' configuration
+values set, then Git will set configuration values to some recommended
+settings. These settings disable foreground maintenance while performing
+maintenance tasks in the background that will not interrupt foreground Git
+operations.
 
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
diff --git a/builtin/gc.c b/builtin/gc.c
index 9914417e25..5f253d3458 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1408,6 +1408,49 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	return maintenance_run_tasks(&opts);
 }
 
+static int has_schedule_config(void)
+{
+	int i, found = 0;
+	struct strbuf config_name = STRBUF_INIT;
+	size_t prefix;
+
+	strbuf_addstr(&config_name, "maintenance.");
+	prefix = config_name.len;
+
+	for (i = 0; !found && i < TASK__COUNT; i++) {
+		char *value;
+
+		strbuf_setlen(&config_name, prefix);
+		strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &value)) {
+			found = 1;
+			FREE_AND_NULL(value);
+		}
+	}
+
+	strbuf_release(&config_name);
+	return found;
+}
+
+static void set_recommended_schedule(void)
+{
+	git_config_set("maintenance.auto", "false");
+	git_config_set("maintenance.gc.enabled", "false");
+
+	git_config_set("maintenance.prefetch.enabled", "true");
+	git_config_set("maintenance.prefetch.schedule", "hourly");
+
+	git_config_set("maintenance.commit-graph.enabled", "true");
+	git_config_set("maintenance.commit-graph.schedule", "hourly");
+
+	git_config_set("maintenance.loose-objects.enabled", "true");
+	git_config_set("maintenance.loose-objects.schedule", "daily");
+
+	git_config_set("maintenance.incremental-repack.enabled", "true");
+	git_config_set("maintenance.incremental-repack.schedule", "daily");
+}
+
 static int maintenance_register(void)
 {
 	struct child_process config_set = CHILD_PROCESS_INIT;
@@ -1417,6 +1460,9 @@ static int maintenance_register(void)
 	if (!the_repository || !the_repository->gitdir)
 		return 0;
 
+	if (!has_schedule_config())
+		set_recommended_schedule();
+
 	config_get.git_cmd = 1;
 	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
 		     the_repository->worktree ? the_repository->worktree
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 8803fcf621..5a31f3925b 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -309,7 +309,23 @@ test_expect_success 'register and unregister' '
 	git config --global --add maintenance.repo /existing1 &&
 	git config --global --add maintenance.repo /existing2 &&
 	git config --global --get-all maintenance.repo >before &&
+
+	# We still have maintenance.<task>.schedule config set,
+	# so this does not update the local schedule
+	git maintenance register &&
+	test_must_fail git config maintenance.auto &&
+
+	# Clear previous maintenance.<task>.schedule values
+	for task in loose-objects commit-graph incremental-repack
+	do
+		git config --unset maintenance.$task.schedule || return 1
+	done &&
 	git maintenance register &&
+	test_cmp_config false maintenance.auto &&
+	test_cmp_config false maintenance.gc.enabled &&
+	test_cmp_config true maintenance.prefetch.enabled &&
+	test_cmp_config hourly maintenance.commit-graph.schedule &&
+	test_cmp_config daily maintenance.incremental-repack.schedule &&
 	git config --global --get-all maintenance.repo >actual &&
 	cp before after &&
 	pwd >>after &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 7/7] maintenance: add troubleshooting guide to docs
  2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                   ` (5 preceding siblings ...)
  2020-09-04 15:42 ` [PATCH 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
@ 2020-09-04 15:42 ` Derrick Stolee via GitGitGadget
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-04 15:42 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance run' subcommand takes a lock on the object database
to prevent concurrent processes from competing for resources. This is an
important safety measure to prevent possible repository corruption and
data loss.

This feature can lead to confusing behavior if a user is not aware of
it. Add a TROUBLESHOOTING section to the 'git maintenance' builtin
documentation that discusses these tradeoffs. The short version of this
section is that Git will not corrupt your repository, but if the list of
scheduled tasks takes longer than an hour then some scheduled tasks may
be dropped due to this object database collision. For example, a
long-running "daily" task at midnight might prevent an "hourly" task
from running at 1AM.

The opposite is also possible, but less likely as long as the "hourly"
tasks are much faster than the "daily" and "weekly" tasks.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 44 +++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 364b3e32bf..f58dd60e40 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -161,6 +161,50 @@ OPTIONS
 	`maintenance.<task>.enabled` configured as `true` are considered.
 	See the 'TASKS' section for the list of accepted `<task>` values.
 
+
+TROUBLESHOOTING
+---------------
+The `git maintenance` command is designed to simplify the repository
+maintenance patterns while minimizing user wait time during Git commands.
+A variety of configuration options are available to allow customizing this
+process. The default maintenance options focus on operations that complete
+quickly, even on large repositories.
+
+Users may find some cases where scheduled maintenance tasks do not run as
+frequently as intended. Each `git maintenance run` command takes a lock on
+the repository's object database, and this prevents other concurrent
+`git maintenance run` commands from running on the same repository. Without
+this safeguard, competing processes could leave the repository in an
+unpredictable state.
+
+The background maintenance schedule runs `git maintenance run` processes
+on an hourly basis. Each run executes the "hourly" tasks. At midnight,
+that process also executes the "daily" tasks. At midnight on the first day
+of the week, that process also executes the "weekly" tasks. A single
+process iterates over each registered repository, performing the scheduled
+tasks for that frequency. Depending on the number of registered
+repositories and their sizes, this process may take longer than an hour.
+In this case, multiple `git maintenance run` commands may run on the same
+repository at the same time, colliding on the object database lock. This
+results in one of the two tasks not running.
+
+If you find that some maintenance windows are taking longer than one hour
+to complete, then consider reducing the complexity of your maintenance
+tasks. For example, the `gc` task is much slower than the
+`incremental-repack` task. However, this comes at a cost of a slightly
+larger object database. Consider moving more expensive tasks to be run
+less frequently.
+
+Expert users may consider scheduling their own maintenance tasks using a
+different schedule than is available through `git maintenance start` and
+Git configuration options. These users should be aware of the object
+database lock and how concurrent `git maintenance run` commands behave.
+Further, the `git gc` command should not be combined with
+`git maintenance run` commands. `git gc` modifies the object database
+but does not take the lock in the same way as `git maintenance run`. If
+possible, use `git maintenance run --task=gc` instead of `git gc`.
+
+
 GIT
 ---
 Part of the linkgit:git[1] suite
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/7] maintenance: add start/stop subcommands
  2020-09-04 15:42 ` [PATCH 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
@ 2020-09-08  6:29   ` SZEDER Gábor
  2020-09-08 12:43     ` Derrick Stolee
  2020-09-08 19:31     ` Junio C Hamano
  0 siblings, 2 replies; 62+ messages in thread
From: SZEDER Gábor @ 2020-09-08  6:29 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, congdanhqx,
	Derrick Stolee, Derrick Stolee, John Paul Adrian Glaubitz,
	Todd Zullinger

On Fri, Sep 04, 2020 at 03:42:04PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> Add new subcommands to 'git maintenance' that start or stop background
> maintenance using 'cron', when available. This integration is as simple
> as I could make it, barring some implementation complications.
> 
> The schedule is laid out as follows:
> 
>   0 1-23 * * *   $cmd maintenance run --schedule=hourly
>   0 0    * * 1-6 $cmd maintenance run --schedule=daily
>   0 0    * * 0   $cmd maintenance run --schedule=weekly
> 
> where $cmd is a properly-qualified 'git for-each-repo' execution:
> 
> $cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo
> 
> where $path points to the location of the Git executable running 'git
> maintenance start'. This is critical for systems with multiple versions
> of Git. Specifically, macOS has a system version at '/usr/bin/git' while
> the version that users can install resides at '/usr/local/bin/git'
> (symlinked to '/usr/local/libexec/git-core/git'). This will also use
> your locally-built version if you build and run this in your development
> environment without installing first.
> 
> This conditional schedule avoids having cron launch multiple 'git
> for-each-repo' commands in parallel. Such parallel commands would likely
> lead to the 'hourly' and 'daily' tasks competing over the object
> database lock. This could lead to to some tasks never being run! Since
> the --schedule=<frequency> argument will run all tasks with _at least_
> the given frequency, the daily runs will also run the hourly tasks.
> Similarly, the weekly runs will also run the daily and hourly tasks.
> 
> The GIT_TEST_CRONTAB environment variable is not intended for users to
> edit, but instead as a way to mock the 'crontab [-l]' command. This
> variable is set in test-lib.sh to avoid a future test from accidentally
> running anything with the cron integration from modifying the user's
> schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our
> tests to check how the schedule is modified in 'git maintenance
> (start|stop)' commands.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---


> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 272d1605d2..8803fcf621 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -319,4 +319,32 @@ test_expect_success 'register and unregister' '
>  	test_cmp before actual
>  '
>  
> +test_expect_success 'start from empty cron table' '
> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&

This command hangs when run on Travis CI's s390x arch.  Now, Travis
CI's multi-arch support is labelled as an alpha feature and isn't
exactly bug free, so Cc-ing Adrian and Todd, who reported and tested
big-endian issues and fixes in the past, in the hope that they can
confirm.

> +
> +	# start registers the repo
> +	git config --get --global maintenance.repo "$(pwd)" &&
> +
> +	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
> +	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
> +	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=weekly" cron.txt
> +'
> +
> +test_expect_success 'stop from existing schedule' '
> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
> +
> +	# stop does not unregister the repo
> +	git config --get --global maintenance.repo "$(pwd)" &&
> +
> +	# Operation is idempotent
> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
> +	test_must_be_empty cron.txt
> +'
> +
> +test_expect_success 'start preserves existing schedule' '
> +	echo "Important information!" >cron.txt &&
> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
> +	grep "Important information!" cron.txt
> +'
> +
>  test_done


> diff --git a/t/helper/test-crontab.c b/t/helper/test-crontab.c
> new file mode 100644
> index 0000000000..f5db6319c6
> --- /dev/null
> +++ b/t/helper/test-crontab.c
> @@ -0,0 +1,35 @@
> +#include "test-tool.h"
> +#include "cache.h"
> +
> +/*
> + * Usage: test-tool cron <file> [-l]
> + *
> + * If -l is specified, then write the contents of <file> to stdou.

s/stdou/stdout/

> + * Otherwise, write from stdin into <file>.
> + */
> +int cmd__crontab(int argc, const char **argv)
> +{
> +	char a;

So 'a' is a char...

> +	FILE *from, *to;
> +
> +	if (argc == 3 && !strcmp(argv[2], "-l")) {
> +		from = fopen(argv[1], "r");
> +		if (!from)
> +			return 0;
> +		to = stdout;
> +	} else if (argc == 2) {
> +		from = stdin;
> +		to = fopen(argv[1], "w");
> +	} else
> +		return error("unknown arguments");
> +
> +	while ((a = fgetc(from)) != EOF)

fgetc() returns an int, which is assigned to a char, which is then
compared to whatever EOF might be on the platform.  Apparently this
casting and comparison doesn't work as expected on s390x (I haven't
even tried to think it through...), and instead of detecting EOF and
exiting we end up in an endless loop writing 0xff bytes to 'cron.txt',
while 'git maintenance start' in vain waits for 'test-crontab' to
exit.

Changing the type of 'a' to int fixes this issue, and all these tests
pass.

> +		fputc(a, to);
> +
> +	if (argc == 3)
> +		fclose(from);
> +	else
> +		fclose(to);
> +
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/7] maintenance: add start/stop subcommands
  2020-09-08  6:29   ` SZEDER Gábor
@ 2020-09-08 12:43     ` Derrick Stolee
  2020-09-08 19:31     ` Junio C Hamano
  1 sibling, 0 replies; 62+ messages in thread
From: Derrick Stolee @ 2020-09-08 12:43 UTC (permalink / raw)
  To: SZEDER Gábor, Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, congdanhqx,
	Derrick Stolee, Derrick Stolee, John Paul Adrian Glaubitz,
	Todd Zullinger

On 9/8/2020 2:29 AM, SZEDER Gábor wrote:
> On Fri, Sep 04, 2020 at 03:42:04PM +0000, Derrick Stolee via GitGitGadget wrote:
>> +test_expect_success 'start from empty cron table' '
>> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
> 
> This command hangs when run on Travis CI's s390x arch.  Now, Travis
> CI's multi-arch support is labelled as an alpha feature and isn't
> exactly bug free, so Cc-ing Adrian and Todd, who reported and tested
> big-endian issues and fixes in the past, in the hope that they can
> confirm.

Sounds like you have found the issue below.

>> diff --git a/t/helper/test-crontab.c b/t/helper/test-crontab.c
>> new file mode 100644
>> index 0000000000..f5db6319c6
>> --- /dev/null
>> +++ b/t/helper/test-crontab.c
>> @@ -0,0 +1,35 @@
>> +#include "test-tool.h"
>> +#include "cache.h"
>> +
>> +/*
>> + * Usage: test-tool cron <file> [-l]
>> + *
>> + * If -l is specified, then write the contents of <file> to stdou.
> 
> s/stdou/stdout/

Thanks.

>> + * Otherwise, write from stdin into <file>.
>> + */
>> +int cmd__crontab(int argc, const char **argv)
>> +{
>> +	char a;
> 
> So 'a' is a char...
> 
>> +	FILE *from, *to;
>> +
>> +	if (argc == 3 && !strcmp(argv[2], "-l")) {
>> +		from = fopen(argv[1], "r");
>> +		if (!from)
>> +			return 0;
>> +		to = stdout;
>> +	} else if (argc == 2) {
>> +		from = stdin;
>> +		to = fopen(argv[1], "w");
>> +	} else
>> +		return error("unknown arguments");
>> +
>> +	while ((a = fgetc(from)) != EOF)
> 
> fgetc() returns an int, which is assigned to a char, which is then
> compared to whatever EOF might be on the platform.  Apparently this
> casting and comparison doesn't work as expected on s390x (I haven't
> even tried to think it through...), and instead of detecting EOF and
> exiting we end up in an endless loop writing 0xff bytes to 'cron.txt',
> while 'git maintenance start' in vain waits for 'test-crontab' to
> exit.
> 
> Changing the type of 'a' to int fixes this issue, and all these tests
> pass.

Thanks for the help here. I'll fix this in the next version.

-Stolee

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/7] maintenance: add --schedule option and config
  2020-09-04 15:42 ` [PATCH 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
@ 2020-09-08 13:07   ` Đoàn Trần Công Danh
  2020-09-09 12:14     ` Derrick Stolee
  0 siblings, 1 reply; 62+ messages in thread
From: Đoàn Trần Công Danh @ 2020-09-08 13:07 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, Derrick Stolee,
	Derrick Stolee

On 2020-09-04 15:42:01+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> A user may want to run certain maintenance tasks based on frequency, not
> conditions given in the repository. For example, the user may want to

Hm, sorry but I couldn't decipher "not conditions" here. :|

> perform a 'prefetch' task every hour, or 'gc' task every day. To assist,

I think it's better to say: "To assist those users", at least it's
easier to read for non-native English like me.

> update the 'git maintenance run' command to include a
> '--schedule=<frequency>' option. The allowed frequencies are 'hourly',

So, we have "--schedule=" here, ...

> 'daily', and 'weekly'. These values are also allowed in a new config
> value 'maintenance.<task>.schedule'.
> 
> The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'

and here, ...

> config value for each enabled task to see if the configured frequency is
> at least as frequent as the frequency from the '--schedule' argument. We
> use the following order, for full clarity:
> 
> 	'hourly' > 'daily' > 'weekly'
> 
> Use new 'enum schedule_priority' to track these values numerically.
> 
> The following cron table would run the scheduled tasks with the correct
> frequencies:
> 
>   0 1-23 * * *    git -C <repo> maintenance run --scheduled=hourly
>   0 0    * * 1-6  git -C <repo> maintenance run --scheduled=daily
>   0 0    * * 0    git -C <repo> maintenance run --scheduled=weekly

but it's spelt with "--scheduled=", here and below, mispell, I guess.

Reading the patch, it looks like "--scheduled=" is mispelt.

> This cron schedule will run --scheduled=hourly every hour except at
> midnight. This avoids a concurrent run with the --scheduled=daily that
> runs at midnight every day except the first day of the week. This avoids
> a concurrent run with the --scheduled=weekly that runs at midnight on
> the first day of the week. Since --scheduled=daily also runs the
> 'hourly' tasks and --scheduled=weekly runs the 'hourly' and 'daily'
> tasks, we will still see all tasks run with the proper frequencies.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/config/maintenance.txt |  5 +++
>  Documentation/git-maintenance.txt    | 13 +++++-
>  builtin/gc.c                         | 67 +++++++++++++++++++++++++---
>  t/t7900-maintenance.sh               | 40 +++++++++++++++++
>  4 files changed, 119 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
> index 06db758172..70585564fa 100644
> --- a/Documentation/config/maintenance.txt
> +++ b/Documentation/config/maintenance.txt
> @@ -10,6 +10,11 @@ maintenance.<task>.enabled::
>  	`--task` option exists. By default, only `maintenance.gc.enabled`
>  	is true.
>  
> +maintenance.<task>.schedule::
> +	This config option controls whether or not the given `<task>` runs
> +	during a `git maintenance run --schedule=<frequency>` command. The
> +	value must be one of "hourly", "daily", or "weekly".
> +
>  maintenance.commit-graph.auto::
>  	This integer config option controls how often the `commit-graph` task
>  	should be run as part of `git maintenance run --auto`. If zero, then
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index b44efb05a3..3af5907b01 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -107,7 +107,18 @@ OPTIONS
>  	only if certain thresholds are met. For example, the `gc` task
>  	runs when the number of loose objects exceeds the number stored
>  	in the `gc.auto` config setting, or when the number of pack-files
> -	exceeds the `gc.autoPackLimit` config setting.
> +	exceeds the `gc.autoPackLimit` config setting. Not compatible with
> +	the `--schedule` option.
> +
> +--schedule::
> +	When combined with the `run` subcommand, run maintenance tasks
> +	only if certain time conditions are met, as specified by the
> +	`maintenance.<task>.schedule` config value for each `<task>`.
> +	This config value specifies a number of seconds since the last
> +	time that task ran, according to the `maintenance.<task>.lastRun`
> +	config value. The tasks that are tested are those provided by
> +	the `--task=<task>` option(s) or those with
> +	`maintenance.<task>.enabled` set to true.
>  
>  --quiet::
>  	Do not report progress or other information over `stderr`.
> diff --git a/builtin/gc.c b/builtin/gc.c
> index f8459df04c..85a3370692 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -704,14 +704,51 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  	return 0;
>  }
>  
> -static const char * const builtin_maintenance_run_usage[] = {
> -	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>]"),
> +static const char *const builtin_maintenance_run_usage[] = {
> +	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>] [--schedule]"),
>  	NULL
>  };
>  
> +enum schedule_priority {
> +	SCHEDULE_NONE = 0,
> +	SCHEDULE_WEEKLY = 1,
> +	SCHEDULE_DAILY = 2,
> +	SCHEDULE_HOURLY = 3,
> +};
> +
> +static enum schedule_priority parse_schedule(const char *value)
> +{
> +	if (!value)
> +		return SCHEDULE_NONE;
> +	if (!strcasecmp(value, "hourly"))
> +		return SCHEDULE_HOURLY;
> +	if (!strcasecmp(value, "daily"))
> +		return SCHEDULE_DAILY;
> +	if (!strcasecmp(value, "weekly"))
> +		return SCHEDULE_WEEKLY;
> +	return SCHEDULE_NONE;
> +}
> +
> +static int maintenance_opt_schedule(const struct option *opt, const char *arg,
> +				    int unset)
> +{
> +	enum schedule_priority *priority = opt->value;
> +
> +	if (unset)
> +		die(_("--no-schedule is not allowed"));
> +
> +	*priority = parse_schedule(arg);
> +
> +	if (!*priority)
> +		die(_("unrecognized --schedule argument '%s'"), arg);
> +
> +	return 0;
> +}
> +
>  struct maintenance_run_opts {
>  	int auto_flag;
>  	int quiet;
> +	enum schedule_priority schedule;
>  };
>  
>  /* Remember to update object flag allocation in object.h */
> @@ -1159,6 +1196,8 @@ struct maintenance_task {
>  	maintenance_auto_fn *auto_condition;
>  	unsigned enabled:1;
>  
> +	enum schedule_priority schedule;
> +
>  	/* -1 if not selected. */
>  	int selected_order;
>  };
> @@ -1250,8 +1289,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
>  			continue;
>  
>  		if (opts->auto_flag &&
> -		    (!tasks[i].auto_condition ||
> -		     !tasks[i].auto_condition()))
> +		    (!tasks[i].auto_condition || !tasks[i].auto_condition()))
> +			continue;

This line only add unnecessary noise to this patch.

-- 
Danh
> +
> +		if (opts->schedule && tasks[i].schedule < opts->schedule)
>  			continue;
>  
>  		trace2_region_enter("maintenance", tasks[i].name, r);
> @@ -1274,13 +1315,23 @@ static void initialize_task_config(void)
>  
>  	for (i = 0; i < TASK__COUNT; i++) {
>  		int config_value;
> +		char *config_str;
>  
> -		strbuf_setlen(&config_name, 0);
> +		strbuf_reset(&config_name);
>  		strbuf_addf(&config_name, "maintenance.%s.enabled",
>  			    tasks[i].name);
>  
>  		if (!git_config_get_bool(config_name.buf, &config_value))
>  			tasks[i].enabled = config_value;
> +
> +		strbuf_reset(&config_name);
> +		strbuf_addf(&config_name, "maintenance.%s.schedule",
> +			    tasks[i].name);
> +
> +		if (!git_config_get_string(config_name.buf, &config_str)) {
> +			tasks[i].schedule = parse_schedule(config_str);
> +			free(config_str);
> +		}
>  	}
>  
>  	strbuf_release(&config_name);
> @@ -1324,6 +1375,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
>  	struct option builtin_maintenance_run_options[] = {
>  		OPT_BOOL(0, "auto", &opts.auto_flag,
>  			 N_("run tasks based on the state of the repository")),
> +		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
> +			     N_("run tasks based on frequency"),
> +			     maintenance_opt_schedule),
>  		OPT_BOOL(0, "quiet", &opts.quiet,
>  			 N_("do not report progress or other information over stderr")),
>  		OPT_CALLBACK_F(0, "task", NULL, N_("task"),
> @@ -1344,6 +1398,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
>  			     builtin_maintenance_run_usage,
>  			     PARSE_OPT_STOP_AT_NON_OPTION);
>  
> +	if (opts.auto_flag && opts.schedule)
> +		die(_("use at most one of --auto and --schedule=<frequency>"));
> +
>  	if (argc != 0)
>  		usage_with_options(builtin_maintenance_run_usage,
>  				   builtin_maintenance_run_options);
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index e0ba19e1ff..328bbaa830 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -264,4 +264,44 @@ test_expect_success 'maintenance.incremental-repack.auto' '
>  	done
>  '
>  
> +test_expect_success '--auto and --schedule incompatible' '
> +	test_must_fail git maintenance run --auto --schedule=daily 2>err &&
> +	test_i18ngrep "at most one" err
> +'
> +
> +test_expect_success 'invalid --schedule value' '
> +	test_must_fail git maintenance run --schedule=annually 2>err &&
> +	test_i18ngrep "unrecognized --schedule" err
> +'
> +
> +test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
> +	git config maintenance.loose-objects.enabled true &&
> +	git config maintenance.loose-objects.schedule hourly &&
> +	git config maintenance.commit-graph.enabled true &&
> +	git config maintenance.commit-graph.schedule daily &&
> +	git config maintenance.incremental-repack.enabled true &&
> +	git config maintenance.incremental-repack.schedule weekly &&
> +
> +	GIT_TRACE2_EVENT="$(pwd)/hourly.txt" \
> +		git maintenance run --schedule=hourly 2>/dev/null &&
> +	test_subcommand git prune-packed --quiet <hourly.txt &&
> +	test_subcommand ! git commit-graph write --split --reachable \
> +		--no-progress <hourly.txt &&
> +	test_subcommand ! git multi-pack-index write --no-progress <hourly.txt &&
> +
> +	GIT_TRACE2_EVENT="$(pwd)/daily.txt" \
> +		git maintenance run --schedule=daily 2>/dev/null &&
> +	test_subcommand git prune-packed --quiet <daily.txt &&
> +	test_subcommand git commit-graph write --split --reachable \
> +		--no-progress <daily.txt &&
> +	test_subcommand ! git multi-pack-index write --no-progress <daily.txt &&
> +
> +	GIT_TRACE2_EVENT="$(pwd)/weekly.txt" \
> +		git maintenance run --schedule=weekly 2>/dev/null &&
> +	test_subcommand git prune-packed --quiet <weekly.txt &&
> +	test_subcommand git commit-graph write --split --reachable \
> +		--no-progress <weekly.txt &&
> +	test_subcommand git multi-pack-index write --no-progress <weekly.txt
> +'
> +
>  test_done
> -- 
> gitgitgadget
> 

-- 
Danh

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/7] maintenance: add start/stop subcommands
  2020-09-08  6:29   ` SZEDER Gábor
  2020-09-08 12:43     ` Derrick Stolee
@ 2020-09-08 19:31     ` Junio C Hamano
  1 sibling, 0 replies; 62+ messages in thread
From: Junio C Hamano @ 2020-09-08 19:31 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Derrick Stolee via GitGitGadget, git, jrnieder, jonathantanmy,
	sluongng, congdanhqx, Derrick Stolee, Derrick Stolee,
	John Paul Adrian Glaubitz, Todd Zullinger

SZEDER Gábor <szeder.dev@gmail.com> writes:

>> +int cmd__crontab(int argc, const char **argv)
>> +{
>> +	char a;
>
> So 'a' is a char...
>
>> +	FILE *from, *to;
>> +
>> +	if (argc == 3 && !strcmp(argv[2], "-l")) {
>> +		from = fopen(argv[1], "r");
>> +		if (!from)
>> +			return 0;
>> +		to = stdout;
>> +	} else if (argc == 2) {
>> +		from = stdin;
>> +		to = fopen(argv[1], "w");
>> +	} else
>> +		return error("unknown arguments");
>> +
>> +	while ((a = fgetc(from)) != EOF)
>
> fgetc() returns an int, which is assigned to a char, which is then
> compared to whatever EOF might be on the platform.  Apparently this
> casting and comparison doesn't work as expected on s390x (I haven't
> even tried to think it through...), and instead of detecting EOF and
> exiting we end up in an endless loop writing 0xff bytes to 'cron.txt',
> while 'git maintenance start' in vain waits for 'test-crontab' to
> exit.

Ah, is this fun with unsigned char never comparing equal to -1?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/7] maintenance: add --schedule option and config
  2020-09-08 13:07   ` Đoàn Trần Công Danh
@ 2020-09-09 12:14     ` Derrick Stolee
  0 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee @ 2020-09-09 12:14 UTC (permalink / raw)
  To: Đoàn Trần Công Danh,
	Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, Derrick Stolee,
	Derrick Stolee

On 9/8/2020 9:07 AM, Đoàn Trần Công Danh wrote:
> On 2020-09-04 15:42:01+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> A user may want to run certain maintenance tasks based on frequency, not
>> conditions given in the repository. For example, the user may want to
> 
> Hm, sorry but I couldn't decipher "not conditions" here. :|

Awkward, yes. I intended to contrast frequency-based maintenance with
threshold-based maintenance (git gc --auto).

>> perform a 'prefetch' task every hour, or 'gc' task every day. To assist,
> 
> I think it's better to say: "To assist those users", at least it's
> easier to read for non-native English like me.

Thanks.

>> update the 'git maintenance run' command to include a
>> '--schedule=<frequency>' option. The allowed frequencies are 'hourly',
> 
> So, we have "--schedule=" here, ...
> 
>> 'daily', and 'weekly'. These values are also allowed in a new config
>> value 'maintenance.<task>.schedule'.
>>
>> The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
> 
> and here, ...
> 
>> config value for each enabled task to see if the configured frequency is
>> at least as frequent as the frequency from the '--schedule' argument. We
>> use the following order, for full clarity:
>>
>> 	'hourly' > 'daily' > 'weekly'
>>
>> Use new 'enum schedule_priority' to track these values numerically.
>>
>> The following cron table would run the scheduled tasks with the correct
>> frequencies:
>>
>>   0 1-23 * * *    git -C <repo> maintenance run --scheduled=hourly
>>   0 0    * * 1-6  git -C <repo> maintenance run --scheduled=daily
>>   0 0    * * 0    git -C <repo> maintenance run --scheduled=weekly
> 
> but it's spelt with "--scheduled=", here and below, mispell, I guess.> 
> Reading the patch, it looks like "--scheduled=" is mispelt.

Yes, a previous version used "scheduled" and I didn't fix it here.

>> @@ -1250,8 +1289,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
>>  			continue;
>>  
>>  		if (opts->auto_flag &&
>> -		    (!tasks[i].auto_condition ||
>> -		     !tasks[i].auto_condition()))
>> +		    (!tasks[i].auto_condition || !tasks[i].auto_condition()))
>> +			continue;
> 
> This line only add unnecessary noise to this patch.

Thanks,
-Stolee
 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 0/7] Maintenance III: Background maintenance
  2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                   ` (6 preceding siblings ...)
  2020-09-04 15:42 ` [PATCH 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
@ 2020-09-11 17:49 ` Derrick Stolee via GitGitGadget
  2020-09-11 17:49   ` [PATCH v2 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
                     ` (7 more replies)
  7 siblings, 8 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-11 17:49 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee

This is based on ds/maintenance-part-2 and replaces the RFC from [1].

[1] 
https://lore.kernel.org/git/pull.680.v3.git.1598629517.gitgitgadget@gmail.com/

This series introduces background maintenance to Git, through an integration
with cron and crontab.

Some preliminary work is done to allow a new --schedule option that tells
the command which tasks to run based on a maintenance.<task>.schedule config
option. The timing is not enforced by Git, but instead is expected to be
provided as a hint from a cron schedule. The options are "hourly", "daily",
and "weekly".

A new for-each-repo builtin runs Git commands on every repo in a given list.
Currently, the list is stored as a config setting, allowing a new 
maintenance.repos config list to store the repositories registered for
background maintenance. Others may want to add a --file=<file> option for
their own workflows, but I focused on making this as simple as possible for
now.

The updates to the git maintenance builtin include new register/unregister 
subcommands and start/stop subcommands. The register subcommand initializes
the config while the start subcommand does everything register does plus 
update the cron table. The unregister and stop commands reverse this
process.

A troubleshooting guide is added to Documentation/git-maintenance.txt to
advise expert users who choose to create custom cron schedules.

The very last patch is entirely optional. It sets a recommended schedule
based on my own experience with very large repositories. I'm open to other
suggestions, but these are ones that I think work well and don't cause a
"rewrite the world" scenario like running nightly 'gc' would do.

I've been testing this scenario on my macOS laptop and Linux desktop. I have
modified my cron task to provide logging via trace2 so I can see what's
happening. A future direction here would be to add some maintenance logs to
the repository so we can track what is happening and diagnose whether the
maintenance strategy is working on real repos.

Note: git maintenance (start|stop) only works on machines with cron by
design. The proper thing to do on Windows will come later. Perhaps this
command should be marked as unavailable on Windows somehow, or at least a
better error than "cron may not be available on your system". I did find
that that message is helpful sometimes: macOS worker agents for CI builds
typically do not have cron available.

Updates in v2:

 * Fixed the char/int issue in test-tool crontab, and a typo.
 * Updated commit message and patch noise in PATCH 2
 * This should fix the test failures, allowing this to be picked up in
   'seen'.

Derrick Stolee (7):
  maintenance: optionally skip --auto process
  maintenance: add --schedule option and config
  for-each-repo: run subcommands on configured repos
  maintenance: add [un]register subcommands
  maintenance: add start/stop subcommands
  maintenance: recommended schedule in register/start
  maintenance: add troubleshooting guide to docs

 .gitignore                           |   1 +
 Documentation/config/maintenance.txt |  10 +
 Documentation/git-for-each-repo.txt  |  59 ++++++
 Documentation/git-maintenance.txt    |  88 +++++++-
 Makefile                             |   2 +
 builtin.h                            |   1 +
 builtin/for-each-repo.c              |  58 ++++++
 builtin/gc.c                         | 289 ++++++++++++++++++++++++++-
 command-list.txt                     |   1 +
 git.c                                |   1 +
 run-command.c                        |   6 +
 t/helper/test-crontab.c              |  35 ++++
 t/helper/test-tool.c                 |   1 +
 t/helper/test-tool.h                 |   1 +
 t/t0068-for-each-repo.sh             |  30 +++
 t/t7900-maintenance.sh               | 114 ++++++++++-
 t/test-lib.sh                        |   6 +
 17 files changed, 697 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100644 t/helper/test-crontab.c
 create mode 100755 t/t0068-for-each-repo.sh


base-commit: 6f11fba53777584b94dd9ed32976c2079d645fa2
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-724%2Fderrickstolee%2Fmaintenance%2Fscheduled-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-724/derrickstolee/maintenance/scheduled-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/724

Range-diff vs v1:

 1:  bd95729009 = 1:  b21cd68c90 maintenance: optionally skip --auto process
 2:  1783e80b8d ! 2:  e2d14d66d4 maintenance: add --schedule option and config
     @@ Metadata
       ## Commit message ##
          maintenance: add --schedule option and config
      
     -    A user may want to run certain maintenance tasks based on frequency, not
     -    conditions given in the repository. For example, the user may want to
     -    perform a 'prefetch' task every hour, or 'gc' task every day. To assist,
     -    update the 'git maintenance run' command to include a
     -    '--schedule=<frequency>' option. The allowed frequencies are 'hourly',
     -    'daily', and 'weekly'. These values are also allowed in a new config
     -    value 'maintenance.<task>.schedule'.
     +    Maintenance currently triggers when certain data-size thresholds are
     +    met, such as number of pack-files or loose objects. Users may want to
     +    run certain maintenance tasks based on frequency instead. For example,
     +    a user may want to perform a 'prefetch' task every hour, or 'gc' task
     +    every day. To help these users, update the 'git maintenance run' command
     +    to include a '--schedule=<frequency>' option. The allowed frequencies
     +    are 'hourly', 'daily', and 'weekly'. These values are also allowed in a
     +    new config value 'maintenance.<task>.schedule'.
      
          The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
          config value for each enabled task to see if the configured frequency is
     @@ Commit message
          The following cron table would run the scheduled tasks with the correct
          frequencies:
      
     -      0 1-23 * * *    git -C <repo> maintenance run --scheduled=hourly
     -      0 0    * * 1-6  git -C <repo> maintenance run --scheduled=daily
     -      0 0    * * 0    git -C <repo> maintenance run --scheduled=weekly
     +      0 1-23 * * *    git -C <repo> maintenance run --schedule=hourly
     +      0 0    * * 1-6  git -C <repo> maintenance run --schedule=daily
     +      0 0    * * 0    git -C <repo> maintenance run --schedule=weekly
      
     -    This cron schedule will run --scheduled=hourly every hour except at
     -    midnight. This avoids a concurrent run with the --scheduled=daily that
     +    This cron schedule will run --schedule=hourly every hour except at
     +    midnight. This avoids a concurrent run with the --schedule=daily that
          runs at midnight every day except the first day of the week. This avoids
     -    a concurrent run with the --scheduled=weekly that runs at midnight on
     -    the first day of the week. Since --scheduled=daily also runs the
     -    'hourly' tasks and --scheduled=weekly runs the 'hourly' and 'daily'
     +    a concurrent run with the --schedule=weekly that runs at midnight on
     +    the first day of the week. Since --schedule=daily also runs the
     +    'hourly' tasks and --schedule=weekly runs the 'hourly' and 'daily'
          tasks, we will still see all tasks run with the proper frequencies.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
     @@ builtin/gc.c: struct maintenance_task {
       	int selected_order;
       };
      @@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
     + 		     !tasks[i].auto_condition()))
       			continue;
       
     - 		if (opts->auto_flag &&
     --		    (!tasks[i].auto_condition ||
     --		     !tasks[i].auto_condition()))
     -+		    (!tasks[i].auto_condition || !tasks[i].auto_condition()))
     ++		if (opts->schedule && tasks[i].schedule < opts->schedule)
      +			continue;
      +
     -+		if (opts->schedule && tasks[i].schedule < opts->schedule)
     - 			continue;
     - 
       		trace2_region_enter("maintenance", tasks[i].name, r);
     + 		if (tasks[i].fn(opts)) {
     + 			error(_("task '%s' failed"), tasks[i].name);
      @@ builtin/gc.c: static void initialize_task_config(void)
       
       	for (i = 0; i < TASK__COUNT; i++) {
 3:  6082d939eb = 3:  41a346dfbb for-each-repo: run subcommands on configured repos
 4:  b7775b3aaf = 4:  1f49cda18e maintenance: add [un]register subcommands
 5:  e02641881d ! 5:  e9b2a39c1d maintenance: add start/stop subcommands
     @@ t/helper/test-crontab.c (new)
      +/*
      + * Usage: test-tool cron <file> [-l]
      + *
     -+ * If -l is specified, then write the contents of <file> to stdou.
     ++ * If -l is specified, then write the contents of <file> to stdout.
      + * Otherwise, write from stdin into <file>.
      + */
      +int cmd__crontab(int argc, const char **argv)
      +{
     -+	char a;
     ++	int a;
      +	FILE *from, *to;
      +
      +	if (argc == 3 && !strcmp(argv[2], "-l")) {
 6:  8a285e00e6 = 6:  f609c1bde2 maintenance: recommended schedule in register/start
 7:  c00de53906 = 7:  2344eff4ba maintenance: add troubleshooting guide to docs

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 1/7] maintenance: optionally skip --auto process
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
@ 2020-09-11 17:49   ` Derrick Stolee via GitGitGadget
  2020-09-11 17:49   ` [PATCH v2 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-11 17:49 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Some commands run 'git maintenance run --auto --[no-]quiet' after doing
their normal work, as a way to keep repositories clean as they are used.
Currently, users who do not want this maintenance to occur would set the
'gc.auto' config option to 0 to avoid the 'gc' task from running.
However, this does not stop the extra process invocation. On Windows,
this extra process invocation can be more expensive than necessary.

Allow users to drop this extra process by setting 'maintenance.auto' to
'false'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++++
 run-command.c                        |  6 ++++++
 t/t7900-maintenance.sh               | 13 +++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index a0706d8f09..06db758172 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -1,3 +1,8 @@
+maintenance.auto::
+	This boolean config option controls whether some commands run
+	`git maintenance run --auto` after doing their normal work. Defaults
+	to true.
+
 maintenance.<task>.enabled::
 	This boolean config option controls whether the maintenance task
 	with name `<task>` is run when no `--task` option is specified to
diff --git a/run-command.c b/run-command.c
index 2ee59acdc8..ea4d0fb4b1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -7,6 +7,7 @@
 #include "strbuf.h"
 #include "string-list.h"
 #include "quote.h"
+#include "config.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -1868,8 +1869,13 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 
 int run_auto_maintenance(int quiet)
 {
+	int enabled;
 	struct child_process maint = CHILD_PROCESS_INIT;
 
+	if (!git_config_get_bool("maintenance.auto", &enabled) &&
+	    !enabled)
+		return 0;
+
 	maint.git_cmd = 1;
 	strvec_pushl(&maint.args, "maintenance", "run", "--auto", NULL);
 	strvec_push(&maint.args, quiet ? "--quiet" : "--no-quiet");
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 6f878b0141..e0ba19e1ff 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -26,6 +26,19 @@ test_expect_success 'run [--auto|--quiet]' '
 	test_subcommand git gc --no-quiet <run-no-quiet.txt
 '
 
+test_expect_success 'maintenance.auto config option' '
+	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
+	test_subcommand git maintenance run --auto --quiet <default &&
+	GIT_TRACE2_EVENT="$(pwd)/true" \
+		git -c maintenance.auto=true \
+		commit --quiet --allow-empty -m 2 &&
+	test_subcommand git maintenance run --auto --quiet  <true &&
+	GIT_TRACE2_EVENT="$(pwd)/false" \
+		git -c maintenance.auto=false \
+		commit --quiet --allow-empty -m 3 &&
+	test_subcommand ! git maintenance run --auto --quiet  <false
+'
+
 test_expect_success 'maintenance.<task>.enabled' '
 	git config maintenance.gc.enabled false &&
 	git config maintenance.commit-graph.enabled true &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 2/7] maintenance: add --schedule option and config
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  2020-09-11 17:49   ` [PATCH v2 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
@ 2020-09-11 17:49   ` Derrick Stolee via GitGitGadget
  2020-09-11 17:49   ` [PATCH v2 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-11 17:49 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Maintenance currently triggers when certain data-size thresholds are
met, such as number of pack-files or loose objects. Users may want to
run certain maintenance tasks based on frequency instead. For example,
a user may want to perform a 'prefetch' task every hour, or 'gc' task
every day. To help these users, update the 'git maintenance run' command
to include a '--schedule=<frequency>' option. The allowed frequencies
are 'hourly', 'daily', and 'weekly'. These values are also allowed in a
new config value 'maintenance.<task>.schedule'.

The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
config value for each enabled task to see if the configured frequency is
at least as frequent as the frequency from the '--schedule' argument. We
use the following order, for full clarity:

	'hourly' > 'daily' > 'weekly'

Use new 'enum schedule_priority' to track these values numerically.

The following cron table would run the scheduled tasks with the correct
frequencies:

  0 1-23 * * *    git -C <repo> maintenance run --schedule=hourly
  0 0    * * 1-6  git -C <repo> maintenance run --schedule=daily
  0 0    * * 0    git -C <repo> maintenance run --schedule=weekly

This cron schedule will run --schedule=hourly every hour except at
midnight. This avoids a concurrent run with the --schedule=daily that
runs at midnight every day except the first day of the week. This avoids
a concurrent run with the --schedule=weekly that runs at midnight on
the first day of the week. Since --schedule=daily also runs the
'hourly' tasks and --schedule=weekly runs the 'hourly' and 'daily'
tasks, we will still see all tasks run with the proper frequencies.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++
 Documentation/git-maintenance.txt    | 13 +++++-
 builtin/gc.c                         | 64 ++++++++++++++++++++++++++--
 t/t7900-maintenance.sh               | 40 +++++++++++++++++
 4 files changed, 118 insertions(+), 4 deletions(-)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index 06db758172..70585564fa 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -10,6 +10,11 @@ maintenance.<task>.enabled::
 	`--task` option exists. By default, only `maintenance.gc.enabled`
 	is true.
 
+maintenance.<task>.schedule::
+	This config option controls whether or not the given `<task>` runs
+	during a `git maintenance run --schedule=<frequency>` command. The
+	value must be one of "hourly", "daily", or "weekly".
+
 maintenance.commit-graph.auto::
 	This integer config option controls how often the `commit-graph` task
 	should be run as part of `git maintenance run --auto`. If zero, then
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index b44efb05a3..3af5907b01 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -107,7 +107,18 @@ OPTIONS
 	only if certain thresholds are met. For example, the `gc` task
 	runs when the number of loose objects exceeds the number stored
 	in the `gc.auto` config setting, or when the number of pack-files
-	exceeds the `gc.autoPackLimit` config setting.
+	exceeds the `gc.autoPackLimit` config setting. Not compatible with
+	the `--schedule` option.
+
+--schedule::
+	When combined with the `run` subcommand, run maintenance tasks
+	only if certain time conditions are met, as specified by the
+	`maintenance.<task>.schedule` config value for each `<task>`.
+	This config value specifies a number of seconds since the last
+	time that task ran, according to the `maintenance.<task>.lastRun`
+	config value. The tasks that are tested are those provided by
+	the `--task=<task>` option(s) or those with
+	`maintenance.<task>.enabled` set to true.
 
 --quiet::
 	Do not report progress or other information over `stderr`.
diff --git a/builtin/gc.c b/builtin/gc.c
index f8459df04c..e28561b6c5 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -704,14 +704,51 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static const char * const builtin_maintenance_run_usage[] = {
-	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>]"),
+static const char *const builtin_maintenance_run_usage[] = {
+	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>] [--schedule]"),
 	NULL
 };
 
+enum schedule_priority {
+	SCHEDULE_NONE = 0,
+	SCHEDULE_WEEKLY = 1,
+	SCHEDULE_DAILY = 2,
+	SCHEDULE_HOURLY = 3,
+};
+
+static enum schedule_priority parse_schedule(const char *value)
+{
+	if (!value)
+		return SCHEDULE_NONE;
+	if (!strcasecmp(value, "hourly"))
+		return SCHEDULE_HOURLY;
+	if (!strcasecmp(value, "daily"))
+		return SCHEDULE_DAILY;
+	if (!strcasecmp(value, "weekly"))
+		return SCHEDULE_WEEKLY;
+	return SCHEDULE_NONE;
+}
+
+static int maintenance_opt_schedule(const struct option *opt, const char *arg,
+				    int unset)
+{
+	enum schedule_priority *priority = opt->value;
+
+	if (unset)
+		die(_("--no-schedule is not allowed"));
+
+	*priority = parse_schedule(arg);
+
+	if (!*priority)
+		die(_("unrecognized --schedule argument '%s'"), arg);
+
+	return 0;
+}
+
 struct maintenance_run_opts {
 	int auto_flag;
 	int quiet;
+	enum schedule_priority schedule;
 };
 
 /* Remember to update object flag allocation in object.h */
@@ -1159,6 +1196,8 @@ struct maintenance_task {
 	maintenance_auto_fn *auto_condition;
 	unsigned enabled:1;
 
+	enum schedule_priority schedule;
+
 	/* -1 if not selected. */
 	int selected_order;
 };
@@ -1254,6 +1293,9 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 		     !tasks[i].auto_condition()))
 			continue;
 
+		if (opts->schedule && tasks[i].schedule < opts->schedule)
+			continue;
+
 		trace2_region_enter("maintenance", tasks[i].name, r);
 		if (tasks[i].fn(opts)) {
 			error(_("task '%s' failed"), tasks[i].name);
@@ -1274,13 +1316,23 @@ static void initialize_task_config(void)
 
 	for (i = 0; i < TASK__COUNT; i++) {
 		int config_value;
+		char *config_str;
 
-		strbuf_setlen(&config_name, 0);
+		strbuf_reset(&config_name);
 		strbuf_addf(&config_name, "maintenance.%s.enabled",
 			    tasks[i].name);
 
 		if (!git_config_get_bool(config_name.buf, &config_value))
 			tasks[i].enabled = config_value;
+
+		strbuf_reset(&config_name);
+		strbuf_addf(&config_name, "maintenance.%s.schedule",
+			    tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &config_str)) {
+			tasks[i].schedule = parse_schedule(config_str);
+			free(config_str);
+		}
 	}
 
 	strbuf_release(&config_name);
@@ -1324,6 +1376,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
+		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
+			     N_("run tasks based on frequency"),
+			     maintenance_opt_schedule),
 		OPT_BOOL(0, "quiet", &opts.quiet,
 			 N_("do not report progress or other information over stderr")),
 		OPT_CALLBACK_F(0, "task", NULL, N_("task"),
@@ -1344,6 +1399,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 			     builtin_maintenance_run_usage,
 			     PARSE_OPT_STOP_AT_NON_OPTION);
 
+	if (opts.auto_flag && opts.schedule)
+		die(_("use at most one of --auto and --schedule=<frequency>"));
+
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index e0ba19e1ff..328bbaa830 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -264,4 +264,44 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 	done
 '
 
+test_expect_success '--auto and --schedule incompatible' '
+	test_must_fail git maintenance run --auto --schedule=daily 2>err &&
+	test_i18ngrep "at most one" err
+'
+
+test_expect_success 'invalid --schedule value' '
+	test_must_fail git maintenance run --schedule=annually 2>err &&
+	test_i18ngrep "unrecognized --schedule" err
+'
+
+test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
+	git config maintenance.loose-objects.enabled true &&
+	git config maintenance.loose-objects.schedule hourly &&
+	git config maintenance.commit-graph.enabled true &&
+	git config maintenance.commit-graph.schedule daily &&
+	git config maintenance.incremental-repack.enabled true &&
+	git config maintenance.incremental-repack.schedule weekly &&
+
+	GIT_TRACE2_EVENT="$(pwd)/hourly.txt" \
+		git maintenance run --schedule=hourly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <hourly.txt &&
+	test_subcommand ! git commit-graph write --split --reachable \
+		--no-progress <hourly.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <hourly.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/daily.txt" \
+		git maintenance run --schedule=daily 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <daily.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <daily.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <daily.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/weekly.txt" \
+		git maintenance run --schedule=weekly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <weekly.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <weekly.txt &&
+	test_subcommand git multi-pack-index write --no-progress <weekly.txt
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 3/7] for-each-repo: run subcommands on configured repos
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  2020-09-11 17:49   ` [PATCH v2 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
  2020-09-11 17:49   ` [PATCH v2 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
@ 2020-09-11 17:49   ` Derrick Stolee via GitGitGadget
  2020-09-11 17:49   ` [PATCH v2 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-11 17:49 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

It can be helpful to store a list of repositories in global or system
config and then iterate Git commands on that list. Create a new builtin
that makes this process simple for experts. We will use this builtin to
run scheduled maintenance on all configured repositories in a future
change.

The test is very simple, but does highlight that the "--" argument is
optional.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 .gitignore                          |  1 +
 Documentation/git-for-each-repo.txt | 59 +++++++++++++++++++++++++++++
 Makefile                            |  1 +
 builtin.h                           |  1 +
 builtin/for-each-repo.c             | 58 ++++++++++++++++++++++++++++
 command-list.txt                    |  1 +
 git.c                               |  1 +
 t/t0068-for-each-repo.sh            | 30 +++++++++++++++
 8 files changed, 152 insertions(+)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100755 t/t0068-for-each-repo.sh

diff --git a/.gitignore b/.gitignore
index a5808fa30d..5eb2a2be71 100644
--- a/.gitignore
+++ b/.gitignore
@@ -67,6 +67,7 @@
 /git-filter-branch
 /git-fmt-merge-msg
 /git-for-each-ref
+/git-for-each-repo
 /git-format-patch
 /git-fsck
 /git-fsck-objects
diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt
new file mode 100644
index 0000000000..94bd19da26
--- /dev/null
+++ b/Documentation/git-for-each-repo.txt
@@ -0,0 +1,59 @@
+git-for-each-repo(1)
+====================
+
+NAME
+----
+git-for-each-repo - Run a Git command on a list of repositories
+
+
+SYNOPSIS
+--------
+[verse]
+'git for-each-repo' --config=<config> [--] <arguments>
+
+
+DESCRIPTION
+-----------
+Run a Git command on a list of repositories. The arguments after the
+known options or `--` indicator are used as the arguments for the Git
+subprocess.
+
+THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
+
+For example, we could run maintenance on each of a list of repositories
+stored in a `maintenance.repo` config variable using
+
+-------------
+git for-each-repo --config=maintenance.repo maintenance run
+-------------
+
+This will run `git -C <repo> maintenance run` for each value `<repo>`
+in the multi-valued config variable `maintenance.repo`.
+
+
+OPTIONS
+-------
+--config=<config>::
+	Use the given config variable as a multi-valued list storing
+	absolute path names. Iterate on that list of paths to run
+	the given arguments.
++
+These config values are loaded from system, global, and local Git config,
+as available. If `git for-each-repo` is run in a directory that is not a
+Git repository, then only the system and global config is used.
+
+
+SUBPROCESS BEHAVIOR
+-------------------
+
+If any `git -C <repo> <arguments>` subprocess returns a non-zero exit code,
+then the `git for-each-repo` process returns that exit code without running
+more subprocesses.
+
+Each `git -C <repo> <arguments>` subprocess inherits the standard file
+descriptors `stdin`, `stdout`, and `stderr`.
+
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 65f8cfb236..7c588ff036 100644
--- a/Makefile
+++ b/Makefile
@@ -1071,6 +1071,7 @@ BUILTIN_OBJS += builtin/fetch-pack.o
 BUILTIN_OBJS += builtin/fetch.o
 BUILTIN_OBJS += builtin/fmt-merge-msg.o
 BUILTIN_OBJS += builtin/for-each-ref.o
+BUILTIN_OBJS += builtin/for-each-repo.o
 BUILTIN_OBJS += builtin/fsck.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/get-tar-commit-id.o
diff --git a/builtin.h b/builtin.h
index 17c1c0ce49..ff7c6e5aa9 100644
--- a/builtin.h
+++ b/builtin.h
@@ -150,6 +150,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix);
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix);
 int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix);
 int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
 int cmd_format_patch(int argc, const char **argv, const char *prefix);
 int cmd_fsck(int argc, const char **argv, const char *prefix);
 int cmd_gc(int argc, const char **argv, const char *prefix);
diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
new file mode 100644
index 0000000000..5bba623ff1
--- /dev/null
+++ b/builtin/for-each-repo.c
@@ -0,0 +1,58 @@
+#include "cache.h"
+#include "config.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "run-command.h"
+#include "string-list.h"
+
+static const char * const for_each_repo_usage[] = {
+	N_("git for-each-repo --config=<config> <command-args>"),
+	NULL
+};
+
+static int run_command_on_repo(const char *path,
+			       void *cbdata)
+{
+	int i;
+	struct child_process child = CHILD_PROCESS_INIT;
+	struct strvec *args = (struct strvec *)cbdata;
+
+	child.git_cmd = 1;
+	strvec_pushl(&child.args, "-C", path, NULL);
+
+	for (i = 0; i < args->nr; i++)
+		strvec_push(&child.args, args->v[i]);
+
+	return run_command(&child);
+}
+
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
+{
+	static const char *config_key = NULL;
+	int i, result = 0;
+	const struct string_list *values;
+	struct strvec args = STRVEC_INIT;
+
+	const struct option options[] = {
+		OPT_STRING(0, "config", &config_key, N_("config"),
+			   N_("config key storing a list of repository paths")),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, prefix, options, for_each_repo_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+
+	if (!config_key)
+		die(_("missing --config=<config>"));
+
+	for (i = 0; i < argc; i++)
+		strvec_push(&args, argv[i]);
+
+	values = repo_config_get_value_multi(the_repository,
+					     config_key);
+
+	for (i = 0; !result && i < values->nr; i++)
+		result = run_command_on_repo(values->items[i].string, &args);
+
+	return result;
+}
diff --git a/command-list.txt b/command-list.txt
index 0e3204e7d1..581499be82 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -94,6 +94,7 @@ git-fetch-pack                          synchingrepositories
 git-filter-branch                       ancillarymanipulators
 git-fmt-merge-msg                       purehelpers
 git-for-each-ref                        plumbinginterrogators
+git-for-each-repo                       plumbinginterrogators
 git-format-patch                        mainporcelain
 git-fsck                                ancillaryinterrogators          complete
 git-gc                                  mainporcelain
diff --git a/git.c b/git.c
index 24f250d29a..1cab64b5d1 100644
--- a/git.c
+++ b/git.c
@@ -511,6 +511,7 @@ static struct cmd_struct commands[] = {
 	{ "fetch-pack", cmd_fetch_pack, RUN_SETUP | NO_PARSEOPT },
 	{ "fmt-merge-msg", cmd_fmt_merge_msg, RUN_SETUP },
 	{ "for-each-ref", cmd_for_each_ref, RUN_SETUP },
+	{ "for-each-repo", cmd_for_each_repo, RUN_SETUP_GENTLY },
 	{ "format-patch", cmd_format_patch, RUN_SETUP },
 	{ "fsck", cmd_fsck, RUN_SETUP },
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
diff --git a/t/t0068-for-each-repo.sh b/t/t0068-for-each-repo.sh
new file mode 100755
index 0000000000..136b4ec839
--- /dev/null
+++ b/t/t0068-for-each-repo.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+test_description='git for-each-repo builtin'
+
+. ./test-lib.sh
+
+test_expect_success 'run based on configured value' '
+	git init one &&
+	git init two &&
+	git init three &&
+	git -C two commit --allow-empty -m "DID NOT RUN" &&
+	git config run.key "$TRASH_DIRECTORY/one" &&
+	git config --add run.key "$TRASH_DIRECTORY/three" &&
+	git for-each-repo --config=run.key commit --allow-empty -m "ran" &&
+	git -C one log -1 --pretty=format:%s >message &&
+	grep ran message &&
+	git -C two log -1 --pretty=format:%s >message &&
+	! grep ran message &&
+	git -C three log -1 --pretty=format:%s >message &&
+	grep ran message &&
+	git for-each-repo --config=run.key -- commit --allow-empty -m "ran again" &&
+	git -C one log -1 --pretty=format:%s >message &&
+	grep again message &&
+	git -C two log -1 --pretty=format:%s >message &&
+	! grep again message &&
+	git -C three log -1 --pretty=format:%s >message &&
+	grep again message
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 4/7] maintenance: add [un]register subcommands
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                     ` (2 preceding siblings ...)
  2020-09-11 17:49   ` [PATCH v2 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
@ 2020-09-11 17:49   ` Derrick Stolee via GitGitGadget
  2020-09-17 14:05     ` Đoàn Trần Công Danh
  2020-09-11 17:49   ` [PATCH v2 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-11 17:49 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In preparation for launching background maintenance from the 'git
maintenance' builtin, create register/unregister subcommands. These
commands update the new 'maintenance.repos' config option in the global
config so the background maintenance job knows which repositories to
maintain.

These commands allow users to add a repository to the background
maintenance list without disrupting the actual maintenance mechanism.

For example, a user can run 'git maintenance register' when no
background maintenance is running and it will not start the background
maintenance. A later update to start running background maintenance will
then pick up this repository automatically.

The opposite example is that a user can run 'git maintenance unregister'
to remove the current repository from background maintenance without
halting maintenance for other repositories.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 14 ++++++++
 builtin/gc.c                      | 55 ++++++++++++++++++++++++++++++-
 t/t7900-maintenance.sh            | 17 +++++++++-
 3 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 3af5907b01..78d0d8df91 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -29,6 +29,15 @@ Git repository.
 SUBCOMMANDS
 -----------
 
+register::
+	Initialize Git config values so any scheduled maintenance will
+	start running on this repository. This adds the repository to the
+	`maintenance.repo` config variable in the current user's global
+	config and enables some recommended configuration values for
+	`maintenance.<task>.schedule`. The tasks that are enabled are safe
+	for running in the background without disrupting foreground
+	processes.
+
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
 	are specified, then those tasks are run in that order. Otherwise,
@@ -36,6 +45,11 @@ run::
 	config options are true. By default, only `maintenance.gc.enabled`
 	is true.
 
+unregister::
+	Remove the current repository from background maintenance. This
+	only removes the repository from the configured list. It does not
+	stop the background maintenance processes from running.
+
 TASKS
 -----
 
diff --git a/builtin/gc.c b/builtin/gc.c
index e28561b6c5..0290b249c9 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1408,7 +1408,56 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	return maintenance_run_tasks(&opts);
 }
 
-static const char builtin_maintenance_usage[] = N_("git maintenance run [<options>]");
+static int maintenance_register(void)
+{
+	struct child_process config_set = CHILD_PROCESS_INIT;
+	struct child_process config_get = CHILD_PROCESS_INIT;
+
+	/* There is no current repository, so skip registering it */
+	if (!the_repository || !the_repository->gitdir)
+		return 0;
+
+	config_get.git_cmd = 1;
+	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+			 NULL);
+	config_get.out = -1;
+
+	if (start_command(&config_get))
+		return error(_("failed to run 'git config'"));
+
+	/* We already have this value in our config! */
+	if (!finish_command(&config_get))
+		return 0;
+
+	config_set.git_cmd = 1;
+	strvec_pushl(&config_set.args, "config", "--add", "--global", "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+		     NULL);
+
+	return run_command(&config_set);
+}
+
+static int maintenance_unregister(void)
+{
+	struct child_process config_unset = CHILD_PROCESS_INIT;
+
+	if (!the_repository || !the_repository->gitdir)
+		return error(_("no current repository to unregister"));
+
+	config_unset.git_cmd = 1;
+	strvec_pushl(&config_unset.args, "config", "--global", "--unset",
+		     "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+		     NULL);
+
+	return run_command(&config_unset);
+}
+
+static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
 
 int cmd_maintenance(int argc, const char **argv, const char *prefix)
 {
@@ -1417,6 +1466,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "register"))
+		return maintenance_register();
+	if (!strcmp(argv[1], "unregister"))
+		return maintenance_unregister();
 
 	die(_("invalid subcommand: %s"), argv[1]);
 }
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 328bbaa830..272d1605d2 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -9,7 +9,7 @@ GIT_TEST_MULTI_PACK_INDEX=0
 
 test_expect_success 'help text' '
 	test_expect_code 129 git maintenance -h 2>err &&
-	test_i18ngrep "usage: git maintenance run" err &&
+	test_i18ngrep "usage: git maintenance <subcommand>" err &&
 	test_expect_code 128 git maintenance barf 2>err &&
 	test_i18ngrep "invalid subcommand: barf" err
 '
@@ -304,4 +304,19 @@ test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
 	test_subcommand git multi-pack-index write --no-progress <weekly.txt
 '
 
+test_expect_success 'register and unregister' '
+	test_when_finished git config --global --unset-all maintenance.repo &&
+	git config --global --add maintenance.repo /existing1 &&
+	git config --global --add maintenance.repo /existing2 &&
+	git config --global --get-all maintenance.repo >before &&
+	git maintenance register &&
+	git config --global --get-all maintenance.repo >actual &&
+	cp before after &&
+	pwd >>after &&
+	test_cmp after actual &&
+	git maintenance unregister &&
+	git config --global --get-all maintenance.repo >actual &&
+	test_cmp before actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 5/7] maintenance: add start/stop subcommands
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                     ` (3 preceding siblings ...)
  2020-09-11 17:49   ` [PATCH v2 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
@ 2020-09-11 17:49   ` Derrick Stolee via GitGitGadget
  2020-09-11 17:49   ` [PATCH v2 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-11 17:49 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add new subcommands to 'git maintenance' that start or stop background
maintenance using 'cron', when available. This integration is as simple
as I could make it, barring some implementation complications.

The schedule is laid out as follows:

  0 1-23 * * *   $cmd maintenance run --schedule=hourly
  0 0    * * 1-6 $cmd maintenance run --schedule=daily
  0 0    * * 0   $cmd maintenance run --schedule=weekly

where $cmd is a properly-qualified 'git for-each-repo' execution:

$cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo

where $path points to the location of the Git executable running 'git
maintenance start'. This is critical for systems with multiple versions
of Git. Specifically, macOS has a system version at '/usr/bin/git' while
the version that users can install resides at '/usr/local/bin/git'
(symlinked to '/usr/local/libexec/git-core/git'). This will also use
your locally-built version if you build and run this in your development
environment without installing first.

This conditional schedule avoids having cron launch multiple 'git
for-each-repo' commands in parallel. Such parallel commands would likely
lead to the 'hourly' and 'daily' tasks competing over the object
database lock. This could lead to to some tasks never being run! Since
the --schedule=<frequency> argument will run all tasks with _at least_
the given frequency, the daily runs will also run the hourly tasks.
Similarly, the weekly runs will also run the daily and hourly tasks.

The GIT_TEST_CRONTAB environment variable is not intended for users to
edit, but instead as a way to mock the 'crontab [-l]' command. This
variable is set in test-lib.sh to avoid a future test from accidentally
running anything with the cron integration from modifying the user's
schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our
tests to check how the schedule is modified in 'git maintenance
(start|stop)' commands.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  11 +++
 Makefile                          |   1 +
 builtin/gc.c                      | 124 ++++++++++++++++++++++++++++++
 t/helper/test-crontab.c           |  35 +++++++++
 t/helper/test-tool.c              |   1 +
 t/helper/test-tool.h              |   1 +
 t/t7900-maintenance.sh            |  28 +++++++
 t/test-lib.sh                     |   6 ++
 8 files changed, 207 insertions(+)
 create mode 100644 t/helper/test-crontab.c

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 78d0d8df91..7f8c279fe8 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -45,6 +45,17 @@ run::
 	config options are true. By default, only `maintenance.gc.enabled`
 	is true.
 
+start::
+	Start running maintenance on the current repository. This performs
+	the same config updates as the `register` subcommand, then updates
+	the background scheduler to run `git maintenance run --scheduled`
+	on an hourly basis.
+
+stop::
+	Halt the background maintenance schedule. The current repository
+	is not removed from the list of maintained repositories, in case
+	the background maintenance is restarted later.
+
 unregister::
 	Remove the current repository from background maintenance. This
 	only removes the repository from the configured list. It does not
diff --git a/Makefile b/Makefile
index 7c588ff036..c39b39bd7d 100644
--- a/Makefile
+++ b/Makefile
@@ -690,6 +690,7 @@ TEST_BUILTINS_OBJS += test-advise.o
 TEST_BUILTINS_OBJS += test-bloom.o
 TEST_BUILTINS_OBJS += test-chmtime.o
 TEST_BUILTINS_OBJS += test-config.o
+TEST_BUILTINS_OBJS += test-crontab.o
 TEST_BUILTINS_OBJS += test-ctype.o
 TEST_BUILTINS_OBJS += test-date.o
 TEST_BUILTINS_OBJS += test-delta.o
diff --git a/builtin/gc.c b/builtin/gc.c
index 0290b249c9..4f2d17c0c0 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -32,6 +32,7 @@
 #include "remote.h"
 #include "midx.h"
 #include "object-store.h"
+#include "exec-cmd.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -1457,6 +1458,125 @@ static int maintenance_unregister(void)
 	return run_command(&config_unset);
 }
 
+#define BEGIN_LINE "# BEGIN GIT MAINTENANCE SCHEDULE"
+#define END_LINE "# END GIT MAINTENANCE SCHEDULE"
+
+static int update_background_schedule(int run_maintenance)
+{
+	int result = 0;
+	int in_old_region = 0;
+	struct child_process crontab_list = CHILD_PROCESS_INIT;
+	struct child_process crontab_edit = CHILD_PROCESS_INIT;
+	FILE *cron_list, *cron_in;
+	const char *crontab_name;
+	struct strbuf line = STRBUF_INIT;
+	struct lock_file lk;
+	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
+
+	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
+		return error(_("another process is scheduling background maintenance"));
+
+	crontab_name = getenv("GIT_TEST_CRONTAB");
+	if (!crontab_name)
+		crontab_name = "crontab";
+
+	strvec_split(&crontab_list.args, crontab_name);
+	strvec_push(&crontab_list.args, "-l");
+	crontab_list.in = -1;
+	crontab_list.out = dup(lk.tempfile->fd);
+	crontab_list.git_cmd = 0;
+
+	if (start_command(&crontab_list)) {
+		result = error(_("failed to run 'crontab -l'; your system might not support 'cron'"));
+		goto cleanup;
+	}
+
+	/* Ignore exit code, as an empty crontab will return error. */
+	finish_command(&crontab_list);
+
+	/*
+	 * Read from the .lock file, filtering out the old
+	 * schedule while appending the new schedule.
+	 */
+	cron_list = fdopen(lk.tempfile->fd, "r");
+	rewind(cron_list);
+
+	strvec_split(&crontab_edit.args, crontab_name);
+	crontab_edit.in = -1;
+	crontab_edit.git_cmd = 0;
+
+	if (start_command(&crontab_edit)) {
+		result = error(_("failed to run 'crontab'; your system might not support 'cron'"));
+		goto cleanup;
+	}
+
+	cron_in = fdopen(crontab_edit.in, "w");
+	if (!cron_in) {
+		result = error(_("failed to open stdin of 'crontab'"));
+		goto done_editing;
+	}
+
+	while (!strbuf_getline_lf(&line, cron_list)) {
+		if (!in_old_region && !strcmp(line.buf, BEGIN_LINE))
+			in_old_region = 1;
+		if (in_old_region)
+			continue;
+		fprintf(cron_in, "%s\n", line.buf);
+		if (in_old_region && !strcmp(line.buf, END_LINE))
+			in_old_region = 0;
+	}
+
+	if (run_maintenance) {
+		struct strbuf line_format = STRBUF_INIT;
+		const char *exec_path = git_exec_path();
+
+		fprintf(cron_in, "%s\n", BEGIN_LINE);
+		fprintf(cron_in,
+			"# The following schedule was created by Git\n");
+		fprintf(cron_in, "# Any edits made in this region might be\n");
+		fprintf(cron_in,
+			"# replaced in the future by a Git command.\n\n");
+
+		strbuf_addf(&line_format,
+			    "%%s %%s * * %%s \"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%s\n",
+			    exec_path, exec_path);
+		fprintf(cron_in, line_format.buf, "0", "1-23", "*", "hourly");
+		fprintf(cron_in, line_format.buf, "0", "0", "1-6", "daily");
+		fprintf(cron_in, line_format.buf, "0", "0", "0", "weekly");
+		strbuf_release(&line_format);
+
+		fprintf(cron_in, "\n%s\n", END_LINE);
+	}
+
+	fflush(cron_in);
+	fclose(cron_in);
+	close(crontab_edit.in);
+
+done_editing:
+	if (finish_command(&crontab_edit)) {
+		result = error(_("'crontab' died"));
+		goto cleanup;
+	}
+	fclose(cron_list);
+
+cleanup:
+	rollback_lock_file(&lk);
+	return result;
+}
+
+static int maintenance_start(void)
+{
+	if (maintenance_register())
+		warning(_("failed to add repo to global config"));
+
+	return update_background_schedule(1);
+}
+
+static int maintenance_stop(void)
+{
+	return update_background_schedule(0);
+}
+
 static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
 
 int cmd_maintenance(int argc, const char **argv, const char *prefix)
@@ -1466,6 +1586,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "start"))
+		return maintenance_start();
+	if (!strcmp(argv[1], "stop"))
+		return maintenance_stop();
 	if (!strcmp(argv[1], "register"))
 		return maintenance_register();
 	if (!strcmp(argv[1], "unregister"))
diff --git a/t/helper/test-crontab.c b/t/helper/test-crontab.c
new file mode 100644
index 0000000000..e7c0137a47
--- /dev/null
+++ b/t/helper/test-crontab.c
@@ -0,0 +1,35 @@
+#include "test-tool.h"
+#include "cache.h"
+
+/*
+ * Usage: test-tool cron <file> [-l]
+ *
+ * If -l is specified, then write the contents of <file> to stdout.
+ * Otherwise, write from stdin into <file>.
+ */
+int cmd__crontab(int argc, const char **argv)
+{
+	int a;
+	FILE *from, *to;
+
+	if (argc == 3 && !strcmp(argv[2], "-l")) {
+		from = fopen(argv[1], "r");
+		if (!from)
+			return 0;
+		to = stdout;
+	} else if (argc == 2) {
+		from = stdin;
+		to = fopen(argv[1], "w");
+	} else
+		return error("unknown arguments");
+
+	while ((a = fgetc(from)) != EOF)
+		fputc(a, to);
+
+	if (argc == 3)
+		fclose(from);
+	else
+		fclose(to);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 590b2efca7..432b49d948 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -18,6 +18,7 @@ static struct test_cmd cmds[] = {
 	{ "bloom", cmd__bloom },
 	{ "chmtime", cmd__chmtime },
 	{ "config", cmd__config },
+	{ "crontab", cmd__crontab },
 	{ "ctype", cmd__ctype },
 	{ "date", cmd__date },
 	{ "delta", cmd__delta },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index ddc8e990e9..7c3281e071 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -8,6 +8,7 @@ int cmd__advise_if_enabled(int argc, const char **argv);
 int cmd__bloom(int argc, const char **argv);
 int cmd__chmtime(int argc, const char **argv);
 int cmd__config(int argc, const char **argv);
+int cmd__crontab(int argc, const char **argv);
 int cmd__ctype(int argc, const char **argv);
 int cmd__date(int argc, const char **argv);
 int cmd__delta(int argc, const char **argv);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 272d1605d2..8803fcf621 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -319,4 +319,32 @@ test_expect_success 'register and unregister' '
 	test_cmp before actual
 '
 
+test_expect_success 'start from empty cron table' '
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
+
+	# start registers the repo
+	git config --get --global maintenance.repo "$(pwd)" &&
+
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=weekly" cron.txt
+'
+
+test_expect_success 'stop from existing schedule' '
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
+
+	# stop does not unregister the repo
+	git config --get --global maintenance.repo "$(pwd)" &&
+
+	# Operation is idempotent
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
+	test_must_be_empty cron.txt
+'
+
+test_expect_success 'start preserves existing schedule' '
+	echo "Important information!" >cron.txt &&
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
+	grep "Important information!" cron.txt
+'
+
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index ef31f40037..4a60d1ed76 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1702,3 +1702,9 @@ test_lazy_prereq SHA1 '
 test_lazy_prereq REBASE_P '
 	test -z "$GIT_TEST_SKIP_REBASE_P"
 '
+
+# Ensure that no test accidentally triggers a Git command
+# that runs 'crontab', affecting a user's cron schedule.
+# Tests that verify the cron integration must set this locally
+# to avoid errors.
+GIT_TEST_CRONTAB="exit 1"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 6/7] maintenance: recommended schedule in register/start
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                     ` (4 preceding siblings ...)
  2020-09-11 17:49   ` [PATCH v2 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
@ 2020-09-11 17:49   ` Derrick Stolee via GitGitGadget
  2020-09-29 19:48     ` Martin Ågren
  2020-09-11 17:49   ` [PATCH v2 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  7 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-11 17:49 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance (register|start)' subcommands add the current
repository to the global Git config so maintenance will operate on that
repository. It does not specify what maintenance should occur or how
often.

If a user sets any 'maintenance.<task>.schedule' config value, then
they have chosen a specific schedule for themselves and Git should
respect that.

However, in an effort to recommend a good schedule for repositories of
all sizes, set new config values for recommended tasks that are safe to
run in the background while users run foreground Git commands. These
commands are generally everything but the 'gc' task.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  6 ++++
 builtin/gc.c                      | 46 +++++++++++++++++++++++++++++++
 t/t7900-maintenance.sh            | 16 +++++++++++
 3 files changed, 68 insertions(+)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 7f8c279fe8..364b3e32bf 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -37,6 +37,12 @@ register::
 	`maintenance.<task>.schedule`. The tasks that are enabled are safe
 	for running in the background without disrupting foreground
 	processes.
++
+If your repository has no 'maintenance.<task>.schedule' configuration
+values set, then Git will set configuration values to some recommended
+settings. These settings disable foreground maintenance while performing
+maintenance tasks in the background that will not interrupt foreground Git
+operations.
 
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
diff --git a/builtin/gc.c b/builtin/gc.c
index 4f2d17c0c0..2ef4c0960c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1409,6 +1409,49 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	return maintenance_run_tasks(&opts);
 }
 
+static int has_schedule_config(void)
+{
+	int i, found = 0;
+	struct strbuf config_name = STRBUF_INIT;
+	size_t prefix;
+
+	strbuf_addstr(&config_name, "maintenance.");
+	prefix = config_name.len;
+
+	for (i = 0; !found && i < TASK__COUNT; i++) {
+		char *value;
+
+		strbuf_setlen(&config_name, prefix);
+		strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &value)) {
+			found = 1;
+			FREE_AND_NULL(value);
+		}
+	}
+
+	strbuf_release(&config_name);
+	return found;
+}
+
+static void set_recommended_schedule(void)
+{
+	git_config_set("maintenance.auto", "false");
+	git_config_set("maintenance.gc.enabled", "false");
+
+	git_config_set("maintenance.prefetch.enabled", "true");
+	git_config_set("maintenance.prefetch.schedule", "hourly");
+
+	git_config_set("maintenance.commit-graph.enabled", "true");
+	git_config_set("maintenance.commit-graph.schedule", "hourly");
+
+	git_config_set("maintenance.loose-objects.enabled", "true");
+	git_config_set("maintenance.loose-objects.schedule", "daily");
+
+	git_config_set("maintenance.incremental-repack.enabled", "true");
+	git_config_set("maintenance.incremental-repack.schedule", "daily");
+}
+
 static int maintenance_register(void)
 {
 	struct child_process config_set = CHILD_PROCESS_INIT;
@@ -1418,6 +1461,9 @@ static int maintenance_register(void)
 	if (!the_repository || !the_repository->gitdir)
 		return 0;
 
+	if (!has_schedule_config())
+		set_recommended_schedule();
+
 	config_get.git_cmd = 1;
 	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
 		     the_repository->worktree ? the_repository->worktree
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 8803fcf621..5a31f3925b 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -309,7 +309,23 @@ test_expect_success 'register and unregister' '
 	git config --global --add maintenance.repo /existing1 &&
 	git config --global --add maintenance.repo /existing2 &&
 	git config --global --get-all maintenance.repo >before &&
+
+	# We still have maintenance.<task>.schedule config set,
+	# so this does not update the local schedule
+	git maintenance register &&
+	test_must_fail git config maintenance.auto &&
+
+	# Clear previous maintenance.<task>.schedule values
+	for task in loose-objects commit-graph incremental-repack
+	do
+		git config --unset maintenance.$task.schedule || return 1
+	done &&
 	git maintenance register &&
+	test_cmp_config false maintenance.auto &&
+	test_cmp_config false maintenance.gc.enabled &&
+	test_cmp_config true maintenance.prefetch.enabled &&
+	test_cmp_config hourly maintenance.commit-graph.schedule &&
+	test_cmp_config daily maintenance.incremental-repack.schedule &&
 	git config --global --get-all maintenance.repo >actual &&
 	cp before after &&
 	pwd >>after &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 7/7] maintenance: add troubleshooting guide to docs
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                     ` (5 preceding siblings ...)
  2020-09-11 17:49   ` [PATCH v2 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
@ 2020-09-11 17:49   ` Derrick Stolee via GitGitGadget
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-09-11 17:49 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance run' subcommand takes a lock on the object database
to prevent concurrent processes from competing for resources. This is an
important safety measure to prevent possible repository corruption and
data loss.

This feature can lead to confusing behavior if a user is not aware of
it. Add a TROUBLESHOOTING section to the 'git maintenance' builtin
documentation that discusses these tradeoffs. The short version of this
section is that Git will not corrupt your repository, but if the list of
scheduled tasks takes longer than an hour then some scheduled tasks may
be dropped due to this object database collision. For example, a
long-running "daily" task at midnight might prevent an "hourly" task
from running at 1AM.

The opposite is also possible, but less likely as long as the "hourly"
tasks are much faster than the "daily" and "weekly" tasks.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 44 +++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 364b3e32bf..f58dd60e40 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -161,6 +161,50 @@ OPTIONS
 	`maintenance.<task>.enabled` configured as `true` are considered.
 	See the 'TASKS' section for the list of accepted `<task>` values.
 
+
+TROUBLESHOOTING
+---------------
+The `git maintenance` command is designed to simplify the repository
+maintenance patterns while minimizing user wait time during Git commands.
+A variety of configuration options are available to allow customizing this
+process. The default maintenance options focus on operations that complete
+quickly, even on large repositories.
+
+Users may find some cases where scheduled maintenance tasks do not run as
+frequently as intended. Each `git maintenance run` command takes a lock on
+the repository's object database, and this prevents other concurrent
+`git maintenance run` commands from running on the same repository. Without
+this safeguard, competing processes could leave the repository in an
+unpredictable state.
+
+The background maintenance schedule runs `git maintenance run` processes
+on an hourly basis. Each run executes the "hourly" tasks. At midnight,
+that process also executes the "daily" tasks. At midnight on the first day
+of the week, that process also executes the "weekly" tasks. A single
+process iterates over each registered repository, performing the scheduled
+tasks for that frequency. Depending on the number of registered
+repositories and their sizes, this process may take longer than an hour.
+In this case, multiple `git maintenance run` commands may run on the same
+repository at the same time, colliding on the object database lock. This
+results in one of the two tasks not running.
+
+If you find that some maintenance windows are taking longer than one hour
+to complete, then consider reducing the complexity of your maintenance
+tasks. For example, the `gc` task is much slower than the
+`incremental-repack` task. However, this comes at a cost of a slightly
+larger object database. Consider moving more expensive tasks to be run
+less frequently.
+
+Expert users may consider scheduling their own maintenance tasks using a
+different schedule than is available through `git maintenance start` and
+Git configuration options. These users should be aware of the object
+database lock and how concurrent `git maintenance run` commands behave.
+Further, the `git gc` command should not be combined with
+`git maintenance run` commands. `git gc` modifies the object database
+but does not take the lock in the same way as `git maintenance run`. If
+possible, use `git maintenance run --task=gc` instead of `git gc`.
+
+
 GIT
 ---
 Part of the linkgit:git[1] suite
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 4/7] maintenance: add [un]register subcommands
  2020-09-11 17:49   ` [PATCH v2 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
@ 2020-09-17 14:05     ` Đoàn Trần Công Danh
  0 siblings, 0 replies; 62+ messages in thread
From: Đoàn Trần Công Danh @ 2020-09-17 14:05 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee


Hi Stolee,

Sorry for reply this late.

On 2020-09-11 17:49:17+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> In preparation for launching background maintenance from the 'git
> maintenance' builtin, create register/unregister subcommands. These
> commands update the new 'maintenance.repos' config option in the global

And also not spot this earlier.

I think you meant 'maintenance.repo' (without s) here
since it's the one that was mentioned in the patch itself.

Other than that, this series looks sane to me.

Thanks
Danh

> config so the background maintenance job knows which repositories to
> maintain.
> 
> These commands allow users to add a repository to the background
> maintenance list without disrupting the actual maintenance mechanism.
> 
> For example, a user can run 'git maintenance register' when no
> background maintenance is running and it will not start the background
> maintenance. A later update to start running background maintenance will
> then pick up this repository automatically.
> 
> The opposite example is that a user can run 'git maintenance unregister'
> to remove the current repository from background maintenance without
> halting maintenance for other repositories.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/git-maintenance.txt | 14 ++++++++
>  builtin/gc.c                      | 55 ++++++++++++++++++++++++++++++-
>  t/t7900-maintenance.sh            | 17 +++++++++-
>  3 files changed, 84 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 3af5907b01..78d0d8df91 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -29,6 +29,15 @@ Git repository.
>  SUBCOMMANDS
>  -----------
>  
> +register::
> +	Initialize Git config values so any scheduled maintenance will
> +	start running on this repository. This adds the repository to the
> +	`maintenance.repo` config variable in the current user's global
> +	config and enables some recommended configuration values for
> +	`maintenance.<task>.schedule`. The tasks that are enabled are safe
> +	for running in the background without disrupting foreground
> +	processes.
> +
>  run::
>  	Run one or more maintenance tasks. If one or more `--task` options
>  	are specified, then those tasks are run in that order. Otherwise,
> @@ -36,6 +45,11 @@ run::
>  	config options are true. By default, only `maintenance.gc.enabled`
>  	is true.
>  
> +unregister::
> +	Remove the current repository from background maintenance. This
> +	only removes the repository from the configured list. It does not
> +	stop the background maintenance processes from running.
> +
>  TASKS
>  -----
>  
> diff --git a/builtin/gc.c b/builtin/gc.c
> index e28561b6c5..0290b249c9 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1408,7 +1408,56 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
>  	return maintenance_run_tasks(&opts);
>  }
>  
> -static const char builtin_maintenance_usage[] = N_("git maintenance run [<options>]");
> +static int maintenance_register(void)
> +{
> +	struct child_process config_set = CHILD_PROCESS_INIT;
> +	struct child_process config_get = CHILD_PROCESS_INIT;
> +
> +	/* There is no current repository, so skip registering it */
> +	if (!the_repository || !the_repository->gitdir)
> +		return 0;
> +
> +	config_get.git_cmd = 1;
> +	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
> +		     the_repository->worktree ? the_repository->worktree
> +					      : the_repository->gitdir,
> +			 NULL);
> +	config_get.out = -1;
> +
> +	if (start_command(&config_get))
> +		return error(_("failed to run 'git config'"));
> +
> +	/* We already have this value in our config! */
> +	if (!finish_command(&config_get))
> +		return 0;
> +
> +	config_set.git_cmd = 1;
> +	strvec_pushl(&config_set.args, "config", "--add", "--global", "maintenance.repo",
> +		     the_repository->worktree ? the_repository->worktree
> +					      : the_repository->gitdir,
> +		     NULL);
> +
> +	return run_command(&config_set);
> +}
> +
> +static int maintenance_unregister(void)
> +{
> +	struct child_process config_unset = CHILD_PROCESS_INIT;
> +
> +	if (!the_repository || !the_repository->gitdir)
> +		return error(_("no current repository to unregister"));
> +
> +	config_unset.git_cmd = 1;
> +	strvec_pushl(&config_unset.args, "config", "--global", "--unset",
> +		     "maintenance.repo",
> +		     the_repository->worktree ? the_repository->worktree
> +					      : the_repository->gitdir,
> +		     NULL);
> +
> +	return run_command(&config_unset);
> +}
> +
> +static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
>  
>  int cmd_maintenance(int argc, const char **argv, const char *prefix)
>  {
> @@ -1417,6 +1466,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
>  
>  	if (!strcmp(argv[1], "run"))
>  		return maintenance_run(argc - 1, argv + 1, prefix);
> +	if (!strcmp(argv[1], "register"))
> +		return maintenance_register();
> +	if (!strcmp(argv[1], "unregister"))
> +		return maintenance_unregister();
>  
>  	die(_("invalid subcommand: %s"), argv[1]);
>  }
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 328bbaa830..272d1605d2 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -9,7 +9,7 @@ GIT_TEST_MULTI_PACK_INDEX=0
>  
>  test_expect_success 'help text' '
>  	test_expect_code 129 git maintenance -h 2>err &&
> -	test_i18ngrep "usage: git maintenance run" err &&
> +	test_i18ngrep "usage: git maintenance <subcommand>" err &&
>  	test_expect_code 128 git maintenance barf 2>err &&
>  	test_i18ngrep "invalid subcommand: barf" err
>  '
> @@ -304,4 +304,19 @@ test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
>  	test_subcommand git multi-pack-index write --no-progress <weekly.txt
>  '
>  
> +test_expect_success 'register and unregister' '
> +	test_when_finished git config --global --unset-all maintenance.repo &&
> +	git config --global --add maintenance.repo /existing1 &&
> +	git config --global --add maintenance.repo /existing2 &&
> +	git config --global --get-all maintenance.repo >before &&
> +	git maintenance register &&
> +	git config --global --get-all maintenance.repo >actual &&
> +	cp before after &&
> +	pwd >>after &&
> +	test_cmp after actual &&
> +	git maintenance unregister &&
> +	git config --global --get-all maintenance.repo >actual &&
> +	test_cmp before actual
> +'
> +
>  test_done
> -- 
> gitgitgadget
> 

-- 
Danh

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 6/7] maintenance: recommended schedule in register/start
  2020-09-11 17:49   ` [PATCH v2 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
@ 2020-09-29 19:48     ` Martin Ågren
  2020-09-30 20:11       ` Derrick Stolee
  0 siblings, 1 reply; 62+ messages in thread
From: Martin Ågren @ 2020-09-29 19:48 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Jonathan Nieder, Jonathan Tan, sluongng,
	Đoàn Trần Công Danh, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

Hi Stolee,

On Fri, 11 Sep 2020 at 19:53, Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> If a user sets any 'maintenance.<task>.schedule' config value, then
> they have chosen a specific schedule for themselves and Git should
> respect that.
>
> However, in an effort to recommend a good schedule for repositories of
> all sizes, set new config values for recommended tasks that are safe to
> run in the background while users run foreground Git commands. These
> commands are generally everything but the 'gc' task.

If there aren't any "schedule" configurations, we'll go ahead and
sprinkle in quite a few of them. I suppose that another approach would
be that later, much later, when we go look for these configuration
items, we could go "there is not a single one set, let's act as if
*these* were configured".

The advantage there would be that we can tweak those defaults over time.
Whereas with the approach of this patch, v2.29.0 will give the user a
snapshot of 2020's best practices. If they want to catch up, they will
need to drop all their "schedule" config and re-"register", or use a
future `git maintenance reregister`. ;-)

Anyway, this is a convenience thing. There's a chance that "convenience"
interferes with "perfect" and "optimal". I guess that's to be expected.

> +If your repository has no 'maintenance.<task>.schedule' configuration

Thank you for going above and beyond with marking config items et cetera
for rendering in `monospace`. I just noticed that this is slightly
mis-marked-upped. If you end up rerolling this patch series for some
reason, you might want to switch from 'single quotes' to `backticks` in
this particular instance.

While I'm commenting anyway...

> +static int has_schedule_config(void)
> +{
> +       int i, found = 0;
> +       struct strbuf config_name = STRBUF_INIT;
> +       size_t prefix;
> +
> +       strbuf_addstr(&config_name, "maintenance.");
> +       prefix = config_name.len;
> +
> +       for (i = 0; !found && i < TASK__COUNT; i++) {
> +               char *value;
> +
> +               strbuf_setlen(&config_name, prefix);
> +               strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
> +
> +               if (!git_config_get_string(config_name.buf, &value)) {
> +                       found = 1;
> +                       FREE_AND_NULL(value);
> +               }
> +       }
> +
> +       strbuf_release(&config_name);
> +       return found;
> +}

That `FREE_AND_NULL()` caught me off-guard. The pointer is on the stack.
I suppose it doesn't *hurt*, but being careful to set it to NULL made me
go "huh".

I suppose you could drop the `!found` check in favour of `break`-ing
precisely when you get a hit.

And I do wonder how much the reuse of the "maintenance." part of the
buffer helps performance.

In the end, you could use something like the following (not compiled):

  static int has_schedule_config(void)
  {
         int i, found = 0;
         struct strbuf config_name = STRBUF_INIT;

         for (i = 0; i < TASK__COUNT; i++) {
                 const char *value;

                 strbuf_reset(&config_name);
                 strbuf_addf(&config_name, "maintenance.%s.schedule",
tasks[i].name);

                 if (!git_config_get_value(config_name.buf, &value)) {
                         found = 1;
                         break;
                 }
         }

         strbuf_release(&config_name);
         return found;
  }

Anyway, that's just microniting, obviously, but maybe in the sum it has
some value.


Martin

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 6/7] maintenance: recommended schedule in register/start
  2020-09-29 19:48     ` Martin Ågren
@ 2020-09-30 20:11       ` Derrick Stolee
  2020-10-01 20:38         ` Derrick Stolee
  0 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee @ 2020-09-30 20:11 UTC (permalink / raw)
  To: Martin Ågren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Jonathan Nieder, Jonathan Tan, sluongng,
	Đoàn Trần Công Danh, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee

On 9/29/2020 3:48 PM, Martin Ågren wrote:
> Hi Stolee,
> 
> On Fri, 11 Sep 2020 at 19:53, Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> If a user sets any 'maintenance.<task>.schedule' config value, then
>> they have chosen a specific schedule for themselves and Git should
>> respect that.
>>
>> However, in an effort to recommend a good schedule for repositories of
>> all sizes, set new config values for recommended tasks that are safe to
>> run in the background while users run foreground Git commands. These
>> commands are generally everything but the 'gc' task.
> 
> If there aren't any "schedule" configurations, we'll go ahead and
> sprinkle in quite a few of them. I suppose that another approach would
> be that later, much later, when we go look for these configuration
> items, we could go "there is not a single one set, let's act as if
> *these* were configured".

I do like this alternative.

> The advantage there would be that we can tweak those defaults over time.
> Whereas with the approach of this patch, v2.29.0 will give the user a
> snapshot of 2020's best practices. If they want to catch up, they will
> need to drop all their "schedule" config and re-"register", or use a
> future `git maintenance reregister`. ;-)

This is a significant advantage! Great idea.

It might be a bit difficult to slide this in, but I bet it would work
out OK if we have a "initialize_schedule()" option that is only run
when the "--schedule=<...>" option is given? The trickiest part is
actually setting the ".enabled" configs to "true" as well. The condition
for using the "default" schedule might get a bit complicated. I do think
it is worth some effort to do, as adjusting defaults in code is certainly
easier than modifying config values.

> Anyway, this is a convenience thing. There's a chance that "convenience"
> interferes with "perfect" and "optimal". I guess that's to be expected.
> 
>> +If your repository has no 'maintenance.<task>.schedule' configuration
> 
> Thank you for going above and beyond with marking config items et cetera
> for rendering in `monospace`. I just noticed that this is slightly
> mis-marked-upped. If you end up rerolling this patch series for some
> reason, you might want to switch from 'single quotes' to `backticks` in
> this particular instance.

Thanks! Yeah that was a mis-type.

> While I'm commenting anyway...
> 
>> +static int has_schedule_config(void)
>> +{
>> +       int i, found = 0;
>> +       struct strbuf config_name = STRBUF_INIT;
>> +       size_t prefix;
>> +
>> +       strbuf_addstr(&config_name, "maintenance.");
>> +       prefix = config_name.len;
>> +
>> +       for (i = 0; !found && i < TASK__COUNT; i++) {
>> +               char *value;
>> +
>> +               strbuf_setlen(&config_name, prefix);
>> +               strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
>> +
>> +               if (!git_config_get_string(config_name.buf, &value)) {
>> +                       found = 1;
>> +                       FREE_AND_NULL(value);
>> +               }
>> +       }
>> +
>> +       strbuf_release(&config_name);
>> +       return found;
>> +}
> 
> That `FREE_AND_NULL()` caught me off-guard. The pointer is on the stack.
> I suppose it doesn't *hurt*, but being careful to set it to NULL made me
> go "huh".
> 
> I suppose you could drop the `!found` check in favour of `break`-ing
> precisely when you get a hit.
> 
> And I do wonder how much the reuse of the "maintenance." part of the
> buffer helps performance.

All valid points.

> In the end, you could use something like the following (not compiled):
> 
>   static int has_schedule_config(void)
>   {
>          int i, found = 0;
>          struct strbuf config_name = STRBUF_INIT;
> 
>          for (i = 0; i < TASK__COUNT; i++) {
>                  const char *value;
> 
>                  strbuf_reset(&config_name);
>                  strbuf_addf(&config_name, "maintenance.%s.schedule",
> tasks[i].name);
> 
>                  if (!git_config_get_value(config_name.buf, &value)) {
>                          found = 1;
>                          break;
>                  }
>          }
> 
>          strbuf_release(&config_name);
>          return found;
>   }
> 
> Anyway, that's just microniting, obviously, but maybe in the sum it has
> some value.

Sounds good to me. I'll work on a new version that makes your
recommendations.

Thanks,
-Stolee




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 6/7] maintenance: recommended schedule in register/start
  2020-09-30 20:11       ` Derrick Stolee
@ 2020-10-01 20:38         ` Derrick Stolee
  2020-10-02  0:38           ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee @ 2020-10-01 20:38 UTC (permalink / raw)
  To: Martin Ågren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Jonathan Nieder, Jonathan Tan, sluongng,
	Đoàn Trần Công Danh, SZEDER Gábor,
	Derrick Stolee, Derrick Stolee

On 9/30/2020 4:11 PM, Derrick Stolee wrote:
> On 9/29/2020 3:48 PM, Martin Ågren wrote:
>> If there aren't any "schedule" configurations, we'll go ahead and
>> sprinkle in quite a few of them. I suppose that another approach would
>> be that later, much later, when we go look for these configuration
>> items, we could go "there is not a single one set, let's act as if
>> *these* were configured".
> 
> I do like this alternative.
> 
>> The advantage there would be that we can tweak those defaults over time.
>> Whereas with the approach of this patch, v2.29.0 will give the user a
>> snapshot of 2020's best practices. If they want to catch up, they will
>> need to drop all their "schedule" config and re-"register", or use a
>> future `git maintenance reregister`. ;-)
> 
> This is a significant advantage! Great idea.

Thank you for giving me the idea to pursue this direction. The
replacement patch isn't too bad, and I think this is a much better
approach to satisfy the situation.

What do you think?

Thanks,
-Stolee

-- >8 --

From 7c7698da17327d17485ba1b23a16a0a8d54efaad Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Tue, 18 Aug 2020 15:15:02 -0400
Subject: [PATCH] maintenance: recommended schedule in register/start
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The 'git maintenance (register|start)' subcommands add the current
repository to the global Git config so maintenance will operate on that
repository. It does not specify what maintenance should occur or how
often.

If a user sets any 'maintenance.<task>.schedule' config value, then
they have chosen a specific schedule for themselves and Git should
respect that.

To make this process extremely simple for users, assume a default
schedule when no 'maintenance.<task>.schedule' or '...enabled' config
settings are concretely set. This is only an in-process assumption, so
future versions of Git could adjust this expected schedule.

Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 15 ++++++++
 builtin/gc.c                      | 58 +++++++++++++++++++++++++++++++
 t/t7900-maintenance.sh            | 11 +++---
 3 files changed, 80 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 7628a6d157..52fff86844 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -37,6 +37,21 @@ register::
 	`maintenance.<task>.schedule`. The tasks that are enabled are safe
 	for running in the background without disrupting foreground
 	processes.
++
+If your repository has no `maintenance.<task>.schedule` configuration
+values set, then Git will use a recommended default schedule that performs
+background maintenance that will not interrupt foreground commands. The
+default schedule is as follows:
++
+* `gc`: disabled.
+* `commit-graph`: hourly.
+* `prefetch`: hourly.
+* `loose-objects`: daily.
+* `incremental-repack`: daily.
++
+`git maintenance register` will also disable foreground maintenance by
+setting `maintenance.auto = false` in the current repository. This config
+setting will remain after a `git maintenance unregister` command.
 
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
diff --git a/builtin/gc.c b/builtin/gc.c
index a387f46585..965690704b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1251,6 +1251,59 @@ static int compare_tasks_by_selection(const void *a_, const void *b_)
 	return b->selected_order - a->selected_order;
 }
 
+static int has_schedule_config(void)
+{
+	int i, found = 0;
+	struct strbuf config_name = STRBUF_INIT;
+	size_t prefix;
+
+	strbuf_addstr(&config_name, "maintenance.");
+	prefix = config_name.len;
+
+	for (i = 0; !found && i < TASK__COUNT; i++) {
+		char *value;
+
+		strbuf_setlen(&config_name, prefix);
+		strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &value)) {
+			found = 1;
+			FREE_AND_NULL(value);
+		}
+
+		strbuf_setlen(&config_name, prefix);
+		strbuf_addf(&config_name, "%s.enabled", tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &value)) {
+			found = 1;
+			FREE_AND_NULL(value);
+		}
+	}
+
+	strbuf_release(&config_name);
+	return found;
+}
+
+static void set_recommended_schedule(void)
+{
+	if (has_schedule_config())
+		return;
+
+	tasks[TASK_GC].enabled = 0;
+
+	tasks[TASK_PREFETCH].enabled = 1;
+	tasks[TASK_PREFETCH].schedule = SCHEDULE_HOURLY;
+
+	tasks[TASK_COMMIT_GRAPH].enabled = 1;
+	tasks[TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY;
+
+	tasks[TASK_LOOSE_OBJECTS].enabled = 1;
+	tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY;
+
+	tasks[TASK_INCREMENTAL_REPACK].enabled = 1;
+	tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY;
+}
+
 static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 {
 	int i, found_selected = 0;
@@ -1280,6 +1333,8 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 
 	if (found_selected)
 		QSORT(tasks, TASK__COUNT, compare_tasks_by_selection);
+	else if (opts->schedule != SCHEDULE_NONE)
+		set_recommended_schedule();
 
 	for (i = 0; i < TASK__COUNT; i++) {
 		if (found_selected && tasks[i].selected_order < 0)
@@ -1417,6 +1472,9 @@ static int maintenance_register(void)
 	if (!the_repository || !the_repository->gitdir)
 		return 0;
 
+	/* Disable foreground maintenance */
+	git_config_set("maintenance.auto", "false");
+
 	config_get.git_cmd = 1;
 	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
 		     the_repository->worktree ? the_repository->worktree
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index c8d7e65d3d..fae2ef81bd 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -305,11 +305,14 @@ test_expect_success 'register and unregister' '
 	git config --global --add maintenance.repo /existing1 &&
 	git config --global --add maintenance.repo /existing2 &&
 	git config --global --get-all maintenance.repo >before &&
+
 	git maintenance register &&
-	git config --global --get-all maintenance.repo >actual &&
-	cp before after &&
-	pwd >>after &&
-	test_cmp after actual &&
+	test_cmp_config false maintenance.auto &&
+	git config --global --get-all maintenance.repo >between &&
+	cp before expect &&
+	pwd >>expect &&
+	test_cmp expect between &&
+
 	git maintenance unregister &&
 	git config --global --get-all maintenance.repo >actual &&
 	test_cmp before actual
-- 
2.28.0.284.gc2951c3dd58.dirty




^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 6/7] maintenance: recommended schedule in register/start
  2020-10-01 20:38         ` Derrick Stolee
@ 2020-10-02  0:38           ` Đoàn Trần Công Danh
  2020-10-02  1:55             ` Derrick Stolee
  0 siblings, 1 reply; 62+ messages in thread
From: Đoàn Trần Công Danh @ 2020-10-02  0:38 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Martin Ågren, Derrick Stolee via GitGitGadget,
	Git Mailing List, Jonathan Nieder, Jonathan Tan, sluongng,
	SZEDER Gábor, Derrick Stolee, Derrick Stolee

On 2020-10-01 16:38:48-0400, Derrick Stolee <stolee@gmail.com> wrote:
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 7628a6d157..52fff86844 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -37,6 +37,21 @@ register::
>  	`maintenance.<task>.schedule`. The tasks that are enabled are safe
>  	for running in the background without disrupting foreground
>  	processes.
> ++
> +If your repository has no `maintenance.<task>.schedule` configuration
> +values set, then Git will use a recommended default schedule that performs
> +background maintenance that will not interrupt foreground commands. The
> +default schedule is as follows:

I don't mind about using a default schedule (but someone else might).
I think some distributions will be paranoia with this change and shiped
with disable by default in system config.

> ++
> +* `gc`: disabled.
> +* `commit-graph`: hourly.
> +* `prefetch`: hourly.

However, no `prefetch` in default schedule, please.
IIUC, this is a network operation, if someone is on the go and paying
their internet based on their traffic, this will be a disaster.


> +* `loose-objects`: daily.
> +* `incremental-repack`: daily.

And I would say no incremental-repack, too.
Users don't want to a large operation of IO on some random time of the day,
be it when they open their PC in the morning, or when they want to close
their laptop to go home.

----------(Windows rant ahead)
I still remember those days that Windows 8 was introduced,
Back in that days, my computer still uses the old 7200rpm HDD.
I was super-angry that whenever Windows is started, it starts some IO
disk-caching, indexing that hung my computer for a good 10 minutes.
While that same computer can run Windows 7 and other OS fine.
I don't particularly care how much my computer is faster after that.
I want my computer usable at that time, instead of wasting a good 10
minutes on nothing.
---------(Windows rant end)

Either the users know what are they doing, or we don't do anything at
all. Let's them do it on their free time.

IOW, Please let users opt in instead of opt out of this features.

-- 

Danh

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 6/7] maintenance: recommended schedule in register/start
  2020-10-02  0:38           ` Đoàn Trần Công Danh
@ 2020-10-02  1:55             ` Derrick Stolee
  2020-10-05 13:16               ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee @ 2020-10-02  1:55 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: Martin Ågren, Derrick Stolee via GitGitGadget,
	Git Mailing List, Jonathan Nieder, Jonathan Tan, sluongng,
	SZEDER Gábor, Derrick Stolee, Derrick Stolee

On 10/1/2020 8:38 PM, Đoàn Trần Công Danh wrote:
> On 2020-10-01 16:38:48-0400, Derrick Stolee <stolee@gmail.com> wrote:
>> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
>> index 7628a6d157..52fff86844 100644
>> --- a/Documentation/git-maintenance.txt
>> +++ b/Documentation/git-maintenance.txt
>> @@ -37,6 +37,21 @@ register::
>>  	`maintenance.<task>.schedule`. The tasks that are enabled are safe
>>  	for running in the background without disrupting foreground
>>  	processes.
>> ++
>> +If your repository has no `maintenance.<task>.schedule` configuration
>> +values set, then Git will use a recommended default schedule that performs
>> +background maintenance that will not interrupt foreground commands. The
>> +default schedule is as follows:
> 
> I don't mind about using a default schedule (but someone else might).
> I think some distributions will be paranoia with this change and shiped
> with disable by default in system config.

If a user wants to prevent this schedule, then they can simply change
any one of the `.schedule` or `.enabled` configs in their --global config
and these defaults will not be used.

Of course, perhaps you are missing the fact that "git maintenance run
--schedule=<frequency>" is only run as a cron job if a user chose to
start background maintenance using "git maintenance start" (or "git
maintenance register" after running the 'start' subcommand in another
repo). So this is _not_ starting by default without some amount of
choosing to opt in.

>> ++
>> +* `gc`: disabled.
>> +* `commit-graph`: hourly.
>> +* `prefetch`: hourly.
> 
> However, no `prefetch` in default schedule, please.
> IIUC, this is a network operation, if someone is on the go and paying
> their internet based on their traffic, this will be a disaster.

It _is_ a network operation. You're right that we should make it clear
that network operations are being run in the background if the defaults
are being used.

Of course, this is why "git maintenance stop" exists. Background
maintenance can be halted while not being in a mode where this maintenance
is acceptable.

And further: these defaults are optimized for desktop machines that are
expected to always be on and connected to a non-metered network. Laptops
are not always on, not always connected, and sometimes are metered. Perhaps
a user should decide that they don't want to have background maintenance,
and then they can choose to not opt-in.

If this scenario is common enough, then we could extend the "prefetch"
task to somehow detect (on some platforms) that the network connection is
metered, and then not do any fetches.

>> +* `loose-objects`: daily.
>> +* `incremental-repack`: daily.
> 
> And I would say no incremental-repack, too.
> Users don't want to a large operation of IO on some random time of the day,
> be it when they open their PC in the morning, or when they want to close
> their laptop to go home.

But that's exactly why incremental-repack is an improvement over
a "random" instance of 'git gc --auto' going over an invisible threshold.
The "incremental" nature is intended to only do a reasonable amount of work 
instead of rewriting everything.
 
> ----------(Windows rant ahead)
> I still remember those days that Windows 8 was introduced,
> Back in that days, my computer still uses the old 7200rpm HDD.
> I was super-angry that whenever Windows is started, it starts some IO
> disk-caching, indexing that hung my computer for a good 10 minutes.
> While that same computer can run Windows 7 and other OS fine.
> I don't particularly care how much my computer is faster after that.
> I want my computer usable at that time, instead of wasting a good 10
> minutes on nothing.
> ---------(Windows rant end)
> 
> Either the users know what are they doing, or we don't do anything at
> all. Let's them do it on their free time.
> 
> IOW, Please let users opt in instead of opt out of this features.

But users _do_ opt-in to this feature. They need to start the maintenance
at all to have this run. We just need to give them something that actually
"maintains" their repository without being incredibly expensive or
possibly leading to data loss. That is exactly why "incremental-repack" is
chosen over the "gc" task. If there isn't enough work to be done, then
this task is very cheap to do.

Perhaps all of your concerns are satisfied with this reassurance that
background maintenance is completely opt-in and will not be set up
without a user explicitly enabling it.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 0/7] Maintenance III: Background maintenance
  2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                     ` (6 preceding siblings ...)
  2020-09-11 17:49   ` [PATCH v2 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
@ 2020-10-05 12:57   ` Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
                       ` (7 more replies)
  7 siblings, 8 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-05 12:57 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Derrick Stolee

This is based on ds/maintenance-part-2 and replaces the RFC from [1].

[1] 
https://lore.kernel.org/git/pull.680.v3.git.1598629517.gitgitgadget@gmail.com/

This series introduces background maintenance to Git, through an integration
with cron and crontab.

Some preliminary work is done to allow a new --schedule option that tells
the command which tasks to run based on a maintenance.<task>.schedule config
option. The timing is not enforced by Git, but instead is expected to be
provided as a hint from a cron schedule. The options are "hourly", "daily",
and "weekly".

A new for-each-repo builtin runs Git commands on every repo in a given list.
Currently, the list is stored as a config setting, allowing a new 
maintenance.repos config list to store the repositories registered for
background maintenance. Others may want to add a --file=<file> option for
their own workflows, but I focused on making this as simple as possible for
now.

The updates to the git maintenance builtin include new register/unregister 
subcommands and start/stop subcommands. The register subcommand initializes
the config while the start subcommand does everything register does plus 
update the cron table. The unregister and stop commands reverse this
process.

A troubleshooting guide is added to Documentation/git-maintenance.txt to
advise expert users who choose to create custom cron schedules.

The very last patch is entirely optional. It sets a recommended schedule
based on my own experience with very large repositories. I'm open to other
suggestions, but these are ones that I think work well and don't cause a
"rewrite the world" scenario like running nightly 'gc' would do.

I've been testing this scenario on my macOS laptop and Linux desktop. I have
modified my cron task to provide logging via trace2 so I can see what's
happening. A future direction here would be to add some maintenance logs to
the repository so we can track what is happening and diagnose whether the
maintenance strategy is working on real repos.

Note: git maintenance (start|stop) only works on machines with cron by
design. The proper thing to do on Windows will come later. Perhaps this
command should be marked as unavailable on Windows somehow, or at least a
better error than "cron may not be available on your system". I did find
that that message is helpful sometimes: macOS worker agents for CI builds
typically do not have cron available.

Updates in v3:

 * Instead of writing config upon "register" or "start", simply create an
   in-memory default schedule when no .schedule or .enabled configs are
   present. Thanks, Martin! This causes patch 6 to look so different that
   the range-diff considers it a dropped-and-added patch instead of showing
   a diff.
 * There are some context lines that changed because this is rebased onto a
   recent version of ds/maintenance-part-2.

Updates in v2:

 * Fixed the char/int issue in test-tool crontab, and a typo.
 * Updated commit message and patch noise in PATCH 2
 * This should fix the test failures, allowing this to be picked up in
   'seen'.

Derrick Stolee (7):
  maintenance: optionally skip --auto process
  maintenance: add --schedule option and config
  for-each-repo: run subcommands on configured repos
  maintenance: add [un]register subcommands
  maintenance: add start/stop subcommands
  maintenance: use default schedule if not configured
  maintenance: add troubleshooting guide to docs

 .gitignore                           |   1 +
 Documentation/config/maintenance.txt |  10 +
 Documentation/git-for-each-repo.txt  |  59 ++++++
 Documentation/git-maintenance.txt    |  97 ++++++++-
 Makefile                             |   2 +
 builtin.h                            |   1 +
 builtin/for-each-repo.c              |  58 ++++++
 builtin/gc.c                         | 301 ++++++++++++++++++++++++++-
 command-list.txt                     |   1 +
 git.c                                |   1 +
 run-command.c                        |   6 +
 t/helper/test-crontab.c              |  35 ++++
 t/helper/test-tool.c                 |   1 +
 t/helper/test-tool.h                 |   1 +
 t/t0068-for-each-repo.sh             |  30 +++
 t/t7900-maintenance.sh               | 101 ++++++++-
 t/test-lib.sh                        |   6 +
 17 files changed, 705 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100644 t/helper/test-crontab.c
 create mode 100755 t/t0068-for-each-repo.sh


base-commit: e841a79a131d8ce491cf04d0ca3e24f139a10b82
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-724%2Fderrickstolee%2Fmaintenance%2Fscheduled-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-724/derrickstolee/maintenance/scheduled-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/724

Range-diff vs v2:

 1:  b21cd68c90 = 1:  02e7286dba maintenance: optionally skip --auto process
 2:  e2d14d66d4 ! 2:  dae8c04bb5 maintenance: add --schedule option and config
     @@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char
      
       ## t/t7900-maintenance.sh ##
      @@ t/t7900-maintenance.sh: test_expect_success 'maintenance.incremental-repack.auto' '
     - 	done
     + 	test_subcommand git multi-pack-index write --no-progress <trace-B
       '
       
      +test_expect_success '--auto and --schedule incompatible' '
 3:  41a346dfbb = 3:  dd92379273 for-each-repo: run subcommands on configured repos
 4:  1f49cda18e ! 4:  922b984c8a maintenance: add [un]register subcommands
     @@ t/t7900-maintenance.sh: GIT_TEST_MULTI_PACK_INDEX=0
      -	test_i18ngrep "usage: git maintenance run" err &&
      +	test_i18ngrep "usage: git maintenance <subcommand>" err &&
       	test_expect_code 128 git maintenance barf 2>err &&
     - 	test_i18ngrep "invalid subcommand: barf" err
     - '
     + 	test_i18ngrep "invalid subcommand: barf" err &&
     + 	test_expect_code 129 git maintenance 2>err &&
      @@ t/t7900-maintenance.sh: test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
       	test_subcommand git multi-pack-index write --no-progress <weekly.txt
       '
 5:  e9b2a39c1d ! 5:  5194f6b1fa maintenance: add start/stop subcommands
     @@ Makefile: TEST_BUILTINS_OBJS += test-advise.o
      
       ## builtin/gc.c ##
      @@
     + #include "refs.h"
       #include "remote.h"
     - #include "midx.h"
       #include "object-store.h"
      +#include "exec-cmd.h"
       
 6:  f609c1bde2 ! 6:  d833fffe89 maintenance: recommended schedule in register/start
     @@ Metadata
      Author: Derrick Stolee <dstolee@microsoft.com>
      
       ## Commit message ##
     -    maintenance: recommended schedule in register/start
     +    maintenance: use default schedule if not configured
      
          The 'git maintenance (register|start)' subcommands add the current
          repository to the global Git config so maintenance will operate on that
     @@ Commit message
      
          If a user sets any 'maintenance.<task>.schedule' config value, then
          they have chosen a specific schedule for themselves and Git should
     -    respect that.
     +    respect that when running 'git maintenance run --schedule=<frequency>'.
      
     -    However, in an effort to recommend a good schedule for repositories of
     -    all sizes, set new config values for recommended tasks that are safe to
     -    run in the background while users run foreground Git commands. These
     -    commands are generally everything but the 'gc' task.
     +    To make this process extremely simple for users, assume a default
     +    schedule when no 'maintenance.<task>.schedule' or '...enabled' config
     +    settings are concretely set. This is only an in-process assumption, so
     +    future versions of Git could adjust this expected schedule.
      
     +    Helped-by: Martin Ågren <martin.agren@gmail.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## Documentation/git-maintenance.txt ##
     @@ Documentation/git-maintenance.txt: register::
       	for running in the background without disrupting foreground
       	processes.
      ++
     -+If your repository has no 'maintenance.<task>.schedule' configuration
     -+values set, then Git will set configuration values to some recommended
     -+settings. These settings disable foreground maintenance while performing
     -+maintenance tasks in the background that will not interrupt foreground Git
     -+operations.
     ++If your repository has no `maintenance.<task>.schedule` configuration
     ++values set, then Git will use a recommended default schedule that performs
     ++background maintenance that will not interrupt foreground commands. The
     ++default schedule is as follows:
     +++
     ++* `gc`: disabled.
     ++* `commit-graph`: hourly.
     ++* `prefetch`: hourly.
     ++* `loose-objects`: daily.
     ++* `incremental-repack`: daily.
     +++
     ++`git maintenance register` will also disable foreground maintenance by
     ++setting `maintenance.auto = false` in the current repository. This config
     ++setting will remain after a `git maintenance unregister` command.
       
       run::
       	Run one or more maintenance tasks. If one or more `--task` options
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char *prefix)
     - 	return maintenance_run_tasks(&opts);
     +@@ builtin/gc.c: static int compare_tasks_by_selection(const void *a_, const void *b_)
     + 	return b->selected_order - a->selected_order;
       }
       
      +static int has_schedule_config(void)
     @@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char
      +			found = 1;
      +			FREE_AND_NULL(value);
      +		}
     ++
     ++		strbuf_setlen(&config_name, prefix);
     ++		strbuf_addf(&config_name, "%s.enabled", tasks[i].name);
     ++
     ++		if (!git_config_get_string(config_name.buf, &value)) {
     ++			found = 1;
     ++			FREE_AND_NULL(value);
     ++		}
      +	}
      +
      +	strbuf_release(&config_name);
     @@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char
      +
      +static void set_recommended_schedule(void)
      +{
     -+	git_config_set("maintenance.auto", "false");
     -+	git_config_set("maintenance.gc.enabled", "false");
     ++	if (has_schedule_config())
     ++		return;
     ++
     ++	tasks[TASK_GC].enabled = 0;
      +
     -+	git_config_set("maintenance.prefetch.enabled", "true");
     -+	git_config_set("maintenance.prefetch.schedule", "hourly");
     ++	tasks[TASK_PREFETCH].enabled = 1;
     ++	tasks[TASK_PREFETCH].schedule = SCHEDULE_HOURLY;
      +
     -+	git_config_set("maintenance.commit-graph.enabled", "true");
     -+	git_config_set("maintenance.commit-graph.schedule", "hourly");
     ++	tasks[TASK_COMMIT_GRAPH].enabled = 1;
     ++	tasks[TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY;
      +
     -+	git_config_set("maintenance.loose-objects.enabled", "true");
     -+	git_config_set("maintenance.loose-objects.schedule", "daily");
     ++	tasks[TASK_LOOSE_OBJECTS].enabled = 1;
     ++	tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY;
      +
     -+	git_config_set("maintenance.incremental-repack.enabled", "true");
     -+	git_config_set("maintenance.incremental-repack.schedule", "daily");
     ++	tasks[TASK_INCREMENTAL_REPACK].enabled = 1;
     ++	tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY;
      +}
      +
     - static int maintenance_register(void)
     + static int maintenance_run_tasks(struct maintenance_run_opts *opts)
       {
     - 	struct child_process config_set = CHILD_PROCESS_INIT;
     + 	int i, found_selected = 0;
     +@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
     + 
     + 	if (found_selected)
     + 		QSORT(tasks, TASK__COUNT, compare_tasks_by_selection);
     ++	else if (opts->schedule != SCHEDULE_NONE)
     ++		set_recommended_schedule();
     + 
     + 	for (i = 0; i < TASK__COUNT; i++) {
     + 		if (found_selected && tasks[i].selected_order < 0)
      @@ builtin/gc.c: static int maintenance_register(void)
       	if (!the_repository || !the_repository->gitdir)
       		return 0;
       
     -+	if (!has_schedule_config())
     -+		set_recommended_schedule();
     ++	/* Disable foreground maintenance */
     ++	git_config_set("maintenance.auto", "false");
      +
       	config_get.git_cmd = 1;
       	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
     @@ t/t7900-maintenance.sh: test_expect_success 'register and unregister' '
       	git config --global --add maintenance.repo /existing2 &&
       	git config --global --get-all maintenance.repo >before &&
      +
     -+	# We still have maintenance.<task>.schedule config set,
     -+	# so this does not update the local schedule
     -+	git maintenance register &&
     -+	test_must_fail git config maintenance.auto &&
     -+
     -+	# Clear previous maintenance.<task>.schedule values
     -+	for task in loose-objects commit-graph incremental-repack
     -+	do
     -+		git config --unset maintenance.$task.schedule || return 1
     -+	done &&
       	git maintenance register &&
     +-	git config --global --get-all maintenance.repo >actual &&
     +-	cp before after &&
     +-	pwd >>after &&
     +-	test_cmp after actual &&
      +	test_cmp_config false maintenance.auto &&
     -+	test_cmp_config false maintenance.gc.enabled &&
     -+	test_cmp_config true maintenance.prefetch.enabled &&
     -+	test_cmp_config hourly maintenance.commit-graph.schedule &&
     -+	test_cmp_config daily maintenance.incremental-repack.schedule &&
     ++	git config --global --get-all maintenance.repo >between &&
     ++	cp before expect &&
     ++	pwd >>expect &&
     ++	test_cmp expect between &&
     ++
     + 	git maintenance unregister &&
       	git config --global --get-all maintenance.repo >actual &&
     - 	cp before after &&
     - 	pwd >>after &&
     + 	test_cmp before actual
 7:  2344eff4ba = 7:  8e42ff44ce maintenance: add troubleshooting guide to docs

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 1/7] maintenance: optionally skip --auto process
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
@ 2020-10-05 12:57     ` Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-05 12:57 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Some commands run 'git maintenance run --auto --[no-]quiet' after doing
their normal work, as a way to keep repositories clean as they are used.
Currently, users who do not want this maintenance to occur would set the
'gc.auto' config option to 0 to avoid the 'gc' task from running.
However, this does not stop the extra process invocation. On Windows,
this extra process invocation can be more expensive than necessary.

Allow users to drop this extra process by setting 'maintenance.auto' to
'false'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++++
 run-command.c                        |  6 ++++++
 t/t7900-maintenance.sh               | 13 +++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index a0706d8f09..06db758172 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -1,3 +1,8 @@
+maintenance.auto::
+	This boolean config option controls whether some commands run
+	`git maintenance run --auto` after doing their normal work. Defaults
+	to true.
+
 maintenance.<task>.enabled::
 	This boolean config option controls whether the maintenance task
 	with name `<task>` is run when no `--task` option is specified to
diff --git a/run-command.c b/run-command.c
index 2ee59acdc8..ea4d0fb4b1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -7,6 +7,7 @@
 #include "strbuf.h"
 #include "string-list.h"
 #include "quote.h"
+#include "config.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -1868,8 +1869,13 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 
 int run_auto_maintenance(int quiet)
 {
+	int enabled;
 	struct child_process maint = CHILD_PROCESS_INIT;
 
+	if (!git_config_get_bool("maintenance.auto", &enabled) &&
+	    !enabled)
+		return 0;
+
 	maint.git_cmd = 1;
 	strvec_pushl(&maint.args, "maintenance", "run", "--auto", NULL);
 	strvec_push(&maint.args, quiet ? "--quiet" : "--no-quiet");
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 55116c2f04..c7caaa7a55 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -28,6 +28,19 @@ test_expect_success 'run [--auto|--quiet]' '
 	test_subcommand git gc --no-quiet <run-no-quiet.txt
 '
 
+test_expect_success 'maintenance.auto config option' '
+	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
+	test_subcommand git maintenance run --auto --quiet <default &&
+	GIT_TRACE2_EVENT="$(pwd)/true" \
+		git -c maintenance.auto=true \
+		commit --quiet --allow-empty -m 2 &&
+	test_subcommand git maintenance run --auto --quiet  <true &&
+	GIT_TRACE2_EVENT="$(pwd)/false" \
+		git -c maintenance.auto=false \
+		commit --quiet --allow-empty -m 3 &&
+	test_subcommand ! git maintenance run --auto --quiet  <false
+'
+
 test_expect_success 'maintenance.<task>.enabled' '
 	git config maintenance.gc.enabled false &&
 	git config maintenance.commit-graph.enabled true &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v3 2/7] maintenance: add --schedule option and config
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
@ 2020-10-05 12:57     ` Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-05 12:57 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Maintenance currently triggers when certain data-size thresholds are
met, such as number of pack-files or loose objects. Users may want to
run certain maintenance tasks based on frequency instead. For example,
a user may want to perform a 'prefetch' task every hour, or 'gc' task
every day. To help these users, update the 'git maintenance run' command
to include a '--schedule=<frequency>' option. The allowed frequencies
are 'hourly', 'daily', and 'weekly'. These values are also allowed in a
new config value 'maintenance.<task>.schedule'.

The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
config value for each enabled task to see if the configured frequency is
at least as frequent as the frequency from the '--schedule' argument. We
use the following order, for full clarity:

	'hourly' > 'daily' > 'weekly'

Use new 'enum schedule_priority' to track these values numerically.

The following cron table would run the scheduled tasks with the correct
frequencies:

  0 1-23 * * *    git -C <repo> maintenance run --schedule=hourly
  0 0    * * 1-6  git -C <repo> maintenance run --schedule=daily
  0 0    * * 0    git -C <repo> maintenance run --schedule=weekly

This cron schedule will run --schedule=hourly every hour except at
midnight. This avoids a concurrent run with the --schedule=daily that
runs at midnight every day except the first day of the week. This avoids
a concurrent run with the --schedule=weekly that runs at midnight on
the first day of the week. Since --schedule=daily also runs the
'hourly' tasks and --schedule=weekly runs the 'hourly' and 'daily'
tasks, we will still see all tasks run with the proper frequencies.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++
 Documentation/git-maintenance.txt    | 13 +++++-
 builtin/gc.c                         | 64 ++++++++++++++++++++++++++--
 t/t7900-maintenance.sh               | 40 +++++++++++++++++
 4 files changed, 118 insertions(+), 4 deletions(-)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index 06db758172..70585564fa 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -10,6 +10,11 @@ maintenance.<task>.enabled::
 	`--task` option exists. By default, only `maintenance.gc.enabled`
 	is true.
 
+maintenance.<task>.schedule::
+	This config option controls whether or not the given `<task>` runs
+	during a `git maintenance run --schedule=<frequency>` command. The
+	value must be one of "hourly", "daily", or "weekly".
+
 maintenance.commit-graph.auto::
 	This integer config option controls how often the `commit-graph` task
 	should be run as part of `git maintenance run --auto`. If zero, then
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 3f5d8946b4..ed94f66e36 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -110,7 +110,18 @@ OPTIONS
 	only if certain thresholds are met. For example, the `gc` task
 	runs when the number of loose objects exceeds the number stored
 	in the `gc.auto` config setting, or when the number of pack-files
-	exceeds the `gc.autoPackLimit` config setting.
+	exceeds the `gc.autoPackLimit` config setting. Not compatible with
+	the `--schedule` option.
+
+--schedule::
+	When combined with the `run` subcommand, run maintenance tasks
+	only if certain time conditions are met, as specified by the
+	`maintenance.<task>.schedule` config value for each `<task>`.
+	This config value specifies a number of seconds since the last
+	time that task ran, according to the `maintenance.<task>.lastRun`
+	config value. The tasks that are tested are those provided by
+	the `--task=<task>` option(s) or those with
+	`maintenance.<task>.enabled` set to true.
 
 --quiet::
 	Do not report progress or other information over `stderr`.
diff --git a/builtin/gc.c b/builtin/gc.c
index 2b99596ec8..03b24ea0db 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -703,14 +703,51 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static const char * const builtin_maintenance_run_usage[] = {
-	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>]"),
+static const char *const builtin_maintenance_run_usage[] = {
+	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>] [--schedule]"),
 	NULL
 };
 
+enum schedule_priority {
+	SCHEDULE_NONE = 0,
+	SCHEDULE_WEEKLY = 1,
+	SCHEDULE_DAILY = 2,
+	SCHEDULE_HOURLY = 3,
+};
+
+static enum schedule_priority parse_schedule(const char *value)
+{
+	if (!value)
+		return SCHEDULE_NONE;
+	if (!strcasecmp(value, "hourly"))
+		return SCHEDULE_HOURLY;
+	if (!strcasecmp(value, "daily"))
+		return SCHEDULE_DAILY;
+	if (!strcasecmp(value, "weekly"))
+		return SCHEDULE_WEEKLY;
+	return SCHEDULE_NONE;
+}
+
+static int maintenance_opt_schedule(const struct option *opt, const char *arg,
+				    int unset)
+{
+	enum schedule_priority *priority = opt->value;
+
+	if (unset)
+		die(_("--no-schedule is not allowed"));
+
+	*priority = parse_schedule(arg);
+
+	if (!*priority)
+		die(_("unrecognized --schedule argument '%s'"), arg);
+
+	return 0;
+}
+
 struct maintenance_run_opts {
 	int auto_flag;
 	int quiet;
+	enum schedule_priority schedule;
 };
 
 /* Remember to update object flag allocation in object.h */
@@ -1158,6 +1195,8 @@ struct maintenance_task {
 	maintenance_auto_fn *auto_condition;
 	unsigned enabled:1;
 
+	enum schedule_priority schedule;
+
 	/* -1 if not selected. */
 	int selected_order;
 };
@@ -1253,6 +1292,9 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 		     !tasks[i].auto_condition()))
 			continue;
 
+		if (opts->schedule && tasks[i].schedule < opts->schedule)
+			continue;
+
 		trace2_region_enter("maintenance", tasks[i].name, r);
 		if (tasks[i].fn(opts)) {
 			error(_("task '%s' failed"), tasks[i].name);
@@ -1273,13 +1315,23 @@ static void initialize_task_config(void)
 
 	for (i = 0; i < TASK__COUNT; i++) {
 		int config_value;
+		char *config_str;
 
-		strbuf_setlen(&config_name, 0);
+		strbuf_reset(&config_name);
 		strbuf_addf(&config_name, "maintenance.%s.enabled",
 			    tasks[i].name);
 
 		if (!git_config_get_bool(config_name.buf, &config_value))
 			tasks[i].enabled = config_value;
+
+		strbuf_reset(&config_name);
+		strbuf_addf(&config_name, "maintenance.%s.schedule",
+			    tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &config_str)) {
+			tasks[i].schedule = parse_schedule(config_str);
+			free(config_str);
+		}
 	}
 
 	strbuf_release(&config_name);
@@ -1323,6 +1375,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
+		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
+			     N_("run tasks based on frequency"),
+			     maintenance_opt_schedule),
 		OPT_BOOL(0, "quiet", &opts.quiet,
 			 N_("do not report progress or other information over stderr")),
 		OPT_CALLBACK_F(0, "task", NULL, N_("task"),
@@ -1343,6 +1398,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 			     builtin_maintenance_run_usage,
 			     PARSE_OPT_STOP_AT_NON_OPTION);
 
+	if (opts.auto_flag && opts.schedule)
+		die(_("use at most one of --auto and --schedule=<frequency>"));
+
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index c7caaa7a55..33d73cd01c 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -260,4 +260,44 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 	test_subcommand git multi-pack-index write --no-progress <trace-B
 '
 
+test_expect_success '--auto and --schedule incompatible' '
+	test_must_fail git maintenance run --auto --schedule=daily 2>err &&
+	test_i18ngrep "at most one" err
+'
+
+test_expect_success 'invalid --schedule value' '
+	test_must_fail git maintenance run --schedule=annually 2>err &&
+	test_i18ngrep "unrecognized --schedule" err
+'
+
+test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
+	git config maintenance.loose-objects.enabled true &&
+	git config maintenance.loose-objects.schedule hourly &&
+	git config maintenance.commit-graph.enabled true &&
+	git config maintenance.commit-graph.schedule daily &&
+	git config maintenance.incremental-repack.enabled true &&
+	git config maintenance.incremental-repack.schedule weekly &&
+
+	GIT_TRACE2_EVENT="$(pwd)/hourly.txt" \
+		git maintenance run --schedule=hourly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <hourly.txt &&
+	test_subcommand ! git commit-graph write --split --reachable \
+		--no-progress <hourly.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <hourly.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/daily.txt" \
+		git maintenance run --schedule=daily 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <daily.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <daily.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <daily.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/weekly.txt" \
+		git maintenance run --schedule=weekly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <weekly.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <weekly.txt &&
+	test_subcommand git multi-pack-index write --no-progress <weekly.txt
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v3 3/7] for-each-repo: run subcommands on configured repos
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
@ 2020-10-05 12:57     ` Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-05 12:57 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

It can be helpful to store a list of repositories in global or system
config and then iterate Git commands on that list. Create a new builtin
that makes this process simple for experts. We will use this builtin to
run scheduled maintenance on all configured repositories in a future
change.

The test is very simple, but does highlight that the "--" argument is
optional.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 .gitignore                          |  1 +
 Documentation/git-for-each-repo.txt | 59 +++++++++++++++++++++++++++++
 Makefile                            |  1 +
 builtin.h                           |  1 +
 builtin/for-each-repo.c             | 58 ++++++++++++++++++++++++++++
 command-list.txt                    |  1 +
 git.c                               |  1 +
 t/t0068-for-each-repo.sh            | 30 +++++++++++++++
 8 files changed, 152 insertions(+)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100755 t/t0068-for-each-repo.sh

diff --git a/.gitignore b/.gitignore
index a5808fa30d..5eb2a2be71 100644
--- a/.gitignore
+++ b/.gitignore
@@ -67,6 +67,7 @@
 /git-filter-branch
 /git-fmt-merge-msg
 /git-for-each-ref
+/git-for-each-repo
 /git-format-patch
 /git-fsck
 /git-fsck-objects
diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt
new file mode 100644
index 0000000000..94bd19da26
--- /dev/null
+++ b/Documentation/git-for-each-repo.txt
@@ -0,0 +1,59 @@
+git-for-each-repo(1)
+====================
+
+NAME
+----
+git-for-each-repo - Run a Git command on a list of repositories
+
+
+SYNOPSIS
+--------
+[verse]
+'git for-each-repo' --config=<config> [--] <arguments>
+
+
+DESCRIPTION
+-----------
+Run a Git command on a list of repositories. The arguments after the
+known options or `--` indicator are used as the arguments for the Git
+subprocess.
+
+THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
+
+For example, we could run maintenance on each of a list of repositories
+stored in a `maintenance.repo` config variable using
+
+-------------
+git for-each-repo --config=maintenance.repo maintenance run
+-------------
+
+This will run `git -C <repo> maintenance run` for each value `<repo>`
+in the multi-valued config variable `maintenance.repo`.
+
+
+OPTIONS
+-------
+--config=<config>::
+	Use the given config variable as a multi-valued list storing
+	absolute path names. Iterate on that list of paths to run
+	the given arguments.
++
+These config values are loaded from system, global, and local Git config,
+as available. If `git for-each-repo` is run in a directory that is not a
+Git repository, then only the system and global config is used.
+
+
+SUBPROCESS BEHAVIOR
+-------------------
+
+If any `git -C <repo> <arguments>` subprocess returns a non-zero exit code,
+then the `git for-each-repo` process returns that exit code without running
+more subprocesses.
+
+Each `git -C <repo> <arguments>` subprocess inherits the standard file
+descriptors `stdin`, `stdout`, and `stderr`.
+
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 65f8cfb236..7c588ff036 100644
--- a/Makefile
+++ b/Makefile
@@ -1071,6 +1071,7 @@ BUILTIN_OBJS += builtin/fetch-pack.o
 BUILTIN_OBJS += builtin/fetch.o
 BUILTIN_OBJS += builtin/fmt-merge-msg.o
 BUILTIN_OBJS += builtin/for-each-ref.o
+BUILTIN_OBJS += builtin/for-each-repo.o
 BUILTIN_OBJS += builtin/fsck.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/get-tar-commit-id.o
diff --git a/builtin.h b/builtin.h
index 17c1c0ce49..ff7c6e5aa9 100644
--- a/builtin.h
+++ b/builtin.h
@@ -150,6 +150,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix);
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix);
 int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix);
 int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
 int cmd_format_patch(int argc, const char **argv, const char *prefix);
 int cmd_fsck(int argc, const char **argv, const char *prefix);
 int cmd_gc(int argc, const char **argv, const char *prefix);
diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
new file mode 100644
index 0000000000..5bba623ff1
--- /dev/null
+++ b/builtin/for-each-repo.c
@@ -0,0 +1,58 @@
+#include "cache.h"
+#include "config.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "run-command.h"
+#include "string-list.h"
+
+static const char * const for_each_repo_usage[] = {
+	N_("git for-each-repo --config=<config> <command-args>"),
+	NULL
+};
+
+static int run_command_on_repo(const char *path,
+			       void *cbdata)
+{
+	int i;
+	struct child_process child = CHILD_PROCESS_INIT;
+	struct strvec *args = (struct strvec *)cbdata;
+
+	child.git_cmd = 1;
+	strvec_pushl(&child.args, "-C", path, NULL);
+
+	for (i = 0; i < args->nr; i++)
+		strvec_push(&child.args, args->v[i]);
+
+	return run_command(&child);
+}
+
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
+{
+	static const char *config_key = NULL;
+	int i, result = 0;
+	const struct string_list *values;
+	struct strvec args = STRVEC_INIT;
+
+	const struct option options[] = {
+		OPT_STRING(0, "config", &config_key, N_("config"),
+			   N_("config key storing a list of repository paths")),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, prefix, options, for_each_repo_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+
+	if (!config_key)
+		die(_("missing --config=<config>"));
+
+	for (i = 0; i < argc; i++)
+		strvec_push(&args, argv[i]);
+
+	values = repo_config_get_value_multi(the_repository,
+					     config_key);
+
+	for (i = 0; !result && i < values->nr; i++)
+		result = run_command_on_repo(values->items[i].string, &args);
+
+	return result;
+}
diff --git a/command-list.txt b/command-list.txt
index 0e3204e7d1..581499be82 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -94,6 +94,7 @@ git-fetch-pack                          synchingrepositories
 git-filter-branch                       ancillarymanipulators
 git-fmt-merge-msg                       purehelpers
 git-for-each-ref                        plumbinginterrogators
+git-for-each-repo                       plumbinginterrogators
 git-format-patch                        mainporcelain
 git-fsck                                ancillaryinterrogators          complete
 git-gc                                  mainporcelain
diff --git a/git.c b/git.c
index 24f250d29a..1cab64b5d1 100644
--- a/git.c
+++ b/git.c
@@ -511,6 +511,7 @@ static struct cmd_struct commands[] = {
 	{ "fetch-pack", cmd_fetch_pack, RUN_SETUP | NO_PARSEOPT },
 	{ "fmt-merge-msg", cmd_fmt_merge_msg, RUN_SETUP },
 	{ "for-each-ref", cmd_for_each_ref, RUN_SETUP },
+	{ "for-each-repo", cmd_for_each_repo, RUN_SETUP_GENTLY },
 	{ "format-patch", cmd_format_patch, RUN_SETUP },
 	{ "fsck", cmd_fsck, RUN_SETUP },
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
diff --git a/t/t0068-for-each-repo.sh b/t/t0068-for-each-repo.sh
new file mode 100755
index 0000000000..136b4ec839
--- /dev/null
+++ b/t/t0068-for-each-repo.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+test_description='git for-each-repo builtin'
+
+. ./test-lib.sh
+
+test_expect_success 'run based on configured value' '
+	git init one &&
+	git init two &&
+	git init three &&
+	git -C two commit --allow-empty -m "DID NOT RUN" &&
+	git config run.key "$TRASH_DIRECTORY/one" &&
+	git config --add run.key "$TRASH_DIRECTORY/three" &&
+	git for-each-repo --config=run.key commit --allow-empty -m "ran" &&
+	git -C one log -1 --pretty=format:%s >message &&
+	grep ran message &&
+	git -C two log -1 --pretty=format:%s >message &&
+	! grep ran message &&
+	git -C three log -1 --pretty=format:%s >message &&
+	grep ran message &&
+	git for-each-repo --config=run.key -- commit --allow-empty -m "ran again" &&
+	git -C one log -1 --pretty=format:%s >message &&
+	grep again message &&
+	git -C two log -1 --pretty=format:%s >message &&
+	! grep again message &&
+	git -C three log -1 --pretty=format:%s >message &&
+	grep again message
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v3 4/7] maintenance: add [un]register subcommands
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                       ` (2 preceding siblings ...)
  2020-10-05 12:57     ` [PATCH v3 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
@ 2020-10-05 12:57     ` Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-05 12:57 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In preparation for launching background maintenance from the 'git
maintenance' builtin, create register/unregister subcommands. These
commands update the new 'maintenance.repos' config option in the global
config so the background maintenance job knows which repositories to
maintain.

These commands allow users to add a repository to the background
maintenance list without disrupting the actual maintenance mechanism.

For example, a user can run 'git maintenance register' when no
background maintenance is running and it will not start the background
maintenance. A later update to start running background maintenance will
then pick up this repository automatically.

The opposite example is that a user can run 'git maintenance unregister'
to remove the current repository from background maintenance without
halting maintenance for other repositories.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 14 ++++++++
 builtin/gc.c                      | 55 ++++++++++++++++++++++++++++++-
 t/t7900-maintenance.sh            | 17 +++++++++-
 3 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index ed94f66e36..1c59fd0cb5 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -29,6 +29,15 @@ Git repository.
 SUBCOMMANDS
 -----------
 
+register::
+	Initialize Git config values so any scheduled maintenance will
+	start running on this repository. This adds the repository to the
+	`maintenance.repo` config variable in the current user's global
+	config and enables some recommended configuration values for
+	`maintenance.<task>.schedule`. The tasks that are enabled are safe
+	for running in the background without disrupting foreground
+	processes.
+
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
 	are specified, then those tasks are run in that order. Otherwise,
@@ -36,6 +45,11 @@ run::
 	config options are true. By default, only `maintenance.gc.enabled`
 	is true.
 
+unregister::
+	Remove the current repository from background maintenance. This
+	only removes the repository from the configured list. It does not
+	stop the background maintenance processes from running.
+
 TASKS
 -----
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 03b24ea0db..edf1d35ce5 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1407,7 +1407,56 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	return maintenance_run_tasks(&opts);
 }
 
-static const char builtin_maintenance_usage[] = N_("git maintenance run [<options>]");
+static int maintenance_register(void)
+{
+	struct child_process config_set = CHILD_PROCESS_INIT;
+	struct child_process config_get = CHILD_PROCESS_INIT;
+
+	/* There is no current repository, so skip registering it */
+	if (!the_repository || !the_repository->gitdir)
+		return 0;
+
+	config_get.git_cmd = 1;
+	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+			 NULL);
+	config_get.out = -1;
+
+	if (start_command(&config_get))
+		return error(_("failed to run 'git config'"));
+
+	/* We already have this value in our config! */
+	if (!finish_command(&config_get))
+		return 0;
+
+	config_set.git_cmd = 1;
+	strvec_pushl(&config_set.args, "config", "--add", "--global", "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+		     NULL);
+
+	return run_command(&config_set);
+}
+
+static int maintenance_unregister(void)
+{
+	struct child_process config_unset = CHILD_PROCESS_INIT;
+
+	if (!the_repository || !the_repository->gitdir)
+		return error(_("no current repository to unregister"));
+
+	config_unset.git_cmd = 1;
+	strvec_pushl(&config_unset.args, "config", "--global", "--unset",
+		     "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+		     NULL);
+
+	return run_command(&config_unset);
+}
+
+static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
 
 int cmd_maintenance(int argc, const char **argv, const char *prefix)
 {
@@ -1417,6 +1466,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "register"))
+		return maintenance_register();
+	if (!strcmp(argv[1], "unregister"))
+		return maintenance_unregister();
 
 	die(_("invalid subcommand: %s"), argv[1]);
 }
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 33d73cd01c..8f383d01d9 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -9,7 +9,7 @@ GIT_TEST_MULTI_PACK_INDEX=0
 
 test_expect_success 'help text' '
 	test_expect_code 129 git maintenance -h 2>err &&
-	test_i18ngrep "usage: git maintenance run" err &&
+	test_i18ngrep "usage: git maintenance <subcommand>" err &&
 	test_expect_code 128 git maintenance barf 2>err &&
 	test_i18ngrep "invalid subcommand: barf" err &&
 	test_expect_code 129 git maintenance 2>err &&
@@ -300,4 +300,19 @@ test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
 	test_subcommand git multi-pack-index write --no-progress <weekly.txt
 '
 
+test_expect_success 'register and unregister' '
+	test_when_finished git config --global --unset-all maintenance.repo &&
+	git config --global --add maintenance.repo /existing1 &&
+	git config --global --add maintenance.repo /existing2 &&
+	git config --global --get-all maintenance.repo >before &&
+	git maintenance register &&
+	git config --global --get-all maintenance.repo >actual &&
+	cp before after &&
+	pwd >>after &&
+	test_cmp after actual &&
+	git maintenance unregister &&
+	git config --global --get-all maintenance.repo >actual &&
+	test_cmp before actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v3 5/7] maintenance: add start/stop subcommands
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                       ` (3 preceding siblings ...)
  2020-10-05 12:57     ` [PATCH v3 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
@ 2020-10-05 12:57     ` Derrick Stolee via GitGitGadget
  2020-10-05 12:57     ` [PATCH v3 6/7] maintenance: use default schedule if not configured Derrick Stolee via GitGitGadget
                       ` (2 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-05 12:57 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add new subcommands to 'git maintenance' that start or stop background
maintenance using 'cron', when available. This integration is as simple
as I could make it, barring some implementation complications.

The schedule is laid out as follows:

  0 1-23 * * *   $cmd maintenance run --schedule=hourly
  0 0    * * 1-6 $cmd maintenance run --schedule=daily
  0 0    * * 0   $cmd maintenance run --schedule=weekly

where $cmd is a properly-qualified 'git for-each-repo' execution:

$cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo

where $path points to the location of the Git executable running 'git
maintenance start'. This is critical for systems with multiple versions
of Git. Specifically, macOS has a system version at '/usr/bin/git' while
the version that users can install resides at '/usr/local/bin/git'
(symlinked to '/usr/local/libexec/git-core/git'). This will also use
your locally-built version if you build and run this in your development
environment without installing first.

This conditional schedule avoids having cron launch multiple 'git
for-each-repo' commands in parallel. Such parallel commands would likely
lead to the 'hourly' and 'daily' tasks competing over the object
database lock. This could lead to to some tasks never being run! Since
the --schedule=<frequency> argument will run all tasks with _at least_
the given frequency, the daily runs will also run the hourly tasks.
Similarly, the weekly runs will also run the daily and hourly tasks.

The GIT_TEST_CRONTAB environment variable is not intended for users to
edit, but instead as a way to mock the 'crontab [-l]' command. This
variable is set in test-lib.sh to avoid a future test from accidentally
running anything with the cron integration from modifying the user's
schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our
tests to check how the schedule is modified in 'git maintenance
(start|stop)' commands.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  11 +++
 Makefile                          |   1 +
 builtin/gc.c                      | 124 ++++++++++++++++++++++++++++++
 t/helper/test-crontab.c           |  35 +++++++++
 t/helper/test-tool.c              |   1 +
 t/helper/test-tool.h              |   1 +
 t/t7900-maintenance.sh            |  28 +++++++
 t/test-lib.sh                     |   6 ++
 8 files changed, 207 insertions(+)
 create mode 100644 t/helper/test-crontab.c

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 1c59fd0cb5..7628a6d157 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -45,6 +45,17 @@ run::
 	config options are true. By default, only `maintenance.gc.enabled`
 	is true.
 
+start::
+	Start running maintenance on the current repository. This performs
+	the same config updates as the `register` subcommand, then updates
+	the background scheduler to run `git maintenance run --scheduled`
+	on an hourly basis.
+
+stop::
+	Halt the background maintenance schedule. The current repository
+	is not removed from the list of maintained repositories, in case
+	the background maintenance is restarted later.
+
 unregister::
 	Remove the current repository from background maintenance. This
 	only removes the repository from the configured list. It does not
diff --git a/Makefile b/Makefile
index 7c588ff036..c39b39bd7d 100644
--- a/Makefile
+++ b/Makefile
@@ -690,6 +690,7 @@ TEST_BUILTINS_OBJS += test-advise.o
 TEST_BUILTINS_OBJS += test-bloom.o
 TEST_BUILTINS_OBJS += test-chmtime.o
 TEST_BUILTINS_OBJS += test-config.o
+TEST_BUILTINS_OBJS += test-crontab.o
 TEST_BUILTINS_OBJS += test-ctype.o
 TEST_BUILTINS_OBJS += test-date.o
 TEST_BUILTINS_OBJS += test-delta.o
diff --git a/builtin/gc.c b/builtin/gc.c
index edf1d35ce5..a387f46585 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -31,6 +31,7 @@
 #include "refs.h"
 #include "remote.h"
 #include "object-store.h"
+#include "exec-cmd.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -1456,6 +1457,125 @@ static int maintenance_unregister(void)
 	return run_command(&config_unset);
 }
 
+#define BEGIN_LINE "# BEGIN GIT MAINTENANCE SCHEDULE"
+#define END_LINE "# END GIT MAINTENANCE SCHEDULE"
+
+static int update_background_schedule(int run_maintenance)
+{
+	int result = 0;
+	int in_old_region = 0;
+	struct child_process crontab_list = CHILD_PROCESS_INIT;
+	struct child_process crontab_edit = CHILD_PROCESS_INIT;
+	FILE *cron_list, *cron_in;
+	const char *crontab_name;
+	struct strbuf line = STRBUF_INIT;
+	struct lock_file lk;
+	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
+
+	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
+		return error(_("another process is scheduling background maintenance"));
+
+	crontab_name = getenv("GIT_TEST_CRONTAB");
+	if (!crontab_name)
+		crontab_name = "crontab";
+
+	strvec_split(&crontab_list.args, crontab_name);
+	strvec_push(&crontab_list.args, "-l");
+	crontab_list.in = -1;
+	crontab_list.out = dup(lk.tempfile->fd);
+	crontab_list.git_cmd = 0;
+
+	if (start_command(&crontab_list)) {
+		result = error(_("failed to run 'crontab -l'; your system might not support 'cron'"));
+		goto cleanup;
+	}
+
+	/* Ignore exit code, as an empty crontab will return error. */
+	finish_command(&crontab_list);
+
+	/*
+	 * Read from the .lock file, filtering out the old
+	 * schedule while appending the new schedule.
+	 */
+	cron_list = fdopen(lk.tempfile->fd, "r");
+	rewind(cron_list);
+
+	strvec_split(&crontab_edit.args, crontab_name);
+	crontab_edit.in = -1;
+	crontab_edit.git_cmd = 0;
+
+	if (start_command(&crontab_edit)) {
+		result = error(_("failed to run 'crontab'; your system might not support 'cron'"));
+		goto cleanup;
+	}
+
+	cron_in = fdopen(crontab_edit.in, "w");
+	if (!cron_in) {
+		result = error(_("failed to open stdin of 'crontab'"));
+		goto done_editing;
+	}
+
+	while (!strbuf_getline_lf(&line, cron_list)) {
+		if (!in_old_region && !strcmp(line.buf, BEGIN_LINE))
+			in_old_region = 1;
+		if (in_old_region)
+			continue;
+		fprintf(cron_in, "%s\n", line.buf);
+		if (in_old_region && !strcmp(line.buf, END_LINE))
+			in_old_region = 0;
+	}
+
+	if (run_maintenance) {
+		struct strbuf line_format = STRBUF_INIT;
+		const char *exec_path = git_exec_path();
+
+		fprintf(cron_in, "%s\n", BEGIN_LINE);
+		fprintf(cron_in,
+			"# The following schedule was created by Git\n");
+		fprintf(cron_in, "# Any edits made in this region might be\n");
+		fprintf(cron_in,
+			"# replaced in the future by a Git command.\n\n");
+
+		strbuf_addf(&line_format,
+			    "%%s %%s * * %%s \"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%s\n",
+			    exec_path, exec_path);
+		fprintf(cron_in, line_format.buf, "0", "1-23", "*", "hourly");
+		fprintf(cron_in, line_format.buf, "0", "0", "1-6", "daily");
+		fprintf(cron_in, line_format.buf, "0", "0", "0", "weekly");
+		strbuf_release(&line_format);
+
+		fprintf(cron_in, "\n%s\n", END_LINE);
+	}
+
+	fflush(cron_in);
+	fclose(cron_in);
+	close(crontab_edit.in);
+
+done_editing:
+	if (finish_command(&crontab_edit)) {
+		result = error(_("'crontab' died"));
+		goto cleanup;
+	}
+	fclose(cron_list);
+
+cleanup:
+	rollback_lock_file(&lk);
+	return result;
+}
+
+static int maintenance_start(void)
+{
+	if (maintenance_register())
+		warning(_("failed to add repo to global config"));
+
+	return update_background_schedule(1);
+}
+
+static int maintenance_stop(void)
+{
+	return update_background_schedule(0);
+}
+
 static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
 
 int cmd_maintenance(int argc, const char **argv, const char *prefix)
@@ -1466,6 +1586,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "start"))
+		return maintenance_start();
+	if (!strcmp(argv[1], "stop"))
+		return maintenance_stop();
 	if (!strcmp(argv[1], "register"))
 		return maintenance_register();
 	if (!strcmp(argv[1], "unregister"))
diff --git a/t/helper/test-crontab.c b/t/helper/test-crontab.c
new file mode 100644
index 0000000000..e7c0137a47
--- /dev/null
+++ b/t/helper/test-crontab.c
@@ -0,0 +1,35 @@
+#include "test-tool.h"
+#include "cache.h"
+
+/*
+ * Usage: test-tool cron <file> [-l]
+ *
+ * If -l is specified, then write the contents of <file> to stdout.
+ * Otherwise, write from stdin into <file>.
+ */
+int cmd__crontab(int argc, const char **argv)
+{
+	int a;
+	FILE *from, *to;
+
+	if (argc == 3 && !strcmp(argv[2], "-l")) {
+		from = fopen(argv[1], "r");
+		if (!from)
+			return 0;
+		to = stdout;
+	} else if (argc == 2) {
+		from = stdin;
+		to = fopen(argv[1], "w");
+	} else
+		return error("unknown arguments");
+
+	while ((a = fgetc(from)) != EOF)
+		fputc(a, to);
+
+	if (argc == 3)
+		fclose(from);
+	else
+		fclose(to);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 590b2efca7..432b49d948 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -18,6 +18,7 @@ static struct test_cmd cmds[] = {
 	{ "bloom", cmd__bloom },
 	{ "chmtime", cmd__chmtime },
 	{ "config", cmd__config },
+	{ "crontab", cmd__crontab },
 	{ "ctype", cmd__ctype },
 	{ "date", cmd__date },
 	{ "delta", cmd__delta },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index ddc8e990e9..7c3281e071 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -8,6 +8,7 @@ int cmd__advise_if_enabled(int argc, const char **argv);
 int cmd__bloom(int argc, const char **argv);
 int cmd__chmtime(int argc, const char **argv);
 int cmd__config(int argc, const char **argv);
+int cmd__crontab(int argc, const char **argv);
 int cmd__ctype(int argc, const char **argv);
 int cmd__date(int argc, const char **argv);
 int cmd__delta(int argc, const char **argv);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 8f383d01d9..7715e40391 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -315,4 +315,32 @@ test_expect_success 'register and unregister' '
 	test_cmp before actual
 '
 
+test_expect_success 'start from empty cron table' '
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
+
+	# start registers the repo
+	git config --get --global maintenance.repo "$(pwd)" &&
+
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=weekly" cron.txt
+'
+
+test_expect_success 'stop from existing schedule' '
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
+
+	# stop does not unregister the repo
+	git config --get --global maintenance.repo "$(pwd)" &&
+
+	# Operation is idempotent
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
+	test_must_be_empty cron.txt
+'
+
+test_expect_success 'start preserves existing schedule' '
+	echo "Important information!" >cron.txt &&
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
+	grep "Important information!" cron.txt
+'
+
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index ef31f40037..4a60d1ed76 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1702,3 +1702,9 @@ test_lazy_prereq SHA1 '
 test_lazy_prereq REBASE_P '
 	test -z "$GIT_TEST_SKIP_REBASE_P"
 '
+
+# Ensure that no test accidentally triggers a Git command
+# that runs 'crontab', affecting a user's cron schedule.
+# Tests that verify the cron integration must set this locally
+# to avoid errors.
+GIT_TEST_CRONTAB="exit 1"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v3 6/7] maintenance: use default schedule if not configured
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                       ` (4 preceding siblings ...)
  2020-10-05 12:57     ` [PATCH v3 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
@ 2020-10-05 12:57     ` Derrick Stolee via GitGitGadget
  2020-10-05 19:57       ` Martin Ågren
  2020-10-05 12:57     ` [PATCH v3 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  7 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-05 12:57 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance (register|start)' subcommands add the current
repository to the global Git config so maintenance will operate on that
repository. It does not specify what maintenance should occur or how
often.

If a user sets any 'maintenance.<task>.schedule' config value, then
they have chosen a specific schedule for themselves and Git should
respect that when running 'git maintenance run --schedule=<frequency>'.

To make this process extremely simple for users, assume a default
schedule when no 'maintenance.<task>.schedule' or '...enabled' config
settings are concretely set. This is only an in-process assumption, so
future versions of Git could adjust this expected schedule.

Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 15 ++++++++
 builtin/gc.c                      | 58 +++++++++++++++++++++++++++++++
 t/t7900-maintenance.sh            | 11 +++---
 3 files changed, 80 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 7628a6d157..52fff86844 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -37,6 +37,21 @@ register::
 	`maintenance.<task>.schedule`. The tasks that are enabled are safe
 	for running in the background without disrupting foreground
 	processes.
++
+If your repository has no `maintenance.<task>.schedule` configuration
+values set, then Git will use a recommended default schedule that performs
+background maintenance that will not interrupt foreground commands. The
+default schedule is as follows:
++
+* `gc`: disabled.
+* `commit-graph`: hourly.
+* `prefetch`: hourly.
+* `loose-objects`: daily.
+* `incremental-repack`: daily.
++
+`git maintenance register` will also disable foreground maintenance by
+setting `maintenance.auto = false` in the current repository. This config
+setting will remain after a `git maintenance unregister` command.
 
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
diff --git a/builtin/gc.c b/builtin/gc.c
index a387f46585..965690704b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1251,6 +1251,59 @@ static int compare_tasks_by_selection(const void *a_, const void *b_)
 	return b->selected_order - a->selected_order;
 }
 
+static int has_schedule_config(void)
+{
+	int i, found = 0;
+	struct strbuf config_name = STRBUF_INIT;
+	size_t prefix;
+
+	strbuf_addstr(&config_name, "maintenance.");
+	prefix = config_name.len;
+
+	for (i = 0; !found && i < TASK__COUNT; i++) {
+		char *value;
+
+		strbuf_setlen(&config_name, prefix);
+		strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &value)) {
+			found = 1;
+			FREE_AND_NULL(value);
+		}
+
+		strbuf_setlen(&config_name, prefix);
+		strbuf_addf(&config_name, "%s.enabled", tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &value)) {
+			found = 1;
+			FREE_AND_NULL(value);
+		}
+	}
+
+	strbuf_release(&config_name);
+	return found;
+}
+
+static void set_recommended_schedule(void)
+{
+	if (has_schedule_config())
+		return;
+
+	tasks[TASK_GC].enabled = 0;
+
+	tasks[TASK_PREFETCH].enabled = 1;
+	tasks[TASK_PREFETCH].schedule = SCHEDULE_HOURLY;
+
+	tasks[TASK_COMMIT_GRAPH].enabled = 1;
+	tasks[TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY;
+
+	tasks[TASK_LOOSE_OBJECTS].enabled = 1;
+	tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY;
+
+	tasks[TASK_INCREMENTAL_REPACK].enabled = 1;
+	tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY;
+}
+
 static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 {
 	int i, found_selected = 0;
@@ -1280,6 +1333,8 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 
 	if (found_selected)
 		QSORT(tasks, TASK__COUNT, compare_tasks_by_selection);
+	else if (opts->schedule != SCHEDULE_NONE)
+		set_recommended_schedule();
 
 	for (i = 0; i < TASK__COUNT; i++) {
 		if (found_selected && tasks[i].selected_order < 0)
@@ -1417,6 +1472,9 @@ static int maintenance_register(void)
 	if (!the_repository || !the_repository->gitdir)
 		return 0;
 
+	/* Disable foreground maintenance */
+	git_config_set("maintenance.auto", "false");
+
 	config_get.git_cmd = 1;
 	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
 		     the_repository->worktree ? the_repository->worktree
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 7715e40391..7154987fd2 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -305,11 +305,14 @@ test_expect_success 'register and unregister' '
 	git config --global --add maintenance.repo /existing1 &&
 	git config --global --add maintenance.repo /existing2 &&
 	git config --global --get-all maintenance.repo >before &&
+
 	git maintenance register &&
-	git config --global --get-all maintenance.repo >actual &&
-	cp before after &&
-	pwd >>after &&
-	test_cmp after actual &&
+	test_cmp_config false maintenance.auto &&
+	git config --global --get-all maintenance.repo >between &&
+	cp before expect &&
+	pwd >>expect &&
+	test_cmp expect between &&
+
 	git maintenance unregister &&
 	git config --global --get-all maintenance.repo >actual &&
 	test_cmp before actual
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v3 7/7] maintenance: add troubleshooting guide to docs
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                       ` (5 preceding siblings ...)
  2020-10-05 12:57     ` [PATCH v3 6/7] maintenance: use default schedule if not configured Derrick Stolee via GitGitGadget
@ 2020-10-05 12:57     ` Derrick Stolee via GitGitGadget
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-05 12:57 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance run' subcommand takes a lock on the object database
to prevent concurrent processes from competing for resources. This is an
important safety measure to prevent possible repository corruption and
data loss.

This feature can lead to confusing behavior if a user is not aware of
it. Add a TROUBLESHOOTING section to the 'git maintenance' builtin
documentation that discusses these tradeoffs. The short version of this
section is that Git will not corrupt your repository, but if the list of
scheduled tasks takes longer than an hour then some scheduled tasks may
be dropped due to this object database collision. For example, a
long-running "daily" task at midnight might prevent an "hourly" task
from running at 1AM.

The opposite is also possible, but less likely as long as the "hourly"
tasks are much faster than the "daily" and "weekly" tasks.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 44 +++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 52fff86844..738a4c7ebd 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -173,6 +173,50 @@ OPTIONS
 	`maintenance.<task>.enabled` configured as `true` are considered.
 	See the 'TASKS' section for the list of accepted `<task>` values.
 
+
+TROUBLESHOOTING
+---------------
+The `git maintenance` command is designed to simplify the repository
+maintenance patterns while minimizing user wait time during Git commands.
+A variety of configuration options are available to allow customizing this
+process. The default maintenance options focus on operations that complete
+quickly, even on large repositories.
+
+Users may find some cases where scheduled maintenance tasks do not run as
+frequently as intended. Each `git maintenance run` command takes a lock on
+the repository's object database, and this prevents other concurrent
+`git maintenance run` commands from running on the same repository. Without
+this safeguard, competing processes could leave the repository in an
+unpredictable state.
+
+The background maintenance schedule runs `git maintenance run` processes
+on an hourly basis. Each run executes the "hourly" tasks. At midnight,
+that process also executes the "daily" tasks. At midnight on the first day
+of the week, that process also executes the "weekly" tasks. A single
+process iterates over each registered repository, performing the scheduled
+tasks for that frequency. Depending on the number of registered
+repositories and their sizes, this process may take longer than an hour.
+In this case, multiple `git maintenance run` commands may run on the same
+repository at the same time, colliding on the object database lock. This
+results in one of the two tasks not running.
+
+If you find that some maintenance windows are taking longer than one hour
+to complete, then consider reducing the complexity of your maintenance
+tasks. For example, the `gc` task is much slower than the
+`incremental-repack` task. However, this comes at a cost of a slightly
+larger object database. Consider moving more expensive tasks to be run
+less frequently.
+
+Expert users may consider scheduling their own maintenance tasks using a
+different schedule than is available through `git maintenance start` and
+Git configuration options. These users should be aware of the object
+database lock and how concurrent `git maintenance run` commands behave.
+Further, the `git gc` command should not be combined with
+`git maintenance run` commands. `git gc` modifies the object database
+but does not take the lock in the same way as `git maintenance run`. If
+possible, use `git maintenance run --task=gc` instead of `git gc`.
+
+
 GIT
 ---
 Part of the linkgit:git[1] suite
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 6/7] maintenance: recommended schedule in register/start
  2020-10-02  1:55             ` Derrick Stolee
@ 2020-10-05 13:16               ` Đoàn Trần Công Danh
  2020-10-05 18:17                 ` Derrick Stolee
  0 siblings, 1 reply; 62+ messages in thread
From: Đoàn Trần Công Danh @ 2020-10-05 13:16 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Martin Ågren, Derrick Stolee via GitGitGadget,
	Git Mailing List, Jonathan Nieder, Jonathan Tan, sluongng,
	SZEDER Gábor, Derrick Stolee, Derrick Stolee

On 2020-10-01 21:55:40-0400, Derrick Stolee <stolee@gmail.com> wrote:
> On 10/1/2020 8:38 PM, Đoàn Trần Công Danh wrote:
> > On 2020-10-01 16:38:48-0400, Derrick Stolee <stolee@gmail.com> wrote:
> >> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> >> index 7628a6d157..52fff86844 100644
> >> --- a/Documentation/git-maintenance.txt
> >> +++ b/Documentation/git-maintenance.txt
> >> @@ -37,6 +37,21 @@ register::
> >>  	`maintenance.<task>.schedule`. The tasks that are enabled are safe
> >>  	for running in the background without disrupting foreground
> >>  	processes.
> >> ++
> >> +If your repository has no `maintenance.<task>.schedule` configuration
> >> +values set, then Git will use a recommended default schedule that performs
> >> +background maintenance that will not interrupt foreground commands. The
> >> +default schedule is as follows:
> > 
> > I don't mind about using a default schedule (but someone else might).
> > I think some distributions will be paranoia with this change and shiped
> > with disable by default in system config.
> 
> If a user wants to prevent this schedule, then they can simply change
> any one of the `.schedule` or `.enabled` configs in their --global config
> and these defaults will not be used.
> 
> Of course, perhaps you are missing the fact that "git maintenance run
> --schedule=<frequency>" is only run as a cron job if a user chose to
> start background maintenance using "git maintenance start" (or "git
> maintenance register" after running the 'start' subcommand in another
> repo). So this is _not_ starting by default without some amount of
> choosing to opt in.

Yes, I missed that fact. Sorry for the noise I generated.

Thanks,
-- Danh

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 6/7] maintenance: recommended schedule in register/start
  2020-10-05 13:16               ` Đoàn Trần Công Danh
@ 2020-10-05 18:17                 ` Derrick Stolee
  0 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee @ 2020-10-05 18:17 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: Martin Ågren, Derrick Stolee via GitGitGadget,
	Git Mailing List, Jonathan Nieder, Jonathan Tan, sluongng,
	SZEDER Gábor, Derrick Stolee, Derrick Stolee

On 10/5/2020 9:16 AM, Đoàn Trần Công Danh wrote:
> On 2020-10-01 21:55:40-0400, Derrick Stolee <stolee@gmail.com> wrote:
>> On 10/1/2020 8:38 PM, Đoàn Trần Công Danh wrote:
>>> I don't mind about using a default schedule (but someone else might).
>>> I think some distributions will be paranoia with this change and shiped
>>> with disable by default in system config.
>>
>> If a user wants to prevent this schedule, then they can simply change
>> any one of the `.schedule` or `.enabled` configs in their --global config
>> and these defaults will not be used.
>>
>> Of course, perhaps you are missing the fact that "git maintenance run
>> --schedule=<frequency>" is only run as a cron job if a user chose to
>> start background maintenance using "git maintenance start" (or "git
>> maintenance register" after running the 'start' subcommand in another
>> repo). So this is _not_ starting by default without some amount of
>> choosing to opt in.
> 
> Yes, I missed that fact. Sorry for the noise I generated.

Thanks for confirming. And thank you for the interest in the
feature! It's my fault for not including the context properly.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v3 6/7] maintenance: use default schedule if not configured
  2020-10-05 12:57     ` [PATCH v3 6/7] maintenance: use default schedule if not configured Derrick Stolee via GitGitGadget
@ 2020-10-05 19:57       ` Martin Ågren
  2020-10-08 13:32         ` Derrick Stolee
  0 siblings, 1 reply; 62+ messages in thread
From: Martin Ågren @ 2020-10-05 19:57 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, congdanhqx,
	SZEDER Gábor, Derrick Stolee, Derrick Stolee, Derrick Stolee

Hi Stolee,

On Mon, 5 Oct 2020 at 15:07, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
> The 'git maintenance (register|start)' subcommands add the current
> repository to the global Git config so maintenance will operate on that
> repository. It does not specify what maintenance should occur or how
> often.

I see that you posted a "how about this?" some days ago. I was offline
for the weekend with some margin on both sides, so I didn't see it until
now. Good that you just went ahead and posted the whole series anyway.

> If a user sets any 'maintenance.<task>.schedule' config value, then
> they have chosen a specific schedule for themselves and Git should
> respect that when running 'git maintenance run --schedule=<frequency>'.
>
> To make this process extremely simple for users, assume a default
> schedule when no 'maintenance.<task>.schedule' or '...enabled' config
> settings are concretely set. This is only an in-process assumption, so
> future versions of Git could adjust this expected schedule.

This obviously makes sense to me. ;-) One thing it does mean though is
that something like this:

  $ git maintenance register
  # Time goes by...
  # Someone said to try this:
  $ git config --add maintenance.commit-graph.schedule hourly
  $ git config --add maintenance.commit-graph.enable true
  # That could have been a no-op, since we were already on
  # such an hourly schedule, but it will effectively turn off
  # all other scheduled tasks. So some time later:
  # -- Why are my fetches so slow all of a sudden? :-(

That could be different if `git maintenance register` would turn on,
say, `maintenance.baseSchedule = standard` where setting those
`maintenance.commit-graph.*` would tweak that "standard" "base
schedule" (in a no-op way as it happens).

> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -37,6 +37,21 @@ register::

Adding some more context manually:

register::
	Initialize Git config values so any scheduled maintenance will
	start running on this repository. This adds the repository to the
	`maintenance.repo` config variable in the current user's global
	config and enables some recommended configuration values for
> 	`maintenance.<task>.schedule`. The tasks that are enabled are safe
> 	for running in the background without disrupting foreground
> 	processes.

The part about "and enables some recommended configuration values"
should probably be in this patch, not an earlier one, and maybe it
shouldn't even be here. With the new approach of this version, this
doesn't really enable some recommended configuration values. Or maybe
it does, I can't make up my mind, nor can I come up with an alternative
formulation.

> ++
> +If your repository has no `maintenance.<task>.schedule` configuration
> +values set, then Git will use a recommended default schedule that performs
> +background maintenance that will not interrupt foreground commands. The
> +default schedule is as follows:
> ++

If you add a line of "--" here...

> +* `gc`: disabled.
> +* `commit-graph`: hourly.
> +* `prefetch`: hourly.
> +* `loose-objects`: daily.
> +* `incremental-repack`: daily.

... and one here, you'll drop some indentation at this point so that
the next paragraph doesn't align with the list above. (See patch below.)

> ++
> +`git maintenance register` will also disable foreground maintenance by
> +setting `maintenance.auto = false` in the current repository. This config
> +setting will remain after a `git maintenance unregister` command.

That last paragraph does belong here. The part about the different
tasks ... maybe. With this new approach of not actually setting any
`schedule`/`enabled` configuration variables, that list doesn't
obviously have its natural home here. Maybe under `--schedule`, which is
where the detection actually happens and the default defaults are
imposed? Or maybe in a separate "CONFIGURATION" section. It could
include config/maintenance.txt, then go on to define the whole fallback
mechanism without having to worry about breaking the reader's flow. (The
way it is now, this `register` command is fairly heavy compared to the
surrounding parts.)

> +static int has_schedule_config(void)
> +{
> +       int i, found = 0;
> +       struct strbuf config_name = STRBUF_INIT;
> +       size_t prefix;
> +
> +       strbuf_addstr(&config_name, "maintenance.");
> +       prefix = config_name.len;
> +
> +       for (i = 0; !found && i < TASK__COUNT; i++) {
> +               char *value;
> +
> +               strbuf_setlen(&config_name, prefix);
> +               strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
> +
> +               if (!git_config_get_string(config_name.buf, &value)) {
> +                       found = 1;
> +                       FREE_AND_NULL(value);
> +               }
> +
> +               strbuf_setlen(&config_name, prefix);
> +               strbuf_addf(&config_name, "%s.enabled", tasks[i].name);
> +
> +               if (!git_config_get_string(config_name.buf, &value)) {
> +                       found = 1;
> +                       FREE_AND_NULL(value);
> +               }
> +       }
> +
> +       strbuf_release(&config_name);
> +       return found;
> +}

I had the same reaction to `FREE_AND_NULL()` as on my previous reading.
If you have $reasons for doing it this way, not a big deal. I offer a
suggestion in patch form below anyway. Feel free to squash, adapt or
ignore as you see fit.

> +
> +static void set_recommended_schedule(void)
> +{
> +       if (has_schedule_config())
> +               return;
> +
> +       tasks[TASK_GC].enabled = 0;
> +
> +       tasks[TASK_PREFETCH].enabled = 1;
> +       tasks[TASK_PREFETCH].schedule = SCHEDULE_HOURLY;
> +
> +       tasks[TASK_COMMIT_GRAPH].enabled = 1;
> +       tasks[TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY;
> +
> +       tasks[TASK_LOOSE_OBJECTS].enabled = 1;
> +       tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY;
> +
> +       tasks[TASK_INCREMENTAL_REPACK].enabled = 1;
> +       tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY;
> +}
> +

One thing I can't make up my mind about is how these `enabled` are used
for two purposes: Deciding what to do on `git maintenance run` without
any `--task`, and deciding what to do on `git maintenance run
--scheduled`.

>  static int maintenance_run_tasks(struct maintenance_run_opts *opts)
>  {
>         int i, found_selected = 0;
> @@ -1280,6 +1333,8 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
>
>         if (found_selected)
>                 QSORT(tasks, TASK__COUNT, compare_tasks_by_selection);
> +       else if (opts->schedule != SCHEDULE_NONE)
> +               set_recommended_schedule();

... And especially how we only impose the magic
`maintenance.<task>.enabled` values when we are running with
`--schedule`. So the answer to "what is the default value of
`maintenance.commit-graph.enabled`?" is "it depends on several factors".

Sort of related: The presence of a `maintenance.<task>.schedule` is not
sufficient to schedule the task. This looks like something that one
could easily trip on. Maybe you have already considered letting a zero
value for `maintenance.<task>.schedule` mean "disabled" and ignoring the
`enabled` config item for the scheduled runs, but rejected that for good
reasons?

> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 7715e40391..7154987fd2 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -305,11 +305,14 @@ test_expect_success 'register and unregister' '
>         git config --global --add maintenance.repo /existing1 &&
>         git config --global --add maintenance.repo /existing2 &&
>         git config --global --get-all maintenance.repo >before &&
> +
>         git maintenance register &&
> -       git config --global --get-all maintenance.repo >actual &&
> -       cp before after &&
> -       pwd >>after &&
> -       test_cmp after actual &&
> +       test_cmp_config false maintenance.auto &&
> +       git config --global --get-all maintenance.repo >between &&
> +       cp before expect &&
> +       pwd >>expect &&
> +       test_cmp expect between &&
> +
>         git maintenance unregister &&
>         git config --global --get-all maintenance.repo >actual &&
>         test_cmp before actual

This tests the one-time config tweaking. But we don't test any of the
"detect no config and impose a default" logic. Neither that it kicks in
at all, nor that it doesn't when it shouldn't.

As mentioned above, I end with some minor suggestions.

Martin

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 738a4c7ebd..2085b53dc5 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -43,11 +43,13 @@ values set, then Git will use a recommended default schedule that performs
 background maintenance that will not interrupt foreground commands. The
 default schedule is as follows:
 +
+--
 * `gc`: disabled.
 * `commit-graph`: hourly.
 * `prefetch`: hourly.
 * `loose-objects`: daily.
 * `incremental-repack`: daily.
+--
 +
 `git maintenance register` will also disable foreground maintenance by
 setting `maintenance.auto = false` in the current repository. This config
diff --git a/builtin/gc.c b/builtin/gc.c
index 965690704b..63f4c102b1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1253,35 +1253,31 @@ static int compare_tasks_by_selection(const void *a_, const void *b_)
 
 static int has_schedule_config(void)
 {
-	int i, found = 0;
+	int i;
 	struct strbuf config_name = STRBUF_INIT;
 	size_t prefix;
+	char *value = NULL;
 
 	strbuf_addstr(&config_name, "maintenance.");
 	prefix = config_name.len;
 
-	for (i = 0; !found && i < TASK__COUNT; i++) {
-		char *value;
-
+	for (i = 0; i < TASK__COUNT; i++) {
 		strbuf_setlen(&config_name, prefix);
 		strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
 
-		if (!git_config_get_string(config_name.buf, &value)) {
-			found = 1;
-			FREE_AND_NULL(value);
-		}
+		if (!git_config_get_string(config_name.buf, &value))
+			break;
 
 		strbuf_setlen(&config_name, prefix);
 		strbuf_addf(&config_name, "%s.enabled", tasks[i].name);
 
-		if (!git_config_get_string(config_name.buf, &value)) {
-			found = 1;
-			FREE_AND_NULL(value);
-		}
+		if (!git_config_get_string(config_name.buf, &value))
+			break;
 	}
 
 	strbuf_release(&config_name);
-	return found;
+	free(value);
+	return i < TASK__COUNT;
 }
 
 static void set_recommended_schedule(void)
-- 
2.28.0.297.g1956fa8f8d


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v3 6/7] maintenance: use default schedule if not configured
  2020-10-05 19:57       ` Martin Ågren
@ 2020-10-08 13:32         ` Derrick Stolee
  0 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee @ 2020-10-08 13:32 UTC (permalink / raw)
  To: Martin Ågren, Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, congdanhqx,
	SZEDER Gábor, Derrick Stolee, Derrick Stolee

On 10/5/2020 3:57 PM, Martin Ågren wrote:
> On Mon, 5 Oct 2020 at 15:07, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
>> To make this process extremely simple for users, assume a default
>> schedule when no 'maintenance.<task>.schedule' or '...enabled' config
>> settings are concretely set. This is only an in-process assumption, so
>> future versions of Git could adjust this expected schedule.
> 
> This obviously makes sense to me. ;-) One thing it does mean though is
> that something like this:
> 
>   $ git maintenance register
>   # Time goes by...
>   # Someone said to try this:
>   $ git config --add maintenance.commit-graph.schedule hourly
>   $ git config --add maintenance.commit-graph.enable true
>   # That could have been a no-op, since we were already on
>   # such an hourly schedule, but it will effectively turn off
>   # all other scheduled tasks. So some time later:
>   # -- Why are my fetches so slow all of a sudden? :-(
> 
> That could be different if `git maintenance register` would turn on,
> say, `maintenance.baseSchedule = standard` where setting those
> `maintenance.commit-graph.*` would tweak that "standard" "base
> schedule" (in a no-op way as it happens).

Thank you so much for your detailed feedback! This is an excellent
point and I will be sure to account for it when I have time to
carefully examine the options and these kind of workflows.

Right now, I'm a bit underwater getting ready for the v2.29.0
release (in microsoft/git, Scalar, and VFS for Git) but I will
revisit this as my main focus after this release cycle. I have
not forgotten about this topic!!!

> As mentioned above, I end with some minor suggestions.

I really appreciate the effort you put in to this fixup.
Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v4 0/8] Maintenance III: Background maintenance
  2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                       ` (6 preceding siblings ...)
  2020-10-05 12:57     ` [PATCH v3 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
@ 2020-10-15 17:21     ` Derrick Stolee via GitGitGadget
  2020-10-15 17:21       ` [PATCH v4 1/8] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
                         ` (7 more replies)
  7 siblings, 8 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:21 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee

This is based on ds/maintenance-part-2 and replaces the RFC from [1].

[1] 
https://lore.kernel.org/git/pull.680.v3.git.1598629517.gitgitgadget@gmail.com/

This series introduces background maintenance to Git, through an integration
with cron and crontab.

Some preliminary work is done to allow a new --schedule option that tells
the command which tasks to run based on a maintenance.<task>.schedule config
option. The timing is not enforced by Git, but instead is expected to be
provided as a hint from a cron schedule. The options are "hourly", "daily",
and "weekly".

A new for-each-repo builtin runs Git commands on every repo in a given list.
Currently, the list is stored as a config setting, allowing a new 
maintenance.repos config list to store the repositories registered for
background maintenance. Others may want to add a --file=<file> option for
their own workflows, but I focused on making this as simple as possible for
now.

The updates to the git maintenance builtin include new register/unregister 
subcommands and start/stop subcommands. The register subcommand initializes
the config while the start subcommand does everything register does plus 
update the cron table. The unregister and stop commands reverse this
process.

A troubleshooting guide is added to Documentation/git-maintenance.txt to
advise expert users who choose to create custom cron schedules.

A new config option "maintenance.strategy" allows users to pick from one of
a potentially-expanding number of recommended schedules. Currently, the only
one baked-in is "incremental" although the documentation specifies that
"none" will prevent any tasks from running by default. Using that value
prevents the config from being overridden in "git maintenance
(start|register)". Users can augment the incremental strategy by assigning
specific "maintenance..schedule" config options.

I've been testing this scenario on my macOS laptop and Linux desktop. I have
modified my cron task to provide logging via trace2 so I can see what's
happening. A future direction here would be to add some maintenance logs to
the repository so we can track what is happening and diagnose whether the
maintenance strategy is working on real repos.

Note: git maintenance (start|stop) only works on machines with cron by
design. The proper thing to do on Windows will come later. Perhaps this
command should be marked as unavailable on Windows somehow, or at least a
better error than "cron may not be available on your system". I did find
that that message is helpful sometimes: macOS worker agents for CI builds
typically do not have cron available.

Updates in v4:

 * Thanks, Martin for pointing out a usability concern with how I was
   previously assigning a default schedule.
 * The new logic is to create a new "maintenance.strategy" config option,
   which can be explicitly disabled or augmented.
 * This approach should be a suitable blend of the two previous options, and
   gives us flexibility to adjust the schedule or add more strategies in the
   future!

Updates in v3:

 * Instead of writing config upon "register" or "start", simply create an
   in-memory default schedule when no .schedule or .enabled configs are
   present. Thanks, Martin! This causes patch 6 to look so different that
   the range-diff considers it a dropped-and-added patch instead of showing
   a diff.
 * There are some context lines that changed because this is rebased onto a
   recent version of ds/maintenance-part-2.

Updates in v2:

 * Fixed the char/int issue in test-tool crontab, and a typo.
 * Updated commit message and patch noise in PATCH 2
 * This should fix the test failures, allowing this to be picked up in
   'seen'.

Derrick Stolee (8):
  maintenance: optionally skip --auto process
  maintenance: add --schedule option and config
  for-each-repo: run subcommands on configured repos
  maintenance: add [un]register subcommands
  maintenance: add start/stop subcommands
  maintenance: create maintenance.strategy config
  maintenance: use 'incremental' strategy by default
  maintenance: add troubleshooting guide to docs

 .gitignore                           |   1 +
 Documentation/config/maintenance.txt |  25 +++
 Documentation/git-for-each-repo.txt  |  59 ++++++
 Documentation/git-maintenance.txt    |  99 +++++++++-
 Makefile                             |   2 +
 builtin.h                            |   1 +
 builtin/for-each-repo.c              |  58 ++++++
 builtin/gc.c                         | 281 ++++++++++++++++++++++++++-
 command-list.txt                     |   1 +
 git.c                                |   1 +
 run-command.c                        |   6 +
 t/helper/test-crontab.c              |  35 ++++
 t/helper/test-tool.c                 |   1 +
 t/helper/test-tool.h                 |   1 +
 t/t0068-for-each-repo.sh             |  30 +++
 t/t7900-maintenance.sh               | 159 ++++++++++++++-
 t/test-lib.sh                        |   6 +
 17 files changed, 758 insertions(+), 8 deletions(-)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100644 t/helper/test-crontab.c
 create mode 100755 t/t0068-for-each-repo.sh


base-commit: e841a79a131d8ce491cf04d0ca3e24f139a10b82
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-724%2Fderrickstolee%2Fmaintenance%2Fscheduled-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-724/derrickstolee/maintenance/scheduled-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/724

Range-diff vs v3:

 1:  02e7286dba = 1:  02e7286dba maintenance: optionally skip --auto process
 2:  dae8c04bb5 = 2:  dae8c04bb5 maintenance: add --schedule option and config
 3:  dd92379273 = 3:  dd92379273 for-each-repo: run subcommands on configured repos
 4:  922b984c8a = 4:  922b984c8a maintenance: add [un]register subcommands
 5:  5194f6b1fa = 5:  5194f6b1fa maintenance: add start/stop subcommands
 -:  ---------- > 6:  d696848b37 maintenance: create maintenance.strategy config
 6:  d833fffe89 ! 7:  145c63ed8c maintenance: use default schedule if not configured
     @@ Metadata
      Author: Derrick Stolee <dstolee@microsoft.com>
      
       ## Commit message ##
     -    maintenance: use default schedule if not configured
     +    maintenance: use 'incremental' strategy by default
      
          The 'git maintenance (register|start)' subcommands add the current
          repository to the global Git config so maintenance will operate on that
          repository. It does not specify what maintenance should occur or how
          often.
      
     -    If a user sets any 'maintenance.<task>.schedule' config value, then
     -    they have chosen a specific schedule for themselves and Git should
     -    respect that when running 'git maintenance run --schedule=<frequency>'.
     -
     -    To make this process extremely simple for users, assume a default
     -    schedule when no 'maintenance.<task>.schedule' or '...enabled' config
     -    settings are concretely set. This is only an in-process assumption, so
     -    future versions of Git could adjust this expected schedule.
     +    To make it simple for users to start background maintenance with a
     +    recommended schedlue, update the 'maintenance.strategy' config option in
     +    both the 'register' and 'start' subcommands. This allows users to
     +    customize beyond the defaults using individual
     +    'maintenance.<task>.schedule' options, but also the user can opt-out of
     +    this strategy using 'maintenance.strategy=none'.
      
          Helped-by: Martin Ågren <martin.agren@gmail.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
     @@ Documentation/git-maintenance.txt: register::
       	for running in the background without disrupting foreground
       	processes.
      ++
     -+If your repository has no `maintenance.<task>.schedule` configuration
     -+values set, then Git will use a recommended default schedule that performs
     -+background maintenance that will not interrupt foreground commands. The
     -+default schedule is as follows:
     ++The `register` subcomand will also set the `maintenance.strategy` config
     ++value to `incremental`, if this value is not previously set. The
     ++`incremental` strategy uses the following schedule for each maintenance
     ++task:
      ++
     ++--
      +* `gc`: disabled.
      +* `commit-graph`: hourly.
      +* `prefetch`: hourly.
      +* `loose-objects`: daily.
      +* `incremental-repack`: daily.
     ++--
      ++
      +`git maintenance register` will also disable foreground maintenance by
      +setting `maintenance.auto = false` in the current repository. This config
     @@ Documentation/git-maintenance.txt: register::
       	Run one or more maintenance tasks. If one or more `--task` options
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: static int compare_tasks_by_selection(const void *a_, const void *b_)
     - 	return b->selected_order - a->selected_order;
     - }
     +@@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char *prefix)
       
     -+static int has_schedule_config(void)
     -+{
     -+	int i, found = 0;
     -+	struct strbuf config_name = STRBUF_INIT;
     -+	size_t prefix;
     -+
     -+	strbuf_addstr(&config_name, "maintenance.");
     -+	prefix = config_name.len;
     -+
     -+	for (i = 0; !found && i < TASK__COUNT; i++) {
     -+		char *value;
     -+
     -+		strbuf_setlen(&config_name, prefix);
     -+		strbuf_addf(&config_name, "%s.schedule", tasks[i].name);
     -+
     -+		if (!git_config_get_string(config_name.buf, &value)) {
     -+			found = 1;
     -+			FREE_AND_NULL(value);
     -+		}
     -+
     -+		strbuf_setlen(&config_name, prefix);
     -+		strbuf_addf(&config_name, "%s.enabled", tasks[i].name);
     -+
     -+		if (!git_config_get_string(config_name.buf, &value)) {
     -+			found = 1;
     -+			FREE_AND_NULL(value);
     -+		}
     -+	}
     -+
     -+	strbuf_release(&config_name);
     -+	return found;
     -+}
     -+
     -+static void set_recommended_schedule(void)
     -+{
     -+	if (has_schedule_config())
     -+		return;
     -+
     -+	tasks[TASK_GC].enabled = 0;
     -+
     -+	tasks[TASK_PREFETCH].enabled = 1;
     -+	tasks[TASK_PREFETCH].schedule = SCHEDULE_HOURLY;
     -+
     -+	tasks[TASK_COMMIT_GRAPH].enabled = 1;
     -+	tasks[TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY;
     -+
     -+	tasks[TASK_LOOSE_OBJECTS].enabled = 1;
     -+	tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY;
     -+
     -+	tasks[TASK_INCREMENTAL_REPACK].enabled = 1;
     -+	tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY;
     -+}
     -+
     - static int maintenance_run_tasks(struct maintenance_run_opts *opts)
     + static int maintenance_register(void)
       {
     - 	int i, found_selected = 0;
     -@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
     - 
     - 	if (found_selected)
     - 		QSORT(tasks, TASK__COUNT, compare_tasks_by_selection);
     -+	else if (opts->schedule != SCHEDULE_NONE)
     -+		set_recommended_schedule();
     ++	char *config_value;
     + 	struct child_process config_set = CHILD_PROCESS_INIT;
     + 	struct child_process config_get = CHILD_PROCESS_INIT;
       
     - 	for (i = 0; i < TASK__COUNT; i++) {
     - 		if (found_selected && tasks[i].selected_order < 0)
      @@ builtin/gc.c: static int maintenance_register(void)
       	if (!the_repository || !the_repository->gitdir)
       		return 0;
       
      +	/* Disable foreground maintenance */
      +	git_config_set("maintenance.auto", "false");
     ++
     ++	/* Set maintenance strategy, if unset */
     ++	if (!git_config_get_string("maintenance.strategy", &config_value))
     ++		free(config_value);
     ++	else
     ++		git_config_set("maintenance.strategy", "incremental");
      +
       	config_get.git_cmd = 1;
       	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
     @@ t/t7900-maintenance.sh: test_expect_success 'register and unregister' '
       	git maintenance unregister &&
       	git config --global --get-all maintenance.repo >actual &&
       	test_cmp before actual
     +@@ t/t7900-maintenance.sh: test_expect_success 'start preserves existing schedule' '
     + 	grep "Important information!" cron.txt
     + '
     + 
     ++test_expect_success 'register preserves existing strategy' '
     ++	git config maintenance.strategy none &&
     ++	git maintenance register &&
     ++	test_config maintenance.strategy none &&
     ++	git config --unset maintenance.strategy &&
     ++	git maintenance register &&
     ++	test_config maintenance.strategy incremental
     ++'
     ++
     + test_done
 7:  8e42ff44ce = 8:  ce0ced705f maintenance: add troubleshooting guide to docs

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v4 1/8] maintenance: optionally skip --auto process
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
@ 2020-10-15 17:21       ` Derrick Stolee via GitGitGadget
  2020-10-15 17:21       ` [PATCH v4 2/8] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:21 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Some commands run 'git maintenance run --auto --[no-]quiet' after doing
their normal work, as a way to keep repositories clean as they are used.
Currently, users who do not want this maintenance to occur would set the
'gc.auto' config option to 0 to avoid the 'gc' task from running.
However, this does not stop the extra process invocation. On Windows,
this extra process invocation can be more expensive than necessary.

Allow users to drop this extra process by setting 'maintenance.auto' to
'false'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++++
 run-command.c                        |  6 ++++++
 t/t7900-maintenance.sh               | 13 +++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index a0706d8f09..06db758172 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -1,3 +1,8 @@
+maintenance.auto::
+	This boolean config option controls whether some commands run
+	`git maintenance run --auto` after doing their normal work. Defaults
+	to true.
+
 maintenance.<task>.enabled::
 	This boolean config option controls whether the maintenance task
 	with name `<task>` is run when no `--task` option is specified to
diff --git a/run-command.c b/run-command.c
index 2ee59acdc8..ea4d0fb4b1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -7,6 +7,7 @@
 #include "strbuf.h"
 #include "string-list.h"
 #include "quote.h"
+#include "config.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -1868,8 +1869,13 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 
 int run_auto_maintenance(int quiet)
 {
+	int enabled;
 	struct child_process maint = CHILD_PROCESS_INIT;
 
+	if (!git_config_get_bool("maintenance.auto", &enabled) &&
+	    !enabled)
+		return 0;
+
 	maint.git_cmd = 1;
 	strvec_pushl(&maint.args, "maintenance", "run", "--auto", NULL);
 	strvec_push(&maint.args, quiet ? "--quiet" : "--no-quiet");
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 55116c2f04..c7caaa7a55 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -28,6 +28,19 @@ test_expect_success 'run [--auto|--quiet]' '
 	test_subcommand git gc --no-quiet <run-no-quiet.txt
 '
 
+test_expect_success 'maintenance.auto config option' '
+	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
+	test_subcommand git maintenance run --auto --quiet <default &&
+	GIT_TRACE2_EVENT="$(pwd)/true" \
+		git -c maintenance.auto=true \
+		commit --quiet --allow-empty -m 2 &&
+	test_subcommand git maintenance run --auto --quiet  <true &&
+	GIT_TRACE2_EVENT="$(pwd)/false" \
+		git -c maintenance.auto=false \
+		commit --quiet --allow-empty -m 3 &&
+	test_subcommand ! git maintenance run --auto --quiet  <false
+'
+
 test_expect_success 'maintenance.<task>.enabled' '
 	git config maintenance.gc.enabled false &&
 	git config maintenance.commit-graph.enabled true &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 2/8] maintenance: add --schedule option and config
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  2020-10-15 17:21       ` [PATCH v4 1/8] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
@ 2020-10-15 17:21       ` Derrick Stolee via GitGitGadget
  2021-02-09 14:06         ` Ævar Arnfjörð Bjarmason
  2020-10-15 17:21       ` [PATCH v4 3/8] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
                         ` (5 subsequent siblings)
  7 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:21 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Maintenance currently triggers when certain data-size thresholds are
met, such as number of pack-files or loose objects. Users may want to
run certain maintenance tasks based on frequency instead. For example,
a user may want to perform a 'prefetch' task every hour, or 'gc' task
every day. To help these users, update the 'git maintenance run' command
to include a '--schedule=<frequency>' option. The allowed frequencies
are 'hourly', 'daily', and 'weekly'. These values are also allowed in a
new config value 'maintenance.<task>.schedule'.

The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
config value for each enabled task to see if the configured frequency is
at least as frequent as the frequency from the '--schedule' argument. We
use the following order, for full clarity:

	'hourly' > 'daily' > 'weekly'

Use new 'enum schedule_priority' to track these values numerically.

The following cron table would run the scheduled tasks with the correct
frequencies:

  0 1-23 * * *    git -C <repo> maintenance run --schedule=hourly
  0 0    * * 1-6  git -C <repo> maintenance run --schedule=daily
  0 0    * * 0    git -C <repo> maintenance run --schedule=weekly

This cron schedule will run --schedule=hourly every hour except at
midnight. This avoids a concurrent run with the --schedule=daily that
runs at midnight every day except the first day of the week. This avoids
a concurrent run with the --schedule=weekly that runs at midnight on
the first day of the week. Since --schedule=daily also runs the
'hourly' tasks and --schedule=weekly runs the 'hourly' and 'daily'
tasks, we will still see all tasks run with the proper frequencies.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++
 Documentation/git-maintenance.txt    | 13 +++++-
 builtin/gc.c                         | 64 ++++++++++++++++++++++++++--
 t/t7900-maintenance.sh               | 40 +++++++++++++++++
 4 files changed, 118 insertions(+), 4 deletions(-)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index 06db758172..70585564fa 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -10,6 +10,11 @@ maintenance.<task>.enabled::
 	`--task` option exists. By default, only `maintenance.gc.enabled`
 	is true.
 
+maintenance.<task>.schedule::
+	This config option controls whether or not the given `<task>` runs
+	during a `git maintenance run --schedule=<frequency>` command. The
+	value must be one of "hourly", "daily", or "weekly".
+
 maintenance.commit-graph.auto::
 	This integer config option controls how often the `commit-graph` task
 	should be run as part of `git maintenance run --auto`. If zero, then
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 3f5d8946b4..ed94f66e36 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -110,7 +110,18 @@ OPTIONS
 	only if certain thresholds are met. For example, the `gc` task
 	runs when the number of loose objects exceeds the number stored
 	in the `gc.auto` config setting, or when the number of pack-files
-	exceeds the `gc.autoPackLimit` config setting.
+	exceeds the `gc.autoPackLimit` config setting. Not compatible with
+	the `--schedule` option.
+
+--schedule::
+	When combined with the `run` subcommand, run maintenance tasks
+	only if certain time conditions are met, as specified by the
+	`maintenance.<task>.schedule` config value for each `<task>`.
+	This config value specifies a number of seconds since the last
+	time that task ran, according to the `maintenance.<task>.lastRun`
+	config value. The tasks that are tested are those provided by
+	the `--task=<task>` option(s) or those with
+	`maintenance.<task>.enabled` set to true.
 
 --quiet::
 	Do not report progress or other information over `stderr`.
diff --git a/builtin/gc.c b/builtin/gc.c
index 2b99596ec8..03b24ea0db 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -703,14 +703,51 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static const char * const builtin_maintenance_run_usage[] = {
-	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>]"),
+static const char *const builtin_maintenance_run_usage[] = {
+	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>] [--schedule]"),
 	NULL
 };
 
+enum schedule_priority {
+	SCHEDULE_NONE = 0,
+	SCHEDULE_WEEKLY = 1,
+	SCHEDULE_DAILY = 2,
+	SCHEDULE_HOURLY = 3,
+};
+
+static enum schedule_priority parse_schedule(const char *value)
+{
+	if (!value)
+		return SCHEDULE_NONE;
+	if (!strcasecmp(value, "hourly"))
+		return SCHEDULE_HOURLY;
+	if (!strcasecmp(value, "daily"))
+		return SCHEDULE_DAILY;
+	if (!strcasecmp(value, "weekly"))
+		return SCHEDULE_WEEKLY;
+	return SCHEDULE_NONE;
+}
+
+static int maintenance_opt_schedule(const struct option *opt, const char *arg,
+				    int unset)
+{
+	enum schedule_priority *priority = opt->value;
+
+	if (unset)
+		die(_("--no-schedule is not allowed"));
+
+	*priority = parse_schedule(arg);
+
+	if (!*priority)
+		die(_("unrecognized --schedule argument '%s'"), arg);
+
+	return 0;
+}
+
 struct maintenance_run_opts {
 	int auto_flag;
 	int quiet;
+	enum schedule_priority schedule;
 };
 
 /* Remember to update object flag allocation in object.h */
@@ -1158,6 +1195,8 @@ struct maintenance_task {
 	maintenance_auto_fn *auto_condition;
 	unsigned enabled:1;
 
+	enum schedule_priority schedule;
+
 	/* -1 if not selected. */
 	int selected_order;
 };
@@ -1253,6 +1292,9 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 		     !tasks[i].auto_condition()))
 			continue;
 
+		if (opts->schedule && tasks[i].schedule < opts->schedule)
+			continue;
+
 		trace2_region_enter("maintenance", tasks[i].name, r);
 		if (tasks[i].fn(opts)) {
 			error(_("task '%s' failed"), tasks[i].name);
@@ -1273,13 +1315,23 @@ static void initialize_task_config(void)
 
 	for (i = 0; i < TASK__COUNT; i++) {
 		int config_value;
+		char *config_str;
 
-		strbuf_setlen(&config_name, 0);
+		strbuf_reset(&config_name);
 		strbuf_addf(&config_name, "maintenance.%s.enabled",
 			    tasks[i].name);
 
 		if (!git_config_get_bool(config_name.buf, &config_value))
 			tasks[i].enabled = config_value;
+
+		strbuf_reset(&config_name);
+		strbuf_addf(&config_name, "maintenance.%s.schedule",
+			    tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &config_str)) {
+			tasks[i].schedule = parse_schedule(config_str);
+			free(config_str);
+		}
 	}
 
 	strbuf_release(&config_name);
@@ -1323,6 +1375,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
+		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
+			     N_("run tasks based on frequency"),
+			     maintenance_opt_schedule),
 		OPT_BOOL(0, "quiet", &opts.quiet,
 			 N_("do not report progress or other information over stderr")),
 		OPT_CALLBACK_F(0, "task", NULL, N_("task"),
@@ -1343,6 +1398,9 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 			     builtin_maintenance_run_usage,
 			     PARSE_OPT_STOP_AT_NON_OPTION);
 
+	if (opts.auto_flag && opts.schedule)
+		die(_("use at most one of --auto and --schedule=<frequency>"));
+
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index c7caaa7a55..33d73cd01c 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -260,4 +260,44 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 	test_subcommand git multi-pack-index write --no-progress <trace-B
 '
 
+test_expect_success '--auto and --schedule incompatible' '
+	test_must_fail git maintenance run --auto --schedule=daily 2>err &&
+	test_i18ngrep "at most one" err
+'
+
+test_expect_success 'invalid --schedule value' '
+	test_must_fail git maintenance run --schedule=annually 2>err &&
+	test_i18ngrep "unrecognized --schedule" err
+'
+
+test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
+	git config maintenance.loose-objects.enabled true &&
+	git config maintenance.loose-objects.schedule hourly &&
+	git config maintenance.commit-graph.enabled true &&
+	git config maintenance.commit-graph.schedule daily &&
+	git config maintenance.incremental-repack.enabled true &&
+	git config maintenance.incremental-repack.schedule weekly &&
+
+	GIT_TRACE2_EVENT="$(pwd)/hourly.txt" \
+		git maintenance run --schedule=hourly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <hourly.txt &&
+	test_subcommand ! git commit-graph write --split --reachable \
+		--no-progress <hourly.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <hourly.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/daily.txt" \
+		git maintenance run --schedule=daily 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <daily.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <daily.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <daily.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/weekly.txt" \
+		git maintenance run --schedule=weekly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <weekly.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <weekly.txt &&
+	test_subcommand git multi-pack-index write --no-progress <weekly.txt
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 3/8] for-each-repo: run subcommands on configured repos
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
  2020-10-15 17:21       ` [PATCH v4 1/8] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
  2020-10-15 17:21       ` [PATCH v4 2/8] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
@ 2020-10-15 17:21       ` Derrick Stolee via GitGitGadget
  2021-05-03 16:10         ` Andrzej Hunt
  2020-10-15 17:22       ` [PATCH v4 4/8] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
                         ` (4 subsequent siblings)
  7 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:21 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

It can be helpful to store a list of repositories in global or system
config and then iterate Git commands on that list. Create a new builtin
that makes this process simple for experts. We will use this builtin to
run scheduled maintenance on all configured repositories in a future
change.

The test is very simple, but does highlight that the "--" argument is
optional.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 .gitignore                          |  1 +
 Documentation/git-for-each-repo.txt | 59 +++++++++++++++++++++++++++++
 Makefile                            |  1 +
 builtin.h                           |  1 +
 builtin/for-each-repo.c             | 58 ++++++++++++++++++++++++++++
 command-list.txt                    |  1 +
 git.c                               |  1 +
 t/t0068-for-each-repo.sh            | 30 +++++++++++++++
 8 files changed, 152 insertions(+)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100755 t/t0068-for-each-repo.sh

diff --git a/.gitignore b/.gitignore
index a5808fa30d..5eb2a2be71 100644
--- a/.gitignore
+++ b/.gitignore
@@ -67,6 +67,7 @@
 /git-filter-branch
 /git-fmt-merge-msg
 /git-for-each-ref
+/git-for-each-repo
 /git-format-patch
 /git-fsck
 /git-fsck-objects
diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt
new file mode 100644
index 0000000000..94bd19da26
--- /dev/null
+++ b/Documentation/git-for-each-repo.txt
@@ -0,0 +1,59 @@
+git-for-each-repo(1)
+====================
+
+NAME
+----
+git-for-each-repo - Run a Git command on a list of repositories
+
+
+SYNOPSIS
+--------
+[verse]
+'git for-each-repo' --config=<config> [--] <arguments>
+
+
+DESCRIPTION
+-----------
+Run a Git command on a list of repositories. The arguments after the
+known options or `--` indicator are used as the arguments for the Git
+subprocess.
+
+THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
+
+For example, we could run maintenance on each of a list of repositories
+stored in a `maintenance.repo` config variable using
+
+-------------
+git for-each-repo --config=maintenance.repo maintenance run
+-------------
+
+This will run `git -C <repo> maintenance run` for each value `<repo>`
+in the multi-valued config variable `maintenance.repo`.
+
+
+OPTIONS
+-------
+--config=<config>::
+	Use the given config variable as a multi-valued list storing
+	absolute path names. Iterate on that list of paths to run
+	the given arguments.
++
+These config values are loaded from system, global, and local Git config,
+as available. If `git for-each-repo` is run in a directory that is not a
+Git repository, then only the system and global config is used.
+
+
+SUBPROCESS BEHAVIOR
+-------------------
+
+If any `git -C <repo> <arguments>` subprocess returns a non-zero exit code,
+then the `git for-each-repo` process returns that exit code without running
+more subprocesses.
+
+Each `git -C <repo> <arguments>` subprocess inherits the standard file
+descriptors `stdin`, `stdout`, and `stderr`.
+
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 65f8cfb236..7c588ff036 100644
--- a/Makefile
+++ b/Makefile
@@ -1071,6 +1071,7 @@ BUILTIN_OBJS += builtin/fetch-pack.o
 BUILTIN_OBJS += builtin/fetch.o
 BUILTIN_OBJS += builtin/fmt-merge-msg.o
 BUILTIN_OBJS += builtin/for-each-ref.o
+BUILTIN_OBJS += builtin/for-each-repo.o
 BUILTIN_OBJS += builtin/fsck.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/get-tar-commit-id.o
diff --git a/builtin.h b/builtin.h
index 17c1c0ce49..ff7c6e5aa9 100644
--- a/builtin.h
+++ b/builtin.h
@@ -150,6 +150,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix);
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix);
 int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix);
 int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
 int cmd_format_patch(int argc, const char **argv, const char *prefix);
 int cmd_fsck(int argc, const char **argv, const char *prefix);
 int cmd_gc(int argc, const char **argv, const char *prefix);
diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
new file mode 100644
index 0000000000..5bba623ff1
--- /dev/null
+++ b/builtin/for-each-repo.c
@@ -0,0 +1,58 @@
+#include "cache.h"
+#include "config.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "run-command.h"
+#include "string-list.h"
+
+static const char * const for_each_repo_usage[] = {
+	N_("git for-each-repo --config=<config> <command-args>"),
+	NULL
+};
+
+static int run_command_on_repo(const char *path,
+			       void *cbdata)
+{
+	int i;
+	struct child_process child = CHILD_PROCESS_INIT;
+	struct strvec *args = (struct strvec *)cbdata;
+
+	child.git_cmd = 1;
+	strvec_pushl(&child.args, "-C", path, NULL);
+
+	for (i = 0; i < args->nr; i++)
+		strvec_push(&child.args, args->v[i]);
+
+	return run_command(&child);
+}
+
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
+{
+	static const char *config_key = NULL;
+	int i, result = 0;
+	const struct string_list *values;
+	struct strvec args = STRVEC_INIT;
+
+	const struct option options[] = {
+		OPT_STRING(0, "config", &config_key, N_("config"),
+			   N_("config key storing a list of repository paths")),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, prefix, options, for_each_repo_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+
+	if (!config_key)
+		die(_("missing --config=<config>"));
+
+	for (i = 0; i < argc; i++)
+		strvec_push(&args, argv[i]);
+
+	values = repo_config_get_value_multi(the_repository,
+					     config_key);
+
+	for (i = 0; !result && i < values->nr; i++)
+		result = run_command_on_repo(values->items[i].string, &args);
+
+	return result;
+}
diff --git a/command-list.txt b/command-list.txt
index 0e3204e7d1..581499be82 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -94,6 +94,7 @@ git-fetch-pack                          synchingrepositories
 git-filter-branch                       ancillarymanipulators
 git-fmt-merge-msg                       purehelpers
 git-for-each-ref                        plumbinginterrogators
+git-for-each-repo                       plumbinginterrogators
 git-format-patch                        mainporcelain
 git-fsck                                ancillaryinterrogators          complete
 git-gc                                  mainporcelain
diff --git a/git.c b/git.c
index 24f250d29a..1cab64b5d1 100644
--- a/git.c
+++ b/git.c
@@ -511,6 +511,7 @@ static struct cmd_struct commands[] = {
 	{ "fetch-pack", cmd_fetch_pack, RUN_SETUP | NO_PARSEOPT },
 	{ "fmt-merge-msg", cmd_fmt_merge_msg, RUN_SETUP },
 	{ "for-each-ref", cmd_for_each_ref, RUN_SETUP },
+	{ "for-each-repo", cmd_for_each_repo, RUN_SETUP_GENTLY },
 	{ "format-patch", cmd_format_patch, RUN_SETUP },
 	{ "fsck", cmd_fsck, RUN_SETUP },
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
diff --git a/t/t0068-for-each-repo.sh b/t/t0068-for-each-repo.sh
new file mode 100755
index 0000000000..136b4ec839
--- /dev/null
+++ b/t/t0068-for-each-repo.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+test_description='git for-each-repo builtin'
+
+. ./test-lib.sh
+
+test_expect_success 'run based on configured value' '
+	git init one &&
+	git init two &&
+	git init three &&
+	git -C two commit --allow-empty -m "DID NOT RUN" &&
+	git config run.key "$TRASH_DIRECTORY/one" &&
+	git config --add run.key "$TRASH_DIRECTORY/three" &&
+	git for-each-repo --config=run.key commit --allow-empty -m "ran" &&
+	git -C one log -1 --pretty=format:%s >message &&
+	grep ran message &&
+	git -C two log -1 --pretty=format:%s >message &&
+	! grep ran message &&
+	git -C three log -1 --pretty=format:%s >message &&
+	grep ran message &&
+	git for-each-repo --config=run.key -- commit --allow-empty -m "ran again" &&
+	git -C one log -1 --pretty=format:%s >message &&
+	grep again message &&
+	git -C two log -1 --pretty=format:%s >message &&
+	! grep again message &&
+	git -C three log -1 --pretty=format:%s >message &&
+	grep again message
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 4/8] maintenance: add [un]register subcommands
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                         ` (2 preceding siblings ...)
  2020-10-15 17:21       ` [PATCH v4 3/8] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
@ 2020-10-15 17:22       ` Derrick Stolee via GitGitGadget
  2020-10-15 17:22       ` [PATCH v4 5/8] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:22 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In preparation for launching background maintenance from the 'git
maintenance' builtin, create register/unregister subcommands. These
commands update the new 'maintenance.repos' config option in the global
config so the background maintenance job knows which repositories to
maintain.

These commands allow users to add a repository to the background
maintenance list without disrupting the actual maintenance mechanism.

For example, a user can run 'git maintenance register' when no
background maintenance is running and it will not start the background
maintenance. A later update to start running background maintenance will
then pick up this repository automatically.

The opposite example is that a user can run 'git maintenance unregister'
to remove the current repository from background maintenance without
halting maintenance for other repositories.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 14 ++++++++
 builtin/gc.c                      | 55 ++++++++++++++++++++++++++++++-
 t/t7900-maintenance.sh            | 17 +++++++++-
 3 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index ed94f66e36..1c59fd0cb5 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -29,6 +29,15 @@ Git repository.
 SUBCOMMANDS
 -----------
 
+register::
+	Initialize Git config values so any scheduled maintenance will
+	start running on this repository. This adds the repository to the
+	`maintenance.repo` config variable in the current user's global
+	config and enables some recommended configuration values for
+	`maintenance.<task>.schedule`. The tasks that are enabled are safe
+	for running in the background without disrupting foreground
+	processes.
+
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
 	are specified, then those tasks are run in that order. Otherwise,
@@ -36,6 +45,11 @@ run::
 	config options are true. By default, only `maintenance.gc.enabled`
 	is true.
 
+unregister::
+	Remove the current repository from background maintenance. This
+	only removes the repository from the configured list. It does not
+	stop the background maintenance processes from running.
+
 TASKS
 -----
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 03b24ea0db..edf1d35ce5 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1407,7 +1407,56 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	return maintenance_run_tasks(&opts);
 }
 
-static const char builtin_maintenance_usage[] = N_("git maintenance run [<options>]");
+static int maintenance_register(void)
+{
+	struct child_process config_set = CHILD_PROCESS_INIT;
+	struct child_process config_get = CHILD_PROCESS_INIT;
+
+	/* There is no current repository, so skip registering it */
+	if (!the_repository || !the_repository->gitdir)
+		return 0;
+
+	config_get.git_cmd = 1;
+	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+			 NULL);
+	config_get.out = -1;
+
+	if (start_command(&config_get))
+		return error(_("failed to run 'git config'"));
+
+	/* We already have this value in our config! */
+	if (!finish_command(&config_get))
+		return 0;
+
+	config_set.git_cmd = 1;
+	strvec_pushl(&config_set.args, "config", "--add", "--global", "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+		     NULL);
+
+	return run_command(&config_set);
+}
+
+static int maintenance_unregister(void)
+{
+	struct child_process config_unset = CHILD_PROCESS_INIT;
+
+	if (!the_repository || !the_repository->gitdir)
+		return error(_("no current repository to unregister"));
+
+	config_unset.git_cmd = 1;
+	strvec_pushl(&config_unset.args, "config", "--global", "--unset",
+		     "maintenance.repo",
+		     the_repository->worktree ? the_repository->worktree
+					      : the_repository->gitdir,
+		     NULL);
+
+	return run_command(&config_unset);
+}
+
+static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
 
 int cmd_maintenance(int argc, const char **argv, const char *prefix)
 {
@@ -1417,6 +1466,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "register"))
+		return maintenance_register();
+	if (!strcmp(argv[1], "unregister"))
+		return maintenance_unregister();
 
 	die(_("invalid subcommand: %s"), argv[1]);
 }
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 33d73cd01c..8f383d01d9 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -9,7 +9,7 @@ GIT_TEST_MULTI_PACK_INDEX=0
 
 test_expect_success 'help text' '
 	test_expect_code 129 git maintenance -h 2>err &&
-	test_i18ngrep "usage: git maintenance run" err &&
+	test_i18ngrep "usage: git maintenance <subcommand>" err &&
 	test_expect_code 128 git maintenance barf 2>err &&
 	test_i18ngrep "invalid subcommand: barf" err &&
 	test_expect_code 129 git maintenance 2>err &&
@@ -300,4 +300,19 @@ test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
 	test_subcommand git multi-pack-index write --no-progress <weekly.txt
 '
 
+test_expect_success 'register and unregister' '
+	test_when_finished git config --global --unset-all maintenance.repo &&
+	git config --global --add maintenance.repo /existing1 &&
+	git config --global --add maintenance.repo /existing2 &&
+	git config --global --get-all maintenance.repo >before &&
+	git maintenance register &&
+	git config --global --get-all maintenance.repo >actual &&
+	cp before after &&
+	pwd >>after &&
+	test_cmp after actual &&
+	git maintenance unregister &&
+	git config --global --get-all maintenance.repo >actual &&
+	test_cmp before actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 5/8] maintenance: add start/stop subcommands
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                         ` (3 preceding siblings ...)
  2020-10-15 17:22       ` [PATCH v4 4/8] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
@ 2020-10-15 17:22       ` Derrick Stolee via GitGitGadget
  2020-12-09 18:51         ` Josh Steadmon
  2020-10-15 17:22       ` [PATCH v4 6/8] maintenance: create maintenance.strategy config Derrick Stolee via GitGitGadget
                         ` (2 subsequent siblings)
  7 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:22 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add new subcommands to 'git maintenance' that start or stop background
maintenance using 'cron', when available. This integration is as simple
as I could make it, barring some implementation complications.

The schedule is laid out as follows:

  0 1-23 * * *   $cmd maintenance run --schedule=hourly
  0 0    * * 1-6 $cmd maintenance run --schedule=daily
  0 0    * * 0   $cmd maintenance run --schedule=weekly

where $cmd is a properly-qualified 'git for-each-repo' execution:

$cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo

where $path points to the location of the Git executable running 'git
maintenance start'. This is critical for systems with multiple versions
of Git. Specifically, macOS has a system version at '/usr/bin/git' while
the version that users can install resides at '/usr/local/bin/git'
(symlinked to '/usr/local/libexec/git-core/git'). This will also use
your locally-built version if you build and run this in your development
environment without installing first.

This conditional schedule avoids having cron launch multiple 'git
for-each-repo' commands in parallel. Such parallel commands would likely
lead to the 'hourly' and 'daily' tasks competing over the object
database lock. This could lead to to some tasks never being run! Since
the --schedule=<frequency> argument will run all tasks with _at least_
the given frequency, the daily runs will also run the hourly tasks.
Similarly, the weekly runs will also run the daily and hourly tasks.

The GIT_TEST_CRONTAB environment variable is not intended for users to
edit, but instead as a way to mock the 'crontab [-l]' command. This
variable is set in test-lib.sh to avoid a future test from accidentally
running anything with the cron integration from modifying the user's
schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our
tests to check how the schedule is modified in 'git maintenance
(start|stop)' commands.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  11 +++
 Makefile                          |   1 +
 builtin/gc.c                      | 124 ++++++++++++++++++++++++++++++
 t/helper/test-crontab.c           |  35 +++++++++
 t/helper/test-tool.c              |   1 +
 t/helper/test-tool.h              |   1 +
 t/t7900-maintenance.sh            |  28 +++++++
 t/test-lib.sh                     |   6 ++
 8 files changed, 207 insertions(+)
 create mode 100644 t/helper/test-crontab.c

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 1c59fd0cb5..7628a6d157 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -45,6 +45,17 @@ run::
 	config options are true. By default, only `maintenance.gc.enabled`
 	is true.
 
+start::
+	Start running maintenance on the current repository. This performs
+	the same config updates as the `register` subcommand, then updates
+	the background scheduler to run `git maintenance run --scheduled`
+	on an hourly basis.
+
+stop::
+	Halt the background maintenance schedule. The current repository
+	is not removed from the list of maintained repositories, in case
+	the background maintenance is restarted later.
+
 unregister::
 	Remove the current repository from background maintenance. This
 	only removes the repository from the configured list. It does not
diff --git a/Makefile b/Makefile
index 7c588ff036..c39b39bd7d 100644
--- a/Makefile
+++ b/Makefile
@@ -690,6 +690,7 @@ TEST_BUILTINS_OBJS += test-advise.o
 TEST_BUILTINS_OBJS += test-bloom.o
 TEST_BUILTINS_OBJS += test-chmtime.o
 TEST_BUILTINS_OBJS += test-config.o
+TEST_BUILTINS_OBJS += test-crontab.o
 TEST_BUILTINS_OBJS += test-ctype.o
 TEST_BUILTINS_OBJS += test-date.o
 TEST_BUILTINS_OBJS += test-delta.o
diff --git a/builtin/gc.c b/builtin/gc.c
index edf1d35ce5..a387f46585 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -31,6 +31,7 @@
 #include "refs.h"
 #include "remote.h"
 #include "object-store.h"
+#include "exec-cmd.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -1456,6 +1457,125 @@ static int maintenance_unregister(void)
 	return run_command(&config_unset);
 }
 
+#define BEGIN_LINE "# BEGIN GIT MAINTENANCE SCHEDULE"
+#define END_LINE "# END GIT MAINTENANCE SCHEDULE"
+
+static int update_background_schedule(int run_maintenance)
+{
+	int result = 0;
+	int in_old_region = 0;
+	struct child_process crontab_list = CHILD_PROCESS_INIT;
+	struct child_process crontab_edit = CHILD_PROCESS_INIT;
+	FILE *cron_list, *cron_in;
+	const char *crontab_name;
+	struct strbuf line = STRBUF_INIT;
+	struct lock_file lk;
+	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
+
+	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
+		return error(_("another process is scheduling background maintenance"));
+
+	crontab_name = getenv("GIT_TEST_CRONTAB");
+	if (!crontab_name)
+		crontab_name = "crontab";
+
+	strvec_split(&crontab_list.args, crontab_name);
+	strvec_push(&crontab_list.args, "-l");
+	crontab_list.in = -1;
+	crontab_list.out = dup(lk.tempfile->fd);
+	crontab_list.git_cmd = 0;
+
+	if (start_command(&crontab_list)) {
+		result = error(_("failed to run 'crontab -l'; your system might not support 'cron'"));
+		goto cleanup;
+	}
+
+	/* Ignore exit code, as an empty crontab will return error. */
+	finish_command(&crontab_list);
+
+	/*
+	 * Read from the .lock file, filtering out the old
+	 * schedule while appending the new schedule.
+	 */
+	cron_list = fdopen(lk.tempfile->fd, "r");
+	rewind(cron_list);
+
+	strvec_split(&crontab_edit.args, crontab_name);
+	crontab_edit.in = -1;
+	crontab_edit.git_cmd = 0;
+
+	if (start_command(&crontab_edit)) {
+		result = error(_("failed to run 'crontab'; your system might not support 'cron'"));
+		goto cleanup;
+	}
+
+	cron_in = fdopen(crontab_edit.in, "w");
+	if (!cron_in) {
+		result = error(_("failed to open stdin of 'crontab'"));
+		goto done_editing;
+	}
+
+	while (!strbuf_getline_lf(&line, cron_list)) {
+		if (!in_old_region && !strcmp(line.buf, BEGIN_LINE))
+			in_old_region = 1;
+		if (in_old_region)
+			continue;
+		fprintf(cron_in, "%s\n", line.buf);
+		if (in_old_region && !strcmp(line.buf, END_LINE))
+			in_old_region = 0;
+	}
+
+	if (run_maintenance) {
+		struct strbuf line_format = STRBUF_INIT;
+		const char *exec_path = git_exec_path();
+
+		fprintf(cron_in, "%s\n", BEGIN_LINE);
+		fprintf(cron_in,
+			"# The following schedule was created by Git\n");
+		fprintf(cron_in, "# Any edits made in this region might be\n");
+		fprintf(cron_in,
+			"# replaced in the future by a Git command.\n\n");
+
+		strbuf_addf(&line_format,
+			    "%%s %%s * * %%s \"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%s\n",
+			    exec_path, exec_path);
+		fprintf(cron_in, line_format.buf, "0", "1-23", "*", "hourly");
+		fprintf(cron_in, line_format.buf, "0", "0", "1-6", "daily");
+		fprintf(cron_in, line_format.buf, "0", "0", "0", "weekly");
+		strbuf_release(&line_format);
+
+		fprintf(cron_in, "\n%s\n", END_LINE);
+	}
+
+	fflush(cron_in);
+	fclose(cron_in);
+	close(crontab_edit.in);
+
+done_editing:
+	if (finish_command(&crontab_edit)) {
+		result = error(_("'crontab' died"));
+		goto cleanup;
+	}
+	fclose(cron_list);
+
+cleanup:
+	rollback_lock_file(&lk);
+	return result;
+}
+
+static int maintenance_start(void)
+{
+	if (maintenance_register())
+		warning(_("failed to add repo to global config"));
+
+	return update_background_schedule(1);
+}
+
+static int maintenance_stop(void)
+{
+	return update_background_schedule(0);
+}
+
 static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
 
 int cmd_maintenance(int argc, const char **argv, const char *prefix)
@@ -1466,6 +1586,10 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "start"))
+		return maintenance_start();
+	if (!strcmp(argv[1], "stop"))
+		return maintenance_stop();
 	if (!strcmp(argv[1], "register"))
 		return maintenance_register();
 	if (!strcmp(argv[1], "unregister"))
diff --git a/t/helper/test-crontab.c b/t/helper/test-crontab.c
new file mode 100644
index 0000000000..e7c0137a47
--- /dev/null
+++ b/t/helper/test-crontab.c
@@ -0,0 +1,35 @@
+#include "test-tool.h"
+#include "cache.h"
+
+/*
+ * Usage: test-tool cron <file> [-l]
+ *
+ * If -l is specified, then write the contents of <file> to stdout.
+ * Otherwise, write from stdin into <file>.
+ */
+int cmd__crontab(int argc, const char **argv)
+{
+	int a;
+	FILE *from, *to;
+
+	if (argc == 3 && !strcmp(argv[2], "-l")) {
+		from = fopen(argv[1], "r");
+		if (!from)
+			return 0;
+		to = stdout;
+	} else if (argc == 2) {
+		from = stdin;
+		to = fopen(argv[1], "w");
+	} else
+		return error("unknown arguments");
+
+	while ((a = fgetc(from)) != EOF)
+		fputc(a, to);
+
+	if (argc == 3)
+		fclose(from);
+	else
+		fclose(to);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 590b2efca7..432b49d948 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -18,6 +18,7 @@ static struct test_cmd cmds[] = {
 	{ "bloom", cmd__bloom },
 	{ "chmtime", cmd__chmtime },
 	{ "config", cmd__config },
+	{ "crontab", cmd__crontab },
 	{ "ctype", cmd__ctype },
 	{ "date", cmd__date },
 	{ "delta", cmd__delta },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index ddc8e990e9..7c3281e071 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -8,6 +8,7 @@ int cmd__advise_if_enabled(int argc, const char **argv);
 int cmd__bloom(int argc, const char **argv);
 int cmd__chmtime(int argc, const char **argv);
 int cmd__config(int argc, const char **argv);
+int cmd__crontab(int argc, const char **argv);
 int cmd__ctype(int argc, const char **argv);
 int cmd__date(int argc, const char **argv);
 int cmd__delta(int argc, const char **argv);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 8f383d01d9..7715e40391 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -315,4 +315,32 @@ test_expect_success 'register and unregister' '
 	test_cmp before actual
 '
 
+test_expect_success 'start from empty cron table' '
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
+
+	# start registers the repo
+	git config --get --global maintenance.repo "$(pwd)" &&
+
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
+	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=weekly" cron.txt
+'
+
+test_expect_success 'stop from existing schedule' '
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
+
+	# stop does not unregister the repo
+	git config --get --global maintenance.repo "$(pwd)" &&
+
+	# Operation is idempotent
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
+	test_must_be_empty cron.txt
+'
+
+test_expect_success 'start preserves existing schedule' '
+	echo "Important information!" >cron.txt &&
+	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
+	grep "Important information!" cron.txt
+'
+
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index ef31f40037..4a60d1ed76 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1702,3 +1702,9 @@ test_lazy_prereq SHA1 '
 test_lazy_prereq REBASE_P '
 	test -z "$GIT_TEST_SKIP_REBASE_P"
 '
+
+# Ensure that no test accidentally triggers a Git command
+# that runs 'crontab', affecting a user's cron schedule.
+# Tests that verify the cron integration must set this locally
+# to avoid errors.
+GIT_TEST_CRONTAB="exit 1"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 6/8] maintenance: create maintenance.strategy config
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                         ` (4 preceding siblings ...)
  2020-10-15 17:22       ` [PATCH v4 5/8] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
@ 2020-10-15 17:22       ` Derrick Stolee via GitGitGadget
  2020-10-15 17:22       ` [PATCH v4 7/8] maintenance: use 'incremental' strategy by default Derrick Stolee via GitGitGadget
  2020-10-15 17:22       ` [PATCH v4 8/8] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:22 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

To provide an on-ramp for users to use background maintenance without
several 'git config' commands, create a 'maintenance.strategy' config
option. Currently, the only important value is 'incremental' which
assigns the following schedule:

* gc: never
* prefetch: hourly
* commit-graph: hourly
* loose-objects: daily
* incremental-repack: daily

These tasks are chosen to minimize disruptions to foreground Git
commands and use few compute resources.

The 'maintenance.strategy' is intended as a baseline that can be
customzied further by manually assigning 'maintenance.<task>.enabled'
and 'maintenance.<task>.schedule' config options, which will override
any recommendation from 'maintenance.strategy'. This operates similarly
to config options like 'feature.experimental' which operate as "meta"
config options that change default config values.

This presents a way forward for updating the 'incremental' strategy in
the future or adding new strategies. For example, a potential strategy
could be to include a 'full' strategy that runs the 'gc' task weekly
and no other tasks by default.

Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt | 15 +++++++++
 builtin/gc.c                         | 28 ++++++++++++++--
 t/t7900-maintenance.sh               | 49 ++++++++++++++++++++++++++++
 3 files changed, 90 insertions(+), 2 deletions(-)

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index 70585564fa..a5ead09e4b 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -3,6 +3,21 @@ maintenance.auto::
 	`git maintenance run --auto` after doing their normal work. Defaults
 	to true.
 
+maintenance.strategy::
+	This string config option provides a way to specify one of a few
+	recommended schedules for background maintenance. This only affects
+	which tasks are run during `git maintenance run --schedule=X`
+	commands, provided no `--task=<task>` arguments are provided.
+	Further, if a `maintenance.<task>.schedule` config value is set,
+	then that value is used instead of the one provided by
+	`maintenance.strategy`. The possible strategy strings are:
++
+* `none`: This default setting implies no task are run at any schedule.
+* `incremental`: This setting optimizes for performing small maintenance
+  activities that do not delete any data. This does not schedule the `gc`
+  task, but runs the `prefetch` and `commit-graph` tasks hourly and the
+  `loose-objects` and `incremental-repack` tasks daily.
+
 maintenance.<task>.enabled::
 	This boolean config option controls whether the maintenance task
 	with name `<task>` is run when no `--task` option is specified to
diff --git a/builtin/gc.c b/builtin/gc.c
index a387f46585..a8248e7a45 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1308,12 +1308,35 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 	return result;
 }
 
-static void initialize_task_config(void)
+static void initialize_maintenance_strategy(void)
+{
+	char *config_str;
+
+	if (git_config_get_string("maintenance.strategy", &config_str))
+		return;
+
+	if (!strcasecmp(config_str, "incremental")) {
+		tasks[TASK_GC].schedule = SCHEDULE_NONE;
+		tasks[TASK_COMMIT_GRAPH].enabled = 1;
+		tasks[TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY;
+		tasks[TASK_PREFETCH].enabled = 1;
+		tasks[TASK_PREFETCH].schedule = SCHEDULE_HOURLY;
+		tasks[TASK_INCREMENTAL_REPACK].enabled = 1;
+		tasks[TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY;
+		tasks[TASK_LOOSE_OBJECTS].enabled = 1;
+		tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY;
+	}
+}
+
+static void initialize_task_config(int schedule)
 {
 	int i;
 	struct strbuf config_name = STRBUF_INIT;
 	gc_config();
 
+	if (schedule)
+		initialize_maintenance_strategy();
+
 	for (i = 0; i < TASK__COUNT; i++) {
 		int config_value;
 		char *config_str;
@@ -1389,7 +1412,6 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	memset(&opts, 0, sizeof(opts));
 
 	opts.quiet = !isatty(2);
-	initialize_task_config();
 
 	for (i = 0; i < TASK__COUNT; i++)
 		tasks[i].selected_order = -1;
@@ -1402,6 +1424,8 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	if (opts.auto_flag && opts.schedule)
 		die(_("use at most one of --auto and --schedule=<frequency>"));
 
+	initialize_task_config(opts.schedule);
+
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 7715e40391..7440a0ea19 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -300,6 +300,55 @@ test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
 	test_subcommand git multi-pack-index write --no-progress <weekly.txt
 '
 
+test_expect_success 'maintenance.strategy inheritance' '
+	for task in commit-graph loose-objects incremental-repack
+	do
+		git config --unset maintenance.$task.schedule || return 1
+	done &&
+
+	test_when_finished git config --unset maintenance.strategy &&
+	git config maintenance.strategy incremental &&
+
+	GIT_TRACE2_EVENT="$(pwd)/incremental-hourly.txt" \
+		git maintenance run --schedule=hourly --quiet &&
+	GIT_TRACE2_EVENT="$(pwd)/incremental-daily.txt" \
+		git maintenance run --schedule=daily --quiet &&
+
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <incremental-hourly.txt &&
+	test_subcommand ! git prune-packed --quiet <incremental-hourly.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress \
+		<incremental-hourly.txt &&
+
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <incremental-daily.txt &&
+	test_subcommand git prune-packed --quiet <incremental-daily.txt &&
+	test_subcommand git multi-pack-index write --no-progress \
+		<incremental-daily.txt &&
+
+	# Modify defaults
+	git config maintenance.commit-graph.schedule daily &&
+	git config maintenance.loose-objects.schedule hourly &&
+	git config maintenance.incremental-repack.enabled false &&
+
+	GIT_TRACE2_EVENT="$(pwd)/modified-hourly.txt" \
+		git maintenance run --schedule=hourly --quiet &&
+	GIT_TRACE2_EVENT="$(pwd)/modified-daily.txt" \
+		git maintenance run --schedule=daily --quiet &&
+
+	test_subcommand ! git commit-graph write --split --reachable \
+		--no-progress <modified-hourly.txt &&
+	test_subcommand git prune-packed --quiet <modified-hourly.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress \
+		<modified-hourly.txt &&
+
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <modified-daily.txt &&
+	test_subcommand git prune-packed --quiet <modified-daily.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress \
+		<modified-daily.txt
+'
+
 test_expect_success 'register and unregister' '
 	test_when_finished git config --global --unset-all maintenance.repo &&
 	git config --global --add maintenance.repo /existing1 &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 7/8] maintenance: use 'incremental' strategy by default
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                         ` (5 preceding siblings ...)
  2020-10-15 17:22       ` [PATCH v4 6/8] maintenance: create maintenance.strategy config Derrick Stolee via GitGitGadget
@ 2020-10-15 17:22       ` Derrick Stolee via GitGitGadget
  2020-10-15 17:22       ` [PATCH v4 8/8] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:22 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance (register|start)' subcommands add the current
repository to the global Git config so maintenance will operate on that
repository. It does not specify what maintenance should occur or how
often.

To make it simple for users to start background maintenance with a
recommended schedlue, update the 'maintenance.strategy' config option in
both the 'register' and 'start' subcommands. This allows users to
customize beyond the defaults using individual
'maintenance.<task>.schedule' options, but also the user can opt-out of
this strategy using 'maintenance.strategy=none'.

Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 17 +++++++++++++++++
 builtin/gc.c                      | 10 ++++++++++
 t/t7900-maintenance.sh            | 20 ++++++++++++++++----
 3 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 7628a6d157..b5944b4c51 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -37,6 +37,23 @@ register::
 	`maintenance.<task>.schedule`. The tasks that are enabled are safe
 	for running in the background without disrupting foreground
 	processes.
++
+The `register` subcomand will also set the `maintenance.strategy` config
+value to `incremental`, if this value is not previously set. The
+`incremental` strategy uses the following schedule for each maintenance
+task:
++
+--
+* `gc`: disabled.
+* `commit-graph`: hourly.
+* `prefetch`: hourly.
+* `loose-objects`: daily.
+* `incremental-repack`: daily.
+--
++
+`git maintenance register` will also disable foreground maintenance by
+setting `maintenance.auto = false` in the current repository. This config
+setting will remain after a `git maintenance unregister` command.
 
 run::
 	Run one or more maintenance tasks. If one or more `--task` options
diff --git a/builtin/gc.c b/builtin/gc.c
index a8248e7a45..e3098ef6a1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1434,6 +1434,7 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 
 static int maintenance_register(void)
 {
+	char *config_value;
 	struct child_process config_set = CHILD_PROCESS_INIT;
 	struct child_process config_get = CHILD_PROCESS_INIT;
 
@@ -1441,6 +1442,15 @@ static int maintenance_register(void)
 	if (!the_repository || !the_repository->gitdir)
 		return 0;
 
+	/* Disable foreground maintenance */
+	git_config_set("maintenance.auto", "false");
+
+	/* Set maintenance strategy, if unset */
+	if (!git_config_get_string("maintenance.strategy", &config_value))
+		free(config_value);
+	else
+		git_config_set("maintenance.strategy", "incremental");
+
 	config_get.git_cmd = 1;
 	strvec_pushl(&config_get.args, "config", "--global", "--get", "maintenance.repo",
 		     the_repository->worktree ? the_repository->worktree
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 7440a0ea19..20184e96e1 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -354,11 +354,14 @@ test_expect_success 'register and unregister' '
 	git config --global --add maintenance.repo /existing1 &&
 	git config --global --add maintenance.repo /existing2 &&
 	git config --global --get-all maintenance.repo >before &&
+
 	git maintenance register &&
-	git config --global --get-all maintenance.repo >actual &&
-	cp before after &&
-	pwd >>after &&
-	test_cmp after actual &&
+	test_cmp_config false maintenance.auto &&
+	git config --global --get-all maintenance.repo >between &&
+	cp before expect &&
+	pwd >>expect &&
+	test_cmp expect between &&
+
 	git maintenance unregister &&
 	git config --global --get-all maintenance.repo >actual &&
 	test_cmp before actual
@@ -392,4 +395,13 @@ test_expect_success 'start preserves existing schedule' '
 	grep "Important information!" cron.txt
 '
 
+test_expect_success 'register preserves existing strategy' '
+	git config maintenance.strategy none &&
+	git maintenance register &&
+	test_config maintenance.strategy none &&
+	git config --unset maintenance.strategy &&
+	git maintenance register &&
+	test_config maintenance.strategy incremental
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 8/8] maintenance: add troubleshooting guide to docs
  2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
                         ` (6 preceding siblings ...)
  2020-10-15 17:22       ` [PATCH v4 7/8] maintenance: use 'incremental' strategy by default Derrick Stolee via GitGitGadget
@ 2020-10-15 17:22       ` Derrick Stolee via GitGitGadget
  7 siblings, 0 replies; 62+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-10-15 17:22 UTC (permalink / raw)
  To: git
  Cc: jrnieder, jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance run' subcommand takes a lock on the object database
to prevent concurrent processes from competing for resources. This is an
important safety measure to prevent possible repository corruption and
data loss.

This feature can lead to confusing behavior if a user is not aware of
it. Add a TROUBLESHOOTING section to the 'git maintenance' builtin
documentation that discusses these tradeoffs. The short version of this
section is that Git will not corrupt your repository, but if the list of
scheduled tasks takes longer than an hour then some scheduled tasks may
be dropped due to this object database collision. For example, a
long-running "daily" task at midnight might prevent an "hourly" task
from running at 1AM.

The opposite is also possible, but less likely as long as the "hourly"
tasks are much faster than the "daily" and "weekly" tasks.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 44 +++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index b5944b4c51..6fec1eb8dc 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -175,6 +175,50 @@ OPTIONS
 	`maintenance.<task>.enabled` configured as `true` are considered.
 	See the 'TASKS' section for the list of accepted `<task>` values.
 
+
+TROUBLESHOOTING
+---------------
+The `git maintenance` command is designed to simplify the repository
+maintenance patterns while minimizing user wait time during Git commands.
+A variety of configuration options are available to allow customizing this
+process. The default maintenance options focus on operations that complete
+quickly, even on large repositories.
+
+Users may find some cases where scheduled maintenance tasks do not run as
+frequently as intended. Each `git maintenance run` command takes a lock on
+the repository's object database, and this prevents other concurrent
+`git maintenance run` commands from running on the same repository. Without
+this safeguard, competing processes could leave the repository in an
+unpredictable state.
+
+The background maintenance schedule runs `git maintenance run` processes
+on an hourly basis. Each run executes the "hourly" tasks. At midnight,
+that process also executes the "daily" tasks. At midnight on the first day
+of the week, that process also executes the "weekly" tasks. A single
+process iterates over each registered repository, performing the scheduled
+tasks for that frequency. Depending on the number of registered
+repositories and their sizes, this process may take longer than an hour.
+In this case, multiple `git maintenance run` commands may run on the same
+repository at the same time, colliding on the object database lock. This
+results in one of the two tasks not running.
+
+If you find that some maintenance windows are taking longer than one hour
+to complete, then consider reducing the complexity of your maintenance
+tasks. For example, the `gc` task is much slower than the
+`incremental-repack` task. However, this comes at a cost of a slightly
+larger object database. Consider moving more expensive tasks to be run
+less frequently.
+
+Expert users may consider scheduling their own maintenance tasks using a
+different schedule than is available through `git maintenance start` and
+Git configuration options. These users should be aware of the object
+database lock and how concurrent `git maintenance run` commands behave.
+Further, the `git gc` command should not be combined with
+`git maintenance run` commands. `git gc` modifies the object database
+but does not take the lock in the same way as `git maintenance run`. If
+possible, use `git maintenance run --task=gc` instead of `git gc`.
+
+
 GIT
 ---
 Part of the linkgit:git[1] suite
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 5/8] maintenance: add start/stop subcommands
  2020-10-15 17:22       ` [PATCH v4 5/8] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
@ 2020-12-09 18:51         ` Josh Steadmon
  2020-12-09 19:16           ` Josh Steadmon
  0 siblings, 1 reply; 62+ messages in thread
From: Josh Steadmon @ 2020-12-09 18:51 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, congdanhqx,
	SZEDER Gábor, Derrick Stolee, Martin Ågren,
	Derrick Stolee, Derrick Stolee

On 2020.10.15 17:22, Derrick Stolee via GitGitGadget wrote:
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 8f383d01d9..7715e40391 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -315,4 +315,32 @@ test_expect_success 'register and unregister' '
>  	test_cmp before actual
>  '
>  
> +test_expect_success 'start from empty cron table' '
> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
> +
> +	# start registers the repo
> +	git config --get --global maintenance.repo "$(pwd)" &&
> +
> +	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
> +	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
> +	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=weekly" cron.txt
> +'
> +
> +test_expect_success 'stop from existing schedule' '
> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
> +
> +	# stop does not unregister the repo
> +	git config --get --global maintenance.repo "$(pwd)" &&
> +
> +	# Operation is idempotent
> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
> +	test_must_be_empty cron.txt
> +'
> +
> +test_expect_success 'start preserves existing schedule' '
> +	echo "Important information!" >cron.txt &&
> +	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
> +	grep "Important information!" cron.txt
> +'
> +
>  test_done

These two test cases fail when the paths passed to git-config contain
ERE metacharacters [similar to the issue addressed in 483a6d9b5da
(maintenance: use 'git config --fixed-value', 2020-11-25)]. Since these
are already in next, I'm providing a patch to add '--fixed-value' to the
git-config calls here as well.

-- >8 --
Subject: [PATCH] t7900: use --fixed-value in git-maintenance tests

Use --fixed-value in git-config calls in the git-maintenance tests, so
that the tests will continue to work even if the repo path contains
shell metacharacters.

Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 t/t7900-maintenance.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index fab0e01c39..41bf523953 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -422,7 +422,7 @@ test_expect_success 'start from empty cron table' '
 	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
 
 	# start registers the repo
-	git config --get --global maintenance.repo "$(pwd)" &&
+	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
 
 	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
 	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
@@ -433,7 +433,7 @@ test_expect_success 'stop from existing schedule' '
 	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
 
 	# stop does not unregister the repo
-	git config --get --global maintenance.repo "$(pwd)" &&
+	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
 
 	# Operation is idempotent
 	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
-- 
2.29.2.576.ga3fc446d84-goog


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 5/8] maintenance: add start/stop subcommands
  2020-12-09 18:51         ` Josh Steadmon
@ 2020-12-09 19:16           ` Josh Steadmon
  2020-12-09 21:59             ` Derrick Stolee
  2020-12-10  0:13             ` Junio C Hamano
  0 siblings, 2 replies; 62+ messages in thread
From: Josh Steadmon @ 2020-12-09 19:16 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git, jrnieder, jonathantanmy,
	sluongng, congdanhqx, SZEDER Gábor, Derrick Stolee,
	Martin Ågren, Derrick Stolee, Derrick Stolee

Whoops, had a small think-o while writing the patch message. Fixed
below.

-- >8 --
Subject: [PATCH] t7900: use --fixed-value in git-maintenance tests

Use --fixed-value in git-config calls in the git-maintenance tests, so
that the tests will continue to work even if the repo path contains
regexp metacharacters.

Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 t/t7900-maintenance.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index fab0e01c39..41bf523953 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -422,7 +422,7 @@ test_expect_success 'start from empty cron table' '
 	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
 
 	# start registers the repo
-	git config --get --global maintenance.repo "$(pwd)" &&
+	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
 
 	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
 	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
@@ -433,7 +433,7 @@ test_expect_success 'stop from existing schedule' '
 	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
 
 	# stop does not unregister the repo
-	git config --get --global maintenance.repo "$(pwd)" &&
+	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
 
 	# Operation is idempotent
 	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
-- 
2.29.2.576.ga3fc446d84-goog


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 5/8] maintenance: add start/stop subcommands
  2020-12-09 19:16           ` Josh Steadmon
@ 2020-12-09 21:59             ` Derrick Stolee
  2020-12-10  0:13             ` Junio C Hamano
  1 sibling, 0 replies; 62+ messages in thread
From: Derrick Stolee @ 2020-12-09 21:59 UTC (permalink / raw)
  To: Josh Steadmon, Derrick Stolee via GitGitGadget, git, jrnieder,
	jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Martin Ågren, Derrick Stolee, Derrick Stolee

On 12/9/2020 2:16 PM, Josh Steadmon wrote:
> Whoops, had a small think-o while writing the patch message. Fixed
> below.
> 
> -- >8 --
> Subject: [PATCH] t7900: use --fixed-value in git-maintenance tests
> 
> Use --fixed-value in git-config calls in the git-maintenance tests, so
> that the tests will continue to work even if the repo path contains
> regexp metacharacters.
> 
> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---
>  t/t7900-maintenance.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index fab0e01c39..41bf523953 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -422,7 +422,7 @@ test_expect_success 'start from empty cron table' '
>  	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance start &&
>  
>  	# start registers the repo
> -	git config --get --global maintenance.repo "$(pwd)" &&
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
>  
>  	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=daily" cron.txt &&
>  	grep "for-each-repo --config=maintenance.repo maintenance run --schedule=hourly" cron.txt &&
> @@ -433,7 +433,7 @@ test_expect_success 'stop from existing schedule' '
>  	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&
>  
>  	# stop does not unregister the repo
> -	git config --get --global maintenance.repo "$(pwd)" &&
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
>  
>  	# Operation is idempotent
>  	GIT_TEST_CRONTAB="test-tool crontab cron.txt" git maintenance stop &&

Thank you for this. While I went to make sure the maintenance builtin worked
properly, I forgot to check the rest of the test script worked as well. This
is a good way to fix that.

Thanks,
-Stolee
 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 5/8] maintenance: add start/stop subcommands
  2020-12-09 19:16           ` Josh Steadmon
  2020-12-09 21:59             ` Derrick Stolee
@ 2020-12-10  0:13             ` Junio C Hamano
  2020-12-10  1:52               ` Derrick Stolee
  1 sibling, 1 reply; 62+ messages in thread
From: Junio C Hamano @ 2020-12-10  0:13 UTC (permalink / raw)
  To: Josh Steadmon
  Cc: Derrick Stolee via GitGitGadget, git, jrnieder, jonathantanmy,
	sluongng, congdanhqx, SZEDER Gábor, Derrick Stolee,
	Martin Ågren, Derrick Stolee, Derrick Stolee

Josh Steadmon <steadmon@google.com> writes:

>  	# start registers the repo
> -	git config --get --global maintenance.repo "$(pwd)" &&
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&

The rewrite makes it better than the original, but I wonder why the
original did not do a more obvious

	git config --get maintenance.repo >actual &&
	pwd >expect &&
	test_cmp expect actual

>  	# stop does not unregister the repo
> -	git config --get --global maintenance.repo "$(pwd)" &&
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&

Ditto.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 5/8] maintenance: add start/stop subcommands
  2020-12-10  0:13             ` Junio C Hamano
@ 2020-12-10  1:52               ` Derrick Stolee
  2020-12-10  6:54                 ` Junio C Hamano
  0 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee @ 2020-12-10  1:52 UTC (permalink / raw)
  To: Junio C Hamano, Josh Steadmon
  Cc: Derrick Stolee via GitGitGadget, git, jrnieder, jonathantanmy,
	sluongng, congdanhqx, SZEDER Gábor, Martin Ågren,
	Derrick Stolee, Derrick Stolee

On 12/9/2020 7:13 PM, Junio C Hamano wrote:
> Josh Steadmon <steadmon@google.com> writes:
> 
>>  	# start registers the repo
>> -	git config --get --global maintenance.repo "$(pwd)" &&
>> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
> 
> The rewrite makes it better than the original, but I wonder why the
> original did not do a more obvious

maintenance.repo is a multi-valued config setting, so it is possible
that there are multiple existing values. Hence the reason for needing
the value filter.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 5/8] maintenance: add start/stop subcommands
  2020-12-10  1:52               ` Derrick Stolee
@ 2020-12-10  6:54                 ` Junio C Hamano
  0 siblings, 0 replies; 62+ messages in thread
From: Junio C Hamano @ 2020-12-10  6:54 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Josh Steadmon, Derrick Stolee via GitGitGadget, git, jrnieder,
	jonathantanmy, sluongng, congdanhqx, SZEDER Gábor,
	Martin Ågren, Derrick Stolee, Derrick Stolee

Derrick Stolee <stolee@gmail.com> writes:

> On 12/9/2020 7:13 PM, Junio C Hamano wrote:
>> Josh Steadmon <steadmon@google.com> writes:
>> 
>>>  	# start registers the repo
>>> -	git config --get --global maintenance.repo "$(pwd)" &&
>>> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
>> 
>> The rewrite makes it better than the original, but I wonder why the
>> original did not do a more obvious
>
> maintenance.repo is a multi-valued config setting, so it is possible
> that there are multiple existing values. Hence the reason for needing
> the value filter.

I do not quite get it.  You mean as long as $(pwd) appears, you do
not care what other value appear on the variable?  Aren't we control
of what repositories have been registered to the system at this point
in the test sequence?

It's not wrong per-se to use "does this value exist for the key?",
especially with the --fixed-value option.  It somehow just felt a
bit unusual to me.

In any case, thanks for the metacharacter fix.  That is now on
'master' so the previous breakages are all gone with Josh's fix.

Thanks, both.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 2/8] maintenance: add --schedule option and config
  2020-10-15 17:21       ` [PATCH v4 2/8] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
@ 2021-02-09 14:06         ` Ævar Arnfjörð Bjarmason
  2021-02-09 16:54           ` Derrick Stolee
  0 siblings, 1 reply; 62+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-09 14:06 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, SZEDER Gábor,
	Derrick Stolee, Đoàn Trần Công Danh,
	Martin Ågren, Derrick Stolee, Derrick Stolee


On Thu, Oct 15 2020, Derrick Stolee via GitGitGadget wrote:

> +--schedule::
> +	When combined with the `run` subcommand, run maintenance tasks
> +	only if certain time conditions are met, as specified by the
> +	`maintenance.<task>.schedule` config value for each `<task>`.
> +	This config value specifies a number of seconds since the last
> +	time that task ran, according to the `maintenance.<task>.lastRun`
> +	config value. The tasks that are tested are those provided by
> +	the `--task=<task>` option(s) or those with
> +	`maintenance.<task>.enabled` set to true.

I see from searching on list and from spying on your repo that patches
for this maintenance.<task>.lastRun feature exist, but there's no code
for it in git.git.

So we've got a 2.30.0 release with a mention of that, and it can't work,
because it's only in the doc due to b08ff1fee00 (maintenance: add
--schedule option and config, 2020-09-11).


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 2/8] maintenance: add --schedule option and config
  2021-02-09 14:06         ` Ævar Arnfjörð Bjarmason
@ 2021-02-09 16:54           ` Derrick Stolee
  2021-05-10 12:16             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 62+ messages in thread
From: Derrick Stolee @ 2021-02-09 16:54 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	Derrick Stolee via GitGitGadget
  Cc: git, jrnieder, jonathantanmy, sluongng, SZEDER Gábor,
	Đoàn Trần Công Danh, Martin Ågren,
	Derrick Stolee, Derrick Stolee

On 2/9/2021 9:06 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Oct 15 2020, Derrick Stolee via GitGitGadget wrote:
> 
>> +--schedule::
>> +	When combined with the `run` subcommand, run maintenance tasks
>> +	only if certain time conditions are met, as specified by the
>> +	`maintenance.<task>.schedule` config value for each `<task>`.
>> +	This config value specifies a number of seconds since the last
>> +	time that task ran, according to the `maintenance.<task>.lastRun`
>> +	config value. The tasks that are tested are those provided by
>> +	the `--task=<task>` option(s) or those with
>> +	`maintenance.<task>.enabled` set to true.
> 
> I see from searching on list and from spying on your repo that patches
> for this maintenance.<task>.lastRun feature exist, but there's no code
> for it in git.git.
> 
> So we've got a 2.30.0 release with a mention of that, and it can't work,
> because it's only in the doc due to b08ff1fee00 (maintenance: add
> --schedule option and config, 2020-09-11).

Thank you for pointing out this docbug. This is based on an early
version of the patch series and should have been changed.

Please see this patch which attempts to do a better job. I can
create a new thread with this submission if we need more edits.

Thanks,
-Stolee

--- >8 ---

From 46436b06caf65ee824e781603a8108413bb87705 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Tue, 9 Feb 2021 11:51:32 -0500
Subject: [PATCH] maintenance: properly document --schedule
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The documentation for the '--schedule' option is incorrect and based on
an early version of the background maintenance feature. Update the
documentation to describe the actual use of the option.

The most important thing is that Git takes this option as a hint for
which tasks it should run. Users should not run this command arbitrarily
and expect that Git will enforce some timing restrictions.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 6fec1eb8dc2..d4b5aea6760 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -155,15 +155,15 @@ OPTIONS
 	exceeds the `gc.autoPackLimit` config setting. Not compatible with
 	the `--schedule` option.
 
---schedule::
+--schedule=<frequency>::
 	When combined with the `run` subcommand, run maintenance tasks
-	only if certain time conditions are met, as specified by the
-	`maintenance.<task>.schedule` config value for each `<task>`.
-	This config value specifies a number of seconds since the last
-	time that task ran, according to the `maintenance.<task>.lastRun`
-	config value. The tasks that are tested are those provided by
-	the `--task=<task>` option(s) or those with
-	`maintenance.<task>.enabled` set to true.
+	whose `maintenance.<task>.schedule` config value is equal to
+	`<frequency>`. There is no timing restriction imposed by this
+	option, but instead is used to inform the Git process which
+	frequency to use. The command scheduler created by
+	`git maintenance start` runs this command with `<frequency>`
+	equal to `hourly`, `daily`, and `weekly` at the appropriate
+	intervals.
 
 --quiet::
 	Do not report progress or other information over `stderr`.
-- 
2.30.0.vfs.0.0.exp




^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 3/8] for-each-repo: run subcommands on configured repos
  2020-10-15 17:21       ` [PATCH v4 3/8] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
@ 2021-05-03 16:10         ` Andrzej Hunt
  2021-05-03 17:01           ` Eric Sunshine
  2021-05-03 19:43           ` Derrick Stolee
  0 siblings, 2 replies; 62+ messages in thread
From: Andrzej Hunt @ 2021-05-03 16:10 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git; +Cc: Derrick Stolee, Derrick Stolee


On 15/10/2020 19:21, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
[... snip ...]
> diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
> new file mode 100644
> index 0000000000..5bba623ff1
> --- /dev/null
> +++ b/builtin/for-each-repo.c
> @@ -0,0 +1,58 @@
> +#include "cache.h"
> +#include "config.h"
> +#include "builtin.h"
> +#include "parse-options.h"
> +#include "run-command.h"
> +#include "string-list.h"
> +
> +static const char * const for_each_repo_usage[] = {
> +	N_("git for-each-repo --config=<config> <command-args>"),
> +	NULL
> +};
> +
> +static int run_command_on_repo(const char *path,
> +			       void *cbdata)
> +{
> +	int i;
> +	struct child_process child = CHILD_PROCESS_INIT;
> +	struct strvec *args = (struct strvec *)cbdata;

I was curious there's a strong reason for declaring args as void * 
followed by this cast? The most obvious answer seems to be that this 
probably evolved from a callback - and we could simplify it now?
> +
> +	child.git_cmd = 1;
> +	strvec_pushl(&child.args, "-C", path, NULL);
> +
> +	for (i = 0; i < args->nr; i++)
> +		strvec_push(&child.args, args->v[i]);
So here we're copying all of args - and I don't see any way of avoiding 
it since we're adding it to child's arg list.

> +
> +	return run_command(&child);
> +}
> +
> +int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
> +{
> +	static const char *config_key = NULL;
> +	int i, result = 0;
> +	const struct string_list *values;
> +	struct strvec args = STRVEC_INIT;
> +
> +	const struct option options[] = {
> +		OPT_STRING(0, "config", &config_key, N_("config"),
> +			   N_("config key storing a list of repository paths")),
> +		OPT_END()
> +	};
> +
> +	argc = parse_options(argc, argv, prefix, options, for_each_repo_usage,
> +			     PARSE_OPT_STOP_AT_NON_OPTION);
> +
> +	if (!config_key)
> +		die(_("missing --config=<config>"));
> +
> +	for (i = 0; i < argc; i++)
> +		strvec_push(&args, argv[i]);

But why do we have to copy all of argv 1:1 into args here, only to later 
pass it to run_command_on_repo() which, as described above, copies the 
entire input again? I suspect this was done to comply with 
run_command_on_repo()'s API (which takes strvec) - does that seem 
plausible, or did I miss something?

Which brings me to the real reason for my questions: I noticed we "leak" 
args (this leak is of no significance since it happens in cmd_*, but 
LSAN still complains, and I'm trying to get tests running leak-free). My 
initial inclination was to strvec_clear() or UNLEAK() args - but if we 
can avoid creating args in the first place we also wouldn't need to 
clear it later.

My current proposal is therefore to completely remove args and pass 
argc+argv into run_command_on_repo() - but I wanted to be sure that I 
didn't miss some important reason to stick with the current approach.

> +
> +	values = repo_config_get_value_multi(the_repository,
> +					     config_key);
> +
> +	for (i = 0; !result && i < values->nr; i++)
> +		result = run_command_on_repo(values->items[i].string, &args);
> +
> +	return result;
> +}
[... snip ...]

(I hope this doesn't come across as useless necroposting - I figured it 
would be easier to clarify these questions on the original thread as 
opposed to potentially discussing it as part of my next leak-fixing 
series :) .)

ATB,

   Andrzej

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 3/8] for-each-repo: run subcommands on configured repos
  2021-05-03 16:10         ` Andrzej Hunt
@ 2021-05-03 17:01           ` Eric Sunshine
  2021-05-03 19:26             ` Eric Sunshine
  2021-05-03 19:43           ` Derrick Stolee
  1 sibling, 1 reply; 62+ messages in thread
From: Eric Sunshine @ 2021-05-03 17:01 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Derrick Stolee via GitGitGadget, Git List, Derrick Stolee,
	Derrick Stolee

I'm not Stolee and hadn't even looked at this code before, but I'll
jump in with a couple comments...

On Mon, May 3, 2021 at 12:11 PM Andrzej Hunt <andrzej@ahunt.org> wrote:
> On 15/10/2020 19:21, Derrick Stolee via GitGitGadget wrote:
> > +static int run_command_on_repo(const char *path,
> > +                            void *cbdata)
> > +{
> > +     int i;
> > +     struct child_process child = CHILD_PROCESS_INIT;
> > +     struct strvec *args = (struct strvec *)cbdata;
>
> I was curious there's a strong reason for declaring args as void *
> followed by this cast? The most obvious answer seems to be that this
> probably evolved from a callback - and we could simplify it now?

Agreed, the `void*` cbdata doesn't make sense here.

> > +     strvec_pushl(&child.args, "-C", path, NULL);
> > +
> > +     for (i = 0; i < args->nr; i++)
> > +             strvec_push(&child.args, args->v[i]);
>
> So here we're copying all of args - and I don't see any way of avoiding
> it since we're adding it to child's arg list.

... (dot dot dot)

> > +int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
> > +{
> > +     struct strvec args = STRVEC_INIT;
> > +     for (i = 0; i < argc; i++)
> > +             strvec_push(&args, argv[i]);
>
> But why do we have to copy all of argv 1:1 into args here, only to later
> pass it to run_command_on_repo() which, as described above, copies the
> entire input again? I suspect this was done to comply with
> run_command_on_repo()'s API (which takes strvec) - does that seem
> plausible, or did I miss something?
>
> Which brings me to the real reason for my questions: I noticed we "leak"
> args (this leak is of no significance since it happens in cmd_*, but
> LSAN still complains, and I'm trying to get tests running leak-free). My
> initial inclination was to strvec_clear() or UNLEAK() args - but if we
> can avoid creating args in the first place we also wouldn't need to
> clear it later.
>
> My current proposal is therefore to completely remove args and pass
> argc+argv into run_command_on_repo() - but I wanted to be sure that I
> didn't miss some important reason to stick with the current approach.

An alternative to all this copying would be to take advantage of
child_process.argv which is owned by the caller, thus does not get
freed automatically by run_command(). This would allow you to re-use
the same argument vector for all calls. And you don't need
run_command_on_repo() at all. Something like this in
cmd_for_each_repo(), untested and typed directly in email:

    struct child_process child = CHILD_PROCESS_INIT;

    for (i = 0; !result && i < values->nr; i++) {
        const char *d = chdir(values->items[i].string);
        if (chdir(d))
            die_errno(_("cannot chdir to '%s'"), d);
        child.git_cmd = 1;
        child.argv = argv;
        result = run_command(&child);
    }

This assumes that argv[] is correctly NULL-terminated after
parse_options() -- I didn't check, but expect it to be so. If not,
it's easy enough to copy argv[] into `args` once and then
strvec_clear(&args) at the end of the function.

The one downside is that trace output wouldn't be as helpful (I think)
because you wouldn't see an explicit "-C <dir>", but I suppose the
tracing machinery can be invoked manually to address this.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 3/8] for-each-repo: run subcommands on configured repos
  2021-05-03 17:01           ` Eric Sunshine
@ 2021-05-03 19:26             ` Eric Sunshine
  0 siblings, 0 replies; 62+ messages in thread
From: Eric Sunshine @ 2021-05-03 19:26 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Derrick Stolee via GitGitGadget, Git List, Derrick Stolee,
	Derrick Stolee

On Mon, May 3, 2021 at 1:01 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>     for (i = 0; !result && i < values->nr; i++) {
>         const char *d = chdir(values->items[i].string);
>         if (chdir(d))
>             die_errno(_("cannot chdir to '%s'"), d);
>         child.git_cmd = 1;
>         child.argv = argv;
>         result = run_command(&child);
>     }

Without the copy/paste error, the declaration of `d` would, of course, be:

    const char *d = values->items[i].string;

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 3/8] for-each-repo: run subcommands on configured repos
  2021-05-03 16:10         ` Andrzej Hunt
  2021-05-03 17:01           ` Eric Sunshine
@ 2021-05-03 19:43           ` Derrick Stolee
  1 sibling, 0 replies; 62+ messages in thread
From: Derrick Stolee @ 2021-05-03 19:43 UTC (permalink / raw)
  To: Andrzej Hunt, Derrick Stolee via GitGitGadget, git
  Cc: Derrick Stolee, Derrick Stolee

On 5/3/2021 12:10 PM, Andrzej Hunt wrote:
> 
> On 15/10/2020 19:21, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <dstolee@microsoft.com>
>> +static int run_command_on_repo(const char *path,
>> +                   void *cbdata)
>> +{
>> +    int i;
>> +    struct child_process child = CHILD_PROCESS_INIT;
>> +    struct strvec *args = (struct strvec *)cbdata;
> 
> I was curious there's a strong reason for declaring args as void * followed by this cast? The most obvious answer seems to be that this probably evolved from a callback - and we could simplify it now?

You are absolutely right that this evolved from a
callback. I look forward to reviewing your patch that
updates this. ;)

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 2/8] maintenance: add --schedule option and config
  2021-02-09 16:54           ` Derrick Stolee
@ 2021-05-10 12:16             ` Ævar Arnfjörð Bjarmason
  2021-05-10 18:42               ` Junio C Hamano
  0 siblings, 1 reply; 62+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-05-10 12:16 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, jrnieder, jonathantanmy,
	sluongng, SZEDER Gábor,
	Đoàn Trần Công Danh, Martin Ågren,
	Derrick Stolee, Derrick Stolee, Junio C Hamano


On Tue, Feb 09 2021, Derrick Stolee wrote:

> On 2/9/2021 9:06 AM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Thu, Oct 15 2020, Derrick Stolee via GitGitGadget wrote:
>> 
>>> +--schedule::
>>> +	When combined with the `run` subcommand, run maintenance tasks
>>> +	only if certain time conditions are met, as specified by the
>>> +	`maintenance.<task>.schedule` config value for each `<task>`.
>>> +	This config value specifies a number of seconds since the last
>>> +	time that task ran, according to the `maintenance.<task>.lastRun`
>>> +	config value. The tasks that are tested are those provided by
>>> +	the `--task=<task>` option(s) or those with
>>> +	`maintenance.<task>.enabled` set to true.
>> 
>> I see from searching on list and from spying on your repo that patches
>> for this maintenance.<task>.lastRun feature exist, but there's no code
>> for it in git.git.
>> 
>> So we've got a 2.30.0 release with a mention of that, and it can't work,
>> because it's only in the doc due to b08ff1fee00 (maintenance: add
>> --schedule option and config, 2020-09-11).
>
> Thank you for pointing out this docbug. This is based on an early
> version of the patch series and should have been changed.
>
> Please see this patch which attempts to do a better job. I can
> create a new thread with this submission if we need more edits.
>
> Thanks,
> -Stolee
>
> --- >8 ---
>
> From 46436b06caf65ee824e781603a8108413bb87705 Mon Sep 17 00:00:00 2001
> From: Derrick Stolee <dstolee@microsoft.com>
> Date: Tue, 9 Feb 2021 11:51:32 -0500
> Subject: [PATCH] maintenance: properly document --schedule
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> The documentation for the '--schedule' option is incorrect and based on
> an early version of the background maintenance feature. Update the
> documentation to describe the actual use of the option.
>
> The most important thing is that Git takes this option as a hint for
> which tasks it should run. Users should not run this command arbitrarily
> and expect that Git will enforce some timing restrictions.
>
> Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/git-maintenance.txt | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 6fec1eb8dc2..d4b5aea6760 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -155,15 +155,15 @@ OPTIONS
>  	exceeds the `gc.autoPackLimit` config setting. Not compatible with
>  	the `--schedule` option.
>  
> ---schedule::
> +--schedule=<frequency>::
>  	When combined with the `run` subcommand, run maintenance tasks
> -	only if certain time conditions are met, as specified by the
> -	`maintenance.<task>.schedule` config value for each `<task>`.
> -	This config value specifies a number of seconds since the last
> -	time that task ran, according to the `maintenance.<task>.lastRun`
> -	config value. The tasks that are tested are those provided by
> -	the `--task=<task>` option(s) or those with
> -	`maintenance.<task>.enabled` set to true.
> +	whose `maintenance.<task>.schedule` config value is equal to
> +	`<frequency>`. There is no timing restriction imposed by this
> +	option, but instead is used to inform the Git process which
> +	frequency to use. The command scheduler created by
> +	`git maintenance start` runs this command with `<frequency>`
> +	equal to `hourly`, `daily`, and `weekly` at the appropriate
> +	intervals.
>  
>  --quiet::
>  	Do not report progress or other information over `stderr`.

+ CC Junio

Late reply, I was reminded of this again. This patch looks good to me,
but I see it never got picked up, and our docs still mention an
unsupported "lastRun". Junio: I think it makes sense to just pick this
up...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 2/8] maintenance: add --schedule option and config
  2021-05-10 12:16             ` Ævar Arnfjörð Bjarmason
@ 2021-05-10 18:42               ` Junio C Hamano
  0 siblings, 0 replies; 62+ messages in thread
From: Junio C Hamano @ 2021-05-10 18:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, Derrick Stolee via GitGitGadget, git, jrnieder,
	jonathantanmy, sluongng, SZEDER Gábor,
	Đoàn Trần Công Danh, Martin Ågren,
	Derrick Stolee, Derrick Stolee

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Tue, Feb 09 2021, Derrick Stolee wrote:
>
>> On 2/9/2021 9:06 AM, Ævar Arnfjörð Bjarmason wrote:
>>> 
>>> On Thu, Oct 15 2020, Derrick Stolee via GitGitGadget wrote:
>>> 
>>>> +--schedule::
>>>> +	When combined with the `run` subcommand, run maintenance tasks
>>>> +	only if certain time conditions are met, as specified by the
>>>> +	`maintenance.<task>.schedule` config value for each `<task>`.
>>>> +	This config value specifies a number of seconds since the last
>>>> +	time that task ran, according to the `maintenance.<task>.lastRun`
>>>> +	config value. The tasks that are tested are those provided by
>>>> +	the `--task=<task>` option(s) or those with
>>>> +	`maintenance.<task>.enabled` set to true.
>>> 
>>> I see from searching on list and from spying on your repo that patches
>>> for this maintenance.<task>.lastRun feature exist, but there's no code
>>> for it in git.git.
>>> 
>>> So we've got a 2.30.0 release with a mention of that, and it can't work,
>>> because it's only in the doc due to b08ff1fee00 (maintenance: add
>>> --schedule option and config, 2020-09-11).
>>
>> Thank you for pointing out this docbug. This is based on an early
>> version of the patch series and should have been changed.
>>
>> Please see this patch which attempts to do a better job. I can
>> create a new thread with this submission if we need more edits.

Or resend it in a new thread for better visibility, with or without
change, with a mention of the original under the three-dash line.
Nobody can tell, with the above comment, if you abandoned the patch
or if you thought it is good enough.

Thanks.

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2021-05-10 18:42 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
2020-09-08 13:07   ` Đoàn Trần Công Danh
2020-09-09 12:14     ` Derrick Stolee
2020-09-04 15:42 ` [PATCH 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
2020-09-08  6:29   ` SZEDER Gábor
2020-09-08 12:43     ` Derrick Stolee
2020-09-08 19:31     ` Junio C Hamano
2020-09-04 15:42 ` [PATCH 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
2020-09-11 17:49 ` [PATCH v2 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
2020-09-17 14:05     ` Đoàn Trần Công Danh
2020-09-11 17:49   ` [PATCH v2 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
2020-09-29 19:48     ` Martin Ågren
2020-09-30 20:11       ` Derrick Stolee
2020-10-01 20:38         ` Derrick Stolee
2020-10-02  0:38           ` Đoàn Trần Công Danh
2020-10-02  1:55             ` Derrick Stolee
2020-10-05 13:16               ` Đoàn Trần Công Danh
2020-10-05 18:17                 ` Derrick Stolee
2020-09-11 17:49   ` [PATCH v2 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 6/7] maintenance: use default schedule if not configured Derrick Stolee via GitGitGadget
2020-10-05 19:57       ` Martin Ågren
2020-10-08 13:32         ` Derrick Stolee
2020-10-05 12:57     ` [PATCH v3 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
2020-10-15 17:21       ` [PATCH v4 1/8] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
2020-10-15 17:21       ` [PATCH v4 2/8] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
2021-02-09 14:06         ` Ævar Arnfjörð Bjarmason
2021-02-09 16:54           ` Derrick Stolee
2021-05-10 12:16             ` Ævar Arnfjörð Bjarmason
2021-05-10 18:42               ` Junio C Hamano
2020-10-15 17:21       ` [PATCH v4 3/8] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
2021-05-03 16:10         ` Andrzej Hunt
2021-05-03 17:01           ` Eric Sunshine
2021-05-03 19:26             ` Eric Sunshine
2021-05-03 19:43           ` Derrick Stolee
2020-10-15 17:22       ` [PATCH v4 4/8] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
2020-10-15 17:22       ` [PATCH v4 5/8] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
2020-12-09 18:51         ` Josh Steadmon
2020-12-09 19:16           ` Josh Steadmon
2020-12-09 21:59             ` Derrick Stolee
2020-12-10  0:13             ` Junio C Hamano
2020-12-10  1:52               ` Derrick Stolee
2020-12-10  6:54                 ` Junio C Hamano
2020-10-15 17:22       ` [PATCH v4 6/8] maintenance: create maintenance.strategy config Derrick Stolee via GitGitGadget
2020-10-15 17:22       ` [PATCH v4 7/8] maintenance: use 'incremental' strategy by default Derrick Stolee via GitGitGadget
2020-10-15 17:22       ` [PATCH v4 8/8] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).