git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH] maintenance: use systemd timers on Linux
@ 2021-05-01 14:52 Lénaïc Huard
  2021-05-01 20:02 ` brian m. carlson
                   ` (5 more replies)
  0 siblings, 6 replies; 30+ messages in thread
From: Lénaïc Huard @ 2021-05-01 14:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine, Lénaïc Huard

The existing mechanism for scheduling background maintenance is done
through cron. On Linux systems managed by systemd, systemd provides an
alternative to schedule recurring tasks: systemd timers.

The main motivations to implement systemd timers in addition to cron
are:
* cron is optional and Linux systems running systemd might not have it
  installed.
* The execution of `crontab -l` can tell us if cron is installed but not
  if the daemon is actually running.
* With systemd, each service is run in its own cgroup and its logs are
  tagged by the service inside journald. With cron, all scheduled tasks
  are running in the cron daemon cgroup and all the logs of the
  user-scheduled tasks are pretended to belong to the system cron
  service.
  Concretely, a user that doesn’t have access to the system logs won’t
  have access to the log of its own tasks scheduled by cron whereas he
  will have access to the log of its own tasks scheduled by systemd
  timer.

In order to schedule git maintenance, we need two unit template files:
* ~/.config/systemd/user/git-maintenance@.service
  to define the command to be started by systemd and
* ~/.config/systemd/user/git-maintenance@.timer
  to define the schedule at which the command should be run.

Those units are templates that are parametrized by the frequency.

Based on those templates, 3 timers are started:
* git-maintenance@hourly.timer
* git-maintenance@daily.timer
* git-maintenance@weekly.timer

The command launched by those three timers are the same than with the
other scheduling methods:

git for-each-repo --config=maintenance.repo maintenance run
--schedule=%i

with the full path for git to ensure that the version of git launched
for the scheduled maintenance is the same as the one used to run
`maintenance start`.

The timer unit contains `Persistent=true` so that, if the computer is
powered down when a maintenance task should run, the task will be run
when the computer is back powered on.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
---
 Documentation/git-maintenance.txt |  49 ++++++++
 builtin/gc.c                      | 188 +++++++++++++++++++++++++++++-
 t/t7900-maintenance.sh            |  51 ++++++++
 3 files changed, 287 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 80ddd33ceb..30443b417a 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -279,6 +279,55 @@ schedule to ensure you are executing the correct binaries in your
 schedule.
 
 
+BACKGROUND MAINTENANCE ON LINUX SYSTEMD SYSTEMD
+-----------------------------------------------
+
+While Linux supports `cron`, depending on the distribution, `cron` may
+be an optional package not necessarily installed. On modern Linux
+distributions, systemd timers are superseding it.
+
+If user systemd timers are available, they will be used as a replacement
+of `cron`.
+
+In this case, `git maintenance start` will create user systemd timer units
+and start the timers. The current list of user-scheduled tasks can be found
+by running `systemctl --user list-timers`. The timers written by `git
+maintenance start` are similar to this:
+
+-----------------------------------------------------------------------
+$ systemctl --user list-timers
+NEXT                         LEFT          LAST                         PASSED     UNIT                         ACTIVATES
+Thu 2021-04-29 19:00:00 CEST 42min left    Thu 2021-04-29 18:00:11 CEST 17min ago  git-maintenance@hourly.timer git-maintenance@hourly.service
+Fri 2021-04-30 00:00:00 CEST 5h 42min left Thu 2021-04-29 00:00:11 CEST 18h ago    git-maintenance@daily.timer  git-maintenance@daily.service
+Mon 2021-05-03 00:00:00 CEST 3 days left   Mon 2021-04-26 00:00:11 CEST 3 days ago git-maintenance@weekly.timer git-maintenance@weekly.service
+
+3 timers listed.
+Pass --all to see loaded but inactive timers, too.
+-----------------------------------------------------------------------
+
+One timer is registered for each `--schedule=<frequency>` option.
+
+The definition of the systemd units can be inspected in the following files:
+
+-----------------------------------------------------------------------
+~/.config/systemd/user/git-maintenance@.timer
+~/.config/systemd/user/git-maintenance@.service
+~/.config/systemd/user/timers.target.wants/git-maintenance@hourly.timer
+~/.config/systemd/user/timers.target.wants/git-maintenance@daily.timer
+~/.config/systemd/user/timers.target.wants/git-maintenance@weekly.timer
+-----------------------------------------------------------------------
+
+`git maintenance start` will overwrite these files and start the timer
+again with `systemctl --user`, so any customization should be done by
+creating a drop-in file
+`~/.config/systemd/user/git-maintenance@.service.d/*.conf`.
+
+`git maintenance stop` will stop the user systemd timers and delete
+the above mentioned files.
+
+For more details, see systemd.timer(5)
+
+
 BACKGROUND MAINTENANCE ON MACOS SYSTEMS
 ---------------------------------------
 
diff --git a/builtin/gc.c b/builtin/gc.c
index ef7226d7bc..913fcfc882 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1872,6 +1872,25 @@ static int schtasks_update_schedule(int run_maintenance, int fd, const char *cmd
 		return schtasks_remove_tasks(cmd);
 }
 
+static int is_crontab_available(const char *cmd)
+{
+	struct child_process child = CHILD_PROCESS_INIT;
+
+	strvec_split(&child.args, cmd);
+	strvec_push(&child.args, "-l");
+	child.no_stdin = 1;
+	child.no_stdout = 1;
+	child.no_stderr = 1;
+	child.silent_exec_failure = 1;
+
+	if (start_command(&child))
+		return 0;
+	/* Ignore exit code, as an empty crontab will return error. */
+	finish_command(&child);
+
+	return 1;
+}
+
 #define BEGIN_LINE "# BEGIN GIT MAINTENANCE SCHEDULE"
 #define END_LINE "# END GIT MAINTENANCE SCHEDULE"
 
@@ -1959,10 +1978,164 @@ static int crontab_update_schedule(int run_maintenance, int fd, const char *cmd)
 	return result;
 }
 
+static int is_systemd_timer_available(const char *cmd)
+{
+	struct child_process child = CHILD_PROCESS_INIT;
+
+	strvec_split(&child.args, cmd);
+	strvec_pushl(&child.args, "--user", "list-timers", NULL);
+	child.no_stdin = 1;
+	child.no_stdout = 1;
+	child.no_stderr = 1;
+	child.silent_exec_failure = 1;
+
+	if (start_command(&child))
+		return 0;
+	if (finish_command(&child))
+		return 0;
+
+	return 1;
+}
+
+static char *systemd_timer_timer_filename()
+{
+	const char *filename = "~/.config/systemd/user/git-maintenance@.timer";
+	char *expanded = expand_user_path(filename, 0);
+	if (!expanded)
+		die(_("failed to expand path '%s'"), filename);
+
+	return expanded;
+}
+
+static char *systemd_timer_service_filename()
+{
+	const char *filename =
+		"~/.config/systemd/user/git-maintenance@.service";
+	char *expanded = expand_user_path(filename, 0);
+	if (!expanded)
+		die(_("failed to expand path '%s'"), filename);
+
+	return expanded;
+}
+
+static int systemd_timer_enable_unit(int enable,
+				     enum schedule_priority schedule,
+				     const char *cmd)
+{
+	struct child_process child = CHILD_PROCESS_INIT;
+	const char *frequency = get_frequency(schedule);
+
+	strvec_split(&child.args, cmd);
+	strvec_push(&child.args, "--user");
+	if (enable)
+		strvec_push(&child.args, "enable");
+	else
+		strvec_push(&child.args, "disable");
+	strvec_push(&child.args, "--now");
+	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
+
+	if (start_command(&child))
+		die(_("failed to run systemctl"));
+	return finish_command(&child);
+}
+
+static int systemd_timer_delete_unit_templates()
+{
+	char *filename = systemd_timer_timer_filename();
+	unlink(filename);
+	free(filename);
+
+	filename = systemd_timer_service_filename();
+	unlink(filename);
+	free(filename);
+
+	return 0;
+}
+
+static int systemd_timer_delete_units(const char *cmd)
+{
+	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, cmd) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, cmd) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, cmd) ||
+	       systemd_timer_delete_unit_templates();
+}
+
+static int systemd_timer_write_unit_templates(const char *exec_path)
+{
+	char *filename;
+	FILE *file;
+	const char *unit;
+
+	filename = systemd_timer_timer_filename();
+	if (safe_create_leading_directories(filename))
+		die(_("failed to create directories for '%s'"), filename);
+	file = xfopen(filename, "w");
+	free(filename);
+
+	unit = "[Unit]\n"
+	       "Description=Optimize Git repositories data\n"
+	       "\n"
+	       "[Timer]\n"
+	       "OnCalendar=%i\n"
+	       "Persistent=true\n"
+	       "\n"
+	       "[Install]\n"
+	       "WantedBy=timers.target\n";
+	fputs(unit, file);
+	fclose(file);
+
+	filename = systemd_timer_service_filename();
+	if (safe_create_leading_directories(filename))
+		die(_("failed to create directories for '%s'"), filename);
+	file = xfopen(filename, "w");
+	free(filename);
+
+	unit = "[Unit]\n"
+	       "Description=Optimize Git repositories data\n"
+	       "\n"
+	       "[Service]\n"
+	       "Type=oneshot\n"
+	       "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
+	       "LockPersonality=yes\n"
+	       "MemoryDenyWriteExecute=yes\n"
+	       "NoNewPrivileges=yes\n"
+	       "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6\n"
+	       "RestrictNamespaces=yes\n"
+	       "RestrictRealtime=yes\n"
+	       "RestrictSUIDSGID=yes\n"
+	       "SystemCallArchitectures=native\n"
+	       "SystemCallFilter=@system-service\n";
+	fprintf(file, unit, exec_path);
+	fclose(file);
+
+	return 0;
+}
+
+static int systemd_timer_setup_units(const char *cmd)
+{
+	const char *exec_path = git_exec_path();
+
+	return systemd_timer_write_unit_templates(exec_path) ||
+	       systemd_timer_enable_unit(1, SCHEDULE_HOURLY, cmd) ||
+	       systemd_timer_enable_unit(1, SCHEDULE_DAILY, cmd) ||
+	       systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, cmd);
+}
+
+static int systemd_timer_update_schedule(int run_maintenance, int fd,
+					 const char *cmd)
+{
+	if (run_maintenance)
+		return systemd_timer_setup_units(cmd);
+	else
+		return systemd_timer_delete_units(cmd);
+}
+
 #if defined(__APPLE__)
 static const char platform_scheduler[] = "launchctl";
 #elif defined(GIT_WINDOWS_NATIVE)
 static const char platform_scheduler[] = "schtasks";
+#elif defined(__linux__)
+static const char platform_scheduler[] = "crontab_or_systemctl";
 #else
 static const char platform_scheduler[] = "crontab";
 #endif
@@ -1986,6 +2159,15 @@ static int update_background_schedule(int enable)
 		cmd = sep + 1;
 	}
 
+	if (!strcmp(scheduler, "crontab_or_systemctl")) {
+		if (is_systemd_timer_available("systemctl"))
+			scheduler = cmd = "systemctl";
+		else if (is_crontab_available("crontab"))
+			scheduler = cmd = "crontab";
+		else
+			die(_("Neither systemd timers nor crontab are available"));
+	}
+
 	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
 		return error(_("another process is scheduling background maintenance"));
 
@@ -1995,10 +2177,14 @@ static int update_background_schedule(int enable)
 		result = schtasks_update_schedule(enable, get_lock_file_fd(&lk), cmd);
 	else if (!strcmp(scheduler, "crontab"))
 		result = crontab_update_schedule(enable, get_lock_file_fd(&lk), cmd);
+	else if (!strcmp(scheduler, "systemctl"))
+		result = systemd_timer_update_schedule(
+			enable, get_lock_file_fd(&lk), cmd);
 	else
-		die("unknown background scheduler: %s", scheduler);
+		die(_("unknown background scheduler: %s"), scheduler);
 
 	rollback_lock_file(&lk);
+	free(lock_path);
 	free(testing);
 	return result;
 }
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 2412d8c5c0..dd281789f4 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -20,6 +20,20 @@ test_xmllint () {
 	fi
 }
 
+test_lazy_prereq SYSTEMD_ANALYZE '
+	systemd-analyze --help >out &&
+	grep -w verify out
+'
+
+test_systemd_analyze_verify () {
+	if test_have_prereq SYSTEMD_ANALYZE
+	then
+		systemd-analyze verify "$@"
+	else
+		true
+	fi
+}
+
 test_expect_success 'help text' '
 	test_expect_code 129 git maintenance -h 2>err &&
 	test_i18ngrep "usage: git maintenance <subcommand>" err &&
@@ -615,6 +629,43 @@ test_expect_success 'start and stop Windows maintenance' '
 	test_cmp expect args
 '
 
+test_expect_success 'start and stop Linux/systemd maintenance' '
+	write_script print-args <<-\EOF &&
+	echo $* >>args
+	EOF
+
+	rm -f args &&
+	GIT_TEST_MAINT_SCHEDULER="systemctl:./print-args" git maintenance start &&
+
+	# start registers the repo
+	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
+
+	test_systemd_analyze_verify "$HOME/.config/systemd/user/git-maintenance@.service" &&
+
+	rm -f expect &&
+	for frequency in hourly daily weekly
+	do
+		echo "--user enable --now git-maintenance@${frequency}.timer" >>expect || return 1
+	done &&
+	test_cmp expect args &&
+
+	rm -f args &&
+	GIT_TEST_MAINT_SCHEDULER="systemctl:./print-args" git maintenance stop &&
+
+	# stop does not unregister the repo
+	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
+
+	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.timer" &&
+	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.service" &&
+
+	rm -f expect &&
+	for frequency in hourly daily weekly
+	do
+		echo "--user disable --now git-maintenance@${frequency}.timer" >>expect || return 1
+	done &&
+	test_cmp expect args
+'
+
 test_expect_success 'register preserves existing strategy' '
 	git config maintenance.strategy none &&
 	git maintenance register &&
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-01 14:52 [PATCH] maintenance: use systemd timers on Linux Lénaïc Huard
@ 2021-05-01 20:02 ` brian m. carlson
  2021-05-02  5:28 ` Bagas Sanjaya
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2021-05-01 20:02 UTC (permalink / raw)
  To: Lénaïc Huard; +Cc: git, Junio C Hamano, Derrick Stolee, Eric Sunshine

[-- Attachment #1: Type: text/plain, Size: 1568 bytes --]

On 2021-05-01 at 14:52:20, Lénaïc Huard wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.
> 
> The main motivations to implement systemd timers in addition to cron
> are:
> * cron is optional and Linux systems running systemd might not have it
>   installed.
> * The execution of `crontab -l` can tell us if cron is installed but not
>   if the daemon is actually running.
> * With systemd, each service is run in its own cgroup and its logs are
>   tagged by the service inside journald. With cron, all scheduled tasks
>   are running in the cron daemon cgroup and all the logs of the
>   user-scheduled tasks are pretended to belong to the system cron
>   service.
>   Concretely, a user that doesn’t have access to the system logs won’t
>   have access to the log of its own tasks scheduled by cron whereas he
>   will have access to the log of its own tasks scheduled by systemd
>   timer.

I would prefer to see this as a configurable option.  I have systemd
installed (because it's not really optional to have a functional desktop
on Linux) but I want to restrict it to starting and stopping services,
not performing the tasks of cron.  cron is portable across a wide
variety of systems, including Linux variants (and WSL) that don't use
systemd, and I prefer to use more standard tooling when possible.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-01 14:52 [PATCH] maintenance: use systemd timers on Linux Lénaïc Huard
  2021-05-01 20:02 ` brian m. carlson
@ 2021-05-02  5:28 ` Bagas Sanjaya
  2021-05-02  6:49   ` Eric Sunshine
  2021-05-02  6:45 ` Eric Sunshine
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 30+ messages in thread
From: Bagas Sanjaya @ 2021-05-02  5:28 UTC (permalink / raw)
  To: Lénaïc Huard, git; +Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine

On 01/05/21 21.52, Lénaïc Huard wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.
> 
> The main motivations to implement systemd timers in addition to cron
> are:
> * cron is optional and Linux systems running systemd might not have it
>    installed.

Supposed that I have Linux box with systemd and classical cron. Should
systemd timers be preferred over cron?

> * The execution of `crontab -l` can tell us if cron is installed but not
>    if the daemon is actually running.
> * With systemd, each service is run in its own cgroup and its logs are
>    tagged by the service inside journald. With cron, all scheduled tasks
>    are running in the cron daemon cgroup and all the logs of the
>    user-scheduled tasks are pretended to belong to the system cron
>    service.
>    Concretely, a user that doesn’t have access to the system logs won’t
>    have access to the log of its own tasks scheduled by cron whereas he
>    will have access to the log of its own tasks scheduled by systemd
>    timer.
> 
> In order to schedule git maintenance, we need two unit template files:
> * ~/.config/systemd/user/git-maintenance@.service
>    to define the command to be started by systemd and
> * ~/.config/systemd/user/git-maintenance@.timer
>    to define the schedule at which the command should be run.
> 
> Those units are templates that are parametrized by the frequency.
> 
> Based on those templates, 3 timers are started:
> * git-maintenance@hourly.timer
> * git-maintenance@daily.timer
> * git-maintenance@weekly.timer
> 
> The command launched by those three timers are the same than with the
> other scheduling methods:
> 
> git for-each-repo --config=maintenance.repo maintenance run
> --schedule=%i
> 
> with the full path for git to ensure that the version of git launched
> for the scheduled maintenance is the same as the one used to run
> `maintenance start`.
> 
> The timer unit contains `Persistent=true` so that, if the computer is
> powered down when a maintenance task should run, the task will be run
> when the computer is back powered on.
> 
> Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Nevertheless, because we are dealing with external dependency (systemd), it
should makes sense to enforce this dependency requirement when user choose to use
systemd timers so that users on non-systemd boxes (such as Gentoo with OpenRC)
don't see errors that forcing them to use systemd.

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-01 14:52 [PATCH] maintenance: use systemd timers on Linux Lénaïc Huard
  2021-05-01 20:02 ` brian m. carlson
  2021-05-02  5:28 ` Bagas Sanjaya
@ 2021-05-02  6:45 ` Eric Sunshine
  2021-05-02 14:10   ` Phillip Wood
  2021-05-05 12:01   ` Ævar Arnfjörð Bjarmason
  2021-05-02 11:12 ` Bagas Sanjaya
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 30+ messages in thread
From: Eric Sunshine @ 2021-05-02  6:45 UTC (permalink / raw)
  To: Lénaïc Huard
  Cc: Git List, Junio C Hamano, Derrick Stolee, brian m. carlson

On Sat, May 1, 2021 at 10:59 AM Lénaïc Huard <lenaic@lhuard.fr> wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.

Thanks for working on this. While `cron` has been the go-to standard
for decades, `systemd` is certainly widespread enough that it makes
sense to support it, as well.

> The main motivations to implement systemd timers in addition to cron
> are:
> * cron is optional and Linux systems running systemd might not have it
>   installed.
> * The execution of `crontab -l` can tell us if cron is installed but not
>   if the daemon is actually running.
> * With systemd, each service is run in its own cgroup and its logs are
>   tagged by the service inside journald. With cron, all scheduled tasks
>   are running in the cron daemon cgroup and all the logs of the
>   user-scheduled tasks are pretended to belong to the system cron
>   service.
>   Concretely, a user that doesn’t have access to the system logs won’t
>   have access to the log of its own tasks scheduled by cron whereas he
>   will have access to the log of its own tasks scheduled by systemd
>   timer.

The last point is somewhat compelling. A potential counterargument is
that `cron` does send email to the user by default if any output is
generated by the cron job. However, it seems quite likely these days
that many systems either won't have local mail service enabled or the
user won't bother checking the local mailbox. It's a minor point, but
if you re-roll it might make sense for the commit message to expand
the last point by saying that although `cron` attempts to send email,
that email may go unseen by the user.

> In order to schedule git maintenance, we need two unit template files:
> * ~/.config/systemd/user/git-maintenance@.service
>   to define the command to be started by systemd and
> * ~/.config/systemd/user/git-maintenance@.timer
>   to define the schedule at which the command should be run.
> [...]
> The timer unit contains `Persistent=true` so that, if the computer is
> powered down when a maintenance task should run, the task will be run
> when the computer is back powered on.

It would be nice for the commit message to also give some high-level
information about how git-maintenance chooses between `cron` and
`systemd` and whether the user can influence that decision. (I know
the answer because I read the patch, but this is the sort of
information which is good to have in the commit message; readers want
to know why certain choices were made.)

Although I avoid Linux distros with `systemd`, my knee-jerk reaction,
like brian's upthread, is that there should be some escape hatch or
direct mechanism to allow the user to choose between `systemd` and
`cron`.

The patch itself is straightforward enough and nicely follows the
pattern established for already-implemented schedulers, so I don't
have a lot to say about it. I did leave a few comments below, most of
which are subjective nits and minor observations, though there are two
or three actionable items.

> Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
> ---
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> @@ -279,6 +279,55 @@ schedule to ensure you are executing the correct binaries in your
> +BACKGROUND MAINTENANCE ON LINUX SYSTEMD SYSTEMD
> +-----------------------------------------------

Is there a reason for the duplicated "SYSTEMD" that I'm missing? I
suppose you probably mean "SYSTEMD SYSTEMS".

> +In this case, `git maintenance start` will create user systemd timer units
> +and start the timers. The current list of user-scheduled tasks can be found
> +by running `systemctl --user list-timers`. The timers written by `git
> +maintenance start` are similar to this:
> +
> +-----------------------------------------------------------------------
> +$ systemctl --user list-timers
> +NEXT                         LEFT          LAST                         PASSED     UNIT                         ACTIVATES
> +Thu 2021-04-29 19:00:00 CEST 42min left    Thu 2021-04-29 18:00:11 CEST 17min ago  git-maintenance@hourly.timer git-maintenance@hourly.service
> +Fri 2021-04-30 00:00:00 CEST 5h 42min left Thu 2021-04-29 00:00:11 CEST 18h ago    git-maintenance@daily.timer  git-maintenance@daily.service
> +Mon 2021-05-03 00:00:00 CEST 3 days left   Mon 2021-04-26 00:00:11 CEST 3 days ago git-maintenance@weekly.timer git-maintenance@weekly.service
> +
> +3 timers listed.
> +Pass --all to see loaded but inactive timers, too.
> +-----------------------------------------------------------------------

I suspect that the "3 timers listed" and "Pass --all" lines don't add
value and can be dropped without hurting the example.

> +`git maintenance start` will overwrite these files and start the timer
> +again with `systemctl --user`, so any customization should be done by
> +creating a drop-in file
> +`~/.config/systemd/user/git-maintenance@.service.d/*.conf`.

Will `systemd` users generally understand what filename to create in
the "...@.service.d/" directory, and will they know what to populate
the file with? (Genuine question; I've never dealt with that.)

> diff --git a/builtin/gc.c b/builtin/gc.c
> @@ -1872,6 +1872,25 @@ static int schtasks_update_schedule(int run_maintenance, int fd, const char *cmd
> +static int is_crontab_available(const char *cmd)
> +{
> +       struct child_process child = CHILD_PROCESS_INIT;
> +
> +       strvec_split(&child.args, cmd);
> +       strvec_push(&child.args, "-l");
> +       child.no_stdin = 1;
> +       child.no_stdout = 1;
> +       child.no_stderr = 1;
> +       child.silent_exec_failure = 1;
> +
> +       if (start_command(&child))
> +               return 0;
> +       /* Ignore exit code, as an empty crontab will return error. */
> +       finish_command(&child);
> +
> +       return 1;
> +}

Ignoring the error from `crontab -l` is an already-established idiom
in this file. Okay.

Nit: There doesn't seem to be a need for the blank line before `return
1`, and other maintenance-related functions don't have such a blank
line. The same comment about blank lines before `return` applies to
other newly-added functions, as well. But it's subjective, and not
necessarily worth changing.

> +static char *systemd_timer_timer_filename()
> +{
> +       const char *filename = "~/.config/systemd/user/git-maintenance@.timer";
> +       char *expanded = expand_user_path(filename, 0);
> +       if (!expanded)
> +               die(_("failed to expand path '%s'"), filename);
> +
> +       return expanded;
> +}

I was curious whether this would fail if `.config/systemd/user/`
didn't already exist, but looking at the implementation of
expand_user_path() , I see that it doesn't require the path to already
exist if you pass 0 for the second argument as you do here. Okay.

> +static char *systemd_timer_service_filename()
> +{
> +       const char *filename =
> +               "~/.config/systemd/user/git-maintenance@.service";
> +       char *expanded = expand_user_path(filename, 0);
> +       if (!expanded)
> +               die(_("failed to expand path '%s'"), filename);
> +
> +       return expanded;
> +}

The duplication of code between systemd_timer_timer_filename() and
systemd_timer_service_filename() is probably too minor to worry about.
Okay.

> +static int systemd_timer_enable_unit(int enable,
> +                                    enum schedule_priority schedule,
> +                                    const char *cmd)
> +{
> +       struct child_process child = CHILD_PROCESS_INIT;
> +       const char *frequency = get_frequency(schedule);
> +
> +       strvec_split(&child.args, cmd);
> +       strvec_push(&child.args, "--user");
> +       if (enable)
> +               strvec_push(&child.args, "enable");
> +       else
> +               strvec_push(&child.args, "disable");

It's subjective, but this might be more nicely expressed as:

    strvec_push(&child.args, enable ? "enable" : "disable");

> +       strvec_push(&child.args, "--now");
> +       strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
> +
> +       if (start_command(&child))
> +               die(_("failed to run systemctl"));
> +       return finish_command(&child);
> +}
> +static int systemd_timer_write_unit_templates(const char *exec_path)
> +{
> +       unit = "[Unit]\n"
> +              "Description=Optimize Git repositories data\n"
> +              "\n"
> +              "[Service]\n"
> +              "Type=oneshot\n"
> +              "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"

I see that it's in POSIX, but do we use this `%n$s` directive
elsewhere in the Git source code? If not, I'd be cautious of
introducing it here. Maybe it's better to just use plain `%s` twice...

> +              "LockPersonality=yes\n"
> +              "MemoryDenyWriteExecute=yes\n"
> +              "NoNewPrivileges=yes\n"
> +              "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6\n"
> +              "RestrictNamespaces=yes\n"
> +              "RestrictRealtime=yes\n"
> +              "RestrictSUIDSGID=yes\n"
> +              "SystemCallArchitectures=native\n"
> +              "SystemCallFilter=@system-service\n";
> +       fprintf(file, unit, exec_path);

... and then:

    fprintf(file, unit, exec_path, exec_path);

> +       fclose(file);
> +
> +       return 0;
> +}
> @@ -1986,6 +2159,15 @@ static int update_background_schedule(int enable)
> +       if (!strcmp(scheduler, "crontab_or_systemctl")) {
> +               if (is_systemd_timer_available("systemctl"))
> +                       scheduler = cmd = "systemctl";
> +               else if (is_crontab_available("crontab"))
> +                       scheduler = cmd = "crontab";
> +               else
> +                       die(_("Neither systemd timers nor crontab are available"));
> +       }

Other messages emitted by git-maintenance are entirely lowercase, so
downcasing "Neither" would be appropriate.

> @@ -1995,10 +2177,14 @@ static int update_background_schedule(int enable)
> -               die("unknown background scheduler: %s", scheduler);
> +               die(_("unknown background scheduler: %s"), scheduler);

This change is unrelated to the rest of the patch. Normally, such a
"fix" would be made as a separate patch. This one is somewhat minor,
so perhaps it doesn't matter whether it's in this patch...

>         rollback_lock_file(&lk);
> +       free(lock_path);
>         free(testing);
>         return result;

... however, this leak fix probably deserves its own patch. Or, at the
very least, mention these two fixes in this commit message.

> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> @@ -20,6 +20,20 @@ test_xmllint () {
> +test_lazy_prereq SYSTEMD_ANALYZE '
> +       systemd-analyze --help >out &&
> +       grep -w verify out
> +'

Unportable use of `grep -w`. It's neither in POSIX nor understood by
BSD-lineage `grep` (including macOS `grep`).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-02  5:28 ` Bagas Sanjaya
@ 2021-05-02  6:49   ` Eric Sunshine
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Sunshine @ 2021-05-02  6:49 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Lénaïc Huard, Git List, Junio C Hamano, Derrick Stolee

On Sun, May 2, 2021 at 1:28 AM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> On 01/05/21 21.52, Lénaïc Huard wrote:
> > The existing mechanism for scheduling background maintenance is done
> > through cron. On Linux systems managed by systemd, systemd provides an
> > alternative to schedule recurring tasks: systemd timers.
> >
> > The main motivations to implement systemd timers in addition to cron
> > are:
> > * cron is optional and Linux systems running systemd might not have it
> >    installed.
>
> Supposed that I have Linux box with systemd and classical cron. Should
> systemd timers be preferred over cron?

The implementation in this patch unconditionally prefers `systemd`
over `cron`. Whether that's a good idea is subject to question (as
both brian and I mentioned in our reviews).

> Nevertheless, because we are dealing with external dependency (systemd), it
> should makes sense to enforce this dependency requirement when user choose to use
> systemd timers so that users on non-systemd boxes (such as Gentoo with OpenRC)
> don't see errors that forcing them to use systemd.

If you scan through the patch itself, you will find that it is careful
to choose the appropriate scheduler and not to spit out errors when
one or the other scheduler is unavailable.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-01 14:52 [PATCH] maintenance: use systemd timers on Linux Lénaïc Huard
                   ` (2 preceding siblings ...)
  2021-05-02  6:45 ` Eric Sunshine
@ 2021-05-02 11:12 ` Bagas Sanjaya
  2021-05-03 12:04 ` Derrick Stolee
  2021-05-09 21:32 ` [PATCH v2 0/1] " Lénaïc Huard
  5 siblings, 0 replies; 30+ messages in thread
From: Bagas Sanjaya @ 2021-05-02 11:12 UTC (permalink / raw)
  To: Lénaïc Huard, git; +Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine

On 01/05/21 21.52, Lénaïc Huard wrote:
> +BACKGROUND MAINTENANCE ON LINUX SYSTEMD SYSTEMD
> +-----------------------------------------------
> +

systemd was repeated twice above. It should be
`BACKGROUND MAINTENANCE WITH SYSTEMD TIMERS`

>   test_expect_success 'help text' '
>   	test_expect_code 129 git maintenance -h 2>err &&

Why exit code 129 there?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-02  6:45 ` Eric Sunshine
@ 2021-05-02 14:10   ` Phillip Wood
  2021-05-05 12:19     ` Đoàn Trần Công Danh
  2021-05-05 12:01   ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 30+ messages in thread
From: Phillip Wood @ 2021-05-02 14:10 UTC (permalink / raw)
  To: Eric Sunshine, Lénaïc Huard
  Cc: Git List, Junio C Hamano, Derrick Stolee, brian m. carlson

On 02/05/2021 07:45, Eric Sunshine wrote:
> On Sat, May 1, 2021 at 10:59 AM Lénaïc Huard <lenaic@lhuard.fr> wrote:
>> The existing mechanism for scheduling background maintenance is done
>> through cron. On Linux systems managed by systemd, systemd provides an
>> alternative to schedule recurring tasks: systemd timers.
> 
> Thanks for working on this. While `cron` has been the go-to standard
> for decades, `systemd` is certainly widespread enough that it makes
> sense to support it, as well.

Yes, thank you for working on this, it will be very useful to users like 
me who use a linux distribution that does not install a cron daemon by 
default but relies on systemd instead.

>> The main motivations to implement systemd timers in addition to cron
>> are:
>> * cron is optional and Linux systems running systemd might not have it
>>    installed.
>> * The execution of `crontab -l` can tell us if cron is installed but not
>>    if the daemon is actually running.

Can we use systemctl to see if it is running (and enabled so we know it 
will be restarted after a reboot)?

>> * With systemd, each service is run in its own cgroup and its logs are
>>    tagged by the service inside journald. With cron, all scheduled tasks
>>    are running in the cron daemon cgroup and all the logs of the
>>    user-scheduled tasks are pretended to belong to the system cron
>>    service.
>>    Concretely, a user that doesn’t have access to the system logs won’t
>>    have access to the log of its own tasks scheduled by cron whereas he
>>    will have access to the log of its own tasks scheduled by systemd
>>    timer.
> 
> The last point is somewhat compelling. A potential counterargument is
> that `cron` does send email to the user by default if any output is
> generated by the cron job. However, it seems quite likely these days
> that many systems either won't have local mail service enabled or the
> user won't bother checking the local mailbox. It's a minor point, but
> if you re-roll it might make sense for the commit message to expand
> the last point by saying that although `cron` attempts to send email,
> that email may go unseen by the user.
> 
>> In order to schedule git maintenance, we need two unit template files:
>> * ~/.config/systemd/user/git-maintenance@.service
>>    to define the command to be started by systemd and
>> * ~/.config/systemd/user/git-maintenance@.timer
>>    to define the schedule at which the command should be run.
>> [...]
>> The timer unit contains `Persistent=true` so that, if the computer is
>> powered down when a maintenance task should run, the task will be run
>> when the computer is back powered on.
> 
> It would be nice for the commit message to also give some high-level
> information about how git-maintenance chooses between `cron` and
> `systemd` and whether the user can influence that decision. (I know
> the answer because I read the patch, but this is the sort of
> information which is good to have in the commit message; readers want
> to know why certain choices were made.)
> 
> Although I avoid Linux distros with `systemd`, my knee-jerk reaction,
> like brian's upthread, is that there should be some escape hatch or
> direct mechanism to allow the user to choose between `systemd` and
> `cron`.

I agree that if both are present the user should be able to choose one. 
I'm not sure what the default should be in that case - before I read the 
commit message and Eric's comments I was inclined to say that if the 
user has cron installed we should take that as a sign they preferred 
cron over systemd timers and use that. However, given the arguments 
above about not knowing if cron is running and the user not necessarily 
getting emails from cron or being able to read the logs than maybe 
defaulting to systemd timers makes sense.

> The patch itself is straightforward enough and nicely follows the
> pattern established for already-implemented schedulers, so I don't
> have a lot to say about it. I did leave a few comments below, most of
> which are subjective nits and minor observations, though there are two
> or three actionable items.
> 
>> Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
>> ---
>> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
>> @@ -279,6 +279,55 @@ schedule to ensure you are executing the correct binaries in your
>> +BACKGROUND MAINTENANCE ON LINUX SYSTEMD SYSTEMD
>> +-----------------------------------------------
> 
> Is there a reason for the duplicated "SYSTEMD" that I'm missing? I
> suppose you probably mean "SYSTEMD SYSTEMS".
> 
>> +In this case, `git maintenance start` will create user systemd timer units
>> +and start the timers. The current list of user-scheduled tasks can be found
>> +by running `systemctl --user list-timers`. The timers written by `git
>> +maintenance start` are similar to this:
>> +
>> +-----------------------------------------------------------------------
>> +$ systemctl --user list-timers
>> +NEXT                         LEFT          LAST                         PASSED     UNIT                         ACTIVATES
>> +Thu 2021-04-29 19:00:00 CEST 42min left    Thu 2021-04-29 18:00:11 CEST 17min ago  git-maintenance@hourly.timer git-maintenance@hourly.service
>> +Fri 2021-04-30 00:00:00 CEST 5h 42min left Thu 2021-04-29 00:00:11 CEST 18h ago    git-maintenance@daily.timer  git-maintenance@daily.service
>> +Mon 2021-05-03 00:00:00 CEST 3 days left   Mon 2021-04-26 00:00:11 CEST 3 days ago git-maintenance@weekly.timer git-maintenance@weekly.service
>> +
>> +3 timers listed.
>> +Pass --all to see loaded but inactive timers, too.
>> +-----------------------------------------------------------------------
> 
> I suspect that the "3 timers listed" and "Pass --all" lines don't add
> value and can be dropped without hurting the example.

If the idea is to show the output of `systemctl --user list-timers` then 
I don't think we should be editing it. I also think having the column 
headers helps as it shows what the fields are.

>> +`git maintenance start` will overwrite these files and start the timer
>> +again with `systemctl --user`, so any customization should be done by
>> +creating a drop-in file
>> +`~/.config/systemd/user/git-maintenance@.service.d/*.conf`.
> 
> Will `systemd` users generally understand what filename to create in
> the "...@.service.d/" directory, and will they know what to populate
> the file with? (Genuine question; I've never dealt with that.)

I think it would be helpful to explicitly mention the file names (I 
don't think I could tell you what they are without reading the relevant 
systemd man page)

>> diff --git a/builtin/gc.c b/builtin/gc.c
>> @@ -1872,6 +1872,25 @@ static int schtasks_update_schedule(int run_maintenance, int fd, const char *cmd
>> +static int is_crontab_available(const char *cmd)
>> +{
>> +       struct child_process child = CHILD_PROCESS_INIT;
>> +
>> +       strvec_split(&child.args, cmd);
>> +       strvec_push(&child.args, "-l");
>> +       child.no_stdin = 1;
>> +       child.no_stdout = 1;
>> +       child.no_stderr = 1;
>> +       child.silent_exec_failure = 1;
>> +
>> +       if (start_command(&child))
>> +               return 0;
>> +       /* Ignore exit code, as an empty crontab will return error. */
>> +       finish_command(&child);
>> +
>> +       return 1;
>> +}
> 
> Ignoring the error from `crontab -l` is an already-established idiom
> in this file. Okay.
> 
> Nit: There doesn't seem to be a need for the blank line before `return
> 1`, and other maintenance-related functions don't have such a blank
> line. The same comment about blank lines before `return` applies to
> other newly-added functions, as well. But it's subjective, and not
> necessarily worth changing.
> 
>> +static char *systemd_timer_timer_filename()
>> +{
>> +       const char *filename = "~/.config/systemd/user/git-maintenance@.timer";
>> +       char *expanded = expand_user_path(filename, 0);
>> +       if (!expanded)
>> +               die(_("failed to expand path '%s'"), filename);
>> +
>> +       return expanded;
>> +}
> 
> I was curious whether this would fail if `.config/systemd/user/`
> didn't already exist, but looking at the implementation of
> expand_user_path() , I see that it doesn't require the path to already
> exist if you pass 0 for the second argument as you do here. Okay.

Do we need to worry about $XDG_CONFIG_HOME rather than hard coding 
"~/.config/". There is a function xdg_config_home() that takes care of this.

Thanks again for working on this

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-01 14:52 [PATCH] maintenance: use systemd timers on Linux Lénaïc Huard
                   ` (3 preceding siblings ...)
  2021-05-02 11:12 ` Bagas Sanjaya
@ 2021-05-03 12:04 ` Derrick Stolee
  2021-05-09 21:32 ` [PATCH v2 0/1] " Lénaïc Huard
  5 siblings, 0 replies; 30+ messages in thread
From: Derrick Stolee @ 2021-05-03 12:04 UTC (permalink / raw)
  To: Lénaïc Huard, git; +Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine

On 5/1/2021 10:52 AM, Lénaïc Huard wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.
> 
> The main motivations to implement systemd timers in addition to cron
> are:
...

Thank you for working on this. Users have questioned "why cron?"
since the release of background maintenance, so I appreciate you
taking the time to port this feature to systemd.

I won't do a deep code review here since that seems to already be
covered, and a v2 seems required. Ensuring that users can choose
which of the two backends is a good idea. We might even want to
start with 'cron' as the default and 'systemd' as an opt-in.

The other concern I wanted to discuss was the upgrade scenario.
If users have already enabled background maintenance with the
cron backend, how can we help users disable the cron backend
before they upgrade to the systemd version? I imagine that we
should disable cron when enabling systemd, using
crontab_update_schedule() with run_maintenance given as 0.
We might want to enable quiet errors in that method for the
case that cron does not exist.

It is important to make it clear that we only accept one
scheduler at a time, since they would be competing to run
'git for-each-repo' over the same list of repos. A single user
cannot schedule some repositories in 'cron' and another set in
'systemd'. This seems like an acceptable technical limitation.

Thanks, and I look forward to your v2!
-Stolee

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-02  6:45 ` Eric Sunshine
  2021-05-02 14:10   ` Phillip Wood
@ 2021-05-05 12:01   ` Ævar Arnfjörð Bjarmason
  2021-05-09 22:34     ` Lénaïc Huard
  1 sibling, 1 reply; 30+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-05-05 12:01 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Lénaïc Huard, Git List, Junio C Hamano, Derrick Stolee,
	brian m. carlson


On Sun, May 02 2021, Eric Sunshine wrote:

> On Sat, May 1, 2021 at 10:59 AM Lénaïc Huard <lenaic@lhuard.fr> wrote:
>> +       strvec_push(&child.args, "--now");
>> +       strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
>> +
>> +       if (start_command(&child))
>> +               die(_("failed to run systemctl"));
>> +       return finish_command(&child);
>> +}
>> +static int systemd_timer_write_unit_templates(const char *exec_path)
>> +{
>> +       unit = "[Unit]\n"
>> +              "Description=Optimize Git repositories data\n"
>> +              "\n"
>> +              "[Service]\n"
>> +              "Type=oneshot\n"
>> +              "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
>
> I see that it's in POSIX, but do we use this `%n$s` directive
> elsewhere in the Git source code? If not, I'd be cautious of
> introducing it here. Maybe it's better to just use plain `%s` twice...

We use it in po/, so for sprintf() on systems that don't have
NO_GETTEXT=Y we already test it in the wild.

But no, I don't think anything in the main source uses it, FWIW I have a
WIP series in my own fork that I've cooked for a while that uses this, I
haven't run into any issues with it in either GitHub's CI
(e.g. Windows), or on the systems I myself test on.

I think it would be a useful canary to just take a change like this, we
can always change it to the form you suggest if it doesn't work out.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-02 14:10   ` Phillip Wood
@ 2021-05-05 12:19     ` Đoàn Trần Công Danh
  2021-05-05 14:57       ` Phillip Wood
  0 siblings, 1 reply; 30+ messages in thread
From: Đoàn Trần Công Danh @ 2021-05-05 12:19 UTC (permalink / raw)
  To: phillip.wood
  Cc: Eric Sunshine, Lénaïc Huard, Git List, Junio C Hamano,
	Derrick Stolee, brian m. carlson

On 2021-05-02 15:10:05+0100, Phillip Wood <phillip.wood123@gmail.com> wrote:
> On 02/05/2021 07:45, Eric Sunshine wrote:
> > On Sat, May 1, 2021 at 10:59 AM Lénaïc Huard <lenaic@lhuard.fr> wrote:
> > > The existing mechanism for scheduling background maintenance is done
> > > through cron. On Linux systems managed by systemd, systemd provides an
> > > alternative to schedule recurring tasks: systemd timers.
> > 
> > Thanks for working on this. While `cron` has been the go-to standard
> > for decades, `systemd` is certainly widespread enough that it makes
> > sense to support it, as well.
> 
> Yes, thank you for working on this, it will be very useful to users like me
> who use a linux distribution that does not install a cron daemon by default
> but relies on systemd instead.
> 
> > > The main motivations to implement systemd timers in addition to cron
> > > are:
> > > * cron is optional and Linux systems running systemd might not have it
> > >    installed.
> > > * The execution of `crontab -l` can tell us if cron is installed but not
> > >    if the daemon is actually running.
> 
> Can we use systemctl to see if it is running (and enabled so we know it will
> be restarted after a reboot)?

Not sure if I understand this suggestion.
However, non-systemd systems doesn't have systemctl command to begin
with.

> > > * With systemd, each service is run in its own cgroup and its logs are
> > >    tagged by the service inside journald. With cron, all scheduled tasks
> > >    are running in the cron daemon cgroup and all the logs of the
> > >    user-scheduled tasks are pretended to belong to the system cron
> > >    service.
> > >    Concretely, a user that doesn’t have access to the system logs won’t
> > >    have access to the log of its own tasks scheduled by cron whereas he
> > >    will have access to the log of its own tasks scheduled by systemd
> > >    timer.
> > 
> > The last point is somewhat compelling. A potential counterargument is
> > that `cron` does send email to the user by default if any output is
> > generated by the cron job. However, it seems quite likely these days
> > that many systems either won't have local mail service enabled or the
> > user won't bother checking the local mailbox. It's a minor point, but
> > if you re-roll it might make sense for the commit message to expand
> > the last point by saying that although `cron` attempts to send email,
> > that email may go unseen by the user.
> > 
> > > In order to schedule git maintenance, we need two unit template files:
> > > * ~/.config/systemd/user/git-maintenance@.service
> > >    to define the command to be started by systemd and
> > > * ~/.config/systemd/user/git-maintenance@.timer
> > >    to define the schedule at which the command should be run.

I think it would be better to change ~/.config here to
$XDG_CONFIG_HOME, as others also points out in another comments.

[..snip..]

> > > +`git maintenance start` will overwrite these files and start the timer
> > > +again with `systemctl --user`, so any customization should be done by
> > > +creating a drop-in file
> > > +`~/.config/systemd/user/git-maintenance@.service.d/*.conf`.

Ditto.

> > Will `systemd` users generally understand what filename to create in
> > the "...@.service.d/" directory, and will they know what to populate
> > the file with? (Genuine question; I've never dealt with that.)
> 
> I think it would be helpful to explicitly mention the file names (I don't
> think I could tell you what they are without reading the relevant systemd
> man page)

[..snip..]

> > > +static char *systemd_timer_timer_filename()
> > > +{
> > > +       const char *filename = "~/.config/systemd/user/git-maintenance@.timer";
> > > +       char *expanded = expand_user_path(filename, 0);
> > > +       if (!expanded)
> > > +               die(_("failed to expand path '%s'"), filename);
> > > +
> > > +       return expanded;
> > > +}
> > 
> > I was curious whether this would fail if `.config/systemd/user/`
> > didn't already exist, but looking at the implementation of
> > expand_user_path() , I see that it doesn't require the path to already
> > exist if you pass 0 for the second argument as you do here. Okay.
> 
> Do we need to worry about $XDG_CONFIG_HOME rather than hard coding
> "~/.config/". There is a function xdg_config_home() that takes care of this.

-- 
Danh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-05 12:19     ` Đoàn Trần Công Danh
@ 2021-05-05 14:57       ` Phillip Wood
  0 siblings, 0 replies; 30+ messages in thread
From: Phillip Wood @ 2021-05-05 14:57 UTC (permalink / raw)
  To: Đoàn Trần Công Danh, phillip.wood
  Cc: Eric Sunshine, Lénaïc Huard, Git List, Junio C Hamano,
	Derrick Stolee, brian m. carlson

Hi Đoàn

On 05/05/2021 13:19, Đoàn Trần Công Danh wrote:
> On 2021-05-02 15:10:05+0100, Phillip Wood <phillip.wood123@gmail.com> wrote:
>> On 02/05/2021 07:45, Eric Sunshine wrote:
>>> On Sat, May 1, 2021 at 10:59 AM Lénaïc Huard <lenaic@lhuard.fr> wrote:
>>>> The existing mechanism for scheduling background maintenance is done
>>>> through cron. On Linux systems managed by systemd, systemd provides an
>>>> alternative to schedule recurring tasks: systemd timers.
>>>
>>> Thanks for working on this. While `cron` has been the go-to standard
>>> for decades, `systemd` is certainly widespread enough that it makes
>>> sense to support it, as well.
>>
>> Yes, thank you for working on this, it will be very useful to users like me
>> who use a linux distribution that does not install a cron daemon by default
>> but relies on systemd instead.
>>
>>>> The main motivations to implement systemd timers in addition to cron
>>>> are:
>>>> * cron is optional and Linux systems running systemd might not have it
>>>>     installed.
>>>> * The execution of `crontab -l` can tell us if cron is installed but not
>>>>     if the daemon is actually running.
>>
>> Can we use systemctl to see if it is running (and enabled so we know it will
>> be restarted after a reboot)?
> 
> Not sure if I understand this suggestion.
> However, non-systemd systems doesn't have systemctl command to begin
> with.

I was wondering if on systems with both cron and systemd installed we 
could use systemctl to determine if crond is actually running as Lénaïc 
pointed out that being able to run `crontab -l` does not tell us if 
crond is running.

Best Wishes

Phillip

>>>> * With systemd, each service is run in its own cgroup and its logs are
>>>>     tagged by the service inside journald. With cron, all scheduled tasks
>>>>     are running in the cron daemon cgroup and all the logs of the
>>>>     user-scheduled tasks are pretended to belong to the system cron
>>>>     service.
>>>>     Concretely, a user that doesn’t have access to the system logs won’t
>>>>     have access to the log of its own tasks scheduled by cron whereas he
>>>>     will have access to the log of its own tasks scheduled by systemd
>>>>     timer.
>>>
>>> The last point is somewhat compelling. A potential counterargument is
>>> that `cron` does send email to the user by default if any output is
>>> generated by the cron job. However, it seems quite likely these days
>>> that many systems either won't have local mail service enabled or the
>>> user won't bother checking the local mailbox. It's a minor point, but
>>> if you re-roll it might make sense for the commit message to expand
>>> the last point by saying that although `cron` attempts to send email,
>>> that email may go unseen by the user.
>>>
>>>> In order to schedule git maintenance, we need two unit template files:
>>>> * ~/.config/systemd/user/git-maintenance@.service
>>>>     to define the command to be started by systemd and
>>>> * ~/.config/systemd/user/git-maintenance@.timer
>>>>     to define the schedule at which the command should be run.
> 
> I think it would be better to change ~/.config here to
> $XDG_CONFIG_HOME, as others also points out in another comments.
> 
> [..snip..]
> 
>>>> +`git maintenance start` will overwrite these files and start the timer
>>>> +again with `systemctl --user`, so any customization should be done by
>>>> +creating a drop-in file
>>>> +`~/.config/systemd/user/git-maintenance@.service.d/*.conf`.
> 
> Ditto.
> 
>>> Will `systemd` users generally understand what filename to create in
>>> the "...@.service.d/" directory, and will they know what to populate
>>> the file with? (Genuine question; I've never dealt with that.)
>>
>> I think it would be helpful to explicitly mention the file names (I don't
>> think I could tell you what they are without reading the relevant systemd
>> man page)
> 
> [..snip..]
> 
>>>> +static char *systemd_timer_timer_filename()
>>>> +{
>>>> +       const char *filename = "~/.config/systemd/user/git-maintenance@.timer";
>>>> +       char *expanded = expand_user_path(filename, 0);
>>>> +       if (!expanded)
>>>> +               die(_("failed to expand path '%s'"), filename);
>>>> +
>>>> +       return expanded;
>>>> +}
>>>
>>> I was curious whether this would fail if `.config/systemd/user/`
>>> didn't already exist, but looking at the implementation of
>>> expand_user_path() , I see that it doesn't require the path to already
>>> exist if you pass 0 for the second argument as you do here. Okay.
>>
>> Do we need to worry about $XDG_CONFIG_HOME rather than hard coding
>> "~/.config/". There is a function xdg_config_home() that takes care of this.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 0/1] maintenance: use systemd timers on Linux
  2021-05-01 14:52 [PATCH] maintenance: use systemd timers on Linux Lénaïc Huard
                   ` (4 preceding siblings ...)
  2021-05-03 12:04 ` Derrick Stolee
@ 2021-05-09 21:32 ` Lénaïc Huard
  2021-05-09 21:32   ` [PATCH v2 1/1] " Lénaïc Huard
  5 siblings, 1 reply; 30+ messages in thread
From: Lénaïc Huard @ 2021-05-09 21:32 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine, brian m . carlson,
	Bagas Sanjaya, Phillip Wood,
	Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason, Lénaïc Huard

Hello,

Thank you all for your valuable feedback!
I tried to address all the discussed points in this new version of the
patch.
Do not hesitate to let me know if I forgot anything.

The main new thing in this version is the `--scheduler=<scheduler>`
parameter that has been added to `git maintenance start` command. It
allows the end user to choose between `cron` or user systemd timers
for scheduling git maintenance tasks.

I also addressed the migration problematic during an upgrade.
If a user invokes `git maintenance start --scheduler=systemd-timer`
first, and then invokes `git maintenance start --scheduler=cron`
without invoking `git maintenance stop` in between, the git
maintenance tasks will be removed from `systemd-timer` to be sure that
the same tasks won’t be scheduled concurrently twice by both
`systemd-timer` and `cron`. And the same in the other way round.

On its side, `git maintenance stop` don’t have any
`--scheduler=<scheduler>` parameter as it will try to remove the git
maintenance tasks from all the schedulers available on the system.

The default scheduler when `--scheduler=<scheduler>` isn’t specified
is `auto` which means “choose an appropriate scheduler”.

On Windows and MacOS, it always chooses the specific scheduler on
those platforms, `schtasks` and `launchctl`.

On Linux, it chooses user systemd timers if they are available and
`cron` otherwise.
This order has been a subject of discussion so, let me explain why I
chose this order.

On my system, I uses systemd and all the packages of my distribution
that are needing regular scheduled tasks are defining systemd timers
instead of cron task.
So, I don’t use crontab anymore. However, a cron package is installed
because it is an indirect dependency of a package I installed through
my Linux distribution package manager. But the cron daemon isn’t
started.

`systemctl --user list-timers` is the CLI command that actually talks
to the daemon in charge of scheduling timers. So, if `systemctl --user
list-timers` succeed, we can be sure that user systemd timers are
functional.
Concretely, the `is_systemd_timer_available` function of this patch is
reliable.

`crontab`, on the other hand, only reads and writes to
`/var/spool/cron/$USER` but this command can work also if the `cron`
daemon isn’t running. In this case, the scheduled tasks will never
run.
Concretely, the `is_crontab_available` function of this patch is less
reliable.

We would need to check if the `cron` daemon is really running.
Relying on `systemctl` to check if the service is enabled and running
isn’t ideal since we want to support systems that don’t have systemd.
Parsing directly `/proc` is challenging since its content is OS
specific. `/proc` content on a Solaris system is different than its
content on a Linux system.
We could rely on `ps` but we must keep in mind its interface is very
different from one system to another. It doesn’t implement both the
standard syntax and the BSD syntax for its arguments on all platforms
for example.

Moreover, Linux distributions are proposing several different
implementations of `cron`: `cronie`, `fcron`, `dcron`, `vixie-cron`,
`scron`, `bcron`.

See:
* https://wiki.archlinux.org/title/Cron#Installation
* https://wiki.gentoo.org/wiki/Cron#Which_cron_is_right_for_the_job.3F

Depending on the `cron` implementation, the name of the `cron` daemon
process might differ.
So, reliably detecting if a `cron` daemon is running may require to
review each `cron` implementation.

With this new version of the patch, advanced users that care about
systemd timers versus `cron` can explicitly choose which one they want to
use.
For less advanced users that don’t care, I prefer to choose the method
which has the higher probability of working.

But since this order is a subject of debate, what I can propose is a
`./configure` compile time option to select which `cron` or systemd
timers should be chosen in priority if both are available.



In addition to this big change, this new version of the patch also honors
the `XDG_CONFIG_HOME` environment variable and removes the code
duplication between `systemd_timer_timer_filename()` and
`systemd_timer_service_filename()`.

It also fixes other points raised in the code review.

Lénaïc Huard (1):
  maintenance: use systemd timers on Linux

 Documentation/git-maintenance.txt |  60 +++++
 builtin/gc.c                      | 375 ++++++++++++++++++++++++++++--
 t/t7900-maintenance.sh            |  51 ++++
 3 files changed, 462 insertions(+), 24 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-09 21:32 ` [PATCH v2 0/1] " Lénaïc Huard
@ 2021-05-09 21:32   ` Lénaïc Huard
  2021-05-10  1:20     ` Đoàn Trần Công Danh
                       ` (4 more replies)
  0 siblings, 5 replies; 30+ messages in thread
From: Lénaïc Huard @ 2021-05-09 21:32 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine, brian m . carlson,
	Bagas Sanjaya, Phillip Wood,
	Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason, Lénaïc Huard

The existing mechanism for scheduling background maintenance is done
through cron. On Linux systems managed by systemd, systemd provides an
alternative to schedule recurring tasks: systemd timers.

The main motivations to implement systemd timers in addition to cron
are:
* cron is optional and Linux systems running systemd might not have it
  installed.
* The execution of `crontab -l` can tell us if cron is installed but not
  if the daemon is actually running.
* With systemd, each service is run in its own cgroup and its logs are
  tagged by the service inside journald. With cron, all scheduled tasks
  are running in the cron daemon cgroup and all the logs of the
  user-scheduled tasks are pretended to belong to the system cron
  service.
  Concretely, a user that doesn’t have access to the system logs won’t
  have access to the log of its own tasks scheduled by cron whereas he
  will have access to the log of its own tasks scheduled by systemd
  timer.
  Although `cron` attempts to send email, that email may go unseen by
  the user because these days, local mailboxes are not heavily used
  anymore.

In order to choose which scheduler to use between `cron` and user
systemd timers, a new option
`--scheduler=auto|cron|systemd|launchctl|schtasks` has been added to
`git maintenance start`.
When `git maintenance start --scheduler=XXX` is run, it not only
registers `git maintenance run` tasks in the scheduler XXX, it also
removes the `git maintenance run` tasks from all the other schedulers to
ensure we cannot have two schedulers launching concurrent identical
tasks.

The default value is `auto` which chooses a suitable scheduler for the
system.
On Linux, if user systemd timers are available, they will be used as git
maintenance scheduler. If not, `cron` will be used if it is available.
If none is available, it will fail.

`git maintenance stop` doesn't have any `--scheduler` parameter because
this command will try to remove the `git maintenance run` tasks from all
the available schedulers.

In order to schedule git maintenance, we need two unit template files:
* ~/.config/systemd/user/git-maintenance@.service
  to define the command to be started by systemd and
* ~/.config/systemd/user/git-maintenance@.timer
  to define the schedule at which the command should be run.

Those units are templates that are parametrized by the frequency.

Based on those templates, 3 timers are started:
* git-maintenance@hourly.timer
* git-maintenance@daily.timer
* git-maintenance@weekly.timer

The command launched by those three timers are the same than with the
other scheduling methods:

git for-each-repo --config=maintenance.repo maintenance run
--schedule=%i

with the full path for git to ensure that the version of git launched
for the scheduled maintenance is the same as the one used to run
`maintenance start`.

The timer unit contains `Persistent=true` so that, if the computer is
powered down when a maintenance task should run, the task will be run
when the computer is back powered on.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
---
 Documentation/git-maintenance.txt |  60 +++++
 builtin/gc.c                      | 375 ++++++++++++++++++++++++++++--
 t/t7900-maintenance.sh            |  51 ++++
 3 files changed, 462 insertions(+), 24 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 80ddd33ceb..f012923333 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -181,6 +181,20 @@ OPTIONS
 	`maintenance.<task>.enabled` configured as `true` are considered.
 	See the 'TASKS' section for the list of accepted `<task>` values.
 
+--scheduler=auto|crontab|systemd-timer|launchctl|schtasks::
+	When combined with the `start` subcommand, specify the scheduler
+	to use to run the hourly, daily and weekly executions of
+	`git maintenance run`.
+	The possible values for `<scheduler>` depend on the system: `crontab`
+	is available on POSIX systems, `systemd-timer` is available on Linux
+	systems, `launchctl` is available on MacOS and `schtasks` is available
+	on Windows.
+	By default or when `auto` is specified, the most appropriate scheduler
+	for the system is used. On MacOS, `launchctl` is used. On Windows,
+	`schtasks` is used. On Linux, `systemd-timers` is used if user systemd
+	timers are available, otherwise, `crontab` is used. On all other systems,
+	`crontab` is used.
+
 
 TROUBLESHOOTING
 ---------------
@@ -279,6 +293,52 @@ schedule to ensure you are executing the correct binaries in your
 schedule.
 
 
+BACKGROUND MAINTENANCE ON LINUX SYSTEMD SYSTEMS
+-----------------------------------------------
+
+While Linux supports `cron`, depending on the distribution, `cron` may
+be an optional package not necessarily installed. On modern Linux
+distributions, systemd timers are superseding it.
+
+If user systemd timers are available, they will be used as a replacement
+of `cron`.
+
+In this case, `git maintenance start` will create user systemd timer units
+and start the timers. The current list of user-scheduled tasks can be found
+by running `systemctl --user list-timers`. The timers written by `git
+maintenance start` are similar to this:
+
+-----------------------------------------------------------------------
+$ systemctl --user list-timers
+NEXT                         LEFT          LAST                         PASSED     UNIT                         ACTIVATES
+Thu 2021-04-29 19:00:00 CEST 42min left    Thu 2021-04-29 18:00:11 CEST 17min ago  git-maintenance@hourly.timer git-maintenance@hourly.service
+Fri 2021-04-30 00:00:00 CEST 5h 42min left Thu 2021-04-29 00:00:11 CEST 18h ago    git-maintenance@daily.timer  git-maintenance@daily.service
+Mon 2021-05-03 00:00:00 CEST 3 days left   Mon 2021-04-26 00:00:11 CEST 3 days ago git-maintenance@weekly.timer git-maintenance@weekly.service
+-----------------------------------------------------------------------
+
+One timer is registered for each `--schedule=<frequency>` option.
+
+The definition of the systemd units can be inspected in the following files:
+
+-----------------------------------------------------------------------
+~/.config/systemd/user/git-maintenance@.timer
+~/.config/systemd/user/git-maintenance@.service
+~/.config/systemd/user/timers.target.wants/git-maintenance@hourly.timer
+~/.config/systemd/user/timers.target.wants/git-maintenance@daily.timer
+~/.config/systemd/user/timers.target.wants/git-maintenance@weekly.timer
+-----------------------------------------------------------------------
+
+`git maintenance start` will overwrite these files and start the timer
+again with `systemctl --user`, so any customization should be done by
+creating a drop-in file, i.e. a `.conf` suffixed file in the
+`~/.config/systemd/user/git-maintenance@.service.d` directory.
+
+`git maintenance stop` will stop the user systemd timers and delete
+the above mentioned files.
+
+For more details, see systemd.timer(5)
+
+
 BACKGROUND MAINTENANCE ON MACOS SYSTEMS
 ---------------------------------------
 
diff --git a/builtin/gc.c b/builtin/gc.c
index ef7226d7bc..7c72aa3b99 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1544,6 +1544,15 @@ static const char *get_frequency(enum schedule_priority schedule)
 	}
 }
 
+static int is_launchctl_available(const char *cmd)
+{
+#ifdef __APPLE__
+	return 1;
+#else
+	return 0;
+#endif
+}
+
 static char *launchctl_service_name(const char *frequency)
 {
 	struct strbuf label = STRBUF_INIT;
@@ -1710,6 +1719,15 @@ static int launchctl_update_schedule(int run_maintenance, int fd, const char *cm
 		return launchctl_remove_plists(cmd);
 }
 
+static int is_schtasks_available(const char *cmd)
+{
+#ifdef GIT_WINDOWS_NATIVE
+	return 1;
+#else
+	return 0;
+#endif
+}
+
 static char *schtasks_task_name(const char *frequency)
 {
 	struct strbuf label = STRBUF_INIT;
@@ -1872,6 +1890,28 @@ static int schtasks_update_schedule(int run_maintenance, int fd, const char *cmd
 		return schtasks_remove_tasks(cmd);
 }
 
+static int is_crontab_available(const char *cmd)
+{
+	static int cached_result = -1;
+	struct child_process child = CHILD_PROCESS_INIT;
+
+	if (cached_result != -1)
+		return cached_result;
+
+	strvec_split(&child.args, cmd);
+	strvec_push(&child.args, "-l");
+	child.no_stdin = 1;
+	child.no_stdout = 1;
+	child.no_stderr = 1;
+	child.silent_exec_failure = 1;
+
+	if (start_command(&child))
+		return cached_result = 0;
+	/* Ignore exit code, as an empty crontab will return error. */
+	finish_command(&child);
+	return cached_result = 1;
+}
+
 #define BEGIN_LINE "# BEGIN GIT MAINTENANCE SCHEDULE"
 #define END_LINE "# END GIT MAINTENANCE SCHEDULE"
 
@@ -1959,61 +1999,348 @@ static int crontab_update_schedule(int run_maintenance, int fd, const char *cmd)
 	return result;
 }
 
+static int is_systemd_timer_available(const char *cmd)
+{
+#ifdef __linux__
+	static int cached_result = -1;
+	struct child_process child = CHILD_PROCESS_INIT;
+
+	if (cached_result != -1)
+		return cached_result;
+
+	strvec_split(&child.args, cmd);
+	strvec_pushl(&child.args, "--user", "list-timers", NULL);
+	child.no_stdin = 1;
+	child.no_stdout = 1;
+	child.no_stderr = 1;
+	child.silent_exec_failure = 1;
+
+	if (start_command(&child))
+		return cached_result = 0;
+	if (finish_command(&child))
+		return cached_result = 0;
+	return cached_result = 1;
+#else
+	return 0;
+#endif
+}
+
+static char *xdg_config_home_systemd(const char *filename)
+{
+	const char *home, *config_home;
+
+	assert(filename);
+	config_home = getenv("XDG_CONFIG_HOME");
+	if (config_home && *config_home)
+		return mkpathdup("%s/systemd/user/%s", config_home, filename);
+
+	home = getenv("HOME");
+	if (home)
+		return mkpathdup("%s/.config/systemd/user/%s", home, filename);
+
+	die(_("failed to get $XDG_CONFIG_HOME and $HOME"));
+}
+
+static int systemd_timer_enable_unit(int enable,
+				     enum schedule_priority schedule,
+				     const char *cmd)
+{
+	struct child_process child = CHILD_PROCESS_INIT;
+	const char *frequency = get_frequency(schedule);
+
+	strvec_split(&child.args, cmd);
+	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
+		     "--now", NULL);
+	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
+
+	if (start_command(&child))
+		die(_("failed to run systemctl"));
+	return finish_command(&child);
+}
+
+static int systemd_timer_delete_unit_templates(void)
+{
+	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
+	unlink(filename);
+	free(filename);
+
+	filename = xdg_config_home_systemd("git-maintenance@.service");
+	unlink(filename);
+	free(filename);
+
+	return 0;
+}
+
+static int systemd_timer_delete_units(const char *cmd)
+{
+	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, cmd) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, cmd) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, cmd) ||
+	       systemd_timer_delete_unit_templates();
+}
+
+static int systemd_timer_write_unit_templates(const char *exec_path)
+{
+	char *filename;
+	FILE *file;
+	const char *unit;
+
+	filename = xdg_config_home_systemd("git-maintenance@.timer");
+	if (safe_create_leading_directories(filename))
+		die(_("failed to create directories for '%s'"), filename);
+	file = xfopen(filename, "w");
+	free(filename);
+
+	unit = "# This file was created and is maintained by Git.\n"
+	       "# Any edits made in this file might be replaced in the future\n"
+	       "# by a Git command.\n"
+	       "\n"
+	       "[Unit]\n"
+	       "Description=Optimize Git repositories data\n"
+	       "\n"
+	       "[Timer]\n"
+	       "OnCalendar=%i\n"
+	       "Persistent=true\n"
+	       "\n"
+	       "[Install]\n"
+	       "WantedBy=timers.target\n";
+	fputs(unit, file);
+	fclose(file);
+
+	filename = xdg_config_home_systemd("git-maintenance@.service");
+	if (safe_create_leading_directories(filename))
+		die(_("failed to create directories for '%s'"), filename);
+	file = xfopen(filename, "w");
+	free(filename);
+
+	unit = "# This file was created and is maintained by Git.\n"
+	       "# Any edits made in this file might be replaced in the future\n"
+	       "# by a Git command.\n"
+	       "\n"
+	       "[Unit]\n"
+	       "Description=Optimize Git repositories data\n"
+	       "\n"
+	       "[Service]\n"
+	       "Type=oneshot\n"
+	       "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
+	       "LockPersonality=yes\n"
+	       "MemoryDenyWriteExecute=yes\n"
+	       "NoNewPrivileges=yes\n"
+	       "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6\n"
+	       "RestrictNamespaces=yes\n"
+	       "RestrictRealtime=yes\n"
+	       "RestrictSUIDSGID=yes\n"
+	       "SystemCallArchitectures=native\n"
+	       "SystemCallFilter=@system-service\n";
+	fprintf(file, unit, exec_path);
+	fclose(file);
+
+	return 0;
+}
+
+static int systemd_timer_setup_units(const char *cmd)
+{
+	const char *exec_path = git_exec_path();
+
+	return systemd_timer_write_unit_templates(exec_path) ||
+	       systemd_timer_enable_unit(1, SCHEDULE_HOURLY, cmd) ||
+	       systemd_timer_enable_unit(1, SCHEDULE_DAILY, cmd) ||
+	       systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, cmd);
+}
+
+static int systemd_timer_update_schedule(int run_maintenance, int fd,
+					 const char *cmd)
+{
+	if (run_maintenance)
+		return systemd_timer_setup_units(cmd);
+	else
+		return systemd_timer_delete_units(cmd);
+}
+
+enum scheduler {
+	SCHEDULER_INVALID = -1,
+	SCHEDULER_AUTO = 0,
+	SCHEDULER_CRON = 1,
+	SCHEDULER_SYSTEMD = 2,
+	SCHEDULER_LAUNCHCTL = 3,
+	SCHEDULER_SCHTASKS = 4,
+};
+
+static const struct {
+	int (*is_available)(const char *cmd);
+	int (*update_schedule)(int run_maintenance, int fd, const char *cmd);
+	const char *cmd;
+} scheduler_fn[] = {
+	[SCHEDULER_CRON] = { is_crontab_available, crontab_update_schedule,
+			     "crontab" },
+	[SCHEDULER_SYSTEMD] = { is_systemd_timer_available,
+				systemd_timer_update_schedule, "systemctl" },
+	[SCHEDULER_LAUNCHCTL] = { is_launchctl_available,
+				  launchctl_update_schedule, "launchctl" },
+	[SCHEDULER_SCHTASKS] = { is_schtasks_available,
+				 schtasks_update_schedule, "schtasks" },
+};
+
+static enum scheduler parse_scheduler(const char *value)
+{
+	if (!value)
+		return SCHEDULER_INVALID;
+	else if (!strcasecmp(value, "auto"))
+		return SCHEDULER_AUTO;
+	else if (!strcasecmp(value, "cron") || !strcasecmp(value, "crontab"))
+		return SCHEDULER_CRON;
+	else if (!strcasecmp(value, "systemd") ||
+		 !strcasecmp(value, "systemd-timer"))
+		return SCHEDULER_SYSTEMD;
+	else if (!strcasecmp(value, "launchctl"))
+		return SCHEDULER_LAUNCHCTL;
+	else if (!strcasecmp(value, "schtasks"))
+		return SCHEDULER_SCHTASKS;
+	else
+		return SCHEDULER_INVALID;
+}
+
+static int maintenance_opt_scheduler(const struct option *opt, const char *arg,
+				     int unset)
+{
+	enum scheduler *scheduler = opt->value;
+
+	if (unset)
+		die(_("--no-scheduler is not allowed"));
+
+	*scheduler = parse_scheduler(arg);
+
+	if (*scheduler == SCHEDULER_INVALID)
+		die(_("unrecognized --scheduler argument '%s'"), arg);
+
+	return 0;
+}
+
+struct maintenance_start_opts {
+	enum scheduler scheduler;
+};
+
+static void resolve_auto_scheduler(enum scheduler *scheduler)
+{
+	if (*scheduler != SCHEDULER_AUTO)
+		return;
+
 #if defined(__APPLE__)
-static const char platform_scheduler[] = "launchctl";
+	*scheduler = SCHEDULER_LAUNCHCTL;
+	return;
+
 #elif defined(GIT_WINDOWS_NATIVE)
-static const char platform_scheduler[] = "schtasks";
+	*scheduler = SCHEDULER_SCHTASKS;
+	return;
+
+#elif defined(__linux__)
+	if (is_systemd_timer_available("systemctl"))
+		*scheduler = SCHEDULER_SYSTEMD;
+	else if (is_crontab_available("crontab"))
+		*scheduler = SCHEDULER_CRON;
+	else
+		die(_("neither systemd timers nor crontab are available"));
+	return;
+
 #else
-static const char platform_scheduler[] = "crontab";
+	*scheduler = SCHEDULER_CRON;
+	return;
 #endif
+}
 
-static int update_background_schedule(int enable)
+static void validate_scheduler(enum scheduler scheduler)
 {
-	int result;
-	const char *scheduler = platform_scheduler;
-	const char *cmd = scheduler;
+	const char *cmd;
+
+	if (scheduler == SCHEDULER_INVALID)
+		BUG("invalid scheduler");
+	if (scheduler == SCHEDULER_AUTO)
+		BUG("resolve_auto_scheduler should have been called before");
+
+	cmd = scheduler_fn[scheduler].cmd;
+	if (!scheduler_fn[scheduler].is_available(cmd))
+		die(_("%s scheduler is not available"), cmd);
+}
+
+static int update_background_schedule(const struct maintenance_start_opts *opts,
+				      int enable)
+{
+	unsigned int i;
+	int res, result = 0;
+	enum scheduler scheduler;
+	const char *cmd = NULL;
 	char *testing;
 	struct lock_file lk;
 	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
 
+	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
+		return error(_("another process is scheduling background maintenance"));
+
 	testing = xstrdup_or_null(getenv("GIT_TEST_MAINT_SCHEDULER"));
 	if (testing) {
 		char *sep = strchr(testing, ':');
 		if (!sep)
 			die("GIT_TEST_MAINT_SCHEDULER unparseable: %s", testing);
 		*sep = '\0';
-		scheduler = testing;
+		scheduler = parse_scheduler(testing);
 		cmd = sep + 1;
+		result = scheduler_fn[scheduler].update_schedule(
+			enable, get_lock_file_fd(&lk), cmd);
+		goto done;
 	}
 
-	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
-		return error(_("another process is scheduling background maintenance"));
-
-	if (!strcmp(scheduler, "launchctl"))
-		result = launchctl_update_schedule(enable, get_lock_file_fd(&lk), cmd);
-	else if (!strcmp(scheduler, "schtasks"))
-		result = schtasks_update_schedule(enable, get_lock_file_fd(&lk), cmd);
-	else if (!strcmp(scheduler, "crontab"))
-		result = crontab_update_schedule(enable, get_lock_file_fd(&lk), cmd);
-	else
-		die("unknown background scheduler: %s", scheduler);
+	for (i = 1; i < ARRAY_SIZE(scheduler_fn); i++) {
+		int enable_scheduler = enable && (opts->scheduler == i);
+		cmd = scheduler_fn[i].cmd;
+		if (!scheduler_fn[i].is_available(cmd))
+			continue;
+		res = scheduler_fn[i].update_schedule(
+			enable_scheduler, get_lock_file_fd(&lk), cmd);
+		if (enable_scheduler)
+			result = res;
+	}
 
+done:
 	rollback_lock_file(&lk);
 	free(testing);
 	return result;
 }
 
-static int maintenance_start(void)
+static const char *const builtin_maintenance_start_usage[] = {
+	N_("git maintenance start [--scheduler=<scheduler>]"), NULL
+};
+
+static int maintenance_start(int argc, const char **argv, const char *prefix)
 {
+	struct maintenance_start_opts opts;
+	struct option builtin_maintenance_start_options[] = {
+		OPT_CALLBACK(
+			0, "scheduler", &opts.scheduler, N_("scheduler"),
+			N_("scheduler to use to trigger git maintenance run"),
+			maintenance_opt_scheduler),
+		OPT_END()
+	};
+	memset(&opts, 0, sizeof(opts));
+
+	argc = parse_options(argc, argv, prefix,
+			     builtin_maintenance_start_options,
+			     builtin_maintenance_start_usage, 0);
+
+	resolve_auto_scheduler(&opts.scheduler);
+	validate_scheduler(opts.scheduler);
+
+	if (argc > 0)
+		usage_with_options(builtin_maintenance_start_usage,
+				   builtin_maintenance_start_options);
+
 	if (maintenance_register())
 		warning(_("failed to add repo to global config"));
-
-	return update_background_schedule(1);
+	return update_background_schedule(&opts, 1);
 }
 
 static int maintenance_stop(void)
 {
-	return update_background_schedule(0);
+	return update_background_schedule(NULL, 0);
 }
 
 static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
@@ -2027,7 +2354,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
 	if (!strcmp(argv[1], "start"))
-		return maintenance_start();
+		return maintenance_start(argc - 1, argv + 1, prefix);
 	if (!strcmp(argv[1], "stop"))
 		return maintenance_stop();
 	if (!strcmp(argv[1], "register"))
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 2412d8c5c0..6e6316cd90 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -20,6 +20,20 @@ test_xmllint () {
 	fi
 }
 
+test_lazy_prereq SYSTEMD_ANALYZE '
+	systemd-analyze --help >out &&
+	grep verify out
+'
+
+test_systemd_analyze_verify () {
+	if test_have_prereq SYSTEMD_ANALYZE
+	then
+		systemd-analyze verify "$@"
+	else
+		true
+	fi
+}
+
 test_expect_success 'help text' '
 	test_expect_code 129 git maintenance -h 2>err &&
 	test_i18ngrep "usage: git maintenance <subcommand>" err &&
@@ -615,6 +629,43 @@ test_expect_success 'start and stop Windows maintenance' '
 	test_cmp expect args
 '
 
+test_expect_success 'start and stop Linux/systemd maintenance' '
+	write_script print-args <<-\EOF &&
+	echo $* >>args
+	EOF
+
+	rm -f args &&
+	GIT_TEST_MAINT_SCHEDULER="systemd:./print-args" git maintenance start &&
+
+	# start registers the repo
+	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
+
+	test_systemd_analyze_verify "$HOME/.config/systemd/user/git-maintenance@.service" &&
+
+	rm -f expect &&
+	for frequency in hourly daily weekly
+	do
+		echo "--user enable --now git-maintenance@${frequency}.timer" >>expect || return 1
+	done &&
+	test_cmp expect args &&
+
+	rm -f args &&
+	GIT_TEST_MAINT_SCHEDULER="systemd:./print-args" git maintenance stop &&
+
+	# stop does not unregister the repo
+	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
+
+	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.timer" &&
+	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.service" &&
+
+	rm -f expect &&
+	for frequency in hourly daily weekly
+	do
+		echo "--user disable --now git-maintenance@${frequency}.timer" >>expect || return 1
+	done &&
+	test_cmp expect args
+'
+
 test_expect_success 'register preserves existing strategy' '
 	git config maintenance.strategy none &&
 	git maintenance register &&
-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-05 12:01   ` Ævar Arnfjörð Bjarmason
@ 2021-05-09 22:34     ` Lénaïc Huard
  2021-05-10 13:03       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 30+ messages in thread
From: Lénaïc Huard @ 2021-05-09 22:34 UTC (permalink / raw)
  To: Eric Sunshine, Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Derrick Stolee, brian m. carlson

Le mercredi 5 mai 2021, 14:01:25 CEST Ævar Arnfjörð Bjarmason a écrit :
> On Sun, May 02 2021, Eric Sunshine wrote:
> > On Sat, May 1, 2021 at 10:59 AM Lénaïc Huard <lenaic@lhuard.fr> wrote:
> >> +       strvec_push(&child.args, "--now");
> >> +       strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
> >> +
> >> +       if (start_command(&child))
> >> +               die(_("failed to run systemctl"));
> >> +       return finish_command(&child);
> >> +}
> >> +static int systemd_timer_write_unit_templates(const char *exec_path)
> >> +{
> >> +       unit = "[Unit]\n"
> >> +              "Description=Optimize Git repositories data\n"
> >> +              "\n"
> >> +              "[Service]\n"
> >> +              "Type=oneshot\n"
> >> +              "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo
> >> --config=maintenance.repo maintenance run --schedule=%%i\n"> 
> > I see that it's in POSIX, but do we use this `%n$s` directive
> > elsewhere in the Git source code? If not, I'd be cautious of
> > introducing it here. Maybe it's better to just use plain `%s` twice...
> 
> We use it in po/, so for sprintf() on systems that don't have
> NO_GETTEXT=Y we already test it in the wild.
> 
> But no, I don't think anything in the main source uses it, FWIW I have a
> WIP series in my own fork that I've cooked for a while that uses this, I
> haven't run into any issues with it in either GitHub's CI
> (e.g. Windows), or on the systems I myself test on.
> 
> I think it would be a useful canary to just take a change like this, we
> can always change it to the form you suggest if it doesn't work out.

Based on this latest comment, I left the `%n$s` directive in the v2 of the 
patch.

Let me know if that’s still OK. Otherwise, I’d be happy to implement Eric’s 
suggestion.

Note however that this would be a “poor” canary to check if that directive is 
supported on all the platforms on which git has been ported.
Indeed, this code is executed only on systemd platforms, which means quite 
recent Linux systems.
Should this directive not be supported, I suppose it would be on more exotic 
systems.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-09 21:32   ` [PATCH v2 1/1] " Lénaïc Huard
@ 2021-05-10  1:20     ` Đoàn Trần Công Danh
  2021-05-10  2:48       ` Eric Sunshine
  2021-05-10 18:03     ` Phillip Wood
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: Đoàn Trần Công Danh @ 2021-05-10  1:20 UTC (permalink / raw)
  To: Lénaïc Huard
  Cc: git, Junio C Hamano, Derrick Stolee, Eric Sunshine,
	brian m . carlson, Bagas Sanjaya, Phillip Wood,
	Ævar Arnfjörð Bjarmason

On 2021-05-09 23:32:17+0200, Lénaïc Huard <lenaic@lhuard.fr> wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.
> 
> +static int systemd_timer_enable_unit(int enable,
> +				     enum schedule_priority schedule,
> +				     const char *cmd)
> +{
> +	struct child_process child = CHILD_PROCESS_INIT;
> +	const char *frequency = get_frequency(schedule);
> +
> +	strvec_split(&child.args, cmd);
> +	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
> +		     "--now", NULL);
> +	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
> +
> +	if (start_command(&child))
> +		die(_("failed to run systemctl"));
> +	return finish_command(&child);
> +}
> +
> +static int systemd_timer_delete_unit_templates(void)
> +{
> +	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
> +	unlink(filename);
> +	free(filename);
> +
> +	filename = xdg_config_home_systemd("git-maintenance@.service");
> +	unlink(filename);
> +	free(filename);
> +
> +	return 0;
> +}
> +
> +static int systemd_timer_delete_units(const char *cmd)
> +{
> +	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, cmd) ||
> +	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, cmd) ||
> +	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, cmd) ||
> +	       systemd_timer_delete_unit_templates();
> +}

I'm not using any systemd-based distros. However, isn't this try to
enable all systemd's {hourly,daily,weekly} user's timer, then delete
the templates?^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W^W

Argh, we're disabling those systemd timer units first, by passing 0 as
first argument of systemd_timer_delete_units.

The fact that I read that twice, and still wrote down above reply
makes me think that above code is not self-explanatory enough.
May we switch to something else? Let's say using enum?


> +static int systemd_timer_write_unit_templates(const char *exec_path)
> +{
> +	char *filename;
> +	FILE *file;
> +	const char *unit;
> +
> +	filename = xdg_config_home_systemd("git-maintenance@.timer");
> +	if (safe_create_leading_directories(filename))
> +		die(_("failed to create directories for '%s'"), filename);

This message is used by other codes, less works for translator, nice!

> +	file = xfopen(filename, "w");
> +	free(filename);

I'm sure if we should use FREE_AND_NULL(filename) instead?
Since, filename will be reused later.

> +
> +	unit = "# This file was created and is maintained by Git.\n"
> +	       "# Any edits made in this file might be replaced in the future\n"
> +	       "# by a Git command.\n"
> +	       "\n"
> +	       "[Unit]\n"
> +	       "Description=Optimize Git repositories data\n"
> +	       "\n"
> +	       "[Timer]\n"
> +	       "OnCalendar=%i\n"
> +	       "Persistent=true\n"
> +	       "\n"
> +	       "[Install]\n"
> +	       "WantedBy=timers.target\n";
> +	fputs(unit, file);
> +	fclose(file);
> +
> +	filename = xdg_config_home_systemd("git-maintenance@.service");
> +	if (safe_create_leading_directories(filename))
> +		die(_("failed to create directories for '%s'"), filename);
> +	file = xfopen(filename, "w");
> +	free(filename);
> +
> +	unit = "# This file was created and is maintained by Git.\n"
> +	       "# Any edits made in this file might be replaced in the future\n"
> +	       "# by a Git command.\n"
> +	       "\n"
> +	       "[Unit]\n"
> +	       "Description=Optimize Git repositories data\n"
> +	       "\n"
> +	       "[Service]\n"
> +	       "Type=oneshot\n"
> +	       "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
> +	       "LockPersonality=yes\n"
> +	       "MemoryDenyWriteExecute=yes\n"
> +	       "NoNewPrivileges=yes\n"
> +	       "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6\n"
> +	       "RestrictNamespaces=yes\n"
> +	       "RestrictRealtime=yes\n"
> +	       "RestrictSUIDSGID=yes\n"
> +	       "SystemCallArchitectures=native\n"
> +	       "SystemCallFilter=@system-service\n";
> +	fprintf(file, unit, exec_path);

I think others have strong opinion on not using "%1$s",
and prefer simple "%s" and using "exec_path" twice instead.

> +	fclose(file);
> +
> +	return 0;
> +}
> +
> +static int systemd_timer_setup_units(const char *cmd)
> +{
> +	const char *exec_path = git_exec_path();
> +
> +	return systemd_timer_write_unit_templates(exec_path) ||
> +	       systemd_timer_enable_unit(1, SCHEDULE_HOURLY, cmd) ||
> +	       systemd_timer_enable_unit(1, SCHEDULE_DAILY, cmd) ||
> +	       systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, cmd);
> +}
> +
> +static int systemd_timer_update_schedule(int run_maintenance, int fd,
> +					 const char *cmd)
> +{
> +	if (run_maintenance)
> +		return systemd_timer_setup_units(cmd);
> +	else
> +		return systemd_timer_delete_units(cmd);
> +}
> +
> +enum scheduler {
> +	SCHEDULER_INVALID = -1,
> +	SCHEDULER_AUTO = 0,
> +	SCHEDULER_CRON = 1,
> +	SCHEDULER_SYSTEMD = 2,
> +	SCHEDULER_LAUNCHCTL = 3,
> +	SCHEDULER_SCHTASKS = 4,

I think explicitly writing down values doesn't make things clearer,
-1 would be nice, not a strong opinion, though.

Anyway, would it be better to move those type declaration to top of
file?

> +};
> +
> +static const struct {
> +	int (*is_available)(const char *cmd);
> +	int (*update_schedule)(int run_maintenance, int fd, const char *cmd);
> +	const char *cmd;
> +} scheduler_fn[] = {
> +	[SCHEDULER_CRON] = { is_crontab_available, crontab_update_schedule,
> +			     "crontab" },
> +	[SCHEDULER_SYSTEMD] = { is_systemd_timer_available,
> +				systemd_timer_update_schedule, "systemctl" },
> +	[SCHEDULER_LAUNCHCTL] = { is_launchctl_available,
> +				  launchctl_update_schedule, "launchctl" },
> +	[SCHEDULER_SCHTASKS] = { is_schtasks_available,
> +				 schtasks_update_schedule, "schtasks" },
> +};
> +
> +static enum scheduler parse_scheduler(const char *value)
> +{
> +	if (!value)
> +		return SCHEDULER_INVALID;
> +	else if (!strcasecmp(value, "auto"))
> +		return SCHEDULER_AUTO;
> +	else if (!strcasecmp(value, "cron") || !strcasecmp(value, "crontab"))
> +		return SCHEDULER_CRON;
> +	else if (!strcasecmp(value, "systemd") ||
> +		 !strcasecmp(value, "systemd-timer"))
> +		return SCHEDULER_SYSTEMD;
> +	else if (!strcasecmp(value, "launchctl"))
> +		return SCHEDULER_LAUNCHCTL;
> +	else if (!strcasecmp(value, "schtasks"))
> +		return SCHEDULER_SCHTASKS;
> +	else
> +		return SCHEDULER_INVALID;
> +}
> +
> +static int maintenance_opt_scheduler(const struct option *opt, const char *arg,
> +				     int unset)
> +{
> +	enum scheduler *scheduler = opt->value;
> +
> +	if (unset)
> +		die(_("--no-scheduler is not allowed"));

I think it's better to use OPT_CALLBACK_F in the options list and
we will write below code instead:

	BUG_ON_OPT_NEG(unset)

> +
> +	*scheduler = parse_scheduler(arg);
> +
> +	if (*scheduler == SCHEDULER_INVALID)
> +		die(_("unrecognized --scheduler argument '%s'"), arg);

Most of other callbacks do this instead:

	return error(_("messsage.... '%s'"), arg);

> +
> +	return 0;
> +}
> +
> +struct maintenance_start_opts {
> +	enum scheduler scheduler;
> +};
> +
> +static void resolve_auto_scheduler(enum scheduler *scheduler)
> +{
> +	if (*scheduler != SCHEDULER_AUTO)
> +		return;
> +
>  #if defined(__APPLE__)
> -static const char platform_scheduler[] = "launchctl";
> +	*scheduler = SCHEDULER_LAUNCHCTL;
> +	return;
> +
>  #elif defined(GIT_WINDOWS_NATIVE)
> -static const char platform_scheduler[] = "schtasks";
> +	*scheduler = SCHEDULER_SCHTASKS;
> +	return;
> +
> +#elif defined(__linux__)
> +	if (is_systemd_timer_available("systemctl"))
> +		*scheduler = SCHEDULER_SYSTEMD;
> +	else if (is_crontab_available("crontab"))
> +		*scheduler = SCHEDULER_CRON;
> +	else
> +		die(_("neither systemd timers nor crontab are available"));
> +	return;
> +
>  #else
> -static const char platform_scheduler[] = "crontab";
> +	*scheduler = SCHEDULER_CRON;
> +	return;
>  #endif
> +}
>  
> -static int update_background_schedule(int enable)
> +static void validate_scheduler(enum scheduler scheduler)
>  {
> -	int result;
> -	const char *scheduler = platform_scheduler;
> -	const char *cmd = scheduler;
> +	const char *cmd;
> +
> +	if (scheduler == SCHEDULER_INVALID)
> +		BUG("invalid scheduler");
> +	if (scheduler == SCHEDULER_AUTO)
> +		BUG("resolve_auto_scheduler should have been called before");
> +
> +	cmd = scheduler_fn[scheduler].cmd;
> +	if (!scheduler_fn[scheduler].is_available(cmd))
> +		die(_("%s scheduler is not available"), cmd);
> +}
> +
> +static int update_background_schedule(const struct maintenance_start_opts *opts,
> +				      int enable)
> +{
> +	unsigned int i;
> +	int res, result = 0;
> +	enum scheduler scheduler;
> +	const char *cmd = NULL;
>  	char *testing;
>  	struct lock_file lk;
>  	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
>  
> +	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> +		return error(_("another process is scheduling background maintenance"));
> +
>  	testing = xstrdup_or_null(getenv("GIT_TEST_MAINT_SCHEDULER"));
>  	if (testing) {
>  		char *sep = strchr(testing, ':');
>  		if (!sep)
>  			die("GIT_TEST_MAINT_SCHEDULER unparseable: %s", testing);
>  		*sep = '\0';
> -		scheduler = testing;
> +		scheduler = parse_scheduler(testing);
>  		cmd = sep + 1;
> +		result = scheduler_fn[scheduler].update_schedule(
> +			enable, get_lock_file_fd(&lk), cmd);
> +		goto done;
>  	}
>  
> -	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> -		return error(_("another process is scheduling background maintenance"));
> -
> -	if (!strcmp(scheduler, "launchctl"))
> -		result = launchctl_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "schtasks"))
> -		result = schtasks_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "crontab"))
> -		result = crontab_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else
> -		die("unknown background scheduler: %s", scheduler);
> +	for (i = 1; i < ARRAY_SIZE(scheduler_fn); i++) {
> +		int enable_scheduler = enable && (opts->scheduler == i);
> +		cmd = scheduler_fn[i].cmd;
> +		if (!scheduler_fn[i].is_available(cmd))
> +			continue;
> +		res = scheduler_fn[i].update_schedule(
> +			enable_scheduler, get_lock_file_fd(&lk), cmd);
> +		if (enable_scheduler)
> +			result = res;
> +	}
>  
> +done:
>  	rollback_lock_file(&lk);
>  	free(testing);
>  	return result;
>  }
>  
> -static int maintenance_start(void)
> +static const char *const builtin_maintenance_start_usage[] = {
> +	N_("git maintenance start [--scheduler=<scheduler>]"), NULL
> +};
> +
> +static int maintenance_start(int argc, const char **argv, const char *prefix)
>  {
> +	struct maintenance_start_opts opts;
> +	struct option builtin_maintenance_start_options[] = {
> +		OPT_CALLBACK(
> +			0, "scheduler", &opts.scheduler, N_("scheduler"),
> +			N_("scheduler to use to trigger git maintenance run"),
> +			maintenance_opt_scheduler),

Following up my comment above, we're better to use:

		OPT_CALLBACK_F(0, "scheduler", &opts.scheduler, N_("scheduler"),
			N_("............"),
			PARSE_OPT_NONEG, maintenance_opt_scheduler),

> +		OPT_END()
> +	};
> +	memset(&opts, 0, sizeof(opts));
> +
> +	argc = parse_options(argc, argv, prefix,
> +			     builtin_maintenance_start_options,
> +			     builtin_maintenance_start_usage, 0);
> +
> +	resolve_auto_scheduler(&opts.scheduler);
> +	validate_scheduler(opts.scheduler);
> +
> +	if (argc > 0)
> +		usage_with_options(builtin_maintenance_start_usage,
> +				   builtin_maintenance_start_options);
> +
>  	if (maintenance_register())
>  		warning(_("failed to add repo to global config"));
> -
> -	return update_background_schedule(1);
> +	return update_background_schedule(&opts, 1);
>  }
>  
>  static int maintenance_stop(void)
>  {
> -	return update_background_schedule(0);
> +	return update_background_schedule(NULL, 0);
>  }
>  
>  static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
> @@ -2027,7 +2354,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
>  	if (!strcmp(argv[1], "run"))
>  		return maintenance_run(argc - 1, argv + 1, prefix);
>  	if (!strcmp(argv[1], "start"))
> -		return maintenance_start();
> +		return maintenance_start(argc - 1, argv + 1, prefix);
>  	if (!strcmp(argv[1], "stop"))
>  		return maintenance_stop();
>  	if (!strcmp(argv[1], "register"))
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 2412d8c5c0..6e6316cd90 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -20,6 +20,20 @@ test_xmllint () {
>  	fi
>  }
>  
> +test_lazy_prereq SYSTEMD_ANALYZE '
> +	systemd-analyze --help >out &&
> +	grep verify out
> +'
> +
> +test_systemd_analyze_verify () {
> +	if test_have_prereq SYSTEMD_ANALYZE
> +	then
> +		systemd-analyze verify "$@"
> +	else
> +		true

The "else" leg is not necessary.

> +	fi
> +}
> +
>  test_expect_success 'help text' '
>  	test_expect_code 129 git maintenance -h 2>err &&
>  	test_i18ngrep "usage: git maintenance <subcommand>" err &&
> @@ -615,6 +629,43 @@ test_expect_success 'start and stop Windows maintenance' '
>  	test_cmp expect args
>  '
>  
> +test_expect_success 'start and stop Linux/systemd maintenance' '
> +	write_script print-args <<-\EOF &&
> +	echo $* >>args

To avoid any possible incompatibility with zillion echo implementation
out there. printf should be prefered over echo. Not a in this test
case, however, it costs us nothing anyway.

	printf "%s\n" "$*"

> +	EOF
> +
> +	rm -f args &&
> +	GIT_TEST_MAINT_SCHEDULER="systemd:./print-args" git maintenance start &&
> +
> +	# start registers the repo
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
> +
> +	test_systemd_analyze_verify "$HOME/.config/systemd/user/git-maintenance@.service" &&
> +
> +	rm -f expect &&
> +	for frequency in hourly daily weekly
> +	do
> +		echo "--user enable --now git-maintenance@${frequency}.timer" >>expect || return 1
> +	done &&

And here, we can have a nicer syntax with printf:

	printf "--user enable --now git-maintenance@%s.timer\n" hourly daily weekly >expect &&

With printf, we don't even need "rm -f expect" above.

> +	test_cmp expect args &&
> +
> +	rm -f args &&
> +	GIT_TEST_MAINT_SCHEDULER="systemd:./print-args" git maintenance stop &&
> +
> +	# stop does not unregister the repo
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
> +
> +	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.timer" &&
> +	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.service" &&
> +
> +	rm -f expect &&
> +	for frequency in hourly daily weekly
> +	do
> +		echo "--user disable --now git-maintenance@${frequency}.timer" >>expect || return 1
> +	done &&

Ditto.

All of this was written without testing, because I don't have any
systemd based system near my hand, right now.

So, please take it with a grain of salt.


-- 
Danh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-10  1:20     ` Đoàn Trần Công Danh
@ 2021-05-10  2:48       ` Eric Sunshine
  2021-05-10  6:25         ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Sunshine @ 2021-05-10  2:48 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: Lénaïc Huard, Git List, Junio C Hamano, Derrick Stolee,
	brian m . carlson, Bagas Sanjaya, Phillip Wood,
	Ævar Arnfjörð Bjarmason

On Sun, May 9, 2021 at 9:20 PM Đoàn Trần Công Danh <congdanhqx@gmail.com> wrote:
> On 2021-05-09 23:32:17+0200, Lénaïc Huard <lenaic@lhuard.fr> wrote:
> > +static int systemd_timer_delete_units(const char *cmd)
> > +{
> > +     return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, cmd) ||
> > +            systemd_timer_enable_unit(0, SCHEDULE_DAILY, cmd) ||
> > +            systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, cmd) ||
> > +            systemd_timer_delete_unit_templates();
> > +}
>
> Argh, we're disabling those systemd timer units first, by passing 0 as
> first argument of systemd_timer_delete_units.
>
> The fact that I read that twice, and still wrote down above reply
> makes me think that above code is not self-explanatory enough.
> May we switch to something else? Let's say using enum?

This is modeled after existing scheduler functions in this file, in
which the `enable` argument is a simple 0 or 1, so changing this to an
enum just for this function would be inconsistent. Changing all the
functions to `enum` in a preparatory patch could indeed improve
readability, however, that's tangential cleanup which may be outside
the scope of this submission.

> > +     file = xfopen(filename, "w");
> > +     free(filename);
>
> I'm sure if we should use FREE_AND_NULL(filename) instead?
> Since, filename will be reused later.

Indeed, probably a good idea, as it would make catching mistakes easier.

> > +            "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
>
> I think others have strong opinion on not using "%1$s",
> and prefer simple "%s" and using "exec_path" twice instead.

I brought it up only because I hadn't seen it in Git sources, and
wasn't sure if we'd want to start using it. Aside from Ævar, who
seemed reasonably in favor of it, nobody else chimed in, so it could
go either way, I suppose.

> > +test_systemd_analyze_verify () {
> > +     if test_have_prereq SYSTEMD_ANALYZE
> > +     then
> > +             systemd-analyze verify "$@"
> > +     else
> > +             true
>
> The "else" leg is not necessary.

This was patterned after the existing test_xmllint() function in this
file which has the `else true` leg. I wrote test_xmllint(), so I'll
take blame. But you're right, the `if` will return success if the
prerequisite is not set, so the `else` leg is indeed not needed.
(Cleaning up the test_xmllint() function is outside the scope of this
patch.)

> > +test_expect_success 'start and stop Linux/systemd maintenance' '
> > +     write_script print-args <<-\EOF &&
> > +     echo $* >>args
>
> To avoid any possible incompatibility with zillion echo implementation
> out there. printf should be prefered over echo. Not a in this test
> case, however, it costs us nothing anyway.
>
>         printf "%s\n" "$*"

This, too, is patterned after existing auxiliary scripts created by
these test functions, and you're correct that it's potentially
dangerous. A manual inspection of all the existing instances shows
that `echo $*` happens to be safe for those cases but that doesn't
excuse being sloppy about it, so the existing cases probably ought to
be cleaned up. But, again, that is outside the scope of this series.
For this particular case, though...

> > +     for frequency in hourly daily weekly
> > +     do
> > +             echo "--user enable --now git-maintenance@${frequency}.timer" >>expect || return 1
> > +     done &&
>
> And here, we can have a nicer syntax with printf:
>
>         printf "--user enable --now git-maintenance@%s.timer\n" hourly daily weekly >expect &&
>
> With printf, we don't even need "rm -f expect" above.

... you're quite correct that `printf` is the way to go both here and
in the generated `print-args` script since these arguments start with
hyphen.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-10  2:48       ` Eric Sunshine
@ 2021-05-10  6:25         ` Junio C Hamano
  2021-05-12  0:29           ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2021-05-10  6:25 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Đoàn Trần Công Danh, Lénaïc Huard,
	Git List, Derrick Stolee, brian m . carlson, Bagas Sanjaya,
	Phillip Wood, Ævar Arnfjörð Bjarmason

Eric Sunshine <sunshine@sunshineco.com> writes:

>> I think others have strong opinion on not using "%1$s",
>> and prefer simple "%s" and using "exec_path" twice instead.
>
> I brought it up only because I hadn't seen it in Git sources, and
> wasn't sure if we'd want to start using it. Aside from Ævar, who
> seemed reasonably in favor of it, nobody else chimed in, so it could
> go either way, I suppose.

If this were a piece of code that _everybody_ would use on _all_ the
supported platforms, I would suggest declaring that this is a
weather-balloon to see if some platforms have trouble using it.  But
unfortunately this is not such a piece of code.  Dependence on
systemd should strictly be opt-in.

So my preference is

 - here, just do it in the dumb and simple way

 - somewhere else, find code that is compiled and run for everybody
   on all platforms that feeds two same arguments to printf format,
   and update it to use "%1$x" twice, mark it clearly as a weather
   balloon, and document it (see how what 512f41cf did is documented
   in Documentation/CodingGuidelines and mimick, but tone it down as
   we haven't declared it safe to use (yet).

It is likely that we need rearrangement of argument order for po/
files anyway, but a misimplementation might not handle using the
same placeholder twice, and that is why I'd like to be a bit extra
careful.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] maintenance: use systemd timers on Linux
  2021-05-09 22:34     ` Lénaïc Huard
@ 2021-05-10 13:03       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 30+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-05-10 13:03 UTC (permalink / raw)
  To: Lénaïc Huard
  Cc: Eric Sunshine, Git List, Junio C Hamano, Derrick Stolee,
	brian m. carlson


On Mon, May 10 2021, Lénaïc Huard wrote:

> Le mercredi 5 mai 2021, 14:01:25 CEST Ævar Arnfjörð Bjarmason a écrit :
>> On Sun, May 02 2021, Eric Sunshine wrote:
>> > On Sat, May 1, 2021 at 10:59 AM Lénaïc Huard <lenaic@lhuard.fr> wrote:
>> >> +       strvec_push(&child.args, "--now");
>> >> +       strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
>> >> +
>> >> +       if (start_command(&child))
>> >> +               die(_("failed to run systemctl"));
>> >> +       return finish_command(&child);
>> >> +}
>> >> +static int systemd_timer_write_unit_templates(const char *exec_path)
>> >> +{
>> >> +       unit = "[Unit]\n"
>> >> +              "Description=Optimize Git repositories data\n"
>> >> +              "\n"
>> >> +              "[Service]\n"
>> >> +              "Type=oneshot\n"
>> >> +              "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo
>> >> --config=maintenance.repo maintenance run --schedule=%%i\n"> 
>> > I see that it's in POSIX, but do we use this `%n$s` directive
>> > elsewhere in the Git source code? If not, I'd be cautious of
>> > introducing it here. Maybe it's better to just use plain `%s` twice...
>> 
>> We use it in po/, so for sprintf() on systems that don't have
>> NO_GETTEXT=Y we already test it in the wild.
>> 
>> But no, I don't think anything in the main source uses it, FWIW I have a
>> WIP series in my own fork that I've cooked for a while that uses this, I
>> haven't run into any issues with it in either GitHub's CI
>> (e.g. Windows), or on the systems I myself test on.
>> 
>> I think it would be a useful canary to just take a change like this, we
>> can always change it to the form you suggest if it doesn't work out.
>
> Based on this latest comment, I left the `%n$s` directive in the v2 of the 
> patch.
>
> Let me know if that’s still OK. Otherwise, I’d be happy to implement Eric’s 
> suggestion.
>
> Note however that this would be a “poor” canary to check if that directive is 
> supported on all the platforms on which git has been ported.
> Indeed, this code is executed only on systemd platforms, which means quite 
> recent Linux systems.
> Should this directive not be supported, I suppose it would be on more exotic 
> systems.

Indeed, although we compile it on non-Linux platforms, so we'd expect to
get complaints from smarter non-Linux compilers if fprintf() is given an
unknown formatting directive.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-09 21:32   ` [PATCH v2 1/1] " Lénaïc Huard
  2021-05-10  1:20     ` Đoàn Trần Công Danh
@ 2021-05-10 18:03     ` Phillip Wood
  2021-05-10 18:25       ` Eric Sunshine
  2021-05-10 19:15     ` Martin Ågren
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: Phillip Wood @ 2021-05-10 18:03 UTC (permalink / raw)
  To: Lénaïc Huard, git
  Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine, brian m . carlson,
	Bagas Sanjaya, Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason

Hi Lénaïc

Thanks for updating the patch, I've left some comments below. Aside from 
one possible bug, they mostly revolve around not passing the name of 
scheduler command around when that name is fixed and functions which do 
not check for errors but return a success/failure value (either they 
should check the return values of the system calls they make or be 
declared as void if it is safe to ignore the failures)

On 09/05/2021 22:32, Lénaïc Huard wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.
> 
> The main motivations to implement systemd timers in addition to cron
> are:
> * cron is optional and Linux systems running systemd might not have it
>    installed.
> * The execution of `crontab -l` can tell us if cron is installed but not
>    if the daemon is actually running.
> * With systemd, each service is run in its own cgroup and its logs are
>    tagged by the service inside journald. With cron, all scheduled tasks
>    are running in the cron daemon cgroup and all the logs of the
>    user-scheduled tasks are pretended to belong to the system cron
>    service.
>    Concretely, a user that doesn’t have access to the system logs won’t
>    have access to the log of its own tasks scheduled by cron whereas he
>    will have access to the log of its own tasks scheduled by systemd
>    timer.
>    Although `cron` attempts to send email, that email may go unseen by
>    the user because these days, local mailboxes are not heavily used
>    anymore.
> 
> In order to choose which scheduler to use between `cron` and user
> systemd timers, a new option
> `--scheduler=auto|cron|systemd|launchctl|schtasks` has been added to
> `git maintenance start`.
> When `git maintenance start --scheduler=XXX` is run, it not only
> registers `git maintenance run` tasks in the scheduler XXX, it also
> removes the `git maintenance run` tasks from all the other schedulers

I'm not sure it is actually doing that at the moment - see my comment in 
update_background_schedule()

> to
> ensure we cannot have two schedulers launching concurrent identical
> tasks.
> 
> The default value is `auto` which chooses a suitable scheduler for the
> system.
> On Linux, if user systemd timers are available, they will be used as git
> maintenance scheduler. If not, `cron` will be used if it is available.
> If none is available, it will fail.

I think defaulting to systemd timers when both systemd and cron are 
installed is sensible given the problems associated with testing whether 
crond is actually running in that scenario.

> `git maintenance stop` doesn't have any `--scheduler` parameter because
> this command will try to remove the `git maintenance run` tasks from all
> the available schedulers.
> 
> In order to schedule git maintenance, we need two unit template files:
> * ~/.config/systemd/user/git-maintenance@.service
>    to define the command to be started by systemd and
> * ~/.config/systemd/user/git-maintenance@.timer
>    to define the schedule at which the command should be run.
> 
> Those units are templates that are parametrized by the frequency.
> 
> Based on those templates, 3 timers are started:
> * git-maintenance@hourly.timer
> * git-maintenance@daily.timer
> * git-maintenance@weekly.timer
> 
> The command launched by those three timers are the same than with the
> other scheduling methods:
> 
> git for-each-repo --config=maintenance.repo maintenance run
> --schedule=%i
> 
> with the full path for git to ensure that the version of git launched
> for the scheduled maintenance is the same as the one used to run
> `maintenance start`.
> 
> The timer unit contains `Persistent=true` so that, if the computer is
> powered down when a maintenance task should run, the task will be run
> when the computer is back powered on.
> 
> Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
> ---
>   Documentation/git-maintenance.txt |  60 +++++
>   builtin/gc.c                      | 375 ++++++++++++++++++++++++++++--
>   t/t7900-maintenance.sh            |  51 ++++
>   3 files changed, 462 insertions(+), 24 deletions(-)
> 
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 80ddd33ceb..f012923333 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -181,6 +181,20 @@ OPTIONS
>   	`maintenance.<task>.enabled` configured as `true` are considered.
>   	See the 'TASKS' section for the list of accepted `<task>` values.
>   
> +--scheduler=auto|crontab|systemd-timer|launchctl|schtasks::
> +	When combined with the `start` subcommand, specify the scheduler
> +	to use to run the hourly, daily and weekly executions of
> +	`git maintenance run`.
> +	The possible values for `<scheduler>` depend on the system: `crontab`
> +	is available on POSIX systems, `systemd-timer` is available on Linux
> +	systems, `launchctl` is available on MacOS and `schtasks` is available
> +	on Windows.
> +	By default or when `auto` is specified, the most appropriate scheduler
> +	for the system is used. On MacOS, `launchctl` is used. On Windows,
> +	`schtasks` is used. On Linux, `systemd-timers` is used if user systemd
> +	timers are available, otherwise, `crontab` is used. On all other systems,
> +	`crontab` is used.
> +
>   
>   TROUBLESHOOTING
>   ---------------
> @@ -279,6 +293,52 @@ schedule to ensure you are executing the correct binaries in your
>   schedule.
>   
>   
> +BACKGROUND MAINTENANCE ON LINUX SYSTEMD SYSTEMS
> +-----------------------------------------------
> +
> +While Linux supports `cron`, depending on the distribution, `cron` may
> +be an optional package not necessarily installed. On modern Linux
> +distributions, systemd timers are superseding it.
> +
> +If user systemd timers are available, they will be used as a replacement
> +of `cron`.
> +
> +In this case, `git maintenance start` will create user systemd timer units
> +and start the timers. The current list of user-scheduled tasks can be found
> +by running `systemctl --user list-timers`. The timers written by `git
> +maintenance start` are similar to this:
> +
> +-----------------------------------------------------------------------
> +$ systemctl --user list-timers
> +NEXT                         LEFT          LAST                         PASSED     UNIT                         ACTIVATES
> +Thu 2021-04-29 19:00:00 CEST 42min left    Thu 2021-04-29 18:00:11 CEST 17min ago  git-maintenance@hourly.timer git-maintenance@hourly.service
> +Fri 2021-04-30 00:00:00 CEST 5h 42min left Thu 2021-04-29 00:00:11 CEST 18h ago    git-maintenance@daily.timer  git-maintenance@daily.service
> +Mon 2021-05-03 00:00:00 CEST 3 days left   Mon 2021-04-26 00:00:11 CEST 3 days ago git-maintenance@weekly.timer git-maintenance@weekly.service
> +-----------------------------------------------------------------------
> +
> +One timer is registered for each `--schedule=<frequency>` option.
> +
> +The definition of the systemd units can be inspected in the following files:
> +
> +-----------------------------------------------------------------------
> +~/.config/systemd/user/git-maintenance@.timer
> +~/.config/systemd/user/git-maintenance@.service
> +~/.config/systemd/user/timers.target.wants/git-maintenance@hourly.timer
> +~/.config/systemd/user/timers.target.wants/git-maintenance@daily.timer
> +~/.config/systemd/user/timers.target.wants/git-maintenance@weekly.timer
> +-----------------------------------------------------------------------
> +
> +`git maintenance start` will overwrite these files and start the timer
> +again with `systemctl --user`, so any customization should be done by
> +creating a drop-in file, i.e. a `.conf` suffixed file in the
> +`~/.config/systemd/user/git-maintenance@.service.d` directory.
> +
> +`git maintenance stop` will stop the user systemd timers and delete
> +the above mentioned files.
> +
> +For more details, see systemd.timer(5)
> +
> +
>   BACKGROUND MAINTENANCE ON MACOS SYSTEMS
>   ---------------------------------------
>   
> diff --git a/builtin/gc.c b/builtin/gc.c
> index ef7226d7bc..7c72aa3b99 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1544,6 +1544,15 @@ static const char *get_frequency(enum schedule_priority schedule)
>   	}
>   }
>   
> +static int is_launchctl_available(const char *cmd)

None of these is_..._available() function needs the cmd parameter. It 
matches the existing pattern of the existing ..._update_schedule() 
functions but they don't really need the cmd argument either so I would 
drop the argument for the new functions.

> +{
> +#ifdef __APPLE__
> +	return 1;
> +#else
> +	return 0;
> +#endif
> +}
> +
>   static char *launchctl_service_name(const char *frequency)
>   {
>   	struct strbuf label = STRBUF_INIT;
> @@ -1710,6 +1719,15 @@ static int launchctl_update_schedule(int run_maintenance, int fd, const char *cm
>   		return launchctl_remove_plists(cmd);
>   }
>   
> +static int is_schtasks_available(const char *cmd)
> +{
> +#ifdef GIT_WINDOWS_NATIVE
> +	return 1;
> +#else
> +	return 0;
> +#endif
> +}
> +
>   static char *schtasks_task_name(const char *frequency)
>   {
>   	struct strbuf label = STRBUF_INIT;
> @@ -1872,6 +1890,28 @@ static int schtasks_update_schedule(int run_maintenance, int fd, const char *cmd
>   		return schtasks_remove_tasks(cmd);
>   }
>   
> +static int is_crontab_available(const char *cmd)
> +{
> +	static int cached_result = -1;
> +	struct child_process child = CHILD_PROCESS_INIT;
> +
> +	if (cached_result != -1)
> +		return cached_result;
> +
> +	strvec_split(&child.args, cmd);
> +	strvec_push(&child.args, "-l");
> +	child.no_stdin = 1;
> +	child.no_stdout = 1;
> +	child.no_stderr = 1;
> +	child.silent_exec_failure = 1;
> +
> +	if (start_command(&child))
> +		return cached_result = 0;
> +	/* Ignore exit code, as an empty crontab will return error. */
> +	finish_command(&child);
> +	return cached_result = 1;
> +}
> +
>   #define BEGIN_LINE "# BEGIN GIT MAINTENANCE SCHEDULE"
>   #define END_LINE "# END GIT MAINTENANCE SCHEDULE"
>   
> @@ -1959,61 +1999,348 @@ static int crontab_update_schedule(int run_maintenance, int fd, const char *cmd)
>   	return result;
>   }
>   
> +static int is_systemd_timer_available(const char *cmd)
> +{
> +#ifdef __linux__
> +	static int cached_result = -1;
> +	struct child_process child = CHILD_PROCESS_INIT;
> +
> +	if (cached_result != -1)
> +		return cached_result;
> +
> +	strvec_split(&child.args, cmd);
> +	strvec_pushl(&child.args, "--user", "list-timers", NULL);
> +	child.no_stdin = 1;
> +	child.no_stdout = 1;
> +	child.no_stderr = 1;
> +	child.silent_exec_failure = 1;
> +
> +	if (start_command(&child))
> +		return cached_result = 0;

This is maybe a bit too clever. It would be clearer to separate the 
assignment of the cached result and the return statement.

> +	if (finish_command(&child))
> +		return cached_result = 0;
> +	return cached_result = 1;
> +#else
> +	return 0;
> +#endif
> +}
> +
> +static char *xdg_config_home_systemd(const char *filename)
> +{
> +	const char *home, *config_home;
> +
> +	assert(filename);
> +	config_home = getenv("XDG_CONFIG_HOME");
> +	if (config_home && *config_home)
> +		return mkpathdup("%s/systemd/user/%s", config_home, filename);
> +
> +	home = getenv("HOME");
> +	if (home)
> +		return mkpathdup("%s/.config/systemd/user/%s", home, filename);
> +
> +	die(_("failed to get $XDG_CONFIG_HOME and $HOME"));
> +}

This is largely a copy of xdg_config_home(), I would prefer to see this 
function calling that rather than duplicating it.

static char *xdg_config_home_systemd(const char *filename)
{
     struct strbuf buf = STRBUF_INIT;
     char *path;

     strbuf_addf(&buf, "systemd/user/%s", filename);
     path = xdg_config_home(buf.buf);
     strbuf_release(&buf);
     return path;
}

Even better would be to modify xdg_config_home() to take a format string.

> +static int systemd_timer_enable_unit(int enable,
> +				     enum schedule_priority schedule,
> +				     const char *cmd)

The cmd argument is pointless, it will always be "systemctl" and you 
have even hard coded that value into the error message below.

> +{
> +	struct child_process child = CHILD_PROCESS_INIT;
> +	const char *frequency = get_frequency(schedule);
> +
> +	strvec_split(&child.args, cmd);
> +	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
> +		     "--now", NULL);
> +	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
> +
> +	if (start_command(&child))
> +		die(_("failed to run systemctl"));
> +	return finish_command(&child);

There is a strange mix of dying and returning an error here. Also should 
this be printing a message if finish_command() returns a non-zero value? 
If systemctl fails then there is not much we can do to recover so dying 
on any error might be ok unless we need to perform some cleanup if it 
fails like removing the timer unit files.

What is the exit code of systemctl if a unit is already enabled and we 
try to enbale it again (and the same for disabling a disabled unit)?

> +}
> +
> +static int systemd_timer_delete_unit_templates(void)
> +{
> +	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
> +	unlink(filename);

This is missing error handling (and below). If there is a good reason to 
ignore the errors then there is no point in returning a value from this 
function

> +	free(filename);
> +
> +	filename = xdg_config_home_systemd("git-maintenance@.service");
> +	unlink(filename);
> +	free(filename);
> +
> +	return 0;
> +}
> +
> +static int systemd_timer_delete_units(const char *cmd)
> +{
> +	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, cmd) ||
> +	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, cmd) ||
> +	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, cmd) ||
> +	       systemd_timer_delete_unit_templates();
> +}
> +
> +static int systemd_timer_write_unit_templates(const char *exec_path)
> +{
> +	char *filename;
> +	FILE *file;
> +	const char *unit;
> +
> +	filename = xdg_config_home_systemd("git-maintenance@.timer");
> +	if (safe_create_leading_directories(filename))
> +		die(_("failed to create directories for '%s'"), filename);
> +	file = xfopen(filename, "w");
> +	free(filename);
> +
> +	unit = "# This file was created and is maintained by Git.\n"
> +	       "# Any edits made in this file might be replaced in the future\n"
> +	       "# by a Git command.\n"
> +	       "\n"
> +	       "[Unit]\n"
> +	       "Description=Optimize Git repositories data\n"
> +	       "\n"
> +	       "[Timer]\n"
> +	       "OnCalendar=%i\n"
> +	       "Persistent=true\n"
> +	       "\n"
> +	       "[Install]\n"
> +	       "WantedBy=timers.target\n";
> +	fputs(unit, file);
> +	fclose(file);
> +
> +	filename = xdg_config_home_systemd("git-maintenance@.service");
> +	if (safe_create_leading_directories(filename))

Haven't we already created this path if it was missing above for the 
first file?

> +		die(_("failed to create directories for '%s'"), filename);
> +	file = xfopen(filename, "w");

Do we want to try and remove the first file if we cannot write the 
second file to leave the file system in a consitent state? If so you'll 
need to use something like fopen_or_warn() and check the return value.

> +	free(filename);
> +
> +	unit = "# This file was created and is maintained by Git.\n"
> +	       "# Any edits made in this file might be replaced in the future\n"
> +	       "# by a Git command.\n"
> +	       "\n"
> +	       "[Unit]\n"
> +	       "Description=Optimize Git repositories data\n"
> +	       "\n"
> +	       "[Service]\n"
> +	       "Type=oneshot\n"
> +	       "ExecStart=\"%1$s/git\" --exec-path=\"%1$s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
> +	       "LockPersonality=yes\n"
> +	       "MemoryDenyWriteExecute=yes\n"
> +	       "NoNewPrivileges=yes\n"
> +	       "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6\n"
> +	       "RestrictNamespaces=yes\n"
> +	       "RestrictRealtime=yes\n"
> +	       "RestrictSUIDSGID=yes\n"

After a quick read of the systemd.exec man page it is unclear to me if 
these Restrict... lines are needed as we already have 
NoNewPrivileges=yes - maybe they have some effect if `git maintence` is 
run as root?

> +	       "SystemCallArchitectures=native\n"
> +	       "SystemCallFilter=@system-service\n";
> +	fprintf(file, unit, exec_path);
> +	fclose(file);
> +
> +	return 0;

As we don't return any errors but die instead there is no need to return 
a value. On the other hand maybe we should be checking the return values 
of fclose() and possibly fputs() / fprint().

> +}
> +
> +static int systemd_timer_setup_units(const char *cmd)
> +{
> +	const char *exec_path = git_exec_path();
> +
> +	return systemd_timer_write_unit_templates(exec_path) ||
> +	       systemd_timer_enable_unit(1, SCHEDULE_HOURLY, cmd) ||
> +	       systemd_timer_enable_unit(1, SCHEDULE_DAILY, cmd) ||
> +	       systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, cmd);

If any step fails do we need to try and reverse the preceding steps to 
avoid leaving so units enabled but not others. In practice this is 
probably pretty unlikely so maybe we don't need to worry.

> +}
> +
> +static int systemd_timer_update_schedule(int run_maintenance, int fd,
> +					 const char *cmd)

No need for the cmd argument

> +{
> +	if (run_maintenance)
> +		return systemd_timer_setup_units(cmd);
> +	else
> +		return systemd_timer_delete_units(cmd);
> +}
> +
> +enum scheduler {
> +	SCHEDULER_INVALID = -1,
> +	SCHEDULER_AUTO = 0,
> +	SCHEDULER_CRON = 1,
> +	SCHEDULER_SYSTEMD = 2,
> +	SCHEDULER_LAUNCHCTL = 3,
> +	SCHEDULER_SCHTASKS = 4,
> +};
> +
> +static const struct {
> +	int (*is_available)(const char *cmd);
> +	int (*update_schedule)(int run_maintenance, int fd, const char *cmd);
> +	const char *cmd;
> +} scheduler_fn[] = {
> +	[SCHEDULER_CRON] = { is_crontab_available, crontab_update_schedule,
> +			     "crontab" },
> +	[SCHEDULER_SYSTEMD] = { is_systemd_timer_available,
> +				systemd_timer_update_schedule, "systemctl" },
> +	[SCHEDULER_LAUNCHCTL] = { is_launchctl_available,
> +				  launchctl_update_schedule, "launchctl" },
> +	[SCHEDULER_SCHTASKS] = { is_schtasks_available,
> +				 schtasks_update_schedule, "schtasks" },
> +};
> +
> +static enum scheduler parse_scheduler(const char *value)
> +{
> +	if (!value)
> +		return SCHEDULER_INVALID;
> +	else if (!strcasecmp(value, "auto"))
> +		return SCHEDULER_AUTO;
> +	else if (!strcasecmp(value, "cron") || !strcasecmp(value, "crontab"))
> +		return SCHEDULER_CRON;
> +	else if (!strcasecmp(value, "systemd") ||
> +		 !strcasecmp(value, "systemd-timer"))
> +		return SCHEDULER_SYSTEMD;
> +	else if (!strcasecmp(value, "launchctl"))
> +		return SCHEDULER_LAUNCHCTL;
> +	else if (!strcasecmp(value, "schtasks"))
> +		return SCHEDULER_SCHTASKS;
> +	else
> +		return SCHEDULER_INVALID;
> +}
> +
> +static int maintenance_opt_scheduler(const struct option *opt, const char *arg,
> +				     int unset)
> +{
> +	enum scheduler *scheduler = opt->value;
> +
> +	if (unset)
> +		die(_("--no-scheduler is not allowed"));

As others have said it would be better to BUG() here and pass 
PARSE_OPT_NONEG when defining the option below.

> +
> +	*scheduler = parse_scheduler(arg);
> +
> +	if (*scheduler == SCHEDULER_INVALID)
> +		die(_("unrecognized --scheduler argument '%s'"), arg);
> +
> +	return 0;
> +}
> +
> +struct maintenance_start_opts {
> +	enum scheduler scheduler;
> +};
> +
> +static void resolve_auto_scheduler(enum scheduler *scheduler)
> +{
> +	if (*scheduler != SCHEDULER_AUTO)
> +		return;
> +
>   #if defined(__APPLE__)
> -static const char platform_scheduler[] = "launchctl";
> +	*scheduler = SCHEDULER_LAUNCHCTL;
> +	return;
> +
>   #elif defined(GIT_WINDOWS_NATIVE)
> -static const char platform_scheduler[] = "schtasks";
> +	*scheduler = SCHEDULER_SCHTASKS;
> +	return;
> +
> +#elif defined(__linux__)
> +	if (is_systemd_timer_available("systemctl"))
> +		*scheduler = SCHEDULER_SYSTEMD;
> +	else if (is_crontab_available("crontab"))
> +		*scheduler = SCHEDULER_CRON;
> +	else
> +		die(_("neither systemd timers nor crontab are available"));
> +	return;
> +
>   #else
> -static const char platform_scheduler[] = "crontab";
> +	*scheduler = SCHEDULER_CRON;
> +	return;
>   #endif
> +}
>   
> -static int update_background_schedule(int enable)
> +static void validate_scheduler(enum scheduler scheduler)
>   {
> -	int result;
> -	const char *scheduler = platform_scheduler;
> -	const char *cmd = scheduler;
> +	const char *cmd;
> +
> +	if (scheduler == SCHEDULER_INVALID)
> +		BUG("invalid scheduler");
> +	if (scheduler == SCHEDULER_AUTO)
> +		BUG("resolve_auto_scheduler should have been called before");
> +
> +	cmd = scheduler_fn[scheduler].cmd;
> +	if (!scheduler_fn[scheduler].is_available(cmd))
> +		die(_("%s scheduler is not available"), cmd);
> +}
> +
> +static int update_background_schedule(const struct maintenance_start_opts *opts,
> +				      int enable)
> +{
> +	unsigned int i;
> +	int res, result = 0;
> +	enum scheduler scheduler;
> +	const char *cmd = NULL;
>   	char *testing;
>   	struct lock_file lk;
>   	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
>   
> +	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> +		return error(_("another process is scheduling background maintenance"));
> +
>   	testing = xstrdup_or_null(getenv("GIT_TEST_MAINT_SCHEDULER"));
>   	if (testing) {
>   		char *sep = strchr(testing, ':');
>   		if (!sep)
>   			die("GIT_TEST_MAINT_SCHEDULER unparseable: %s", testing);
>   		*sep = '\0';
> -		scheduler = testing;
> +		scheduler = parse_scheduler(testing);
>   		cmd = sep + 1;
> +		result = scheduler_fn[scheduler].update_schedule(
> +			enable, get_lock_file_fd(&lk), cmd);
> +		goto done;
>   	}
>   
> -	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> -		return error(_("another process is scheduling background maintenance"));
> -
> -	if (!strcmp(scheduler, "launchctl"))
> -		result = launchctl_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "schtasks"))
> -		result = schtasks_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "crontab"))
> -		result = crontab_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else
> -		die("unknown background scheduler: %s", scheduler);
> +	for (i = 1; i < ARRAY_SIZE(scheduler_fn); i++) {
> +		int enable_scheduler = enable && (opts->scheduler == i);
> +		cmd = scheduler_fn[i].cmd;
> +		if (!scheduler_fn[i].is_available(cmd))
> +			continue;
> +		res = scheduler_fn[i].update_schedule(
> +			enable_scheduler, get_lock_file_fd(&lk), cmd);
> +		if (enable_scheduler)
> +			result = res;

If I have understood the code correctly then if systemd timers are 
currently enabled and the user runs `git maintenance --scheduler=cron` 
then cron will be enabled and the loop will quit before we get the the 
entry to disable systemd timers leaving both running. I think it would 
be cleaner and clearer to loop over the schedulers disabling them first 
and then enable the one the user has selected.

> +	}
>   
> +done:
>   	rollback_lock_file(&lk);
>   	free(testing);
>   	return result;
>   }
>   
> -static int maintenance_start(void)
> +static const char *const builtin_maintenance_start_usage[] = {
> +	N_("git maintenance start [--scheduler=<scheduler>]"), NULL
> +};
> +
> +static int maintenance_start(int argc, const char **argv, const char *prefix)
>   {
> +	struct maintenance_start_opts opts;

struct maintenance_start_opts opts = { 0 };

would avoid the need for the memset() call below

Thanks again for working on this. Best Wishes

Phillip

> +	struct option builtin_maintenance_start_options[] = {
> +		OPT_CALLBACK(
> +			0, "scheduler", &opts.scheduler, N_("scheduler"),
> +			N_("scheduler to use to trigger git maintenance run"),
> +			maintenance_opt_scheduler),
> +		OPT_END()
> +	};
> +	memset(&opts, 0, sizeof(opts));
> +
> +	argc = parse_options(argc, argv, prefix,
> +			     builtin_maintenance_start_options,
> +			     builtin_maintenance_start_usage, 0);
> +
> +	resolve_auto_scheduler(&opts.scheduler);
> +	validate_scheduler(opts.scheduler);
> +
> +	if (argc > 0)
> +		usage_with_options(builtin_maintenance_start_usage,
> +				   builtin_maintenance_start_options);
> +
>   	if (maintenance_register())
>   		warning(_("failed to add repo to global config"));
> -
> -	return update_background_schedule(1);
> +	return update_background_schedule(&opts, 1);
>   }
>   
>   static int maintenance_stop(void)
>   {
> -	return update_background_schedule(0);
> +	return update_background_schedule(NULL, 0);
>   }
>   
>   static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
> @@ -2027,7 +2354,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
>   	if (!strcmp(argv[1], "run"))
>   		return maintenance_run(argc - 1, argv + 1, prefix);
>   	if (!strcmp(argv[1], "start"))
> -		return maintenance_start();
> +		return maintenance_start(argc - 1, argv + 1, prefix);
>   	if (!strcmp(argv[1], "stop"))
>   		return maintenance_stop();
>   	if (!strcmp(argv[1], "register"))
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 2412d8c5c0..6e6316cd90 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -20,6 +20,20 @@ test_xmllint () {
>   	fi
>   }
>   
> +test_lazy_prereq SYSTEMD_ANALYZE '
> +	systemd-analyze --help >out &&
> +	grep verify out
> +'
> +
> +test_systemd_analyze_verify () {
> +	if test_have_prereq SYSTEMD_ANALYZE
> +	then
> +		systemd-analyze verify "$@"
> +	else
> +		true
> +	fi
> +}
> +
>   test_expect_success 'help text' '
>   	test_expect_code 129 git maintenance -h 2>err &&
>   	test_i18ngrep "usage: git maintenance <subcommand>" err &&
> @@ -615,6 +629,43 @@ test_expect_success 'start and stop Windows maintenance' '
>   	test_cmp expect args
>   '
>   
> +test_expect_success 'start and stop Linux/systemd maintenance' '
> +	write_script print-args <<-\EOF &&
> +	echo $* >>args
> +	EOF
> +
> +	rm -f args &&
> +	GIT_TEST_MAINT_SCHEDULER="systemd:./print-args" git maintenance start &&
> +
> +	# start registers the repo
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
> +
> +	test_systemd_analyze_verify "$HOME/.config/systemd/user/git-maintenance@.service" &&
> +
> +	rm -f expect &&
> +	for frequency in hourly daily weekly
> +	do
> +		echo "--user enable --now git-maintenance@${frequency}.timer" >>expect || return 1
> +	done &&
> +	test_cmp expect args &&
> +
> +	rm -f args &&
> +	GIT_TEST_MAINT_SCHEDULER="systemd:./print-args" git maintenance stop &&
> +
> +	# stop does not unregister the repo
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
> +
> +	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.timer" &&
> +	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.service" &&
> +
> +	rm -f expect &&
> +	for frequency in hourly daily weekly
> +	do
> +		echo "--user disable --now git-maintenance@${frequency}.timer" >>expect || return 1
> +	done &&
> +	test_cmp expect args
> +'
> +
>   test_expect_success 'register preserves existing strategy' '
>   	git config maintenance.strategy none &&
>   	git maintenance register &&
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-10 18:03     ` Phillip Wood
@ 2021-05-10 18:25       ` Eric Sunshine
  2021-05-10 20:09         ` Phillip Wood
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Sunshine @ 2021-05-10 18:25 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Lénaïc Huard, Git List, Junio C Hamano, Derrick Stolee,
	brian m . carlson, Bagas Sanjaya,
	Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason

On Mon, May 10, 2021 at 2:04 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
> On 09/05/2021 22:32, Lénaïc Huard wrote:
> > +static int systemd_timer_enable_unit(int enable,
> > +                                  enum schedule_priority schedule,
> > +                                  const char *cmd)
>
> The cmd argument is pointless, it will always be "systemctl" and you
> have even hard coded that value into the error message below.

The reason that `cmd` is passed around everywhere is that the actual
command can be overridden by GIT_TEST_MAINT_SCHEDULER which allows the
test script to mock up a scheduler command rather than running the
real scheduler command. I haven't read the new version of the patch
closely yet, but after a quick scan, I'm pretty confident that this is
still the case (despite the aggressive changes the patch makes to the
areas around GIT_TEST_MAINT_SCHEDULER).

As for hardcoding the command name in the error message, that seems
perfectly fine since, under normal circumstances, it _will_ be that
command (it's only different when testing).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-09 21:32   ` [PATCH v2 1/1] " Lénaïc Huard
  2021-05-10  1:20     ` Đoàn Trần Công Danh
  2021-05-10 18:03     ` Phillip Wood
@ 2021-05-10 19:15     ` Martin Ågren
  2021-05-11 14:50     ` Phillip Wood
  2021-05-11 17:31     ` Derrick Stolee
  4 siblings, 0 replies; 30+ messages in thread
From: Martin Ågren @ 2021-05-10 19:15 UTC (permalink / raw)
  To: Lénaïc Huard
  Cc: Git Mailing List, Junio C Hamano, Derrick Stolee, Eric Sunshine,
	brian m . carlson, Bagas Sanjaya, Phillip Wood,
	Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason

Hi Lénaïc,

On Sun, 9 May 2021 at 23:37, Lénaïc Huard <lenaic@lhuard.fr> wrote:
> The default value is `auto` which chooses a suitable scheduler for the
> system.
> On Linux, if user systemd timers are available, they will be used as git
> maintenance scheduler. If not, `cron` will be used if it is available.
> If none is available, it will fail.

I understand your reasoning for going with systemd-timer over cron,
especially the part about knowing that the thing is actually running.

> +--scheduler=auto|crontab|systemd-timer|launchctl|schtasks::

This says "systemd-timer"...

> +       By default or when `auto` is specified, the most appropriate scheduler
> +       for the system is used. On MacOS, `launchctl` is used. On Windows,
> +       `schtasks` is used. On Linux, `systemd-timers` is used if user systemd

... this says "systemd-timers". Should those two be the same? (Which?)

> +       timers are available, otherwise, `crontab` is used. On all other systems,
> +       `crontab` is used.

So to be clear, I don't have a horse in this race. A few years ago I
would have foreseen all kinds of reactions to the implication that
systemd-timers would be "the most appropriate scheduler [...] on Linux".
Maybe those times are behind us now. In the commit message, you say "a
suitable", which reads a little bit less opinionated (to me).

That's just a minor point; feel free to disregard.

> +For more details, see systemd.timer(5)

Missing trailing ".".

A cursory grepping of our docs suggests this should be monospace
(`systemd.timer(5)`). There aren't that many places where we refer to
non-git manpages, thanks for doing so.

That's the only nit I found to make about the markup in the
documentation. Thanks for your attention to details. :-)

Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-10 18:25       ` Eric Sunshine
@ 2021-05-10 20:09         ` Phillip Wood
  2021-05-10 20:52           ` Eric Sunshine
  0 siblings, 1 reply; 30+ messages in thread
From: Phillip Wood @ 2021-05-10 20:09 UTC (permalink / raw)
  To: Eric Sunshine, Phillip Wood
  Cc: Lénaïc Huard, Git List, Junio C Hamano, Derrick Stolee,
	brian m . carlson, Bagas Sanjaya,
	Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason

Hi Eric

On 10/05/2021 19:25, Eric Sunshine wrote:
> On Mon, May 10, 2021 at 2:04 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>> On 09/05/2021 22:32, Lénaïc Huard wrote:
>>> +static int systemd_timer_enable_unit(int enable,
>>> +                                  enum schedule_priority schedule,
>>> +                                  const char *cmd)
>>
>> The cmd argument is pointless, it will always be "systemctl" and you
>> have even hard coded that value into the error message below.
> 
> The reason that `cmd` is passed around everywhere is that the actual
> command can be overridden by GIT_TEST_MAINT_SCHEDULER which allows the
> test script to mock up a scheduler command rather than running the
> real scheduler command. I haven't read the new version of the patch
> closely yet, but after a quick scan, I'm pretty confident that this is
> still the case (despite the aggressive changes the patch makes to the
> areas around GIT_TEST_MAINT_SCHEDULER).

Thanks for pointing that out I'd somehow glossed over the 
GIT_TEST_MAINT_SCHEDULER code, I agree it looks like the patch takes 
care to keep it working.

It is outside the scope of this patch but a possibly nicer pattern would 
be to have a function get_command_name(const char *default) that checks 
GIT_TEST_MAINT_SCHEDULER and returns the command name from that or the 
default if it is not set. We would then call that function to get the 
command name when we want to run a command. That way all the extra 
complexity is localized around the command call (and consists of a 
single function call), the usual command name is visible in the function 
calling the command and we'd avoid littering all the function signatures 
with a argument that is only relevant for testing.

> As for hardcoding the command name in the error message, that seems
> perfectly fine since, under normal circumstances, it _will_ be that
> command (it's only different when testing).

I agree, thanks

Phillip

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-10 20:09         ` Phillip Wood
@ 2021-05-10 20:52           ` Eric Sunshine
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Sunshine @ 2021-05-10 20:52 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Lénaïc Huard, Git List, Junio C Hamano, Derrick Stolee,
	brian m . carlson, Bagas Sanjaya,
	Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason

On Mon, May 10, 2021 at 4:09 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
> It is outside the scope of this patch but a possibly nicer pattern would
> be to have a function get_command_name(const char *default) that checks
> GIT_TEST_MAINT_SCHEDULER and returns the command name from that or the
> default if it is not set. We would then call that function to get the
> command name when we want to run a command. That way all the extra
> complexity is localized around the command call (and consists of a
> single function call), the usual command name is visible in the function
> calling the command and we'd avoid littering all the function signatures
> with a argument that is only relevant for testing.

Yup, that would be a nice eventual cleanup. I agree that it is outside
the scope of this submission.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-09 21:32   ` [PATCH v2 1/1] " Lénaïc Huard
                       ` (2 preceding siblings ...)
  2021-05-10 19:15     ` Martin Ågren
@ 2021-05-11 14:50     ` Phillip Wood
  2021-05-11 17:31     ` Derrick Stolee
  4 siblings, 0 replies; 30+ messages in thread
From: Phillip Wood @ 2021-05-11 14:50 UTC (permalink / raw)
  To: Lénaïc Huard, git
  Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine, brian m . carlson,
	Bagas Sanjaya, Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason

Hi Lénaïc

I've added some comments about testing

On 09/05/2021 22:32, Lénaïc Huard wrote:
>[...]
> +struct maintenance_start_opts {
> +	enum scheduler scheduler;
> +};
> +
> +static void resolve_auto_scheduler(enum scheduler *scheduler)
> +{
> +	if (*scheduler != SCHEDULER_AUTO)
> +		return;
> +
>   #if defined(__APPLE__)
> -static const char platform_scheduler[] = "launchctl";
> +	*scheduler = SCHEDULER_LAUNCHCTL;
> +	return;
> +
>   #elif defined(GIT_WINDOWS_NATIVE)
> -static const char platform_scheduler[] = "schtasks";
> +	*scheduler = SCHEDULER_SCHTASKS;
> +	return;
> +
> +#elif defined(__linux__)
> +	if (is_systemd_timer_available("systemctl"))
> +		*scheduler = SCHEDULER_SYSTEMD;
> +	else if (is_crontab_available("crontab"))
> +		*scheduler = SCHEDULER_CRON;
> +	else
> +		die(_("neither systemd timers nor crontab are available"));
> +	return;
> +
>   #else
> -static const char platform_scheduler[] = "crontab";
> +	*scheduler = SCHEDULER_CRON;
> +	return;
>   #endif
> +}

As it stands this function is untested and there is no way to test it 
with the current setup. There are two difficulties with testing it (i) 
it uses conditional compilation and (ii) there is no way to fake crontab 
and systemctl with the current test setup. I think we can address the 
latter (see below) and handle the condition compilation issues using 
test prerequisites if necessary.

> -static int update_background_schedule(int enable)
> +static void validate_scheduler(enum scheduler scheduler)
>   {
> -	int result;
> -	const char *scheduler = platform_scheduler;
> -	const char *cmd = scheduler;
> +	const char *cmd;
> +
> +	if (scheduler == SCHEDULER_INVALID)
> +		BUG("invalid scheduler");
> +	if (scheduler == SCHEDULER_AUTO)
> +		BUG("resolve_auto_scheduler should have been called before");
> +
> +	cmd = scheduler_fn[scheduler].cmd;
> +	if (!scheduler_fn[scheduler].is_available(cmd))
> +		die(_("%s scheduler is not available"), cmd);
> +}
> +
> +static int update_background_schedule(const struct maintenance_start_opts *opts,
> +				      int enable)
> +{
> +	unsigned int i;
> +	int res, result = 0;
> +	enum scheduler scheduler;
> +	const char *cmd = NULL;
>   	char *testing;
>   	struct lock_file lk;
>   	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
>   
> +	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> +		return error(_("another process is scheduling background maintenance"));
> +
>   	testing = xstrdup_or_null(getenv("GIT_TEST_MAINT_SCHEDULER"));
>   	if (testing) {
>   		char *sep = strchr(testing, ':');
>   		if (!sep)
>   			die("GIT_TEST_MAINT_SCHEDULER unparseable: %s", testing);
>   		*sep = '\0';
> -		scheduler = testing;
> +		scheduler = parse_scheduler(testing);
>   		cmd = sep + 1;
> +		result = scheduler_fn[scheduler].update_schedule(
> +			enable, get_lock_file_fd(&lk), cmd);
> +		goto done;
>   	}
>   
> -	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> -		return error(_("another process is scheduling background maintenance"));
> -
> -	if (!strcmp(scheduler, "launchctl"))
> -		result = launchctl_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "schtasks"))
> -		result = schtasks_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "crontab"))
> -		result = crontab_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else
> -		die("unknown background scheduler: %s", scheduler);
> +	for (i = 1; i < ARRAY_SIZE(scheduler_fn); i++) {
> +		int enable_scheduler = enable && (opts->scheduler == i);
> +		cmd = scheduler_fn[i].cmd;
> +		if (!scheduler_fn[i].is_available(cmd))
> +			continue;
> +		res = scheduler_fn[i].update_schedule(
> +			enable_scheduler, get_lock_file_fd(&lk), cmd);
> +		if (enable_scheduler)
> +			result = res;
> +	}

This loop which is responsible for disabling the existing scheduler and 
enabling a new one is completely untested. I think that before this 
patch special casing the testing above still ran the important code, 
however now we fail to test an important aspect of the business logic. 
As we can only fake one command name the current test setup is 
insufficient to fix this. I think the best solution would be to mock 
systemctl, crontab, and the other commands in the test environment.

	mkdir bin &&
	PATH="$(pwd)/bin:$PATH" &&
	test_write_script systemctl <<-\EOF &&
	# Mock systemctl, set FAKE_SYSTEMD=0 in the
	# environment to fake systemctl missing
	case "$*" in
		"--user list-timers") test ${FAKE_SYSTEMD:-1} = 1;
				      exit $?;;
		*) printf "%s\n" "$*";;
	esac	
	EOF
	# and so on for crontab etc

Then it would be possible to test the various values 
--scheduler=<scheduler> including 'auto' (by setting FAKE_SYSTEMD=0 or 
FAKE_CRONTAB=0) and that we disable cron when enabling systemd and vice 
versa. Mocking the commands would also allow us to cleanup the code in 
gc.c as it would no longer need to pass command names around.

> +done:
>   	rollback_lock_file(&lk);
>   	free(testing);
>   	return result;
>   }
>   
> -static int maintenance_start(void)
> +static const char *const builtin_maintenance_start_usage[] = {
> +	N_("git maintenance start [--scheduler=<scheduler>]"), NULL
> +};
> +
> +static int maintenance_start(int argc, const char **argv, const char *prefix)
>   {
> +	struct maintenance_start_opts opts;
> +	struct option builtin_maintenance_start_options[] = {
> +		OPT_CALLBACK(
> +			0, "scheduler", &opts.scheduler, N_("scheduler"),
> +			N_("scheduler to use to trigger git maintenance run"),
> +			maintenance_opt_scheduler),
> +		OPT_END()
> +	};
> +	memset(&opts, 0, sizeof(opts));
> +
> +	argc = parse_options(argc, argv, prefix,
> +			     builtin_maintenance_start_options,
> +			     builtin_maintenance_start_usage, 0);

This new command line option is completely untested

Best Wishes

Phillip

> +
> +	resolve_auto_scheduler(&opts.scheduler);
> +	validate_scheduler(opts.scheduler);
> +
> +	if (argc > 0)
> +		usage_with_options(builtin_maintenance_start_usage,
> +				   builtin_maintenance_start_options);
> +
>   	if (maintenance_register())
>   		warning(_("failed to add repo to global config"));
> -
> -	return update_background_schedule(1);
> +	return update_background_schedule(&opts, 1);
>   }
>   
>   static int maintenance_stop(void)
>   {
> -	return update_background_schedule(0);
> +	return update_background_schedule(NULL, 0);
>   }
>   
>   static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
> @@ -2027,7 +2354,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
>   	if (!strcmp(argv[1], "run"))
>   		return maintenance_run(argc - 1, argv + 1, prefix);
>   	if (!strcmp(argv[1], "start"))
> -		return maintenance_start();
> +		return maintenance_start(argc - 1, argv + 1, prefix);
>   	if (!strcmp(argv[1], "stop"))
>   		return maintenance_stop();
>   	if (!strcmp(argv[1], "register"))
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 2412d8c5c0..6e6316cd90 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -20,6 +20,20 @@ test_xmllint () {
>   	fi
>   }
>   
> +test_lazy_prereq SYSTEMD_ANALYZE '
> +	systemd-analyze --help >out &&
> +	grep verify out
> +'
> +
> +test_systemd_analyze_verify () {
> +	if test_have_prereq SYSTEMD_ANALYZE
> +	then
> +		systemd-analyze verify "$@"
> +	else
> +		true
> +	fi
> +}
> +
>   test_expect_success 'help text' '
>   	test_expect_code 129 git maintenance -h 2>err &&
>   	test_i18ngrep "usage: git maintenance <subcommand>" err &&
> @@ -615,6 +629,43 @@ test_expect_success 'start and stop Windows maintenance' '
>   	test_cmp expect args
>   '
>   
> +test_expect_success 'start and stop Linux/systemd maintenance' '
> +	write_script print-args <<-\EOF &&
> +	echo $* >>args
> +	EOF
> +
> +	rm -f args &&
> +	GIT_TEST_MAINT_SCHEDULER="systemd:./print-args" git maintenance start &&
> +
> +	# start registers the repo
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
> +
> +	test_systemd_analyze_verify "$HOME/.config/systemd/user/git-maintenance@.service" &&
> +
> +	rm -f expect &&
> +	for frequency in hourly daily weekly
> +	do
> +		echo "--user enable --now git-maintenance@${frequency}.timer" >>expect || return 1
> +	done &&
> +	test_cmp expect args &&
> +
> +	rm -f args &&
> +	GIT_TEST_MAINT_SCHEDULER="systemd:./print-args" git maintenance stop &&
> +
> +	# stop does not unregister the repo
> +	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
> +
> +	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.timer" &&
> +	test_path_is_missing "$HOME/.config/systemd/user/git-maintenance@.service" &&
> +
> +	rm -f expect &&
> +	for frequency in hourly daily weekly
> +	do
> +		echo "--user disable --now git-maintenance@${frequency}.timer" >>expect || return 1
> +	done &&
> +	test_cmp expect args
> +'
> +
>   test_expect_success 'register preserves existing strategy' '
>   	git config maintenance.strategy none &&
>   	git maintenance register &&
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-09 21:32   ` [PATCH v2 1/1] " Lénaïc Huard
                       ` (3 preceding siblings ...)
  2021-05-11 14:50     ` Phillip Wood
@ 2021-05-11 17:31     ` Derrick Stolee
  4 siblings, 0 replies; 30+ messages in thread
From: Derrick Stolee @ 2021-05-11 17:31 UTC (permalink / raw)
  To: Lénaïc Huard, git
  Cc: Junio C Hamano, Derrick Stolee, Eric Sunshine, brian m . carlson,
	Bagas Sanjaya, Phillip Wood,
	Đoàn Trần Công Danh,
	Ævar Arnfjörð Bjarmason

On 5/9/2021 5:32 PM, Lénaïc Huard wrote:
> The existing mechanism for scheduling background maintenance is done
> through cron. On Linux systems managed by systemd, systemd provides an
> alternative to schedule recurring tasks: systemd timers.

Thank you for working so hard on this systemd integration. I see you
have already received significant feedback on that portion, so I
wanted to focus my review on the other piece that exists in this patch.

> In order to choose which scheduler to use between `cron` and user
> systemd timers, a new option
> `--scheduler=auto|cron|systemd|launchctl|schtasks` has been added to
> `git maintenance start`.
> When `git maintenance start --scheduler=XXX` is run, it not only
> registers `git maintenance run` tasks in the scheduler XXX, it also
> removes the `git maintenance run` tasks from all the other schedulers to
> ensure we cannot have two schedulers launching concurrent identical
> tasks.
> 
> The default value is `auto` which chooses a suitable scheduler for the
> system.

This addition of the --scheduler option should be split into a patch
on its own. It requires significant refactoring of the existing code
in a way that distracts from your systemd work.

I'll highlight the portions of the diff that you could include in
a preliminary patch and save the systemd stuff for an addition on top
of that.

> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 80ddd33ceb..f012923333 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -181,6 +181,20 @@ OPTIONS
>  	`maintenance.<task>.enabled` configured as `true` are considered.
>  	See the 'TASKS' section for the list of accepted `<task>` values.
>  
> +--scheduler=auto|crontab|systemd-timer|launchctl|schtasks::
> +	When combined with the `start` subcommand, specify the scheduler
> +	to use to run the hourly, daily and weekly executions of
> +	`git maintenance run`.
> +	The possible values for `<scheduler>` depend on the system: `crontab`
> +	is available on POSIX systems, `systemd-timer` is available on Linux
> +	systems, `launchctl` is available on MacOS and `schtasks` is available
> +	on Windows.
> +	By default or when `auto` is specified, the most appropriate scheduler
> +	for the system is used. On MacOS, `launchctl` is used. On Windows,
> +	`schtasks` is used. On Linux, `systemd-timers` is used if user systemd
> +	timers are available, otherwise, `crontab` is used. On all other systems,
> +	`crontab` is used.
> +

This portion of the docs can be updated on its own (minus the systemd bits).
> diff --git a/builtin/gc.c b/builtin/gc.c
> index ef7226d7bc..7c72aa3b99 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1544,6 +1544,15 @@ static const char *get_frequency(enum schedule_priority schedule)
>  	}
>  }
>  
> +static int is_launchctl_available(const char *cmd)
> +{
> +#ifdef __APPLE__
> +	return 1;
> +#else
> +	return 0;
> +#endif
> +}
> +
>  static char *launchctl_service_name(const char *frequency)
>  {
>  	struct strbuf label = STRBUF_INIT;
> @@ -1710,6 +1719,15 @@ static int launchctl_update_schedule(int run_maintenance, int fd, const char *cm
>  		return launchctl_remove_plists(cmd);
>  }
>  
> +static int is_schtasks_available(const char *cmd)
> +{
> +#ifdef GIT_WINDOWS_NATIVE
> +	return 1;
> +#else
> +	return 0;
> +#endif
> +}
> +
>  static char *schtasks_task_name(const char *frequency)
>  {
>  	struct strbuf label = STRBUF_INIT;
> @@ -1872,6 +1890,28 @@ static int schtasks_update_schedule(int run_maintenance, int fd, const char *cmd
>  		return schtasks_remove_tasks(cmd);
>  }
>  
> +static int is_crontab_available(const char *cmd)
> +{
> +	static int cached_result = -1;
> +	struct child_process child = CHILD_PROCESS_INIT;
> +
> +	if (cached_result != -1)
> +		return cached_result;
> +
> +	strvec_split(&child.args, cmd);
> +	strvec_push(&child.args, "-l");
> +	child.no_stdin = 1;
> +	child.no_stdout = 1;
> +	child.no_stderr = 1;
> +	child.silent_exec_failure = 1;
> +
> +	if (start_command(&child))
> +		return cached_result = 0;
> +	/* Ignore exit code, as an empty crontab will return error. */
> +	finish_command(&child);
> +	return cached_result = 1;
> +}
> +

These is_X_available() methods are valuable helpers. Adding
is_systemd_timer_available() in the second patch will be
simpler with this framework in place.

> +enum scheduler {
> +	SCHEDULER_INVALID = -1,
> +	SCHEDULER_AUTO = 0,
> +	SCHEDULER_CRON = 1,
> +	SCHEDULER_SYSTEMD = 2,
> +	SCHEDULER_LAUNCHCTL = 3,
> +	SCHEDULER_SCHTASKS = 4,
> +};
> +
> +static const struct {
> +	int (*is_available)(const char *cmd);
> +	int (*update_schedule)(int run_maintenance, int fd, const char *cmd);
> +	const char *cmd;
> +} scheduler_fn[] = {
> +	[SCHEDULER_CRON] = { is_crontab_available, crontab_update_schedule,
> +			     "crontab" },
> +	[SCHEDULER_SYSTEMD] = { is_systemd_timer_available,
> +				systemd_timer_update_schedule, "systemctl" },
> +	[SCHEDULER_LAUNCHCTL] = { is_launchctl_available,
> +				  launchctl_update_schedule, "launchctl" },
> +	[SCHEDULER_SCHTASKS] = { is_schtasks_available,
> +				 schtasks_update_schedule, "schtasks" },
> +};

This is also good to include, minus the systemd lines.

I would also like to see this declaration reformatted.
Something like this would be good:

static const struct {
	int (*is_available)(const char *cmd);
	int (*update_schedule)(int run_maintenance, int fd, const char *cmd);
	const char *cmd;
} scheduler_fn[] = {
	[SCHEDULER_CRON] = {
		.is_available = is_crontab_available,
		.update_schedule = crontab_update_schedule,
		.cmd = "crontab",
	},
	[SCHEDULER_LAUNCHCTL] = {
		.is_available = is_launchctl_available,
		.update_schedule = launchctl_update_schedule,
		.cmd = "launchctl",
	},
	[SCHEDULER_SCHTASKS] = {
		.is_available = is_schtasks_available,
		.update_schedule = schtasks_update_schedule,
		.cmd = "schtasks",
	},
};

The use of member names can help if we need to augment this
struct later, and the use of commas after the final terms of
each block helps the future diff if we add items to the end.

> +
> +static enum scheduler parse_scheduler(const char *value)
> +{
> +	if (!value)
> +		return SCHEDULER_INVALID;
> +	else if (!strcasecmp(value, "auto"))
> +		return SCHEDULER_AUTO;
> +	else if (!strcasecmp(value, "cron") || !strcasecmp(value, "crontab"))
> +		return SCHEDULER_CRON;
> +	else if (!strcasecmp(value, "systemd") ||
> +		 !strcasecmp(value, "systemd-timer"))
> +		return SCHEDULER_SYSTEMD;
> +	else if (!strcasecmp(value, "launchctl"))
> +		return SCHEDULER_LAUNCHCTL;
> +	else if (!strcasecmp(value, "schtasks"))
> +		return SCHEDULER_SCHTASKS;
> +	else
> +		return SCHEDULER_INVALID;
> +}

Good. The systemd stuff can be added in the second patch,
making a clear integration point.

> +static int maintenance_opt_scheduler(const struct option *opt, const char *arg,
> +				     int unset)
> +{
> +	enum scheduler *scheduler = opt->value;
> +
> +	if (unset)
> +		die(_("--no-scheduler is not allowed"));
> +
> +	*scheduler = parse_scheduler(arg);
> +
> +	if (*scheduler == SCHEDULER_INVALID)
> +		die(_("unrecognized --scheduler argument '%s'"), arg);
> +
> +	return 0;
> +}
> +
> +struct maintenance_start_opts {
> +	enum scheduler scheduler;
> +};

This struct that contains only the enum seems confusing to me.
Maybe it will make sense later.

> +static void resolve_auto_scheduler(enum scheduler *scheduler)
> +{
> +	if (*scheduler != SCHEDULER_AUTO)
> +		return;
> +
>  #if defined(__APPLE__)
> -static const char platform_scheduler[] = "launchctl";
> +	*scheduler = SCHEDULER_LAUNCHCTL;
> +	return;
> +
>  #elif defined(GIT_WINDOWS_NATIVE)
> -static const char platform_scheduler[] = "schtasks";
> +	*scheduler = SCHEDULER_SCHTASKS;
> +	return;
> +
> +#elif defined(__linux__)
> +	if (is_systemd_timer_available("systemctl"))
> +		*scheduler = SCHEDULER_SYSTEMD;
> +	else if (is_crontab_available("crontab"))
> +		*scheduler = SCHEDULER_CRON;
> +	else
> +		die(_("neither systemd timers nor crontab are available"));
> +	return;
> +
>  #else
> -static const char platform_scheduler[] = "crontab";
> +	*scheduler = SCHEDULER_CRON;
> +	return;
>  #endif
> +}

This diff looks pretty rough. I see that you are making
systemctl the default for Linux. Ok. This also seems like
it will not be testable in the test suite.

>  
> -static int update_background_schedule(int enable)
> +static void validate_scheduler(enum scheduler scheduler)
>  {
> -	int result;
> -	const char *scheduler = platform_scheduler;
> -	const char *cmd = scheduler;
> +	const char *cmd;
> +
> +	if (scheduler == SCHEDULER_INVALID)
> +		BUG("invalid scheduler");
> +	if (scheduler == SCHEDULER_AUTO)
> +		BUG("resolve_auto_scheduler should have been called before");
> +
> +	cmd = scheduler_fn[scheduler].cmd;
> +	if (!scheduler_fn[scheduler].is_available(cmd))
> +		die(_("%s scheduler is not available"), cmd);
> +}
> +
> +static int update_background_schedule(const struct maintenance_start_opts *opts,
> +				      int enable)
> +{
> +	unsigned int i;
> +	int res, result = 0;
> +	enum scheduler scheduler;
> +	const char *cmd = NULL;
>  	char *testing;
>  	struct lock_file lk;
>  	char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);
>  
> +	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> +		return error(_("another process is scheduling background maintenance"));
> +
>  	testing = xstrdup_or_null(getenv("GIT_TEST_MAINT_SCHEDULER"));
>  	if (testing) {
>  		char *sep = strchr(testing, ':');
>  		if (!sep)
>  			die("GIT_TEST_MAINT_SCHEDULER unparseable: %s", testing);
>  		*sep = '\0';
> -		scheduler = testing;
> +		scheduler = parse_scheduler(testing);
>  		cmd = sep + 1;
> +		result = scheduler_fn[scheduler].update_schedule(
> +			enable, get_lock_file_fd(&lk), cmd);
> +		goto done;

I see this 'goto done' is the reason we need to take the lock earlier. The
other option would be to put the 'goto done' after the rollback_lock_file(),
but this is fine, too.

>  	}
>  
> -	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0)
> -		return error(_("another process is scheduling background maintenance"));
> -
> -	if (!strcmp(scheduler, "launchctl"))
> -		result = launchctl_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "schtasks"))
> -		result = schtasks_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else if (!strcmp(scheduler, "crontab"))
> -		result = crontab_update_schedule(enable, get_lock_file_fd(&lk), cmd);
> -	else
> -		die("unknown background scheduler: %s", scheduler);
> +	for (i = 1; i < ARRAY_SIZE(scheduler_fn); i++) {
> +		int enable_scheduler = enable && (opts->scheduler == i);
> +		cmd = scheduler_fn[i].cmd;
> +		if (!scheduler_fn[i].is_available(cmd))
> +			continue;
> +		res = scheduler_fn[i].update_schedule(
> +			enable_scheduler, get_lock_file_fd(&lk), cmd);
> +		if (enable_scheduler)
> +			result = res;
> +	}

This loop is cleaner than the list of else-ifs.

>  
> +done:
>  	rollback_lock_file(&lk);
>  	free(testing);
>  	return result;
>  }
>  
> -static int maintenance_start(void)
> +static const char *const builtin_maintenance_start_usage[] = {
> +	N_("git maintenance start [--scheduler=<scheduler>]"), NULL
> +};
> +
> +static int maintenance_start(int argc, const char **argv, const char *prefix)
>  {
> +	struct maintenance_start_opts opts;

I see you using the struct now and in the helper methods above.
While the creation of the struct looks a little strange now, this
use matches other patterns in this file and is more flexible to
additional options in the future. Thanks.

> +	struct option builtin_maintenance_start_options[] = {
> +		OPT_CALLBACK(
> +			0, "scheduler", &opts.scheduler, N_("scheduler"),
> +			N_("scheduler to use to trigger git maintenance run"),
> +			maintenance_opt_scheduler),

nit: these would typically be aligned with the end of the
OPT_CALLBACK( opener, with the first set of parameters being on
the same line:

		OPT_CALLBACK(0, "scheduler", &opts.scheduler, N_("scheduler"),
			     N_("scheduler to use to trigger 'git maintenance run'"),
			     maintenance_opt_scheduler),

These options frequently run a little long on the line width,
which might have been your motivation in adding an extra line.

> +		OPT_END()
> +	};
> +	memset(&opts, 0, sizeof(opts));
> +
> +	argc = parse_options(argc, argv, prefix,
> +			     builtin_maintenance_start_options,
> +			     builtin_maintenance_start_usage, 0);
> +
> +	resolve_auto_scheduler(&opts.scheduler);
> +	validate_scheduler(opts.scheduler);
> +
> +	if (argc > 0)
> +		usage_with_options(builtin_maintenance_start_usage,
> +				   builtin_maintenance_start_options);

nit: "if (argc)" is the more typical pattern in the Git codebase.

Also, this check should come right after parse_options().

>  	if (maintenance_register())
>  		warning(_("failed to add repo to global config"));
> -
> -	return update_background_schedule(1);
> +	return update_background_schedule(&opts, 1);
>  }
>  
>  static int maintenance_stop(void)
>  {
> -	return update_background_schedule(0);
> +	return update_background_schedule(NULL, 0);
>  }
>  
>  static const char builtin_maintenance_usage[] =	N_("git maintenance <subcommand> [<options>]");
> @@ -2027,7 +2354,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
>  	if (!strcmp(argv[1], "run"))
>  		return maintenance_run(argc - 1, argv + 1, prefix);
>  	if (!strcmp(argv[1], "start"))
> -		return maintenance_start();
> +		return maintenance_start(argc - 1, argv + 1, prefix);
>  	if (!strcmp(argv[1], "stop"))
>  		return maintenance_stop();
>  	if (!strcmp(argv[1], "register"))
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 2412d8c5c0..6e6316cd90 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh

I'm not sure how we could achieve it, but it might be good to demonstrate
a use of the --scheduler option here in the test script.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-10  6:25         ` Junio C Hamano
@ 2021-05-12  0:29           ` Đoàn Trần Công Danh
  2021-05-12  6:59             ` Felipe Contreras
  2021-05-12 13:38             ` Phillip Wood
  0 siblings, 2 replies; 30+ messages in thread
From: Đoàn Trần Công Danh @ 2021-05-12  0:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eric Sunshine, Lénaïc Huard, Git List, Derrick Stolee,
	brian m . carlson, Bagas Sanjaya, Phillip Wood,
	Ævar Arnfjörð Bjarmason

On 2021-05-10 15:25:07+0900, Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> 
> >> I think others have strong opinion on not using "%1$s",
> >> and prefer simple "%s" and using "exec_path" twice instead.
> >
> > I brought it up only because I hadn't seen it in Git sources, and
> > wasn't sure if we'd want to start using it. Aside from Ævar, who
> > seemed reasonably in favor of it, nobody else chimed in, so it could
> > go either way, I suppose.
> 
> If this were a piece of code that _everybody_ would use on _all_ the
> supported platforms, I would suggest declaring that this is a
> weather-balloon to see if some platforms have trouble using it.  But
> unfortunately this is not such a piece of code.  Dependence on
> systemd should strictly be opt-in.

Yes, dependence on systemd should be strictly opt-in.
Although, I don't use systemd-based distro, so it is irrelevant to me.
I think it's none of Git (the project) business to decide which
scheduler should be given higher priority. It's crontab when
maintenance was introduced, it should be crontab, now.

Another point for eternal bikeshedding: why do we limit ourselves in
crontab and systemd, how about other homebrew schedulers? What should
we do if another scheduler raise to be the big star in the scheduler
world?

I guess we should take some templates for running on {,un}register
instead? However, I think such design may open another can of worms.
So, I don't know.


-- 
Danh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-12  0:29           ` Đoàn Trần Công Danh
@ 2021-05-12  6:59             ` Felipe Contreras
  2021-05-12 13:26               ` Phillip Wood
  2021-05-12 13:38             ` Phillip Wood
  1 sibling, 1 reply; 30+ messages in thread
From: Felipe Contreras @ 2021-05-12  6:59 UTC (permalink / raw)
  To: Đoàn Trần Công Danh, Junio C Hamano
  Cc: Eric Sunshine, Lénaïc Huard, Git List, Derrick Stolee,
	brian m . carlson, Bagas Sanjaya, Phillip Wood,
	Ævar Arnfjörð Bjarmason

Đoàn Trần Công Danh wrote:
> Yes, dependence on systemd should be strictly opt-in.
> Although, I don't use systemd-based distro, so it is irrelevant to me.
> I think it's none of Git (the project) business to decide which
> scheduler should be given higher priority. It's crontab when
> maintenance was introduced, it should be crontab, now.

I do use a systemd-based distro, and I like the option to use systemd
units, but let's be honest...

100% of systems with systemd have cron... So...

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-12  6:59             ` Felipe Contreras
@ 2021-05-12 13:26               ` Phillip Wood
  0 siblings, 0 replies; 30+ messages in thread
From: Phillip Wood @ 2021-05-12 13:26 UTC (permalink / raw)
  To: Felipe Contreras, Đoàn Trần Công Danh,
	Junio C Hamano
  Cc: Eric Sunshine, Lénaïc Huard, Git List, Derrick Stolee,
	brian m . carlson, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

On 12/05/2021 07:59, Felipe Contreras wrote:
> Đoàn Trần Công Danh wrote:
>> Yes, dependence on systemd should be strictly opt-in.
>> Although, I don't use systemd-based distro, so it is irrelevant to me.
>> I think it's none of Git (the project) business to decide which
>> scheduler should be given higher priority. It's crontab when
>> maintenance was introduced, it should be crontab, now.
> 
> I do use a systemd-based distro, and I like the option to use systemd
> units, but let's be honest...
> 
> 100% of systems with systemd have cron...

This is untrue, as the commit message points out cron is optional on 
systems running systemd and there are distributions such as Arch Linux 
that do not install a cron daemon without explicit user intervention.

  So...
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-12  0:29           ` Đoàn Trần Công Danh
  2021-05-12  6:59             ` Felipe Contreras
@ 2021-05-12 13:38             ` Phillip Wood
  2021-05-12 15:41               ` Đoàn Trần Công Danh
  1 sibling, 1 reply; 30+ messages in thread
From: Phillip Wood @ 2021-05-12 13:38 UTC (permalink / raw)
  To: Đoàn Trần Công Danh, Junio C Hamano
  Cc: Eric Sunshine, Lénaïc Huard, Git List, Derrick Stolee,
	brian m . carlson, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Hi Đoàn

On 12/05/2021 01:29, Đoàn Trần Công Danh wrote:
> On 2021-05-10 15:25:07+0900, Junio C Hamano <gitster@pobox.com> wrote:
>> Eric Sunshine <sunshine@sunshineco.com> writes:
>>
>>>> I think others have strong opinion on not using "%1$s",
>>>> and prefer simple "%s" and using "exec_path" twice instead.
>>>
>>> I brought it up only because I hadn't seen it in Git sources, and
>>> wasn't sure if we'd want to start using it. Aside from Ævar, who
>>> seemed reasonably in favor of it, nobody else chimed in, so it could
>>> go either way, I suppose.
>>
>> If this were a piece of code that _everybody_ would use on _all_ the
>> supported platforms, I would suggest declaring that this is a
>> weather-balloon to see if some platforms have trouble using it.  But
>> unfortunately this is not such a piece of code.  Dependence on
>> systemd should strictly be opt-in.
> 
> Yes, dependence on systemd should be strictly opt-in.
> Although, I don't use systemd-based distro, so it is irrelevant to me.
> I think it's none of Git (the project) business to decide which
> scheduler should be given higher priority. It's crontab when
> maintenance was introduced, it should be crontab, now.

You seem to be simultaneously arguing that git should be neutral on the 
choice of scheduler while saying it should prioritize crontab. The 
commit message and cover letter list a number of difficulties with the 
strategy of prioritizing crontab over systemd when both are installed. I 
think we should aim for the solution that has the most chance of working 
without user intervention.

> Another point for eternal bikeshedding: why do we limit ourselves in
> crontab and systemd, how about other homebrew schedulers? What should
> we do if another scheduler raise to be the big star in the scheduler
> world?

We should support the default scheduler on each platform - that was the 
rod we made for our own back when we decided to use the platform's 
scheduler rather than having a cross platform git maintenance daemon. It 
just happens that there are two possible default schedulers on linux so 
we need to support both of them.

Best Wishes

Phillip

> I guess we should take some templates for running on {,un}register
> instead? However, I think such design may open another can of worms.
> So, I don't know.
> 
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/1] maintenance: use systemd timers on Linux
  2021-05-12 13:38             ` Phillip Wood
@ 2021-05-12 15:41               ` Đoàn Trần Công Danh
  0 siblings, 0 replies; 30+ messages in thread
From: Đoàn Trần Công Danh @ 2021-05-12 15:41 UTC (permalink / raw)
  To: phillip.wood
  Cc: Junio C Hamano, Eric Sunshine, Lénaïc Huard, Git List,
	Derrick Stolee, brian m . carlson, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

On 2021-05-12 14:38:26+0100, Phillip Wood <phillip.wood123@gmail.com> wrote:
> > Yes, dependence on systemd should be strictly opt-in.
> > Although, I don't use systemd-based distro, so it is irrelevant to me.
> > I think it's none of Git (the project) business to decide which
> > scheduler should be given higher priority. It's crontab when
> > maintenance was introduced, it should be crontab, now.
> 
> You seem to be simultaneously arguing that git should be neutral on the
> choice of scheduler while saying it should prioritize crontab.

Yes, I'm arguing for git should be neutral on the choice of scheduler.

No, I'm not arguing for git should be prioritize crontab, I'm arguing
for "princible of least surprise" for no known break-through advantage.

FWIW, whatever default scheduler chosen won't affect me, since I don't
have systemd-timers to begin with. So ...

In addition, I was one of those people pointed out that beside
crontab, Linux users nowaday employed different schedulers [1],
and the consensus, some how, settled on crontab.

I think  we shouldn't switch away from crontab if we don't have any
compelling reasons.

> The commit
> message and cover letter list a number of difficulties with the strategy of
> prioritizing crontab over systemd when both are installed. I think we should
> aim for the solution that has the most chance of working without user
> intervention.

The solution that has the most chance of working without user
intervention is the solution that is the status quo. Promoting
systemd-timers to higher priority is a solution requires either
user intervention or our supports (which will be carried over our
lifetime).

> > Another point for eternal bikeshedding: why do we limit ourselves in
> > crontab and systemd, how about other homebrew schedulers? What should
> > we do if another scheduler raise to be the big star in the scheduler
> > world?
> 
> We should support the default scheduler on each platform - that was the rod
> we made for our own back when we decided to use the platform's scheduler
> rather than having a cross platform git maintenance daemon. It just happens
> that there are two possible default schedulers on linux so we need to
> support both of them.

As noted in [1], some home-brew solutions are very popular solutions
among those some community.
I'm not arguing that crontab or systemd-timers aren't popular.
In fact, I think they're *very* popular, I listed systemd-timers as
*first* alternative in the linked email.
I'm not against supporting both of them, I was arguing about a generic
solution.

1: https://lore.kernel.org/git/20200407005828.GC2568@danh.dev/

-- 
Danh

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2021-05-12 17:07 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-01 14:52 [PATCH] maintenance: use systemd timers on Linux Lénaïc Huard
2021-05-01 20:02 ` brian m. carlson
2021-05-02  5:28 ` Bagas Sanjaya
2021-05-02  6:49   ` Eric Sunshine
2021-05-02  6:45 ` Eric Sunshine
2021-05-02 14:10   ` Phillip Wood
2021-05-05 12:19     ` Đoàn Trần Công Danh
2021-05-05 14:57       ` Phillip Wood
2021-05-05 12:01   ` Ævar Arnfjörð Bjarmason
2021-05-09 22:34     ` Lénaïc Huard
2021-05-10 13:03       ` Ævar Arnfjörð Bjarmason
2021-05-02 11:12 ` Bagas Sanjaya
2021-05-03 12:04 ` Derrick Stolee
2021-05-09 21:32 ` [PATCH v2 0/1] " Lénaïc Huard
2021-05-09 21:32   ` [PATCH v2 1/1] " Lénaïc Huard
2021-05-10  1:20     ` Đoàn Trần Công Danh
2021-05-10  2:48       ` Eric Sunshine
2021-05-10  6:25         ` Junio C Hamano
2021-05-12  0:29           ` Đoàn Trần Công Danh
2021-05-12  6:59             ` Felipe Contreras
2021-05-12 13:26               ` Phillip Wood
2021-05-12 13:38             ` Phillip Wood
2021-05-12 15:41               ` Đoàn Trần Công Danh
2021-05-10 18:03     ` Phillip Wood
2021-05-10 18:25       ` Eric Sunshine
2021-05-10 20:09         ` Phillip Wood
2021-05-10 20:52           ` Eric Sunshine
2021-05-10 19:15     ` Martin Ågren
2021-05-11 14:50     ` Phillip Wood
2021-05-11 17:31     ` Derrick Stolee

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/inbox.comp.version-control.git
	nntp://ie5yzdi7fg72h7s4sdcztq5evakq23rdt33mfyfcddc5u3ndnw24ogqd.onion/inbox.comp.version-control.git
	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git