git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Since gc.autodetach=1 you can end up with auto-gc on every command with no user notification
@ 2015-07-08 12:28 Ævar Arnfjörð Bjarmason
  2015-07-08 12:47 ` Duy Nguyen
  0 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2015-07-08 12:28 UTC (permalink / raw)
  To: Git; +Cc: Nguyễn Thái Ngọc Duy, Junio C Hamano

Someone at work came to me with the problem that they were getting the
"Auto packing the repository in background for optimum performance"
notice on every Git command that they ran.

This problem is a combination of two things:

 * Since Nguyễn's v1.9-rc0-2-g9f673f9 where we started running "git
gc" in the background the user hasn't seen the "There are too many
unreachable loose objects" message added back in v1.5.3.1-27-ga087cc9

 * The checkout has a lot of loose objects. So even after "git prune
--expire=2.week.ago" the .git/objects/17 directory has 317 objects.
More than 27 in that directory trigger "git gc --auto".

So it's partly a UI issue. Since the repacking is happening in the
background the user never sees the message suggesting that they run
"git prune".

But perhaps the heuristic of "are there more than 27 objects in
.git/objects/17" could be improved, but I don't know with what
exactly.

But having something fork a gc to the background on every fetch (and
similar object-modifying operations) is quite sub-optimal.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Since gc.autodetach=1 you can end up with auto-gc on every command with no user notification
  2015-07-08 12:28 Since gc.autodetach=1 you can end up with auto-gc on every command with no user notification Ævar Arnfjörð Bjarmason
@ 2015-07-08 12:47 ` Duy Nguyen
  2015-08-22  2:12   ` [PATCH v3] gc: save log from daemonized gc --auto and print it next time Nguyễn Thái Ngọc Duy
  0 siblings, 1 reply; 17+ messages in thread
From: Duy Nguyen @ 2015-07-08 12:47 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git, Junio C Hamano

On Wed, Jul 8, 2015 at 7:28 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Someone at work came to me with the problem that they were getting the
> "Auto packing the repository in background for optimum performance"
> notice on every Git command that they ran.
>
> This problem is a combination of two things:
>
>  * Since Nguyễn's v1.9-rc0-2-g9f673f9 where we started running "git
> gc" in the background the user hasn't seen the "There are too many
> unreachable loose objects" message added back in v1.5.3.1-27-ga087cc9
>
>  * The checkout has a lot of loose objects. So even after "git prune
> --expire=2.week.ago" the .git/objects/17 directory has 317 objects.
> More than 27 in that directory trigger "git gc --auto".
>
> So it's partly a UI issue. Since the repacking is happening in the
> background the user never sees the message suggesting that they run
> "git prune".
>
> But perhaps the heuristic of "are there more than 27 objects in
> .git/objects/17" could be improved, but I don't know with what
> exactly.
>
> But having something fork a gc to the background on every fetch (and
> similar object-modifying operations) is quite sub-optimal.

Thanks for reminding me. Two related threads: [1] is about that "17"
heuristics and [2] the UI thing. I need to reread them before making
any more comments.

[1] http://thread.gmane.org/gmane.comp.version-control.git/265734/focus=265756
[2] http://thread.gmane.org/gmane.comp.version-control.git/266182
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3] gc: save log from daemonized gc --auto and print it next time
  2015-07-08 12:47 ` Duy Nguyen
@ 2015-08-22  2:12   ` Nguyễn Thái Ngọc Duy
  2015-08-25 17:49     ` Junio C Hamano
  2015-09-13  1:36     ` [PATCH v4] " Nguyễn Thái Ngọc Duy
  0 siblings, 2 replies; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-08-22  2:12 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Nguyễn Thái Ngọc Duy

While commit 9f673f9 (gc: config option for running --auto in
background - 2014-02-08) helps reduce some complaints about 'gc
--auto' hogging the terminal, it creates another set of problems.

The latest in this set is, as the result of daemonizing, stderr is
closed and all warnings are lost. This warning at the end of cmd_gc()
is particularly important because it tells the user how to avoid "gc
--auto" running repeatedly. Because stderr is closed, the user does
not know, naturally they complain about 'gc --auto' wasting CPU.

Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc runs
will not be daemonized and gc.log printed out until the user removes
gc.log.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Let's try again. Compared to v2 [1], this version does not delete
 gc.log automatically any more. The user needs to take action, then
 delete gc.log to bring it background again.

 [1] http://thread.gmane.org/gmane.comp.version-control.git/266182/focus=266320

 builtin/gc.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index bcc75d9..00a83e1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
 static struct argv_array rerere = ARGV_ARRAY_INIT;
 
 static char *pidfile;
+static struct strbuf log_filename = STRBUF_INIT;
+static int daemonized;
 
 static void remove_pidfile(void)
 {
+	if (daemonized && log_filename.len) {
+		struct stat st;
+
+		close(2);
+		if (stat(log_filename.buf, &st) ||
+		    !st.st_size ||
+		    rename(log_filename.buf, git_path("gc.log")))
+			unlink(log_filename.buf);
+	}
 	if (pidfile)
 		unlink(pidfile);
 }
@@ -330,13 +341,24 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
 		if (detach_auto) {
+			struct strbuf sb = STRBUF_INIT;
+			if (strbuf_read_file(&sb, git_path("gc.log"), 0) > 0) {
+				warning(_("last gc run reported:\n"
+					  "%s\n"
+					  "running in foreground until %s is removed"),
+					sb.buf, git_path("gc.log"));
+				detach_auto = 0;
+			}
+			strbuf_release(&sb);
+		}
+		if (detach_auto) {
 			if (gc_before_repack())
 				return -1;
 			/*
 			 * failure to daemonize is ok, we'll continue
 			 * in foreground
 			 */
-			daemonize();
+			daemonized = !daemonize();
 		}
 	} else
 		add_repack_all_option();
@@ -349,6 +371,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		    name, (uintmax_t)pid);
 	}
 
+	if (daemonized) {
+		int fd;
+
+		strbuf_addstr(&log_filename, git_path("gc.log_XXXXXX"));
+		fd = xmkstemp(log_filename.buf);
+		if (fd >= 0) {
+			dup2(fd, 2);
+			close(fd);
+		} else
+			strbuf_release(&log_filename);
+	}
+
 	if (gc_before_repack())
 		return -1;
 
-- 
2.3.0.rc1.137.g477eb31

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3] gc: save log from daemonized gc --auto and print it next time
  2015-08-22  2:12   ` [PATCH v3] gc: save log from daemonized gc --auto and print it next time Nguyễn Thái Ngọc Duy
@ 2015-08-25 17:49     ` Junio C Hamano
  2015-08-31 10:17       ` Duy Nguyen
  2015-09-13  1:36     ` [PATCH v4] " Nguyễn Thái Ngọc Duy
  1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-08-25 17:49 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Ævar Arnfjörð Bjarmason, Eric Sunshine

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> While commit 9f673f9 (gc: config option for running --auto in
> background - 2014-02-08) helps reduce some complaints about 'gc
> --auto' hogging the terminal, it creates another set of problems.
>
> The latest in this set is, as the result of daemonizing, stderr is
> closed and all warnings are lost. This warning at the end of cmd_gc()
> is particularly important because it tells the user how to avoid "gc
> --auto" running repeatedly. Because stderr is closed, the user does
> not know, naturally they complain about 'gc --auto' wasting CPU.
>
> Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc runs
> will not be daemonized and gc.log printed out until the user removes
> gc.log.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  Let's try again. Compared to v2 [1], this version does not delete
>  gc.log automatically any more. The user needs to take action, then
>  delete gc.log to bring it background again.

Sounds a bit more sensible, but I wonder if it is a good idea to
keep going in the first place.  If the last gc run reported an
issue, would it make it likely that we would hit the same issue?

An alternative design I have in mind is to exit "gc --auto" with
success without doing anything, after giving the new warning
message.  What would be the pros-and-cons between this patch and
that alternative design?



>  [1] http://thread.gmane.org/gmane.comp.version-control.git/266182/focus=266320
>
>  builtin/gc.c | 36 +++++++++++++++++++++++++++++++++++-
>  1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index bcc75d9..00a83e1 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
>  static struct argv_array rerere = ARGV_ARRAY_INIT;
>  
>  static char *pidfile;
> +static struct strbuf log_filename = STRBUF_INIT;
> +static int daemonized;
>  
>  static void remove_pidfile(void)
>  {
> +	if (daemonized && log_filename.len) {
> +		struct stat st;
> +
> +		close(2);
> +		if (stat(log_filename.buf, &st) ||
> +		    !st.st_size ||
> +		    rename(log_filename.buf, git_path("gc.log")))
> +			unlink(log_filename.buf);
> +	}
>  	if (pidfile)
>  		unlink(pidfile);
>  }
> @@ -330,13 +341,24 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
>  		}
>  		if (detach_auto) {
> +			struct strbuf sb = STRBUF_INIT;
> +			if (strbuf_read_file(&sb, git_path("gc.log"), 0) > 0) {
> +				warning(_("last gc run reported:\n"
> +					  "%s\n"
> +					  "running in foreground until %s is removed"),
> +					sb.buf, git_path("gc.log"));
> +				detach_auto = 0;
> +			}
> +			strbuf_release(&sb);
> +		}
> +		if (detach_auto) {
>  			if (gc_before_repack())
>  				return -1;
>  			/*
>  			 * failure to daemonize is ok, we'll continue
>  			 * in foreground
>  			 */
> -			daemonize();
> +			daemonized = !daemonize();
>  		}
>  	} else
>  		add_repack_all_option();
> @@ -349,6 +371,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  		    name, (uintmax_t)pid);
>  	}
>  
> +	if (daemonized) {
> +		int fd;
> +
> +		strbuf_addstr(&log_filename, git_path("gc.log_XXXXXX"));
> +		fd = xmkstemp(log_filename.buf);
> +		if (fd >= 0) {
> +			dup2(fd, 2);
> +			close(fd);
> +		} else
> +			strbuf_release(&log_filename);
> +	}
> +
>  	if (gc_before_repack())
>  		return -1;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3] gc: save log from daemonized gc --auto and print it next time
  2015-08-25 17:49     ` Junio C Hamano
@ 2015-08-31 10:17       ` Duy Nguyen
  0 siblings, 0 replies; 17+ messages in thread
From: Duy Nguyen @ 2015-08-31 10:17 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Ævar Arnfjörð, Eric Sunshine

On Wed, Aug 26, 2015 at 12:49 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> While commit 9f673f9 (gc: config option for running --auto in
>> background - 2014-02-08) helps reduce some complaints about 'gc
>> --auto' hogging the terminal, it creates another set of problems.
>>
>> The latest in this set is, as the result of daemonizing, stderr is
>> closed and all warnings are lost. This warning at the end of cmd_gc()
>> is particularly important because it tells the user how to avoid "gc
>> --auto" running repeatedly. Because stderr is closed, the user does
>> not know, naturally they complain about 'gc --auto' wasting CPU.
>>
>> Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc runs
>> will not be daemonized and gc.log printed out until the user removes
>> gc.log.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>>  Let's try again. Compared to v2 [1], this version does not delete
>>  gc.log automatically any more. The user needs to take action, then
>>  delete gc.log to bring it background again.
>
> Sounds a bit more sensible, but I wonder if it is a good idea to
> keep going in the first place.  If the last gc run reported an
> issue, would it make it likely that we would hit the same issue?
>
> An alternative design I have in mind is to exit "gc --auto" with
> success without doing anything, after giving the new warning
> message.  What would be the pros-and-cons between this patch and
> that alternative design?

I think the alt. design is better. If anything, keep runing may
produce more output and probably distract the user from the real
warning. We only want to keep doing something again if it can gain us
something. If we succeed at steps A and B, then fail at C. Doing A and
B again next time might be worth something in general, but given the
context of git-gc I don't think it is.
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4] gc: save log from daemonized gc --auto and print it next time
  2015-08-22  2:12   ` [PATCH v3] gc: save log from daemonized gc --auto and print it next time Nguyễn Thái Ngọc Duy
  2015-08-25 17:49     ` Junio C Hamano
@ 2015-09-13  1:36     ` Nguyễn Thái Ngọc Duy
  2015-09-14 17:24       ` Junio C Hamano
                         ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-09-13  1:36 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Nguyễn Thái Ngọc Duy

While commit 9f673f9 (gc: config option for running --auto in
background - 2014-02-08) helps reduce some complaints about 'gc
--auto' hogging the terminal, it creates another set of problems.

The latest in this set is, as the result of daemonizing, stderr is
closed and all warnings are lost. This warning at the end of cmd_gc()
is particularly important because it tells the user how to avoid "gc
--auto" running repeatedly. Because stderr is closed, the user does
not know, naturally they complain about 'gc --auto' wasting CPU.

Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc --auto
will not run and gc.log printed out until the user removes gc.log.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 When gc.log exists, gc --auto now simply exits

 builtin/gc.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index bcc75d9..2c3aaeb 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
 static struct argv_array rerere = ARGV_ARRAY_INIT;
 
 static char *pidfile;
+static struct strbuf log_filename = STRBUF_INIT;
+static int daemonized;
 
 static void remove_pidfile(void)
 {
+	if (daemonized && log_filename.len) {
+		struct stat st;
+
+		close(2);
+		if (stat(log_filename.buf, &st) ||
+		    !st.st_size ||
+		    rename(log_filename.buf, git_path("gc.log")))
+			unlink(log_filename.buf);
+	}
 	if (pidfile)
 		unlink(pidfile);
 }
@@ -330,13 +341,21 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
 		if (detach_auto) {
+			struct strbuf sb = STRBUF_INIT;
+			if (strbuf_read_file(&sb, git_path("gc.log"), 0) > 0)
+				return error(_("last gc run reported:\n"
+					       "%s\n"
+					       "not running until %s is removed"),
+					     sb.buf, git_path("gc.log"));
+			strbuf_release(&sb);
+
 			if (gc_before_repack())
 				return -1;
 			/*
 			 * failure to daemonize is ok, we'll continue
 			 * in foreground
 			 */
-			daemonize();
+			daemonized = !daemonize();
 		}
 	} else
 		add_repack_all_option();
@@ -349,6 +368,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		    name, (uintmax_t)pid);
 	}
 
+	if (daemonized) {
+		int fd;
+
+		strbuf_addstr(&log_filename, git_path("gc.log_XXXXXX"));
+		fd = xmkstemp(log_filename.buf);
+		if (fd >= 0) {
+			dup2(fd, 2);
+			close(fd);
+		} else
+			strbuf_release(&log_filename);
+	}
+
 	if (gc_before_repack())
 		return -1;
 
-- 
2.3.0.rc1.137.g477eb31

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
  2015-09-13  1:36     ` [PATCH v4] " Nguyễn Thái Ngọc Duy
@ 2015-09-14 17:24       ` Junio C Hamano
  2015-09-14 17:37         ` Junio C Hamano
  2015-09-19  5:13       ` [PATCH v5] " Nguyễn Thái Ngọc Duy
  2015-09-19  5:14       ` [Alt. PATCH " Nguyễn Thái Ngọc Duy
  2 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-09-14 17:24 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Ævar Arnfjörð Bjarmason, Eric Sunshine

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> While commit 9f673f9 (gc: config option for running --auto in
> background - 2014-02-08) helps reduce some complaints about 'gc
> --auto' hogging the terminal, it creates another set of problems.
>
> The latest in this set is, as the result of daemonizing, stderr is
> closed and all warnings are lost. This warning at the end of cmd_gc()
> is particularly important because it tells the user how to avoid "gc
> --auto" running repeatedly. Because stderr is closed, the user does
> not know, naturally they complain about 'gc --auto' wasting CPU.
>
> Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc --auto
> will not run and gc.log printed out until the user removes gc.log.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  When gc.log exists, gc --auto now simply exits
>
>  builtin/gc.c | 33 ++++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)

Thanks, will queue.

This is sufficient for non-novice users, but I wonder if the error
message is strong enough.

A knee-jerk reaction to "Last run reported an error so we won't run
until the log is removed" may be "Then why don't you remove the log
for me automatically?", which is a total brain-less raction [*1*],
and "then I'll remove the log" is not all that better.  The message
we want to convey is "correct the root cause so that gc can run
without having to give an error message."

I guess it all depends on what is in gc.log; after all, this new
codepath is not in a good position to read the diagnosis left in
there and offer good pieces of advice to resolve them.


[Footnote]

*1* ... which is what a knee-jerk reaction is by definition.



>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index bcc75d9..2c3aaeb 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
>  static struct argv_array rerere = ARGV_ARRAY_INIT;
>  
>  static char *pidfile;
> +static struct strbuf log_filename = STRBUF_INIT;
> +static int daemonized;
>  
>  static void remove_pidfile(void)
>  {
> +	if (daemonized && log_filename.len) {
> +		struct stat st;
> +
> +		close(2);
> +		if (stat(log_filename.buf, &st) ||
> +		    !st.st_size ||
> +		    rename(log_filename.buf, git_path("gc.log")))
> +			unlink(log_filename.buf);
> +	}
>  	if (pidfile)
>  		unlink(pidfile);
>  }
> @@ -330,13 +341,21 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
>  		}
>  		if (detach_auto) {
> +			struct strbuf sb = STRBUF_INIT;
> +			if (strbuf_read_file(&sb, git_path("gc.log"), 0) > 0)
> +				return error(_("last gc run reported:\n"
> +					       "%s\n"
> +					       "not running until %s is removed"),
> +					     sb.buf, git_path("gc.log"));
> +			strbuf_release(&sb);
> +
>  			if (gc_before_repack())
>  				return -1;
>  			/*
>  			 * failure to daemonize is ok, we'll continue
>  			 * in foreground
>  			 */
> -			daemonize();
> +			daemonized = !daemonize();
>  		}
>  	} else
>  		add_repack_all_option();
> @@ -349,6 +368,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  		    name, (uintmax_t)pid);
>  	}
>  
> +	if (daemonized) {
> +		int fd;
> +
> +		strbuf_addstr(&log_filename, git_path("gc.log_XXXXXX"));
> +		fd = xmkstemp(log_filename.buf);
> +		if (fd >= 0) {
> +			dup2(fd, 2);
> +			close(fd);
> +		} else
> +			strbuf_release(&log_filename);
> +	}
> +
>  	if (gc_before_repack())
>  		return -1;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
  2015-09-14 17:24       ` Junio C Hamano
@ 2015-09-14 17:37         ` Junio C Hamano
  2015-09-16  9:28           ` Michael Haggerty
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-09-14 17:37 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Michael Haggerty

Junio C Hamano <gitster@pobox.com> writes:

> Thanks, will queue.

Ehh, I spoke a bit too early.

>> diff --git a/builtin/gc.c b/builtin/gc.c
>> index bcc75d9..2c3aaeb 100644
>> --- a/builtin/gc.c
>> +++ b/builtin/gc.c
>> @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
>>  static struct argv_array rerere = ARGV_ARRAY_INIT;
>>  
>>  static char *pidfile;
>> +static struct strbuf log_filename = STRBUF_INIT;
>> +static int daemonized;
>>  
>>  static void remove_pidfile(void)
>>  {
>> +	if (daemonized && log_filename.len) {
>> +		struct stat st;
>> +
>> +		close(2);
>> +		if (stat(log_filename.buf, &st) ||
>> +		    !st.st_size ||
>> +		    rename(log_filename.buf, git_path("gc.log")))
>> +			unlink(log_filename.buf);
>> +	}

Unfortuantely we cannot queue this as-is, as we let the tempfile API
to automatically manage the pidfile since ebebeaea (gc: use tempfile
module to handle gc.pid file, 2015-08-10), and you cannot piggy-back
the log file finalization to this function that no longer exists.

Besides, it is obviously wrong to remove this log file in a function
whose name is remove_pidfile() ;-)

Adding a new function to tempfile API that puts the file to a final
place if it is non-empty and otherwise remove it, and using that to
create this "gc.log" file, would be the cleanest from the point of
view of _this_ codepath.  I however do not know if that is too
specific for the need of this codepath or "leave it if non-empty,
but otherwise remove as it is uninteresting" is fairly common thing
we would want and it is a good addition to the API set.

Michael, what do you think?

>> @@ -330,13 +341,21 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>>  			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
>>  		}
>>  		if (detach_auto) {
>> +			struct strbuf sb = STRBUF_INIT;
>> +			if (strbuf_read_file(&sb, git_path("gc.log"), 0) > 0)
>> +				return error(_("last gc run reported:\n"
>> +					       "%s\n"
>> +					       "not running until %s is removed"),
>> +					     sb.buf, git_path("gc.log"));
>> +			strbuf_release(&sb);
>> +
>>  			if (gc_before_repack())
>>  				return -1;
>>  			/*
>>  			 * failure to daemonize is ok, we'll continue
>>  			 * in foreground
>>  			 */
>> -			daemonize();
>> +			daemonized = !daemonize();
>>  		}
>>  	} else
>>  		add_repack_all_option();
>> @@ -349,6 +368,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>>  		    name, (uintmax_t)pid);
>>  	}
>>  
>> +	if (daemonized) {
>> +		int fd;
>> +
>> +		strbuf_addstr(&log_filename, git_path("gc.log_XXXXXX"));
>> +		fd = xmkstemp(log_filename.buf);
>> +		if (fd >= 0) {
>> +			dup2(fd, 2);
>> +			close(fd);
>> +		} else
>> +			strbuf_release(&log_filename);
>> +	}
>> +
>>  	if (gc_before_repack())
>>  		return -1;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
  2015-09-14 17:37         ` Junio C Hamano
@ 2015-09-16  9:28           ` Michael Haggerty
  2015-09-16 16:00             ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Haggerty @ 2015-09-16  9:28 UTC (permalink / raw)
  To: Junio C Hamano, Nguyễn Thái Ngọc Duy
  Cc: git, Ævar Arnfjörð Bjarmason, Eric Sunshine

On 09/14/2015 07:37 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> Thanks, will queue.
> 
> Ehh, I spoke a bit too early.
> 
>>> diff --git a/builtin/gc.c b/builtin/gc.c
>>> index bcc75d9..2c3aaeb 100644
>>> --- a/builtin/gc.c
>>> +++ b/builtin/gc.c
>>> @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
>>>  static struct argv_array rerere = ARGV_ARRAY_INIT;
>>>  
>>>  static char *pidfile;
>>> +static struct strbuf log_filename = STRBUF_INIT;
>>> +static int daemonized;
>>>  
>>>  static void remove_pidfile(void)
>>>  {
>>> +	if (daemonized && log_filename.len) {
>>> +		struct stat st;
>>> +
>>> +		close(2);
>>> +		if (stat(log_filename.buf, &st) ||
>>> +		    !st.st_size ||
>>> +		    rename(log_filename.buf, git_path("gc.log")))
>>> +			unlink(log_filename.buf);
>>> +	}
> 
> Unfortuantely we cannot queue this as-is, as we let the tempfile API
> to automatically manage the pidfile since ebebeaea (gc: use tempfile
> module to handle gc.pid file, 2015-08-10), and you cannot piggy-back
> the log file finalization to this function that no longer exists.
> 
> Besides, it is obviously wrong to remove this log file in a function
> whose name is remove_pidfile() ;-)
> 
> Adding a new function to tempfile API that puts the file to a final
> place if it is non-empty and otherwise remove it, and using that to
> create this "gc.log" file, would be the cleanest from the point of
> view of _this_ codepath.  I however do not know if that is too
> specific for the need of this codepath or "leave it if non-empty,
> but otherwise remove as it is uninteresting" is fairly common thing
> we would want and it is a good addition to the API set.
> 
> Michael, what do you think?

I'm not sure what behavior you want. At one point you say "puts the file
to a final place if it is non-empty" but later you say "leave it if
non-empty". Should the file be written directly, or should it be written
to a lockfile and renamed into place only when complete?

Technically I don't see a problem implementing either behavior. POSIX
allows [1] calls to stat() and rename() from a signal handler. There is
a minor technical difficulty that commit_lock_file() allocates memory
via get_locked_file_path(), but this would be surmountable by
pre-allocating the memory for the locked file path and storing it in the
lock_file object.

This doesn't seem like a common thing to want (as in, this might be the
only caller), but it probably makes sense to build it into the
tempfile/lockfile API nevertheless, because implementing it externally
would require a lot of other code to be duplicated.

Another possibility that might work (maybe without requiring changes to
tempfile/lockfile): don't worry about deleting the log file if it is
empty, but make observers treat an empty log file the same as an absent one.

Michael

[1]
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
  2015-09-16  9:28           ` Michael Haggerty
@ 2015-09-16 16:00             ` Junio C Hamano
  2015-09-17  9:40               ` Michael Haggerty
  2015-09-17 13:08               ` Duy Nguyen
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-09-16 16:00 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Nguyễn Thái Ngọc Duy, git,
	Ævar Arnfjörð Bjarmason, Eric Sunshine

Michael Haggerty <mhagger@alum.mit.edu> writes:

> I'm not sure what behavior you want. At one point you say "puts the file
> to a final place if it is non-empty" but later you say "leave it if
> non-empty". Should the file be written directly, or should it be written
> to a lockfile and renamed into place only when complete?

I do not think we care that deeply either way, as we do not want to
run multiple auto-gc's at the same time in the first place.  So either
one of the following is perfectly fine:

 * We open the final destination directly, but with O_CREAT|O_EXCL,
   which as a side effect also detects the simultanous execution
   [*1*].  We do not do any "finalizing rename" if we go this route.
   When we are done, close it and remove it if we did not write
   anything, or leave it there if we did write something.

 * We open a lockfile the usual way.  When we are done, close it and
   and remove it if we did not write anything, or finalize it by
   renaming it if we did write something.

I think Duy's code wants to do the latter.

> This doesn't seem like a common thing to want (as in, this might be the
> only caller), but it probably makes sense to build it into the
> tempfile/lockfile API nevertheless, because implementing it externally
> would require a lot of other code to be duplicated.
>
> Another possibility that might work (maybe without requiring changes to
> tempfile/lockfile): don't worry about deleting the log file if it is
> empty, but make observers treat an empty log file the same as an absent one.

Probably your "don't remove and check for emptiness" approach would
be the simpler of the two, but I think we can go either way.

Thanks.

[Footnote]

*1* But one "gc" could be foreground gc that does not write to a log
    file and the other "gc" could be background that does write to a
    log file, so this cannot be the primary way to avoid double
    execution.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
  2015-09-16 16:00             ` Junio C Hamano
@ 2015-09-17  9:40               ` Michael Haggerty
  2015-09-17 13:08               ` Duy Nguyen
  1 sibling, 0 replies; 17+ messages in thread
From: Michael Haggerty @ 2015-09-17  9:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, git,
	Ævar Arnfjörð Bjarmason, Eric Sunshine

On 09/16/2015 06:00 PM, Junio C Hamano wrote:
> *1* But one "gc" could be foreground gc that does not write to a log
>     file and the other "gc" could be background that does write to a
>     log file, so this cannot be the primary way to avoid double
>     execution.

^^^ This was the point that was confusing me. If this is not one of the
roles of the log file, then things are easier.

If you decide to go the route of tempfile/lockfile modifications, feel
free to CC me for feedback.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
  2015-09-16 16:00             ` Junio C Hamano
  2015-09-17  9:40               ` Michael Haggerty
@ 2015-09-17 13:08               ` Duy Nguyen
  2015-09-17 14:48                 ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Duy Nguyen @ 2015-09-17 13:08 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael Haggerty, Git Mailing List, Ævar Arnfjörð,
	Eric Sunshine

On Wed, Sep 16, 2015 at 11:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> I'm not sure what behavior you want. At one point you say "puts the file
>> to a final place if it is non-empty" but later you say "leave it if
>> non-empty". Should the file be written directly, or should it be written
>> to a lockfile and renamed into place only when complete?
>
> I do not think we care that deeply either way, as we do not want to
> run multiple auto-gc's at the same time in the first place.  So either
> one of the following is perfectly fine:
>
>  * We open the final destination directly, but with O_CREAT|O_EXCL,
>    which as a side effect also detects the simultanous execution
>    [*1*].  We do not do any "finalizing rename" if we go this route.
>    When we are done, close it and remove it if we did not write
>    anything, or leave it there if we did write something.
>
>  * We open a lockfile the usual way.  When we are done, close it and
>    and remove it if we did not write anything, or finalize it by
>    renaming it if we did write something.
>
> I think Duy's code wants to do the latter.

We do keep another lock before coming to opening this log file. So
once we get here we already know nobody else will be opening the log
file. We can simply open it the normal way, then make sure we clean it
up at atexit().

>> This doesn't seem like a common thing to want (as in, this might be the
>> only caller), but it probably makes sense to build it into the
>> tempfile/lockfile API nevertheless, because implementing it externally
>> would require a lot of other code to be duplicated.
>>
>> Another possibility that might work (maybe without requiring changes to
>> tempfile/lockfile): don't worry about deleting the log file if it is
>> empty, but make observers treat an empty log file the same as an absent one.
>
> Probably your "don't remove and check for emptiness" approach would
> be the simpler of the two, but I think we can go either way.

People have complained to me about stray files in $GIT_DIR, so it's
probably best that we delete empty/useless files. Although we could
delete empty files at the beginning of the next gc instead of at
atexit(). Let me try it out and see which is simplest.
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
  2015-09-17 13:08               ` Duy Nguyen
@ 2015-09-17 14:48                 ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-09-17 14:48 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Michael Haggerty, Git Mailing List, Ævar Arnfjörð,
	Eric Sunshine

Duy Nguyen <pclouds@gmail.com> writes:

> We do keep another lock before coming to opening this log file. So
> once we get here we already know nobody else will be opening the log
> file. We can simply open it the normal way, then make sure we clean it
> up at atexit().
>
>>> This doesn't seem like a common thing to want (as in, this might be the
>>> only caller), but it probably makes sense to build it into the
>>> tempfile/lockfile API nevertheless, because implementing it externally
>>> would require a lot of other code to be duplicated.
>>>
>>> Another possibility that might work (maybe without requiring changes to
>>> tempfile/lockfile): don't worry about deleting the log file if it is
>>> empty, but make observers treat an empty log file the same as an absent one.
>>
>> Probably your "don't remove and check for emptiness" approach would
>> be the simpler of the two, but I think we can go either way.
>
> People have complained to me about stray files in $GIT_DIR, so it's
> probably best that we delete empty/useless files. Although we could
> delete empty files at the beginning of the next gc instead of at
> atexit(). Let me try it out and see which is simplest.

It would be nicer if we did not have to do an extra and custom
atexit() handler.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v5] gc: save log from daemonized gc --auto and print it next time
  2015-09-13  1:36     ` [PATCH v4] " Nguyễn Thái Ngọc Duy
  2015-09-14 17:24       ` Junio C Hamano
@ 2015-09-19  5:13       ` Nguyễn Thái Ngọc Duy
  2015-09-21 16:43         ` Junio C Hamano
  2015-09-19  5:14       ` [Alt. PATCH " Nguyễn Thái Ngọc Duy
  2 siblings, 1 reply; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-09-19  5:13 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Nguyễn Thái Ngọc Duy

While commit 9f673f9 (gc: config option for running --auto in
background - 2014-02-08) helps reduce some complaints about 'gc
--auto' hogging the terminal, it creates another set of problems.

The latest in this set is, as the result of daemonizing, stderr is
closed and all warnings are lost. This warning at the end of cmd_gc()
is particularly important because it tells the user how to avoid "gc
--auto" running repeatedly. Because stderr is closed, the user does
not know, naturally they complain about 'gc --auto' wasting CPU.

Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc --auto
will not run and gc.log printed out until the user removes gc.log.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 This is the lock-based version. I'm sending an alternative that does
 not need atexit() with comparison between the two.

 builtin/gc.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index bcc75d9..4e738fa 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -43,6 +43,7 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
 static struct argv_array rerere = ARGV_ARRAY_INIT;
 
 static char *pidfile;
+static struct lock_file log_lock;
 
 static void remove_pidfile(void)
 {
@@ -57,6 +58,28 @@ static void remove_pidfile_on_signal(int signo)
 	raise(signo);
 }
 
+static void process_log_file(void)
+{
+	struct stat st;
+	if (!fstat(log_lock.fd, &st) && st.st_size)
+		commit_lock_file(&log_lock);
+	else
+		rollback_lock_file(&log_lock);
+}
+
+static void process_log_file_at_exit(void)
+{
+	fflush(stderr);
+	process_log_file();
+}
+
+static void process_log_file_on_signal(int signo)
+{
+	process_log_file();
+	sigchain_pop(signo);
+	raise(signo);
+}
+
 static void git_config_date_string(const char *key, const char **output)
 {
 	if (git_config_get_string_const(key, output))
@@ -253,6 +276,24 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid)
 	return NULL;
 }
 
+static int report_last_gc_error(void)
+{
+	struct strbuf sb = STRBUF_INIT;
+	int ret;
+
+	ret = strbuf_read_file(&sb, git_path("gc.log"), 0);
+	if (ret > 0)
+		return error(_("The last gc run reported the following. "
+			       "Please correct the root cause\n"
+			       "and remove %s.\n"
+			       "Automatic cleanup will not be performed "
+			       "until the file is removed.\n\n"
+			       "%s"),
+			     git_path("gc.log"), sb.buf);
+	strbuf_release(&sb);
+	return 0;
+}
+
 static int gc_before_repack(void)
 {
 	if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD))
@@ -274,6 +315,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	int force = 0;
 	const char *name;
 	pid_t pid;
+	int daemonized = 0;
 
 	struct option builtin_gc_options[] = {
 		OPT__QUIET(&quiet, N_("suppress progress reporting")),
@@ -330,13 +372,16 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
 		if (detach_auto) {
+			if (report_last_gc_error())
+				return -1;
+
 			if (gc_before_repack())
 				return -1;
 			/*
 			 * failure to daemonize is ok, we'll continue
 			 * in foreground
 			 */
-			daemonize();
+			daemonized = !daemonize();
 		}
 	} else
 		add_repack_all_option();
@@ -349,6 +394,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		    name, (uintmax_t)pid);
 	}
 
+	if (daemonized) {
+		hold_lock_file_for_update(&log_lock,
+					  git_path("gc.log"),
+					  LOCK_DIE_ON_ERROR);
+		dup2(log_lock.fd, 2);
+		sigchain_push_common(process_log_file_on_signal);
+		atexit(process_log_file_at_exit);
+	}
+
 	if (gc_before_repack())
 		return -1;
 
-- 
2.3.0.rc1.137.g477eb31

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Alt. PATCH v5] gc: save log from daemonized gc --auto and print it next time
  2015-09-13  1:36     ` [PATCH v4] " Nguyễn Thái Ngọc Duy
  2015-09-14 17:24       ` Junio C Hamano
  2015-09-19  5:13       ` [PATCH v5] " Nguyễn Thái Ngọc Duy
@ 2015-09-19  5:14       ` Nguyễn Thái Ngọc Duy
  2015-09-21 16:19         ` Junio C Hamano
  2 siblings, 1 reply; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-09-19  5:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Nguyễn Thái Ngọc Duy

While commit 9f673f9 (gc: config option for running --auto in
background - 2014-02-08) helps reduce some complaints about 'gc
--auto' hogging the terminal, it creates another set of problems.

The latest in this set is, as the result of daemonizing, stderr is
closed and all warnings are lost. This warning at the end of cmd_gc()
is particularly important because it tells the user how to avoid "gc
--auto" running repeatedly. Because stderr is closed, the user does
not know, naturally they complain about 'gc --auto' wasting CPU.

Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc --auto
will not run and gc.log printed out until the user removes gc.log.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 The lock-based version has an advantage that the following gc runs
 will never see partial gc.log. But it requires some more hook at
 atexit() and maybe signal handler.
 
 This version avoids that, and gc.log can be kept even if gc is
 SIGKILL'd (unlikely because gc itself does not do anything that can
 upset the kernel), but then it's racy.

 I think I perfer the lock-based version.

 builtin/gc.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index bcc75d9..3d42ef7 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -43,6 +43,8 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
 static struct argv_array rerere = ARGV_ARRAY_INIT;
 
 static char *pidfile;
+static struct strbuf log_filename = STRBUF_INIT;
+static int daemonized;
 
 static void remove_pidfile(void)
 {
@@ -253,6 +255,41 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid)
 	return NULL;
 }
 
+static int report_last_gc_error(void)
+{
+	struct strbuf sb = STRBUF_INIT;
+	int ret;
+
+	if (!access(git_path("gc.pid"), R_OK))
+		/*
+		 * Either background gc is still running, or we have a stale
+		 * gc.pid.
+		 *
+		 * If the former, it's not the right time to look at gc.log
+		 * yet. We'll exit shortly after lock_repo_for_gc().
+		 *
+		 * If the latter, we'll run gc --auto one more time. But
+		 * because gc.log is appended, the old log (if exists) won't
+		 * be lost. One extra gc run is not that big a deal to avoid.
+		 */
+		return 0;
+
+	ret = strbuf_read_file(&sb, git_path("gc.log"), 0);
+	if (ret > 0)
+		return error(_("The last gc run reported the following. "
+			       "Please correct the root cause\n"
+			       "and remove %s. Automatic cleanup will not "
+			       "be performed\n"
+			       "until the file is removed.\n\n"
+			       "%s"),
+			     git_path("gc.log"), sb.buf);
+	else if (!ret)
+		/* racy, but in the worst case we waste one more gc run */
+		unlink(git_path("gc.log"));
+	strbuf_release(&sb);
+	return 0;
+}
+
 static int gc_before_repack(void)
 {
 	if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD))
@@ -330,13 +367,16 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
 		if (detach_auto) {
+			if (report_last_gc_error())
+				return -1;
+
 			if (gc_before_repack())
 				return -1;
 			/*
 			 * failure to daemonize is ok, we'll continue
 			 * in foreground
 			 */
-			daemonize();
+			daemonized = !daemonize();
 		}
 	} else
 		add_repack_all_option();
@@ -349,6 +389,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		    name, (uintmax_t)pid);
 	}
 
+	if (daemonized) {
+		int fd;
+
+		strbuf_git_path(&log_filename, "gc.log");
+		fd = open(log_filename.buf, O_WRONLY | O_CREAT | O_APPEND, 0644);
+		if (fd >= 0) {
+			dup2(fd, 2);
+			close(fd);
+		} else
+			strbuf_release(&log_filename);
+	}
+
 	if (gc_before_repack())
 		return -1;
 
@@ -376,6 +428,17 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		warning(_("There are too many unreachable loose objects; "
 			"run 'git prune' to remove them."));
 
+	/*
+	 * Opportunistic cleanup in the good case (which most likely
+	 * produce empty gc.log), otherwise empty gc.log will be
+	 * deleted at the next auto gc run.
+	 */
+	if (daemonized && log_filename.len) {
+		struct stat st;
+
+		fflush(stderr);
+		if (!stat(log_filename.buf, &st) && !st.st_size)
+			unlink(log_filename.buf);
+	}
 	return 0;
 }
-- 
2.3.0.rc1.137.g477eb31

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Alt. PATCH v5] gc: save log from daemonized gc --auto and print it next time
  2015-09-19  5:14       ` [Alt. PATCH " Nguyễn Thái Ngọc Duy
@ 2015-09-21 16:19         ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-09-21 16:19 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: Michael Haggerty, git, Ævar Arnfjörð Bjarmason,
	Eric Sunshine

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

>  The lock-based version has an advantage that the following gc runs
>  will never see partial gc.log. But it requires some more hook at
>  atexit() and maybe signal handler.
>  
>  This version avoids that, and gc.log can be kept even if gc is
>  SIGKILL'd (unlikely because gc itself does not do anything that can
>  upset the kernel), but then it's racy.

Given that you are only interested in an non-empty output, and also
you want to avoid running an auto-gc when another one is already
running, I wonder if that "never see partial" property really
matters.

In any case, "an alternative could be to avoid lockfile and not
bothering to remove an empty one" was idea from Michael, so let's CC
him back in the loop.

>  I think I perfer the lock-based version.

It certainly does look more familiar to me, too ;-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5] gc: save log from daemonized gc --auto and print it next time
  2015-09-19  5:13       ` [PATCH v5] " Nguyễn Thái Ngọc Duy
@ 2015-09-21 16:43         ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-09-21 16:43 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: Michael Haggerty, git, Ævar Arnfjörð Bjarmason,
	Eric Sunshine

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> While commit 9f673f9 (gc: config option for running --auto in
> background - 2014-02-08) helps reduce some complaints about 'gc
> --auto' hogging the terminal, it creates another set of problems.
>
> The latest in this set is, as the result of daemonizing, stderr is
> closed and all warnings are lost. This warning at the end of cmd_gc()
> is particularly important because it tells the user how to avoid "gc
> --auto" running repeatedly. Because stderr is closed, the user does
> not know, naturally they complain about 'gc --auto' wasting CPU.
>
> Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc --auto
> will not run and gc.log printed out until the user removes gc.log.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  This is the lock-based version. I'm sending an alternative that does
>  not need atexit() with comparison between the two.

I think Michael deserves to be CC'ed.

I am wondering if we need an audit of what our signal handlers do in
the same light as what led to 507d7804 (pager: don't use unsafe
functions in signal handlers, 2015-09-04).  lockfile API already
does removal in remove_lock_files_on_signal(), which may or may not
need further fixing.  And that is why I really hate to see us adding
custom signal handler if we can avoid it.

What you are gaining by having a custom signal handler here is
twofold, I think:

 * You have a chance to remove an empty log message upon signal, but
   if you follow Michael's idea of equating an empty and missing log
   file, that would not be a huge plus either way.

 * You can keep a non-empty log message that may be incomplete when
   you are killed yourself.  If you reused the lockfile API and let
   its signal handler clean it up, you obviously will lose that, but
   you do not even know if that incomplete non-empty log message is
   something you want to base the "stop auto-gc until an existing
   problem is resolved" decision on.  Your subprocess may have left
   that message only because you are getting killed, no?

Neither looks like a compelling enough reason to have it to me.

Wouldn't the logic be vastly simplified and be still correct, if you
let the lockfile API do its usual "unless you explicitly commit,
lockfiles are removed upon signal or exit"?

>  builtin/gc.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 55 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index bcc75d9..4e738fa 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -43,6 +43,7 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
>  static struct argv_array rerere = ARGV_ARRAY_INIT;
>  
>  static char *pidfile;
> +static struct lock_file log_lock;
>  
>  static void remove_pidfile(void)
>  {
> @@ -57,6 +58,28 @@ static void remove_pidfile_on_signal(int signo)
>  	raise(signo);
>  }
>  
> +static void process_log_file(void)
> +{
> +	struct stat st;
> +	if (!fstat(log_lock.fd, &st) && st.st_size)
> +		commit_lock_file(&log_lock);
> +	else
> +		rollback_lock_file(&log_lock);
> +}
> +
> +static void process_log_file_at_exit(void)
> +{
> +	fflush(stderr);
> +	process_log_file();
> +}
> +
> +static void process_log_file_on_signal(int signo)
> +{
> +	process_log_file();
> +	sigchain_pop(signo);
> +	raise(signo);
> +}
> +
>  static void git_config_date_string(const char *key, const char **output)
>  {
>  	if (git_config_get_string_const(key, output))
> @@ -253,6 +276,24 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid)
>  	return NULL;
>  }
>  
> +static int report_last_gc_error(void)
> +{
> +	struct strbuf sb = STRBUF_INIT;
> +	int ret;
> +
> +	ret = strbuf_read_file(&sb, git_path("gc.log"), 0);
> +	if (ret > 0)
> +		return error(_("The last gc run reported the following. "
> +			       "Please correct the root cause\n"
> +			       "and remove %s.\n"
> +			       "Automatic cleanup will not be performed "
> +			       "until the file is removed.\n\n"
> +			       "%s"),
> +			     git_path("gc.log"), sb.buf);
> +	strbuf_release(&sb);
> +	return 0;
> +}
> +
>  static int gc_before_repack(void)
>  {
>  	if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD))
> @@ -274,6 +315,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  	int force = 0;
>  	const char *name;
>  	pid_t pid;
> +	int daemonized = 0;
>  
>  	struct option builtin_gc_options[] = {
>  		OPT__QUIET(&quiet, N_("suppress progress reporting")),
> @@ -330,13 +372,16 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
>  		}
>  		if (detach_auto) {
> +			if (report_last_gc_error())
> +				return -1;
> +
>  			if (gc_before_repack())
>  				return -1;
>  			/*
>  			 * failure to daemonize is ok, we'll continue
>  			 * in foreground
>  			 */
> -			daemonize();
> +			daemonized = !daemonize();
>  		}
>  	} else
>  		add_repack_all_option();
> @@ -349,6 +394,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  		    name, (uintmax_t)pid);
>  	}
>  
> +	if (daemonized) {
> +		hold_lock_file_for_update(&log_lock,
> +					  git_path("gc.log"),
> +					  LOCK_DIE_ON_ERROR);
> +		dup2(log_lock.fd, 2);
> +		sigchain_push_common(process_log_file_on_signal);
> +		atexit(process_log_file_at_exit);
> +	}
> +
>  	if (gc_before_repack())
>  		return -1;

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-09-21 16:43 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-08 12:28 Since gc.autodetach=1 you can end up with auto-gc on every command with no user notification Ævar Arnfjörð Bjarmason
2015-07-08 12:47 ` Duy Nguyen
2015-08-22  2:12   ` [PATCH v3] gc: save log from daemonized gc --auto and print it next time Nguyễn Thái Ngọc Duy
2015-08-25 17:49     ` Junio C Hamano
2015-08-31 10:17       ` Duy Nguyen
2015-09-13  1:36     ` [PATCH v4] " Nguyễn Thái Ngọc Duy
2015-09-14 17:24       ` Junio C Hamano
2015-09-14 17:37         ` Junio C Hamano
2015-09-16  9:28           ` Michael Haggerty
2015-09-16 16:00             ` Junio C Hamano
2015-09-17  9:40               ` Michael Haggerty
2015-09-17 13:08               ` Duy Nguyen
2015-09-17 14:48                 ` Junio C Hamano
2015-09-19  5:13       ` [PATCH v5] " Nguyễn Thái Ngọc Duy
2015-09-21 16:43         ` Junio C Hamano
2015-09-19  5:14       ` [Alt. PATCH " Nguyễn Thái Ngọc Duy
2015-09-21 16:19         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).