git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Jeff King <peff@peff.net>,
	Jonathan Tan <jonathantanmy@google.com>,
	Git Mailing List <git@vger.kernel.org>,
	adehtiarov@google.com
Subject: Re: [PATCH] gc: do not warn about too many loose objects
Date: Tue, 17 Jul 2018 17:59:57 +0200	[thread overview]
Message-ID: <CACsJy8C=+0n_5Uz3ub1t-1fJtu4HJQ3_ro1R+aEmpUz+1Ge6RA@mail.gmail.com> (raw)
In-Reply-To: <20180717015339.GL11513@aiede.svl.corp.google.com>

On Tue, Jul 17, 2018 at 3:53 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
> Subject: gc: do not return error for prior errors in daemonized mode
>
> Some build machines started failing to fetch updated source using
> "repo sync", with error
>
>   error: The last gc run reported the following. Please correct the root cause
>   and remove /build/.repo/projects/tools/git.git/gc.log.
>   Automatic cleanup will not be performed until the file is removed.
>
>   warning: There are too many unreachable loose objects; run 'git prune' to remove them.
>
> The cause takes some time to describe.
>
> In v2.0.0-rc0~145^2 (gc: config option for running --auto in
> background, 2014-02-08), "git gc --auto" learned to run in the
> background instead of blocking the invoking command.  In this mode, it
> closed stderr to avoid interleaving output with any subsequent
> commands, causing warnings like the above to be swallowed; v2.6.3~24^2
> (gc: save log from daemonized gc --auto and print it next time,
> 2015-09-19) addressed this by storing any diagnostic output in
> .git/gc.log and allowing the next "git gc --auto" run to print it.
>
> To avoid wasteful repeated fruitless gcs, when gc.log is present, the
> subsequent "gc --auto" would die after print its contents.  Most git
> commands, such as "git fetch", ignore the exit status from "git gc
> --auto" so all is well at this point: the user gets to see the error
> message, and the fetch succeeds, without a wasteful additional attempt
> at an automatic gc.
>
> External tools like repo[1], though, do care about the exit status
> from "git gc --auto".  In non-daemonized mode, the exit status is
> straightforward: if there is an error, it is nonzero, but after a
> warning like the above, the status is zero.  The daemonized mode, as a
> side effect of the other properties provided, offers a very strange
> exit code convention:
>
>  - if no housekeeping was required, the exit status is 0
>
>  - the first real run, after forking into the background, returns exit
>    status 0 unconditionally.  The parent process has no way to know
>    whether gc will succeed.
>
>  - if there is any diagnostic output in gc.log, subsequent runs return
>    a nonzero exit status to indicate that gc was not triggered.
>
> There's nothing for the calling program to act on on the basis of that
> error.  Use status 0 consistently instead, to indicate that we decided
> not to run a gc (just like if no housekeeping was required).  This
> way, repo and similar tools can get the benefit of the same behavior
> as tools like "git fetch" that ignore the exit status from gc --auto.

This background gc thing is pretty much designed for interactive use.
When it's scripted, you probably should avoid it. Then you can fork a
new process by yourself and have much better control if you still want
"background" gc. So an alternative here is to turn on or off
gc.autodetach based on interactiveness (i.e. having tty). We can add a
new (and default) value "auto" to gc.autodetach for this purpose.

In automated scripts, it will always run in non-damonized mode. You
will get non-zero exit code when real errors occur. You don't worry
about hanging processes. Rate limit is also thrown out in this mode if
I'm not mistaken, but it could be added back and more tailored for
server needs.

> Once the period of time described by gc.pruneExpire elapses, the
> unreachable loose objects will be removed by "git gc --auto"
> automatically.
>
> [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/
>
> Reported-by: Andrii Dehtiarov <adehtiarov@google.com>
> Helped-by: Jeff King <peff@peff.net>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---
>  Documentation/config.txt |  3 ++-
>  builtin/gc.c             | 16 +++++++++++-----
>  t/t6500-gc.sh            |  6 +++---
>  3 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1cc18a828c..5eaa4aaa7d 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should go below
>  gc.autoPackLimit and gc.bigPackThreshold should be respected again.
>
>  gc.logExpiry::
> -       If the file gc.log exists, then `git gc --auto` won't run
> +       If the file gc.log exists, then `git gc --auto` will print
> +       its content and exit with status zero instead of running
>         unless that file is more than 'gc.logExpiry' old.  Default is
>         "1.day".  See `gc.pruneExpire` for more ways to specify its
>         value.
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 2bebc52bda..484ab21b8c 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid)
>         return NULL;
>  }
>
> -static void report_last_gc_error(void)
> +static int report_last_gc_error(void)
>  {
>         struct strbuf sb = STRBUF_INIT;
>         ssize_t ret;
> @@ -449,7 +449,7 @@ static void report_last_gc_error(void)
>                 if (errno == ENOENT)
>                         goto done;
>
> -               die_errno(_("cannot stat '%s'"), gc_log_path);
> +               return error_errno(_("cannot stat '%s'"), gc_log_path);
>         }
>
>         if (st.st_mtime < gc_log_expire_time)
> @@ -457,9 +457,9 @@ static void report_last_gc_error(void)
>
>         ret = strbuf_read_file(&sb, gc_log_path, 0);
>         if (ret < 0)
> -               die_errno(_("cannot read '%s'"), gc_log_path);
> +               return error_errno(_("cannot read '%s'"), gc_log_path);
>         if (ret > 0)
> -               die(_("The last gc run reported the following. "
> +               return error(_("The last gc run reported the following. "
>                                "Please correct the root cause\n"
>                                "and remove %s.\n"
>                                "Automatic cleanup will not be performed "
> @@ -469,6 +469,7 @@ static void report_last_gc_error(void)
>         strbuf_release(&sb);
>  done:
>         free(gc_log_path);
> +       return 0;
>  }
>
>  static void gc_before_repack(void)
> @@ -561,7 +562,12 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>                         fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
>                 }
>                 if (detach_auto) {
> -                       report_last_gc_error(); /* dies on error */
> +                       if (report_last_gc_error())
> +                               /*
> +                                * A previous gc failed. We've reported the
> +                                * error, so there's nothing left to do.
> +                                */
> +                               return 0;
>
>                         if (lock_repo_for_gc(force, &pid))
>                                 return 0;
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index c474a94a9f..3e62df616c 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if gc.log is present and re
>         test_config gc.autopacklimit 1 &&
>         test_config gc.autodetach true &&
>         echo fleem >.git/gc.log &&
> -       test_must_fail git gc --auto 2>err &&
> -       test_i18ngrep "^fatal:" err &&
> +       git gc --auto 2>err &&
> +       test_i18ngrep "^error:" err &&
>         test_config gc.logexpiry 5.days &&
>         test-tool chmtime =-345600 .git/gc.log &&
> -       test_must_fail git gc --auto &&
> +       git gc --auto &&
>         test_config gc.logexpiry 2.days &&
>         run_and_wait_for_auto_gc &&
>         ls .git/objects/pack/pack-*.pack >packs &&
> --
> 2.18.0.233.g985f88cf7e
>


-- 
Duy

  parent reply	other threads:[~2018-07-17 16:00 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-16 17:27 [PATCH] gc: do not warn about too many loose objects Jonathan Tan
2018-07-16 17:51 ` Jeff King
2018-07-16 18:22   ` Jonathan Nieder
2018-07-16 18:52     ` Jeff King
2018-07-16 19:09       ` Jonathan Nieder
2018-07-16 19:41         ` Jeff King
2018-07-16 19:54           ` Jonathan Nieder
2018-07-16 20:29             ` Jeff King
2018-07-16 20:37               ` Jonathan Nieder
2018-07-16 21:09                 ` Jeff King
2018-07-16 21:40                   ` Jonathan Nieder
2018-07-16 21:45                     ` Jeff King
2018-07-16 22:03                       ` Jonathan Nieder
2018-07-16 22:43                         ` Jeff King
2018-07-16 22:56                           ` Jonathan Nieder
2018-07-16 23:26                             ` Jeff King
2018-07-17  1:53                               ` Jonathan Nieder
2018-07-17  8:59                                 ` Ævar Arnfjörð Bjarmason
2018-07-17 14:03                                   ` Jonathan Nieder
2018-07-17 15:24                                     ` Ævar Arnfjörð Bjarmason
2018-07-17 20:27                                   ` Jeff King
2018-07-18 13:11                                     ` Ævar Arnfjörð Bjarmason
2018-07-18 17:29                                       ` Jeff King
2018-07-17 15:59                                 ` Duy Nguyen [this message]
2018-07-17 18:09                                 ` Junio C Hamano
2018-07-16 19:15 ` Elijah Newren
2018-07-16 19:19   ` Jonathan Nieder
2018-07-16 20:21     ` Elijah Newren
2018-07-16 20:35       ` Jeff King
2018-07-16 20:56         ` Jonathan Nieder
2018-07-16 21:12           ` Jeff King
2018-07-16 19:52   ` Jeff King
2018-07-16 20:16     ` Elijah Newren
2018-07-16 20:38       ` Jeff King
2018-07-16 21:09         ` Elijah Newren
2018-07-16 21:21           ` Jeff King
2018-07-16 22:07             ` Elijah Newren
2018-07-16 22:55               ` Jeff King
2018-07-16 23:06                 ` Elijah Newren
2018-07-16 21:31           ` Jonathan Nieder
2018-07-17  6:51 ` [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode Jonathan Nieder
2018-07-17  6:53   ` [PATCH 1/3] gc: improve handling of errors reading gc.log Jonathan Nieder
2018-07-17 18:19     ` Junio C Hamano
2018-07-17 19:58     ` Jeff King
2018-07-17  6:54   ` [PATCH 2/3] gc: exit with status 128 on failure Jonathan Nieder
2018-07-17 18:22     ` Junio C Hamano
2018-07-17 19:59     ` Jeff King
2018-09-17 18:33       ` Jeff King
2018-09-17 18:40         ` Jonathan Nieder
2018-09-18 17:30           ` Jeff King
2018-07-17  6:57   ` [PATCH 3/3] gc: do not return error for prior errors in daemonized mode Jonathan Nieder
2018-07-17 20:13     ` Jeff King
2018-07-18 16:21       ` Junio C Hamano
2018-07-18 17:22         ` Jeff King
2018-07-18 18:19           ` Junio C Hamano
2018-07-18 19:06             ` Jeff King
2018-07-18 19:55               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACsJy8C=+0n_5Uz3ub1t-1fJtu4HJQ3_ro1R+aEmpUz+1Ge6RA@mail.gmail.com' \
    --to=pclouds@gmail.com \
    --cc=adehtiarov@google.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).