git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] commit: run git gc --auto just before the pre-commit hook
@ 2018-02-28 22:13 Ævar Arnfjörð Bjarmason
  2018-02-28 22:29 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-02-28 22:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason

Change the behavior of git-commit back to what it was back in
d4bb43ee27 ("Invoke "git gc --auto" from commit, merge, am and
rebase.", 2007-09-05) when it was git-commit.sh.

Shortly afterwards in f5bbc3225c ("Port git commit to C.", 2007-11-08)
when it was ported to C the "git gc --auto" invocation went away.

Before this git gc --auto only ran for
git-{am,merge,fetch,receive-pack}, therefore it was possible to write
a script that would "git commit" a lot of data locally, and gc would
never run.

One such repository that was locally committing generated zone file
changes had grown to a size of ~60GB before a daily cronjob was added
to "git gc", bringing it down to less than 1GB. This will make such
cases work transparently.

I think fixing such pathological cases where the repository will grow
forever is a worthwhile trade-off for spending a couple of
milliseconds calling "git gc --auto" (in the common cases where it
doesn't do anything).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/commit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/commit.c b/builtin/commit.c
index e8e8d13be4..b671367840 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1402,6 +1402,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, const char *name
 
 int cmd_commit(int argc, const char **argv, const char *prefix)
 {
+	const char *argv_gc_auto[] = {"gc", "--auto", NULL};
 	static struct wt_status s;
 	static struct option builtin_commit_options[] = {
 		OPT__QUIET(&quiet, N_("suppress summary after successful commit")),
@@ -1608,6 +1609,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 
 	rerere(0);
 	run_commit_hook(use_editor, get_index_file(), "post-commit", NULL);
+	run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
 	if (amend && !no_post_rewrite) {
 		commit_post_rewrite(current_head, &oid);
 	}
-- 
2.15.1.424.g9478a66081


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] commit: run git gc --auto just before the pre-commit hook
  2018-02-28 22:13 [PATCH] commit: run git gc --auto just before the pre-commit hook Ævar Arnfjörð Bjarmason
@ 2018-02-28 22:29 ` Junio C Hamano
  2018-02-28 23:04   ` [PATCH v2] " Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2018-02-28 22:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Change the behavior of git-commit back to what it was back in
> d4bb43ee27 ("Invoke "git gc --auto" from commit, merge, am and
> rebase.", 2007-09-05) when it was git-commit.sh.

... which was to run it just before post-commit.  Do I retitle this
patch before queuing?

The code seems to run it "after" post-commit.  We need to explain
why we chose to run it differently from the old practice, when we
claim we resurrect the old behaviour with this change.

> Shortly afterwards in f5bbc3225c ("Port git commit to C.", 2007-11-08)
> when it was ported to C the "git gc --auto" invocation went away.

So this is a decade late regression fix?  As they say, better late
than never, probably.

> @@ -1608,6 +1609,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
>  
>  	rerere(0);
>  	run_commit_hook(use_editor, get_index_file(), "post-commit", NULL);
> +	run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
>  	if (amend && !no_post_rewrite) {
>  		commit_post_rewrite(current_head, &oid);
>  	}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] commit: run git gc --auto just before the pre-commit hook
  2018-02-28 22:29 ` Junio C Hamano
@ 2018-02-28 23:04   ` Ævar Arnfjörð Bjarmason
  2018-02-28 23:39     ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-02-28 23:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason

Change the behavior of git-commit back to what it was back in
d4bb43ee27 ("Invoke "git gc --auto" from commit, merge, am and
rebase.", 2007-09-05) when it was git-commit.sh.

Shortly afterwards in f5bbc3225c ("Port git commit to C.", 2007-11-08)
when it was ported to C, the "git gc --auto" invocation went away.

Before this, git gc --auto only ran for
git-{am,merge,fetch,receive-pack}. Therefore it was possible to write
a script that would "git commit" a lot of data locally, and gc would
never run.

One such repository that was locally committing generated zone file
changes had grown to a size of ~60GB before a daily cronjob was added
to "git gc", bringing it down to less than 1GB. This will make such
cases work without intervention.

I think fixing such pathological cases where the repository will grow
forever is a worthwhile trade-off for spending a couple of
milliseconds calling "git gc --auto" (in the common cases where it
doesn't do anything).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

On Wed, Feb 28 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Change the behavior of git-commit back to what it was back in
>> d4bb43ee27 ("Invoke "git gc --auto" from commit, merge, am and
>> rebase.", 2007-09-05) when it was git-commit.sh.
>
> ... which was to run it just before post-commit.  Do I retitle this
> patch before queuing?

Do'h. Of course I screw up something simple like that, sorry. This v2
fixes it, and I also rephrased the commit message a bit (more commas &
full-stops).

> So this is a decade late regression fix?  As they say, better late
> than never, probably.

Yup.

I wonder if it would also be a good idea to run git gc --auto on "git
push". It itself won't create any objects, but it would be a nice
proxy in many cases for picking up anything else we missed due to
various object writing commands that won't run --auto.

 builtin/commit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/commit.c b/builtin/commit.c
index e8e8d13be4..a16a056f6a 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1402,6 +1402,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, const char *name
 
 int cmd_commit(int argc, const char **argv, const char *prefix)
 {
+	const char *argv_gc_auto[] = {"gc", "--auto", NULL};
 	static struct wt_status s;
 	static struct option builtin_commit_options[] = {
 		OPT__QUIET(&quiet, N_("suppress summary after successful commit")),
@@ -1607,6 +1608,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 		     "not exceeded, and then \"git reset HEAD\" to recover."));
 
 	rerere(0);
+	run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
 	run_commit_hook(use_editor, get_index_file(), "post-commit", NULL);
 	if (amend && !no_post_rewrite) {
 		commit_post_rewrite(current_head, &oid);
-- 
2.15.1.424.g9478a66081


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] commit: run git gc --auto just before the pre-commit hook
  2018-02-28 23:04   ` [PATCH v2] " Ævar Arnfjörð Bjarmason
@ 2018-02-28 23:39     ` Junio C Hamano
  0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2018-02-28 23:39 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

>> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>>
>>> Change the behavior of git-commit back to what it was back in
>>> d4bb43ee27 ("Invoke "git gc --auto" from commit, merge, am and
>>> rebase.", 2007-09-05) when it was git-commit.sh.
>>
>> ... which was to run it just before post-commit.  Do I retitle this
>> patch before queuing?
>
> Do'h. Of course I screw up something simple like that, sorry. This v2
> fixes it, and I also rephrased the commit message a bit (more commas &
> full-stops).

I guess I still need to retitle it ;-) But that can happen tomorrow
(I have the previous one with local fixes that pretty much matches
v2 modulo the body of the log message on 'pu', ready to be pushed
out).

> I wonder if it would also be a good idea to run git gc --auto on "git
> push". It itself won't create any objects, but it would be a nice
> proxy in many cases for picking up anything else we missed due to
> various object writing commands that won't run --auto.

Before "push" starts producing a pack might be a good optimization,
as reading from a packed repository is often more performant than a
repository full of loose objects.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-02-28 23:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 22:13 [PATCH] commit: run git gc --auto just before the pre-commit hook Ævar Arnfjörð Bjarmason
2018-02-28 22:29 ` Junio C Hamano
2018-02-28 23:04   ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2018-02-28 23:39     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).