git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Question: How to execute git-gc correctly on the git server
@ 2022-12-07 15:58 ZheNing Hu
  2022-12-07 23:57 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 14+ messages in thread
From: ZheNing Hu @ 2022-12-07 15:58 UTC (permalink / raw)
  To: Git List
  Cc: Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Christian Couder, johncai86, Taylor Blau

Hi,

I would like to run git gc on my git server periodically, which should help
reduce storage space and optimize the read performance of the repository.
I know github, gitlab all have this process...

But the concurrency between git gc and other git commands is holding
me back a bit.

git-gc [1] docs say:

    On the other hand, when git gc runs concurrently with another process,
    there is a risk of it deleting an object that the other process is using but
    hasn’t created a reference to. This may just cause the other process to
    fail or may corrupt the repository if the other process later adds
a reference
    to the deleted object.

It seems that git gc is a dangerous operation that may cause data corruption
concurrently with other git commands.

Then I read the contents of Github's blog [2], git gc ---cruft seems to be used
to keep those expiring unreachable objects in a cruft pack, but the blog says
github use some special "limbo" repository to keep the cruft pack for git data
recover. Well, a lot of the details here are pretty hard to understand for me :(

However, on the other hand, my git server is still at v2.35, and --cruft was
introduced in v2.38, so I'm actually more curious about: how did the server
execute git gc correctly in the past? Do we need a repository level "big lock"
that blocks most/all other git operations? What should the behavior of users'
git clone/push be at this time? Report error that the git server is performing
git gc? Or just wait for git gc to complete?

Thanks for any comments and help!

[1]: https://git-scm.com/docs/git-gc
[2]: https://github.blog/2022-09-13-scaling-gits-garbage-collection/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-12-14 20:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-07 15:58 Question: How to execute git-gc correctly on the git server ZheNing Hu
2022-12-07 23:57 ` Ævar Arnfjörð Bjarmason
2022-12-08  1:16   ` Michal Suchánek
2022-12-08  7:01     ` Jeff King
2022-12-09  0:49       ` Michal Suchánek
2022-12-09  1:37         ` Jeff King
2022-12-09  7:26           ` ZheNing Hu
2022-12-09 13:48             ` Ævar Arnfjörð Bjarmason
2022-12-11 16:01               ` ZheNing Hu
2022-12-11 16:27                 ` Michal Suchánek
2022-12-09  7:15     ` ZheNing Hu
2022-12-08  6:59   ` Jeff King
2022-12-08 12:35     ` Ævar Arnfjörð Bjarmason
2022-12-14 20:11       ` Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).