* [PATCH] gc: do not warn about too many loose objects @ 2018-07-16 17:27 Jonathan Tan 2018-07-16 17:51 ` Jeff King ` (2 more replies) 0 siblings, 3 replies; 57+ messages in thread From: Jonathan Tan @ 2018-07-16 17:27 UTC (permalink / raw) To: git; +Cc: Jonathan Tan In a087cc9819 ("git-gc --auto: protect ourselves from accumulated cruft", 2007-09-17), the user was warned if there were too many unreachable loose objects. This made sense at the time, because gc couldn't prune them safely. But subsequently, git prune learned the ability to not prune recently created loose objects, making pruning able to be done more safely, and gc was made to automatically prune old unreachable loose objects in 25ee9731c1 ("gc: call "prune --expire 2.weeks.ago" by default", 2008-03-12). This makes the warning unactionable by the user, as any loose objects left are not deleted yet because of safety, and "git prune" is not a command that the user is recommended to run directly anyway. Therefore, remove this warning. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- This was noticed when a daemonized gc run wrote this warning to the log file, and returned 0; but a subsequent run merely read the log file, saw that it is non-empty and returned -1 (which is inconsistent in that such a run should return 0, as it did the first time). --- builtin/gc.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index ccfb1ceaeb..fc3b553651 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -641,10 +641,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (pack_garbage.nr > 0) clean_pack_garbage(); - if (auto_gc && too_many_loose_objects()) - warning(_("There are too many unreachable loose objects; " - "run 'git prune' to remove them.")); - if (!daemonized) unlink(git_path("gc.log")); -- 2.18.0.203.gfac676dfb9-goog ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 17:27 [PATCH] gc: do not warn about too many loose objects Jonathan Tan @ 2018-07-16 17:51 ` Jeff King 2018-07-16 18:22 ` Jonathan Nieder 2018-07-16 19:15 ` Elijah Newren 2018-07-17 6:51 ` [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode Jonathan Nieder 2 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 17:51 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Mon, Jul 16, 2018 at 10:27:17AM -0700, Jonathan Tan wrote: > In a087cc9819 ("git-gc --auto: protect ourselves from accumulated > cruft", 2007-09-17), the user was warned if there were too many > unreachable loose objects. This made sense at the time, because gc > couldn't prune them safely. But subsequently, git prune learned the > ability to not prune recently created loose objects, making pruning able > to be done more safely, and gc was made to automatically prune old > unreachable loose objects in 25ee9731c1 ("gc: call "prune --expire > 2.weeks.ago" by default", 2008-03-12). > > This makes the warning unactionable by the user, as any loose objects > left are not deleted yet because of safety, and "git prune" is not a > command that the user is recommended to run directly anyway. I'm not sure if this tells the whole story. You may still run into a case where auto-gc runs over and over during the grace period, because you have a bunch of objects which are not reachable and not yet ready to be expired. So a gc cannot make forward progress until the 2-week expiration, and the way to break the cycle is to run an immediate "prune". So while I completely agree that it's not a great thing to encourage users to blindly run "git prune", I think it _is_ actionable. But... > This was noticed when a daemonized gc run wrote this warning to the log > file, and returned 0; but a subsequent run merely read the log file, saw > that it is non-empty and returned -1 (which is inconsistent in that such > a run should return 0, as it did the first time). Yes, this got much worse with daemonized gc. The warning blocks further runs. And then even after the 2-week period, you still cannot make further progress until the user steps in! I think we dealt with that in a831c06a2b (gc: ignore old gc.log files, 2017-02-10). So now we won't run gc for a day, but eventually we may make further progress. So the warning _is_ still doing something useful there (it's blocking immediate auto-gc and kicking in the 1-day threshold). I agree that the "-1" return is a little funny. Perhaps on the reading side we should detect that the log contains only a "warning" line and adjust our exit code accordingly. Ultimately, I think Git should avoid keeping unreachable objects as loose in the first place, which would make this whole problem go away. There's some discussion in this thread: https://public-inbox.org/git/87inc89j38.fsf@evledraar.gmail.com/ -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 17:51 ` Jeff King @ 2018-07-16 18:22 ` Jonathan Nieder 2018-07-16 18:52 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 18:22 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Tan, git Hi, Jeff King wrote: > On Mon, Jul 16, 2018 at 10:27:17AM -0700, Jonathan Tan wrote: >> In a087cc9819 ("git-gc --auto: protect ourselves from accumulated >> cruft", 2007-09-17), the user was warned if there were too many >> unreachable loose objects. This made sense at the time, because gc >> couldn't prune them safely. But subsequently, git prune learned the >> ability to not prune recently created loose objects, making pruning able >> to be done more safely, and gc was made to automatically prune old >> unreachable loose objects in 25ee9731c1 ("gc: call "prune --expire >> 2.weeks.ago" by default", 2008-03-12). >> >> This makes the warning unactionable by the user, as any loose objects >> left are not deleted yet because of safety, and "git prune" is not a >> command that the user is recommended to run directly anyway. > > I'm not sure if this tells the whole story. You may still run into a > case where auto-gc runs over and over during the grace period, because > you have a bunch of objects which are not reachable and not yet ready to > be expired. So a gc cannot make forward progress until the 2-week > expiration, and the way to break the cycle is to run an immediate > "prune". > > So while I completely agree that it's not a great thing to encourage > users to blindly run "git prune", I think it _is_ actionable. To flesh this out a little more: what user action do you suggest? Could we carry out that action automatically? I mentally make a distinction between this warning from "git gc --auto" and the warning from commands like "git gui". The former was saying "Please run 'git prune', because I'm not going to do it", and as Jonathan noticed, that's not true any more. The latter says This repository currently has approximately %i loose objects. To maintain optimal performance it is strongly recommended that you compress the database. Compress the database now? which is relevant to the current operation (slowly reading the repository) and easy to act on (choose "yes"). [...] > I agree that the "-1" return is a little funny. Perhaps on the reading > side we should detect that the log contains only a "warning" line and > adjust our exit code accordingly. Yes, this is a real problem, and it would affect any other warning (or even GIT_TRACE=1 output) produced by gc --auto as well. I think we should consider it a serious bug, separate from the symptom Jonathan is fixing. Unfortunately I don't have a great idea about how to fix it. Should we avoid writing warnings to gc.log in daemon mode? That would prevent the user from ever seeing the warnings, but because of the nature of a warning, that might be reasonable. Should we put warnings in a separate log file with different semantics? Should we change the format of gc.log to allow more structured information (include 'gc' exit code) and perform a two-stage migration to the new format (first learn to read the new format, then switch to writing it)? Thanks, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 18:22 ` Jonathan Nieder @ 2018-07-16 18:52 ` Jeff King 2018-07-16 19:09 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 18:52 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jonathan Tan, git On Mon, Jul 16, 2018 at 11:22:07AM -0700, Jonathan Nieder wrote: > > I'm not sure if this tells the whole story. You may still run into a > > case where auto-gc runs over and over during the grace period, because > > you have a bunch of objects which are not reachable and not yet ready to > > be expired. So a gc cannot make forward progress until the 2-week > > expiration, and the way to break the cycle is to run an immediate > > "prune". > > > > So while I completely agree that it's not a great thing to encourage > > users to blindly run "git prune", I think it _is_ actionable. > > To flesh this out a little more: what user action do you suggest? Could > we carry out that action automatically? Er, the action is to run "git prune", like the warning says. :) And no, I do not think we should run it automatically. It deletes objects without respect to the grace period, which may not be what the user wants. (And yes, the existing warning is terrible because it does not at all make clear that following its advice may be dangerous). > I mentally make a distinction between this warning from "git gc > --auto" and the warning from commands like "git gui". The former was > saying "Please run 'git prune', because I'm not going to do it", and > as Jonathan noticed, that's not true any more. The latter says > > This repository currently has approximately %i loose objects. > > To maintain optimal performance it is strongly recommended > that you compress the database. > > Compress the database now? > > which is relevant to the current operation (slowly reading the > repository) and easy to act on (choose "yes"). I don't think those warnings are the same. The warning from "git gui" is: you may benefit from running gc. The warning that is deleted by this patch is: you _just_ ran gc, and hey we even did it automatically for you, but we're still in a funky state afterwards. You might need to clean up this state. (As an aside, I think the git-gui warning pre-dates "gc --auto", and probably should just run that rather than implementing its own heuristic). > > I agree that the "-1" return is a little funny. Perhaps on the reading > > side we should detect that the log contains only a "warning" line and > > adjust our exit code accordingly. > > Yes, this is a real problem, and it would affect any other warning > (or even GIT_TRACE=1 output) produced by gc --auto as well. I think we > should consider it a serious bug, separate from the symptom Jonathan is > fixing. > > Unfortunately I don't have a great idea about how to fix it. Should > we avoid writing warnings to gc.log in daemon mode? That would > prevent the user from ever seeing the warnings, but because of the > nature of a warning, that might be reasonable. If you do that without anything further, then it will break the protection against repeatedly running auto-gc, as I described in the previous email. > Should we put warnings > in a separate log file with different semantics? Should we change the > format of gc.log to allow more structured information (include 'gc' > exit code) and perform a two-stage migration to the new format (first > learn to read the new format, then switch to writing it)? Any of those would work similarly to the "just detect warnings" I suggested earlier, with respect to keeping the "1 day" expiration intact, so I'd be OK with them. In theory they'd be more robust than scraping the "warning:" prefix. But in practice, I think you have to resort to scraping anyway, if you are interested in treating warnings from sub-processes the same way. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 18:52 ` Jeff King @ 2018-07-16 19:09 ` Jonathan Nieder 2018-07-16 19:41 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 19:09 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Tan, git Hi, Jeff King wrote: > On Mon, Jul 16, 2018 at 11:22:07AM -0700, Jonathan Nieder wrote: >> Jeff King wrote: >>> So while I completely agree that it's not a great thing to encourage >>> users to blindly run "git prune", I think it _is_ actionable. >> >> To flesh this out a little more: what user action do you suggest? Could >> we carry out that action automatically? > > Er, the action is to run "git prune", like the warning says. :) I don't think we want to recommend that, especially when "git gc --auto" does the right thing automatically. Can you convince me I'm wrong? [...] >> I mentally make a distinction between this warning from "git gc >> --auto" and the warning from commands like "git gui". [...] > I don't think those warnings are the same. The warning from "git gui" > is: you may benefit from running gc. > > The warning that is deleted by this patch is: you _just_ ran gc, and hey > we even did it automatically for you, but we're still in a funky state > afterwards. You might need to clean up this state. This sounds awful. It sounds to me like you're saying "git gc --auto" is saying "I just did the wrong thing, and here is how you can clean up after me". That's not how I want a program to behave. But there's a simpler explanation for what "git gc --auto" was trying to say, which Jonathan described. [...] >> Yes, this is a real problem, and it would affect any other warning >> (or even GIT_TRACE=1 output) produced by gc --auto as well. I think we >> should consider it a serious bug, separate from the symptom Jonathan is >> fixing. >> >> Unfortunately I don't have a great idea about how to fix it. Should >> we avoid writing warnings to gc.log in daemon mode? That would >> prevent the user from ever seeing the warnings, but because of the >> nature of a warning, that might be reasonable. > > If you do that without anything further, then it will break the > protection against repeatedly running auto-gc, as I described in the > previous email. Do you mean ratelimiting for the message, or do you actually mean repeatedly running auto-gc itself? If we suppress warnings, there would still be a gc.log while "git gc --auto" is running, just as though there had been no warnings at all. I believe this is close to the intended behavior; it's the same as what you'd get without daemon mode, except that you lose the warning. [...] >> Should we put warnings >> in a separate log file with different semantics? Should we change the >> format of gc.log to allow more structured information (include 'gc' >> exit code) and perform a two-stage migration to the new format (first >> learn to read the new format, then switch to writing it)? > > Any of those would work similarly to the "just detect warnings" I > suggested earlier, with respect to keeping the "1 day" expiration > intact, so I'd be OK with them. In theory they'd be more robust than > scraping the "warning:" prefix. But in practice, I think you have to > resort to scraping anyway, if you are interested in treating warnings > from sub-processes the same way. Can you say more about this for me? I am not understanding what you're saying necessitates scraping the output. I would strongly prefer to avoid scraping the output. Thanks, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 19:09 ` Jonathan Nieder @ 2018-07-16 19:41 ` Jeff King 2018-07-16 19:54 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 19:41 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jonathan Tan, git On Mon, Jul 16, 2018 at 12:09:49PM -0700, Jonathan Nieder wrote: > >>> So while I completely agree that it's not a great thing to encourage > >>> users to blindly run "git prune", I think it _is_ actionable. > >> > >> To flesh this out a little more: what user action do you suggest? Could > >> we carry out that action automatically? > > > > Er, the action is to run "git prune", like the warning says. :) > > I don't think we want to recommend that, especially when "git gc --auto" > does the right thing automatically. But that's the point. This warning is written literally after running "git gc --auto" _didn't_ do the right thing. Yes, it would be nicer if it could do the right thing. But it doesn't yet know how to. See the thread I linked earlier on putting unreachable objects into a pack, which I think is the real solution. > > The warning that is deleted by this patch is: you _just_ ran gc, and hey > > we even did it automatically for you, but we're still in a funky state > > afterwards. You might need to clean up this state. > > This sounds awful. It sounds to me like you're saying "git gc --auto" > is saying "I just did the wrong thing, and here is how you can clean > up after me". That's not how I want a program to behave. Sure, it would be nice if it did the right thing. Nobody has written that yet. Until they do, we have to deal with the fallout. > > If you do that without anything further, then it will break the > > protection against repeatedly running auto-gc, as I described in the > > previous email. > > Do you mean ratelimiting for the message, or do you actually mean > repeatedly running auto-gc itself? > > If we suppress warnings, there would still be a gc.log while "git gc > --auto" is running, just as though there had been no warnings at all. > I believe this is close to the intended behavior; it's the same as > what you'd get without daemon mode, except that you lose the warning. I mean that if you do not write a persistent log, then "gc --auto" will do an unproductive gc every time it is invoked. I.e., it will see "oh, there are too many loose objects", and then waste a bunch of CPU every time you commit. > > Any of those would work similarly to the "just detect warnings" I > > suggested earlier, with respect to keeping the "1 day" expiration > > intact, so I'd be OK with them. In theory they'd be more robust than > > scraping the "warning:" prefix. But in practice, I think you have to > > resort to scraping anyway, if you are interested in treating warnings > > from sub-processes the same way. > > Can you say more about this for me? I am not understanding what > you're saying necessitates scraping the output. I would strongly > prefer to avoid scraping the output. A daemonized git-gc runs a bunch of sub-commands (e.g., "git prune") with their stderr redirected into the logfile. If you want to have warnings go somewhere else, then either: - you need some way to tell those sub-commands to send the warnings elsewhere (i.e., _not_ stderr) or - you have to post-process the output they send to separate warnings from other errors. Right now that means scraping. If you are proposing a system of machine-readable output, it would need to work not just for git-gc, but for every sub-command it runs. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 19:41 ` Jeff King @ 2018-07-16 19:54 ` Jonathan Nieder 2018-07-16 20:29 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 19:54 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Tan, git Jeff King wrote: > On Mon, Jul 16, 2018 at 12:09:49PM -0700, Jonathan Nieder wrote: >> Jeff King wrote: >>> Er, the action is to run "git prune", like the warning says. :) >> >> I don't think we want to recommend that, especially when "git gc --auto" >> does the right thing automatically. > > But that's the point. This warning is written literally after running > "git gc --auto" _didn't_ do the right thing. Yes, it would be nicer if > it could do the right thing. But it doesn't yet know how to. I think we fundamentally disagree, and I would like your help getting past this impasse. Even restricting attention to the warning, not the exit code, I think you are saying the current behavior is acceptable. I do not believe it is. I think you are saying Jonathan's patch makes the behavior worse, and I'm not seeing it. Can you describe an example user interaction where the current behavior would be better than the behavior after Jonathan's patch? That might help make this discussion more concrete. [...] > See the thread I linked earlier on putting unreachable objects into a > pack, which I think is the real solution. Have you looked over the discussion in "Loose objects and unreachable objects" in Documentation/technical/hash-function-transition.txt? Do you have thoughts on it (preferrably in a separate thread)? [...] >> This sounds awful. It sounds to me like you're saying "git gc --auto" >> is saying "I just did the wrong thing, and here is how you can clean >> up after me". That's not how I want a program to behave. > > Sure, it would be nice if it did the right thing. Nobody has written > that yet. Until they do, we have to deal with the fallout. "git gc --auto" is already doing the right thing. [...] > I mean that if you do not write a persistent log, then "gc --auto" will > do an unproductive gc every time it is invoked. I.e., it will see "oh, > there are too many loose objects", and then waste a bunch of CPU every > time you commit. If so, then this would already be the behavior in non daemon mode. Are you saying this accidental effect of daemon mode is in fact intentional? [...] >>> But in practice, I think you have to >>> resort to scraping anyway, if you are interested in treating warnings >>> from sub-processes the same way. >> >> Can you say more about this for me? I am not understanding what >> you're saying necessitates scraping the output. I would strongly >> prefer to avoid scraping the output. > > A daemonized git-gc runs a bunch of sub-commands (e.g., "git prune") > with their stderr redirected into the logfile. If you want to have > warnings go somewhere else, then either: > > - you need some way to tell those sub-commands to send the warnings > elsewhere (i.e., _not_ stderr) > > or > > - you have to post-process the output they send to separate warnings > from other errors. Right now that means scraping. If you are > proposing a system of machine-readable output, it would need to work > not just for git-gc, but for every sub-command it runs. or - you can simply record and trust the exit code A simple way to do that without changing formats is to truncate the file when exiting with status 0. Thanks, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 19:54 ` Jonathan Nieder @ 2018-07-16 20:29 ` Jeff King 2018-07-16 20:37 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 20:29 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jonathan Tan, git On Mon, Jul 16, 2018 at 12:54:31PM -0700, Jonathan Nieder wrote: > Even restricting attention to the warning, not the exit code, I think > you are saying the current behavior is acceptable. I do not believe > it is. I didn't say that at all. The current situation sucks, and I think the right solution is to pack the unreachable objects instead of exploding them. > I think you are saying Jonathan's patch makes the behavior > worse, and I'm not seeing it. Can you describe an example user > interaction where the current behavior would be better than the > behavior after Jonathan's patch? That might help make this discussion > more concrete. It makes it worse because there is nothing to throttle a "gc --auto" that is making no progress. Try this: -- >8 -- #!/bin/sh rm -rf repo # new mostly-empty repo git init repo cd repo git commit --allow-empty -m base # make a bunch of unreachable blobs for i in $(seq 7000); do echo "blob" echo "data <<EOD" echo "cruft $i" echo "EOD" done | git fast-import # This gc will explode them (doing a "gc --auto" isn't sufficient here, because # we don't have enough _other_ material in the repo to trigger a gc. But you # can imagine that a normal repo would eventually "gc --auto". git gc # Now simulate some real work in the repo. for i in $(seq 50); do git commit --allow-empty -m "commit $i" sleep 30 done -- 8< -- With the current code, that produces a bunch of annoying warnings for every commit ("I couldn't gc because the last gc reported a warning"). But after Jonathan's patch, every single commit will do a full gc of the repository. In this tiny repository that's relatively quick, but in a real repo it's a ton of CPU and I/O, all for nothing. > [...] > > See the thread I linked earlier on putting unreachable objects into a > > pack, which I think is the real solution. > > Have you looked over the discussion in "Loose objects and unreachable > objects" in Documentation/technical/hash-function-transition.txt? Do > you have thoughts on it (preferrably in a separate thread)? It seems to propose putting the unreachable objects into a pack. Which yes, I absolutely agree with (as I thought I'd been saying in every single email in this thread). > > I mean that if you do not write a persistent log, then "gc --auto" will > > do an unproductive gc every time it is invoked. I.e., it will see "oh, > > there are too many loose objects", and then waste a bunch of CPU every > > time you commit. > > If so, then this would already be the behavior in non daemon mode. > Are you saying this accidental effect of daemon mode is in fact > intentional? I'm not sure if I'd call it intentional, since I don't recall ever seeing this side effect discussed. But since daemonizing is the default, I think that's the case people have focused on when they hit annoying cases. E.g., a831c06a2b (gc: ignore old gc.log files, 2017-02-10). I'd consider the fact that "gc --auto" with gc.autoDetach=false will repeatedly do useless work to be a minor bug. But I think Jonathan's patch makes it even worse because we do not even tell people that useless work is being done (while making them wait for each gc to finish!). Even if we were going to remove this message to help the daemonized case, I think we'd want to retain it for the non-daemon case. > > A daemonized git-gc runs a bunch of sub-commands (e.g., "git prune") > > with their stderr redirected into the logfile. If you want to have > > warnings go somewhere else, then either: > > > > - you need some way to tell those sub-commands to send the warnings > > elsewhere (i.e., _not_ stderr) > > > > or > > > > - you have to post-process the output they send to separate warnings > > from other errors. Right now that means scraping. If you are > > proposing a system of machine-readable output, it would need to work > > not just for git-gc, but for every sub-command it runs. > > or > > - you can simply record and trust the exit code > > A simple way to do that without changing formats is to truncate the > file when exiting with status 0. That's a different behavior than what we do now (and what was suggested earlier), in that it assumes that anything written to stderr is OK for gc to hide from the user if the process exits with code zero. That's probably OK in most cases, though I wonder if there are corner cases. For example, if you have a corrupt ref, we used to say "error: refs/heads/foo does not point to a valid object!" but otherwise ignore it. These days git-gc sets REF_PARANOIA=1, so we'll actually barf on a corrupt ref. But I wonder if there are other cases lurking. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 20:29 ` Jeff King @ 2018-07-16 20:37 ` Jonathan Nieder 2018-07-16 21:09 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 20:37 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Tan, git Jeff King wrote: > On Mon, Jul 16, 2018 at 12:54:31PM -0700, Jonathan Nieder wrote: >> Even restricting attention to the warning, not the exit code, I think >> you are saying the current behavior is acceptable. I do not believe >> it is. > > I didn't say that at all. The current situation sucks, Thanks for clarifying. That helps. > and I think the > right solution is to pack the unreachable objects instead of exploding > them. That seems like a huge widening in scope relative to what this original patch tackled. I'm not aware of a way to do this without breaking compatibility with the current (broken) race prevention. If you're saying that breaking compatibility in that way is okay with you, then great! [...] >> I think you are saying Jonathan's patch makes the behavior >> worse, and I'm not seeing it. Can you describe an example user >> interaction where the current behavior would be better than the >> behavior after Jonathan's patch? That might help make this discussion >> more concrete. > > It makes it worse because there is nothing to throttle a "gc --auto" > that is making no progress. [...] > With the current code, that produces a bunch of annoying warnings for > every commit ("I couldn't gc because the last gc reported a warning"). > But after Jonathan's patch, every single commit will do a full gc of the > repository. In this tiny repository that's relatively quick, but in a > real repo it's a ton of CPU and I/O, all for nothing. I see. Do I understand correctly that if we find a way to print the warning but not error out, that would be good enough for you? [...] >> Have you looked over the discussion in "Loose objects and unreachable >> objects" in Documentation/technical/hash-function-transition.txt? Do >> you have thoughts on it (preferrably in a separate thread)? > > It seems to propose putting the unreachable objects into a pack. Which > yes, I absolutely agree with (as I thought I'd been saying in every > single email in this thread). I figured you were proposing something like https://public-inbox.org/git/20180113100734.GA30698@sigill.intra.peff.net/, which is still racy (because it does not handle the freshening in a safe way). [...] > Even if we were going to remove this message to help the > daemonized case, I think we'd want to retain it for the non-daemon case. Interesting. That should be doable, e.g. following the approach described below. [...] >> A simple way to do that without changing formats is to truncate the >> file when exiting with status 0. > > That's a different behavior than what we do now (and what was suggested > earlier), in that it assumes that anything written to stderr is OK for > gc to hide from the user if the process exits with code zero. > > That's probably OK in most cases, though I wonder if there are corner > cases. For example, if you have a corrupt ref, we used to say "error: > refs/heads/foo does not point to a valid object!" but otherwise ignore > it. These days git-gc sets REF_PARANOIA=1, so we'll actually barf on a > corrupt ref. But I wonder if there are other cases lurking. What decides it for me is that the user did not invoke "git gc --auto" explicitly, so anything "git gc --auto" prints is tangential to what the user was trying to do. If the gc failed, that is worth telling them, but if e.g. it encountered a disk I/O error reading and succeeded on retry (to make up a fake example), then that's likely worth logging to syslog but it's not something the user asked to be directly informed about. Thanks, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 20:37 ` Jonathan Nieder @ 2018-07-16 21:09 ` Jeff King 2018-07-16 21:40 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 21:09 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jonathan Tan, git On Mon, Jul 16, 2018 at 01:37:53PM -0700, Jonathan Nieder wrote: > > and I think the > > right solution is to pack the unreachable objects instead of exploding > > them. > > That seems like a huge widening in scope relative to what this > original patch tackled. I'm not at all saying that Jonathan needs to work on that. It's a big topic that we've been putting off for years. I _am_ saying that I don't think his patch as-is should be merged, as it makes the daemonized behavior here worse. > I'm not aware of a way to do this without > breaking compatibility with the current (broken) race prevention. If > you're saying that breaking compatibility in that way is okay with > you, then great! It depends what you're trying to fix. If you're trying to fix auto-gc outputting an annoying message (or repeatedly doing useless invocations of pack-objects), you don't have to break compatibility at all. You just have to put the objects into a pack instead of exploding them loose. Worst case if you use an old implementation to run "git gc", it may accidentally freshen a cruft pack's timestamp. If you want to improve the raciness, then yes, I think you need stronger rules around marking packs as cruft/garbage, and making operations that want to use those objects copy them out to a non-cruft location (like you describe in the hash transition document). There you risk corruption if an old implementation fails to make the non-cruft copy. But that's really no worse than we are today. So I don't even really see it as a backwards-compatibility break. But even if it were, I'd probably still be fine with it. This is a local-repo thing, and in the absolute worst case we could bump core.repositoryformatversion (though again, I don't even think that would be necessary since it would degrade to behaving just as the current code does). > > With the current code, that produces a bunch of annoying warnings for > > every commit ("I couldn't gc because the last gc reported a warning"). > > But after Jonathan's patch, every single commit will do a full gc of the > > repository. In this tiny repository that's relatively quick, but in a > > real repo it's a ton of CPU and I/O, all for nothing. > > I see. Do I understand correctly that if we find a way to print the > warning but not error out, that would be good enough for you? Yes. I thought that's what I proposed originally. ;) The key thing is that the presence of the warning/log still suppress the actual gc for the gc.logExpiry period. > >> Have you looked over the discussion in "Loose objects and unreachable > >> objects" in Documentation/technical/hash-function-transition.txt? Do > >> you have thoughts on it (preferrably in a separate thread)? > > > > It seems to propose putting the unreachable objects into a pack. Which > > yes, I absolutely agree with (as I thought I'd been saying in every > > single email in this thread). > > I figured you were proposing something like > https://public-inbox.org/git/20180113100734.GA30698@sigill.intra.peff.net/, > which is still racy (because it does not handle the freshening in a safe > way). That message is just me telling somebody that their idea _won't_ work. ;) But I assume you mean the general idea of putting things in a cruft pack[1]. Yes, I was only suggesting going that far as a solution to the auto-gc woes. Solving races is another issue entirely (though obviously it may make sense to build on the single-pack work, if it exists). [1] The best message discussion on that is I think: https://public-inbox.org/git/20170610080626.sjujpmgkli4muh7h@sigill.intra.peff.net/ which I think I linked earlier, so possibly that is even what you meant. :) > What decides it for me is that the user did not invoke "git gc --auto" > explicitly, so anything "git gc --auto" prints is tangential to what > the user was trying to do. If the gc failed, that is worth telling > them, but if e.g. it encountered a disk I/O error reading and > succeeded on retry (to make up a fake example), then that's likely > worth logging to syslog but it's not something the user asked to be > directly informed about. I'm not sure I agree. If a repack discovered that you had a bit corruption on disk, but you happened to have another copy of the object available that made the repack succeed, I'd like to know that I'm getting bit corruptions, and earlier is better. I think we need to surface that information to the user eventually, and I don't think we can count on syslog being universally available. By definition we're daemonized here, so we can't count on even having access to the user's terminal. But it would make more sense to me if the logic were: - at the end of a "gc --auto" run, gc should write a machine-readable indication of its exit code (either as a final line in the log file, or to a gc.exitcode file). The warnings/errors remain intermingled in the logfile. - on the _next_ run, if: - the exit code was non-zero, dump the log output to stderr and exit with the same code. This is the same as the current behavior (with the exception that we propagate the code instead of returning -1, which is a minor bonus). - the exit code was zero, report to the user synchronously that the last run found some oddities, and dump the log. Proceed with the rest of auto-gc (either exiting 0 if nothing to be done, or doing useful work). Optionally, delete the log after showing it so the user is not spammed with the warning on every commit. Though for the warning under discussion, this does mean that if they don't take any action, they'll end up running a useless gc every _other_ commit. So I'd probably keep the log and let it expire naturally after the 1-day gc.logExpiry period. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 21:09 ` Jeff King @ 2018-07-16 21:40 ` Jonathan Nieder 2018-07-16 21:45 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 21:40 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Tan, git Jeff King wrote: > On Mon, Jul 16, 2018 at 01:37:53PM -0700, Jonathan Nieder wrote: >> Jeff King wrote: >>> With the current code, that produces a bunch of annoying warnings for >>> every commit ("I couldn't gc because the last gc reported a warning"). >>> But after Jonathan's patch, every single commit will do a full gc of the >>> repository. In this tiny repository that's relatively quick, but in a >>> real repo it's a ton of CPU and I/O, all for nothing. >> >> I see. Do I understand correctly that if we find a way to print the >> warning but not error out, that would be good enough for you? > > Yes. I thought that's what I proposed originally. ;) > > The key thing is that the presence of the warning/log still suppress > the actual gc for the gc.logExpiry period. Ah, now I think I see the source of the misunderstanding. When I initially read the docs, I also assumed that If the file gc.log exists, then git gc --auto won’t run unless that file is more than gc.logExpiry old. meant "git gc --auto" would be skipped (and thus silently succeed) when the file is less than gc.logExpiry old. But that assumption was wrong: it errors out! This makes scripting difficult, since arbitrary commands that incidentally run "git gc --auto" will fail in this state, until gc.log expires (but at that point they'll fail again anyway). For comparison, in non-daemon mode, there is nothing enforcing the kind of ratelimiting you are talking about. It is an accident of history. If we want this kind of ratelimiting, I'd rather we build it deliberately instead of relying on this accident. Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 21:40 ` Jonathan Nieder @ 2018-07-16 21:45 ` Jeff King 2018-07-16 22:03 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 21:45 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jonathan Tan, git On Mon, Jul 16, 2018 at 02:40:03PM -0700, Jonathan Nieder wrote: > > The key thing is that the presence of the warning/log still suppress > > the actual gc for the gc.logExpiry period. > > Ah, now I think I see the source of the misunderstanding. > > When I initially read the docs, I also assumed that > > If the file gc.log exists, then git gc --auto won’t run unless > that file is more than gc.logExpiry old. > > meant "git gc --auto" would be skipped (and thus silently succeed) when > the file is less than gc.logExpiry old. But that assumption was wrong: > it errors out! Right. That's the mysterious "-1" error code, and I agree that part is confusing. > This makes scripting difficult, since arbitrary commands that > incidentally run "git gc --auto" will fail in this state, until gc.log > expires (but at that point they'll fail again anyway). I don't think any command should report failure of its _own_ operation if "gc --auto" failed. And grepping around the source code shows that we typically ignore it. > For comparison, in non-daemon mode, there is nothing enforcing the > kind of ratelimiting you are talking about. It is an accident of > history. If we want this kind of ratelimiting, I'd rather we build it > deliberately instead of relying on this accident. What I was trying to say earlier is that we _did_ build this rate-limiting, and I think it is a bug that the non-daemon case does not rate-limit (but nobody noticed, because the default is daemonizing). So the fix is not "rip out the rate-limiting in daemon mode", but rather "extend it to the non-daemon case". -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 21:45 ` Jeff King @ 2018-07-16 22:03 ` Jonathan Nieder 2018-07-16 22:43 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 22:03 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Tan, git Jeff King wrote: > I don't think any command should report failure of its _own_ operation > if "gc --auto" failed. And grepping around the source code shows that we > typically ignore it. Oh, good point. In non-daemon mode, we don't let "gc --auto" failure cause the invoking command to fail, but in daemon mode we do. That should be a straightforward fix; patch coming in a moment. > On Mon, Jul 16, 2018 at 02:40:03PM -0700, Jonathan Nieder wrote: >> For comparison, in non-daemon mode, there is nothing enforcing the >> kind of ratelimiting you are talking about. It is an accident of >> history. If we want this kind of ratelimiting, I'd rather we build it >> deliberately instead of relying on this accident. > > What I was trying to say earlier is that we _did_ build this > rate-limiting, and I think it is a bug that the non-daemon case does not > rate-limit (but nobody noticed, because the default is daemonizing). > > So the fix is not "rip out the rate-limiting in daemon mode", but rather > "extend it to the non-daemon case". Can you point me to some discussion about building that rate-limiting? The commit message for v2.12.2~17^2 (gc: ignore old gc.log files, 2017-02-10) definitely doesn't describe that as its intent. This is the kind of review that Dscho often complains about, where someone tries to fix something small but significant to users and gets told to build something larger that was not their itch instead. The comments about the "Why is 'git commit' so slow?" experience and how having the warning helps with that are well taken. I think we should be able to find a way to keep the warning in a v2 of this patch. But the rest about rate-limiting and putting unreachable objects in packs etc as a blocker for this are demoralizing, since they gives the feeling that even if I handle the cases that are handled today well, it will never be enough for the project unless I solve the larger problems that were already there. Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 22:03 ` Jonathan Nieder @ 2018-07-16 22:43 ` Jeff King 2018-07-16 22:56 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 22:43 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jonathan Tan, git On Mon, Jul 16, 2018 at 03:03:06PM -0700, Jonathan Nieder wrote: > Jeff King wrote: > > > I don't think any command should report failure of its _own_ operation > > if "gc --auto" failed. And grepping around the source code shows that we > > typically ignore it. > > Oh, good point. In non-daemon mode, we don't let "gc --auto" failure > cause the invoking command to fail, but in daemon mode we do. That > should be a straightforward fix; patch coming in a moment. OK, that definitely sounds like a bug. I'm still confused how that could happen, though, since from the caller's perspective they ignore git-gc's exit code either way. I guess I'll see in your patch. :) > > What I was trying to say earlier is that we _did_ build this > > rate-limiting, and I think it is a bug that the non-daemon case does not > > rate-limit (but nobody noticed, because the default is daemonizing). > > > > So the fix is not "rip out the rate-limiting in daemon mode", but rather > > "extend it to the non-daemon case". > > Can you point me to some discussion about building that rate-limiting? > The commit message for v2.12.2~17^2 (gc: ignore old gc.log files, > 2017-02-10) definitely doesn't describe that as its intent. I think that commit is a loosening of the rate-limiting (because we'd refuse to progress for something that was actually time-based). But the original stopping comes from this discussion, I think: https://public-inbox.org/git/xmqqlhijznpm.fsf@gitster.dls.corp.google.com/ (I didn't read the whole thread, but that was what I hit by blaming the log code and then tracing that back to the list). > This is the kind of review that Dscho often complains about, where > someone tries to fix something small but significant to users and gets > told to build something larger that was not their itch instead. I don't know how to say more emphatically that I am not asking anyone to build something larger (like cruft packfiles). I'm just trying to bring up an impact that the author didn't consider (and that IMHO would be a regression). Isn't that what reviews are for? I only mention packfiles because as the discussion turns to "well, all of these solutions are mediocre hacks" (because they absolutely are), it's important to realize that there _is_ a right solution, and we even already know about it. Even if we don't work on it now, knowing that it's there makes it easier to decide about the various hacks. > The comments about the "Why is 'git commit' so slow?" experience and > how having the warning helps with that are well taken. I think we > should be able to find a way to keep the warning in a v2 of this > patch. But the rest about rate-limiting and putting unreachable > objects in packs etc as a blocker for this are demoralizing, since > they gives the feeling that even if I handle the cases that are > handled today well, it will never be enough for the project unless I > solve the larger problems that were already there. I really don't know why we are having such trouble communicating. I've tried to make it clear several times that the pack thing is not something I expect your or Jonathan Tan to work on, but obviously I failed. I'd be _delighted_ if you wanted to work on it, but AFAICT this patch is purely motivated by: 1. there's a funny exit code thing going on (0 on the first run, -1 on the second) 2. the warning is not actionable by users I disagree with the second, and I think we've discussed easy solutions for the first. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 22:43 ` Jeff King @ 2018-07-16 22:56 ` Jonathan Nieder 2018-07-16 23:26 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 22:56 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Tan, git Jeff King wrote: > On Mon, Jul 16, 2018 at 03:03:06PM -0700, Jonathan Nieder wrote: >> Oh, good point. In non-daemon mode, we don't let "gc --auto" failure >> cause the invoking command to fail, but in daemon mode we do. That >> should be a straightforward fix; patch coming in a moment. > > OK, that definitely sounds like a bug. I'm still confused how that could > happen, though, since from the caller's perspective they ignore git-gc's > exit code either way. I guess I'll see in your patch. :) Alas, I just misremembered. What I was remembering is that gc --auto does if (auto_gc && too_many_loose_objects()) warning(_("There are too many unreachable loose objects; " "run 'git prune' to remove them.")); return 0; which means that too_many_loose_objects is not an error in undaemonized mode, while it is in daemonized mode. But we've already discussed that. The calling command in the motivating example is Android's "repo" tool: bare_git.gc('--auto') from https://gerrit-review.googlesource.com/c/git-repo/+/10598/. I think it's reasonable that it expects a status code of 0 in the normal case. So life is less simple than I hoped. [...] >> Can you point me to some discussion about building that rate-limiting? >> The commit message for v2.12.2~17^2 (gc: ignore old gc.log files, >> 2017-02-10) definitely doesn't describe that as its intent. > > I think that commit is a loosening of the rate-limiting (because we'd > refuse to progress for something that was actually time-based). But the > original stopping comes from this discussion, I think: > > https://public-inbox.org/git/xmqqlhijznpm.fsf@gitster.dls.corp.google.com/ Interesting! It looks like that thread anticipated the problems we've seen here. Three years without having to have fixed it is a good run, I suppose. The discussion of stopping there appears to be primarily about stopping in the error case, not rate-limiting in the success or warning case. Here's a patch for the 'return -1' thing. -- >8 -- Subject: gc: exit with status 128 on failure A value of -1 returned from cmd_gc gets propagated to exit(), resulting in an exit status of 255. Use die instead for a clearer error message and a controlled exit. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- builtin/gc.c | 36 ++++++++++++++++-------------------- t/t6500-gc.sh | 2 +- 2 files changed, 17 insertions(+), 21 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index ccfb1ceaeb..2bebc52bda 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -438,10 +438,10 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) return NULL; } -static int report_last_gc_error(void) +static void report_last_gc_error(void) { struct strbuf sb = STRBUF_INIT; - int ret = 0; + ssize_t ret; struct stat st; char *gc_log_path = git_pathdup("gc.log"); @@ -449,16 +449,17 @@ static int report_last_gc_error(void) if (errno == ENOENT) goto done; - ret = error_errno(_("Can't stat %s"), gc_log_path); - goto done; + die_errno(_("cannot stat '%s'"), gc_log_path); } if (st.st_mtime < gc_log_expire_time) goto done; ret = strbuf_read_file(&sb, gc_log_path, 0); + if (ret < 0) + die_errno(_("cannot read '%s'"), gc_log_path); if (ret > 0) - ret = error(_("The last gc run reported the following. " + die(_("The last gc run reported the following. " "Please correct the root cause\n" "and remove %s.\n" "Automatic cleanup will not be performed " @@ -468,20 +469,18 @@ static int report_last_gc_error(void) strbuf_release(&sb); done: free(gc_log_path); - return ret; } -static int gc_before_repack(void) +static void gc_before_repack(void) { if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, pack_refs_cmd.argv[0]); + die(FAILED_RUN, pack_refs_cmd.argv[0]); if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, reflog.argv[0]); + die(FAILED_RUN, reflog.argv[0]); pack_refs = 0; prune_reflogs = 0; - return 0; } int cmd_gc(int argc, const char **argv, const char *prefix) @@ -562,13 +561,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix) fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); } if (detach_auto) { - if (report_last_gc_error()) - return -1; + report_last_gc_error(); /* dies on error */ if (lock_repo_for_gc(force, &pid)) return 0; - if (gc_before_repack()) - return -1; + gc_before_repack(); /* dies on failure */ delete_tempfile(&pidfile); /* @@ -608,12 +605,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix) atexit(process_log_file_at_exit); } - if (gc_before_repack()) - return -1; + gc_before_repack(); if (!repository_format_precious_objects) { if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, repack.argv[0]); + die(FAILED_RUN, repack.argv[0]); if (prune_expire) { argv_array_push(&prune, prune_expire); @@ -623,18 +619,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix) argv_array_push(&prune, "--exclude-promisor-objects"); if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, prune.argv[0]); + die(FAILED_RUN, prune.argv[0]); } } if (prune_worktrees_expire) { argv_array_push(&prune_worktrees, prune_worktrees_expire); if (run_command_v_opt(prune_worktrees.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, prune_worktrees.argv[0]); + die(FAILED_RUN, prune_worktrees.argv[0]); } if (run_command_v_opt(rerere.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, rerere.argv[0]); + die(FAILED_RUN, rerere.argv[0]); report_garbage = report_pack_garbage; reprepare_packed_git(the_repository); diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 818435f04e..c474a94a9f 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -117,7 +117,7 @@ test_expect_success 'background auto gc does not run if gc.log is present and re test_config gc.autodetach true && echo fleem >.git/gc.log && test_must_fail git gc --auto 2>err && - test_i18ngrep "^error:" err && + test_i18ngrep "^fatal:" err && test_config gc.logexpiry 5.days && test-tool chmtime =-345600 .git/gc.log && test_must_fail git gc --auto && -- 2.18.0.233.g985f88cf7e ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 22:56 ` Jonathan Nieder @ 2018-07-16 23:26 ` Jeff King 2018-07-17 1:53 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 23:26 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jonathan Tan, git On Mon, Jul 16, 2018 at 03:56:39PM -0700, Jonathan Nieder wrote: > The calling command in the motivating example is Android's "repo" tool: > > bare_git.gc('--auto') > > from https://gerrit-review.googlesource.com/c/git-repo/+/10598/. I > think it's reasonable that it expects a status code of 0 in the normal > case. So life is less simple than I hoped. IMHO it should ignore the return code, since that's what Git does itself. And at any rate, you'd miss errors from daemonized gc's (at least until the _next_ one runs and propagates the error code). > Interesting! It looks like that thread anticipated the problems we've > seen here. Three years without having to have fixed it is a good run, > I suppose. > > The discussion of stopping there appears to be primarily about > stopping in the error case, not rate-limiting in the success or > warning case. I think the two are essentially the same. The point is that we cannot make further progress by re-running the gc again, so we should not. Amusingly, the warning we're talking about is the exact reason that the logging in that thread was added. 329e6e8794 (gc: save log from daemonized gc --auto and print it next time, 2015-09-19) says: The latest in this set is, as the result of daemonizing, stderr is closed and all warnings are lost. This warning at the end of cmd_gc() is particularly important because it tells the user how to avoid "gc --auto" running repeatedly. Because stderr is closed, the user does not know, naturally they complain about 'gc --auto' wasting CPU. > -- >8 -- > Subject: gc: exit with status 128 on failure > > A value of -1 returned from cmd_gc gets propagated to exit(), > resulting in an exit status of 255. Use die instead for a clearer > error message and a controlled exit. I agree it's better to not pass -1 to exit(). But I thought the original motivation was the fact that we were returning non-zero in this case at all? (And I thought you and I both agreed with that motivation). So is this meant to be a preparatory fix until somebody is interested in fixing that? > -static int gc_before_repack(void) > +static void gc_before_repack(void) > { > if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, pack_refs_cmd.argv[0]); > + die(FAILED_RUN, pack_refs_cmd.argv[0]); > > if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, reflog.argv[0]); > + die(FAILED_RUN, reflog.argv[0]); Dscho is going to yell at you about replacing error returns with die(). ;) I wonder if it would be simpler to just reinterpret the "-1" into "128" in cmd_gc(). I thought we had talked about having run_builtin() do that at one point, but I don't think we ever did. I dunno. I'm fine with it either way. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 23:26 ` Jeff King @ 2018-07-17 1:53 ` Jonathan Nieder 2018-07-17 8:59 ` Ævar Arnfjörð Bjarmason ` (2 more replies) 0 siblings, 3 replies; 57+ messages in thread From: Jonathan Nieder @ 2018-07-17 1:53 UTC (permalink / raw) To: Jeff King Cc: Jonathan Tan, git, Andrii Dehtiarov, Nguyễn Thái Ngọc Duy Hi, Jeff King wrote: > On Mon, Jul 16, 2018 at 03:56:39PM -0700, Jonathan Nieder wrote: >> The calling command in the motivating example is Android's "repo" tool: >> >> bare_git.gc('--auto') >> >> from https://gerrit-review.googlesource.com/c/git-repo/+/10598/. I >> think it's reasonable that it expects a status code of 0 in the normal >> case. So life is less simple than I hoped. > > IMHO it should ignore the return code, since that's what Git does > itself. And at any rate, you'd miss errors from daemonized gc's (at > least until the _next_ one runs and propagates the error code). That suggests a possible improvement. If all callers should be ignoring the exit code, can we make the exit code in daemonize mode unconditionally zero in this kind of case? That doesn't really solve the problem: 1. "gc --auto" would produce noise *every time it runs* until gc.log is removed, for example via expiry 2. "gc --auto" would not do any garbage collection until gc.log is removed, for example by expiry 3. "gc --auto" would still not ratelimit itself, for example when there are a large number of loose unreachable objects that is not large enough to hit the loose object threshold. but maybe it's better than the present state. To solve (1) and (2), we could introduce a gc.warnings file that behaves like gc.log except that it only gets printed once and then self-destructs, allowing gc --auto to proceed. To solve (3), we could introduce a gc.lastrun file that is touched whenever "gc --auto" runs successfully and make "gc --auto" a no-op for a while after each run. -- >8 -- Subject: gc: do not return error for prior errors in daemonized mode Some build machines started failing to fetch updated source using "repo sync", with error error: The last gc run reported the following. Please correct the root cause and remove /build/.repo/projects/tools/git.git/gc.log. Automatic cleanup will not be performed until the file is removed. warning: There are too many unreachable loose objects; run 'git prune' to remove them. The cause takes some time to describe. In v2.0.0-rc0~145^2 (gc: config option for running --auto in background, 2014-02-08), "git gc --auto" learned to run in the background instead of blocking the invoking command. In this mode, it closed stderr to avoid interleaving output with any subsequent commands, causing warnings like the above to be swallowed; v2.6.3~24^2 (gc: save log from daemonized gc --auto and print it next time, 2015-09-19) addressed this by storing any diagnostic output in .git/gc.log and allowing the next "git gc --auto" run to print it. To avoid wasteful repeated fruitless gcs, when gc.log is present, the subsequent "gc --auto" would die after print its contents. Most git commands, such as "git fetch", ignore the exit status from "git gc --auto" so all is well at this point: the user gets to see the error message, and the fetch succeeds, without a wasteful additional attempt at an automatic gc. External tools like repo[1], though, do care about the exit status from "git gc --auto". In non-daemonized mode, the exit status is straightforward: if there is an error, it is nonzero, but after a warning like the above, the status is zero. The daemonized mode, as a side effect of the other properties provided, offers a very strange exit code convention: - if no housekeeping was required, the exit status is 0 - the first real run, after forking into the background, returns exit status 0 unconditionally. The parent process has no way to know whether gc will succeed. - if there is any diagnostic output in gc.log, subsequent runs return a nonzero exit status to indicate that gc was not triggered. There's nothing for the calling program to act on on the basis of that error. Use status 0 consistently instead, to indicate that we decided not to run a gc (just like if no housekeeping was required). This way, repo and similar tools can get the benefit of the same behavior as tools like "git fetch" that ignore the exit status from gc --auto. Once the period of time described by gc.pruneExpire elapses, the unreachable loose objects will be removed by "git gc --auto" automatically. [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/ Reported-by: Andrii Dehtiarov <adehtiarov@google.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Documentation/config.txt | 3 ++- builtin/gc.c | 16 +++++++++++----- t/t6500-gc.sh | 6 +++--- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1cc18a828c..5eaa4aaa7d 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should go below gc.autoPackLimit and gc.bigPackThreshold should be respected again. gc.logExpiry:: - If the file gc.log exists, then `git gc --auto` won't run + If the file gc.log exists, then `git gc --auto` will print + its content and exit with status zero instead of running unless that file is more than 'gc.logExpiry' old. Default is "1.day". See `gc.pruneExpire` for more ways to specify its value. diff --git a/builtin/gc.c b/builtin/gc.c index 2bebc52bda..484ab21b8c 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) return NULL; } -static void report_last_gc_error(void) +static int report_last_gc_error(void) { struct strbuf sb = STRBUF_INIT; ssize_t ret; @@ -449,7 +449,7 @@ static void report_last_gc_error(void) if (errno == ENOENT) goto done; - die_errno(_("cannot stat '%s'"), gc_log_path); + return error_errno(_("cannot stat '%s'"), gc_log_path); } if (st.st_mtime < gc_log_expire_time) @@ -457,9 +457,9 @@ static void report_last_gc_error(void) ret = strbuf_read_file(&sb, gc_log_path, 0); if (ret < 0) - die_errno(_("cannot read '%s'"), gc_log_path); + return error_errno(_("cannot read '%s'"), gc_log_path); if (ret > 0) - die(_("The last gc run reported the following. " + return error(_("The last gc run reported the following. " "Please correct the root cause\n" "and remove %s.\n" "Automatic cleanup will not be performed " @@ -469,6 +469,7 @@ static void report_last_gc_error(void) strbuf_release(&sb); done: free(gc_log_path); + return 0; } static void gc_before_repack(void) @@ -561,7 +562,12 @@ int cmd_gc(int argc, const char **argv, const char *prefix) fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); } if (detach_auto) { - report_last_gc_error(); /* dies on error */ + if (report_last_gc_error()) + /* + * A previous gc failed. We've reported the + * error, so there's nothing left to do. + */ + return 0; if (lock_repo_for_gc(force, &pid)) return 0; diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index c474a94a9f..3e62df616c 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if gc.log is present and re test_config gc.autopacklimit 1 && test_config gc.autodetach true && echo fleem >.git/gc.log && - test_must_fail git gc --auto 2>err && - test_i18ngrep "^fatal:" err && + git gc --auto 2>err && + test_i18ngrep "^error:" err && test_config gc.logexpiry 5.days && test-tool chmtime =-345600 .git/gc.log && - test_must_fail git gc --auto && + git gc --auto && test_config gc.logexpiry 2.days && run_and_wait_for_auto_gc && ls .git/objects/pack/pack-*.pack >packs && -- 2.18.0.233.g985f88cf7e ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-17 1:53 ` Jonathan Nieder @ 2018-07-17 8:59 ` Ævar Arnfjörð Bjarmason 2018-07-17 14:03 ` Jonathan Nieder 2018-07-17 20:27 ` Jeff King 2018-07-17 15:59 ` Duy Nguyen 2018-07-17 18:09 ` Junio C Hamano 2 siblings, 2 replies; 57+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-07-17 8:59 UTC (permalink / raw) To: Jonathan Nieder Cc: Jeff King, Jonathan Tan, git, Andrii Dehtiarov, Nguyễn Thái Ngọc Duy On Tue, Jul 17 2018, Jonathan Nieder wrote: > Hi, > > Jeff King wrote: >> On Mon, Jul 16, 2018 at 03:56:39PM -0700, Jonathan Nieder wrote: > >>> The calling command in the motivating example is Android's "repo" tool: >>> >>> bare_git.gc('--auto') >>> >>> from https://gerrit-review.googlesource.com/c/git-repo/+/10598/. I >>> think it's reasonable that it expects a status code of 0 in the normal >>> case. So life is less simple than I hoped. >> >> IMHO it should ignore the return code, since that's what Git does >> itself. And at any rate, you'd miss errors from daemonized gc's (at >> least until the _next_ one runs and propagates the error code). I've read this whole thread, and as Peff noted I've barked up this particular tree before[1] without coming up with a solution myself. So please don't take the following as critique of any way of moving forward, I'm just trying to poke holes in what you're doing to make sure we don't have regressions to the currently (sucky) logic. > That suggests a possible improvement. If all callers should be > ignoring the exit code, can we make the exit code in daemonize mode > unconditionally zero in this kind of case? That doesn't make sense to me. Just because git itself is happily ignoring the exit code I don't think that should mean there shouldn't be a meaningful exit code. Why don't you just ignore the exit code in the repo tool as well? Now e.g. automation I have to see if git-gc ---auto is having issues can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet of servers, but will need to start caring if stderr was emitted to. I don't care if we e.g. have a 'git gc --auto --exit-code' similar to what git-diff does, but it doesn't make sense to me that we *know* we can't background the gc due to a previous error and then always return 0. Having to parse STDERR to see if it *really* failed is un-unixy, let's use exit codes. That's what they're for. > That doesn't really solve the problem: > > 1. "gc --auto" would produce noise *every time it runs* until gc.log > is removed, for example via expiry > > 2. "gc --auto" would not do any garbage collection until gc.log is > removed, for example by expiry > > 3. "gc --auto" would still not ratelimit itself, for example when > there are a large number of loose unreachable objects that is not > large enough to hit the loose object threshold. > > but maybe it's better than the present state. > > To solve (1) and (2), we could introduce a gc.warnings file that > behaves like gc.log except that it only gets printed once and then > self-destructs, allowing gc --auto to proceed. I think you're conflating two things here in a way that (if I'm reading this correctly) produces a pretty bad regression for users. a) If we have something in the gc.log we keep yelling at users until we reach the gc.logExpiry, this was the subject of my thread back in January. This sucks, and should be fixed somehow. Maybe we should only emit the warning once, e.g. creating a .git/gc.log.wasemitted marker and as long as its ctime is later than the mtime on .git/gc.log we don't emit it again). But that also creates the UX problem that it's easy to miss this message in the middle of some huge "fetch" output. Perhaps we should just start emitting this as part of "git status" or something (or solve the root cause, as Peff notes...). b) We *also* use this presence of a gc.log as a marker for "we failed too recently, let's not try again until after a day". The semantics of b) are very useful, and just because they're tied up with a) right now, let's not throw out b) just because we're trying to solve a). We have dev machines with limited I/O & CPU/memory that occasionally break the gc.auto limit, I really don't want those churning a background "git gc" on every fetch/commit etc. until we're finally below the gc.auto limit (possibly days later), which would be a side-effect of printing the .git/gc.log once and then removing it. > To solve (3), we could > introduce a gc.lastrun file that is touched whenever "gc --auto" runs > successfully and make "gc --auto" a no-op for a while after each run. This would work around my concern with b) above in most cases by introducing a purely time-based rate limit, but I'm uneasy about this change in git-gc semantics. Right now one thing we do right is we always try to look at the actual state of the repo with too_many_packs() and too_many_loose_objects(). We don't assume the state of your repo hasn't drastically changed recently. That means that we do the right thing and gc when the repo needs it, not just because we GC'd recently enough. With these proposed semantics we'd skip a needed GC (potentially for days, depending on the default) just because we recently ran one. In many environments, such as on busy servers, we could have tens of thousands of packs by this point, since this facility would (presumably) bypass both gc.autoPackLimit and gc.auto in favor of only running gc at a maximum of every N minutes, similarly in a local checkout I could have a crapload of loose objects because I ran a big rebase or a filter-branch on one of my branches. 1. https://public-inbox.org/git/87inc89j38.fsf@evledraar.gmail.com/ > -- >8 -- > Subject: gc: do not return error for prior errors in daemonized mode > > Some build machines started failing to fetch updated source using > "repo sync", with error > > error: The last gc run reported the following. Please correct the root cause > and remove /build/.repo/projects/tools/git.git/gc.log. > Automatic cleanup will not be performed until the file is removed. > > warning: There are too many unreachable loose objects; run 'git prune' to remove them. > > The cause takes some time to describe. > > In v2.0.0-rc0~145^2 (gc: config option for running --auto in > background, 2014-02-08), "git gc --auto" learned to run in the > background instead of blocking the invoking command. In this mode, it > closed stderr to avoid interleaving output with any subsequent > commands, causing warnings like the above to be swallowed; v2.6.3~24^2 > (gc: save log from daemonized gc --auto and print it next time, > 2015-09-19) addressed this by storing any diagnostic output in > .git/gc.log and allowing the next "git gc --auto" run to print it. > > To avoid wasteful repeated fruitless gcs, when gc.log is present, the > subsequent "gc --auto" would die after print its contents. Most git > commands, such as "git fetch", ignore the exit status from "git gc > --auto" so all is well at this point: the user gets to see the error > message, and the fetch succeeds, without a wasteful additional attempt > at an automatic gc. > > External tools like repo[1], though, do care about the exit status > from "git gc --auto". In non-daemonized mode, the exit status is > straightforward: if there is an error, it is nonzero, but after a > warning like the above, the status is zero. The daemonized mode, as a > side effect of the other properties provided, offers a very strange > exit code convention: > > - if no housekeeping was required, the exit status is 0 > > - the first real run, after forking into the background, returns exit > status 0 unconditionally. The parent process has no way to know > whether gc will succeed. > > - if there is any diagnostic output in gc.log, subsequent runs return > a nonzero exit status to indicate that gc was not triggered. > > There's nothing for the calling program to act on on the basis of that > error. Use status 0 consistently instead, to indicate that we decided > not to run a gc (just like if no housekeeping was required). This > way, repo and similar tools can get the benefit of the same behavior > as tools like "git fetch" that ignore the exit status from gc --auto. > > Once the period of time described by gc.pruneExpire elapses, the > unreachable loose objects will be removed by "git gc --auto" > automatically. > > [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/ > > Reported-by: Andrii Dehtiarov <adehtiarov@google.com> > Helped-by: Jeff King <peff@peff.net> > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> > --- > Documentation/config.txt | 3 ++- > builtin/gc.c | 16 +++++++++++----- > t/t6500-gc.sh | 6 +++--- > 3 files changed, 16 insertions(+), 9 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828c..5eaa4aaa7d 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should go below > gc.autoPackLimit and gc.bigPackThreshold should be respected again. > > gc.logExpiry:: > - If the file gc.log exists, then `git gc --auto` won't run > + If the file gc.log exists, then `git gc --auto` will print > + its content and exit with status zero instead of running > unless that file is more than 'gc.logExpiry' old. Default is > "1.day". See `gc.pruneExpire` for more ways to specify its > value. > diff --git a/builtin/gc.c b/builtin/gc.c > index 2bebc52bda..484ab21b8c 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) > return NULL; > } > > -static void report_last_gc_error(void) > +static int report_last_gc_error(void) > { > struct strbuf sb = STRBUF_INIT; > ssize_t ret; > @@ -449,7 +449,7 @@ static void report_last_gc_error(void) > if (errno == ENOENT) > goto done; > > - die_errno(_("cannot stat '%s'"), gc_log_path); > + return error_errno(_("cannot stat '%s'"), gc_log_path); > } > > if (st.st_mtime < gc_log_expire_time) > @@ -457,9 +457,9 @@ static void report_last_gc_error(void) > > ret = strbuf_read_file(&sb, gc_log_path, 0); > if (ret < 0) > - die_errno(_("cannot read '%s'"), gc_log_path); > + return error_errno(_("cannot read '%s'"), gc_log_path); > if (ret > 0) > - die(_("The last gc run reported the following. " > + return error(_("The last gc run reported the following. " > "Please correct the root cause\n" > "and remove %s.\n" > "Automatic cleanup will not be performed " > @@ -469,6 +469,7 @@ static void report_last_gc_error(void) > strbuf_release(&sb); > done: > free(gc_log_path); > + return 0; > } > > static void gc_before_repack(void) > @@ -561,7 +562,12 @@ int cmd_gc(int argc, const char **argv, const char *prefix) > fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); > } > if (detach_auto) { > - report_last_gc_error(); /* dies on error */ > + if (report_last_gc_error()) > + /* > + * A previous gc failed. We've reported the > + * error, so there's nothing left to do. > + */ > + return 0; > > if (lock_repo_for_gc(force, &pid)) > return 0; > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index c474a94a9f..3e62df616c 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if gc.log is present and re > test_config gc.autopacklimit 1 && > test_config gc.autodetach true && > echo fleem >.git/gc.log && > - test_must_fail git gc --auto 2>err && > - test_i18ngrep "^fatal:" err && > + git gc --auto 2>err && > + test_i18ngrep "^error:" err && > test_config gc.logexpiry 5.days && > test-tool chmtime =-345600 .git/gc.log && > - test_must_fail git gc --auto && > + git gc --auto && > test_config gc.logexpiry 2.days && > run_and_wait_for_auto_gc && > ls .git/objects/pack/pack-*.pack >packs && ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-17 8:59 ` Ævar Arnfjörð Bjarmason @ 2018-07-17 14:03 ` Jonathan Nieder 2018-07-17 15:24 ` Ævar Arnfjörð Bjarmason 2018-07-17 20:27 ` Jeff King 1 sibling, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-17 14:03 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Jeff King, Jonathan Tan, git, Andrii Dehtiarov, Nguyễn Thái Ngọc Duy Hi Ævar, Ævar Arnfjörð Bjarmason wrote: > On Tue, Jul 17 2018, Jonathan Nieder wrote: >> That suggests a possible improvement. If all callers should be >> ignoring the exit code, can we make the exit code in daemonize mode >> unconditionally zero in this kind of case? > > That doesn't make sense to me. Just because git itself is happily > ignoring the exit code I don't think that should mean there shouldn't be > a meaningful exit code. Why don't you just ignore the exit code in the > repo tool as well? I don't maintain the repo tool. I am just trying to improve git to make some users' lives better. repo worked fine for years, until gc's daemonized mode broke it. I don't think it makes sense for us to declare that it has been holding git gc wrong for all that time before then. > Now e.g. automation I have to see if git-gc ---auto is having issues > can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet > of servers, but will need to start caring if stderr was emitted to. Thanks for bringing this up. The thing is, the current exit code is not useful for that purpose. It doesn't report a failure until the *next* `git gc --auto` run, when it is too late to be useful. See the commit message on the proposed patch https://public-inbox.org/git/20180717065740.GD177907@aiede.svl.corp.google.com/ for more on that subject. > I don't care if we e.g. have a 'git gc --auto --exit-code' similar to > what git-diff does, but it doesn't make sense to me that we *know* we > can't background the gc due to a previous error and then always return > 0. Having to parse STDERR to see if it *really* failed is un-unixy, > let's use exit codes. That's what they're for. Do you know of anyone trying to use the exit code from gc --auto in this way? It sounds to me like for what you're proposing, a separate git gc --auto --last-run-failed command that a tool could use to check the outcome from the previous gc --auto run without triggering a new one would be best. [...] > I think you're conflating two things here in a way that (if I'm reading > this correctly) produces a pretty bad regression for users. The patch doesn't touch the "ratelimiting" behavior at all, so I'm not sure what I'm doing to regress users. [...] >> To solve (3), we could >> introduce a gc.lastrun file that is touched whenever "gc --auto" runs >> successfully and make "gc --auto" a no-op for a while after each run. > > This would work around my concern with b) above in most cases by > introducing a purely time-based rate limit, but I'm uneasy about this > change in git-gc semantics. [..] > With these proposed semantics we'd skip a needed GC (potentially for > days, depending on the default) just because we recently ran one. > > In many environments, such as on busy servers, we could have tens of > thousands of packs by this point, since this facility would (presumably) > bypass both gc.autoPackLimit and gc.auto in favor of only running gc at > a maximum of every N minutes, similarly in a local checkout I could have > a crapload of loose objects because I ran a big rebase or a > filter-branch on one of my branches. I think we all agree that getting rid of the hack that 'explodes' unreachable objects into loose objects is the best way forward, long term. Even in that future, some ratelimiting may be useful, though. I also suspect that we're never going to achieve a perfect set of defaults that works well for both humans and servers, though we can try. Thanks and hope that helps, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-17 14:03 ` Jonathan Nieder @ 2018-07-17 15:24 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 57+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-07-17 15:24 UTC (permalink / raw) To: Jonathan Nieder Cc: Jeff King, Jonathan Tan, git, Andrii Dehtiarov, Nguyễn Thái Ngọc Duy On Tue, Jul 17 2018, Jonathan Nieder wrote: > Hi Ævar, > > Ævar Arnfjörð Bjarmason wrote: >> On Tue, Jul 17 2018, Jonathan Nieder wrote: > >>> That suggests a possible improvement. If all callers should be >>> ignoring the exit code, can we make the exit code in daemonize mode >>> unconditionally zero in this kind of case? >> >> That doesn't make sense to me. Just because git itself is happily >> ignoring the exit code I don't think that should mean there shouldn't be >> a meaningful exit code. Why don't you just ignore the exit code in the >> repo tool as well? > > I don't maintain the repo tool. I am just trying to improve git to > make some users' lives better. > > repo worked fine for years, until gc's daemonized mode broke it. I > don't think it makes sense for us to declare that it has been holding > git gc wrong for all that time before then. Before the sequence of commits noted at the bottom of my https://public-inbox.org/git/87inc89j38.fsf@evledraar.gmail.com/ plus a831c06a2b ("gc: ignore old gc.log files", 2017-02-10) we wouldn't do that, that's true. We'd just happily run GC again pointlessly even though it wasn't going to accomplish anything, in cases where you really did have ~>6700 loose objects that weren't going to be pruned. I don't think it makes sense to revert those patches and go back to the much more naïve (and user waiting there twiddling his thumbs while it runs) method, but I also don't think making no distinction between the following states: 1. gc --auto has nothing to do 2. gc --auto has something to do, will fork and try to do it 3. gc --auto has something to do, but notices that gc has been failing before and can't do anything now. I think #3 should exit with non-zero. Yes, before this whole backgrounding etc. we wouldn't have exited with non-zero either, we'd just print a thing to STDERR. But with backgrounding we implicitly introduced a new state, which is we won't do *any* maintenance at all, including consolidating packfiles (very important for servers), so I think it makes sense to exit with non-zero since gc can't run at all. >> Now e.g. automation I have to see if git-gc ---auto is having issues >> can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet >> of servers, but will need to start caring if stderr was emitted to. > > Thanks for bringing this up. The thing is, the current exit code is > not useful for that purpose. It doesn't report a failure until the > *next* `git gc --auto` run, when it is too late to be useful. > > See the commit message on the proposed patch > https://public-inbox.org/git/20180717065740.GD177907@aiede.svl.corp.google.com/ > for more on that subject. Right. I know. What I mean is now I can (and do) use it to run 'git gc --auto' across our server fleet and see whether I have any of #3, or whether it's all #1 or #2. If there's nothing to do in #1 that's fine, and it just so happens that I'll run gc due to #2 that's also fine, but I'd like to see if gc really is stuck. This of course relies on them having other users / scripts doing normal git commands which would trigger previous 'git gc --auto' runs. >> I don't care if we e.g. have a 'git gc --auto --exit-code' similar to >> what git-diff does, but it doesn't make sense to me that we *know* we >> can't background the gc due to a previous error and then always return >> 0. Having to parse STDERR to see if it *really* failed is un-unixy, >> let's use exit codes. That's what they're for. > > Do you know of anyone trying to use the exit code from gc --auto in > this way? > > It sounds to me like for what you're proposing, a separate > > git gc --auto --last-run-failed > > command that a tool could use to check the outcome from the previous > gc --auto run without triggering a new one would be best. Yeah this is admittedly a bit of a poweruser thing I'm doing, so I don't mind if it's hidden behind something like that in principle, but it seems wrong to exit with zero in this (#3) case: $ git gc --auto; echo $? Auto packing the repository in background for optimum performance. See "git help gc" for manual housekeeping. error: The last gc run reported the following. Please correct the root cause and remove .git/gc.log. Automatic cleanup will not be performed until the file is removed. warning: There are too many unreachable loose objects; run 'git prune' to remove them. 255 As a previous (good) patch in this series notes that shouldn't be 255, let's fix that, but I don't see how emitting an "error" and "warning" saying we're broken and can't gc at all here in conjunction with exiting with zero makes sense. > [...] >> I think you're conflating two things here in a way that (if I'm reading >> this correctly) produces a pretty bad regression for users. > > The patch doesn't touch the "ratelimiting" behavior at all, so I'm not > sure what I'm doing to regress users. Sorry about being unclear again, this is not a comment on this patch, but your paragraph beginning with "To solve[...]", i.e. "this patch doesn't do X, but in the future maybe we'll...". I'm pointing out potential caveats with that, I realize you have not posted an implementation of that yet. > [...] >>> To solve (3), we could >>> introduce a gc.lastrun file that is touched whenever "gc --auto" runs >>> successfully and make "gc --auto" a no-op for a while after each run. >> >> This would work around my concern with b) above in most cases by >> introducing a purely time-based rate limit, but I'm uneasy about this >> change in git-gc semantics. > [..] >> With these proposed semantics we'd skip a needed GC (potentially for >> days, depending on the default) just because we recently ran one. >> >> In many environments, such as on busy servers, we could have tens of >> thousands of packs by this point, since this facility would (presumably) >> bypass both gc.autoPackLimit and gc.auto in favor of only running gc at >> a maximum of every N minutes, similarly in a local checkout I could have >> a crapload of loose objects because I ran a big rebase or a >> filter-branch on one of my branches. > > I think we all agree that getting rid of the hack that 'explodes' > unreachable objects into loose objects is the best way forward, long > term. > > Even in that future, some ratelimiting may be useful, though. I also > suspect that we're never going to achieve a perfect set of defaults > that works well for both humans and servers, though we can try. Having read this whole thing over again, and with some time to reflect on it, I think the best way forward for now, in lieu of the bigger solution of consolidating loose objects into a pack, is to split up this special case of warning due to too many loose objects at the end, and any other errors we may print during GC. As noted above if our policy for gc.auto is legitimately exceeded, we deal with that badly by stopping all gc, includin gc that would just run due to too_many_packs(). Instead we should note that we had an error due to too many loose objects last time, but still try to run if the too_many_packs() condition is satisfied. Now we're throwing the baby out with the bathwater and not consolidating packs (for a default period of 2 weeks!) just because some operation produced a lot of loose objects, and not writing bitmaps, commit graph etc. It would also be nice to expose via an exit code "do we need to gc?" and "is gc failing here?" separately from the (currently working) "run gc or report last error + code" mode we have with "gc --auto". ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-17 8:59 ` Ævar Arnfjörð Bjarmason 2018-07-17 14:03 ` Jonathan Nieder @ 2018-07-17 20:27 ` Jeff King 2018-07-18 13:11 ` Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-17 20:27 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Jonathan Nieder, Jonathan Tan, git, Andrii Dehtiarov, Nguyễn Thái Ngọc Duy On Tue, Jul 17, 2018 at 10:59:50AM +0200, Ævar Arnfjörð Bjarmason wrote: > That doesn't make sense to me. Just because git itself is happily > ignoring the exit code I don't think that should mean there shouldn't be > a meaningful exit code. Why don't you just ignore the exit code in the > repo tool as well? > > Now e.g. automation I have to see if git-gc ---auto is having issues > can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet > of servers, but will need to start caring if stderr was emitted to. If you're daemonizing gc, though, there are a number of cases where the exit code is not propagated. If you really care about the outcome, you're probably better off either: - doing synchronous gc's, which will still return a meaningful code after Jonathan's patches - inspecting the log yourself. I know that comes close to the un-unixy stderr thing. But in this case, the presence of a non-empty log is literally the on-disk bit for "the previous run failed". And doing so catches all of the daemonized cases, even the "first one" that you'd miss by paying attention to the immediate exit code. This will treat the zero-exit-code "warning" case as an error. I'm not against propagating the true original error code, if somebody wants to work on it. But I think Jonathan's patches are a strict improvement over the current situation, and a patch to propagate could come on top. > I think you're conflating two things here in a way that (if I'm reading > this correctly) produces a pretty bad regression for users. > > a) If we have something in the gc.log we keep yelling at users until we > reach the gc.logExpiry, this was the subject of my thread back in > January. This sucks, and should be fixed somehow. > > Maybe we should only emit the warning once, e.g. creating a > .git/gc.log.wasemitted marker and as long as its ctime is later than > the mtime on .git/gc.log we don't emit it again). > > But that also creates the UX problem that it's easy to miss this > message in the middle of some huge "fetch" output. Perhaps we should > just start emitting this as part of "git status" or something (or > solve the root cause, as Peff notes...). I kind of like that "git status" suggestion. Of course many users run "git status" more often than "git commit", so it may actually spam them more! > b) We *also* use this presence of a gc.log as a marker for "we failed > too recently, let's not try again until after a day". > > The semantics of b) are very useful, and just because they're tied up > with a) right now, let's not throw out b) just because we're trying to > solve a). Yeah. In general my concern was the handling of (b), which I think this last crop of patches is fine with. As far as showing the message repeatedly or not, I don't have a strong opinion (it could even be configurable, once it's split from the "marker" behavior). > Right now one thing we do right is we always try to look at the actual > state of the repo with too_many_packs() and too_many_loose_objects(). > > We don't assume the state of your repo hasn't drastically changed > recently. That means that we do the right thing and gc when the repo > needs it, not just because we GC'd recently enough. > > With these proposed semantics we'd skip a needed GC (potentially for > days, depending on the default) just because we recently ran one. Yeah, I agree that deferring repeated gc's based on time is going to run into pathological corner cases. OTOH, what we've found at GitHub is that "gc --auto" is quite insufficient for scheduling maintenance anyway (e.g., if a machine gets pushes to 100 separate repositories in quick succession, you probably want to queue and throttle any associated gc). But there are probably more mid-size sites that are big enough to have corner cases, but not so big that "gc --auto" is hopeless. ;) -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-17 20:27 ` Jeff King @ 2018-07-18 13:11 ` Ævar Arnfjörð Bjarmason 2018-07-18 17:29 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-07-18 13:11 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, Jonathan Tan, git, Andrii Dehtiarov, Nguyễn Thái Ngọc Duy On Tue, Jul 17 2018, Jeff King wrote: > On Tue, Jul 17, 2018 at 10:59:50AM +0200, Ævar Arnfjörð Bjarmason wrote: > >> That doesn't make sense to me. Just because git itself is happily >> ignoring the exit code I don't think that should mean there shouldn't be >> a meaningful exit code. Why don't you just ignore the exit code in the >> repo tool as well? >> >> Now e.g. automation I have to see if git-gc ---auto is having issues >> can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet >> of servers, but will need to start caring if stderr was emitted to. > > If you're daemonizing gc, though, there are a number of cases where the > exit code is not propagated. If you really care about the outcome, > you're probably better off either: In theory a lot of the other stuff can fail, but in practice I've only seen it get stuck due to running out of disk space (easily detected in other ways), and because of having too many loose objects (e.g. due to, but not limited to https://public-inbox.org/git/87fu6bmr0j.fsf@evledraar.gmail.com/). > - doing synchronous gc's, which will still return a meaningful code > after Jonathan's patches (Reply to this and "we've found at GitHub..." later in your mail) I'm targeting clients (dev machines, laptops, staging boxes) which have clones of repos, some put in place by automation, some manually cloned by users in ~. So it's much easier to rely on shipping a /etc/gitconfig than setting gc.auto=0 and having some system-wide job that goes and hunts for local git repos to manually gc. It's also a big advantage that it's driven by user activity, because it's an implicit guard of only_do_this_if_the_user_is_active_here() before "git-gc" on a fleet of machines, which when some of those get their disk space from shared NFS resources cuts down on overall I/O. > - inspecting the log yourself. I know that comes close to the un-unixy > stderr thing. But in this case, the presence of a non-empty log is > literally the on-disk bit for "the previous run failed". And doing > so catches all of the daemonized cases, even the "first one" that > you'd miss by paying attention to the immediate exit code. > > This will treat the zero-exit-code "warning" case as an error. I'm > not against propagating the true original error code, if somebody > wants to work on it. But I think Jonathan's patches are a strict > improvement over the current situation, and a patch to propagate > could come on top. Yeah, I can check gc.log, if others are of a different opinion about this #3 case at the end of the day I don't mind if it returns 0. It just doesn't make any conceptual sense to me. >> I think you're conflating two things here in a way that (if I'm reading >> this correctly) produces a pretty bad regression for users. >> >> a) If we have something in the gc.log we keep yelling at users until we >> reach the gc.logExpiry, this was the subject of my thread back in >> January. This sucks, and should be fixed somehow. >> >> Maybe we should only emit the warning once, e.g. creating a >> .git/gc.log.wasemitted marker and as long as its ctime is later than >> the mtime on .git/gc.log we don't emit it again). >> >> But that also creates the UX problem that it's easy to miss this >> message in the middle of some huge "fetch" output. Perhaps we should >> just start emitting this as part of "git status" or something (or >> solve the root cause, as Peff notes...). > > I kind of like that "git status" suggestion. Of course many users run > "git status" more often than "git commit", so it may actually spam them > more! Maybe piggy-packing on the advice facility ... >> b) We *also* use this presence of a gc.log as a marker for "we failed >> too recently, let's not try again until after a day". >> >> The semantics of b) are very useful, and just because they're tied up >> with a) right now, let's not throw out b) just because we're trying to >> solve a). > > Yeah. In general my concern was the handling of (b), which I think this > last crop of patches is fine with. As far as showing the message > repeatedly or not, I don't have a strong opinion (it could even be > configurable, once it's split from the "marker" behavior). > >> Right now one thing we do right is we always try to look at the actual >> state of the repo with too_many_packs() and too_many_loose_objects(). >> >> We don't assume the state of your repo hasn't drastically changed >> recently. That means that we do the right thing and gc when the repo >> needs it, not just because we GC'd recently enough. >> >> With these proposed semantics we'd skip a needed GC (potentially for >> days, depending on the default) just because we recently ran one. > > Yeah, I agree that deferring repeated gc's based on time is going to run > into pathological corner cases. OTOH, what we've found at GitHub is that > "gc --auto" is quite insufficient for scheduling maintenance anyway > (e.g., if a machine gets pushes to 100 separate repositories in quick > succession, you probably want to queue and throttle any associated gc). I'm sure you have better solutions for this at GitHub, but as an aside it might be interesting to add some sort of gc flock + retry setting for this use-case, i.e. even if you had 100 concurrent gc's due to too_many_*(), they'd wait + retry until they could get the flock on a given file. Then again this is probably too specific, and could be done with a pre-auto-gc hook too.. > But there are probably more mid-size sites that are big enough to have > corner cases, but not so big that "gc --auto" is hopeless. ;) FWIW I don't deal with gc on the server at all these days (that's our internal GitLab's instance problem to solve). I just mentioned the edge case of a growing number of pack files on the server upthread as something that we'd be shipping with git if we had time-based backoff semantics. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-18 13:11 ` Ævar Arnfjörð Bjarmason @ 2018-07-18 17:29 ` Jeff King 0 siblings, 0 replies; 57+ messages in thread From: Jeff King @ 2018-07-18 17:29 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Jonathan Nieder, Jonathan Tan, git, Andrii Dehtiarov, Nguyễn Thái Ngọc Duy On Wed, Jul 18, 2018 at 03:11:38PM +0200, Ævar Arnfjörð Bjarmason wrote: > > Yeah, I agree that deferring repeated gc's based on time is going to run > > into pathological corner cases. OTOH, what we've found at GitHub is that > > "gc --auto" is quite insufficient for scheduling maintenance anyway > > (e.g., if a machine gets pushes to 100 separate repositories in quick > > succession, you probably want to queue and throttle any associated gc). > > I'm sure you have better solutions for this at GitHub, but as an aside > it might be interesting to add some sort of gc flock + retry setting for > this use-case, i.e. even if you had 100 concurrent gc's due to > too_many_*(), they'd wait + retry until they could get the flock on a > given file. > > Then again this is probably too specific, and could be done with a > pre-auto-gc hook too.. Yeah, I think any multi-repo solution is getting way too specific for Git, and the best thing we can do is provide a hook. I agree you could probably do this today with a pre-auto-gc hook (if it skips gc it would just queue itself and return non-zero). Or even just make a mark in a database that says "there was some activity here". Since we have so much other infrastructure sitting between the user and Git anyway, we do that marking at a separate layer which is already talking to a database. ;) Anyway, I do agree with your general notion that this isn't the right approach for many situations, and auto-gc is a better solution for many cases. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-17 1:53 ` Jonathan Nieder 2018-07-17 8:59 ` Ævar Arnfjörð Bjarmason @ 2018-07-17 15:59 ` Duy Nguyen 2018-07-17 18:09 ` Junio C Hamano 2 siblings, 0 replies; 57+ messages in thread From: Duy Nguyen @ 2018-07-17 15:59 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jeff King, Jonathan Tan, Git Mailing List, adehtiarov On Tue, Jul 17, 2018 at 3:53 AM Jonathan Nieder <jrnieder@gmail.com> wrote: > Subject: gc: do not return error for prior errors in daemonized mode > > Some build machines started failing to fetch updated source using > "repo sync", with error > > error: The last gc run reported the following. Please correct the root cause > and remove /build/.repo/projects/tools/git.git/gc.log. > Automatic cleanup will not be performed until the file is removed. > > warning: There are too many unreachable loose objects; run 'git prune' to remove them. > > The cause takes some time to describe. > > In v2.0.0-rc0~145^2 (gc: config option for running --auto in > background, 2014-02-08), "git gc --auto" learned to run in the > background instead of blocking the invoking command. In this mode, it > closed stderr to avoid interleaving output with any subsequent > commands, causing warnings like the above to be swallowed; v2.6.3~24^2 > (gc: save log from daemonized gc --auto and print it next time, > 2015-09-19) addressed this by storing any diagnostic output in > .git/gc.log and allowing the next "git gc --auto" run to print it. > > To avoid wasteful repeated fruitless gcs, when gc.log is present, the > subsequent "gc --auto" would die after print its contents. Most git > commands, such as "git fetch", ignore the exit status from "git gc > --auto" so all is well at this point: the user gets to see the error > message, and the fetch succeeds, without a wasteful additional attempt > at an automatic gc. > > External tools like repo[1], though, do care about the exit status > from "git gc --auto". In non-daemonized mode, the exit status is > straightforward: if there is an error, it is nonzero, but after a > warning like the above, the status is zero. The daemonized mode, as a > side effect of the other properties provided, offers a very strange > exit code convention: > > - if no housekeeping was required, the exit status is 0 > > - the first real run, after forking into the background, returns exit > status 0 unconditionally. The parent process has no way to know > whether gc will succeed. > > - if there is any diagnostic output in gc.log, subsequent runs return > a nonzero exit status to indicate that gc was not triggered. > > There's nothing for the calling program to act on on the basis of that > error. Use status 0 consistently instead, to indicate that we decided > not to run a gc (just like if no housekeeping was required). This > way, repo and similar tools can get the benefit of the same behavior > as tools like "git fetch" that ignore the exit status from gc --auto. This background gc thing is pretty much designed for interactive use. When it's scripted, you probably should avoid it. Then you can fork a new process by yourself and have much better control if you still want "background" gc. So an alternative here is to turn on or off gc.autodetach based on interactiveness (i.e. having tty). We can add a new (and default) value "auto" to gc.autodetach for this purpose. In automated scripts, it will always run in non-damonized mode. You will get non-zero exit code when real errors occur. You don't worry about hanging processes. Rate limit is also thrown out in this mode if I'm not mistaken, but it could be added back and more tailored for server needs. > Once the period of time described by gc.pruneExpire elapses, the > unreachable loose objects will be removed by "git gc --auto" > automatically. > > [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/ > > Reported-by: Andrii Dehtiarov <adehtiarov@google.com> > Helped-by: Jeff King <peff@peff.net> > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> > --- > Documentation/config.txt | 3 ++- > builtin/gc.c | 16 +++++++++++----- > t/t6500-gc.sh | 6 +++--- > 3 files changed, 16 insertions(+), 9 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828c..5eaa4aaa7d 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should go below > gc.autoPackLimit and gc.bigPackThreshold should be respected again. > > gc.logExpiry:: > - If the file gc.log exists, then `git gc --auto` won't run > + If the file gc.log exists, then `git gc --auto` will print > + its content and exit with status zero instead of running > unless that file is more than 'gc.logExpiry' old. Default is > "1.day". See `gc.pruneExpire` for more ways to specify its > value. > diff --git a/builtin/gc.c b/builtin/gc.c > index 2bebc52bda..484ab21b8c 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) > return NULL; > } > > -static void report_last_gc_error(void) > +static int report_last_gc_error(void) > { > struct strbuf sb = STRBUF_INIT; > ssize_t ret; > @@ -449,7 +449,7 @@ static void report_last_gc_error(void) > if (errno == ENOENT) > goto done; > > - die_errno(_("cannot stat '%s'"), gc_log_path); > + return error_errno(_("cannot stat '%s'"), gc_log_path); > } > > if (st.st_mtime < gc_log_expire_time) > @@ -457,9 +457,9 @@ static void report_last_gc_error(void) > > ret = strbuf_read_file(&sb, gc_log_path, 0); > if (ret < 0) > - die_errno(_("cannot read '%s'"), gc_log_path); > + return error_errno(_("cannot read '%s'"), gc_log_path); > if (ret > 0) > - die(_("The last gc run reported the following. " > + return error(_("The last gc run reported the following. " > "Please correct the root cause\n" > "and remove %s.\n" > "Automatic cleanup will not be performed " > @@ -469,6 +469,7 @@ static void report_last_gc_error(void) > strbuf_release(&sb); > done: > free(gc_log_path); > + return 0; > } > > static void gc_before_repack(void) > @@ -561,7 +562,12 @@ int cmd_gc(int argc, const char **argv, const char *prefix) > fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); > } > if (detach_auto) { > - report_last_gc_error(); /* dies on error */ > + if (report_last_gc_error()) > + /* > + * A previous gc failed. We've reported the > + * error, so there's nothing left to do. > + */ > + return 0; > > if (lock_repo_for_gc(force, &pid)) > return 0; > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index c474a94a9f..3e62df616c 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if gc.log is present and re > test_config gc.autopacklimit 1 && > test_config gc.autodetach true && > echo fleem >.git/gc.log && > - test_must_fail git gc --auto 2>err && > - test_i18ngrep "^fatal:" err && > + git gc --auto 2>err && > + test_i18ngrep "^error:" err && > test_config gc.logexpiry 5.days && > test-tool chmtime =-345600 .git/gc.log && > - test_must_fail git gc --auto && > + git gc --auto && > test_config gc.logexpiry 2.days && > run_and_wait_for_auto_gc && > ls .git/objects/pack/pack-*.pack >packs && > -- > 2.18.0.233.g985f88cf7e > -- Duy ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-17 1:53 ` Jonathan Nieder 2018-07-17 8:59 ` Ævar Arnfjörð Bjarmason 2018-07-17 15:59 ` Duy Nguyen @ 2018-07-17 18:09 ` Junio C Hamano 2 siblings, 0 replies; 57+ messages in thread From: Junio C Hamano @ 2018-07-17 18:09 UTC (permalink / raw) To: Jonathan Nieder Cc: Jeff King, Jonathan Tan, git, Andrii Dehtiarov, Nguyễn Thái Ngọc Duy Jonathan Nieder <jrnieder@gmail.com> writes: > That doesn't really solve the problem: > > 1. "gc --auto" would produce noise *every time it runs* until gc.log > is removed, for example via expiry > > 2. "gc --auto" would not do any garbage collection until gc.log is > removed, for example by expiry > > 3. "gc --auto" would still not ratelimit itself, for example when > there are a large number of loose unreachable objects that is not > large enough to hit the loose object threshold. > > but maybe it's better than the present state. > > To solve (1) and (2), we could introduce a gc.warnings file that > behaves like gc.log except that it only gets printed once and then > self-destructs, allowing gc --auto to proceed. That makes it not rate-limit, no? > To solve (3), we could > introduce a gc.lastrun file that is touched whenever "gc --auto" runs > successfully and make "gc --auto" a no-op for a while after each run. Does absolute time-interval make sense here? Some repositories may be busily churning new objects and for them 5-minute interval may be too infrequent, while other repositories may be relatively slow and once a day may be sufficient for them. I wonder if we can somehow auto-tune this. > To avoid wasteful repeated fruitless gcs, when gc.log is present, the > subsequent "gc --auto" would die after print its contents. Most git s/print/&ing/ > commands, such as "git fetch", ignore the exit status from "git gc > --auto" so all is well at this point: the user gets to see the error > message, and the fetch succeeds, without a wasteful additional attempt > at an automatic gc. The above, by saying "Most git commands", leaves readers want to know "then what are minority git commands that do not correctly do so?" If you are not going to answer that question, it probably is better not to even say "Most". > External tools like repo[1], though, do care about the exit status > from "git gc --auto". In non-daemonized mode, the exit status is > straightforward: if there is an error, it is nonzero, but after a > warning like the above, the status is zero. The daemonized mode, as a > side effect of the other properties provided, offers a very strange > exit code convention: > > - if no housekeeping was required, the exit status is 0 OK. > - the first real run, after forking into the background, returns exit > status 0 unconditionally. The parent process has no way to know > whether gc will succeed. Understandable; allowing the parent to exit before we know the outcome of the gc is the whole point of backgrounding. > - if there is any diagnostic output in gc.log, subsequent runs return > a nonzero exit status to indicate that gc was not triggered. This is unfortunate. > There's nothing for the calling program to act on on the basis of that > error. Use status 0 consistently instead, to indicate that we decided > not to run a gc (just like if no housekeeping was required). s/.$/ in the last case./ And I fully agree with the reasoning. > This > way, repo and similar tools can get the benefit of the same behavior > as tools like "git fetch" that ignore the exit status from gc --auto. > > Once the period of time described by gc.pruneExpire elapses, the > unreachable loose objects will be removed by "git gc --auto" > automatically. > > [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/ > > Reported-by: Andrii Dehtiarov <adehtiarov@google.com> > Helped-by: Jeff King <peff@peff.net> > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> > --- > Documentation/config.txt | 3 ++- > builtin/gc.c | 16 +++++++++++----- > t/t6500-gc.sh | 6 +++--- > 3 files changed, 16 insertions(+), 9 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828c..5eaa4aaa7d 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should go below > gc.autoPackLimit and gc.bigPackThreshold should be respected again. > > gc.logExpiry:: > - If the file gc.log exists, then `git gc --auto` won't run > + If the file gc.log exists, then `git gc --auto` will print > + its content and exit with status zero instead of running > unless that file is more than 'gc.logExpiry' old. Default is > "1.day". See `gc.pruneExpire` for more ways to specify its > value. > diff --git a/builtin/gc.c b/builtin/gc.c > index 2bebc52bda..484ab21b8c 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) > return NULL; > } > > -static void report_last_gc_error(void) > +static int report_last_gc_error(void) > { > struct strbuf sb = STRBUF_INIT; > ssize_t ret; > @@ -449,7 +449,7 @@ static void report_last_gc_error(void) > if (errno == ENOENT) > goto done; > > - die_errno(_("cannot stat '%s'"), gc_log_path); > + return error_errno(_("cannot stat '%s'"), gc_log_path); > } > > if (st.st_mtime < gc_log_expire_time) > @@ -457,9 +457,9 @@ static void report_last_gc_error(void) > > ret = strbuf_read_file(&sb, gc_log_path, 0); > if (ret < 0) > - die_errno(_("cannot read '%s'"), gc_log_path); > + return error_errno(_("cannot read '%s'"), gc_log_path); > if (ret > 0) > - die(_("The last gc run reported the following. " > + return error(_("The last gc run reported the following. " > "Please correct the root cause\n" > "and remove %s.\n" > "Automatic cleanup will not be performed " > @@ -469,6 +469,7 @@ static void report_last_gc_error(void) > strbuf_release(&sb); > done: > free(gc_log_path); > + return 0; > } > > static void gc_before_repack(void) > @@ -561,7 +562,12 @@ int cmd_gc(int argc, const char **argv, const char *prefix) > fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); > } > if (detach_auto) { > - report_last_gc_error(); /* dies on error */ > + if (report_last_gc_error()) > + /* > + * A previous gc failed. We've reported the > + * error, so there's nothing left to do. > + */ > + return 0; > > if (lock_repo_for_gc(force, &pid)) > return 0; > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index c474a94a9f..3e62df616c 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if gc.log is present and re > test_config gc.autopacklimit 1 && > test_config gc.autodetach true && > echo fleem >.git/gc.log && > - test_must_fail git gc --auto 2>err && > - test_i18ngrep "^fatal:" err && > + git gc --auto 2>err && > + test_i18ngrep "^error:" err && > test_config gc.logexpiry 5.days && > test-tool chmtime =-345600 .git/gc.log && > - test_must_fail git gc --auto && > + git gc --auto && > test_config gc.logexpiry 2.days && > run_and_wait_for_auto_gc && > ls .git/objects/pack/pack-*.pack >packs && ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 17:27 [PATCH] gc: do not warn about too many loose objects Jonathan Tan 2018-07-16 17:51 ` Jeff King @ 2018-07-16 19:15 ` Elijah Newren 2018-07-16 19:19 ` Jonathan Nieder 2018-07-16 19:52 ` Jeff King 2018-07-17 6:51 ` [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode Jonathan Nieder 2 siblings, 2 replies; 57+ messages in thread From: Elijah Newren @ 2018-07-16 19:15 UTC (permalink / raw) To: git; +Cc: jonathantanmy, peff, jrnieder, Elijah Newren On Mon, Jul 16, 2018 at 10:27 AM, Jonathan Tan <jonathantanmy@google.com> wrote: > In a087cc9819 ("git-gc --auto: protect ourselves from accumulated > cruft", 2007-09-17), the user was warned if there were too many > unreachable loose objects. This made sense at the time, because gc > couldn't prune them safely. But subsequently, git prune learned the > ability to not prune recently created loose objects, making pruning able > to be done more safely, and gc was made to automatically prune old > unreachable loose objects in 25ee9731c1 ("gc: call "prune --expire > 2.weeks.ago" by default", 2008-03-12). ... > > --- > This was noticed when a daemonized gc run wrote this warning to the log > file, and returned 0; but a subsequent run merely read the log file, saw > that it is non-empty and returned -1 (which is inconsistent in that such > a run should return 0, as it did the first time). Yeah, we've hit this several times too. I even created a testcase and a workaround, though I never got it into proper submission form. The basic problem here, at least for us, is that gc has enough information to know it could expunge some objects, but because of how it is structured in terms of several substeps (reflog expiration, repack, prune), the information is lost between the steps and it instead writes them out as unreachable objects. If we could prune (or avoid exploding) loose objects that are only reachable from reflog entries that we are expiring, then the problem goes away for us. (I totally understand that other repos may have enough unreachable objects for other reasons that Peff's suggestion to just pack up unreachable objects is still a really good idea. But on its own, it seems like a waste since it's packing stuff that we know we could just expunge.) Anyway, my very rough testcase is below. The workaround is the git_actual_garbage_collect() function (minus the call to wait_for_background_gc_to_finish). Elijah --- wait_for_background_gc_to_finish() { while ( ps -ef | grep -v grep | grep --quiet git.gc.--auto ); do sleep 1; done } git_standard_garbage_collect() { # Current git gc sprays unreachable objects back in loose form; this is # fine in many cases, but is annoying when done with objects which # newly become unreachable because of something else git-gc does and # git-gc doesn't clean them up. git gc --auto wait_for_background_gc_to_finish } git_actual_garbage_collect() { GITDIR=$(git rev-parse --git-dir) # Record all revisions stored in reflog before and after gc git rev-list --no-walk --reflog >$GITDIR/gc.original-refs git gc --auto wait_for_background_gc_to_finish git rev-list --no-walk --reflog >$GITDIR/gc.final-refs # Find out which reflog entries were removed DELETED_REFS=$(comm -23 <(sort $GITDIR/gc.original-refs) <(sort $GITDIR/gc.final-refs)) # Get the list of objects which used to be reachable, but were made # unreachable due to gc's reflog expiration. To get these, I need # the intersection of things reachable from $DELETED_REFS and things # which are unreachable now. git rev-list --objects $DELETED_REFS --not --all --reflog | awk '{print $1}' >$GITDIR/gc.previously-reachable-objects git prune --expire=now --dry-run | awk '{print $1}' > $GITDIR/gc.unreachable-objects # Delete all the previously-reachable-objects made unreachable by the # reflog expiration done by git gc comm -12 <(sort $GITDIR/gc.unreachable-objects) <(sort $GITDIR/gc.previously-reachable-objects) | sed -e "s#^\(..\)#$GITDIR/objects/\1/#" | xargs rm } test -d testcase && { echo "testcase exists; exiting"; exit 1; } git init testcase/ cd testcase # Create a basic commit echo hello >world git add world git commit -q -m "Initial" # Create a commit with lots of files for i in {0000..9999}; do echo $i >$i; done git add [0-9]* git commit --quiet -m "Lots of numbers" # Pack it all up git gc --quiet # Stop referencing the garbage git reset --quiet --hard HEAD~1 # Pretend we did all the above stuff 30 days ago for rlog in $(find .git/logs -type f); do # Subtract 3E6 (just over 30 days) from every date (assuming dates have # exactly 10 digits, which just happens to be valid...right now at least) perl -i -ne '/(.*?)(\b[0-9]{10}\b)(.*)/ && print $1,$2-3000000,$3,"\n"' $rlog done # HOWEVER: note that the pack is new; if we make the pack old, the old objects # will get pruned for us. But it is quite common to have new packfiles with # old-and-soon-to-be-unreferenced-objects because frequent gc's mean moving # the objects to new packfiles often, and eventually the reflog is expired. # If you want to test them being part of an old packfile, uncomment this: # find .git/objects/pack -type f | xargs touch -t 200001010101 # Create 50 packfiles in the current repo so that 'git gc --auto' will # trigger `git repack -A -d -l` instead of just `git repack -d -l` for i in {01..50}; do git fast-export master | sed -e s/Initial/Initi$i/ | git -c fastimport.unpacklimit=0 fast-import --quiet |& grep -v "Not updating refs/heads/master" done echo "*** Before gc, reflog refers to garbage-collectible commits: ***" git rev-list --no-walk --all --reflog cat .git/logs/refs/heads/master echo "*** Before gc, everything is packed with no loose objects: ***" git count-objects -v git_standard_garbage_collect # Just `git gc --auto` #git_actual_garbage_collect # What I really want from `git gc --auto` echo -e "\n\n*** After gc, commit garbage collected and objects made loose: ***" git rev-list --no-walk --all --reflog cat .git/logs/refs/heads/master git count-objects -v echo -e "\n\n*** Now we can trigger the 'too many unreachable' error: ***" git gc --auto ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 19:15 ` Elijah Newren @ 2018-07-16 19:19 ` Jonathan Nieder 2018-07-16 20:21 ` Elijah Newren 2018-07-16 19:52 ` Jeff King 1 sibling, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 19:19 UTC (permalink / raw) To: Elijah Newren; +Cc: git, jonathantanmy, peff Hi, Elijah Newren wrote: > The basic problem here, at least for us, is that gc has enough > information to know it could expunge some objects, but because of how > it is structured in terms of several substeps (reflog expiration, > repack, prune), the information is lost between the steps and it > instead writes them out as unreachable objects. If we could prune (or > avoid exploding) loose objects that are only reachable from reflog > entries that we are expiring, then the problem goes away for us. My understanding is that exploding the objects is intentional behavior, to avoid a race where objects are newly referenced while they are being pruned. I am not a fan of that behavior. It's still racy. But when we've brought it up in the past, the consensus seems to have been that it's better than nothing. Documentation/technical/hash-function-transition.txt section "Loose objects and unreachable objects" describes a way to eliminate the race. Thanks and hope that helps, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 19:19 ` Jonathan Nieder @ 2018-07-16 20:21 ` Elijah Newren 2018-07-16 20:35 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Elijah Newren @ 2018-07-16 20:21 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git Mailing List, Jonathan Tan, Jeff King Hi, On Mon, Jul 16, 2018 at 12:19 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Elijah Newren wrote: > >> The basic problem here, at least for us, is that gc has enough >> information to know it could expunge some objects, but because of how >> it is structured in terms of several substeps (reflog expiration, >> repack, prune), the information is lost between the steps and it >> instead writes them out as unreachable objects. If we could prune (or >> avoid exploding) loose objects that are only reachable from reflog >> entries that we are expiring, then the problem goes away for us. > > My understanding is that exploding the objects is intentional behavior, > to avoid a race where objects are newly referenced while they are being > pruned. > > I am not a fan of that behavior. It's still racy. But when we've > brought it up in the past, the consensus seems to have been that it's > better than nothing. Documentation/technical/hash-function-transition.txt > section "Loose objects and unreachable objects" describes a way to > eliminate the race. Ah, that's good to know and at least makes sense. It seems somewhat odd, though; loose objects that are two weeks old are just as susceptible to being referenced anew by new commits, so the default of running 'git prune --expire=2.weeks.ago' as gc currently does would also be unsafe, wouldn't it? Why is that any more or less unsafe than pruning objects only referenced by reflog entries that are more than 90 days old? Elijah ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 20:21 ` Elijah Newren @ 2018-07-16 20:35 ` Jeff King 2018-07-16 20:56 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 20:35 UTC (permalink / raw) To: Elijah Newren; +Cc: Jonathan Nieder, Git Mailing List, Jonathan Tan On Mon, Jul 16, 2018 at 01:21:40PM -0700, Elijah Newren wrote: > > My understanding is that exploding the objects is intentional behavior, > > to avoid a race where objects are newly referenced while they are being > > pruned. > > > > I am not a fan of that behavior. It's still racy. But when we've > > brought it up in the past, the consensus seems to have been that it's > > better than nothing. Documentation/technical/hash-function-transition.txt > > section "Loose objects and unreachable objects" describes a way to > > eliminate the race. > > Ah, that's good to know and at least makes sense. It seems somewhat > odd, though; loose objects that are two weeks old are just as > susceptible to being referenced anew by new commits, so the default of > running 'git prune --expire=2.weeks.ago' as gc currently does would > also be unsafe, wouldn't it? Why is that any more or less unsafe than > pruning objects only referenced by reflog entries that are more than > 90 days old? The 2-week safety isn't primarily about things which just became unreferenced. It's about things which are in the act of being referenced. Imagine a "git commit" racing with a "git prune". The commit has to create an object, and then it will update a ref to point to it. But between those two actions, prune may racily delete the object! The mtime grace period is what makes that work. Using 2 weeks is sort of ridiculous for that. But it also helps with manual recovery (e.g., imagine a blob added to the index but never committed; 3 days later you may want to try to recover your old work). And you're correct that a new git-commit may still reference an old object (e.g., a blob that's 5 seconds shy of being 2 weeks old that you're including in a new commit). That's why we retain non-fresh objects that are referenced from fresh ones (so as long as you made the new commit recently, it transitively infers freshness on the old blob), and why we fresh mtimes when we elide a write for an existing object. That's _still_ not race-proof, because none of these operations is atomic. git-prune can decide the blob is unfresh at the exact moment you're creating the commit object. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 20:35 ` Jeff King @ 2018-07-16 20:56 ` Jonathan Nieder 2018-07-16 21:12 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 20:56 UTC (permalink / raw) To: Jeff King; +Cc: Elijah Newren, Git Mailing List, Jonathan Tan Jeff King wrote: > On Mon, Jul 16, 2018 at 01:21:40PM -0700, Elijah Newren wrote: >> Jonathan Nieder wrote: >>> My understanding is that exploding the objects is intentional behavior, >>> to avoid a race where objects are newly referenced while they are being >>> pruned. [...] >> Ah, that's good to know and at least makes sense. It seems somewhat >> odd, though; loose objects that are two weeks old are just as >> susceptible to being referenced anew by new commits, so the default of >> running 'git prune --expire=2.weeks.ago' as gc currently does would >> also be unsafe, wouldn't it? Why is that any more or less unsafe than >> pruning objects only referenced by reflog entries that are more than >> 90 days old? Part of the answer is that this safety feature applies even when reflogs are not in use. Another part is that as you say, you can start referencing an object at the same time as the reflog entry pointing to it is expiring. [...] > That's why we retain non-fresh > objects that are referenced from fresh ones (so as long as you made the > new commit recently, it transitively infers freshness on the old blob), > and why we fresh mtimes when we elide a write for an existing object. > > That's _still_ not race-proof, because none of these operations is > atomic. Indeed. One of the advantages of using a packfile for unreachable objects is that metadata associated with that packfile can be updated atomically. I don't believe we are very good at transitively propagating freshness today, by the way. Thanks, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 20:56 ` Jonathan Nieder @ 2018-07-16 21:12 ` Jeff King 0 siblings, 0 replies; 57+ messages in thread From: Jeff King @ 2018-07-16 21:12 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Elijah Newren, Git Mailing List, Jonathan Tan On Mon, Jul 16, 2018 at 01:56:46PM -0700, Jonathan Nieder wrote: > I don't believe we are very good at transitively propagating freshness > today, by the way. Perhaps I don't understand what you mean by transitive here. But we should be traversing from any fresh object and marking any non-fresh ones that are reachable from it to be saved. If you know of a case where we're not, there's a bug, and I'd be happy to look into it. The only problem I know about is the utter lack of atomicity. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 19:15 ` Elijah Newren 2018-07-16 19:19 ` Jonathan Nieder @ 2018-07-16 19:52 ` Jeff King 2018-07-16 20:16 ` Elijah Newren 1 sibling, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 19:52 UTC (permalink / raw) To: Elijah Newren; +Cc: git, jonathantanmy, jrnieder On Mon, Jul 16, 2018 at 12:15:05PM -0700, Elijah Newren wrote: > The basic problem here, at least for us, is that gc has enough > information to know it could expunge some objects, but because of how > it is structured in terms of several substeps (reflog expiration, > repack, prune), the information is lost between the steps and it > instead writes them out as unreachable objects. If we could prune (or > avoid exploding) loose objects that are only reachable from reflog > entries that we are expiring, then the problem goes away for us. (I > totally understand that other repos may have enough unreachable > objects for other reasons that Peff's suggestion to just pack up > unreachable objects is still a really good idea. But on its own, it > seems like a waste since it's packing stuff that we know we could just > expunge.) No, we should have expunged everything that could be during the "repack" and "prune" steps. We feed the expiration time to repack, so that it knows to drop objects entirely instead of exploding them loose. So the leftovers really are objects that cannot be deleted yet. You could literally just do: find .git/objects/?? -type f | perl -lne 's{../.{38}$} and print "$1$2"' | git pack-objects .git/objects/pack/cruft-pack But: - that will explode them out only to repack them, which is inefficient (if they're already packed, you can probably reuse deltas, not to mention the I/O savings) - there's the question of how to handle timestamps. Some of those objects may have been _about_ to expire, but now you've just put them in a brand-new pack that adds another 2 weeks to their life - the find above is sloppy, and will race with somebody adding new objects to the repo So probably you want to have pack-objects write out the list of objects it _would_ explode, rather than exploding them. And then before git-repack deletes the old packs, put those into a new cruft pack. That _just_ leaves the timestamp issue (which is discussed at length in the thread I linked earlier). > git_actual_garbage_collect() { > GITDIR=$(git rev-parse --git-dir) > > # Record all revisions stored in reflog before and after gc > git rev-list --no-walk --reflog >$GITDIR/gc.original-refs > git gc --auto > wait_for_background_gc_to_finish > git rev-list --no-walk --reflog >$GITDIR/gc.final-refs > > # Find out which reflog entries were removed > DELETED_REFS=$(comm -23 <(sort $GITDIR/gc.original-refs) <(sort $GITDIR/gc.final-refs)) This is too detailed, I think. There are other reasons to have unreachable objects than expired reflogs. I think you really just want to consider all unreachable objects (like the pack-objects thing I mentioned above). -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 19:52 ` Jeff King @ 2018-07-16 20:16 ` Elijah Newren 2018-07-16 20:38 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Elijah Newren @ 2018-07-16 20:16 UTC (permalink / raw) To: Jeff King; +Cc: Git Mailing List, Jonathan Tan, Jonathan Nieder On Mon, Jul 16, 2018 at 12:52 PM, Jeff King <peff@peff.net> wrote: > On Mon, Jul 16, 2018 at 12:15:05PM -0700, Elijah Newren wrote: > >> The basic problem here, at least for us, is that gc has enough >> information to know it could expunge some objects, but because of how >> it is structured in terms of several substeps (reflog expiration, >> repack, prune), the information is lost between the steps and it >> instead writes them out as unreachable objects. If we could prune (or >> avoid exploding) loose objects that are only reachable from reflog >> entries that we are expiring, then the problem goes away for us. (I >> totally understand that other repos may have enough unreachable >> objects for other reasons that Peff's suggestion to just pack up >> unreachable objects is still a really good idea. But on its own, it >> seems like a waste since it's packing stuff that we know we could just >> expunge.) > > No, we should have expunged everything that could be during the "repack" > and "prune" steps. We feed the expiration time to repack, so that it > knows to drop objects entirely instead of exploding them loose. Um, except it doesn't actually do that. The testcase I provided shows that it leaves around 10000 objects that are totally deletable and were only previously referenced by reflog entries -- entries that gc removed without removing the corresponding objects. I will note that my testcase was slightly out-of-date; with current git it needs a call to 'wait_for_background_gc_to_finish' right before the 'git gc --quiet' to avoid erroring out. > You > could literally just do: > > find .git/objects/?? -type f | > perl -lne 's{../.{38}$} and print "$1$2"' | > git pack-objects .git/objects/pack/cruft-pack > > But: > > - that will explode them out only to repack them, which is inefficient > (if they're already packed, you can probably reuse deltas, not to > mention the I/O savings) > > - there's the question of how to handle timestamps. Some of those > objects may have been _about_ to expire, but now you've just put > them in a brand-new pack that adds another 2 weeks to their life > > - the find above is sloppy, and will race with somebody adding new > objects to the repo > > So probably you want to have pack-objects write out the list of objects > it _would_ explode, rather than exploding them. And then before > git-repack deletes the old packs, put those into a new cruft pack. That > _just_ leaves the timestamp issue (which is discussed at length in the > thread I linked earlier). > >> git_actual_garbage_collect() { >> GITDIR=$(git rev-parse --git-dir) >> >> # Record all revisions stored in reflog before and after gc >> git rev-list --no-walk --reflog >$GITDIR/gc.original-refs >> git gc --auto >> wait_for_background_gc_to_finish >> git rev-list --no-walk --reflog >$GITDIR/gc.final-refs >> >> # Find out which reflog entries were removed >> DELETED_REFS=$(comm -23 <(sort $GITDIR/gc.original-refs) <(sort $GITDIR/gc.final-refs)) > > This is too detailed, I think. There are other reasons to have > unreachable objects than expired reflogs. I think you really just want > to consider all unreachable objects (like the pack-objects thing I > mentioned above). Yes, like I said, coarse workaround and I never had time to create a real fix. But I thought the testcase might be useful as a demonstration of how git gc leaves around loose objects that were previously reference by reflogs that gc itself pruned. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 20:16 ` Elijah Newren @ 2018-07-16 20:38 ` Jeff King 2018-07-16 21:09 ` Elijah Newren 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 20:38 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List, Jonathan Tan, Jonathan Nieder On Mon, Jul 16, 2018 at 01:16:45PM -0700, Elijah Newren wrote: > >> The basic problem here, at least for us, is that gc has enough > >> information to know it could expunge some objects, but because of how > >> it is structured in terms of several substeps (reflog expiration, > >> repack, prune), the information is lost between the steps and it > >> instead writes them out as unreachable objects. If we could prune (or > >> avoid exploding) loose objects that are only reachable from reflog > >> entries that we are expiring, then the problem goes away for us. (I > >> totally understand that other repos may have enough unreachable > >> objects for other reasons that Peff's suggestion to just pack up > >> unreachable objects is still a really good idea. But on its own, it > >> seems like a waste since it's packing stuff that we know we could just > >> expunge.) > > > > No, we should have expunged everything that could be during the "repack" > > and "prune" steps. We feed the expiration time to repack, so that it > > knows to drop objects entirely instead of exploding them loose. > > Um, except it doesn't actually do that. The testcase I provided shows > that it leaves around 10000 objects that are totally deletable and > were only previously referenced by reflog entries -- entries that gc > removed without removing the corresponding objects. What's your definition of "totally deletable"? If the pack they were in is less than 2 weeks old, then they are unreachable but "fresh", and are intentionally retained. If it was older than 2 weeks, they should have been deleted. If you don't like that policy, you can set gc.pruneExpire to something lower (including "now", but watch out, as that can race with other operations!). But I don't think that's an implementation short-coming. It's intentional to keep those objects. It's just that the format they're in is inefficient and confuses later gc runs. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 20:38 ` Jeff King @ 2018-07-16 21:09 ` Elijah Newren 2018-07-16 21:21 ` Jeff King 2018-07-16 21:31 ` Jonathan Nieder 0 siblings, 2 replies; 57+ messages in thread From: Elijah Newren @ 2018-07-16 21:09 UTC (permalink / raw) To: Jeff King; +Cc: Git Mailing List, Jonathan Tan, Jonathan Nieder On Mon, Jul 16, 2018 at 1:38 PM, Jeff King <peff@peff.net> wrote: > On Mon, Jul 16, 2018 at 01:16:45PM -0700, Elijah Newren wrote: > >> >> The basic problem here, at least for us, is that gc has enough >> >> information to know it could expunge some objects, but because of how >> >> it is structured in terms of several substeps (reflog expiration, >> >> repack, prune), the information is lost between the steps and it >> >> instead writes them out as unreachable objects. If we could prune (or >> >> avoid exploding) loose objects that are only reachable from reflog >> >> entries that we are expiring, then the problem goes away for us. (I >> >> totally understand that other repos may have enough unreachable >> >> objects for other reasons that Peff's suggestion to just pack up >> >> unreachable objects is still a really good idea. But on its own, it >> >> seems like a waste since it's packing stuff that we know we could just >> >> expunge.) >> > >> > No, we should have expunged everything that could be during the "repack" >> > and "prune" steps. We feed the expiration time to repack, so that it >> > knows to drop objects entirely instead of exploding them loose. >> >> Um, except it doesn't actually do that. The testcase I provided shows >> that it leaves around 10000 objects that are totally deletable and >> were only previously referenced by reflog entries -- entries that gc >> removed without removing the corresponding objects. > > What's your definition of "totally deletable"? The point of gc is to: expire reflogs, repack referenced objects, and delete loose objects that (1) are no longer referenced and (2) are "old enough". The "old enough" bit is the special part here. Before the gc, those objects were referenced (only by old reflog entries) and were found in a pack. git currently writes those objects out to disk and gives them the age of the packfile they are contained in, which I think is the source of the bug. We have a reference for those objects from the reflogs, know they aren't referenced anywhere else, so those objects to me are the age of those reflog entries: 90 days. As such, they are "old enough" and should be deleted. With big repos, it's easy to get into situations where there are well more than 10000 objects satisfying these conditions. In fact, it's not all that rare that the repo has far more loose objects after a git gc than before. I never got around to fixing it properly, though, because 'git prune' is such a handy workaround that for now. Having people nuke all their loose objects is a bit risky, but the only other workaround people had was to re-clone (waiting the requisite 20+ minutes for the repo to download) and throw away their old clone. (Though some people even went that route, IIRC.) > If the pack they were in is less than 2 weeks old, then they are > unreachable but "fresh", and are intentionally retained. If it was older > than 2 weeks, they should have been deleted. That's what's currently implemented, yes, but as above I think that logic is bad. > If you don't like that policy, you can set gc.pruneExpire to something > lower (including "now", but watch out, as that can race with other > operations!). But I don't think that's an implementation short-coming. > It's intentional to keep those objects. It's just that the format > they're in is inefficient and confuses later gc runs. I'm aware of pruneExpire, but I think it's the wrong way to handle this. I totally agree with your general plan to put unreferenced loose objects into a pack. However, I don't think these objects should be part of that pack; they should just be deleted instead. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 21:09 ` Elijah Newren @ 2018-07-16 21:21 ` Jeff King 2018-07-16 22:07 ` Elijah Newren 2018-07-16 21:31 ` Jonathan Nieder 1 sibling, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 21:21 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List, Jonathan Tan, Jonathan Nieder On Mon, Jul 16, 2018 at 02:09:43PM -0700, Elijah Newren wrote: > >> Um, except it doesn't actually do that. The testcase I provided shows > >> that it leaves around 10000 objects that are totally deletable and > >> were only previously referenced by reflog entries -- entries that gc > >> removed without removing the corresponding objects. > > > > What's your definition of "totally deletable"? > > The point of gc is to: expire reflogs, repack referenced objects, and > delete loose objects that (1) are no longer referenced and (2) are > "old enough". > > The "old enough" bit is the special part here. Before the gc, those > objects were referenced (only by old reflog entries) and were found in > a pack. git currently writes those objects out to disk and gives them > the age of the packfile they are contained in, which I think is the > source of the bug. We have a reference for those objects from the > reflogs, know they aren't referenced anywhere else, so those objects > to me are the age of those reflog entries: 90 days. As such, they are > "old enough" and should be deleted. OK, I see what you're saying, but... > I never got around to fixing it properly, though, because 'git prune' > is such a handy workaround that for now. Having people nuke all their > loose objects is a bit risky, but the only other workaround people had > was to re-clone (waiting the requisite 20+ minutes for the repo to > download) and throw away their old clone. (Though some people even > went that route, IIRC.) If we were to delete those objects, wouldn't it be exactly the same as running "git prune"? Or setting your gc.pruneExpire to "now" or even "5 minutes"? Or are you concerned with taking other objects along for the ride that weren't part of old reflogs? I think that's a valid concern, but it's also an issue for objects which were previously referenced in a reflog, but are part of another current operation. Also, what do you do if there weren't reflogs in the repo? Or the reflogs were deleted (e.g., because branch deletion drops the associated reflog entirely)? > With big repos, it's easy to get into situations where there are well > more than 10000 objects satisfying these conditions. In fact, it's > not all that rare that the repo has far more loose objects after a git > gc than before. Yes, this is definitely a wart and I think is worth addressing. > I totally agree with your general plan to put unreferenced loose > objects into a pack. However, I don't think these objects should be > part of that pack; they should just be deleted instead. I assume by "these objects" you mean ones which used to be reachable from a reflog, but that reflog entry just expired. I think you'd be sacrificing some race-safety in that case. If the objects went into a pack under a race-proof scheme, would that actually bother you? Is it the 10,000 objects that's a problem, or is it the horrible I/O from exploding them coupled with the annoying auto-gc behavior? -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 21:21 ` Jeff King @ 2018-07-16 22:07 ` Elijah Newren 2018-07-16 22:55 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Elijah Newren @ 2018-07-16 22:07 UTC (permalink / raw) To: Jeff King; +Cc: Git Mailing List, Jonathan Tan, Jonathan Nieder On Mon, Jul 16, 2018 at 2:21 PM, Jeff King <peff@peff.net> wrote: > On Mon, Jul 16, 2018 at 02:09:43PM -0700, Elijah Newren wrote: >> The point of gc is to: expire reflogs, repack referenced objects, and >> delete loose objects that (1) are no longer referenced and (2) are >> "old enough". >> >> The "old enough" bit is the special part here. Before the gc, those >> objects were referenced (only by old reflog entries) and were found in >> a pack. git currently writes those objects out to disk and gives them >> the age of the packfile they are contained in, which I think is the >> source of the bug. We have a reference for those objects from the >> reflogs, know they aren't referenced anywhere else, so those objects >> to me are the age of those reflog entries: 90 days. As such, they are >> "old enough" and should be deleted. > > OK, I see what you're saying, but... > >> I never got around to fixing it properly, though, because 'git prune' >> is such a handy workaround that for now. Having people nuke all their >> loose objects is a bit risky, but the only other workaround people had >> was to re-clone (waiting the requisite 20+ minutes for the repo to >> download) and throw away their old clone. (Though some people even >> went that route, IIRC.) > > If we were to delete those objects, wouldn't it be exactly the same as > running "git prune"? Or setting your gc.pruneExpire to "now" or even "5 > minutes"? Or are you concerned with taking other objects along for the > ride that weren't part of old reflogs? I think that's a valid concern, Yes, I was worried about taking other objects along for the ride that weren't part of old reflogs. > but it's also an issue for objects which were previously referenced in > a reflog, but are part of another current operation. I'm not certain what you're referring to here. > Also, what do you do if there weren't reflogs in the repo? Or the > reflogs were deleted (e.g., because branch deletion drops the associated > reflog entirely)? Yes, there are issues this rule won't help with, but in my experience it was a primary (if not sole) actual cause in practice. (I believe I even said elsewhere in this thread that I knew there were unreachable objects for other reasons and they might also become large in number). At $DAYJOB we've had multiple people including myself hit the "too many unreachable loose objects" nasty loop issue (some of us multiple different times), and as far as I can tell, most (perhaps even all) of them would have been avoided by just "properly" deleting garbage as per my object-age-is-reflog-age-if-not-otherwise-referenced rule. >> With big repos, it's easy to get into situations where there are well >> more than 10000 objects satisfying these conditions. In fact, it's >> not all that rare that the repo has far more loose objects after a git >> gc than before. > > Yes, this is definitely a wart and I think is worth addressing. > >> I totally agree with your general plan to put unreferenced loose >> objects into a pack. However, I don't think these objects should be >> part of that pack; they should just be deleted instead. > > I assume by "these objects" you mean ones which used to be reachable > from a reflog, but that reflog entry just expired. I think you'd be > sacrificing some race-safety in that case. Is that inherently any more race unsafe than 'git prune --expire=2.weeks.ago'? I thought it'd be racy in the same ways, and thus a tradeoff folks are already accepting (at least implicitly) when running git-gc. Since these objects are actually 90 days old rather than a mere two weeks, it actually seemed more safe to me. But maybe I'm overlooking something with the pack->loose transition that makes it more racy? > If the objects went into a pack under a race-proof scheme, would that > actually bother you? Is it the 10,000 objects that's a problem, or is it > the horrible I/O from exploding them coupled with the annoying auto-gc > behavior? Yeah, good point. It's mostly the annoying auto-gc behavior and the horrible I/O of future git operations from having lots of loose objects. They've already been paying the cost of storing those objects in packed form for 90 days; a few more won't hurt much. I'd be slightly annoyed knowing that we're storing garbage we don't need to be, but I don't think it's a real issue. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 22:07 ` Elijah Newren @ 2018-07-16 22:55 ` Jeff King 2018-07-16 23:06 ` Elijah Newren 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-16 22:55 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List, Jonathan Tan, Jonathan Nieder On Mon, Jul 16, 2018 at 03:07:34PM -0700, Elijah Newren wrote: > > If we were to delete those objects, wouldn't it be exactly the same as > > running "git prune"? Or setting your gc.pruneExpire to "now" or even "5 > > minutes"? Or are you concerned with taking other objects along for the > > ride that weren't part of old reflogs? I think that's a valid concern, > > Yes, I was worried about taking other objects along for the ride that > weren't part of old reflogs. > > > but it's also an issue for objects which were previously referenced in > > a reflog, but are part of another current operation. > > I'm not certain what you're referring to here. I mean that an ongoing operation could refer to a blob that just recently became unreachable via reflog pruning. And then we would delete it, leaving the repository corrupt. One of the protections we have against that is that if I ask to write blob XYZ and we find that we already have the object, Git will freshen the mtime of the loose object or pack that contains it, protecting it from pruning. But with your suggestion, we'd delete it immediately, regardless of the mtime of the containing pack. Another way to think of it is this: a reflog mentioning an object does not mean it is the exclusive user of that object. So when we expire it, that does not mean that it is OK to delete it immediately; there may be other users. > > Also, what do you do if there weren't reflogs in the repo? Or the > > reflogs were deleted (e.g., because branch deletion drops the associated > > reflog entirely)? > > Yes, there are issues this rule won't help with, but in my experience > it was a primary (if not sole) actual cause in practice. (I believe I > even said elsewhere in this thread that I knew there were unreachable > objects for other reasons and they might also become large in number). > At $DAYJOB we've had multiple people including myself hit the "too > many unreachable loose objects" nasty loop issue (some of us multiple > different times), and as far as I can tell, most (perhaps even all) of > them would have been avoided by just "properly" deleting garbage as > per my object-age-is-reflog-age-if-not-otherwise-referenced rule. I agree with you that this is a frequent cause, and probably even the primary one. But my concern is that your loosening increases the risk of corruption for other cases. > > I assume by "these objects" you mean ones which used to be reachable > > from a reflog, but that reflog entry just expired. I think you'd be > > sacrificing some race-safety in that case. > > Is that inherently any more race unsafe than 'git prune > --expire=2.weeks.ago'? I thought it'd be racy in the same ways, and > thus a tradeoff folks are already accepting (at least implicitly) when > running git-gc. Since these objects are actually 90 days old rather > than a mere two weeks, it actually seemed more safe to me. But maybe > I'm overlooking something with the pack->loose transition that makes > it more racy? I think it's worse in terms of race-safety because you're losing one of the signals that users of the objects can provide to git-prune to tell it the object is useful: updating the mtime. So yes, you think of the objects as "90 days old", but we don't know if there are other users. Has somebody else been accessing them in the meantime? To be honest, I'm not sure how meaningful that distinction is in practice. The current scheme is still racy, even if the windows are shorter in some circumstances. But it seems like cruft packfiles are a similar amount of work to your scheme, cover more cases, and are slightly safer. And possibly even give us a long-term route to true race-free pruning (via the "garbage pack" mark that Jonathan mentioned). Assuming you buy into the idea that objects in a cruft-pack are not hurting anything aside from a little wasted storage for up to 2 weeks (which it sounds like you're at least partially on board with ;) ). -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 22:55 ` Jeff King @ 2018-07-16 23:06 ` Elijah Newren 0 siblings, 0 replies; 57+ messages in thread From: Elijah Newren @ 2018-07-16 23:06 UTC (permalink / raw) To: Jeff King; +Cc: Git Mailing List, Jonathan Tan, Jonathan Nieder On Mon, Jul 16, 2018 at 3:55 PM, Jeff King <peff@peff.net> wrote: > On Mon, Jul 16, 2018 at 03:07:34PM -0700, Elijah Newren wrote: > >> > If we were to delete those objects, wouldn't it be exactly the same as >> > running "git prune"? Or setting your gc.pruneExpire to "now" or even "5 >> > minutes"? Or are you concerned with taking other objects along for the >> > ride that weren't part of old reflogs? I think that's a valid concern, >> >> Yes, I was worried about taking other objects along for the ride that >> weren't part of old reflogs. >> >> > but it's also an issue for objects which were previously referenced in >> > a reflog, but are part of another current operation. >> >> I'm not certain what you're referring to here. > > I mean that an ongoing operation could refer to a blob that just > recently became unreachable via reflog pruning. And then we would delete > it, leaving the repository corrupt. > > One of the protections we have against that is that if I ask to write > blob XYZ and we find that we already have the object, Git will freshen > the mtime of the loose object or pack that contains it, protecting it > from pruning. But with your suggestion, we'd delete it immediately, > regardless of the mtime of the containing pack. > > Another way to think of it is this: a reflog mentioning an object does > not mean it is the exclusive user of that object. So when we expire it, > that does not mean that it is OK to delete it immediately; there may be > other users. > >> > Also, what do you do if there weren't reflogs in the repo? Or the >> > reflogs were deleted (e.g., because branch deletion drops the associated >> > reflog entirely)? >> >> Yes, there are issues this rule won't help with, but in my experience >> it was a primary (if not sole) actual cause in practice. (I believe I >> even said elsewhere in this thread that I knew there were unreachable >> objects for other reasons and they might also become large in number). >> At $DAYJOB we've had multiple people including myself hit the "too >> many unreachable loose objects" nasty loop issue (some of us multiple >> different times), and as far as I can tell, most (perhaps even all) of >> them would have been avoided by just "properly" deleting garbage as >> per my object-age-is-reflog-age-if-not-otherwise-referenced rule. > > I agree with you that this is a frequent cause, and probably even the > primary one. But my concern is that your loosening increases the risk of > corruption for other cases. > >> > I assume by "these objects" you mean ones which used to be reachable >> > from a reflog, but that reflog entry just expired. I think you'd be >> > sacrificing some race-safety in that case. >> >> Is that inherently any more race unsafe than 'git prune >> --expire=2.weeks.ago'? I thought it'd be racy in the same ways, and >> thus a tradeoff folks are already accepting (at least implicitly) when >> running git-gc. Since these objects are actually 90 days old rather >> than a mere two weeks, it actually seemed more safe to me. But maybe >> I'm overlooking something with the pack->loose transition that makes >> it more racy? > > I think it's worse in terms of race-safety because you're losing one of > the signals that users of the objects can provide to git-prune to tell > it the object is useful: updating the mtime. So yes, you think of the > objects as "90 days old", but we don't know if there are other users. > Has somebody else been accessing them in the meantime? > > To be honest, I'm not sure how meaningful that distinction is in > practice. The current scheme is still racy, even if the windows are > shorter in some circumstances. But it seems like cruft packfiles are > a similar amount of work to your scheme, cover more cases, and are > slightly safer. And possibly even give us a long-term route to true > race-free pruning (via the "garbage pack" mark that Jonathan mentioned). > > Assuming you buy into the idea that objects in a cruft-pack are not > hurting anything aside from a little wasted storage for up to 2 weeks > (which it sounds like you're at least partially on board with ;) ). Ah, I see now. Thanks (to you and Jonathan) for the thorough explanations. I'm totally on board now. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] gc: do not warn about too many loose objects 2018-07-16 21:09 ` Elijah Newren 2018-07-16 21:21 ` Jeff King @ 2018-07-16 21:31 ` Jonathan Nieder 1 sibling, 0 replies; 57+ messages in thread From: Jonathan Nieder @ 2018-07-16 21:31 UTC (permalink / raw) To: Elijah Newren; +Cc: Jeff King, Git Mailing List, Jonathan Tan Elijah Newren wrote: > I totally agree with your general plan to put unreferenced loose > objects into a pack. However, I don't think these objects should be > part of that pack; they should just be deleted instead. This might be the wrong thread to discuss it, but did you follow the reference/prune race that Peff mentioned? The simplest cure I'm aware of to it does involve writing those objects to a pack. The idea is to enforce a straightforward contract: There are two kinds of packs: GC and UNREACHABLE_GARBAGE. Every object in a GC pack has a minimum lifetime of <ttl> (let's say "1 days") from the time they are read. If you start making use of an object from a GC pack (e.g. by creating a new object referencing it), you have three days to ensure it's referenced. Each UNREACHABLE_GARBAGE pack has a <ttl> (let's say "3 days") from the time it is created. Objects in an UNREACHABLE_GARBAGE have no minimum ttl from the time they are read. If you want to start making use of an object from an UNREACHABLE_GARBAGE pack (e.g. by creating a new object referencing it), then copy it and everything it references to a GC pack. To avoid a proliferation of UNREACHABLE_GARBAGE packs, there's a rule for coalescing them, but that's not relevant here. It is perfectly possible for an object in a GC pack to reference an object in an UNREACHABLE_GARBAGE pack via writes racing with gc, but that's fine --- on the next gc run, the unreachable garbage objects get copied to a GC pack. We've been using this on a JGit DfsRepository based server for > 2 years now and it's been working well. More details are in the "Loose objects and unreachable objects" section in Documentation/technical/ mentioned before. Thanks, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode 2018-07-16 17:27 [PATCH] gc: do not warn about too many loose objects Jonathan Tan 2018-07-16 17:51 ` Jeff King 2018-07-16 19:15 ` Elijah Newren @ 2018-07-17 6:51 ` Jonathan Nieder 2018-07-17 6:53 ` [PATCH 1/3] gc: improve handling of errors reading gc.log Jonathan Nieder ` (2 more replies) 2 siblings, 3 replies; 57+ messages in thread From: Jonathan Nieder @ 2018-07-17 6:51 UTC (permalink / raw) To: Jonathan Tan Cc: git, Jeff King, Elijah Newren, Nguyễn Thái Ngọc Duy Hi, Jonathan Tan wrote: > In a087cc9819 ("git-gc --auto: protect ourselves from accumulated > cruft", 2007-09-17), the user was warned if there were too many > unreachable loose objects. This made sense at the time, because gc > couldn't prune them safely. But subsequently, git prune learned the > ability to not prune recently created loose objects, making pruning able > to be done more safely, and gc was made to automatically prune old > unreachable loose objects in 25ee9731c1 ("gc: call "prune --expire > 2.weeks.ago" by default", 2008-03-12). > > This makes the warning unactionable by the user, as any loose objects > left are not deleted yet because of safety, and "git prune" is not a > command that the user is recommended to run directly anyway. I agree that given the better alternatives we have now, "git prune" is not such a great option these days. E.g. should it say struct strbuf now = STRBUF_INIT; date_stamp(&now); ... "run 'git gc --prune=%s' to remove them", now.buf); ? > This was noticed when a daemonized gc run wrote this warning to the log > file, and returned 0; but a subsequent run merely read the log file, saw > that it is non-empty and returned -1 (which is inconsistent in that such > a run should return 0, as it did the first time). Here's a series to address that motivating case. Thanks for the careful analysis and to Peff for the patient explanations. Sincerely, Jonathan Nieder (3): gc: improve handling of errors reading gc.log gc: exit with status 128 on failure gc: do not return error for prior errors in daemonized mode Documentation/config.txt | 3 ++- builtin/gc.c | 53 ++++++++++++++++++++++++++-------------- t/t6500-gc.sh | 6 ++--- 3 files changed, 40 insertions(+), 22 deletions(-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 1/3] gc: improve handling of errors reading gc.log 2018-07-17 6:51 ` [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode Jonathan Nieder @ 2018-07-17 6:53 ` Jonathan Nieder 2018-07-17 18:19 ` Junio C Hamano 2018-07-17 19:58 ` Jeff King 2018-07-17 6:54 ` [PATCH 2/3] gc: exit with status 128 on failure Jonathan Nieder 2018-07-17 6:57 ` [PATCH 3/3] gc: do not return error for prior errors in daemonized mode Jonathan Nieder 2 siblings, 2 replies; 57+ messages in thread From: Jonathan Nieder @ 2018-07-17 6:53 UTC (permalink / raw) To: Jonathan Tan Cc: git, Jeff King, Elijah Newren, Nguyễn Thái Ngọc Duy A collection of minor error handling fixes: - use an error message in lower case, following the usual style - quote filenames in error messages to make them easier to read and to decrease translation load by matching other 'stat' error messages - check for and report errors from 'read', too - avoid being confused by a gc.log larger than INT_MAX bytes Noticed by code inspection. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Split out from https://public-inbox.org/git/20180716225639.GK11513@aiede.svl.corp.google.com/. builtin/gc.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index ccfb1ceaeb..d69fc4c0b0 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -442,6 +442,7 @@ static int report_last_gc_error(void) { struct strbuf sb = STRBUF_INIT; int ret = 0; + ssize_t len; struct stat st; char *gc_log_path = git_pathdup("gc.log"); @@ -449,15 +450,17 @@ static int report_last_gc_error(void) if (errno == ENOENT) goto done; - ret = error_errno(_("Can't stat %s"), gc_log_path); + ret = error_errno(_("cannot stat '%s'"), gc_log_path); goto done; } if (st.st_mtime < gc_log_expire_time) goto done; - ret = strbuf_read_file(&sb, gc_log_path, 0); - if (ret > 0) + len = strbuf_read_file(&sb, gc_log_path, 0); + if (len < 0) + ret = error_errno(_("cannot read '%s'"), gc_log_path); + else if (len > 0) ret = error(_("The last gc run reported the following. " "Please correct the root cause\n" "and remove %s.\n" -- 2.18.0.233.g985f88cf7e ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH 1/3] gc: improve handling of errors reading gc.log 2018-07-17 6:53 ` [PATCH 1/3] gc: improve handling of errors reading gc.log Jonathan Nieder @ 2018-07-17 18:19 ` Junio C Hamano 2018-07-17 19:58 ` Jeff King 1 sibling, 0 replies; 57+ messages in thread From: Junio C Hamano @ 2018-07-17 18:19 UTC (permalink / raw) To: Jonathan Nieder Cc: Jonathan Tan, git, Jeff King, Elijah Newren, Nguyễn Thái Ngọc Duy Jonathan Nieder <jrnieder@gmail.com> writes: > - avoid being confused by a gc.log larger than INT_MAX bytes ;-) > > Noticed by code inspection. > > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> > --- Looks good. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 1/3] gc: improve handling of errors reading gc.log 2018-07-17 6:53 ` [PATCH 1/3] gc: improve handling of errors reading gc.log Jonathan Nieder 2018-07-17 18:19 ` Junio C Hamano @ 2018-07-17 19:58 ` Jeff King 1 sibling, 0 replies; 57+ messages in thread From: Jeff King @ 2018-07-17 19:58 UTC (permalink / raw) To: Jonathan Nieder Cc: Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy On Mon, Jul 16, 2018 at 11:53:21PM -0700, Jonathan Nieder wrote: > A collection of minor error handling fixes: > > - use an error message in lower case, following the usual style > - quote filenames in error messages to make them easier to read and to > decrease translation load by matching other 'stat' error messages > - check for and report errors from 'read', too > - avoid being confused by a gc.log larger than INT_MAX bytes > > Noticed by code inspection. > > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Thanks, this all looks obviously good. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 2/3] gc: exit with status 128 on failure 2018-07-17 6:51 ` [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode Jonathan Nieder 2018-07-17 6:53 ` [PATCH 1/3] gc: improve handling of errors reading gc.log Jonathan Nieder @ 2018-07-17 6:54 ` Jonathan Nieder 2018-07-17 18:22 ` Junio C Hamano 2018-07-17 19:59 ` Jeff King 2018-07-17 6:57 ` [PATCH 3/3] gc: do not return error for prior errors in daemonized mode Jonathan Nieder 2 siblings, 2 replies; 57+ messages in thread From: Jonathan Nieder @ 2018-07-17 6:54 UTC (permalink / raw) To: Jonathan Tan Cc: git, Jeff King, Elijah Newren, Nguyễn Thái Ngọc Duy A value of -1 returned from cmd_gc gets propagated to exit(), resulting in an exit status of 255. Use die instead for a clearer error message and a controlled exit. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- As in https://public-inbox.org/git/20180716225639.GK11513@aiede.svl.corp.google.com/. The only change is splitting out patch 1/3. builtin/gc.c | 35 ++++++++++++++--------------------- t/t6500-gc.sh | 2 +- 2 files changed, 15 insertions(+), 22 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index d69fc4c0b0..95c8afd07b 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -438,10 +438,9 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) return NULL; } -static int report_last_gc_error(void) +static void report_last_gc_error(void) { struct strbuf sb = STRBUF_INIT; - int ret = 0; ssize_t len; struct stat st; char *gc_log_path = git_pathdup("gc.log"); @@ -450,8 +449,7 @@ static int report_last_gc_error(void) if (errno == ENOENT) goto done; - ret = error_errno(_("cannot stat '%s'"), gc_log_path); - goto done; + die_errno(_("cannot stat '%s'"), gc_log_path); } if (st.st_mtime < gc_log_expire_time) @@ -459,9 +457,9 @@ static int report_last_gc_error(void) len = strbuf_read_file(&sb, gc_log_path, 0); if (len < 0) - ret = error_errno(_("cannot read '%s'"), gc_log_path); + die_errno(_("cannot read '%s'"), gc_log_path); else if (len > 0) - ret = error(_("The last gc run reported the following. " + die(_("The last gc run reported the following. " "Please correct the root cause\n" "and remove %s.\n" "Automatic cleanup will not be performed " @@ -471,20 +469,18 @@ static int report_last_gc_error(void) strbuf_release(&sb); done: free(gc_log_path); - return ret; } -static int gc_before_repack(void) +static void gc_before_repack(void) { if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, pack_refs_cmd.argv[0]); + die(FAILED_RUN, pack_refs_cmd.argv[0]); if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, reflog.argv[0]); + die(FAILED_RUN, reflog.argv[0]); pack_refs = 0; prune_reflogs = 0; - return 0; } int cmd_gc(int argc, const char **argv, const char *prefix) @@ -565,13 +561,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix) fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); } if (detach_auto) { - if (report_last_gc_error()) - return -1; + report_last_gc_error(); /* dies on error */ if (lock_repo_for_gc(force, &pid)) return 0; - if (gc_before_repack()) - return -1; + gc_before_repack(); /* dies on failure */ delete_tempfile(&pidfile); /* @@ -611,12 +605,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix) atexit(process_log_file_at_exit); } - if (gc_before_repack()) - return -1; + gc_before_repack(); if (!repository_format_precious_objects) { if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, repack.argv[0]); + die(FAILED_RUN, repack.argv[0]); if (prune_expire) { argv_array_push(&prune, prune_expire); @@ -626,18 +619,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix) argv_array_push(&prune, "--exclude-promisor-objects"); if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, prune.argv[0]); + die(FAILED_RUN, prune.argv[0]); } } if (prune_worktrees_expire) { argv_array_push(&prune_worktrees, prune_worktrees_expire); if (run_command_v_opt(prune_worktrees.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, prune_worktrees.argv[0]); + die(FAILED_RUN, prune_worktrees.argv[0]); } if (run_command_v_opt(rerere.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, rerere.argv[0]); + die(FAILED_RUN, rerere.argv[0]); report_garbage = report_pack_garbage; reprepare_packed_git(the_repository); diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 818435f04e..c474a94a9f 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -117,7 +117,7 @@ test_expect_success 'background auto gc does not run if gc.log is present and re test_config gc.autodetach true && echo fleem >.git/gc.log && test_must_fail git gc --auto 2>err && - test_i18ngrep "^error:" err && + test_i18ngrep "^fatal:" err && test_config gc.logexpiry 5.days && test-tool chmtime =-345600 .git/gc.log && test_must_fail git gc --auto && -- 2.18.0.233.g985f88cf7e ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH 2/3] gc: exit with status 128 on failure 2018-07-17 6:54 ` [PATCH 2/3] gc: exit with status 128 on failure Jonathan Nieder @ 2018-07-17 18:22 ` Junio C Hamano 2018-07-17 19:59 ` Jeff King 1 sibling, 0 replies; 57+ messages in thread From: Junio C Hamano @ 2018-07-17 18:22 UTC (permalink / raw) To: Jonathan Nieder Cc: Jonathan Tan, git, Jeff King, Elijah Newren, Nguyễn Thái Ngọc Duy Jonathan Nieder <jrnieder@gmail.com> writes: > A value of -1 returned from cmd_gc gets propagated to exit(), > resulting in an exit status of 255. Use die instead for a clearer > error message and a controlled exit. > > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> > --- > As in > https://public-inbox.org/git/20180716225639.GK11513@aiede.svl.corp.google.com/. > The only change is splitting out patch 1/3. > > builtin/gc.c | 35 ++++++++++++++--------------------- > t/t6500-gc.sh | 2 +- > 2 files changed, 15 insertions(+), 22 deletions(-) I double checked "return' in cmd_gc() and this patch catches them all. Good clean-up. Thanks. > diff --git a/builtin/gc.c b/builtin/gc.c > index d69fc4c0b0..95c8afd07b 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -438,10 +438,9 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) > return NULL; > } > > -static int report_last_gc_error(void) > +static void report_last_gc_error(void) > { > struct strbuf sb = STRBUF_INIT; > - int ret = 0; > ssize_t len; > struct stat st; > char *gc_log_path = git_pathdup("gc.log"); > @@ -450,8 +449,7 @@ static int report_last_gc_error(void) > if (errno == ENOENT) > goto done; > > - ret = error_errno(_("cannot stat '%s'"), gc_log_path); > - goto done; > + die_errno(_("cannot stat '%s'"), gc_log_path); > } > > if (st.st_mtime < gc_log_expire_time) > @@ -459,9 +457,9 @@ static int report_last_gc_error(void) > > len = strbuf_read_file(&sb, gc_log_path, 0); > if (len < 0) > - ret = error_errno(_("cannot read '%s'"), gc_log_path); > + die_errno(_("cannot read '%s'"), gc_log_path); > else if (len > 0) > - ret = error(_("The last gc run reported the following. " > + die(_("The last gc run reported the following. " > "Please correct the root cause\n" > "and remove %s.\n" > "Automatic cleanup will not be performed " > @@ -471,20 +469,18 @@ static int report_last_gc_error(void) > strbuf_release(&sb); > done: > free(gc_log_path); > - return ret; > } > > -static int gc_before_repack(void) > +static void gc_before_repack(void) > { > if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, pack_refs_cmd.argv[0]); > + die(FAILED_RUN, pack_refs_cmd.argv[0]); > > if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, reflog.argv[0]); > + die(FAILED_RUN, reflog.argv[0]); > > pack_refs = 0; > prune_reflogs = 0; > - return 0; > } > > int cmd_gc(int argc, const char **argv, const char *prefix) > @@ -565,13 +561,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix) > fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); > } > if (detach_auto) { > - if (report_last_gc_error()) > - return -1; > + report_last_gc_error(); /* dies on error */ > > if (lock_repo_for_gc(force, &pid)) > return 0; > - if (gc_before_repack()) > - return -1; > + gc_before_repack(); /* dies on failure */ > delete_tempfile(&pidfile); > > /* > @@ -611,12 +605,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix) > atexit(process_log_file_at_exit); > } > > - if (gc_before_repack()) > - return -1; > + gc_before_repack(); > > if (!repository_format_precious_objects) { > if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, repack.argv[0]); > + die(FAILED_RUN, repack.argv[0]); > > if (prune_expire) { > argv_array_push(&prune, prune_expire); > @@ -626,18 +619,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix) > argv_array_push(&prune, > "--exclude-promisor-objects"); > if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, prune.argv[0]); > + die(FAILED_RUN, prune.argv[0]); > } > } > > if (prune_worktrees_expire) { > argv_array_push(&prune_worktrees, prune_worktrees_expire); > if (run_command_v_opt(prune_worktrees.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, prune_worktrees.argv[0]); > + die(FAILED_RUN, prune_worktrees.argv[0]); > } > > if (run_command_v_opt(rerere.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, rerere.argv[0]); > + die(FAILED_RUN, rerere.argv[0]); > > report_garbage = report_pack_garbage; > reprepare_packed_git(the_repository); > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index 818435f04e..c474a94a9f 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -117,7 +117,7 @@ test_expect_success 'background auto gc does not run if gc.log is present and re > test_config gc.autodetach true && > echo fleem >.git/gc.log && > test_must_fail git gc --auto 2>err && > - test_i18ngrep "^error:" err && > + test_i18ngrep "^fatal:" err && > test_config gc.logexpiry 5.days && > test-tool chmtime =-345600 .git/gc.log && > test_must_fail git gc --auto && ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/3] gc: exit with status 128 on failure 2018-07-17 6:54 ` [PATCH 2/3] gc: exit with status 128 on failure Jonathan Nieder 2018-07-17 18:22 ` Junio C Hamano @ 2018-07-17 19:59 ` Jeff King 2018-09-17 18:33 ` Jeff King 1 sibling, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-17 19:59 UTC (permalink / raw) To: Jonathan Nieder Cc: Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy On Mon, Jul 16, 2018 at 11:54:16PM -0700, Jonathan Nieder wrote: > A value of -1 returned from cmd_gc gets propagated to exit(), > resulting in an exit status of 255. Use die instead for a clearer > error message and a controlled exit. This feels a little funny because we know we're going to turn some of these back in the next patch. That said, I'm OK with it, since this version is already written. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/3] gc: exit with status 128 on failure 2018-07-17 19:59 ` Jeff King @ 2018-09-17 18:33 ` Jeff King 2018-09-17 18:40 ` Jonathan Nieder 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-09-17 18:33 UTC (permalink / raw) To: Jonathan Nieder Cc: Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy On Tue, Jul 17, 2018 at 03:59:47PM -0400, Jeff King wrote: > On Mon, Jul 16, 2018 at 11:54:16PM -0700, Jonathan Nieder wrote: > > > A value of -1 returned from cmd_gc gets propagated to exit(), > > resulting in an exit status of 255. Use die instead for a clearer > > error message and a controlled exit. > > This feels a little funny because we know we're going to turn some of > these back in the next patch. That said, I'm OK with it, since this > version is already written. There's discussion elsewhere[1] of applying just up to patch 2. Do we still want to convert these cases to die() as their end-state? Another alternative would be to just convert a "-1" return to 128 or similar at the level of cmd_gc(), which avoids the 255 weirdness. Doing so also keeps the error messages the same as (as "error" and not "fatal"). I'm not sure which we prefer. It also makes the code more flexible and lib-ifiable (since the caller can decide how to handle the errors). That probably doesn't matter much since this is all static-local to builtin/gc.c, though I suppose in theory we could eventually do parts of "gc --auto" without spawning a separate process. -Peff [1] https://public-inbox.org/git/20180917182210.GB3894@sigill.intra.peff.net/ ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/3] gc: exit with status 128 on failure 2018-09-17 18:33 ` Jeff King @ 2018-09-17 18:40 ` Jonathan Nieder 2018-09-18 17:30 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-09-17 18:40 UTC (permalink / raw) To: Jeff King Cc: Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy Hi, Jeff King wrote: > On Tue, Jul 17, 2018 at 03:59:47PM -0400, Jeff King wrote: >> On Mon, Jul 16, 2018 at 11:54:16PM -0700, Jonathan Nieder wrote: >>> A value of -1 returned from cmd_gc gets propagated to exit(), >>> resulting in an exit status of 255. Use die instead for a clearer >>> error message and a controlled exit. >> >> This feels a little funny because we know we're going to turn some of >> these back in the next patch. That said, I'm OK with it, since this >> version is already written. > > There's discussion elsewhere[1] of applying just up to patch 2. > > Do we still want to convert these cases to die() as their end-state? IMHO yes, we do. die() is the function that you can use to exit with a fatal error. If we want to get rid of die(), that would be a tree-wide effort, not something that should hold up this patch. [...] > It also makes the code more flexible and lib-ifiable (since the caller > can decide how to handle the errors). That probably doesn't matter much > since this is all static-local to builtin/gc.c, Exactly. I'm a strong believer in http://wiki.c2.com/?YouArentGonnaNeedIt. Thanks, Jonathan ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 2/3] gc: exit with status 128 on failure 2018-09-17 18:40 ` Jonathan Nieder @ 2018-09-18 17:30 ` Jeff King 0 siblings, 0 replies; 57+ messages in thread From: Jeff King @ 2018-09-18 17:30 UTC (permalink / raw) To: Jonathan Nieder Cc: Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy On Mon, Sep 17, 2018 at 11:40:12AM -0700, Jonathan Nieder wrote: > > There's discussion elsewhere[1] of applying just up to patch 2. > > > > Do we still want to convert these cases to die() as their end-state? > > IMHO yes, we do. die() is the function that you can use to exit with > a fatal error. > > If we want to get rid of die(), that would be a tree-wide effort, not > something that should hold up this patch. But that was sort of my question. I think there are people who _do_ want to get rid of most die() calls (like Dscho), and there is a tree-wide effort that is happening slowly to lib-ify. Your patch goes in the opposite direction. That said, I think there are actually two cases in your patch. The calls to "return error()" or even just "return -1" in cmd_gc() seem like obvious candidates for die(). We're at the top of the stack, and anybody lib-ifying at that level is going to need to extract bits into reusable functions anyway. I more wondered about helpers like report_last_gc_error() and gc_before_repack(). > > It also makes the code more flexible and lib-ifiable (since the caller > > can decide how to handle the errors). That probably doesn't matter much > > since this is all static-local to builtin/gc.c, > > Exactly. I'm a strong believer in http://wiki.c2.com/?YouArentGonnaNeedIt. I only half-agree that this is YAGNI. If it were "let's punt on making this code friendlier to lib-ification", I'd agree more completely. But it's actually taking an active step in the opposite direction. I dunno. It's probably not worth spending too much more time discussing, and I'm OK either way. I mostly just wanted to raise the issue since dropping patch 3 changes the balance to me. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH 3/3] gc: do not return error for prior errors in daemonized mode 2018-07-17 6:51 ` [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode Jonathan Nieder 2018-07-17 6:53 ` [PATCH 1/3] gc: improve handling of errors reading gc.log Jonathan Nieder 2018-07-17 6:54 ` [PATCH 2/3] gc: exit with status 128 on failure Jonathan Nieder @ 2018-07-17 6:57 ` Jonathan Nieder 2018-07-17 20:13 ` Jeff King 2 siblings, 1 reply; 57+ messages in thread From: Jonathan Nieder @ 2018-07-17 6:57 UTC (permalink / raw) To: Jonathan Tan Cc: git, Jeff King, Elijah Newren, Nguyễn Thái Ngọc Duy, Andrii Dehtiarov Some build machines started consistently failing to fetch updated source using "repo sync", with error error: The last gc run reported the following. Please correct the root cause and remove /build/.repo/projects/tools/git.git/gc.log. Automatic cleanup will not be performed until the file is removed. warning: There are too many unreachable loose objects; run 'git prune' to remove them. The cause takes some time to describe. In v2.0.0-rc0~145^2 (gc: config option for running --auto in background, 2014-02-08), "git gc --auto" learned to run in the background instead of blocking the invoking command. In this mode, it closed stderr to avoid interleaving output with any subsequent commands, causing warnings like the above to be swallowed; v2.6.3~24^2 (gc: save log from daemonized gc --auto and print it next time, 2015-09-19) addressed that by storing any diagnostic output in .git/gc.log and allowing the next "git gc --auto" run to print it. To avoid wasteful repeated fruitless gcs, when gc.log is present, the subsequent "gc --auto" would die after printing its contents. Most git commands, such as "git fetch", ignore the exit status from "git gc --auto" so all is well at this point: the user gets to see the error message, and the fetch succeeds, without a wasteful additional attempt at an automatic gc. External tools like repo[1], though, do care about the exit status from "git gc --auto". In non-daemonized mode, the exit status is straightforward: if there is an error, it is nonzero, but after a warning like the above, the status is zero. The daemonized mode, as a side effect of the other properties provided, offers a very strange exit code convention: - if no housekeeping was required, the exit status is 0 - the first real run, after forking into the background, returns exit status 0 unconditionally. The parent process has no way to know whether gc will succeed. - if there is any diagnostic output in gc.log, subsequent runs return a nonzero exit status to indicate that gc was not triggered. There's nothing for the calling program to act on on the basis of that error. Use status 0 consistently instead, to indicate that we decided not to run a gc (just like if no housekeeping was required). This way, repo and similar tools can get the benefit of the same behavior as tools like "git fetch" that ignore the exit status from gc --auto. Once the period of time described by gc.pruneExpire elapses, the unreachable loose objects will be removed by "git gc --auto" automatically. [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/ Reported-by: Andrii Dehtiarov <adehtiarov@google.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Changes from v1: - minor commit message tweaks - using warning() instead of error() - avoiding leaking gc_log_path in report_last_gc_error - only changing the exit status when reporting a prior error, not a new one Thoughts of all kinds welcome, as always. Sincerely, Jonathan Documentation/config.txt | 3 ++- builtin/gc.c | 33 +++++++++++++++++++++++++++------ t/t6500-gc.sh | 6 +++--- 3 files changed, 32 insertions(+), 10 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1cc18a828c..5eaa4aaa7d 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should go below gc.autoPackLimit and gc.bigPackThreshold should be respected again. gc.logExpiry:: - If the file gc.log exists, then `git gc --auto` won't run + If the file gc.log exists, then `git gc --auto` will print + its content and exit with status zero instead of running unless that file is more than 'gc.logExpiry' old. Default is "1.day". See `gc.pruneExpire` for more ways to specify its value. diff --git a/builtin/gc.c b/builtin/gc.c index 95c8afd07b..ce8a663a01 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -438,9 +438,15 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) return NULL; } -static void report_last_gc_error(void) +/* + * Returns 0 if there was no previous error and gc can proceed, 1 if + * gc should not proceed due to an error in the last run. Prints a + * message and returns -1 if an error occured while reading gc.log + */ +static int report_last_gc_error(void) { struct strbuf sb = STRBUF_INIT; + int ret = 0; ssize_t len; struct stat st; char *gc_log_path = git_pathdup("gc.log"); @@ -449,7 +455,8 @@ static void report_last_gc_error(void) if (errno == ENOENT) goto done; - die_errno(_("cannot stat '%s'"), gc_log_path); + ret = error_errno(_("cannot stat '%s'"), gc_log_path); + goto done; } if (st.st_mtime < gc_log_expire_time) @@ -457,18 +464,26 @@ static void report_last_gc_error(void) len = strbuf_read_file(&sb, gc_log_path, 0); if (len < 0) - die_errno(_("cannot read '%s'"), gc_log_path); - else if (len > 0) - die(_("The last gc run reported the following. " + ret = error_errno(_("cannot read '%s'"), gc_log_path); + else if (len > 0) { + /* + * A previous gc failed. Report the error, and don't + * bother with an automatic gc run since it is likely + * to fail in the same way. + */ + warning(_("The last gc run reported the following. " "Please correct the root cause\n" "and remove %s.\n" "Automatic cleanup will not be performed " "until the file is removed.\n\n" "%s"), gc_log_path, sb.buf); + ret = 1; + } strbuf_release(&sb); done: free(gc_log_path); + return ret; } static void gc_before_repack(void) @@ -561,7 +576,13 @@ int cmd_gc(int argc, const char **argv, const char *prefix) fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); } if (detach_auto) { - report_last_gc_error(); /* dies on error */ + int ret = report_last_gc_error(); + if (ret < 0) + /* an I/O error occured, already reported */ + exit(128); + if (ret == 1) + /* Last gc --auto failed. Skip this one. */ + return 0; if (lock_repo_for_gc(force, &pid)) return 0; diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index c474a94a9f..a222efdbe1 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if gc.log is present and re test_config gc.autopacklimit 1 && test_config gc.autodetach true && echo fleem >.git/gc.log && - test_must_fail git gc --auto 2>err && - test_i18ngrep "^fatal:" err && + git gc --auto 2>err && + test_i18ngrep "^warning:" err && test_config gc.logexpiry 5.days && test-tool chmtime =-345600 .git/gc.log && - test_must_fail git gc --auto && + git gc --auto && test_config gc.logexpiry 2.days && run_and_wait_for_auto_gc && ls .git/objects/pack/pack-*.pack >packs && -- 2.18.0.233.g985f88cf7e ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH 3/3] gc: do not return error for prior errors in daemonized mode 2018-07-17 6:57 ` [PATCH 3/3] gc: do not return error for prior errors in daemonized mode Jonathan Nieder @ 2018-07-17 20:13 ` Jeff King 2018-07-18 16:21 ` Junio C Hamano 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-17 20:13 UTC (permalink / raw) To: Jonathan Nieder Cc: Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy, Andrii Dehtiarov On Mon, Jul 16, 2018 at 11:57:40PM -0700, Jonathan Nieder wrote: > External tools like repo[1], though, do care about the exit status > from "git gc --auto". In non-daemonized mode, the exit status is > straightforward: if there is an error, it is nonzero, but after a > warning like the above, the status is zero. The daemonized mode, as a > side effect of the other properties provided, offers a very strange > exit code convention: > > - if no housekeeping was required, the exit status is 0 > > - the first real run, after forking into the background, returns exit > status 0 unconditionally. The parent process has no way to know > whether gc will succeed. > > - if there is any diagnostic output in gc.log, subsequent runs return > a nonzero exit status to indicate that gc was not triggered. > > There's nothing for the calling program to act on on the basis of that > error. Use status 0 consistently instead, to indicate that we decided > not to run a gc (just like if no housekeeping was required). This > way, repo and similar tools can get the benefit of the same behavior > as tools like "git fetch" that ignore the exit status from gc --auto. I think this is a good change. In theory it might be useful to propagate the original exit code (_not_ "did we see a warning or an error", but the true original exit code. But as you note, it's not deterministic anyway (we'd miss that exit code on the first run, or even any simultaneous runs that exit early due to lock contention). So it's clear that callers can't really do anything robust based on the exit code of a daemonized "gc --auto". I still think that "repo" should probably stop respecting the exit code. But that's no excuse for Git not to have a sensible exit code in the first place. The patch itself looks good overall. A few comments (none of which I think even requires a fix): > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828c..5eaa4aaa7d 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should go below > gc.autoPackLimit and gc.bigPackThreshold should be respected again. > > gc.logExpiry:: > - If the file gc.log exists, then `git gc --auto` won't run > + If the file gc.log exists, then `git gc --auto` will print > + its content and exit with status zero instead of running > unless that file is more than 'gc.logExpiry' old. Default is > "1.day". See `gc.pruneExpire` for more ways to specify its > value. Yeah, this is definitely worth documenting. I was surprised there's no discussion of daemonization at all in git-gc(1). I don't think adding it is a blocker for this series, though. > static void gc_before_repack(void) > @@ -561,7 +576,13 @@ int cmd_gc(int argc, const char **argv, const char *prefix) > fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); > } > if (detach_auto) { > - report_last_gc_error(); /* dies on error */ > + int ret = report_last_gc_error(); > + if (ret < 0) > + /* an I/O error occured, already reported */ > + exit(128); We have a few exit(128)'s sprinkled throughout the code-base, and I always wonder if they will one day go stale if we change the code that die() uses. But it probably doesn't matter, and anyway I don't think there is a better way to do this currently. I would have written "return 128" since the other case arm also returns, but I really cannot think of a reason to prefer one over the other. > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index c474a94a9f..a222efdbe1 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if gc.log is present and re > test_config gc.autopacklimit 1 && > test_config gc.autodetach true && > echo fleem >.git/gc.log && > - test_must_fail git gc --auto 2>err && > - test_i18ngrep "^fatal:" err && > + git gc --auto 2>err && > + test_i18ngrep "^warning:" err && > test_config gc.logexpiry 5.days && > test-tool chmtime =-345600 .git/gc.log && > - test_must_fail git gc --auto && > + git gc --auto && Nice. At first I thought this was changing an existing test to cover the new case (which I usually frown on), but it is just that your patch is intentionally changing the case covered here. So this is the right thing to do. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 3/3] gc: do not return error for prior errors in daemonized mode 2018-07-17 20:13 ` Jeff King @ 2018-07-18 16:21 ` Junio C Hamano 2018-07-18 17:22 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Junio C Hamano @ 2018-07-18 16:21 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy, Andrii Dehtiarov Jeff King <peff@peff.net> writes: >> There's nothing for the calling program to act on on the basis of that >> error. Use status 0 consistently instead, to indicate that we decided >> not to run a gc (just like if no housekeeping was required). This >> way, repo and similar tools can get the benefit of the same behavior >> as tools like "git fetch" that ignore the exit status from gc --auto. > > I think this is a good change. > > In theory it might be useful to propagate the original exit code (_not_ > "did we see a warning or an error", but the true original exit code. But > as you note, it's not deterministic anyway (we'd miss that exit code on > the first run, or even any simultaneous runs that exit early due to lock > contention). So it's clear that callers can't really do anything robust > based on the exit code of a daemonized "gc --auto". > > I still think that "repo" should probably stop respecting the exit code. > But that's no excuse for Git not to have a sensible exit code in the > first place. I am not yet convinced that this last step to exit with 0 is a good change, even though I can understand that it would be more convenient, as there currently is no easy way for the calling script to tell two error cases apart. I think the "sensible exit code" you mention would be something like "1 for hard error, 2 for 'I am punting as I see there were previous errors---you may want to examine your repository'". If we did that from day one and documented that behaviour, nobody would have complained, but adopting that suddenly is of course a breaking change. Perhaps we should exit with 2 (not 0) in that "previous error" case by default, and then have a configuration knob to turn that 2 into 0 for those who cannot easily modify the calling script? That way, we by default will *not* break those who have been paying attention to zero-ness of the exit status, we allow those who want to treat this "prior error" case as if there were no error with just a knob, and then those who are willing to update their script can tell two cases by the exit status and act differently. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 3/3] gc: do not return error for prior errors in daemonized mode 2018-07-18 16:21 ` Junio C Hamano @ 2018-07-18 17:22 ` Jeff King 2018-07-18 18:19 ` Junio C Hamano 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-18 17:22 UTC (permalink / raw) To: Junio C Hamano Cc: Jonathan Nieder, Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy, Andrii Dehtiarov On Wed, Jul 18, 2018 at 09:21:18AM -0700, Junio C Hamano wrote: > > I still think that "repo" should probably stop respecting the exit code. > > But that's no excuse for Git not to have a sensible exit code in the > > first place. > > I am not yet convinced that this last step to exit with 0 is a good > change, even though I can understand that it would be more > convenient, as there currently is no easy way for the calling script > to tell two error cases apart. > > I think the "sensible exit code" you mention would be something like > "1 for hard error, 2 for 'I am punting as I see there were previous > errors---you may want to examine your repository'". > > If we did that from day one and documented that behaviour, nobody > would have complained, but adopting that suddenly is of course a > breaking change. > > Perhaps we should exit with 2 (not 0) in that "previous error" case > by default, and then have a configuration knob to turn that 2 into 0 > for those who cannot easily modify the calling script? That way, we > by default will *not* break those who have been paying attention to > zero-ness of the exit status, we allow those who want to treat this > "prior error" case as if there were no error with just a knob, and > then those who are willing to update their script can tell two cases > by the exit status and act differently. I think we have been exiting non-zero with "previous errors" for some time with the daemonizing code. It was just spelled "-1" instead of "2". So just jumping right there does not mean any regression from the current state, I don't think (but it also does not fix existing scripts like "repo" that check the code). I agree the config you suggest would give people the tools to make that case work. But it somehow rubs me the wrong way. Can you imagine the perspective of a user who is told "oh, your script breaks? Just try setting this option to ignore error codes in this one particular situation". It feels like a weird hack, because it is. It's also still inconsistent in the daemonize case. The run that yields the error won't return a non-zero exit. But the next run will exit with "2". -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 3/3] gc: do not return error for prior errors in daemonized mode 2018-07-18 17:22 ` Jeff King @ 2018-07-18 18:19 ` Junio C Hamano 2018-07-18 19:06 ` Jeff King 0 siblings, 1 reply; 57+ messages in thread From: Junio C Hamano @ 2018-07-18 18:19 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy, Andrii Dehtiarov Jeff King <peff@peff.net> writes: >> Perhaps we should exit with 2 (not 0) in that "previous error" case >> by default, and then have a configuration knob to turn that 2 into 0 >> for those who cannot easily modify the calling script? That way, we >> by default will *not* break those who have been paying attention to >> zero-ness of the exit status, we allow those who want to treat this >> "prior error" case as if there were no error with just a knob, and >> then those who are willing to update their script can tell two cases >> by the exit status and act differently. > > I think we have been exiting non-zero with "previous errors" for some > time with the daemonizing code. It was just spelled "-1" instead of "2". > So just jumping right there does not mean any regression from the > current state, I don't think (but it also does not fix existing scripts > like "repo" that check the code). Correct. That is why I suggested that default setting. However. If we change it so that by default we exit with 0 but allow those who care to see 2 with a knob, that would be a regression to those who actually were acting on the non-zero exit, but they no longer have to scrape the log/warning output if they choose to set the knob, so the "convenience" factor might argue in favor of the "by default we return 0 but you can choose to get the 'sensible exit codes'" approach. I do not have a strong opinion either way wrt which one should be the default. > I agree the config you suggest would give people the tools to make that > case work. But it somehow rubs me the wrong way. Can you imagine the > perspective of a user who is told "oh, your script breaks? Just try > setting this option to ignore error codes in this one particular > situation". It feels like a weird hack, because it is. Well, the message is "your script is broken (and has always been when 'gc --auto' saw that the previous round failed), but now we have two ways to help you correct. An easier but broken way is to turn this knob to turn 2 to 0 to hide the issue under the rug; or you can notice the difference between $?==1 and $?==2 and act differently". "turn 2 to 0" is a weird hack, but they are also given the option of doing the right thing, so... > It's also still inconsistent in the daemonize case. The run that yields > the error won't return a non-zero exit. But the next run will exit with > "2". I do not see this particular "inconsistency" a problem. The run that first discovers the problem by definition cannot return with non-zero; not waiting until finding out the outcome is the whole point of running in the background and giving control back early. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 3/3] gc: do not return error for prior errors in daemonized mode 2018-07-18 18:19 ` Junio C Hamano @ 2018-07-18 19:06 ` Jeff King 2018-07-18 19:55 ` Junio C Hamano 0 siblings, 1 reply; 57+ messages in thread From: Jeff King @ 2018-07-18 19:06 UTC (permalink / raw) To: Junio C Hamano Cc: Jonathan Nieder, Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy, Andrii Dehtiarov On Wed, Jul 18, 2018 at 11:19:01AM -0700, Junio C Hamano wrote: > > It's also still inconsistent in the daemonize case. The run that yields > > the error won't return a non-zero exit. But the next run will exit with > > "2". > > I do not see this particular "inconsistency" a problem. The run > that first discovers the problem by definition cannot return with > non-zero; not waiting until finding out the outcome is the whole > point of running in the background and giving control back early. I guess I just see it as encouraging a non-robust flow. I can see somebody thinking they should care about "gc --auto" errors, because they can turn up latent repository corruption (because most operations try to only look at the objects they need to, and gc inherently considers every object). But doing so would give a very delayed notification for a repository that is handled infrequently. So finding out about a corruption that we detected might takes weeks or months. I dunno. If your primary motivation is not finding latent problems, but loudly complaining when gc does not make forward progress, I suppose it makes more sense. -Peff ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH 3/3] gc: do not return error for prior errors in daemonized mode 2018-07-18 19:06 ` Jeff King @ 2018-07-18 19:55 ` Junio C Hamano 0 siblings, 0 replies; 57+ messages in thread From: Junio C Hamano @ 2018-07-18 19:55 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, Jonathan Tan, git, Elijah Newren, Nguyễn Thái Ngọc Duy, Andrii Dehtiarov Jeff King <peff@peff.net> writes: > On Wed, Jul 18, 2018 at 11:19:01AM -0700, Junio C Hamano wrote: > >> > It's also still inconsistent in the daemonize case. The run that yields >> > the error won't return a non-zero exit. But the next run will exit with >> > "2". >> >> I do not see this particular "inconsistency" a problem. The run >> that first discovers the problem by definition cannot return with >> non-zero; not waiting until finding out the outcome is the whole >> point of running in the background and giving control back early. > > I guess I just see it as encouraging a non-robust flow. I can see > somebody thinking they should care about "gc --auto" errors, because > they can turn up latent repository corruption (because most operations > try to only look at the objects they need to, and gc inherently > considers every object). But doing so would give a very delayed > notification for a repository that is handled infrequently. So finding > out about a corruption that we detected might takes weeks or months. > > I dunno. If your primary motivation is not finding latent problems, but > loudly complaining when gc does not make forward progress, I suppose it > makes more sense. I am not sure either. If your primary motivation is to protect yourself from ignoring gc failures that could be an indication of possible repository corruption, you wouldn't be running it backgrounded in the first place, I woudl guess. And if you are backgrounding, with "exit with 2, not hide it with 0" approach, you would have a better chance to notice, no? ^ permalink raw reply [flat|nested] 57+ messages in thread
end of thread, other threads:[~2018-09-18 17:30 UTC | newest] Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-16 17:27 [PATCH] gc: do not warn about too many loose objects Jonathan Tan 2018-07-16 17:51 ` Jeff King 2018-07-16 18:22 ` Jonathan Nieder 2018-07-16 18:52 ` Jeff King 2018-07-16 19:09 ` Jonathan Nieder 2018-07-16 19:41 ` Jeff King 2018-07-16 19:54 ` Jonathan Nieder 2018-07-16 20:29 ` Jeff King 2018-07-16 20:37 ` Jonathan Nieder 2018-07-16 21:09 ` Jeff King 2018-07-16 21:40 ` Jonathan Nieder 2018-07-16 21:45 ` Jeff King 2018-07-16 22:03 ` Jonathan Nieder 2018-07-16 22:43 ` Jeff King 2018-07-16 22:56 ` Jonathan Nieder 2018-07-16 23:26 ` Jeff King 2018-07-17 1:53 ` Jonathan Nieder 2018-07-17 8:59 ` Ævar Arnfjörð Bjarmason 2018-07-17 14:03 ` Jonathan Nieder 2018-07-17 15:24 ` Ævar Arnfjörð Bjarmason 2018-07-17 20:27 ` Jeff King 2018-07-18 13:11 ` Ævar Arnfjörð Bjarmason 2018-07-18 17:29 ` Jeff King 2018-07-17 15:59 ` Duy Nguyen 2018-07-17 18:09 ` Junio C Hamano 2018-07-16 19:15 ` Elijah Newren 2018-07-16 19:19 ` Jonathan Nieder 2018-07-16 20:21 ` Elijah Newren 2018-07-16 20:35 ` Jeff King 2018-07-16 20:56 ` Jonathan Nieder 2018-07-16 21:12 ` Jeff King 2018-07-16 19:52 ` Jeff King 2018-07-16 20:16 ` Elijah Newren 2018-07-16 20:38 ` Jeff King 2018-07-16 21:09 ` Elijah Newren 2018-07-16 21:21 ` Jeff King 2018-07-16 22:07 ` Elijah Newren 2018-07-16 22:55 ` Jeff King 2018-07-16 23:06 ` Elijah Newren 2018-07-16 21:31 ` Jonathan Nieder 2018-07-17 6:51 ` [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode Jonathan Nieder 2018-07-17 6:53 ` [PATCH 1/3] gc: improve handling of errors reading gc.log Jonathan Nieder 2018-07-17 18:19 ` Junio C Hamano 2018-07-17 19:58 ` Jeff King 2018-07-17 6:54 ` [PATCH 2/3] gc: exit with status 128 on failure Jonathan Nieder 2018-07-17 18:22 ` Junio C Hamano 2018-07-17 19:59 ` Jeff King 2018-09-17 18:33 ` Jeff King 2018-09-17 18:40 ` Jonathan Nieder 2018-09-18 17:30 ` Jeff King 2018-07-17 6:57 ` [PATCH 3/3] gc: do not return error for prior errors in daemonized mode Jonathan Nieder 2018-07-17 20:13 ` Jeff King 2018-07-18 16:21 ` Junio C Hamano 2018-07-18 17:22 ` Jeff King 2018-07-18 18:19 ` Junio C Hamano 2018-07-18 19:06 ` Jeff King 2018-07-18 19:55 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).