* [PATCH 0/4] Fix issues with rename detection limits @ 2017-11-10 17:39 Elijah Newren 2017-11-10 17:39 ` [PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit Elijah Newren ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Elijah Newren @ 2017-11-10 17:39 UTC (permalink / raw) To: git In a repo with around 60k files with deep directory hierarchies (due to being predominantly java code) and which had a few high level directories moved around (making it appear to git as dozens of thousands of individual file renames), attempts to cherry-pick commits across product versions resulted in non-sensical delete/modify conflicts (rather than rename/modify as expected). We had a few teams who were in the unenviable position of having to backport hundreds or thousands of commits across such product versions, and the result was months of pain, which could have been alleviated were it not for a few small git bugs: * When you try to cherry-pick changes, unlike when you merge two branches, git will not notify you when you need to set a higher merge.renameLimit. * If you know git internals well enough, you can try to trick git into telling you the correct renameLimit by doing a merge instead of a cherry-pick. If you do that, and have a renameLimit that is too small, git will let you know the value you need. You can then undo the merge and proceed with the cherry-pick. Except that... * If you are performing a merge and specify a large renameLimit that isn't large enough, and the necessary renameLimit you need is greater than 32767, then git tells you nothing, leading you to believe that the limit you specified is high enough, but only to watch it still mess up the merge badly. * If you happen to specify a merge.renameLimit that *is* high enough, but just happens to be greater than 32767, then git will silently pretend you specified 32767, determine that 32767 is not high enough, not tell you anything about it's silent clamping, and then proceed to mess up the merge or cherry-pick badly. Not only do you get no feedback, the clamping to 32767 isn't documented anywhere even if you tried to read every manual page. Folks did discover the merge.renameLimit and tried setting it to various values, spread over the spectrum from 1000 (the default) up to 999999999 (or maybe more, that's just the highest number I heard), completely unaware that most their attempts were ignored and wondering why git couldn't handle things and couldn't explain why either. Eventually folks wrote scripts that would run the output of format-patch through a bunch of sed commands to pretend patches were against the filename on the other side of history and then pipe back through git-am. Such scripts grew as more and more rename rules were added. I eventually learned of one of these scripts and said something close to, "You don't need these pile of rename rules; just set merge.renameLimit to something higher." When they responded that merge.renameLimit didn't work, I didn't believe them. This patch series, along with two others that I will be sending shortly, represent my attempt to continue to not believe them. :-) Elijah Newren (4): sequencer: Warn when internal merge may be suboptimal due to renameLimit Remove silent clamp of renameLimit progress: Fix progress meters when dealing with lots of work sequencer: Show rename progress during cherry picks diff.c | 2 +- diffcore-rename.c | 15 ++++++--------- progress.c | 22 +++++++++++----------- progress.h | 8 ++++---- sequencer.c | 2 ++ 5 files changed, 24 insertions(+), 25 deletions(-) -- 2.15.0.5.g9567be9905 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit 2017-11-10 17:39 [PATCH 0/4] Fix issues with rename detection limits Elijah Newren @ 2017-11-10 17:39 ` Elijah Newren 2017-11-13 5:16 ` Junio C Hamano 2017-11-10 17:39 ` [PATCH 2/4] Remove silent clamp of renameLimit Elijah Newren ` (2 subsequent siblings) 3 siblings, 1 reply; 15+ messages in thread From: Elijah Newren @ 2017-11-10 17:39 UTC (permalink / raw) To: git Having renamed files silently treated as deleted/modified conflicts instead of correctly resolving the renamed/modified case has caused lots of pain for some users. Eventually, some of them figured out to set merge.renameLimit to something higher, but without feedback about what value it should have, they were just repeatedly guessing and retrying. Signed-off-by: Elijah Newren <newren@gmail.com> --- sequencer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sequencer.c b/sequencer.c index f2a10cc4f2..2b4cecb617 100644 --- a/sequencer.c +++ b/sequencer.c @@ -462,6 +462,7 @@ static int do_recursive_merge(struct commit *base, struct commit *next, if (is_rebase_i(opts) && clean <= 0) fputs(o.obuf.buf, stdout); strbuf_release(&o.obuf); + diff_warn_rename_limit("merge.renamelimit", o.needed_rename_limit, 0); if (clean < 0) return clean; -- 2.15.0.5.g9567be9905 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit 2017-11-10 17:39 ` [PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit Elijah Newren @ 2017-11-13 5:16 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2017-11-13 5:16 UTC (permalink / raw) To: Elijah Newren; +Cc: git Elijah Newren <newren@gmail.com> writes: > Subject: Re: [PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit Style: please downcase s/Warn/warn/ to fit this better in "shortlog --no-merges" output. > Having renamed files silently treated as deleted/modified conflicts > instead of correctly resolving the renamed/modified case has caused lots > of pain for some users. Eventually, some of them figured out to set > merge.renameLimit to something higher, but without feedback about what > value it should have, they were just repeatedly guessing and retrying. It would have been _much_ easier to read if you refrained from stating the gripes more prominently than the source of the gripe that you are fixing. E.g. If one side of a merge have renamed more paths than merge.renamelimit since the sides diverged, the recursive merge strategy stops detecting renames and instead leaves these many paths as delete/modify conflicts, without warning what is going on and giving hints on how to tell Git that it is allowed to spend more cycles to detect renames. perhaps. The addition of a call to diff_warn_rename_limit looks quite sensible. Thanks. > Signed-off-by: Elijah Newren <newren@gmail.com> > --- > sequencer.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/sequencer.c b/sequencer.c > index f2a10cc4f2..2b4cecb617 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -462,6 +462,7 @@ static int do_recursive_merge(struct commit *base, struct commit *next, > if (is_rebase_i(opts) && clean <= 0) > fputs(o.obuf.buf, stdout); > strbuf_release(&o.obuf); > + diff_warn_rename_limit("merge.renamelimit", o.needed_rename_limit, 0); > if (clean < 0) > return clean; ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/4] Remove silent clamp of renameLimit 2017-11-10 17:39 [PATCH 0/4] Fix issues with rename detection limits Elijah Newren 2017-11-10 17:39 ` [PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit Elijah Newren @ 2017-11-10 17:39 ` Elijah Newren 2017-11-10 18:26 ` Stefan Beller 2017-11-10 17:39 ` [PATCH 3/4] progress: Fix progress meters when dealing with lots of work Elijah Newren 2017-11-10 17:39 ` [PATCH 4/4] sequencer: Show rename progress during cherry picks Elijah Newren 3 siblings, 1 reply; 15+ messages in thread From: Elijah Newren @ 2017-11-10 17:39 UTC (permalink / raw) To: git In commit 0024a5492 (Fix the rename detection limit checking; 2007-09-14), the renameLimit was clamped to 32767. This appears to have been to simply avoid integer overflow in the following computation: num_create * num_src <= rename_limit * rename_limit although it also could be viewed as a hardcoded bound on the amount of CPU time we're willing to allow users to tell git to spend on handling renames. An upper bound may make sense, particularly as the computation is O(rename_limit^2), but only if the bound is documented and communicated to the user -- neither of which were true. In fact, the silent clamping of the renameLimit to a smaller value and lack of reporting of the needed renameLimit when it was too large made it appear to the user as though they had used a high enough value; however, git would proceed to mess up the merge or cherry-pick badly based on the lack of rename detection. Some hardy folks, despite the lack of feedback on the correct limit to choose, were desperate enough to repeatedly retry their cherry-picks with increasingly larger renameLimit values (going orders of magnitude beyond the built-in limit of 32767), but were consistently met with the same failure. Although large limits can make things slow, we have users who would be ecstatic to have a small five file change be correctly cherry picked even if they have to manually specify a large limit and it took git ten minutes to compute it. Signed-off-by: Elijah Newren <newren@gmail.com> --- diff.c | 2 +- diffcore-rename.c | 11 ++++------- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/diff.c b/diff.c index 6fd288420b..c6597e3231 100644 --- a/diff.c +++ b/diff.c @@ -5524,7 +5524,7 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc) warning(_(rename_limit_warning)); else return; - if (0 < needed && needed < 32767) + if (0 < needed) warning(_(rename_limit_advice), varname, needed); } diff --git a/diffcore-rename.c b/diffcore-rename.c index 0d8c3d2ee4..7f9a463f5a 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -391,14 +391,10 @@ static int too_many_rename_candidates(int num_create, * growing larger than a "rename_limit" square matrix, ie: * * num_create * num_src > rename_limit * rename_limit - * - * but handles the potential overflow case specially (and we - * assume at least 32-bit integers) */ - if (rename_limit <= 0 || rename_limit > 32767) - rename_limit = 32767; if ((num_create <= rename_limit || num_src <= rename_limit) && - (num_create * num_src <= rename_limit * rename_limit)) + ((double)num_create * (double)num_src + <= (double)rename_limit * (double)rename_limit)) return 0; options->needed_rename_limit = @@ -415,7 +411,8 @@ static int too_many_rename_candidates(int num_create, num_src++; } if ((num_create <= rename_limit || num_src <= rename_limit) && - (num_create * num_src <= rename_limit * rename_limit)) + ((double)num_create * (double)num_src + <= (double)rename_limit * (double)rename_limit)) return 2; return 1; } -- 2.15.0.5.g9567be9905 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] Remove silent clamp of renameLimit 2017-11-10 17:39 ` [PATCH 2/4] Remove silent clamp of renameLimit Elijah Newren @ 2017-11-10 18:26 ` Stefan Beller 2017-11-10 18:36 ` Elijah Newren 0 siblings, 1 reply; 15+ messages in thread From: Stefan Beller @ 2017-11-10 18:26 UTC (permalink / raw) To: Elijah Newren; +Cc: git On Fri, Nov 10, 2017 at 9:39 AM, Elijah Newren <newren@gmail.com> wrote: > In commit 0024a5492 (Fix the rename detection limit checking; 2007-09-14), > the renameLimit was clamped to 32767. This appears to have been to simply > avoid integer overflow in the following computation: > > num_create * num_src <= rename_limit * rename_limit > > although it also could be viewed as a hardcoded bound on the amount of CPU > time we're willing to allow users to tell git to spend on handling > renames. An upper bound may make sense, particularly as the computation > is O(rename_limit^2), but only if the bound is documented and communicated > to the user -- neither of which were true. > > In fact, the silent clamping of the renameLimit to a smaller value and > lack of reporting of the needed renameLimit when it was too large made it > appear to the user as though they had used a high enough value; however, > git would proceed to mess up the merge or cherry-pick badly based on the > lack of rename detection. Some hardy folks, despite the lack of feedback > on the correct limit to choose, were desperate enough to repeatedly retry > their cherry-picks with increasingly larger renameLimit values (going > orders of magnitude beyond the built-in limit of 32767), but were > consistently met with the same failure. > > Although large limits can make things slow, we have users who would be > ecstatic to have a small five file change be correctly cherry picked even > if they have to manually specify a large limit and it took git ten minutes > to compute it. > > Signed-off-by: Elijah Newren <newren@gmail.com> > --- > diff.c | 2 +- > diffcore-rename.c | 11 ++++------- > 2 files changed, 5 insertions(+), 8 deletions(-) > > diff --git a/diff.c b/diff.c > index 6fd288420b..c6597e3231 100644 > --- a/diff.c > +++ b/diff.c > @@ -5524,7 +5524,7 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc) > warning(_(rename_limit_warning)); > else > return; > - if (0 < needed && needed < 32767) > + if (0 < needed) > warning(_(rename_limit_advice), varname, needed); > } > > diff --git a/diffcore-rename.c b/diffcore-rename.c > index 0d8c3d2ee4..7f9a463f5a 100644 > --- a/diffcore-rename.c > +++ b/diffcore-rename.c > @@ -391,14 +391,10 @@ static int too_many_rename_candidates(int num_create, > * growing larger than a "rename_limit" square matrix, ie: > * > * num_create * num_src > rename_limit * rename_limit > - * > - * but handles the potential overflow case specially (and we > - * assume at least 32-bit integers) > */ > - if (rename_limit <= 0 || rename_limit > 32767) > - rename_limit = 32767; > if ((num_create <= rename_limit || num_src <= rename_limit) && > - (num_create * num_src <= rename_limit * rename_limit)) > + ((double)num_create * (double)num_src > + <= (double)rename_limit * (double)rename_limit)) > return 0; From a technical perspective, I would think that if (num_create <= rename_limit || num_src <= rename_limit) holds true, that the double-cast condition would also be always true? Could we just remove that last check? Or phrased differently, if we can cast to double and extend the check here, do we have to adapt code at other places as well? > > options->needed_rename_limit = > @@ -415,7 +411,8 @@ static int too_many_rename_candidates(int num_create, > num_src++; > } > if ((num_create <= rename_limit || num_src <= rename_limit) && > - (num_create * num_src <= rename_limit * rename_limit)) > + ((double)num_create * (double)num_src > + <= (double)rename_limit * (double)rename_limit)) > return 2; > return 1; > } > -- > 2.15.0.5.g9567be9905 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] Remove silent clamp of renameLimit 2017-11-10 18:26 ` Stefan Beller @ 2017-11-10 18:36 ` Elijah Newren 2017-11-10 23:42 ` brian m. carlson 0 siblings, 1 reply; 15+ messages in thread From: Elijah Newren @ 2017-11-10 18:36 UTC (permalink / raw) To: Stefan Beller; +Cc: git Thanks for taking a look! On Fri, Nov 10, 2017 at 10:26 AM, Stefan Beller <sbeller@google.com> wrote: <snip> >> - if (rename_limit <= 0 || rename_limit > 32767) >> - rename_limit = 32767; >> if ((num_create <= rename_limit || num_src <= rename_limit) && >> - (num_create * num_src <= rename_limit * rename_limit)) >> + ((double)num_create * (double)num_src >> + <= (double)rename_limit * (double)rename_limit)) >> return 0; > > From a technical perspective, I would think that if > (num_create <= rename_limit || num_src <= rename_limit) > holds true, that the double-cast condition would also be always true? > Could we just remove that last check? Not necessarily. For example, if num_create = rename_limit-1 and num_src = rename_limit+2, then the first condition will be satisfied but the second won't. If it was && rather than ||, then the second condition would be superfluous. > Or phrased differently, if we can cast to double and extend the check > here, do we have to adapt code at other places as well? Good point, and yes. Perhaps I should have re-ordered my patch series because I came back to it later and realized that the progress code was broken due to overflow/wraparound, and a patch 3 fixed that. Further, the later patch used uint64_t instead of double. While double works, perhaps I should change the double here to uint64_t for consistency? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] Remove silent clamp of renameLimit 2017-11-10 18:36 ` Elijah Newren @ 2017-11-10 23:42 ` brian m. carlson 2017-11-11 16:39 ` Elijah Newren 0 siblings, 1 reply; 15+ messages in thread From: brian m. carlson @ 2017-11-10 23:42 UTC (permalink / raw) To: Elijah Newren; +Cc: Stefan Beller, git [-- Attachment #1: Type: text/plain, Size: 1522 bytes --] On Fri, Nov 10, 2017 at 10:36:17AM -0800, Elijah Newren wrote: > Thanks for taking a look! > > On Fri, Nov 10, 2017 at 10:26 AM, Stefan Beller <sbeller@google.com> wrote: > > From a technical perspective, I would think that if > > (num_create <= rename_limit || num_src <= rename_limit) > > holds true, that the double-cast condition would also be always true? > > Could we just remove that last check? > > Not necessarily. For example, if num_create = rename_limit-1 and > num_src = rename_limit+2, then the first condition will be satisfied > but the second won't. If it was && rather than ||, then the second > condition would be superfluous. > > > Or phrased differently, if we can cast to double and extend the check > > here, do we have to adapt code at other places as well? > > Good point, and yes. Perhaps I should have re-ordered my patch series > because I came back to it later and realized that the progress code > was broken due to overflow/wraparound, and a patch 3 fixed that. > > Further, the later patch used uint64_t instead of double. While > double works, perhaps I should change the double here to uint64_t for > consistency? I'm wondering if maybe you want to use size_t. If you end up using an unsigned type, you might be able to leverage unsigned_mult_overflows to avoid having to write this by hand. -- brian m. carlson / brian with sandals: Houston, Texas, US https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 867 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] Remove silent clamp of renameLimit 2017-11-10 23:42 ` brian m. carlson @ 2017-11-11 16:39 ` Elijah Newren 2017-11-11 17:32 ` brian m. carlson 0 siblings, 1 reply; 15+ messages in thread From: Elijah Newren @ 2017-11-11 16:39 UTC (permalink / raw) To: brian m. carlson, Elijah Newren, Stefan Beller, git Hi, On Fri, Nov 10, 2017 at 3:42 PM, brian m. carlson <sandals@crustytoothpaste.net> wrote: > On Fri, Nov 10, 2017 at 10:36:17AM -0800, Elijah Newren wrote: <snip> >> Further, the later patch used uint64_t instead of double. While >> double works, perhaps I should change the double here to uint64_t for >> consistency? > > I'm wondering if maybe you want to use size_t. If you end up using an > unsigned type, you might be able to leverage unsigned_mult_overflows to > avoid having to write this by hand. Thanks for pointing out unsigned_mult_overflows; I was unaware of it. I think I'd prefer to not use it though; the fact that I had a case that genuinely needed a value greater than 2^31 (at least before my performance patches) meant that a slightly bigger repo is going to eventually need a value over 2^32, so I think we should just cast to a type that can hold it. I'm curious why you suggest size_t, though. I have always associated that with an amount of memory that will be used, and there's no allocation based on this result. Was I wrong to make that association, or is there another good reason here? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] Remove silent clamp of renameLimit 2017-11-11 16:39 ` Elijah Newren @ 2017-11-11 17:32 ` brian m. carlson 0 siblings, 0 replies; 15+ messages in thread From: brian m. carlson @ 2017-11-11 17:32 UTC (permalink / raw) To: Elijah Newren; +Cc: Stefan Beller, git [-- Attachment #1: Type: text/plain, Size: 1353 bytes --] On Sat, Nov 11, 2017 at 08:39:11AM -0800, Elijah Newren wrote: > Thanks for pointing out unsigned_mult_overflows; I was unaware of it. > I think I'd prefer to not use it though; the fact that I had a case > that genuinely needed a value greater than 2^31 (at least before my > performance patches) meant that a slightly bigger repo is going to > eventually need a value over 2^32, so I think we should just cast to a > type that can hold it. That's fine. I had considered this in the context of 64-bit values, but I suppose that the likelihood of us hitting 2**64 iterations (and performing reasonably as well) is unlikely. > I'm curious why you suggest size_t, though. I have always associated > that with an amount of memory that will be used, and there's no > allocation based on this result. Was I wrong to make that > association, or is there another good reason here? I usually like size_t for values that are unsigned and don't need to be fixed size because it's usually the largest efficient unsigned type. However, if you want something to handle items larger than 2**32, then I agree that it's maybe not a great idea if we want it to work on 32-bit systems. -- brian m. carlson / brian with sandals: Houston, Texas, US https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 867 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/4] progress: Fix progress meters when dealing with lots of work 2017-11-10 17:39 [PATCH 0/4] Fix issues with rename detection limits Elijah Newren 2017-11-10 17:39 ` [PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit Elijah Newren 2017-11-10 17:39 ` [PATCH 2/4] Remove silent clamp of renameLimit Elijah Newren @ 2017-11-10 17:39 ` Elijah Newren 2017-11-13 5:24 ` Junio C Hamano 2017-11-10 17:39 ` [PATCH 4/4] sequencer: Show rename progress during cherry picks Elijah Newren 3 siblings, 1 reply; 15+ messages in thread From: Elijah Newren @ 2017-11-10 17:39 UTC (permalink / raw) To: git The possibility of setting merge.renameLimit beyond 2^16 raises the possibility that the values passed to progress can exceed 2^32. For my usecase of interest, I only needed to pass a value a little over 2^31. If I were only interested in fixing my usecase, I could have simply changed last_value from int to unsigned, and casted each of rename_dst_nr and rename_src_nr (in merge-recursive.c) from int to unsigned just before multiplying them. However, as long as we're making changes to allow larger progress meters, we may as well make a little more room in general. Use uint64_t, because it "ought to be enough for anybody". :-) Signed-off-by: Elijah Newren <newren@gmail.com> --- diffcore-rename.c | 4 ++-- progress.c | 22 +++++++++++----------- progress.h | 8 ++++---- 3 files changed, 17 insertions(+), 17 deletions(-) diff --git a/diffcore-rename.c b/diffcore-rename.c index 7f9a463f5a..6ba6157c61 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -531,7 +531,7 @@ void diffcore_rename(struct diff_options *options) if (options->show_rename_progress) { progress = start_delayed_progress( _("Performing inexact rename detection"), - rename_dst_nr * rename_src_nr); + (uint64_t)rename_dst_nr * (uint64_t)rename_src_nr); } mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_create), sizeof(*mx)); @@ -568,7 +568,7 @@ void diffcore_rename(struct diff_options *options) diff_free_filespec_blob(two); } dst_cnt++; - display_progress(progress, (i+1)*rename_src_nr); + display_progress(progress, (uint64_t)(i+1)*(uint64_t)rename_src_nr); } stop_progress(&progress); diff --git a/progress.c b/progress.c index 289678d43d..7e4a2f9532 100644 --- a/progress.c +++ b/progress.c @@ -30,8 +30,8 @@ struct throughput { struct progress { const char *title; - int last_value; - unsigned total; + uint64_t last_value; + uint64_t total; unsigned last_percent; unsigned delay; unsigned delayed_percent_threshold; @@ -79,7 +79,7 @@ static int is_foreground_fd(int fd) return tpgrp < 0 || tpgrp == getpgid(0); } -static int display(struct progress *progress, unsigned n, const char *done) +static int display(struct progress *progress, uint64_t n, const char *done) { const char *eol, *tp; @@ -106,7 +106,7 @@ static int display(struct progress *progress, unsigned n, const char *done) if (percent != progress->last_percent || progress_update) { progress->last_percent = percent; if (is_foreground_fd(fileno(stderr)) || done) { - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", + fprintf(stderr, "%s: %3u%% (%lu/%lu)%s%s", progress->title, percent, n, progress->total, tp, eol); fflush(stderr); @@ -116,7 +116,7 @@ static int display(struct progress *progress, unsigned n, const char *done) } } else if (progress_update) { if (is_foreground_fd(fileno(stderr)) || done) { - fprintf(stderr, "%s: %u%s%s", + fprintf(stderr, "%s: %lu%s%s", progress->title, n, tp, eol); fflush(stderr); } @@ -127,7 +127,7 @@ static int display(struct progress *progress, unsigned n, const char *done) return 0; } -static void throughput_string(struct strbuf *buf, off_t total, +static void throughput_string(struct strbuf *buf, uint64_t total, unsigned int rate) { strbuf_reset(buf); @@ -138,7 +138,7 @@ static void throughput_string(struct strbuf *buf, off_t total, strbuf_addstr(buf, "/s"); } -void display_throughput(struct progress *progress, off_t total) +void display_throughput(struct progress *progress, uint64_t total) { struct throughput *tp; uint64_t now_ns; @@ -200,12 +200,12 @@ void display_throughput(struct progress *progress, off_t total) display(progress, progress->last_value, NULL); } -int display_progress(struct progress *progress, unsigned n) +int display_progress(struct progress *progress, uint64_t n) { return progress ? display(progress, n, NULL) : 0; } -static struct progress *start_progress_delay(const char *title, unsigned total, +static struct progress *start_progress_delay(const char *title, uint64_t total, unsigned percent_threshold, unsigned delay) { struct progress *progress = malloc(sizeof(*progress)); @@ -227,12 +227,12 @@ static struct progress *start_progress_delay(const char *title, unsigned total, return progress; } -struct progress *start_delayed_progress(const char *title, unsigned total) +struct progress *start_delayed_progress(const char *title, uint64_t total) { return start_progress_delay(title, total, 0, 2); } -struct progress *start_progress(const char *title, unsigned total) +struct progress *start_progress(const char *title, uint64_t total) { return start_progress_delay(title, total, 0, 0); } diff --git a/progress.h b/progress.h index 6392b63371..70a4d4a0d6 100644 --- a/progress.h +++ b/progress.h @@ -3,10 +3,10 @@ struct progress; -void display_throughput(struct progress *progress, off_t total); -int display_progress(struct progress *progress, unsigned n); -struct progress *start_progress(const char *title, unsigned total); -struct progress *start_delayed_progress(const char *title, unsigned total); +void display_throughput(struct progress *progress, uint64_t total); +int display_progress(struct progress *progress, uint64_t n); +struct progress *start_progress(const char *title, uint64_t total); +struct progress *start_delayed_progress(const char *title, uint64_t total); void stop_progress(struct progress **progress); void stop_progress_msg(struct progress **progress, const char *msg); -- 2.15.0.5.g9567be9905 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] progress: Fix progress meters when dealing with lots of work 2017-11-10 17:39 ` [PATCH 3/4] progress: Fix progress meters when dealing with lots of work Elijah Newren @ 2017-11-13 5:24 ` Junio C Hamano 2017-11-13 20:05 ` Elijah Newren 0 siblings, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2017-11-13 5:24 UTC (permalink / raw) To: Elijah Newren; +Cc: git Elijah Newren <newren@gmail.com> writes: > Subject: Re: [PATCH 3/4] progress: Fix progress meters when dealing with lots of work Style: s/Fix/fix/; > The possibility of setting merge.renameLimit beyond 2^16 raises the > possibility that the values passed to progress can exceed 2^32. For my > usecase of interest, I only needed to pass a value a little over 2^31. If > I were only interested in fixing my usecase, I could have simply changed > last_value from int to unsigned, and casted each of rename_dst_nr and > rename_src_nr (in merge-recursive.c) from int to unsigned just before > multiplying them. However, as long as we're making changes to allow > larger progress meters, we may as well make a little more room in general. > Use uint64_t, because it "ought to be enough for anybody". :-) The middle part of the log message may waste more mental bandwidth of readers than it is worth. It might have gave you satisfaction to be able to vent, but don't (the place to do so is after the three dash lines). I am not sure if we want all codepaths to do 64-bit math for progress meter, but let's see what others would think. > -static int display(struct progress *progress, unsigned n, const char *done) > +static int display(struct progress *progress, uint64_t n, const char *done) > { > const char *eol, *tp; > > @@ -106,7 +106,7 @@ static int display(struct progress *progress, unsigned n, const char *done) > if (percent != progress->last_percent || progress_update) { > progress->last_percent = percent; > if (is_foreground_fd(fileno(stderr)) || done) { > - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", > + fprintf(stderr, "%s: %3u%% (%lu/%lu)%s%s", Are these (and there are probably other instances in this patch) %lu correct? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] progress: Fix progress meters when dealing with lots of work 2017-11-13 5:24 ` Junio C Hamano @ 2017-11-13 20:05 ` Elijah Newren 2017-11-14 1:18 ` Junio C Hamano 0 siblings, 1 reply; 15+ messages in thread From: Elijah Newren @ 2017-11-13 20:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List Thanks for the reviews and suggestions! On Sun, Nov 12, 2017 at 9:24 PM, Junio C Hamano <gitster@pobox.com> wrote: > Elijah Newren <newren@gmail.com> writes: > >> Subject: Re: [PATCH 3/4] progress: Fix progress meters when dealing with lots of work > > Style: s/Fix/fix/; I also messed this up in a lot of my patches in my other patch series. I've fixed them all up, but I'll wait to resubmit those other series until I get some other reviews. > The middle part of the log message may waste more mental bandwidth > of readers than it is worth. It might have gave you satisfaction to > be able to vent, but don't (the place to do so is after the three > dash lines). Cleaned it up, along with the other commit message you pointed out; I'll resubmit shortly. > I am not sure if we want all codepaths to do 64-bit math for > progress meter, but let's see what others would think. If others don't want to do 64-bit math for the progress meter, what would they like to see done instead? I can see a few options: 1) Have two separate progress codepaths, one for 32-bith math and one for 64-bit math. 2) Instead of counting pairs of source/dest files compared, just count number of dest paths completed. (Thus, we wouldn't need a value big enough to hold rename_dst_nr * rename_src_nr, just big enough to hold rename_dst_nr). 3) just let the progress meter overflow and show nonsensical values 4) don't show the progress meter if overflow would happen 5) something else I'm not thinking of. >> - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", >> + fprintf(stderr, "%s: %3u%% (%lu/%lu)%s%s", > > Are these (and there are probably other instances in this patch) %lu > correct? Oops, no. I think %llu is right, though looking around the code it appears folks use PRIuMAX and avoid %llu due to possible issues with old windows compilers. Not sure if that's still relevant, but I'll try to remain consistent with what I see elsewhere and include that fix in my re-roll. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] progress: Fix progress meters when dealing with lots of work 2017-11-13 20:05 ` Elijah Newren @ 2017-11-14 1:18 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2017-11-14 1:18 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List Elijah Newren <newren@gmail.com> writes: > 2) Instead of counting pairs of source/dest files compared, just > count number of dest paths completed. (Thus, we wouldn't need a value > big enough to hold rename_dst_nr * rename_src_nr, just big enough to > hold rename_dst_nr). This sounds like the most sensible thing to do even if we do switch to larger integer size, but that is orthogonal to the main point of this series. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 4/4] sequencer: Show rename progress during cherry picks 2017-11-10 17:39 [PATCH 0/4] Fix issues with rename detection limits Elijah Newren ` (2 preceding siblings ...) 2017-11-10 17:39 ` [PATCH 3/4] progress: Fix progress meters when dealing with lots of work Elijah Newren @ 2017-11-10 17:39 ` Elijah Newren 2017-11-13 5:25 ` Junio C Hamano 3 siblings, 1 reply; 15+ messages in thread From: Elijah Newren @ 2017-11-10 17:39 UTC (permalink / raw) To: git When trying to cherry-pick a change that has lots of renames, it is somewhat unsettling to wait a really long time without any feedback. Signed-off-by: Elijah Newren <newren@gmail.com> --- sequencer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sequencer.c b/sequencer.c index 2b4cecb617..247d93f363 100644 --- a/sequencer.c +++ b/sequencer.c @@ -448,6 +448,7 @@ static int do_recursive_merge(struct commit *base, struct commit *next, o.branch2 = next ? next_label : "(empty tree)"; if (is_rebase_i(opts)) o.buffer_output = 2; + o.show_rename_progress = 1; head_tree = parse_tree_indirect(head); next_tree = next ? next->tree : empty_tree(); -- 2.15.0.5.g9567be9905 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] sequencer: Show rename progress during cherry picks 2017-11-10 17:39 ` [PATCH 4/4] sequencer: Show rename progress during cherry picks Elijah Newren @ 2017-11-13 5:25 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2017-11-13 5:25 UTC (permalink / raw) To: Elijah Newren; +Cc: git Elijah Newren <newren@gmail.com> writes: > Subject: Re: [PATCH 4/4] sequencer: Show rename progress during cherry picks Style: s/Show/show/ > When trying to cherry-pick a change that has lots of renames, it is > somewhat unsettling to wait a really long time without any feedback. > > Signed-off-by: Elijah Newren <newren@gmail.com> > --- > sequencer.c | 1 + > 1 file changed, 1 insertion(+) Makes sense. > diff --git a/sequencer.c b/sequencer.c > index 2b4cecb617..247d93f363 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -448,6 +448,7 @@ static int do_recursive_merge(struct commit *base, struct commit *next, > o.branch2 = next ? next_label : "(empty tree)"; > if (is_rebase_i(opts)) > o.buffer_output = 2; > + o.show_rename_progress = 1; > > head_tree = parse_tree_indirect(head); > next_tree = next ? next->tree : empty_tree(); ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-11-14 1:18 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-11-10 17:39 [PATCH 0/4] Fix issues with rename detection limits Elijah Newren 2017-11-10 17:39 ` [PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit Elijah Newren 2017-11-13 5:16 ` Junio C Hamano 2017-11-10 17:39 ` [PATCH 2/4] Remove silent clamp of renameLimit Elijah Newren 2017-11-10 18:26 ` Stefan Beller 2017-11-10 18:36 ` Elijah Newren 2017-11-10 23:42 ` brian m. carlson 2017-11-11 16:39 ` Elijah Newren 2017-11-11 17:32 ` brian m. carlson 2017-11-10 17:39 ` [PATCH 3/4] progress: Fix progress meters when dealing with lots of work Elijah Newren 2017-11-13 5:24 ` Junio C Hamano 2017-11-13 20:05 ` Elijah Newren 2017-11-14 1:18 ` Junio C Hamano 2017-11-10 17:39 ` [PATCH 4/4] sequencer: Show rename progress during cherry picks Elijah Newren 2017-11-13 5:25 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).