* [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout @ 2017-04-07 15:53 git 2017-04-07 15:53 ` git 0 siblings, 1 reply; 8+ messages in thread From: git @ 2017-04-07 15:53 UTC (permalink / raw) To: git; +Cc: gitster, peff, Jeff Hostetler From: Jeff Hostetler <jeffhost@microsoft.com> Version 2 simplifies this and just copies the tree_descriptor data and borrows the underlying buffer without mallocing. It also handles the n=3 cases, so merges shold be helped too. I've updated the p0004 perf times in the commit message. The V2 changes took ~0.15 off the V1 times. The total reduction is ~1 second. ================ Avoid duplicate ODB lookups for trees during traverse_tree_recursive(). Jeff Hostetler (1): unpack-trees: avoid duplicate ODB lookups during checkout unpack-trees.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) -- 2.9.3 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-07 15:53 [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout git @ 2017-04-07 15:53 ` git 2017-04-08 14:06 ` René Scharfe 0 siblings, 1 reply; 8+ messages in thread From: git @ 2017-04-07 15:53 UTC (permalink / raw) To: git; +Cc: gitster, peff, Jeff Hostetler From: Jeff Hostetler <jeffhost@microsoft.com> Teach traverse_trees_recursive() to not do redundant ODB lookups when both directories refer to the same OID. In operations such as read-tree, checkout, and merge when the differences between the commits are relatively small, there will likely be many directories that have the same SHA-1. In these cases we can avoid hitting the ODB multiple times for the same SHA-1. This patch handles n=2 and n=3 cases and simply copies the data rather than repeating the fill_tree_descriptor(). ================ On the Windows repo (500K trees, 3.1M files, 450MB index), this reduced the overall time by 0.75 seconds when cycling between 2 commits with a single file difference. (avg) before: 22.699 (avg) after: 21.955 =============== ================ Using the p0004-read-tree test (posted earlier this week) with 1M files on Linux: before: $ ./p0004-read-tree.sh 0004.5: switch work1 work2 (1003037) 11.99(8.12+3.32) 0004.6: switch commit aliases (1003037) 11.95(8.20+3.14) after: $ ./p0004-read-tree.sh 0004.5: switch work1 work2 (1003037) 11.02(7.71+2.78) 0004.6: switch commit aliases (1003037) 10.95(7.57+2.82) ================ Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> --- unpack-trees.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 3a8ee19..143c5d9 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -531,6 +531,11 @@ static int switch_cache_bottom(struct traverse_info *info) return ret; } +static inline int are_same_oid(struct name_entry *name_j, struct name_entry *name_k) +{ + return name_j->oid && name_k->oid && !oidcmp(name_j->oid, name_k->oid); +} + static int traverse_trees_recursive(int n, unsigned long dirmask, unsigned long df_conflicts, struct name_entry *names, @@ -554,10 +559,20 @@ static int traverse_trees_recursive(int n, unsigned long dirmask, newinfo.df_conflicts |= df_conflicts; for (i = 0; i < n; i++, dirmask >>= 1) { - const unsigned char *sha1 = NULL; - if (dirmask & 1) - sha1 = names[i].oid->hash; - buf[i] = fill_tree_descriptor(t+i, sha1); + if (i > 0 && are_same_oid(&names[i], &names[i-1])) { + /* implicitly borrow buf[i-1] inside tree_desc[i] */ + memcpy(&t[i], &t[i-1], sizeof(struct tree_desc)); + buf[i] = NULL; + } else if (i > 1 && are_same_oid(&names[i], &names[i-2])) { + /* implicitly borrow buf[i-2] inside tree_desc[i] */ + memcpy(&t[i], &t[i-2], sizeof(struct tree_desc)); + buf[i] = NULL; + } else { + const unsigned char *sha1 = NULL; + if (dirmask & 1) + sha1 = names[i].oid->hash; + buf[i] = fill_tree_descriptor(t+i, sha1); + } } bottom = switch_cache_bottom(&newinfo); -- 2.9.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-07 15:53 ` git @ 2017-04-08 14:06 ` René Scharfe 2017-04-10 20:55 ` Jeff King 2017-04-10 21:26 ` Jeff Hostetler 0 siblings, 2 replies; 8+ messages in thread From: René Scharfe @ 2017-04-08 14:06 UTC (permalink / raw) To: git, git; +Cc: gitster, peff, Jeff Hostetler Am 07.04.2017 um 17:53 schrieb git@jeffhostetler.com: > From: Jeff Hostetler <jeffhost@microsoft.com> > > Teach traverse_trees_recursive() to not do redundant ODB > lookups when both directories refer to the same OID. > > In operations such as read-tree, checkout, and merge when > the differences between the commits are relatively small, > there will likely be many directories that have the same > SHA-1. In these cases we can avoid hitting the ODB multiple > times for the same SHA-1. > > This patch handles n=2 and n=3 cases and simply copies the > data rather than repeating the fill_tree_descriptor(). > > ================ > On the Windows repo (500K trees, 3.1M files, 450MB index), > this reduced the overall time by 0.75 seconds when cycling > between 2 commits with a single file difference. > > (avg) before: 22.699 > (avg) after: 21.955 > =============== > > ================ > Using the p0004-read-tree test (posted earlier this week) > with 1M files on Linux: > > before: > $ ./p0004-read-tree.sh > 0004.5: switch work1 work2 (1003037) 11.99(8.12+3.32) > 0004.6: switch commit aliases (1003037) 11.95(8.20+3.14) > > after: > $ ./p0004-read-tree.sh > 0004.5: switch work1 work2 (1003037) 11.02(7.71+2.78) > 0004.6: switch commit aliases (1003037) 10.95(7.57+2.82) > ================ > > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> > --- > unpack-trees.c | 23 +++++++++++++++++++---- > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/unpack-trees.c b/unpack-trees.c > index 3a8ee19..143c5d9 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -531,6 +531,11 @@ static int switch_cache_bottom(struct traverse_info *info) > return ret; > } > > +static inline int are_same_oid(struct name_entry *name_j, struct name_entry *name_k) > +{ > + return name_j->oid && name_k->oid && !oidcmp(name_j->oid, name_k->oid); > +} > + > static int traverse_trees_recursive(int n, unsigned long dirmask, > unsigned long df_conflicts, > struct name_entry *names, > @@ -554,10 +559,20 @@ static int traverse_trees_recursive(int n, unsigned long dirmask, > newinfo.df_conflicts |= df_conflicts; > > for (i = 0; i < n; i++, dirmask >>= 1) { > - const unsigned char *sha1 = NULL; > - if (dirmask & 1) > - sha1 = names[i].oid->hash; > - buf[i] = fill_tree_descriptor(t+i, sha1); > + if (i > 0 && are_same_oid(&names[i], &names[i-1])) { > + /* implicitly borrow buf[i-1] inside tree_desc[i] */ > + memcpy(&t[i], &t[i-1], sizeof(struct tree_desc)); An assignment would be simpler: t[i] = t[i - 1]; > + buf[i] = NULL; > + } else if (i > 1 && are_same_oid(&names[i], &names[i-2])) { > + /* implicitly borrow buf[i-2] inside tree_desc[i] */ > + memcpy(&t[i], &t[i-2], sizeof(struct tree_desc)); Similar case. > + buf[i] = NULL; What makes the previous two entries special, or differently put: Why not just check *all* previous entries? MAX_UNPACK_TREES is 8; the number of comparisons would just about double in the worst case and stay the same for three trees or less. The order of four trees or more wouldn't matter anymore. > + } else { > + const unsigned char *sha1 = NULL; > + if (dirmask & 1) > + sha1 = names[i].oid->hash; > + buf[i] = fill_tree_descriptor(t+i, sha1); > + } > } > > bottom = switch_cache_bottom(&newinfo); > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-08 14:06 ` René Scharfe @ 2017-04-10 20:55 ` Jeff King 2017-04-10 21:28 ` Jeff Hostetler 2017-04-10 21:26 ` Jeff Hostetler 1 sibling, 1 reply; 8+ messages in thread From: Jeff King @ 2017-04-10 20:55 UTC (permalink / raw) To: René Scharfe; +Cc: git, git, gitster, Jeff Hostetler On Sat, Apr 08, 2017 at 04:06:41PM +0200, René Scharfe wrote: > > + } else if (i > 1 && are_same_oid(&names[i], &names[i-2])) { > > + /* implicitly borrow buf[i-2] inside tree_desc[i] */ > > + memcpy(&t[i], &t[i-2], sizeof(struct tree_desc)); > > Similar case. > > > + buf[i] = NULL; > > What makes the previous two entries special, or differently put: Why not > just check *all* previous entries? MAX_UNPACK_TREES is 8; the number of > comparisons would just about double in the worst case and stay the same for > three trees or less. The order of four trees or more wouldn't matter > anymore. If I understand correctly, we've got N (up to 8) trees, and we want to find sets of duplicates. Adjacency doesn't matter. Aren't our options basically: - look through all previous entries naively, O(N^2) - sort the list, O(N log N); but we need the original order, so we'd have to unsort at the end, too - use a hash table, O(N) but with O(N) extra storage I know N=8 isn't very big algorithmically speaking, but it would feel ugly to introduce something quadratic. Especially the limit of 8 seems rather arbitrary. In normal cases we see at most a 3-way merge. Beyond that we're in octopus territory, and 8 is way too low there (I think the octopus strategy just unpacks one at a time and barfs if there are any conflicts). I assume the rationale behind "check the last 2" is just "this optimization will kick in reliably for all sane cases", weird octopus unpack-trees be damned (though reading ca885a4fe, it sounds like 4-trees can actually happen, but I'm not clear on how). -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-10 20:55 ` Jeff King @ 2017-04-10 21:28 ` Jeff Hostetler 0 siblings, 0 replies; 8+ messages in thread From: Jeff Hostetler @ 2017-04-10 21:28 UTC (permalink / raw) To: Jeff King, René Scharfe; +Cc: git, gitster, Jeff Hostetler On 4/10/2017 4:55 PM, Jeff King wrote: > On Sat, Apr 08, 2017 at 04:06:41PM +0200, René Scharfe wrote: > >>> + } else if (i > 1 && are_same_oid(&names[i], &names[i-2])) { >>> + /* implicitly borrow buf[i-2] inside tree_desc[i] */ >>> + memcpy(&t[i], &t[i-2], sizeof(struct tree_desc)); >> >> Similar case. >> >>> + buf[i] = NULL; >> >> What makes the previous two entries special, or differently put: Why not >> just check *all* previous entries? MAX_UNPACK_TREES is 8; the number of >> comparisons would just about double in the worst case and stay the same for >> three trees or less. The order of four trees or more wouldn't matter >> anymore. > > If I understand correctly, we've got N (up to 8) trees, and we want to > find sets of duplicates. Adjacency doesn't matter. Aren't our options > basically: > > - look through all previous entries naively, O(N^2) > > - sort the list, O(N log N); but we need the original order, so we'd > have to unsort at the end, too > > - use a hash table, O(N) but with O(N) extra storage > > I know N=8 isn't very big algorithmically speaking, but it would feel > ugly to introduce something quadratic. Especially the limit of 8 seems > rather arbitrary. In normal cases we see at most a 3-way merge. Beyond > that we're in octopus territory, and 8 is way too low there (I think the > octopus strategy just unpacks one at a time and barfs if there are any > conflicts). > > I assume the rationale behind "check the last 2" is just "this > optimization will kick in reliably for all sane cases", weird octopus > unpack-trees be damned (though reading ca885a4fe, it sounds like 4-trees > can actually happen, but I'm not clear on how). yeah, i didn't want to pay for the obscure (n >= 4) cases with any of the above. doing 2 or 3 gets us checkout and merge. these are the common usages. Jeff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-08 14:06 ` René Scharfe 2017-04-10 20:55 ` Jeff King @ 2017-04-10 21:26 ` Jeff Hostetler 2017-04-10 23:09 ` René Scharfe 1 sibling, 1 reply; 8+ messages in thread From: Jeff Hostetler @ 2017-04-10 21:26 UTC (permalink / raw) To: René Scharfe, git; +Cc: gitster, peff, Jeff Hostetler On 4/8/2017 10:06 AM, René Scharfe wrote: > Am 07.04.2017 um 17:53 schrieb git@jeffhostetler.com: >> From: Jeff Hostetler <jeffhost@microsoft.com> >> >> Teach traverse_trees_recursive() to not do redundant ODB >> lookups when both directories refer to the same OID. >> >> In operations such as read-tree, checkout, and merge when >> the differences between the commits are relatively small, >> there will likely be many directories that have the same >> SHA-1. In these cases we can avoid hitting the ODB multiple >> times for the same SHA-1. >> >> This patch handles n=2 and n=3 cases and simply copies the >> data rather than repeating the fill_tree_descriptor(). >> >> ================ >> On the Windows repo (500K trees, 3.1M files, 450MB index), >> this reduced the overall time by 0.75 seconds when cycling >> between 2 commits with a single file difference. >> >> (avg) before: 22.699 >> (avg) after: 21.955 >> =============== >> >> ================ >> Using the p0004-read-tree test (posted earlier this week) >> with 1M files on Linux: >> >> before: >> $ ./p0004-read-tree.sh >> 0004.5: switch work1 work2 (1003037) 11.99(8.12+3.32) >> 0004.6: switch commit aliases (1003037) 11.95(8.20+3.14) >> >> after: >> $ ./p0004-read-tree.sh >> 0004.5: switch work1 work2 (1003037) 11.02(7.71+2.78) >> 0004.6: switch commit aliases (1003037) 10.95(7.57+2.82) >> ================ >> >> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> >> --- >> unpack-trees.c | 23 +++++++++++++++++++---- >> 1 file changed, 19 insertions(+), 4 deletions(-) >> >> diff --git a/unpack-trees.c b/unpack-trees.c >> index 3a8ee19..143c5d9 100644 >> --- a/unpack-trees.c >> +++ b/unpack-trees.c >> @@ -531,6 +531,11 @@ static int switch_cache_bottom(struct >> traverse_info *info) >> return ret; >> } >> +static inline int are_same_oid(struct name_entry *name_j, struct >> name_entry *name_k) >> +{ >> + return name_j->oid && name_k->oid && !oidcmp(name_j->oid, >> name_k->oid); >> +} >> + >> static int traverse_trees_recursive(int n, unsigned long dirmask, >> unsigned long df_conflicts, >> struct name_entry *names, >> @@ -554,10 +559,20 @@ static int traverse_trees_recursive(int n, >> unsigned long dirmask, >> newinfo.df_conflicts |= df_conflicts; >> for (i = 0; i < n; i++, dirmask >>= 1) { >> - const unsigned char *sha1 = NULL; >> - if (dirmask & 1) >> - sha1 = names[i].oid->hash; >> - buf[i] = fill_tree_descriptor(t+i, sha1); >> + if (i > 0 && are_same_oid(&names[i], &names[i-1])) { >> + /* implicitly borrow buf[i-1] inside tree_desc[i] */ >> + memcpy(&t[i], &t[i-1], sizeof(struct tree_desc)); > > An assignment would be simpler: > > t[i] = t[i - 1]; True, but this might be a coin toss. Maybe we should see what the generated assembly looks like for each ?? > >> + buf[i] = NULL; >> + } else if (i > 1 && are_same_oid(&names[i], &names[i-2])) { >> + /* implicitly borrow buf[i-2] inside tree_desc[i] */ >> + memcpy(&t[i], &t[i-2], sizeof(struct tree_desc)); > > Similar case. > >> + buf[i] = NULL; > > What makes the previous two entries special, or differently put: Why not > just check *all* previous entries? MAX_UNPACK_TREES is 8; the number of > comparisons would just about double in the worst case and stay the same > for three trees or less. The order of four trees or more wouldn't > matter anymore. I tried to hit the common cases. This loop runs a lot and I didn't want to put an O(n^2) thing in there to look for any matching peer. Most of the time we are in a simple 2 or 3 way effort. I didn't want to pay for the looping/branching overhead for the obscure [4..8] efforts. > >> + } else { >> + const unsigned char *sha1 = NULL; >> + if (dirmask & 1) >> + sha1 = names[i].oid->hash; >> + buf[i] = fill_tree_descriptor(t+i, sha1); >> + } >> } >> bottom = switch_cache_bottom(&newinfo); >> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-10 21:26 ` Jeff Hostetler @ 2017-04-10 23:09 ` René Scharfe 2017-04-11 20:42 ` Jeff Hostetler 0 siblings, 1 reply; 8+ messages in thread From: René Scharfe @ 2017-04-10 23:09 UTC (permalink / raw) To: Jeff Hostetler, git; +Cc: gitster, peff, Jeff Hostetler Am 10.04.2017 um 23:26 schrieb Jeff Hostetler: > On 4/8/2017 10:06 AM, René Scharfe wrote: >> Am 07.04.2017 um 17:53 schrieb git@jeffhostetler.com: >>> + /* implicitly borrow buf[i-1] inside tree_desc[i] */ >>> + memcpy(&t[i], &t[i-1], sizeof(struct tree_desc)); >> >> An assignment would be simpler: >> >> t[i] = t[i - 1]; > > True, but this might be a coin toss. Maybe we should > see what the generated assembly looks like for each ?? Clang, GCC and ICC inline that memcpy call; the assembly output is the same for both variants: https://godbolt.org/g/1q0YwK. I guess you worry about compilers that are just bad at struct assignments (i.e. worse than calling memcpy)? Do you have examples (just curious)? Assignments are easier on the eye of human readers in any case, and there is no way to silently get the size wrong. > I tried to hit the common cases. This loop runs a lot > and I didn't want to put an O(n^2) thing in there to > look for any matching peer. Most of the time we are > in a simple 2 or 3 way effort. I didn't want to pay > for the looping/branching overhead for the obscure [4..8] > efforts. Makes sense, and it's a nice heuristic. Perhaps it would be a good idea to document these choices in a comment? René ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-10 23:09 ` René Scharfe @ 2017-04-11 20:42 ` Jeff Hostetler 0 siblings, 0 replies; 8+ messages in thread From: Jeff Hostetler @ 2017-04-11 20:42 UTC (permalink / raw) To: René Scharfe, git; +Cc: gitster, peff, Jeff Hostetler On 4/10/2017 7:09 PM, René Scharfe wrote: > Am 10.04.2017 um 23:26 schrieb Jeff Hostetler: >> On 4/8/2017 10:06 AM, René Scharfe wrote: >>> Am 07.04.2017 um 17:53 schrieb git@jeffhostetler.com: >>>> + /* implicitly borrow buf[i-1] inside tree_desc[i] */ >>>> + memcpy(&t[i], &t[i-1], sizeof(struct tree_desc)); >>> >>> An assignment would be simpler: >>> >>> t[i] = t[i - 1]; >> >> True, but this might be a coin toss. Maybe we should >> see what the generated assembly looks like for each ?? > > Clang, GCC and ICC inline that memcpy call; the assembly output is the > same for both variants: https://godbolt.org/g/1q0YwK. I guess you worry > about compilers that are just bad at struct assignments (i.e. worse than > calling memcpy)? Do you have examples (just curious)? Nice website! Really! Yes, my concern was that structure copies would do it field by field rather than just a block copy. No, I don't have any examples -- maybe just some very old brain cells.... :-) And I just checked VS2015 and the structure copy is a few instructions shorter, but roughly the same. > > Assignments are easier on the eye of human readers in any case, and > there is no way to silently get the size wrong. agreed. thanks. > >> I tried to hit the common cases. This loop runs a lot >> and I didn't want to put an O(n^2) thing in there to >> look for any matching peer. Most of the time we are >> in a simple 2 or 3 way effort. I didn't want to pay >> for the looping/branching overhead for the obscure [4..8] >> efforts. > > Makes sense, and it's a nice heuristic. Perhaps it would be a good idea > to document these choices in a comment? Good point. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-04-11 20:42 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-04-07 15:53 [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout git 2017-04-07 15:53 ` git 2017-04-08 14:06 ` René Scharfe 2017-04-10 20:55 ` Jeff King 2017-04-10 21:28 ` Jeff Hostetler 2017-04-10 21:26 ` Jeff Hostetler 2017-04-10 23:09 ` René Scharfe 2017-04-11 20:42 ` Jeff Hostetler
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).