On Wed, Feb 23, 2022 at 09:18:05AM -0500, Derrick Stolee wrote: > On 2/23/2022 7:35 AM, Patrick Steinhardt wrote: > > When fetching from a remote repository we will by default write what has > > been fetched into the special FETCH_HEAD reference. The order in which > > references are written depends on whether the reference is for merge or > > not, which, despite some other conditions, is also determined based on > > whether the old object ID the reference is being updated from actually > > exists in the repository. > > > > To write FETCH_HEAD we thus loop through all references thrice: once for > > the references that are about to be merged, once for the references that > > are not for merge, and finally for all references that are ignored. For > > every iteration, we then look up the old object ID to determine whether > > the referenced object exists so that we can label it as "not-for-merge" > > if it doesn't exist. It goes without saying that this can be expensive > > in case where we are fetching a lot of references. > > > > While this is hard to avoid in the case where we're writing FETCH_HEAD, > > users can in fact ask us to skip this work via `--no-write-fetch-head`. > > In that case, we do not care for the result of those lookups at all > > because we don't have to order writes to FETCH_HEAD in the first place. > > > > Skip this busywork in case we're not writing to FETCH_HEAD. The > > following benchmark performs a mirror-fetch in a repository with about > > two million references: > > > > Benchmark 1: git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD~) > > Time (mean ± σ): 75.388 s ± 1.942 s [User: 71.103 s, System: 8.953 s] > > Range (min … max): 73.184 s … 76.845 s 3 runs > > > > Benchmark 2: git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD) > > Time (mean ± σ): 69.486 s ± 1.016 s [User: 65.941 s, System: 8.806 s] > > Range (min … max): 68.864 s … 70.659 s 3 runs > > > > Summary > > 'git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD)' ran > > 1.08 ± 0.03 times faster than 'git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD~)' > > I have a super-small nitpick here. > > I see that you are using '-n' to name your experiments. These names > are a bit long, especially since they are the same Git command but > built at different commits. It would be enough to say the command > you are testing before the stats and leave the names as "HEAD" and > "HEAD~" (or, I typically use "new" and "old", respectively). Fair enough, will change. Patrick > > /* > > - * References in "refs/tags/" are often going to point > > - * to annotated tags, which are not part of the > > - * commit-graph. We thus only try to look up refs in > > - * the graph which are not in that namespace to not > > - * regress performance in repositories with many > > - * annotated tags. > > + * When writing FETCH_HEAD we need to determine whether > > + * we already have the commit or not. If not, then the > > + * reference is not for merge and needs to be written > > + * to the reflog after other commits which we already > > + * have. We're not interested in this property though > > + * in case FETCH_HEAD is not to be updated, so we can > > + * skip the classification in that case. > > */ > > - if (!starts_with(rm->name, "refs/tags/")) > > - commit = lookup_commit_in_graph(the_repository, &rm->old_oid); > > - if (!commit) { > > - commit = lookup_commit_reference_gently(the_repository, > > - &rm->old_oid, > > - 1); > > - if (!commit) > > - rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE; > > + if (fetch_head->fp) { > > + struct commit *commit = NULL; > > + > > + /* > > + * References in "refs/tags/" are often going to point > > + * to annotated tags, which are not part of the > > + * commit-graph. We thus only try to look up refs in > > + * the graph which are not in that namespace to not > > + * regress performance in repositories with many > > + * annotated tags. > > + */ > > + if (!starts_with(rm->name, "refs/tags/")) > > + commit = lookup_commit_in_graph(the_repository, &rm->old_oid); > > + if (!commit) { > > + commit = lookup_commit_reference_gently(the_repository, > > + &rm->old_oid, > > + 1); > > + if (!commit) > > + rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE; > > + } > > } > Looks good. Most of the diff is whitespace. > > Thanks, > -Stolee