git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: John Cai via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, John Cai <johncai86@gmail.com>
Subject: Re: [PATCH] promisor-remote.c: use oidset for deduplication
Date: Fri, 14 Jan 2022 15:12:37 -0800	[thread overview]
Message-ID: <xmqqv8yl9aje.fsf@gitster.g> (raw)
In-Reply-To: <xmqqzgnyb03z.fsf@gitster.g> (Junio C. Hamano's message of "Fri, 14 Jan 2022 11:14:56 -0800")

Junio C Hamano <gitster@pobox.com> writes:

> Also, if oid_array is used to produce a de-duplicated list of object
> names in the current code, it is very likely that oid_array is
> sorted (perhaps the objects are fed in sorted order), and the
> callers depend on the order of the objects they find in the array.
> Throwing sorted list of object names at oidset and then iterating
> over what is in the oidset would likely to destroy the original
> ordering.  I do not offhand know if the callers are broken by such a
> change (either correctness-wise or performance-wise).

Since I had a bit of downtime waiting for CI, I took a look.

It seems that the list of objects collected in the oidset/oid_array
is fed to "git fetch --stdin" for lazy fetching, so I would be
surprised if the order of objects matter.  So I would stop worrying
about correctness or performance due to ordering change.

>> But most importantly is it worth it? What's the rationale for the
>> change? Less CPU/memory use? Getting e.g. "hyperfine" or "/usr/bin/time
>> -v" output for those (if so) would be valuable.

But I still agree with this.  How much duplication would a typical
request for a lazy fetch involve?  Would it cause duplicate "want"
to make the protocol exchange noticeably larger?

Having said all that, the change makes the code smaller, and
possibly easier to follow.

The primary reason why promisor-remote.c::remove_fetched_oids()
shrinks so much is because we used to scan the array and copied
the surviving ones into a new array, while the new code just
iterates over the oidset and removes the ones that were fetched.
I am assuming that mucking with the oidset's contents while you
are iterating over it is safe, but if that is not the case, then
the advantage of the smaller code disappears.


  reply	other threads:[~2022-01-14 23:12 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-13 20:32 [PATCH] promisor-remote.c: use oidset for deduplication John Cai via GitGitGadget
2022-01-13 23:45 ` Junio C Hamano
2022-01-14 12:11 ` Ævar Arnfjörð Bjarmason
2022-01-14 19:14   ` Junio C Hamano
2022-01-14 23:12     ` Junio C Hamano [this message]
2022-01-24 22:55   ` John Cai
2022-01-25 19:17     ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqv8yl9aje.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=johncai86@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).