git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
Date: Mon, 31 Oct 2022 15:21:34 +0100	[thread overview]
Message-ID: <Y1/Z7q+56Mu+rmAX@ncase> (raw)
In-Reply-To: <221028.867d0k7yny.gmgdl@evledraar.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4559 bytes --]

On Fri, Oct 28, 2022 at 05:01:58PM +0200, Ævar Arnfjörð Bjarmason wrote:
> On Fri, Oct 28 2022, Patrick Steinhardt wrote:
[sinp]
> > Unfortunately, this change comes with a performance hit when refs are
> > not hidden. Executed in the same repository:
> >
> >     Benchmark 1: main
> >       Time (mean ± σ):     45.780 s ±  0.507 s    [User: 46.908 s, System: 4.838 s]
> >       Range (min … max):   45.453 s … 46.364 s    3 runs
> >
> >     Benchmark 2: pks-connectivity-check-hide-refs
> >       Time (mean ± σ):     49.886 s ±  0.282 s    [User: 51.168 s, System: 5.015 s]
> >       Range (min … max):   49.589 s … 50.149 s    3 runs
> >
> >     Summary
> >       'main' ran
> >         1.09 ± 0.01 times faster than 'pks-connectivity-check-hide-refs'
> >
> > This is probably caused by the overhead of reachable tips being passed
> > in via git-rev-list(1)'s standard input, which seems to be slower than
> > reading the references from disk.
> >
> > It is debatable what to do about this. If this were only about improving
> > performance then it would be trivial to make the new logic depend on
> > whether or not `transfer.hideRefs` has been configured in the repo. But
> > as explained this is also about correctness, even though this can be
> > considered an edge case. Furthermore, this slowdown is really only
> > noticeable in outliers like the above repository with an unreasonable
> > amount of refs. The same benchmark in linux-stable.git with about
> > 4500 references shows no measurable difference:
> 
> Do we have a test that would start failing if we changed the behavior?
> Perhaps such a test is peeking too much behind the curtain, but if it's
> easy come up with one I think it would be most welcome to have it
> alongside this.  to have exposes

We have tests that verify that we indeed detect missing objects in
t5504. But what we're lacking is tests that verify that we stop walking
at the boundary of preexisting objects, and I honestly wouldn't quite
know how to do that as there is no functional difference, but really
only a performance issue if we overwalked.

> > -static void write_head_info(void)
> > +static void write_head_info(struct oidset *announced_objects)
> >  {
> > -	static struct oidset seen = OIDSET_INIT;
> > -
> > -	for_each_ref(show_ref_cb, &seen);
> > -	for_each_alternate_ref(show_one_alternate_ref, &seen);
> > -	oidset_clear(&seen);
> > +	for_each_ref(show_ref_cb, announced_objects);
> > +	for_each_alternate_ref(show_one_alternate_ref, announced_objects);
> >  	if (!sent_capabilities)
> >  		show_ref("capabilities^{}", null_oid());
> 
> Nit: The variable rename stands out slightly,
> i.e. s/&seen/announced_objects/ not s/&seen/seen/, especially as:
> 
> >  static void execute_commands(struct command *commands,
> >  			     const char *unpacker_error,
> >  			     struct shallow_info *si,
> > -			     const struct string_list *push_options)
> > +			     const struct string_list *push_options,
> > +			     struct oidset *announced_oids)
> 
> Here we have the same variable, but now it's *_oids, not *objects.

Hm. I think that `announced_oids` is easier to understand compared to
`seen`, so I'd prefer to keep the rename. But I'll definitely make this
consistent so we use `announced_oids` in both places.

[snip]
> > +static const struct object_id *iterate_announced_oids(void *cb_data)
> > +{
> > +	struct oidset_iter *iter = cb_data;
> > +	return oidset_iter_next(iter);
> > +}
> > +
> 
> Is just used as (from 1/2):
> 
> > +	if (opt->reachable_oids_fn) {
> > +		const struct object_id *reachable_oid;
> > +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> > +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> > +				break;
> > +	}
> 
> After doing above:
> 
> > +	if (oidset_size(announced_oids) != 0) {
> > +		oidset_iter_init(announced_oids, &announced_oids_iter);
> > +		opt.reachable_oids_fn = iterate_announced_oids;
> > +		opt.reachable_oids_data = &announced_oids_iter;
> > +	}
> 
> But I don't see the reason for the indirection, but maybe I'm missing
> something obvious.
> 
> Why not just pass the oidset itself and have connected.c iterate through
> it, rather than going thorugh this callback / data indirection?

This is done to stay consistent with the way new tips are passed in via
the `oid_iterate_fn`. I'm happy to change callers to just directly pass
a `struct oidset *` though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-10-31 14:21 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
2022-10-28 14:42 ` [PATCH 1/2] connected: allow supplying different view of reachable objects Patrick Steinhardt
2022-10-28 14:54   ` Ævar Arnfjörð Bjarmason
2022-10-28 18:12   ` Junio C Hamano
2022-10-30 18:49     ` Taylor Blau
2022-10-31 13:10     ` Patrick Steinhardt
2022-11-01  1:16       ` Taylor Blau
2022-10-28 14:42 ` [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
2022-10-28 15:01   ` Ævar Arnfjörð Bjarmason
2022-10-31 14:21     ` Patrick Steinhardt [this message]
2022-10-31 15:36       ` Ævar Arnfjörð Bjarmason
2022-10-30 19:09   ` Taylor Blau
2022-10-31 14:45     ` Patrick Steinhardt
2022-11-01  1:28       ` Taylor Blau
2022-11-01  7:20         ` Patrick Steinhardt
2022-11-01 11:53           ` Patrick Steinhardt
2022-11-02  1:05             ` Taylor Blau
2022-11-01  8:28       ` Jeff King
2022-10-28 16:40 ` [PATCH 0/2] " Junio C Hamano
2022-11-01  1:30 ` Taylor Blau
2022-11-01  9:00 ` Jeff King
2022-11-01 11:49   ` Patrick Steinhardt
2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
2022-11-03 14:37   ` [PATCH v2 1/3] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-03 14:37   ` [PATCH v2 2/3] revision: add new parameter to specify all visible refs Patrick Steinhardt
2022-11-05 12:46     ` Jeff King
2022-11-07  8:20       ` Patrick Steinhardt
2022-11-08 14:32         ` Jeff King
2022-11-05 12:55     ` Jeff King
2022-11-03 14:37   ` [PATCH v2 3/3] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-05  0:40   ` [PATCH v2 0/3] " Taylor Blau
2022-11-05 12:55     ` Jeff King
2022-11-05 12:52   ` Jeff King
2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 2/6] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-07 12:51     ` Ævar Arnfjörð Bjarmason
2022-11-08  9:11       ` Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-07 13:34     ` Ævar Arnfjörð Bjarmason
2022-11-07 17:07       ` Ævar Arnfjörð Bjarmason
2022-11-08  9:48         ` Patrick Steinhardt
2022-11-08  9:22       ` Patrick Steinhardt
2022-11-08  0:57     ` Taylor Blau
2022-11-08  8:16       ` Patrick Steinhardt
2022-11-08 14:42         ` Jeff King
2022-11-07 12:16   ` [PATCH v3 5/6] revparse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-08 14:44     ` Jeff King
2022-11-07 12:16   ` [PATCH v3 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-08  0:59   ` [PATCH v3 0/6] " Taylor Blau
2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-08 13:36     ` Ævar Arnfjörð Bjarmason
2022-11-08 14:49       ` Patrick Steinhardt
2022-11-08 14:51     ` Jeff King
2022-11-08 10:03   ` [PATCH v4 2/6] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-08 15:07     ` Jeff King
2022-11-08 21:13       ` Taylor Blau
2022-11-11  5:48       ` Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 5/6] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-08 10:04   ` [PATCH v4 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
2022-11-11  6:49   ` [PATCH v5 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
2022-11-11  6:49   ` [PATCH v5 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 3/7] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-11 22:18   ` [PATCH v5 0/7] " Taylor Blau
2022-11-15 17:26     ` Jeff King
2022-11-16 21:22       ` Taylor Blau
2022-11-16 22:04         ` Jeff King
2022-11-16 22:33           ` Taylor Blau
2022-11-17  5:45             ` Patrick Steinhardt
2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 3/7] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-17  5:47   ` [PATCH v6 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-17  5:47   ` [PATCH v6 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-17 15:03   ` [PATCH v6 0/7] " Jeff King
2022-11-17 21:24     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1/Z7q+56Mu+rmAX@ncase \
    --to=ps@pks.im \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).