git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org, jrnieder@gmail.com, spearce@spearce.org
Subject: Re: [PATCH v3] fetch-pack: always allow fetching of literal SHA1s
Date: Thu, 11 May 2017 05:46:36 -0400	[thread overview]
Message-ID: <20170511094635.ljpwg3bxkqj2wmgc@sigill.intra.peff.net> (raw)
In-Reply-To: <20170510221157.8971-1-jonathantanmy@google.com>

On Wed, May 10, 2017 at 03:11:57PM -0700, Jonathan Tan wrote:

> After looking at ways to solve jrnieder's performance concerns, if we're
> going to need to manage one more item of state within the function, I
> might as well use my earlier idea of storing unmatched refs in its own
> list instead of immediately freeing them. This version of the patch
> should have much better performance characteristics.

Hrm. So the problem in your original was that the loop became quadratic
in the number of refs when fetching all of them (because the original
relies on the sorting to essentially do a list-merge). Are there any
quadratic bits left?

> @@ -649,6 +652,25 @@ static void filter_refs(struct fetch_pack_args *args,
>  
>  		if ((allow_unadvertised_object_request &
>  		    (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
> +			can_append = 1;
> +		} else {
> +			struct ref *u;
> +			/* Check all refs, including those already matched */
> +			for (u = unmatched; u; u = u->next) {
> +				if (!oidcmp(&ref->old_oid, &u->old_oid)) {
> +					can_append = 1;
> +					goto can_append;
> +				}
> +			}

This is inside the nr_sought loop. So if I were to do:

  git fetch origin $(git ls-remote origin | awk '{print $1}')

we're back to being quadratic. I realize that's probably a silly thing
to do, but in the general case, you're O(m*n), where "n" is number of
unmatched remote refs and "m" is the number of SHA-1s you're looking
for.

Doing better would require either sorting both lists, or storing the
oids in something that has better than linear-time lookup.  Perhaps a
sha1_array or an oidset? We don't actually need to know anything about
the unmatched refs after the first loop. We just need the list of oids
that let us do can_append.

AIUI, you could also avoid creating the unmatched list entirely when the
server advertises tip/reachable sha1s. That's a small optimization, but
I think it may actually make the logic clearer.

-Peff

  parent reply	other threads:[~2017-05-11  9:46 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-09 18:20 [PATCH] fetch-pack: always allow fetching of literal SHA1s Jonathan Tan
2017-05-09 22:16 ` Jeff King
2017-05-10  4:22   ` Shawn Pearce
2017-05-10  4:33     ` Jeff King
2017-05-10  4:46       ` Mike Hommey
2017-05-10 17:50         ` Ævar Arnfjörð Bjarmason
2017-05-10 18:20           ` Jonathan Nieder
2017-05-10 18:48             ` Martin Fick
2017-05-10 18:54               ` Jonathan Nieder
2017-05-10  4:57       ` Shawn Pearce
2017-05-10 17:00       ` Jonathan Nieder
2017-05-10 18:55         ` Sebastian Schuberth
2017-05-11  9:59         ` Jeff King
2017-05-11 19:03           ` Jonathan Nieder
2017-05-11 21:04             ` Jeff King
2017-05-10 16:44 ` [PATCH v2] " Jonathan Tan
2017-05-10 18:01   ` Jonathan Nieder
2017-05-10 22:11 ` [PATCH v3] " Jonathan Tan
2017-05-10 23:22   ` Jonathan Nieder
2017-05-11  9:46   ` Jeff King [this message]
2017-05-11 17:51     ` Jonathan Tan
2017-05-11 20:52       ` Jeff King
2017-05-11 10:05   ` Jeff King
2017-05-11 17:00     ` Brandon Williams
2017-05-13  9:29       ` Jeff King
2017-05-11 21:14 ` [PATCH v4] " Jonathan Tan
2017-05-11 21:35   ` Jonathan Nieder
2017-05-11 21:59     ` Jeff King
2017-05-11 22:30 ` [PATCH v5] " Jonathan Tan
2017-05-11 22:46   ` Jonathan Nieder
2017-05-12  2:59     ` Jeff King
2017-05-12  6:01     ` Junio C Hamano
2017-05-12  7:59       ` Jeff King
2017-05-12  8:14         ` Jeff King
2017-05-12 18:00           ` Jonathan Tan
2017-05-13  8:30             ` Jeff King
2017-05-12 18:09         ` Jonathan Tan
2017-05-12 19:06           ` Jonathan Nieder
2017-05-12  3:06   ` Jeff King
2017-05-12 20:45 ` Jonathan Tan
2017-05-12 20:46 ` [PATCH v6] " Jonathan Tan
2017-05-12 22:28   ` Jonathan Nieder
2017-05-13  8:36   ` Jeff King
2017-05-15  1:26     ` Junio C Hamano
2017-05-15 17:32 ` [PATCH v7] " Jonathan Tan
2017-05-15 17:46   ` Jonathan Nieder
2017-05-15 22:10   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170511094635.ljpwg3bxkqj2wmgc@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).