git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] apply: do not fetch when checking object existence
Date: Mon, 27 Jul 2020 18:19:38 -0700	[thread overview]
Message-ID: <xmqqwo2oe8r9.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <20200728010403.95142-1-jonathantanmy@google.com> (Jonathan Tan's message of "Mon, 27 Jul 2020 18:04:03 -0700")

Jonathan Tan <jonathantanmy@google.com> writes:

> There have been a few bugs wherein Git fetches missing objects whenever
> the existence of an object is checked, even though it does not need to
> perform such a fetch. To resolve these bugs, we could look at all the
> places that has_object_file() (or a similar function) is used. As a
> first step, introduce a new function has_object() that checks for the
> existence of an object, with a default behavior of not fetching if the
> object is missing and the repository is a partial clone. As we verify
> each has_object_file() (or similar) usage, we can replace it with
> has_object(), and we will know that we are done when we can delete
> has_object_file() (and the other similar functions).

I wonder if we want to name the two (i.e. one variant that refuses
to go to network because it is trying to see if a lazy fetch is
needed, and the other that goes to network behind caller's back for
ease of use in a lazy clone) a bit more distinctly so that which one
could potentially go outside.

Depending on one's view which one is _normal_ access pattern, giving
an explicit adverb to one variant while leaving the other one bland
might be sufficient.  For example, I _think_ most of the places do
not want to handle the details of lazily fetching themselves, and I
suspect that the traditional has_object_file() semantics without "do
not trigger lazy fetch" option would be the normal access pattern.

In which case, renaming your new "has_object" to something like
"has_object_locally()" would be a good name for a special case
codepath that wants to care---if the object does not exist locally
and needs to be obtained lazily from elsewhere, the function would
say "no".

And all the other names like has_object_file() that by default gives
callers a transparent access to lazily fetched objects can stay the
same.

> I mentioned the idea for this change here:
> https://lore.kernel.org/git/20200721225020.1352772-1-jonathantanmy@google.com/

Yup, I think that is going in a good direction.  I suspect that
apply will not be the only remaining case we need to "fix", and
using the new helper function, codepaths that have already been
"fixed" by passing "do not lazily fetch" option to the traditional
API functions would become easier to read.  And if that is the case,
let's have the introduction of the helper function as a separate
patch, with each of [PATCH 2-N/N] be a fix for separate codepaths.

Thanks.

  reply	other threads:[~2020-07-28  1:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28  1:04 [PATCH] apply: do not fetch when checking object existence Jonathan Tan
2020-07-28  1:19 ` Junio C Hamano [this message]
2020-07-28 18:23   ` Jonathan Tan
2020-08-05 23:06 ` [PATCH v2 0/4] No-lazy-fetch has_object() and some fixes Jonathan Tan
2020-08-05 23:06   ` [PATCH v2 1/4] sha1-file: introduce no-lazy-fetch has_object() Jonathan Tan
2020-08-05 23:06   ` [PATCH v2 2/4] apply: do not lazy fetch when applying binary Jonathan Tan
2020-08-05 23:06   ` [PATCH v2 3/4] pack-objects: no fetch when allow-{any,promisor} Jonathan Tan
2020-08-05 23:06   ` [PATCH v2 4/4] fsck: do not lazy fetch known non-promisor object Jonathan Tan
2020-08-06 20:00   ` [PATCH v2 0/4] No-lazy-fetch has_object() and some fixes Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqwo2oe8r9.fsf@gitster.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).