git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Jonathan Tan <jonathantanmy@google.com>,
	git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH v2] sha1-file: remove OBJECT_INFO_SKIP_CACHED
Date: Tue, 7 Jan 2020 06:22:13 -0500	[thread overview]
Message-ID: <20200107112213.GD1073219@coredump.intra.peff.net> (raw)
In-Reply-To: <20200106234753.GB92456@google.com>

On Mon, Jan 06, 2020 at 03:47:53PM -0800, Jonathan Nieder wrote:

> >> + * Callers are responsible for calling write_object_file to record the
> >> + * object in persistent storage before writing any other new objects
> >> + * that reference it.
> >> + */
> >>  int pretend_object_file(void *, unsigned long, enum object_type,
> >>  			struct object_id *oid);
> >
> > I think this is an improvement over the status quo, but it's still a
> > potential trap for code which happens to run in the same process (see my
> > other email in the thread).
> >
> > Should the message perhaps be even more scary?
> 
> A pet peeve of mine is warning volume escalation: if it becomes common
> for us to say
> 
>  * Warning: callers are reponsible for [...]
> 
> then new warnings trying to stand out might say
> 
>  * WARNING: callers are responsible for [...]
> 
> and then after we are desensitized to that, we may switch to
> 
>  * WARNING WARNING WARNING, not the usual blah-blah: callers are
> 
> and so on.  The main way I have found to counteract that is to make
> the "dangerous curve" markers context-specific enough that people
> don't learn to ignore them.  After all, sometimes a concurrency
> warning is important to me, at other times warnings about clarity may
> be what attract my interest, and so on.

I meant less about the number of capital letters, and more that we
should be saying "this interface is dangerous; don't use it". Because
it's not just "callers are responsible". It's "this can cause subtle
and hard-to-debug issues because it's writing to global state".

My preferred solution would actually be to rip it out entirely, but we'd
need some solution for git-blame, the sole caller. Possibly it could
insert the value straight into the diff_filespec. But according to the
thread that I linked earlier, I poked at that last year but it didn't
look trivial.

> I don't have a good suggestion here.  Perhaps "Callers are responsible
> for" is too slow and something more terse would help?
> 
>  /*
>   * Adds an object to the in-memory object store, without writing it
>   * to disk.
>   *
>   * Use with caution!  Unless you call write_object_file to record the
>   * in-memory object to persistent storage, any other new objects that
>   * reference it will point to a missing (in memory only) object,
>   * resulting in a corrupt repository.
>   */

Yeah, that's more what I had in mind.

> It would be even better if we have some automated way to catch this
> kind of issue.  Should tests run "git fsck" after each test?  Should
> write_object_file have a paranoid mode that checks integrity?
> 
> I don't know an efficient way to do that.  Ultimately I am comfortable
> counting on reviewers to be aware of this kind of pitfall.  While
> nonlocal invariants are always hard to maintain, this pitfall is
> inherent in the semantics of the function, so I am not too worried
> that reviewers will overlook it.

Yeah, given the scope of the problem (we have a single caller, and this
mechanism is over a decade old) I'm fine with review as the enforcement
mechanism, too.

> A less error-prone interface would tie the result of
> pretend_object_file to a short-lived overlay on the_repository without
> affecting global state.  We could even enforce read-only access in
> that overlay.  I don't think the "struct repository" interface and
> callers are ready for that yet, though.

I agree that would be better, though it's still kind-of global (in that
the repository object is effectively a global for most processes).

-Peff

      reply	other threads:[~2020-01-07 11:22 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-30 21:10 [PATCH] sha1-file: remove OBJECT_INFO_SKIP_CACHED Jonathan Tan
2019-12-30 21:43 ` Junio C Hamano
2019-12-30 22:01 ` Jonathan Nieder
2019-12-31  0:39   ` Jonathan Tan
2019-12-31  1:03     ` Jonathan Nieder
2020-01-02 20:15 ` Jonathan Tan
2020-01-02 20:16 ` [PATCH v2] " Jonathan Tan
2020-01-02 21:41   ` Junio C Hamano
2020-01-06 21:14     ` Jeff King
2020-01-04  0:13   ` Jonathan Nieder
2020-01-06 21:17     ` Jeff King
2020-01-06 23:47       ` Jonathan Nieder
2020-01-07 11:22         ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200107112213.GD1073219@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).