git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Michael Heemskerk <mheemskerk@atlassian.com>
Cc: Jiang Xin <worldhello.net@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Git List <git@vger.kernel.org>,
	Jiang Xin <zhiyou.jx@alibaba-inc.com>
Subject: Re: [PATCH 9/9] refs: reimplement refs_delete_refs() and run hook once
Date: Tue, 9 Aug 2022 13:05:46 +0200	[thread overview]
Message-ID: <YvI/ipxGZ3rPZ/OQ@ncase> (raw)
In-Reply-To: <CAJDSCnMHHdYGeyXKj=ztUKBv2vRTn5BEXUR_7fAfATJxn_uwww@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 13819 bytes --]

On Tue, Aug 02, 2022 at 02:42:01PM +0200, Michael Heemskerk wrote:
> Let me re-share some questions/suggestions/objections I got on a patch I
> shared with similar changes:
> https://lore.kernel.org/git/pull.1228.git.1651676435634.gitgitgadget@gmail.com/
> 
> There's a lot to like about the change; it fixes the incorrect invocation of
> the reference-transaction hooks when (bulk) deleting refs, but there is a
> down-side that Patrick pointed out. We never got to a satisfactory solution,
> so let me reshare his feedback to pick up the discussion.
> 
> Patrick:
> > I really like these changes given that they simplify things, but I
> > wonder whether we can do them. In the preimage we're eagerly removing
> > loose refs: any error encountered when deleting a reference is recorded,
> > but we keep on trying to remove the other refs, as well. With the new
> > behaviour we now create a single transaction for all refs and try to
> > commit it. This also means that we'll abort the transaction when locking
> > any of the refs fails, which is a change in behaviour.
> >
> > The current behaviour is explicitly documented in `refs.h:refs_delete_refs()`:
> >
> >     /*
> >      * Delete the specified references. If there are any problems, emit
> >      * errors but attempt to keep going (i.e., the deletes are not done in
> >      * an all-or-nothing transaction). msg and flags are passed through to
> >      * ref_transaction_delete().
> >      */
> >    int refs_delete_refs(struct ref_store *refs, const char *msg,
> >                          struct string_list *refnames, unsigned int flags);
> >
> > There are multiple callsites of this function via `delete_refs()`. Now
> > honestly, most of these callsites look somewhat broken:
> >
> >     - `bisect.c` simply does its best to clean up bisect state. This
> >       usecase looks fine to me.
> >
> >    - `builtin/branch.c` reports the branches as deleted even if
> >       `delete_refs()` failed.
> >
> >     - `builtin/remote.c` also misreports the deleted branches for the
> >       `prune` verb. The `rm` verb looks alright: if deletion of any
> >       branch failed then it doesn't prune the remote's config in the end
> >       and reports an error.
> >
> >     - `builtin/fetch.c` also misreports deleted branches with `--prune`.
> >
> > So most of these commands incorrectly handle the case where only a
> > subset of branches has been deleted. This raises the question whether
> > the interface provided by `refs_delete_refs()` is actually sensible if
> > it's so easy to get wrong. It doesn't even report which branches could
> > be removed and which couldn't. Furthermore, the question is whether new
> > backends like the reftable backend which write all refs into a single
> > slice would actually even be in a position to efficiently retain
> > semantics of this function.
> >
> > I'm torn. There are valid usecases for eagerly deleting refs even if a
> > subset of deletions failed, making this change a tough sell, but most of
> > the callsites don't actually handle this correctly in the first place.

Thanks a lot for revoicing my concerns here. I also agree that overall
the changes are very much what I'd love to have as they simplify the
implementation and fix the issues at the same time.

> At the time, the only solution I could see was to switch to
> transaction-per-ref semantics, but this results in bad performance when
> deleting tens of thousands of refs.
> 
> One option might be to optimistically try to delete the refs in a single
> transaction. If that fails for whatever reason and multiple ref deletions are
> requested, we could fall back to a transaction-per-ref approach. That'd keep
> the common case fast, and still provide best effort deletes.
> 
> Thoughts?

The biggest downside I can think of with this approach is that it's now
undeterministic whether we run the hooks once for all references, or
once for all references plus once for every single reference we're about
to delete when there was e.g. a racy deletion. That makes it hard to use
the hook e.g. in setups where we vote on reference updates as it can be
that due to racy behaviour we now see different behaviour on different
nodes.

I'm still torn. Ideally, I'd just bite the bullet and say that
`refs_delete_refs()` is atomic insofar that it will only ever delete all
references or none, and not a best-effort implementation. But that is a
change I'm still scared to make given that it sounds like an easy way to
run into regressions.

Unfortunately I still don't have an easy answer for how to properly fix
this, sorry :/

Patrick

> Cheers,
> Michael Heemskerk
> 
> On Fri, Jul 29, 2022 at 12:13 PM Jiang Xin <worldhello.net@gmail.com> wrote:
> >
> > From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> >
> > When delete references using "git branch -d" or "git tag -d", there will
> > be duplicate call of "reference-transaction committed" for same refs.
> > This is because "refs_delete_refs()" is called twice, once for
> > files-backend and once for packed-backend, and we used to reinvented the
> > wheel in "files_delete_refs()" and "packed_delete_refs()". By removing
> > "packed_delete_refs()" and reimplement "files_delete_refs()", the
> > "reference-transaction" hook will run only once for deleted branches and
> > tags.
> >
> > The behavior of the following git commands and the last two testcases
> > have been fixed in t1416:
> >
> >  * git branch -d <branch>
> >  * git tag -d <tag>
> >
> > A testcase in t5510 is broken because we used to call the function
> > "packed_refs_lock()", but it is not necessary if the deleted reference
> > is not in the "packed-refs" file.
> >
> > Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> > ---
> >  refs/files-backend.c             | 21 ++++++-------
> >  refs/packed-backend.c            | 51 +-------------------------------
> >  t/t1416-ref-transaction-hooks.sh |  4 +--
> >  t/t5510-fetch.sh                 | 17 +++++++++++
> >  4 files changed, 29 insertions(+), 64 deletions(-)
> >
> > diff --git a/refs/files-backend.c b/refs/files-backend.c
> > index 8baea66e58..21426efaae 100644
> > --- a/refs/files-backend.c
> > +++ b/refs/files-backend.c
> > @@ -1268,31 +1268,27 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
> >  static int files_delete_refs(struct ref_store *ref_store, const char *msg,
> >                              struct string_list *refnames, unsigned int flags)
> >  {
> > -       struct files_ref_store *refs =
> > -               files_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
> > +       struct ref_transaction *transaction;
> >         struct strbuf err = STRBUF_INIT;
> >         int i, result = 0;
> >
> >         if (!refnames->nr)
> >                 return 0;
> >
> > -       if (packed_refs_lock(refs->packed_ref_store, 0, &err))
> > -               goto error;
> > -
> > -       if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) {
> > -               packed_refs_unlock(refs->packed_ref_store);
> > +       transaction = ref_store_transaction_begin(ref_store, &err);
> > +       if (!transaction)
> >                 goto error;
> > -       }
> > -
> > -       packed_refs_unlock(refs->packed_ref_store);
> >
> >         for (i = 0; i < refnames->nr; i++) {
> >                 const char *refname = refnames->items[i].string;
> > -
> > -               if (refs_delete_ref(&refs->base, msg, refname, NULL, flags))
> > +               if (ref_transaction_delete(transaction, refname, NULL,
> > +                                          flags, msg, &err))
> >                         result |= error(_("could not remove reference %s"), refname);
> >         }
> > +       if (ref_transaction_commit(transaction, &err))
> > +               goto error;
> >
> > +       ref_transaction_free(transaction);
> >         strbuf_release(&err);
> >         return result;
> >
> > @@ -1309,6 +1305,7 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg,
> >         else
> >                 error(_("could not delete references: %s"), err.buf);
> >
> > +       ref_transaction_free(transaction);
> >         strbuf_release(&err);
> >         return -1;
> >  }
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index 97b6837767..fdb7a0a52c 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -1519,55 +1519,6 @@ static int packed_initial_transaction_commit(struct ref_store *ref_store,
> >         return ref_transaction_commit(transaction, err);
> >  }
> >
> > -static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
> > -                            struct string_list *refnames, unsigned int flags)
> > -{
> > -       struct packed_ref_store *refs =
> > -               packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
> > -       struct strbuf err = STRBUF_INIT;
> > -       struct ref_transaction *transaction;
> > -       struct string_list_item *item;
> > -       int ret;
> > -
> > -       (void)refs; /* We need the check above, but don't use the variable */
> > -
> > -       if (!refnames->nr)
> > -               return 0;
> > -
> > -       /*
> > -        * Since we don't check the references' old_oids, the
> > -        * individual updates can't fail, so we can pack all of the
> > -        * updates into a single transaction.
> > -        */
> > -
> > -       transaction = ref_store_transaction_begin(ref_store, &err);
> > -       if (!transaction)
> > -               return -1;
> > -
> > -       for_each_string_list_item(item, refnames) {
> > -               if (ref_transaction_delete(transaction, item->string, NULL,
> > -                                          flags, msg, &err)) {
> > -                       warning(_("could not delete reference %s: %s"),
> > -                               item->string, err.buf);
> > -                       strbuf_reset(&err);
> > -               }
> > -       }
> > -
> > -       ret = ref_transaction_commit(transaction, &err);
> > -
> > -       if (ret) {
> > -               if (refnames->nr == 1)
> > -                       error(_("could not delete reference %s: %s"),
> > -                             refnames->items[0].string, err.buf);
> > -               else
> > -                       error(_("could not delete references: %s"), err.buf);
> > -       }
> > -
> > -       ref_transaction_free(transaction);
> > -       strbuf_release(&err);
> > -       return ret;
> > -}
> > -
> >  static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags)
> >  {
> >         /*
> > @@ -1595,7 +1546,7 @@ struct ref_storage_be refs_be_packed = {
> >
> >         .pack_refs = packed_pack_refs,
> >         .create_symref = NULL,
> > -       .delete_refs = packed_delete_refs,
> > +       .delete_refs = NULL,
> >         .rename_ref = NULL,
> >         .copy_ref = NULL,
> >
> > diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> > index df75e5727c..f64166f9d7 100755
> > --- a/t/t1416-ref-transaction-hooks.sh
> > +++ b/t/t1416-ref-transaction-hooks.sh
> > @@ -744,7 +744,7 @@ test_expect_success "branch: rename branches" '
> >         test_cmp_heads_and_tags -C workdir expect
> >  '
> >
> > -test_expect_failure "branch: remove branches" '
> > +test_expect_success "branch: remove branches" '
> >         test_when_finished "rm -f $HOOK_OUTPUT" &&
> >
> >         cat >expect <<-EOF &&
> > @@ -873,7 +873,7 @@ test_expect_success "tag: update refs to create loose refs" '
> >         test_cmp_heads_and_tags -C workdir expect
> >  '
> >
> > -test_expect_failure "tag: remove tags with mixed ref_stores" '
> > +test_expect_success "tag: remove tags with mixed ref_stores" '
> >         test_when_finished "rm -f $HOOK_OUTPUT" &&
> >
> >         cat >expect <<-EOF &&
> > diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> > index b45879a760..22de7ac9ec 100755
> > --- a/t/t5510-fetch.sh
> > +++ b/t/t5510-fetch.sh
> > @@ -168,6 +168,8 @@ test_expect_success REFFILES 'fetch --prune fails to delete branches' '
> >         cd "$D" &&
> >         git clone . prune-fail &&
> >         cd prune-fail &&
> > +       git update-ref refs/remotes/origin/extrabranch main~ &&
> > +       git pack-refs --all &&
> >         git update-ref refs/remotes/origin/extrabranch main &&
> >         : this will prevent --prune from locking packed-refs for deleting refs, but adding loose refs still succeeds  &&
> >         >.git/packed-refs.new &&
> > @@ -175,6 +177,21 @@ test_expect_success REFFILES 'fetch --prune fails to delete branches' '
> >         test_must_fail git fetch --prune origin
> >  '
> >
> > +test_expect_success REFFILES 'fetch --prune ok for loose refs not in locked packed-refs' '
> > +       test_when_finished "cd \"$D\"; rm -rf \"prune-ok-ref-not-packed\"" &&
> > +       cd "$D" &&
> > +       git clone . prune-ok-ref-not-packed &&
> > +       (
> > +               cd prune-ok-ref-not-packed &&
> > +               git update-ref refs/remotes/origin/extrabranch main &&
> > +               : for loose refs not in packed-refs, we can delete them even the packed-refs is locked &&
> > +               :>.git/packed-refs.new &&
> > +
> > +               git fetch --prune origin &&
> > +               test_must_fail git rev-parse refs/remotes/origin/extrabranch --
> > +       )
> > +'
> > +
> >  test_expect_success 'fetch --atomic works with a single branch' '
> >         test_when_finished "rm -rf \"$D\"/atomic" &&
> >
> > --
> > 2.36.1.25.gc87d5ad63a.dirty
> >

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      reply	other threads:[~2022-08-09 11:06 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-29 10:12 [PATCH 0/9] Fix issues of reference-transaction hook for various git commands Jiang Xin
2022-07-29 10:12 ` [PATCH 1/9] t1416: more testcases for reference-transaction hook Jiang Xin
2022-07-30  6:44   ` Eric Sunshine
2022-07-31  3:25     ` Jiang Xin
2022-07-29 10:12 ` [PATCH 2/9] refs: update missing old-oid in transaction from lockfile Jiang Xin
2022-07-29 10:12 ` [PATCH 3/9] refs: add new field in transaction for running transaction hook Jiang Xin
2022-07-29 10:12 ` [PATCH 4/9] refs: do not run transaction hook for git-pack-refs Jiang Xin
2022-07-29 10:12 ` [PATCH 5/9] refs: avoid duplicate running of the reference-transaction hook Jiang Xin
2022-08-02 12:18   ` Michael Heemskerk
2022-08-05  1:41     ` Jiang Xin
2022-08-19  3:21       ` [PATCH v2 0/9] Fix issues of refx-txn hook for various git commands Jiang Xin
2022-08-19  3:21       ` [PATCH v2 1/9] t1416: more testcases for reference-transaction hook Jiang Xin
2022-08-19  3:21       ` [PATCH v2 2/9] refs: update missing old-oid in transaction from lockfile Jiang Xin
2022-08-19  3:21       ` [PATCH v2 3/9] refs: add new field in transaction for running transaction hook Jiang Xin
2022-08-19  3:21       ` [PATCH v2 4/9] refs: do not run transaction hook for git-pack-refs Jiang Xin
2022-08-19  3:21       ` [PATCH v2 5/9] refs: avoid duplicate running of the reference-transaction hook Jiang Xin
2022-08-19  3:21       ` [PATCH v2 6/9] refs: add reflog_info to hold more fields for reflog entry Jiang Xin
2022-08-19  3:21       ` [PATCH v2 7/9] refs: get error message via refs_update_ref_extended() Jiang Xin
2022-08-19  3:21       ` [PATCH v2 8/9] refs: reimplement files_copy_or_rename_ref() to run refs-txn hook Jiang Xin
2022-08-19  3:21       ` [PATCH v2 9/9] refs: reimplement refs_delete_refs() and run hook once Jiang Xin
2022-07-29 10:12 ` [PATCH 6/9] refs: add reflog_info to hold more fields for reflog entry Jiang Xin
2022-08-01 11:32   ` Jiang Xin
2022-07-29 10:12 ` [PATCH 7/9] refs: get error message via refs_update_ref_extended() Jiang Xin
2022-07-29 10:12 ` [PATCH 8/9] refs: reimplement files_copy_or_rename_ref() to run hook Jiang Xin
2022-07-29 10:12 ` [PATCH 9/9] refs: reimplement refs_delete_refs() and run hook once Jiang Xin
2022-08-02 12:42   ` Michael Heemskerk
2022-08-09 11:05     ` Patrick Steinhardt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YvI/ipxGZ3rPZ/OQ@ncase \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mheemskerk@atlassian.com \
    --cc=worldhello.net@gmail.com \
    --cc=zhiyou.jx@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).