From: Michael Heemskerk <mheemskerk@atlassian.com>
To: Jiang Xin <worldhello.net@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
Patrick Steinhardt <ps@pks.im>, Git List <git@vger.kernel.org>,
Jiang Xin <zhiyou.jx@alibaba-inc.com>
Subject: Re: [PATCH 9/9] refs: reimplement refs_delete_refs() and run hook once
Date: Tue, 2 Aug 2022 14:42:01 +0200 [thread overview]
Message-ID: <CAJDSCnMHHdYGeyXKj=ztUKBv2vRTn5BEXUR_7fAfATJxn_uwww@mail.gmail.com> (raw)
In-Reply-To: <20220729101245.6469-10-worldhello.net@gmail.com>
Let me re-share some questions/suggestions/objections I got on a patch I
shared with similar changes:
https://lore.kernel.org/git/pull.1228.git.1651676435634.gitgitgadget@gmail.com/
There's a lot to like about the change; it fixes the incorrect invocation of
the reference-transaction hooks when (bulk) deleting refs, but there is a
down-side that Patrick pointed out. We never got to a satisfactory solution,
so let me reshare his feedback to pick up the discussion.
Patrick:
> I really like these changes given that they simplify things, but I
> wonder whether we can do them. In the preimage we're eagerly removing
> loose refs: any error encountered when deleting a reference is recorded,
> but we keep on trying to remove the other refs, as well. With the new
> behaviour we now create a single transaction for all refs and try to
> commit it. This also means that we'll abort the transaction when locking
> any of the refs fails, which is a change in behaviour.
>
> The current behaviour is explicitly documented in `refs.h:refs_delete_refs()`:
>
> /*
> * Delete the specified references. If there are any problems, emit
> * errors but attempt to keep going (i.e., the deletes are not done in
> * an all-or-nothing transaction). msg and flags are passed through to
> * ref_transaction_delete().
> */
> int refs_delete_refs(struct ref_store *refs, const char *msg,
> struct string_list *refnames, unsigned int flags);
>
> There are multiple callsites of this function via `delete_refs()`. Now
> honestly, most of these callsites look somewhat broken:
>
> - `bisect.c` simply does its best to clean up bisect state. This
> usecase looks fine to me.
>
> - `builtin/branch.c` reports the branches as deleted even if
> `delete_refs()` failed.
>
> - `builtin/remote.c` also misreports the deleted branches for the
> `prune` verb. The `rm` verb looks alright: if deletion of any
> branch failed then it doesn't prune the remote's config in the end
> and reports an error.
>
> - `builtin/fetch.c` also misreports deleted branches with `--prune`.
>
> So most of these commands incorrectly handle the case where only a
> subset of branches has been deleted. This raises the question whether
> the interface provided by `refs_delete_refs()` is actually sensible if
> it's so easy to get wrong. It doesn't even report which branches could
> be removed and which couldn't. Furthermore, the question is whether new
> backends like the reftable backend which write all refs into a single
> slice would actually even be in a position to efficiently retain
> semantics of this function.
>
> I'm torn. There are valid usecases for eagerly deleting refs even if a
> subset of deletions failed, making this change a tough sell, but most of
> the callsites don't actually handle this correctly in the first place.
At the time, the only solution I could see was to switch to
transaction-per-ref semantics, but this results in bad performance when
deleting tens of thousands of refs.
One option might be to optimistically try to delete the refs in a single
transaction. If that fails for whatever reason and multiple ref deletions are
requested, we could fall back to a transaction-per-ref approach. That'd keep
the common case fast, and still provide best effort deletes.
Thoughts?
Cheers,
Michael Heemskerk
On Fri, Jul 29, 2022 at 12:13 PM Jiang Xin <worldhello.net@gmail.com> wrote:
>
> From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>
> When delete references using "git branch -d" or "git tag -d", there will
> be duplicate call of "reference-transaction committed" for same refs.
> This is because "refs_delete_refs()" is called twice, once for
> files-backend and once for packed-backend, and we used to reinvented the
> wheel in "files_delete_refs()" and "packed_delete_refs()". By removing
> "packed_delete_refs()" and reimplement "files_delete_refs()", the
> "reference-transaction" hook will run only once for deleted branches and
> tags.
>
> The behavior of the following git commands and the last two testcases
> have been fixed in t1416:
>
> * git branch -d <branch>
> * git tag -d <tag>
>
> A testcase in t5510 is broken because we used to call the function
> "packed_refs_lock()", but it is not necessary if the deleted reference
> is not in the "packed-refs" file.
>
> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> ---
> refs/files-backend.c | 21 ++++++-------
> refs/packed-backend.c | 51 +-------------------------------
> t/t1416-ref-transaction-hooks.sh | 4 +--
> t/t5510-fetch.sh | 17 +++++++++++
> 4 files changed, 29 insertions(+), 64 deletions(-)
>
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 8baea66e58..21426efaae 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1268,31 +1268,27 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
> static int files_delete_refs(struct ref_store *ref_store, const char *msg,
> struct string_list *refnames, unsigned int flags)
> {
> - struct files_ref_store *refs =
> - files_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
> + struct ref_transaction *transaction;
> struct strbuf err = STRBUF_INIT;
> int i, result = 0;
>
> if (!refnames->nr)
> return 0;
>
> - if (packed_refs_lock(refs->packed_ref_store, 0, &err))
> - goto error;
> -
> - if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) {
> - packed_refs_unlock(refs->packed_ref_store);
> + transaction = ref_store_transaction_begin(ref_store, &err);
> + if (!transaction)
> goto error;
> - }
> -
> - packed_refs_unlock(refs->packed_ref_store);
>
> for (i = 0; i < refnames->nr; i++) {
> const char *refname = refnames->items[i].string;
> -
> - if (refs_delete_ref(&refs->base, msg, refname, NULL, flags))
> + if (ref_transaction_delete(transaction, refname, NULL,
> + flags, msg, &err))
> result |= error(_("could not remove reference %s"), refname);
> }
> + if (ref_transaction_commit(transaction, &err))
> + goto error;
>
> + ref_transaction_free(transaction);
> strbuf_release(&err);
> return result;
>
> @@ -1309,6 +1305,7 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg,
> else
> error(_("could not delete references: %s"), err.buf);
>
> + ref_transaction_free(transaction);
> strbuf_release(&err);
> return -1;
> }
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 97b6837767..fdb7a0a52c 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1519,55 +1519,6 @@ static int packed_initial_transaction_commit(struct ref_store *ref_store,
> return ref_transaction_commit(transaction, err);
> }
>
> -static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
> - struct string_list *refnames, unsigned int flags)
> -{
> - struct packed_ref_store *refs =
> - packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
> - struct strbuf err = STRBUF_INIT;
> - struct ref_transaction *transaction;
> - struct string_list_item *item;
> - int ret;
> -
> - (void)refs; /* We need the check above, but don't use the variable */
> -
> - if (!refnames->nr)
> - return 0;
> -
> - /*
> - * Since we don't check the references' old_oids, the
> - * individual updates can't fail, so we can pack all of the
> - * updates into a single transaction.
> - */
> -
> - transaction = ref_store_transaction_begin(ref_store, &err);
> - if (!transaction)
> - return -1;
> -
> - for_each_string_list_item(item, refnames) {
> - if (ref_transaction_delete(transaction, item->string, NULL,
> - flags, msg, &err)) {
> - warning(_("could not delete reference %s: %s"),
> - item->string, err.buf);
> - strbuf_reset(&err);
> - }
> - }
> -
> - ret = ref_transaction_commit(transaction, &err);
> -
> - if (ret) {
> - if (refnames->nr == 1)
> - error(_("could not delete reference %s: %s"),
> - refnames->items[0].string, err.buf);
> - else
> - error(_("could not delete references: %s"), err.buf);
> - }
> -
> - ref_transaction_free(transaction);
> - strbuf_release(&err);
> - return ret;
> -}
> -
> static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags)
> {
> /*
> @@ -1595,7 +1546,7 @@ struct ref_storage_be refs_be_packed = {
>
> .pack_refs = packed_pack_refs,
> .create_symref = NULL,
> - .delete_refs = packed_delete_refs,
> + .delete_refs = NULL,
> .rename_ref = NULL,
> .copy_ref = NULL,
>
> diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> index df75e5727c..f64166f9d7 100755
> --- a/t/t1416-ref-transaction-hooks.sh
> +++ b/t/t1416-ref-transaction-hooks.sh
> @@ -744,7 +744,7 @@ test_expect_success "branch: rename branches" '
> test_cmp_heads_and_tags -C workdir expect
> '
>
> -test_expect_failure "branch: remove branches" '
> +test_expect_success "branch: remove branches" '
> test_when_finished "rm -f $HOOK_OUTPUT" &&
>
> cat >expect <<-EOF &&
> @@ -873,7 +873,7 @@ test_expect_success "tag: update refs to create loose refs" '
> test_cmp_heads_and_tags -C workdir expect
> '
>
> -test_expect_failure "tag: remove tags with mixed ref_stores" '
> +test_expect_success "tag: remove tags with mixed ref_stores" '
> test_when_finished "rm -f $HOOK_OUTPUT" &&
>
> cat >expect <<-EOF &&
> diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> index b45879a760..22de7ac9ec 100755
> --- a/t/t5510-fetch.sh
> +++ b/t/t5510-fetch.sh
> @@ -168,6 +168,8 @@ test_expect_success REFFILES 'fetch --prune fails to delete branches' '
> cd "$D" &&
> git clone . prune-fail &&
> cd prune-fail &&
> + git update-ref refs/remotes/origin/extrabranch main~ &&
> + git pack-refs --all &&
> git update-ref refs/remotes/origin/extrabranch main &&
> : this will prevent --prune from locking packed-refs for deleting refs, but adding loose refs still succeeds &&
> >.git/packed-refs.new &&
> @@ -175,6 +177,21 @@ test_expect_success REFFILES 'fetch --prune fails to delete branches' '
> test_must_fail git fetch --prune origin
> '
>
> +test_expect_success REFFILES 'fetch --prune ok for loose refs not in locked packed-refs' '
> + test_when_finished "cd \"$D\"; rm -rf \"prune-ok-ref-not-packed\"" &&
> + cd "$D" &&
> + git clone . prune-ok-ref-not-packed &&
> + (
> + cd prune-ok-ref-not-packed &&
> + git update-ref refs/remotes/origin/extrabranch main &&
> + : for loose refs not in packed-refs, we can delete them even the packed-refs is locked &&
> + :>.git/packed-refs.new &&
> +
> + git fetch --prune origin &&
> + test_must_fail git rev-parse refs/remotes/origin/extrabranch --
> + )
> +'
> +
> test_expect_success 'fetch --atomic works with a single branch' '
> test_when_finished "rm -rf \"$D\"/atomic" &&
>
> --
> 2.36.1.25.gc87d5ad63a.dirty
>
next prev parent reply other threads:[~2022-08-02 12:42 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-29 10:12 [PATCH 0/9] Fix issues of reference-transaction hook for various git commands Jiang Xin
2022-07-29 10:12 ` [PATCH 1/9] t1416: more testcases for reference-transaction hook Jiang Xin
2022-07-30 6:44 ` Eric Sunshine
2022-07-31 3:25 ` Jiang Xin
2022-07-29 10:12 ` [PATCH 2/9] refs: update missing old-oid in transaction from lockfile Jiang Xin
2022-07-29 10:12 ` [PATCH 3/9] refs: add new field in transaction for running transaction hook Jiang Xin
2022-07-29 10:12 ` [PATCH 4/9] refs: do not run transaction hook for git-pack-refs Jiang Xin
2022-07-29 10:12 ` [PATCH 5/9] refs: avoid duplicate running of the reference-transaction hook Jiang Xin
2022-08-02 12:18 ` Michael Heemskerk
2022-08-05 1:41 ` Jiang Xin
2022-08-19 3:21 ` [PATCH v2 0/9] Fix issues of refx-txn hook for various git commands Jiang Xin
2022-08-19 3:21 ` [PATCH v2 1/9] t1416: more testcases for reference-transaction hook Jiang Xin
2022-08-19 3:21 ` [PATCH v2 2/9] refs: update missing old-oid in transaction from lockfile Jiang Xin
2022-08-19 3:21 ` [PATCH v2 3/9] refs: add new field in transaction for running transaction hook Jiang Xin
2022-08-19 3:21 ` [PATCH v2 4/9] refs: do not run transaction hook for git-pack-refs Jiang Xin
2022-08-19 3:21 ` [PATCH v2 5/9] refs: avoid duplicate running of the reference-transaction hook Jiang Xin
2022-08-19 3:21 ` [PATCH v2 6/9] refs: add reflog_info to hold more fields for reflog entry Jiang Xin
2022-08-19 3:21 ` [PATCH v2 7/9] refs: get error message via refs_update_ref_extended() Jiang Xin
2022-08-19 3:21 ` [PATCH v2 8/9] refs: reimplement files_copy_or_rename_ref() to run refs-txn hook Jiang Xin
2022-08-19 3:21 ` [PATCH v2 9/9] refs: reimplement refs_delete_refs() and run hook once Jiang Xin
2022-07-29 10:12 ` [PATCH 6/9] refs: add reflog_info to hold more fields for reflog entry Jiang Xin
2022-08-01 11:32 ` Jiang Xin
2022-07-29 10:12 ` [PATCH 7/9] refs: get error message via refs_update_ref_extended() Jiang Xin
2022-07-29 10:12 ` [PATCH 8/9] refs: reimplement files_copy_or_rename_ref() to run hook Jiang Xin
2022-07-29 10:12 ` [PATCH 9/9] refs: reimplement refs_delete_refs() and run hook once Jiang Xin
2022-08-02 12:42 ` Michael Heemskerk [this message]
2022-08-09 11:05 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJDSCnMHHdYGeyXKj=ztUKBv2vRTn5BEXUR_7fAfATJxn_uwww@mail.gmail.com' \
--to=mheemskerk@atlassian.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ps@pks.im \
--cc=worldhello.net@gmail.com \
--cc=zhiyou.jx@alibaba-inc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).