From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id 09E521F4B4 for ; Sat, 16 Jan 2021 02:29:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725933AbhAPC2O (ORCPT ); Fri, 15 Jan 2021 21:28:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49584 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725833AbhAPC2N (ORCPT ); Fri, 15 Jan 2021 21:28:13 -0500 Received: from mail-yb1-xb29.google.com (mail-yb1-xb29.google.com [IPv6:2607:f8b0:4864:20::b29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B34D8C061757 for ; Fri, 15 Jan 2021 18:27:32 -0800 (PST) Received: by mail-yb1-xb29.google.com with SMTP id k4so6883014ybp.6 for ; Fri, 15 Jan 2021 18:27:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pt4cK42lH3OJorWwidHN4g8DeAietXydpGu2LBglP/M=; b=cmlybYubTz05fzjTgSNKyu2dh7KG0pmy7RbQABBtFYPrwnwz/JvEHl1ogI6YgjXK1/ 7UHIzeawUmjFIzJX5nEHJ/3D67uAiRVJbm4Stn2UMLr0LqNCXmG1H2d+XGSEaordCget DPjia3TClsNXw3wUfCwF9ORGtI2MdwdcuU6Cddwilz9GvJzsBisDNf4pIOWhke8QC4cT PLRuujDHDu0o0Xs350zFKqoKTTQi0mkEena64Xvck9Tw4usq3RJ1coP7q0deV2GXJQ/z S2aDYjYLOD5Va666Sn3LXUI+HdJWbyBhQyTvFeII2IKiUblKmPKj2Rl9dDuInFyVTA23 7lTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pt4cK42lH3OJorWwidHN4g8DeAietXydpGu2LBglP/M=; b=LdcEz7JwcPI+BBdXvJrStHL0g8BI9B3k7lxZt3fgGizhEZgampJ0lGKBqNw7cE1SZI QTJwvGyibJ1IFlwN/OD2rOhvsc591i47PMp4YhI9oehj0RTIg7uBhb8boUrm1vvPAijA vu+c8TBYy4ndbrOXXaWVlL/MzDn56jvvoWIPUQkn/GczPpt2xOlzpLq/WNxUilW68c6R gQC2UM+vg2N42xwdB6asIkeFB0iZ5RHMnLPjEKi4kFY4f+yv+R7QFnqHMvaXvp48OApS vjoJ44xzjGEx+tAt9eb1mKXFgRkctqaRF9oDmQVmm/PehVviDkHLFuol6ngQaztDlU4h DJ+g== X-Gm-Message-State: AOAM531oXAqvrmSgA5/Go5T6azDr7GEaCd4r7sSwVL3by9ZxNvGJ6c1e +SOqlnIgZXsSYHCNO47gR/yBgCvaYVc/56sNz6HPXQt0y3c66Q== X-Google-Smtp-Source: ABdhPJxgzD7jPtBRPHvaZUGUFWYkGMdLY4mU9rXfZmH9TPyCefTS64yL5JFfshYyH6WTwaKkjZYAL5389Fn3QdTRijw= X-Received: by 2002:a25:bccf:: with SMTP id l15mr21775004ybm.272.1610764051229; Fri, 15 Jan 2021 18:27:31 -0800 (PST) MIME-Version: 1.0 References: <20190814024927.13167-1-phil.hord@gmail.com> In-Reply-To: From: Phil Hord Date: Fri, 15 Jan 2021 18:27:19 -0800 Message-ID: Subject: Re: [PATCH v2] use delete_refs when deleting tags or branches To: Elijah Newren Cc: Git , =?UTF-8?Q?Martin_=C3=85gren?= , Junio C Hamano Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Fri, Jan 15, 2021 at 10:43 AM Elijah Newren wrote: > > Hi, > > On Thu, Jan 14, 2021 at 6:00 PM Phil Hord wrote: > > > > I noticed this is still only in my local branch. Can I get an ACK/NAK? > > Sorry for missing this when you posted in August. Thanks for sending > in the update from v1. > > For other reviewers: v1 is over here: > https://lore.kernel.org/git/20190808035935.30023-1-phil.hord@gmail.com/, > and has review comments from Martin, me, Peff, and Junio. > > > On Tue, Aug 13, 2019 at 7:49 PM Phil Hord wrote: > >> > >> From: Phil Hord > >> > >> 'git tag -d' and 'git branch -d' both accept one or more refs to > >> delete, but each deletion is done by calling `delete_ref` on each argv. > >> This is very slow when removing from packed refs as packed-refs is > >> locked and rewritten each time. Use delete_refs instead so all the > >> removals can be done inside a single transaction with a single update. > > Awesome, thanks for also fixing up git branch with v2. > > >> Since delete_refs performs all the packed-refs delete operations > >> inside a single transaction, if any of the deletes fail then all > >> them will be skipped. In practice, none of them should fail since > >> we verify the hash of each one before calling delete_refs, but some > >> network error or odd permissions problem could have different results > >> after this change. > >> > >> Also, since the file-backed deletions are not performed in the same > >> transaction, those could succeed even when the packed-refs transaction > >> fails. > >> > >> After deleting refs, report the deletion's success only if the ref was > >> actually deleted. For branch deletion, remove the branch config only > >> if the branch ref is actually removed. > >> > >> A manual test deleting 24,000 tags took about 30 minutes using > >> delete_ref. It takes about 5 seconds using delete_refs. > > As I said on v1, it's really nice to have this fixed. Thanks for doing it. > > >> > >> Signed-off-by: Phil Hord > >> --- > >> This reroll adds the same delete_refs change to 'git branch'. It checks > >> individual refs after the operation to report correctly on each whether > >> it was successfully deleted or not. Maybe this is an unnecessary step, > >> though. This handles the weird case where some file system error > >> prevented us from deleting refs, leaving us with an error from > >> delete_refs but without any idea which refs might have been affected. > >> > >> builtin/branch.c | 50 +++++++++++++++++++++++++++++------------------- > >> builtin/tag.c | 45 +++++++++++++++++++++++++++++++++---------- > >> 2 files changed, 65 insertions(+), 30 deletions(-) > >> > >> diff --git a/builtin/branch.c b/builtin/branch.c > >> index 2ef214632f..2273239f41 100644 > >> --- a/builtin/branch.c > >> +++ b/builtin/branch.c > >> @@ -202,6 +202,9 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, > >> int remote_branch = 0; > >> struct strbuf bname = STRBUF_INIT; > >> unsigned allowed_interpret; > >> + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; > >> + struct string_list_item *item; > >> + int refname_pos = 0; > >> > >> switch (kinds) { > >> case FILTER_REFS_REMOTES: > >> @@ -209,12 +212,13 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, > >> /* For subsequent UI messages */ > >> remote_branch = 1; > >> allowed_interpret = INTERPRET_BRANCH_REMOTE; > >> - > >> + refname_pos = 13; > >> force = 1; > >> break; > >> case FILTER_REFS_BRANCHES: > >> fmt = "refs/heads/%s"; > >> allowed_interpret = INTERPRET_BRANCH_LOCAL; > >> + refname_pos = 11; > >> break; > >> default: > >> die(_("cannot use -a with -d")); > >> @@ -265,30 +269,36 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, > >> goto next; > >> } > >> > >> - if (delete_ref(NULL, name, is_null_oid(&oid) ? NULL : &oid, > >> - REF_NO_DEREF)) { > >> - error(remote_branch > >> - ? _("Error deleting remote-tracking branch '%s'") > >> - : _("Error deleting branch '%s'"), > >> - bname.buf); > >> - ret = 1; > >> - goto next; > > The code used to set the return code to 1 if it failed to delete a branch > > >> - } > >> - if (!quiet) { > >> - printf(remote_branch > >> - ? _("Deleted remote-tracking branch %s (was %s).\n") > >> - : _("Deleted branch %s (was %s).\n"), > >> - bname.buf, > >> - (flags & REF_ISBROKEN) ? "broken" > >> - : (flags & REF_ISSYMREF) ? target > >> - : find_unique_abbrev(&oid, DEFAULT_ABBREV)); > >> - } > >> - delete_branch_config(bname.buf); > >> + item = string_list_append(&refs_to_delete, name); > >> + item->util = xstrdup((flags & REF_ISBROKEN) ? "broken" > >> + : (flags & REF_ISSYMREF) ? target > >> + : find_unique_abbrev(&oid, DEFAULT_ABBREV)); > >> > >> next: > >> free(target); > >> } > >> > >> + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); > >> + > >> + for_each_string_list_item(item, &refs_to_delete) { > >> + char * describe_ref = item->util; > >> + char * name = item->string; > >> + if (ref_exists(name)) > >> + ret = 1; > > Now it sets the return code if the branch still exists after trying to > delete. I thought that was subtly different...but I tried doing a > branch deletion of a non-existent branch since I thought that would be > the only difference -- however, that errors out earlier in the > codepath before even getting to the stage of deleting refs. So I > think these are effectively the same. > > >> + else { > >> + char * refname = name + refname_pos; > >> + if (!quiet) > >> + printf(remote_branch > >> + ? _("Deleted remote-tracking branch %s (was %s).\n") > >> + : _("Deleted branch %s (was %s).\n"), > >> + name + refname_pos, describe_ref); > > Neither remote_branch nor refname_pos are changing throughout this > loop, which I at first thought was in error, but it looks like git > branch only allows you to delete one type or the other -- not a > mixture. So this is correct. > > >> + > >> + delete_branch_config(refname); > >> + } > >> + free(describe_ref); > >> + } > >> + string_list_clear(&refs_to_delete, 0); > >> + > >> free(name); > >> strbuf_release(&bname); > >> > >> diff --git a/builtin/tag.c b/builtin/tag.c > >> index e0a4c25382..0d11ffcd04 100644 > >> --- a/builtin/tag.c > >> +++ b/builtin/tag.c > >> @@ -72,10 +72,10 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > >> } > >> > >> typedef int (*each_tag_name_fn)(const char *name, const char *ref, > >> - const struct object_id *oid, const void *cb_data); > >> + const struct object_id *oid, void *cb_data); > >> > >> static int for_each_tag_name(const char **argv, each_tag_name_fn fn, > >> - const void *cb_data) > >> + void *cb_data) > >> { > >> const char **p; > >> struct strbuf ref = STRBUF_INIT; > >> @@ -97,18 +97,43 @@ static int for_each_tag_name(const char **argv, each_tag_name_fn fn, > >> return had_error; > >> } > >> > >> -static int delete_tag(const char *name, const char *ref, > >> - const struct object_id *oid, const void *cb_data) > >> +static int collect_tags(const char *name, const char *ref, > >> + const struct object_id *oid, void *cb_data) > >> { > >> - if (delete_ref(NULL, ref, oid, 0)) > >> - return 1; > > This used to return 1 if it failed to delete a ref. > > >> - printf(_("Deleted tag '%s' (was %s)\n"), name, > >> - find_unique_abbrev(oid, DEFAULT_ABBREV)); > >> + struct string_list *ref_list = cb_data; > >> + > >> + string_list_append(ref_list, ref); > >> + ref_list->items[ref_list->nr - 1].util = oiddup(oid); > >> return 0; > > Now it unconditionally returns 0. > > >> } > >> > >> +static int delete_tags(const char **argv) > >> +{ > >> + int result; > >> + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; > >> + struct string_list_item *item; > >> + > >> + result = for_each_tag_name(argv, collect_tags, (void *)&refs_to_delete); > >> + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); > > You now only look at the result of collecting the tags, and ignore the > result of trying to delete them... > > >> + > >> + for_each_string_list_item(item, &refs_to_delete) { > >> + const char * name = item->string; > >> + struct object_id * oid = item->util; > >> + if (ref_exists(name)) > >> + result = 1; > > ...except that you check if the refs still exist afterward and set the > return code based on it. Like with the branch case, I can't come up > with a case where the difference matters. I suspect there's a race > condition there somewhere, but once you start going down that road I > think the old code may have had a bunch of races too. It might be > nice to document with a comment that there's a small race condition > with someone else trying to forcibly re-create the ref at the same > time you are trying to delete, but I don't think it's a big deal. > > If you did use the result of delete_refs(), you might have to double > check that the callers (git.c:handle_builtin() -> git.c:run_builtin() > -> builtin/tag.c:cmd_tag() -> builtin/tag.c:delete_tags()) are all > okay with the return code; it looks like handle_builtin() would pass > the return code to exit() and the git-tag manpage doesn't document the > return status, so you've at least got some leeway in terms of what > values are acceptable. Or you could just normalize the return value > of delete_refs() down to 0 or 1. But you'd only need to worry about > that if the race condition is something we're worried enough to > tackle. Interesting. I was worried about imposing a requirement on delete_refs that any non-zero return must mean that something was not deleted which should have been. Maybe that's not such a worry, though, and it would be acceptable to return a 1 even if all the refs were deleted even though some error occurred further down the line. I tried normalizing the value and then also verifying each ref was removed, but that seemed wrong. Maybe it's ok to just normalize it and not react to still-existing refs. > >> + else > >> + printf(_("Deleted tag '%s' (was %s)\n"), > >> + item->string + 10, > >> + find_unique_abbrev(oid, DEFAULT_ABBREV)); > >> + > >> + free(oid); > >> + } > >> + string_list_clear(&refs_to_delete, 0); > >> + return result; > >> +} > >> + > >> static int verify_tag(const char *name, const char *ref, > >> - const struct object_id *oid, const void *cb_data) > >> + const struct object_id *oid, void *cb_data) > >> { > >> int flags; > >> const struct ref_format *format = cb_data; > >> @@ -511,7 +536,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix) > >> if (filter.merge_commit) > >> die(_("--merged and --no-merged options are only allowed in list mode")); > >> if (cmdmode == 'd') > >> - return for_each_tag_name(argv, delete_tag, NULL); > >> + return delete_tags(argv); > >> if (cmdmode == 'v') { > >> if (format.format && verify_ref_format(&format)) > >> usage_with_options(git_tag_usage, options); > >> -- > >> 2.23.0.rc1.174.g4cc1b04b4c > > Overall, I like the patch. Peff commented on v1 that the basic idea > (use the part of the refs API that batches operations) is the right > thing to do. I'm not that familiar with refs-touching code, but your > patch makes sense to me. I think I spotted a minor issue (you ignore > the return status of delete_refs(), then later check the existence of > the refs afterwards to determine success, which I believe is a minor > and unlikely race condition), but I'm not sure it's worth fixing; > perhaps just mark it with #leftoverbits and move on -- the faster > branch and tag deletion is a very nice improvement. > > I notice Martin said on v1 that there was a testcase that had problems > with your patch; I tested v2 and it looks like you fixed any such > issues. I think you also addressed the feedback from Junio, though > his comments about the return code and the minor race condition I > noticed around it might mean it'd be good to get his comments. > > Anyway, > Acked-by: Elijah Newren > > I would say Reviewed-by, but I'd like to get Junio's comments on the > return code and minor race. Thanks for the detailed review and thoughts.