* [PATCH v2] use delete_refs when deleting tags or branches @ 2019-08-14 2:49 Phil Hord [not found] ` <CABURp0rtkmo7MSSCVrdNXT0UzV9XqV_kXOGkC23C+_vMENNJUg@mail.gmail.com> 0 siblings, 1 reply; 5+ messages in thread From: Phil Hord @ 2019-08-14 2:49 UTC (permalink / raw) To: git; +Cc: Martin Ågren, Elijah Newren, Junio C Hamano, Phil Hord From: Phil Hord <phil.hord@gmail.com> 'git tag -d' and 'git branch -d' both accept one or more refs to delete, but each deletion is done by calling `delete_ref` on each argv. This is very slow when removing from packed refs as packed-refs is locked and rewritten each time. Use delete_refs instead so all the removals can be done inside a single transaction with a single update. Since delete_refs performs all the packed-refs delete operations inside a single transaction, if any of the deletes fail then all them will be skipped. In practice, none of them should fail since we verify the hash of each one before calling delete_refs, but some network error or odd permissions problem could have different results after this change. Also, since the file-backed deletions are not performed in the same transaction, those could succeed even when the packed-refs transaction fails. After deleting refs, report the deletion's success only if the ref was actually deleted. For branch deletion, remove the branch config only if the branch ref is actually removed. A manual test deleting 24,000 tags took about 30 minutes using delete_ref. It takes about 5 seconds using delete_refs. Signed-off-by: Phil Hord <phil.hord@gmail.com> --- This reroll adds the same delete_refs change to 'git branch'. It checks individual refs after the operation to report correctly on each whether it was successfully deleted or not. Maybe this is an unnecessary step, though. This handles the weird case where some file system error prevented us from deleting refs, leaving us with an error from delete_refs but without any idea which refs might have been affected. builtin/branch.c | 50 +++++++++++++++++++++++++++++------------------- builtin/tag.c | 45 +++++++++++++++++++++++++++++++++---------- 2 files changed, 65 insertions(+), 30 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index 2ef214632f..2273239f41 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -202,6 +202,9 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, int remote_branch = 0; struct strbuf bname = STRBUF_INIT; unsigned allowed_interpret; + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; + struct string_list_item *item; + int refname_pos = 0; switch (kinds) { case FILTER_REFS_REMOTES: @@ -209,12 +212,13 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, /* For subsequent UI messages */ remote_branch = 1; allowed_interpret = INTERPRET_BRANCH_REMOTE; - + refname_pos = 13; force = 1; break; case FILTER_REFS_BRANCHES: fmt = "refs/heads/%s"; allowed_interpret = INTERPRET_BRANCH_LOCAL; + refname_pos = 11; break; default: die(_("cannot use -a with -d")); @@ -265,30 +269,36 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, goto next; } - if (delete_ref(NULL, name, is_null_oid(&oid) ? NULL : &oid, - REF_NO_DEREF)) { - error(remote_branch - ? _("Error deleting remote-tracking branch '%s'") - : _("Error deleting branch '%s'"), - bname.buf); - ret = 1; - goto next; - } - if (!quiet) { - printf(remote_branch - ? _("Deleted remote-tracking branch %s (was %s).\n") - : _("Deleted branch %s (was %s).\n"), - bname.buf, - (flags & REF_ISBROKEN) ? "broken" - : (flags & REF_ISSYMREF) ? target - : find_unique_abbrev(&oid, DEFAULT_ABBREV)); - } - delete_branch_config(bname.buf); + item = string_list_append(&refs_to_delete, name); + item->util = xstrdup((flags & REF_ISBROKEN) ? "broken" + : (flags & REF_ISSYMREF) ? target + : find_unique_abbrev(&oid, DEFAULT_ABBREV)); next: free(target); } + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); + + for_each_string_list_item(item, &refs_to_delete) { + char * describe_ref = item->util; + char * name = item->string; + if (ref_exists(name)) + ret = 1; + else { + char * refname = name + refname_pos; + if (!quiet) + printf(remote_branch + ? _("Deleted remote-tracking branch %s (was %s).\n") + : _("Deleted branch %s (was %s).\n"), + name + refname_pos, describe_ref); + + delete_branch_config(refname); + } + free(describe_ref); + } + string_list_clear(&refs_to_delete, 0); + free(name); strbuf_release(&bname); diff --git a/builtin/tag.c b/builtin/tag.c index e0a4c25382..0d11ffcd04 100644 --- a/builtin/tag.c +++ b/builtin/tag.c @@ -72,10 +72,10 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, } typedef int (*each_tag_name_fn)(const char *name, const char *ref, - const struct object_id *oid, const void *cb_data); + const struct object_id *oid, void *cb_data); static int for_each_tag_name(const char **argv, each_tag_name_fn fn, - const void *cb_data) + void *cb_data) { const char **p; struct strbuf ref = STRBUF_INIT; @@ -97,18 +97,43 @@ static int for_each_tag_name(const char **argv, each_tag_name_fn fn, return had_error; } -static int delete_tag(const char *name, const char *ref, - const struct object_id *oid, const void *cb_data) +static int collect_tags(const char *name, const char *ref, + const struct object_id *oid, void *cb_data) { - if (delete_ref(NULL, ref, oid, 0)) - return 1; - printf(_("Deleted tag '%s' (was %s)\n"), name, - find_unique_abbrev(oid, DEFAULT_ABBREV)); + struct string_list *ref_list = cb_data; + + string_list_append(ref_list, ref); + ref_list->items[ref_list->nr - 1].util = oiddup(oid); return 0; } +static int delete_tags(const char **argv) +{ + int result; + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; + struct string_list_item *item; + + result = for_each_tag_name(argv, collect_tags, (void *)&refs_to_delete); + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); + + for_each_string_list_item(item, &refs_to_delete) { + const char * name = item->string; + struct object_id * oid = item->util; + if (ref_exists(name)) + result = 1; + else + printf(_("Deleted tag '%s' (was %s)\n"), + item->string + 10, + find_unique_abbrev(oid, DEFAULT_ABBREV)); + + free(oid); + } + string_list_clear(&refs_to_delete, 0); + return result; +} + static int verify_tag(const char *name, const char *ref, - const struct object_id *oid, const void *cb_data) + const struct object_id *oid, void *cb_data) { int flags; const struct ref_format *format = cb_data; @@ -511,7 +536,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix) if (filter.merge_commit) die(_("--merged and --no-merged options are only allowed in list mode")); if (cmdmode == 'd') - return for_each_tag_name(argv, delete_tag, NULL); + return delete_tags(argv); if (cmdmode == 'v') { if (format.format && verify_ref_format(&format)) usage_with_options(git_tag_usage, options); -- 2.23.0.rc1.174.g4cc1b04b4c ^ permalink raw reply related [flat|nested] 5+ messages in thread
[parent not found: <CABURp0rtkmo7MSSCVrdNXT0UzV9XqV_kXOGkC23C+_vMENNJUg@mail.gmail.com>]
* Re: [PATCH v2] use delete_refs when deleting tags or branches [not found] ` <CABURp0rtkmo7MSSCVrdNXT0UzV9XqV_kXOGkC23C+_vMENNJUg@mail.gmail.com> @ 2021-01-15 18:43 ` Elijah Newren 2021-01-15 20:06 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 5+ messages in thread From: Elijah Newren @ 2021-01-15 18:43 UTC (permalink / raw) To: Phil Hord; +Cc: Git, Martin Ågren, Junio C Hamano Hi, On Thu, Jan 14, 2021 at 6:00 PM Phil Hord <phil.hord@gmail.com> wrote: > > I noticed this is still only in my local branch. Can I get an ACK/NAK? Sorry for missing this when you posted in August. Thanks for sending in the update from v1. For other reviewers: v1 is over here: https://lore.kernel.org/git/20190808035935.30023-1-phil.hord@gmail.com/, and has review comments from Martin, me, Peff, and Junio. > On Tue, Aug 13, 2019 at 7:49 PM Phil Hord <phil.hord@gmail.com> wrote: >> >> From: Phil Hord <phil.hord@gmail.com> >> >> 'git tag -d' and 'git branch -d' both accept one or more refs to >> delete, but each deletion is done by calling `delete_ref` on each argv. >> This is very slow when removing from packed refs as packed-refs is >> locked and rewritten each time. Use delete_refs instead so all the >> removals can be done inside a single transaction with a single update. Awesome, thanks for also fixing up git branch with v2. >> Since delete_refs performs all the packed-refs delete operations >> inside a single transaction, if any of the deletes fail then all >> them will be skipped. In practice, none of them should fail since >> we verify the hash of each one before calling delete_refs, but some >> network error or odd permissions problem could have different results >> after this change. >> >> Also, since the file-backed deletions are not performed in the same >> transaction, those could succeed even when the packed-refs transaction >> fails. >> >> After deleting refs, report the deletion's success only if the ref was >> actually deleted. For branch deletion, remove the branch config only >> if the branch ref is actually removed. >> >> A manual test deleting 24,000 tags took about 30 minutes using >> delete_ref. It takes about 5 seconds using delete_refs. As I said on v1, it's really nice to have this fixed. Thanks for doing it. >> >> Signed-off-by: Phil Hord <phil.hord@gmail.com> >> --- >> This reroll adds the same delete_refs change to 'git branch'. It checks >> individual refs after the operation to report correctly on each whether >> it was successfully deleted or not. Maybe this is an unnecessary step, >> though. This handles the weird case where some file system error >> prevented us from deleting refs, leaving us with an error from >> delete_refs but without any idea which refs might have been affected. >> >> builtin/branch.c | 50 +++++++++++++++++++++++++++++------------------- >> builtin/tag.c | 45 +++++++++++++++++++++++++++++++++---------- >> 2 files changed, 65 insertions(+), 30 deletions(-) >> >> diff --git a/builtin/branch.c b/builtin/branch.c >> index 2ef214632f..2273239f41 100644 >> --- a/builtin/branch.c >> +++ b/builtin/branch.c >> @@ -202,6 +202,9 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, >> int remote_branch = 0; >> struct strbuf bname = STRBUF_INIT; >> unsigned allowed_interpret; >> + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; >> + struct string_list_item *item; >> + int refname_pos = 0; >> >> switch (kinds) { >> case FILTER_REFS_REMOTES: >> @@ -209,12 +212,13 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, >> /* For subsequent UI messages */ >> remote_branch = 1; >> allowed_interpret = INTERPRET_BRANCH_REMOTE; >> - >> + refname_pos = 13; >> force = 1; >> break; >> case FILTER_REFS_BRANCHES: >> fmt = "refs/heads/%s"; >> allowed_interpret = INTERPRET_BRANCH_LOCAL; >> + refname_pos = 11; >> break; >> default: >> die(_("cannot use -a with -d")); >> @@ -265,30 +269,36 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, >> goto next; >> } >> >> - if (delete_ref(NULL, name, is_null_oid(&oid) ? NULL : &oid, >> - REF_NO_DEREF)) { >> - error(remote_branch >> - ? _("Error deleting remote-tracking branch '%s'") >> - : _("Error deleting branch '%s'"), >> - bname.buf); >> - ret = 1; >> - goto next; The code used to set the return code to 1 if it failed to delete a branch >> - } >> - if (!quiet) { >> - printf(remote_branch >> - ? _("Deleted remote-tracking branch %s (was %s).\n") >> - : _("Deleted branch %s (was %s).\n"), >> - bname.buf, >> - (flags & REF_ISBROKEN) ? "broken" >> - : (flags & REF_ISSYMREF) ? target >> - : find_unique_abbrev(&oid, DEFAULT_ABBREV)); >> - } >> - delete_branch_config(bname.buf); >> + item = string_list_append(&refs_to_delete, name); >> + item->util = xstrdup((flags & REF_ISBROKEN) ? "broken" >> + : (flags & REF_ISSYMREF) ? target >> + : find_unique_abbrev(&oid, DEFAULT_ABBREV)); >> >> next: >> free(target); >> } >> >> + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); >> + >> + for_each_string_list_item(item, &refs_to_delete) { >> + char * describe_ref = item->util; >> + char * name = item->string; >> + if (ref_exists(name)) >> + ret = 1; Now it sets the return code if the branch still exists after trying to delete. I thought that was subtly different...but I tried doing a branch deletion of a non-existent branch since I thought that would be the only difference -- however, that errors out earlier in the codepath before even getting to the stage of deleting refs. So I think these are effectively the same. >> + else { >> + char * refname = name + refname_pos; >> + if (!quiet) >> + printf(remote_branch >> + ? _("Deleted remote-tracking branch %s (was %s).\n") >> + : _("Deleted branch %s (was %s).\n"), >> + name + refname_pos, describe_ref); Neither remote_branch nor refname_pos are changing throughout this loop, which I at first thought was in error, but it looks like git branch only allows you to delete one type or the other -- not a mixture. So this is correct. >> + >> + delete_branch_config(refname); >> + } >> + free(describe_ref); >> + } >> + string_list_clear(&refs_to_delete, 0); >> + >> free(name); >> strbuf_release(&bname); >> >> diff --git a/builtin/tag.c b/builtin/tag.c >> index e0a4c25382..0d11ffcd04 100644 >> --- a/builtin/tag.c >> +++ b/builtin/tag.c >> @@ -72,10 +72,10 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, >> } >> >> typedef int (*each_tag_name_fn)(const char *name, const char *ref, >> - const struct object_id *oid, const void *cb_data); >> + const struct object_id *oid, void *cb_data); >> >> static int for_each_tag_name(const char **argv, each_tag_name_fn fn, >> - const void *cb_data) >> + void *cb_data) >> { >> const char **p; >> struct strbuf ref = STRBUF_INIT; >> @@ -97,18 +97,43 @@ static int for_each_tag_name(const char **argv, each_tag_name_fn fn, >> return had_error; >> } >> >> -static int delete_tag(const char *name, const char *ref, >> - const struct object_id *oid, const void *cb_data) >> +static int collect_tags(const char *name, const char *ref, >> + const struct object_id *oid, void *cb_data) >> { >> - if (delete_ref(NULL, ref, oid, 0)) >> - return 1; This used to return 1 if it failed to delete a ref. >> - printf(_("Deleted tag '%s' (was %s)\n"), name, >> - find_unique_abbrev(oid, DEFAULT_ABBREV)); >> + struct string_list *ref_list = cb_data; >> + >> + string_list_append(ref_list, ref); >> + ref_list->items[ref_list->nr - 1].util = oiddup(oid); >> return 0; Now it unconditionally returns 0. >> } >> >> +static int delete_tags(const char **argv) >> +{ >> + int result; >> + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; >> + struct string_list_item *item; >> + >> + result = for_each_tag_name(argv, collect_tags, (void *)&refs_to_delete); >> + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); You now only look at the result of collecting the tags, and ignore the result of trying to delete them... >> + >> + for_each_string_list_item(item, &refs_to_delete) { >> + const char * name = item->string; >> + struct object_id * oid = item->util; >> + if (ref_exists(name)) >> + result = 1; ...except that you check if the refs still exist afterward and set the return code based on it. Like with the branch case, I can't come up with a case where the difference matters. I suspect there's a race condition there somewhere, but once you start going down that road I think the old code may have had a bunch of races too. It might be nice to document with a comment that there's a small race condition with someone else trying to forcibly re-create the ref at the same time you are trying to delete, but I don't think it's a big deal. If you did use the result of delete_refs(), you might have to double check that the callers (git.c:handle_builtin() -> git.c:run_builtin() -> builtin/tag.c:cmd_tag() -> builtin/tag.c:delete_tags()) are all okay with the return code; it looks like handle_builtin() would pass the return code to exit() and the git-tag manpage doesn't document the return status, so you've at least got some leeway in terms of what values are acceptable. Or you could just normalize the return value of delete_refs() down to 0 or 1. But you'd only need to worry about that if the race condition is something we're worried enough to tackle. >> + else >> + printf(_("Deleted tag '%s' (was %s)\n"), >> + item->string + 10, >> + find_unique_abbrev(oid, DEFAULT_ABBREV)); >> + >> + free(oid); >> + } >> + string_list_clear(&refs_to_delete, 0); >> + return result; >> +} >> + >> static int verify_tag(const char *name, const char *ref, >> - const struct object_id *oid, const void *cb_data) >> + const struct object_id *oid, void *cb_data) >> { >> int flags; >> const struct ref_format *format = cb_data; >> @@ -511,7 +536,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix) >> if (filter.merge_commit) >> die(_("--merged and --no-merged options are only allowed in list mode")); >> if (cmdmode == 'd') >> - return for_each_tag_name(argv, delete_tag, NULL); >> + return delete_tags(argv); >> if (cmdmode == 'v') { >> if (format.format && verify_ref_format(&format)) >> usage_with_options(git_tag_usage, options); >> -- >> 2.23.0.rc1.174.g4cc1b04b4c Overall, I like the patch. Peff commented on v1 that the basic idea (use the part of the refs API that batches operations) is the right thing to do. I'm not that familiar with refs-touching code, but your patch makes sense to me. I think I spotted a minor issue (you ignore the return status of delete_refs(), then later check the existence of the refs afterwards to determine success, which I believe is a minor and unlikely race condition), but I'm not sure it's worth fixing; perhaps just mark it with #leftoverbits and move on -- the faster branch and tag deletion is a very nice improvement. I notice Martin said on v1 that there was a testcase that had problems with your patch; I tested v2 and it looks like you fixed any such issues. I think you also addressed the feedback from Junio, though his comments about the return code and the minor race condition I noticed around it might mean it'd be good to get his comments. Anyway, Acked-by: Elijah Newren <newren@gmail.com> I would say Reviewed-by, but I'd like to get Junio's comments on the return code and minor race. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] use delete_refs when deleting tags or branches 2021-01-15 18:43 ` Elijah Newren @ 2021-01-15 20:06 ` Junio C Hamano 2021-01-15 22:04 ` Junio C Hamano 2021-01-16 2:27 ` Phil Hord 2 siblings, 0 replies; 5+ messages in thread From: Junio C Hamano @ 2021-01-15 20:06 UTC (permalink / raw) To: Elijah Newren; +Cc: Phil Hord, Git, Martin Ågren Elijah Newren <newren@gmail.com> writes: > Overall, I like the patch. Peff commented on v1 that the basic idea > (use the part of the refs API that batches operations) is the right > thing to do. I concur; I agree with the general direction. Where do we reject branch deletion based on the merged-ness? When we ask to delete three branches and one is not merged yet and cannot be removed (let's say we are not "--force"ing) but the other two are OK to remove, we do want the other two branches to go. I didn't check what the patch does with respect to this point, but as long as the updated code does not make it "all or none", it would be great. > I would say Reviewed-by, but I'd like to get Junio's comments on the > return code and minor race. It would have to wait a bit more. Thanks for a review. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] use delete_refs when deleting tags or branches 2021-01-15 18:43 ` Elijah Newren 2021-01-15 20:06 ` Junio C Hamano @ 2021-01-15 22:04 ` Junio C Hamano 2021-01-16 2:27 ` Phil Hord 2 siblings, 0 replies; 5+ messages in thread From: Junio C Hamano @ 2021-01-15 22:04 UTC (permalink / raw) To: Elijah Newren; +Cc: Phil Hord, Git, Martin Ågren Elijah Newren <newren@gmail.com> writes: >>> /* For subsequent UI messages */ >>> remote_branch = 1; >>> allowed_interpret = INTERPRET_BRANCH_REMOTE; >>> - >>> + refname_pos = 13; >>> force = 1; >>> break; >>> case FILTER_REFS_BRANCHES: >>> fmt = "refs/heads/%s"; >>> allowed_interpret = INTERPRET_BRANCH_LOCAL; >>> + refname_pos = 11; >>> break; Hmph, I can tell 11 is strlen("refs/heads/") and 13 is probably for "refs/remotes/", but can't we do better? switch (kinds) { case ...: fmt = "refs/remotes/%s"; ... } refname_skip_bytes = strcspn(fmt, "%"); or something like that? I think refname_pos is a misleading variable name. The refname is the whole thing, so it would always be 0. What we are looking for is the position at which we find human readable "branch name" begins, or, the number of bytes to skip to get there from the beginning (the latter is where 'refname_skip_bytes' come from). I am also OK with 'branch_name_pos'. >>> + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); Earlier in a separate message, I wondered where the branches that are not merged are rejected; it seems that the code reuses the original code that jumps to the end of the loop and refrains from placing such a branch to the refs_to_delete list, so from the point of view of "correctness", the patch would not make it "all or none", which is good. The order in which the failures are reported will change, though. In the original code, we got errors in the order we iterate through the branches. With the patch, we'll see all the "not merged and can not delete" first from A to Z, and then "we tried to delete but this exists after that" in a separate batch from A to Z (we won't report for the same ref twice). I think it probably is OK and not worth refactoring the error reporting mechanism of check_branch_commit() just in order to restore the order of the error messages, but I am pointing it out just for completeness. >>> + for_each_string_list_item(item, &refs_to_delete) { >>> + char * describe_ref = item->util; >>> + char * name = item->string; Style. The asterisk sticks to the identifier, i.e. char *name = item->string; >>> + if (ref_exists(name)) >>> + ret = 1; > > Now it sets the return code if the branch still exists after trying to > delete. I thought that was subtly different...but I tried doing a > branch deletion of a non-existent branch since I thought that would be > the only difference -- however, that errors out earlier in the > codepath before even getting to the stage of deleting refs. So I > think these are effectively the same. It checks if the branch exists after it tried to delete. A branch may exist because delete_refs() failed to delete, or it may exist because somebody created it anew after delete_refs() returned. The original reported if a deletion was successful or not by checking the status returned by delete_ref(). We may want to introduce a variant of delete_refs() that reports which ones it failed to delete to the caller if we really want to report the errors properly. I dunno. >>> diff --git a/builtin/tag.c b/builtin/tag.c >>> ... >>> +static int collect_tags(const char *name, const char *ref, >>> + const struct object_id *oid, void *cb_data) >>> { >>> - if (delete_ref(NULL, ref, oid, 0)) >>> - return 1; > > This used to return 1 if it failed to delete a ref. > >>> - printf(_("Deleted tag '%s' (was %s)\n"), name, >>> - find_unique_abbrev(oid, DEFAULT_ABBREV)); >>> + struct string_list *ref_list = cb_data; >>> + >>> + string_list_append(ref_list, ref); >>> + ref_list->items[ref_list->nr - 1].util = oiddup(oid); >>> return 0; > > Now it unconditionally returns 0. But this is just "collect what to delete" phase, and the return value is to be consumed by for_each_tag_name() as the callback, no? I do not see a problem here. >>> } >>> >>> +static int delete_tags(const char **argv) >>> +{ >>> + int result; >>> + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; >>> + struct string_list_item *item; >>> + >>> + result = for_each_tag_name(argv, collect_tags, (void *)&refs_to_delete); >>> + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); > > You now only look at the result of collecting the tags, and ignore the > result of trying to delete them... That is a bigger problem, and is shared by the branch side, too. I do think delete_refs() API is misdesigned in that it is not equipped to report what happened to individual refs. It only says "we saw an error" when there is even a single ref that encounters a trouble. If we fix that, I suspect that this patch will become much simpler. Another thing that I think is a misdesign of delete_refs() actually can work for us here. The list of refnames to be deleted is passed as a "struct string_list" (which is not the most natural way to pass bunch of strings, which is why I consider it a misdesign), and for each ref, there is a place for the delete_refs() and its internal helpers to store some piece of info, i.e. string_list_item.util. Not even compile tested, but we could stuff NULL/non-NULL to the .util member so that we can report per-ref status from the API, perhaps like this: diff --git i/refs/files-backend.c w/refs/files-backend.c index 04e85e7002..bdc17f55ef 100644 --- i/refs/files-backend.c +++ w/refs/files-backend.c @@ -1224,8 +1224,12 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, for (i = 0; i < refnames->nr; i++) { const char *refname = refnames->items[i].string; - if (refs_delete_ref(&refs->base, msg, refname, NULL, flags)) - result |= error(_("could not remove reference %s"), refname); + if (!refs_delete_ref(&refs->base, msg, refname, NULL, flags)) { + refnames->items[i].util = NULL; + continue; + } + result |= error(_("could not remove reference %s"), refname); + refnames->items[i].util = "error"; } strbuf_release(&err); diff --git i/refs/packed-backend.c w/refs/packed-backend.c index b912f2505f..a5b5071511 100644 --- i/refs/packed-backend.c +++ w/refs/packed-backend.c @@ -1541,7 +1541,10 @@ static int packed_delete_refs(struct ref_store *ref_store, const char *msg, flags, msg, &err)) { warning(_("could not delete reference %s: %s"), item->string, err.buf); + item->util = "error"; strbuf_reset(&err); + } else { + item->util = NULL; } } ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] use delete_refs when deleting tags or branches 2021-01-15 18:43 ` Elijah Newren 2021-01-15 20:06 ` Junio C Hamano 2021-01-15 22:04 ` Junio C Hamano @ 2021-01-16 2:27 ` Phil Hord 2 siblings, 0 replies; 5+ messages in thread From: Phil Hord @ 2021-01-16 2:27 UTC (permalink / raw) To: Elijah Newren; +Cc: Git, Martin Ågren, Junio C Hamano On Fri, Jan 15, 2021 at 10:43 AM Elijah Newren <newren@gmail.com> wrote: > > Hi, > > On Thu, Jan 14, 2021 at 6:00 PM Phil Hord <phil.hord@gmail.com> wrote: > > > > I noticed this is still only in my local branch. Can I get an ACK/NAK? > > Sorry for missing this when you posted in August. Thanks for sending > in the update from v1. > > For other reviewers: v1 is over here: > https://lore.kernel.org/git/20190808035935.30023-1-phil.hord@gmail.com/, > and has review comments from Martin, me, Peff, and Junio. > > > On Tue, Aug 13, 2019 at 7:49 PM Phil Hord <phil.hord@gmail.com> wrote: > >> > >> From: Phil Hord <phil.hord@gmail.com> > >> > >> 'git tag -d' and 'git branch -d' both accept one or more refs to > >> delete, but each deletion is done by calling `delete_ref` on each argv. > >> This is very slow when removing from packed refs as packed-refs is > >> locked and rewritten each time. Use delete_refs instead so all the > >> removals can be done inside a single transaction with a single update. > > Awesome, thanks for also fixing up git branch with v2. > > >> Since delete_refs performs all the packed-refs delete operations > >> inside a single transaction, if any of the deletes fail then all > >> them will be skipped. In practice, none of them should fail since > >> we verify the hash of each one before calling delete_refs, but some > >> network error or odd permissions problem could have different results > >> after this change. > >> > >> Also, since the file-backed deletions are not performed in the same > >> transaction, those could succeed even when the packed-refs transaction > >> fails. > >> > >> After deleting refs, report the deletion's success only if the ref was > >> actually deleted. For branch deletion, remove the branch config only > >> if the branch ref is actually removed. > >> > >> A manual test deleting 24,000 tags took about 30 minutes using > >> delete_ref. It takes about 5 seconds using delete_refs. > > As I said on v1, it's really nice to have this fixed. Thanks for doing it. > > >> > >> Signed-off-by: Phil Hord <phil.hord@gmail.com> > >> --- > >> This reroll adds the same delete_refs change to 'git branch'. It checks > >> individual refs after the operation to report correctly on each whether > >> it was successfully deleted or not. Maybe this is an unnecessary step, > >> though. This handles the weird case where some file system error > >> prevented us from deleting refs, leaving us with an error from > >> delete_refs but without any idea which refs might have been affected. > >> > >> builtin/branch.c | 50 +++++++++++++++++++++++++++++------------------- > >> builtin/tag.c | 45 +++++++++++++++++++++++++++++++++---------- > >> 2 files changed, 65 insertions(+), 30 deletions(-) > >> > >> diff --git a/builtin/branch.c b/builtin/branch.c > >> index 2ef214632f..2273239f41 100644 > >> --- a/builtin/branch.c > >> +++ b/builtin/branch.c > >> @@ -202,6 +202,9 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, > >> int remote_branch = 0; > >> struct strbuf bname = STRBUF_INIT; > >> unsigned allowed_interpret; > >> + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; > >> + struct string_list_item *item; > >> + int refname_pos = 0; > >> > >> switch (kinds) { > >> case FILTER_REFS_REMOTES: > >> @@ -209,12 +212,13 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, > >> /* For subsequent UI messages */ > >> remote_branch = 1; > >> allowed_interpret = INTERPRET_BRANCH_REMOTE; > >> - > >> + refname_pos = 13; > >> force = 1; > >> break; > >> case FILTER_REFS_BRANCHES: > >> fmt = "refs/heads/%s"; > >> allowed_interpret = INTERPRET_BRANCH_LOCAL; > >> + refname_pos = 11; > >> break; > >> default: > >> die(_("cannot use -a with -d")); > >> @@ -265,30 +269,36 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, > >> goto next; > >> } > >> > >> - if (delete_ref(NULL, name, is_null_oid(&oid) ? NULL : &oid, > >> - REF_NO_DEREF)) { > >> - error(remote_branch > >> - ? _("Error deleting remote-tracking branch '%s'") > >> - : _("Error deleting branch '%s'"), > >> - bname.buf); > >> - ret = 1; > >> - goto next; > > The code used to set the return code to 1 if it failed to delete a branch > > >> - } > >> - if (!quiet) { > >> - printf(remote_branch > >> - ? _("Deleted remote-tracking branch %s (was %s).\n") > >> - : _("Deleted branch %s (was %s).\n"), > >> - bname.buf, > >> - (flags & REF_ISBROKEN) ? "broken" > >> - : (flags & REF_ISSYMREF) ? target > >> - : find_unique_abbrev(&oid, DEFAULT_ABBREV)); > >> - } > >> - delete_branch_config(bname.buf); > >> + item = string_list_append(&refs_to_delete, name); > >> + item->util = xstrdup((flags & REF_ISBROKEN) ? "broken" > >> + : (flags & REF_ISSYMREF) ? target > >> + : find_unique_abbrev(&oid, DEFAULT_ABBREV)); > >> > >> next: > >> free(target); > >> } > >> > >> + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); > >> + > >> + for_each_string_list_item(item, &refs_to_delete) { > >> + char * describe_ref = item->util; > >> + char * name = item->string; > >> + if (ref_exists(name)) > >> + ret = 1; > > Now it sets the return code if the branch still exists after trying to > delete. I thought that was subtly different...but I tried doing a > branch deletion of a non-existent branch since I thought that would be > the only difference -- however, that errors out earlier in the > codepath before even getting to the stage of deleting refs. So I > think these are effectively the same. > > >> + else { > >> + char * refname = name + refname_pos; > >> + if (!quiet) > >> + printf(remote_branch > >> + ? _("Deleted remote-tracking branch %s (was %s).\n") > >> + : _("Deleted branch %s (was %s).\n"), > >> + name + refname_pos, describe_ref); > > Neither remote_branch nor refname_pos are changing throughout this > loop, which I at first thought was in error, but it looks like git > branch only allows you to delete one type or the other -- not a > mixture. So this is correct. > > >> + > >> + delete_branch_config(refname); > >> + } > >> + free(describe_ref); > >> + } > >> + string_list_clear(&refs_to_delete, 0); > >> + > >> free(name); > >> strbuf_release(&bname); > >> > >> diff --git a/builtin/tag.c b/builtin/tag.c > >> index e0a4c25382..0d11ffcd04 100644 > >> --- a/builtin/tag.c > >> +++ b/builtin/tag.c > >> @@ -72,10 +72,10 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > >> } > >> > >> typedef int (*each_tag_name_fn)(const char *name, const char *ref, > >> - const struct object_id *oid, const void *cb_data); > >> + const struct object_id *oid, void *cb_data); > >> > >> static int for_each_tag_name(const char **argv, each_tag_name_fn fn, > >> - const void *cb_data) > >> + void *cb_data) > >> { > >> const char **p; > >> struct strbuf ref = STRBUF_INIT; > >> @@ -97,18 +97,43 @@ static int for_each_tag_name(const char **argv, each_tag_name_fn fn, > >> return had_error; > >> } > >> > >> -static int delete_tag(const char *name, const char *ref, > >> - const struct object_id *oid, const void *cb_data) > >> +static int collect_tags(const char *name, const char *ref, > >> + const struct object_id *oid, void *cb_data) > >> { > >> - if (delete_ref(NULL, ref, oid, 0)) > >> - return 1; > > This used to return 1 if it failed to delete a ref. > > >> - printf(_("Deleted tag '%s' (was %s)\n"), name, > >> - find_unique_abbrev(oid, DEFAULT_ABBREV)); > >> + struct string_list *ref_list = cb_data; > >> + > >> + string_list_append(ref_list, ref); > >> + ref_list->items[ref_list->nr - 1].util = oiddup(oid); > >> return 0; > > Now it unconditionally returns 0. > > >> } > >> > >> +static int delete_tags(const char **argv) > >> +{ > >> + int result; > >> + struct string_list refs_to_delete = STRING_LIST_INIT_DUP; > >> + struct string_list_item *item; > >> + > >> + result = for_each_tag_name(argv, collect_tags, (void *)&refs_to_delete); > >> + delete_refs(NULL, &refs_to_delete, REF_NO_DEREF); > > You now only look at the result of collecting the tags, and ignore the > result of trying to delete them... > > >> + > >> + for_each_string_list_item(item, &refs_to_delete) { > >> + const char * name = item->string; > >> + struct object_id * oid = item->util; > >> + if (ref_exists(name)) > >> + result = 1; > > ...except that you check if the refs still exist afterward and set the > return code based on it. Like with the branch case, I can't come up > with a case where the difference matters. I suspect there's a race > condition there somewhere, but once you start going down that road I > think the old code may have had a bunch of races too. It might be > nice to document with a comment that there's a small race condition > with someone else trying to forcibly re-create the ref at the same > time you are trying to delete, but I don't think it's a big deal. > > If you did use the result of delete_refs(), you might have to double > check that the callers (git.c:handle_builtin() -> git.c:run_builtin() > -> builtin/tag.c:cmd_tag() -> builtin/tag.c:delete_tags()) are all > okay with the return code; it looks like handle_builtin() would pass > the return code to exit() and the git-tag manpage doesn't document the > return status, so you've at least got some leeway in terms of what > values are acceptable. Or you could just normalize the return value > of delete_refs() down to 0 or 1. But you'd only need to worry about > that if the race condition is something we're worried enough to > tackle. Interesting. I was worried about imposing a requirement on delete_refs that any non-zero return must mean that something was not deleted which should have been. Maybe that's not such a worry, though, and it would be acceptable to return a 1 even if all the refs were deleted even though some error occurred further down the line. I tried normalizing the value and then also verifying each ref was removed, but that seemed wrong. Maybe it's ok to just normalize it and not react to still-existing refs. > >> + else > >> + printf(_("Deleted tag '%s' (was %s)\n"), > >> + item->string + 10, > >> + find_unique_abbrev(oid, DEFAULT_ABBREV)); > >> + > >> + free(oid); > >> + } > >> + string_list_clear(&refs_to_delete, 0); > >> + return result; > >> +} > >> + > >> static int verify_tag(const char *name, const char *ref, > >> - const struct object_id *oid, const void *cb_data) > >> + const struct object_id *oid, void *cb_data) > >> { > >> int flags; > >> const struct ref_format *format = cb_data; > >> @@ -511,7 +536,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix) > >> if (filter.merge_commit) > >> die(_("--merged and --no-merged options are only allowed in list mode")); > >> if (cmdmode == 'd') > >> - return for_each_tag_name(argv, delete_tag, NULL); > >> + return delete_tags(argv); > >> if (cmdmode == 'v') { > >> if (format.format && verify_ref_format(&format)) > >> usage_with_options(git_tag_usage, options); > >> -- > >> 2.23.0.rc1.174.g4cc1b04b4c > > Overall, I like the patch. Peff commented on v1 that the basic idea > (use the part of the refs API that batches operations) is the right > thing to do. I'm not that familiar with refs-touching code, but your > patch makes sense to me. I think I spotted a minor issue (you ignore > the return status of delete_refs(), then later check the existence of > the refs afterwards to determine success, which I believe is a minor > and unlikely race condition), but I'm not sure it's worth fixing; > perhaps just mark it with #leftoverbits and move on -- the faster > branch and tag deletion is a very nice improvement. > > I notice Martin said on v1 that there was a testcase that had problems > with your patch; I tested v2 and it looks like you fixed any such > issues. I think you also addressed the feedback from Junio, though > his comments about the return code and the minor race condition I > noticed around it might mean it'd be good to get his comments. > > Anyway, > Acked-by: Elijah Newren <newren@gmail.com> > > I would say Reviewed-by, but I'd like to get Junio's comments on the > return code and minor race. Thanks for the detailed review and thoughts. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-01-16 2:29 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-14 2:49 [PATCH v2] use delete_refs when deleting tags or branches Phil Hord [not found] ` <CABURp0rtkmo7MSSCVrdNXT0UzV9XqV_kXOGkC23C+_vMENNJUg@mail.gmail.com> 2021-01-15 18:43 ` Elijah Newren 2021-01-15 20:06 ` Junio C Hamano 2021-01-15 22:04 ` Junio C Hamano 2021-01-16 2:27 ` Phil Hord
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).