From: Junio C Hamano <gitster@pobox.com> To: Phil Hord <phil.hord@gmail.com> Cc: git@vger.kernel.org Subject: Re: [PATCH 1/1] delete multiple tags in a single transaction Date: Thu, 08 Aug 2019 12:39:10 -0700 Message-ID: <xmqq4l2rfnvl.fsf@gitster-ct.c.googlers.com> (raw) In-Reply-To: <20190808035935.30023-1-phil.hord@gmail.com> (Phil Hord's message of "Wed, 7 Aug 2019 20:59:35 -0700") Phil Hord <phil.hord@gmail.com> writes: > From: Phil Hord <phil.hord@gmail.com> > > 'git tag -d' accepts one or more tag refs to delete, but each deletion > is done by calling `delete_ref` on each argv. This is painfully slow > when removing from packed refs. Use delete_refs instead so all the > removals can be done inside a single transaction with a single write. > > I have a repo with 24,000 tags, most of which are not useful to any > developers. Having this many refs slows down many operations that > would otherwise be very fast. Removing these tags when they've been > accidentally fetched again takes about 30 minutes using delete_ref. > > git tag -l feature/* | xargs git tag -d > > Removing the same tags using delete_refs takes less than 5 seconds. Makes sense. As mentioned elsewhere in the thread already, a batched update-ref would open the packed-refs ony once because everything is done in a single transaction, so presumably a pipeline like this git tag -l feature/* | sed -e 's|^|delete refs/tags/|' | git update-ref --stdin may work well, and "git tag -d" that gets these refs on the command line should be capable of doing the same. > -static int delete_tag(const char *name, const char *ref, > - const struct object_id *oid, const void *cb_data) > +struct tag_args { > + char *oid_abbrev; > + char *refname; > +}; > + > +static int make_string_list(const char *name, const char *ref, > + const struct object_id *oid, void *cb_data) Please think about a few more minutes before naming a function like this, and make it a habit for your future patches. We can see that the callback is used to insert more strings into a string list, but the type (i.e. string_list) used to represent the set is not all that important. What is more important is why you are building that set for, and saying what is in the set (as opposed to saying that the container happens to be a string_list) would be a good first step. I presume that you are enumerating the tags to be deleted, together with the data necessary for you to report the deletion of the tags? > { > - if (delete_ref(NULL, ref, oid, 0)) > - return 1; > - printf(_("Deleted tag '%s' (was %s)\n"), name, > - find_unique_abbrev(oid, DEFAULT_ABBREV)); > + struct string_list *ref_list = cb_data; > + struct tag_args *info = xmalloc(sizeof(struct tag_args)); > + > + string_list_append(ref_list, ref); > + > + info->oid_abbrev = xstrdup(find_unique_abbrev(oid, DEFAULT_ABBREV)); > + info->refname = xstrdup(name); > + ref_list->items[ref_list->nr - 1].util = info; > return 0; > } > > +static int delete_tags(const char **argv) > +{ > + int result; > + struct string_list ref_list = STRING_LIST_INIT_DUP; > + struct string_list_item *ref_list_item; > + > + result = for_each_tag_name(argv, make_string_list, (void *) &ref_list); > + if (!result) > + result = delete_refs(NULL, &ref_list, REF_NO_DEREF); > + > + for_each_string_list_item(ref_list_item, &ref_list) { > + struct tag_args * info = ref_list_item->util; > + if (!result) > + printf(_("Deleted tag '%s' (was %s)\n"), info->refname, > + info->oid_abbrev); > + free(info->oid_abbrev); > + free(info->refname); > + free(info); It is not performance critical, but info->refname is computable from ref_list_item->string, isn't it? I am just wondering if we can do this without having to allocate the .util field for each of 20,000 tags. We still need to remember oid (or oid_abbrev, but if I were writing this, I'd record the full oid in .util and make the code that prints call find_unique_abbrev() on it), so I guess we cannot really leave .util NULL. > + } > + string_list_clear(&ref_list, 0); > + return result; We used to return the returned value from for_each_tag_name() that repeatedly called delete_tag(). Now we return value from delete_refs(). Are our caller(s) OK with the values that may come back from that function? Can delete_refs() return a value that is not appropriate to be returned from cmd_tag(), for example a negative value? > +} > + > static int verify_tag(const char *name, const char *ref, > - const struct object_id *oid, const void *cb_data) > + const struct object_id *oid, void *cb_data) > { > int flags; > const struct ref_format *format = cb_data; > @@ -511,7 +543,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix) > if (filter.merge_commit) > die(_("--merged and --no-merged options are only allowed in list mode")); > if (cmdmode == 'd') > - return for_each_tag_name(argv, delete_tag, NULL); > + return delete_tags(argv); Thanks.
next prev parent reply other threads:[~2019-08-08 19:39 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-08-08 3:59 Phil Hord 2019-08-08 12:47 ` Martin Ågren 2019-08-08 12:53 ` [PATCH] t7004: check existence of correct tag Martin Ågren 2019-08-08 18:15 ` [PATCH 1/1] delete multiple tags in a single transaction Elijah Newren 2019-08-08 23:43 ` Phil Hord 2019-08-09 3:05 ` Jeff King 2019-08-08 19:39 ` Junio C Hamano [this message] 2019-08-08 23:58 ` Phil Hord
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=xmqq4l2rfnvl.fsf@gitster-ct.c.googlers.com \ --to=gitster@pobox.com \ --cc=git@vger.kernel.org \ --cc=phil.hord@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
git@vger.kernel.org list mirror (unofficial, one of many) This inbox may be cloned and mirrored by anyone: git clone --mirror https://public-inbox.org/git git clone --mirror http://ou63pmih66umazou.onion/git git clone --mirror http://czquwvybam4bgbro.onion/git git clone --mirror http://hjrcffqmbrq6wope.onion/git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V1 git git/ https://public-inbox.org/git \ git@vger.kernel.org public-inbox-index git Example config snippet for mirrors. Newsgroups are available over NNTP: nntp://news.public-inbox.org/inbox.comp.version-control.git nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git nntp://news.gmane.io/gmane.comp.version-control.git note: .onion URLs require Tor: https://www.torproject.org/ code repositories for the project(s) associated with this inbox: https://80x24.org/mirrors/git.git AGPL code for this site: git clone https://public-inbox.org/public-inbox.git