git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Phil Hord <phil.hord@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 1/1] delete multiple tags in a single transaction
Date: Thu, 8 Aug 2019 16:43:16 -0700	[thread overview]
Message-ID: <CABURp0p5xbsq+8UsFerMAY8EG-ndXgd19EUsHOgQG-dnDnTAgg@mail.gmail.com> (raw)
In-Reply-To: <CABPp-BFH++aJinkzg+qsZDRN6R5-E8LPCG_u+udZLW6o0MGBug@mail.gmail.com>

On Thu, Aug 8, 2019 at 11:15 AM Elijah Newren <newren@gmail.com> wrote:
>
> On Wed, Aug 7, 2019 at 9:11 PM Phil Hord <phil.hord@gmail.com> wrote:
> >
> > From: Phil Hord <phil.hord@gmail.com>
> >
> > 'git tag -d' accepts one or more tag refs to delete, but each deletion
> > is done by calling `delete_ref` on each argv. This is painfully slow
> > when removing from packed refs. Use delete_refs instead so all the
> > removals can be done inside a single transaction with a single write.
>
> Nice, thanks for working on this.
>
> > I have a repo with 24,000 tags, most of which are not useful to any
> > developers. Having this many refs slows down many operations that
> > would otherwise be very fast. Removing these tags when they've been
> > accidentally fetched again takes about 30 minutes using delete_ref.
>
> I also get really slow times on a repo with ~20,000 tags (though order
> ~3 minutes rather than ~30, probably due to having an SSD on this
> machine) -- but ONLY IF the refs are packed first (git pack-refs
> --all).  If the refs are loose, it's relatively quick to delete a
> dozen thousand or so tags (order of a few seconds).  It might be worth
> mentioning in the commit message that this only makes a significant
> difference in the case where the refs are packed.

I'm also using an SSD but I still see about 10 tags per second being
deleted with the current code (and packed-refs).  I see that I'm
CPU-bound, so I guess most of the time is spent searching through
.git/packed-refs.  Probably it will run faster as it progresses. I
guess the 18,000 branches in my repo keep me on the wrong end of O(N).

My VM is on an all-flash storage array, but I can't say much about its
write throughput since it's one VM among many.

Previously I thought I saw a significant speedup between v2.7.4 (on my
development vm) and v2.22.0 (on my laptop). But this week I saw it was
slow again on my laptop.  I looked for the regression but didn't find
anyone touching that code. Then I wrote this patch.

But it should have occurred to me while I was in the code that there
is a different path for unpacked refs which could explain my previous
speeds.  I didn't think I had any unpacked refs, though, since every
time I look in .git/refs for what I want, I find it relatively empty.
I see 'git pack-refs --help' says that new refs should show up loose,
but I can't say that has happened for me.  Maybe a new clone uses
packed-refs for *everything* and only newly fetched things are loose.
Is that it?  I guess since I seldom fetch tags after the first clone,
it makes sense they would all be packed.

> >     git tag -l feature/* | xargs git tag -d
> >
> > Removing the same tags using delete_refs takes less than 5 seconds.
>
> It appears this same bug also affects `git branch -d` when deleting
> lots of branches (or remote tracking branches) and they are all
> packed; could you apply the same fix there?

Will do.

> In constrast, it appears that `git update-ref --stdin` is fast
> regardless of whether the refs are packed, e.g.
>    git tag -l feature/* | sed -e 's%^%delete refs/tags/%' | git
> update-ref --stdin
> finishes quickly (order of a few seconds).

Nice!  That trick is going in my wiki for devs to use on their VMs.
Thanks for that.

  reply	other threads:[~2019-08-08 23:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-08  3:59 [PATCH 1/1] delete multiple tags in a single transaction Phil Hord
2019-08-08 12:47 ` Martin Ågren
2019-08-08 12:53   ` [PATCH] t7004: check existence of correct tag Martin Ågren
2019-08-08 18:15 ` [PATCH 1/1] delete multiple tags in a single transaction Elijah Newren
2019-08-08 23:43   ` Phil Hord [this message]
2019-08-09  3:05     ` Jeff King
2019-08-08 19:39 ` Junio C Hamano
2019-08-08 23:58   ` Phil Hord

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABURp0p5xbsq+8UsFerMAY8EG-ndXgd19EUsHOgQG-dnDnTAgg@mail.gmail.com \
    --to=phil.hord@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).