git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Brian Malehorn <bmalehorn@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: git stash --include-untracked walks ignored directories
Date: Tue, 9 Jun 2020 22:56:48 -0700	[thread overview]
Message-ID: <CAJB88a23uU2WfB0mnB9NfNbtgmABhNOWNOEMBt7rRVu7uL_C9A@mail.gmail.com> (raw)
In-Reply-To: <CABPp-BHv+5XSXMpZ=-kM=a3C0Y+v=JY5m11s-QWj_krjCvvO4g@mail.gmail.com>

Ah, my original message was a bit misleading. I believe git slows down
based on the number of directories, not the number of files. Here's an
updated version of your script that creates a lot of directories
instead of a lot of files:

#!/bin/bash

rm -rf stupid
git init -q stupid
cd stupid

echo ignored >.gitignore
seq 1 10 >numbers-tracked
git add numbers-tracked .gitignore
git commit -q -m initial

seq 11 20 >>numbers-tracked
seq 21 30 >numbers-untracked

mkdir ignored
cd ignored
for i in $(seq 1 50); do
  for j in $(seq 1 1000); do
    echo "$i/$j"
  done
done | xargs mkdir -p
cd ..

echo "Number of directories in ignored before: $(find ignored -type d | wc -l)"
time git stash --include-untracked
git --version
echo "Number of directories in ignored after: $(find ignored -type d | wc -l)"

----

I got it to reproduce with the versions I have handy:

$ ./repro.sh
Number of directories in ignored before: 50051
Saved working directory and index state WIP on master: 0dabddf initial

real    0m0.023s
user    0m0.012s
sys     0m0.010s
git version 2.25.1
Number of directories in ignored after: 50051


$./repro.sh
Number of directories in ignored before: 50051
Saved working directory and index state WIP on master: 175d4ce initial

real    0m0.619s
user    0m0.157s
sys     0m0.452s
git version 2.27.0.83.g0313f36c6e
Number of directories in ignored after: 50051

On Tue, Jun 9, 2020 at 8:21 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Jun 9, 2020 at 1:39 PM Brian Malehorn <bmalehorn@gmail.com> wrote:
> >
> > Hi,
> >
> > Not sure if this is the right place to send this, but I'm here to
> > report a performance regression with git stash --include-untracked.
> >
> > Here's a quick way to reproduce:
> >
> > 1. make a directory with a lot of ignored files
> >
> > $ find ignored -type f | wc -l
> >    50000
> >
> > $ cat .gitignore
> > ignored
> >
> > 2. touch foo
> >
> > 3. time git stash --include-untracked
> >
> > git version 2.26.0:
> > real    0m0.094s
> >
> > git version 2.27.0.83.g0313f36c6e:
> > real    0m1.913s
> >
> > This is a much bigger pain point on my work repo, which has 1.4
> > million ignored files(!). As you can imagine it takes a long time to
> > run git stash. While it might be valid to question why anyone would
> > need that many files for any purpose, the bottom line is that I told
> > git to ignore this directory, and it didn't ignore it.
> >
> > In the meantime I've reverted to 2.26.0 which doesn't have this
> > performance regression. Let me know if you want any other information
> > related to this issue.
> >
> > Thanks,
> > Brian
>
> I seem to be missing some important step to reproduce; what else is
> needed?  Here's what I see:
>
> <Set path to use git-2.26.0>
> $ ./repro.sh
> Number of files in ignored before: 50000
> Saved working directory and index state WIP on master: e2b0471 initial
>
> real 0m0.029s
> user 0m0.014s
> sys 0m0.014s
> git version 2.26.0
> Number of files in ignored after: 50000
>
> <Set path to use git-2.27.0>
> $ ./repro.sh
> Number of files in ignored before: 50000
> Saved working directory and index state WIP on master: 5c596b8 initial
>
> real 0m0.052s
> user 0m0.014s
> sys 0m0.034s
> git version 2.27.0
> Number of files in ignored after: 50000
>
>
> Where repro.sh is:
>
> #!/bin/bash
>
> rm -rf stupid
> git init -q stupid
> cd stupid
>
> echo ignored >.gitignore
> seq 1 10 >numbers-tracked
> git add numbers-tracked .gitignore
> git commit -q -m initial
>
> seq 11 20 >>numbers-tracked
> seq 21 30 >numbers-untracked
>
> mkdir ignored
> cd ignored
> for i in $(seq 1 50000); do >$i; done
> cd ..
>
> echo "Number of files in ignored before: $(find ignored -type f | wc -l)"
> time git stash --include-untracked
> git --version
> echo "Number of files in ignored after: $(find ignored -type f | wc -l)"

      reply	other threads:[~2020-06-10  5:57 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-09 19:33 git stash --include-untracked walks ignored directories Brian Malehorn
2020-06-10  3:21 ` Elijah Newren
2020-06-10  5:56   ` Brian Malehorn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJB88a23uU2WfB0mnB9NfNbtgmABhNOWNOEMBt7rRVu7uL_C9A@mail.gmail.com \
    --to=bmalehorn@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).