From: Elijah Newren <newren@gmail.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>,
"Git Mailing List" <git@vger.kernel.org>,
"Martin Melka" <martin.melka@gmail.com>,
"Samuel Lijin" <sxlijin@gmail.com>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH 5/6] dir: replace exponential algorithm with a linear one
Date: Fri, 31 Jan 2020 09:47:45 -0800 [thread overview]
Message-ID: <CABPp-BG8wAmOL03mU2Tf8woVKdWw7o6muXenMH4G-Z6_kBEajQ@mail.gmail.com> (raw)
In-Reply-To: <20200131171341.GH10482@szeder.dev>
On Fri, Jan 31, 2020 at 9:13 AM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Wed, Jan 29, 2020 at 10:03:42PM +0000, Elijah Newren via GitGitGadget wrote:
> > Part of the tension noted above is that the treatment of a directory can
> > changed based on the files within it, and based on the various settings
>
> s/changed/change/, or perhaps s/changed/be changed/ ?
>
> > Since dir.c is somewhat complex, extra cruft built up around this over
> > time. While trying to unravel it, I noticed several instances where the
> > first call to read_directory_recursive() would return e.g.
> > path_untracked for a some directory and a later one would return e.g.
>
> s/for a some/for some/
>
> > However, on the positive side, it does make the code much faster. For
> > the following simple shell loop in an empty repository:
> >
> > for depth in $(seq 10 25)
> > do
> > dirs=$(for i in $(seq 1 $depth) ; do printf 'dir/' ; done)
> > rm -rf dir
> > mkdir -p $dirs
> > >$dirs/untracked-file
> > /usr/bin/time --format="$depth: %e" git status --ignored >/dev/null
> > done
> >
> > I saw the following timings, in seconds (note that the numbers are a
> > little noisy from run-to-run, but the trend is very clear with every
> > run):
> >
> > 10: 0.03
> > 11: 0.05
> > 12: 0.08
> > 13: 0.19
> > 14: 0.29
> > 15: 0.50
> > 16: 1.05
> > 17: 2.11
> > 18: 4.11
> > 19: 8.60
> > 20: 17.55
> > 21: 33.87
> > 22: 68.71
> > 23: 140.05
> > 24: 274.45
> > 25: 551.15
> >
> > After this fix, those drop to:
> >
> > 10: 0.00
> > 11: 0.00
> > 12: 0.00
> > 13: 0.00
> > 14: 0.00
> > 15: 0.00
> > 16: 0.00
> > 17: 0.00
> > 18: 0.00
> > 19: 0.00
> > 20: 0.00
> > 21: 0.00
> > 22: 0.00
> > 23: 0.00
> > 24: 0.00
> > 25: 0.00
>
> I agree with Derrick here: if you just said that all these report
> 0.00, I would have taken your word for it.
Thanks, I'll include all these fixes. Good timing too, as I was about
to send a re-roll.
> Having said that... I don't know how to get more decimal places out
> of /use/bin/time, but our trace performance facility uses nanosecond
> resolution timestamps. So using this command in the loop above:
>
> GIT_TRACE_PERFORMANCE=2 git status --ignored 2>&1 >/dev/null |
> sed -n -e "s/.* performance: \(.*\): git command.*/$depth: \1/p"
>
> gave me this:
>
> 1: 0.000574302 s
> 2: 0.000584995 s
> 3: 0.000608684 s
> 4: 0.000951336 s
> 5: 0.000762019 s
> 6: 0.000816685 s
> 7: 0.000672516 s
> 8: 0.000912628 s
> 9: 0.000661538 s
> 10: 0.000687465 s
> 11: 0.000708880 s
> 12: 0.000693754 s
> 13: 0.000726120 s
> 14: 0.000737334 s
> 15: 0.000787362 s
> 16: 0.000856687 s
> 17: 0.000780892 s
> 18: 0.000790798 s
> 19: 0.000834411 s
> 20: 0.000859094 s
> 21: 0.001230912 s
> 22: 0.001048852 s
> 23: 0.000891057 s
> 24: 0.000934097 s
> 25: 0.001051704 s
>
> Not sure it's worth including, though.
Yeah, I'm afraid people will spend time trying to analyze it and the
numbers are extremely noisy. I instead included some words about
counting the number of untracked files opened according to strace,
which shows before we had 2^(1+$depth)-2 untracked directories get
opened and after we had exactly $depth get opened.
next prev parent reply other threads:[~2020-01-31 17:48 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-29 22:03 [PATCH 0/6] Avoid multiple recursive calls for same path in read_directory_recursive() Elijah Newren via GitGitGadget
2020-01-29 22:03 ` [PATCH 1/6] dir: consolidate treat_path() and treat_one_path() Elijah Newren via GitGitGadget
2020-01-29 22:03 ` [PATCH 2/6] dir: fix broken comment Elijah Newren via GitGitGadget
2020-01-29 22:03 ` [PATCH 3/6] dir: fix confusion based on variable tense Elijah Newren via GitGitGadget
2020-01-30 15:20 ` Derrick Stolee
2020-01-31 18:04 ` SZEDER Gábor
2020-01-31 18:17 ` Elijah Newren
2020-01-29 22:03 ` [PATCH 4/6] dir: move setting of nested_repo next to its actual usage Elijah Newren via GitGitGadget
2020-01-30 15:33 ` Derrick Stolee
2020-01-30 15:45 ` Elijah Newren
2020-01-30 16:00 ` Derrick Stolee
2020-01-30 16:10 ` Derrick Stolee
2020-01-30 16:20 ` Elijah Newren
2020-01-30 18:17 ` Derrick Stolee
2020-01-29 22:03 ` [PATCH 5/6] dir: replace exponential algorithm with a linear one Elijah Newren via GitGitGadget
2020-01-30 15:55 ` Derrick Stolee
2020-01-30 17:13 ` Elijah Newren
2020-01-30 17:45 ` Elijah Newren
2020-01-31 17:13 ` SZEDER Gábor
2020-01-31 17:47 ` Elijah Newren [this message]
2020-01-29 22:03 ` [PATCH 6/6] t7063: blindly accept diffs Elijah Newren via GitGitGadget
2020-01-31 18:31 ` [PATCH v2 0/6] Avoid multiple recursive calls for same path in read_directory_recursive() Elijah Newren via GitGitGadget
2020-01-31 18:31 ` [PATCH v2 1/6] dir: consolidate treat_path() and treat_one_path() Elijah Newren via GitGitGadget
2020-01-31 18:31 ` [PATCH v2 2/6] dir: fix broken comment Elijah Newren via GitGitGadget
2020-01-31 18:31 ` [PATCH v2 3/6] dir: fix confusion based on variable tense Elijah Newren via GitGitGadget
2020-01-31 18:31 ` [PATCH v2 4/6] dir: refactor treat_directory to clarify control flow Derrick Stolee via GitGitGadget
2020-01-31 18:31 ` [PATCH v2 5/6] dir: replace exponential algorithm with a linear one Elijah Newren via GitGitGadget
2020-01-31 18:31 ` [PATCH v2 6/6] t7063: blindly accept diffs Elijah Newren via GitGitGadget
2020-03-25 19:31 ` [PATCH v3 0/7] Avoid multiple recursive calls for same path in read_directory_recursive() Elijah Newren via GitGitGadget
2020-03-25 19:31 ` [PATCH v3 1/7] t7063: correct broken test expectation Elijah Newren via GitGitGadget
2020-03-26 13:02 ` Derrick Stolee
2020-03-26 21:18 ` Elijah Newren
2020-03-25 19:31 ` [PATCH v3 2/7] dir: fix simple typo in comment Elijah Newren via GitGitGadget
2020-03-25 19:31 ` [PATCH v3 3/7] dir: consolidate treat_path() and treat_one_path() Elijah Newren via GitGitGadget
2020-03-25 19:31 ` [PATCH v3 4/7] dir: fix broken comment Elijah Newren via GitGitGadget
2020-03-25 19:31 ` [PATCH v3 5/7] dir: fix confusion based on variable tense Elijah Newren via GitGitGadget
2020-03-25 19:31 ` [PATCH v3 6/7] dir: refactor treat_directory to clarify control flow Derrick Stolee via GitGitGadget
2020-03-25 19:31 ` [PATCH v3 7/7] dir: replace exponential algorithm with a linear one, fix untracked cache Elijah Newren via GitGitGadget
2020-03-26 13:13 ` Derrick Stolee
2020-03-26 21:27 ` [PATCH v4 0/7] Avoid multiple recursive calls for same path in read_directory_recursive() Elijah Newren via GitGitGadget
2020-03-26 21:27 ` [PATCH v4 1/7] t7063: more thorough status checking Elijah Newren via GitGitGadget
2020-03-27 13:09 ` Derrick Stolee
2020-03-29 18:18 ` Junio C Hamano
2020-03-31 20:15 ` Elijah Newren
2020-03-26 21:27 ` [PATCH v4 2/7] dir: fix simple typo in comment Elijah Newren via GitGitGadget
2020-03-26 21:27 ` [PATCH v4 3/7] dir: consolidate treat_path() and treat_one_path() Elijah Newren via GitGitGadget
2020-03-26 21:27 ` [PATCH v4 4/7] dir: fix broken comment Elijah Newren via GitGitGadget
2020-03-26 21:27 ` [PATCH v4 5/7] dir: fix confusion based on variable tense Elijah Newren via GitGitGadget
2020-03-26 21:27 ` [PATCH v4 6/7] dir: refactor treat_directory to clarify control flow Derrick Stolee via GitGitGadget
2020-03-26 21:27 ` [PATCH v4 7/7] dir: replace exponential algorithm with a linear one Elijah Newren via GitGitGadget
2020-03-27 13:13 ` [PATCH v4 0/7] Avoid multiple recursive calls for same path in read_directory_recursive() Derrick Stolee
2020-03-28 17:33 ` Elijah Newren
2020-03-29 18:20 ` Junio C Hamano
2020-04-01 4:17 ` [PATCH v5 00/12] " Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 01/12] t7063: more thorough status checking Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 02/12] t3000: add more testcases testing a variety of ls-files issues Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 03/12] dir: fix simple typo in comment Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 04/12] dir: consolidate treat_path() and treat_one_path() Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 05/12] dir: fix broken comment Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 06/12] dir: fix confusion based on variable tense Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 07/12] dir: refactor treat_directory to clarify control flow Derrick Stolee via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 08/12] dir: replace exponential algorithm with a linear one Elijah Newren via GitGitGadget
2020-04-01 13:57 ` Derrick Stolee
2020-04-01 15:59 ` Elijah Newren
2020-04-01 4:17 ` [PATCH v5 09/12] dir: include DIR_KEEP_UNTRACKED_CONTENTS handling in treat_directory() Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 10/12] dir: replace double pathspec matching with single " Elijah Newren via GitGitGadget
2020-04-01 4:17 ` [PATCH v5 11/12] Fix error-prone fill_directory() API; make it only return matches Elijah Newren via GitGitGadget
2020-07-19 6:33 ` Andreas Schwab
2020-07-19 12:39 ` Martin Ågren
2020-07-20 15:25 ` Elijah Newren
2020-07-20 18:45 ` [PATCH] dir: check pathspecs before returning `path_excluded` Martin Ågren
2020-07-20 18:49 ` Elijah Newren
2020-07-20 18:51 ` Martin Ågren
2020-07-20 20:25 ` Junio C Hamano
2020-07-20 18:58 ` [PATCH v5 11/12] Fix error-prone fill_directory() API; make it only return matches Junio C Hamano
2020-04-01 4:17 ` [PATCH v5 12/12] completion: fix 'git add' on paths under an untracked directory Elijah Newren via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABPp-BG8wAmOL03mU2Tf8woVKdWw7o6muXenMH4G-Z6_kBEajQ@mail.gmail.com \
--to=newren@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=martin.melka@gmail.com \
--cc=pclouds@gmail.com \
--cc=sxlijin@gmail.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).