git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: Ben Peart <Ben.Peart@microsoft.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Eric Sunshine <sunshine@sunshineco.com>
Subject: Re: [PATCH v2] checkout: optimize "git checkout -b <new_branch>"
Date: Wed, 1 Aug 2018 17:10:25 +0200	[thread overview]
Message-ID: <CACsJy8DMEMsDnKZc65K-0EJcm2udXZ7OKY=xoFmX4COM0dSH=g@mail.gmail.com> (raw)
In-Reply-To: <20180731163909.19004-1-benpeart@microsoft.com>

On Tue, Jul 31, 2018 at 7:03 PM Ben Peart <Ben.Peart@microsoft.com> wrote:
>
> From: Ben Peart <Ben.Peart@microsoft.com>
>
> Skip merging the commit, updating the index and working directory if and
> only if we are creating a new branch via "git checkout -b <new_branch>."
> Any other checkout options will still go through the former code path.

I'd like to see this giant list of checks broken down and pushed down
to smaller areas so that chances of new things being added but checks
not updated become much smaller. And ideally there should just be no
behavior change (I think with your change, "checkout -b" will not
report local changes, but it's not mentioned in the config document;
more things like that can easily slip).

So. I assume this reason for this patch is because on super large worktrees

 - 2-way merge is too slow
 - applying spare checkout patterns on a huge worktree is also slow
 - writing index is, again, slow
 - show_local_changes() slow

For 2-way merge, I believe we can detect inside unpack_trees() that
it's a 2-way merge (fn == twoway_merge), from HEAD to HEAD (simple
enough check), then from the 2-way merge table we know for sure
nothing is going to change and we can just skip traverse_trees() call
in unpack_trees().

On the sparse checkout application. This only needs to be done when
there are new files added, or the spare-checkout file has been updated
since the last time it's been used. We can keep track of these things
(sparse-checkout file change could be kept track with just stat info
maybe as an index extension) then we can skip applying sparse checkout
not for this particular case for but general checkouts as well. Spare
checkout file rarely changes. Big win overall.

And if all go according to plan, there will be no changes made in the
index (by either 2-way merge or sparse checkout stuff) we should be
able to just skip writing down the index, if we haven't done that
already.

show_local_changes() should be sped up significantly with the new
cache-tree optimization I'm working on in another thread.

If I have not made any mistake in my analysis so far, we achieve a big
speedup without adding a new config knob and can still fall back to
slower-but-same-behavior when things are not in the right condition. I
know it will not be the same speedup as this patch because when facing
thousands of items, even counting them takes time. But I think it's a
reasonable speedup without making the code base more fragile.
-- 
Duy

  parent reply	other threads:[~2018-08-01 15:10 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-24 18:01 [PATCH v1] checkout: optionally speed up "git checkout -b foo" Ben Peart
2018-07-24 18:42 ` Eric Sunshine
2018-07-24 19:45   ` Ben Peart
2018-07-26 15:04     ` Junio C Hamano
2018-07-26 18:59       ` Eric Sunshine
2018-07-26 19:08         ` Eric Sunshine
2018-07-24 19:21 ` Junio C Hamano
2018-07-24 20:47   ` Ben Peart
2018-07-31 16:39 ` [PATCH v2] checkout: optimize "git checkout -b <new_branch>" Ben Peart
2018-07-31 20:01   ` Junio C Hamano
2018-08-01 15:10   ` Duy Nguyen [this message]
2018-08-02 18:02     ` Ben Peart
2018-08-03 15:58       ` Duy Nguyen
2018-08-06 14:25         ` Ben Peart
2018-08-15 21:05           ` Ben Peart
2018-08-05  8:57       ` Duy Nguyen
2018-08-16 18:27 ` [PATCH v3] " Ben Peart
2018-08-16 18:37   ` Duy Nguyen
2018-08-17 12:37     ` Ben Peart
2018-08-19  1:44       ` Elijah Newren
2018-08-20 13:40         ` Ben Peart
2018-08-20 18:16           ` Elijah Newren
2018-08-21 14:51             ` Duy Nguyen
2018-08-30 17:22               ` Elijah Newren
2018-09-04 16:46                 ` Duy Nguyen
2018-08-20 18:31         ` Junio C Hamano
2018-09-18  5:34   ` [PATCH] config doc: add missing list separator for checkout.optimizeNewBranch Ævar Arnfjörð Bjarmason
2018-09-18 16:57     ` Taylor Blau
2018-09-18 17:16       ` Jeff King
2018-09-18 17:20         ` Taylor Blau
2018-09-18 17:13     ` Jeff King
2018-09-19  4:41       ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACsJy8DMEMsDnKZc65K-0EJcm2udXZ7OKY=xoFmX4COM0dSH=g@mail.gmail.com' \
    --to=pclouds@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).