git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Joshua Redstone <joshua.redstone@fb.com>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Git performance results on a large repository
Date: Thu, 9 Feb 2012 21:06:21 +0000	[thread overview]
Message-ID: <CB599BA0.42A6B%joshua.redstone@fb.com> (raw)
In-Reply-To: <CACsJy8AxOZQ7S42V1g-b0vdBxPpjhFZe6qDkGaALnxQ6LiUssw@mail.gmail.com>

Hi Nguyen,
I like the notion of using --assume-unchanged to cut down the set of
things that git considers may have changed.
It seems to me that there may still be situations that require operations
on the order of the # of files in the repo and hence may still be slow.
Following is a list of potential candidates that occur to me.

1. Switching branches, especially if you switch to an old branch.
Sometimes I've seen branch switching taking a long time for what I thought
was close to where HEAD was.

2. Interactive rebase in which you reorder a few commits close to the tip
of the branch (I observed this taking a long time, but haven't profiled it
yet).  I include here other types of cherry-picking of commits.

3. Any working directory operations that fail part-way through and make
you want to do a 'git reset --hard' or at least a full 'git-status'.  That
is, when you have reason to believe that files with 'assume-unchange' may
have accidentally changed.

4. Operations that require rewriting the index - I think git-add is one?

If the working-tree representation is the full set of all files
materialized on disk and it's the same as the representation of files
changed, then I'm not sure how to avoid some of these without playing file
system games or using wrapper scripts.

What do you (or others) think?


Josh


On 2/7/12 8:43 AM, "Nguyen Thai Ngoc Duy" <pclouds@gmail.com> wrote:

>On Mon, Feb 6, 2012 at 10:40 PM, Joey Hess <joey@kitenet.net> wrote:
>>> Someone on HN suggested making assume-unchanged files read-only to
>>> avoid 90% accidentally changing a file without telling git. When
>>> assume-unchanged bit is cleared, the file is made read-write again.
>>
>> That made me think about using assume-unchanged with git-annex since it
>> already has read-only files.
>>
>> But, here's what seems a misfeature...
>
>because, well.. assume-unchanged was designed to avoid stat() and
>nothing else. We are basing a new feature on top of it.
>
>> If an assume-unstaged file has
>> modifications and I git add it, nothing happens. To stage a change, I
>> have to explicitly git update-index --no-assume-unchanged and only then
>> git add, and then I need to remember to reset the assume-unstaged bit
>> when I'm done working on that file for now. Compare with running git mv
>> on the same file, which does stage the move despite assume-unstaged. (So
>> does git rm.)
>
>This is normal in the lock-based "checkout/edit/checkin" model. mv/rm
>operates on directory content, which is not "locked - no edit allowed"
>(in our case --assume-unchanged) in git. But lock-based model does not
>map really well to git anyway. It does not have the index (which may
>make things more complicated). Also at index level, git does not
>really understand directories.
>
>I think we could add a protection layer to index, where any changes
>(including removal) to an index entry are only allowed if the entry is
>"unlocked" (i.e no assume-unchanged bit). Locked entries are read-only
>and have assume-unchanged bit set. "git (un)lock" are introduced as
>new UI. Does that make assume-unchanged friendlier?
>-- 
>Duy
>--
>To unsubscribe from this list: send the line "unsubscribe git" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-02-09 21:06 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-03 14:20 Git performance results on a large repository Joshua Redstone
2012-02-03 14:56 ` Ævar Arnfjörð Bjarmason
2012-02-03 17:00   ` Joshua Redstone
2012-02-03 22:40     ` Sam Vilain
2012-02-03 22:57       ` Sam Vilain
2012-02-07  1:19       ` Nguyen Thai Ngoc Duy
2012-02-03 23:05     ` Matt Graham
2012-02-04  1:25   ` Evgeny Sazhin
2012-02-03 23:35 ` Chris Lee
2012-02-04  0:01 ` Zeki Mokhtarzada
2012-02-04  5:07 ` Joey Hess
2012-02-04  6:53 ` Nguyen Thai Ngoc Duy
2012-02-04 18:05   ` Joshua Redstone
2012-02-05  3:47     ` Nguyen Thai Ngoc Duy
2012-02-06 15:40       ` Joey Hess
2012-02-07 13:43         ` Nguyen Thai Ngoc Duy
2012-02-09 21:06           ` Joshua Redstone [this message]
2012-02-10  7:12             ` Nguyen Thai Ngoc Duy
2012-02-10  9:39               ` Christian Couder
2012-02-10 12:24                 ` Nguyen Thai Ngoc Duy
2012-02-06  7:10     ` David Mohs
2012-02-06 16:23     ` Matt Graham
2012-02-06 20:50       ` Joshua Redstone
2012-02-06 21:07         ` Greg Troxel
2012-02-07  1:28         ` david
2012-02-06 21:17     ` Sam Vilain
2012-02-04 20:05   ` Joshua Redstone
2012-02-05 15:01   ` Tomas Carnecky
2012-02-05 15:17     ` Nguyen Thai Ngoc Duy
2012-02-04  8:57 ` slinky
2012-02-04 21:42 ` Greg Troxel
2012-02-05  4:30 ` david
2012-02-05 11:24   ` David Barr
2012-02-07  8:58 ` Emanuele Zattin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CB599BA0.42A6B%joshua.redstone@fb.com \
    --to=joshua.redstone@fb.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).