git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Lucas De Marchi <lucas.demarchi@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: git@vger.kernel.org
Subject: Re: [BUG REPORT] split-index behavior during interactive rebase
Date: Tue, 21 Sep 2021 00:34:02 -0700	[thread overview]
Message-ID: <20210921073402.cf4y3gp7yyfirfnq@ldmartin-desk2> (raw)
In-Reply-To: <20210916055057.GT3389343@mdroper-desk1.amr.corp.intel.com>

On Wed, Sep 15, 2021 at 10:50:57PM -0700, Matt Roper wrote:
>What did you do before the bug happened? (Steps to reproduce your issue)
>
>  I activated split index mode on a repo ("git config core.splitIndex
>  true"), performed an interactive rebase, modified a commit earlier in
>  the history.
>
>  The steps can be reproduced via a sequence of:
>      $ mkdir tmp && cd tmp && git init
>      $ git config core.splitIndex true
>      $ for x in `seq 20`; do echo $x >> count; git add count; git commit -m "Commit $x"; done
>      $ git rebase -i HEAD~10
>
>      ## Add "x git commit --amend --no-edit" as the first command of
>      ## the todo list.
>
>What did you expect to happen? (Expected behavior)
>
>  My expectation was that there would still only be a single shared index
>  file in the .git directory upon completion of the rebase.
>
>What happened instead? (Actual behavior)
>
>  A large number of distinct sharedindex.* files were generated in the .git
>  directory during the rebase.

Probably relevant to the debug, but I still didn't figure out the cause. This
works ok and only one .sharedindex is created

	git config core.splitIndex true
	git am 000[123].patch
	git config core.splitIndex false

Prepare test:
	git config core.splitIndex false
	git update-index --no-split-index
	rm .git/sharedindex.*
	git reset --hard HEAD~3

	git -c core.splitIndex=true am 000[123].patch

This will create 4 .git/sharedindex.* files.

Then it will create 1 .git/shareindex.* file per call to status if the
current head doesn't match the previous and the splitIndex doesn't match
the previous. This keeps increasing:

	git reset --hard ORIG_HEAD; git -c core.splitIndex=true status; ls -l .git/sharedindex.* | wc -l
	...
	4
	git reset --hard ORIG_HEAD; git -c core.splitIndex=true status; ls -l .git/sharedindex.* | wc -l
	...
	5
	...

note that if I pass -c core.splitIndex=true to git reset, this behavior
goes away. It seems that somehow the setting splitindex is getting reset
during git-am with multiple patches (or during rebase)... ?

Lucas De Marchi

>
>What's different between what you expected and what actually happened?
>
>  Rather than a single shared index file, I wound up with a huge number of
>  large shared index files.  The real repository I was working with (a Linux
>  kernel source tree) had a shared index file size of about 7MB, and I was
>  modifying a commit several hundred back in history (in case it
>  matters, these were all linear commits, no merges), so the resulting
>  collection of shared index files consumed a surprising amount of disk
>  space.
>
>Anything else you want to add:
>
>  As an experiment, I tried setting splitIndex.sharedIndexExpire=now to see
>  if it would avoid the explosion of shared index files, but it appears the
>  stale index files are still not being removed during the rebase, and I
>  still wind up with a huge number at the end of the rebase.  If I manually
>  run "git update-index --split-index" after the rebase completes it will
>  properly delete all of the stale ones at that point.
>
>  Rebases that do not actually modify the history do _not_ trigger the
>  explosion of shared index files (e.g., "git rebase -i HEAD~10 --exec 'echo
>  foo'").
>
>  If I do not set the core.splitIndex setting on the repository, but only
>  activate split index manually via "git update-index --split-index" there
>  is only one shared index file at the end of the rebase, but based on the
>  file size it appears the repository is no longer operating in split index
>  mode.
>
>  Before:
>  $ ll .git | grep index
>  -rw-rw-r--   1 mdroper mdroper   149165 Sep 15 22:21 index
>  -rw-rw-r--   1 mdroper mdroper  7296080 Sep 15 22:21 sharedindex.f916dd59ccc22ca34298f557a4659aca2767dae4
>
>  After (just amending HEAD~1 in this case):
>  $ ls -l .git | grep index
>  -rw-rw-r--   1 mdroper mdroper  7445145 Sep 15 22:22 index
>  -rw-rw-r--   1 mdroper mdroper  7296080 Sep 15 22:22 sharedindex.f916dd59ccc22ca34298f557a4659aca2767dae4
>
>
>[System Info]
>git version:
>git version 2.33.0
>cpu: x86_64
>no commit associated with this build
>sizeof-long: 8
>sizeof-size_t: 8
>shell-path: /bin/sh
>uname: Linux 5.8.18-100.fc31.x86_64 #1 SMP Mon Nov 2 20:32:55 UTC 2020 x86_64
>compiler info: gnuc: 9.3
>libc info: glibc: 2.30
>$SHELL (typically, interactive shell): /bin/bash
>
>
>[Enabled Hooks]
>
>-- 
>Matt Roper
>Graphics Software Engineer
>VTT-OSGC Platform Enablement
>Intel Corporation
>(916) 356-2795

  reply	other threads:[~2021-09-21  7:34 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-16  5:50 [BUG REPORT] split-index behavior during interactive rebase Matt Roper
2021-09-21  7:34 ` Lucas De Marchi [this message]
2021-09-26 21:57 ` SZEDER Gábor
2021-09-27  2:17   ` Matt Roper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210921073402.cf4y3gp7yyfirfnq@ldmartin-desk2 \
    --to=lucas.demarchi@intel.com \
    --cc=git@vger.kernel.org \
    --cc=matthew.d.roper@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).