git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Elijah Newren <newren@gmail.com>
Cc: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Philip Oakley <philipoakley@iee.email>
Subject: Re: unifying sequencer's options persisting, was Re: [PATCH v2] sequencer: fix edit handling for cherry-pick and revert messages
Date: Fri, 2 Apr 2021 12:28:43 +0100	[thread overview]
Message-ID: <3b117e65-bf9f-af13-b093-28bbbd6f9bb3@gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.2103311533340.52@tvgsbejvaqbjf.bet>

Hi Dscho and Elijah

On 31/03/2021 14:48, Johannes Schindelin wrote:
> Hi,
> 
> On Tue, 30 Mar 2021, Elijah Newren wrote:
> 
>> On Tue, Mar 30, 2021 at 3:13 AM Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>>
>>> I'll allow myself one tangent: the subject of the sequencer's Unix
>>> shell script heritage seems to come up with an increasing frequency,
>>> in particular the awful "let's write out one file per setting"
>>> strategy.
>>>
>>> I would _love_ for `save_opts()` to write a JSON instead (or an INI
>>> via the `git_config_*()` family of functions, as is done already by
>>> the cherry-pick/revert stuff), now that we no longer have any shell
>>> script backend (apart from `--preserve-merges`, but that one is on its
>>> way out anyway).
>>>
>>> The one thing that concerns me with this idea is that I know for a
>>> fact that some enterprisey users play games with those files inside
>>> `<GIT_DIR>/rebase-merge` that should be considered internal
>>> implementation details. Not sure how to deprecate that properly, I
>>> don't think we have a sane way to detect whether users rely on these
>>> implementation details other than breaking their expectations, which
>>> is not really a gentle way to ask them to update their scripts.

I think it depends if users are just reading the files or writing to 
them. If they are reading them (which my scripts do) then we could 
continue to write the important files along side the new single-file 
that git actually reads. I think there is a distinction between the 
files such as git-rebase-todo which hold state and the one-line files 
which hold the options passed by the user. For example I have scripts 
that read git-rebase-todo, rewritten-pending, rewritten-list, amend-head 
and check that author-script exists. If a script starts a rebase then it 
should know what options are in effect without reading them from 
.git/rebase-merge.

>> Ooh, I'm glad you took this tangent.  May I follow it for a second?
>> I'd really, really like this too, for three reasons:
>>
>> 1) I constantly get confused about the massive duplication and
>> difference in control structures and which ones affect which type of
>> operations in sequencer.c.  Having them both use an ini-file would
>> allow us to remove lots of that duplication.  I'm sure there'll still
>> be some rebase-specific or cherry-pick-specific options, but we don't
>> need a separate parallel structure for every piece of config.
>>
>> 2) In my merge-ort optimization work, rebasing 35 commits in linux.git
>> across a massive set of 26K upstream renames has dropped rename
>> detection time from the top spot.  And it also dropped several other
>> things in the merge machinery from their spots as well.  Do you know
>> what's the slowest now?  Wasted time from sequencer.c: the unnecessary
>> process forking and all the useless disk writing (which I suspect is
>> mostly updating the index and working directory but also writing all
>> the individual control files under .git/rebase-merge/).  And this
>> stuff from sequencer.c is not just barely the slowest part, it's the
>> slowest by a wide margin.

I think we do a lot of needless writing which is unrelated to whether we 
write to one file or may files. For example from memory picking a commit 
involves writing the 'message', 'author-script', 'rewritten-pending' 
(which is often immediately deleted), 'rewritten-list' (we append to 
that one) 'CHERRY_PICK_HEAD' (which is immediately deleted unless the 
pick has become empty), 'git-rebase-todo' is completely rewritten, and 
'done' is appended to. None of this is necessary. For rewording and 
merges the only thing that needs to be written is the commit message for 
the external process to use. Fixup and squash add a couple more files 
into the mix.

I think we would save a lot by only syncing the state to disk when we 
stop or run an exec command (the state needs to be synced so exec 
commands can alter the todo list). In those cases we need to write the 
index and possibly run an external process so writing a couple of files 
is probably insignificant.

Where I think we can usefully consolidate is the one-line files which 
store the options rather than state - these are read an written much 
less frequently so I don't think they have much of a performance hit but 
it would be much nicer to just serialize the options to a single file.

>>
>> 3) I also want to allow cherry-picking or rebasing branches that
>> aren't even checked out (assuming no conflicts are triggered;
>> otherwise an error can be shown with the user asked to repeat with the
>> operation connected to an active working directory).

Exciting!

>>  For such an
>> operation, the difference between "cherry-pick" and "rebase" is nearly
>> irrelevant so you'd expect the code to be the same; every time I look
>> at the code, though, it seems that the control structures are
>> separating these two types of operations in more areas than just the
>> reading and writing of the config.

Yes this can be confusing, for example rebase and cherry-pick handle the 
todo list differently. Rebase removes the command before trying to pick 
the commit and adds it back if the pick fails for a non-conflict reason, 
cherry-pick only removes the command if the pick is successful.

Best Wishes

Phillip

> Excellent, we're in agreement, then.
> 
> The remaining question is: how do we want to go about it? Do we just want
> to codify the notion that these are internal details that are nobody's
> business, and if they are using the exact file system layout of the
> current sequencer, then it's their responsibility to adapt?
> 
> Ciao,
> Dscho
> 


  reply	other threads:[~2021-04-02 11:28 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-26  7:16 [PATCH] sequencer: fix edit handling for cherry-pick and revert messages Elijah Newren via GitGitGadget
2021-03-26 12:27 ` Philip Oakley
2021-03-26 15:12   ` Elijah Newren
2021-03-28  1:30     ` Junio C Hamano
2021-03-29  9:23 ` Phillip Wood
2021-03-29 20:52   ` Junio C Hamano
2021-03-29 21:25   ` Elijah Newren
2021-03-30  2:09 ` [PATCH v2] " Elijah Newren via GitGitGadget
2021-03-30 10:13   ` Johannes Schindelin
2021-03-30 18:47     ` Junio C Hamano
2021-03-30 20:16       ` Elijah Newren
2021-03-31 17:36         ` Ævar Arnfjörð Bjarmason
2021-03-31 17:52           ` Elijah Newren
2021-03-31 18:01         ` Junio C Hamano
2021-04-01 16:31           ` Elijah Newren
2021-03-30 19:37     ` Elijah Newren
2021-03-31 13:48       ` unifying sequencer's options persisting, was " Johannes Schindelin
2021-04-02 11:28         ` Phillip Wood [this message]
2021-04-02 13:10           ` Phillip Wood
2021-04-02 21:01           ` Junio C Hamano
2021-04-02 22:18             ` Elijah Newren
2021-04-02 22:27               ` Junio C Hamano
2021-04-08  2:40                 ` Johannes Schindelin
2021-04-08 17:45                   ` Junio C Hamano
2021-04-08 19:58                   ` Christian Couder
2021-04-09 13:53                     ` Johannes Schindelin
2021-03-31  6:52   ` [PATCH v3] " Elijah Newren via GitGitGadget
2021-03-31 14:38     ` Johannes Schindelin
  -- strict thread matches above, loose matches on Subject: below --
2021-04-02 11:40 unifying sequencer's options persisting, was Re: [PATCH v2] " Gabriel Young

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b117e65-bf9f-af13-b093-28bbbd6f9bb3@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    --cc=philipoakley@iee.email \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).