git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Re: Rebase sequencer changes prevent exec commands from modifying the todo file?
       [not found] <CAKNkOnM366uiJKkz31hS8V3NTa8qksP2pXrH4+F-zodZaNdsqg@mail.gmail.com>
@ 2017-03-02 15:25 ` Johannes Schindelin
       [not found]   ` <CAKNkOnOkSgFei7jpck8Z7tH+jYn_MXvarA86GAadT8jJt4aO-g@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2017-03-02 15:25 UTC (permalink / raw)
  To: Stephen Hicks; +Cc: git

Hi Stephen,

On Wed, 1 Mar 2017, Stephen Hicks wrote:

> I have a preferred rebase workflow wherein I generate my own TODO file
> with 'exec' lines that further modify the TODO file.  I recently noticed
> that these changes weren't sticking: they seemed to persist until the
> end of the exec'd command, and then as soon as the next command ran
> (i.e.  GIT_EDITOR=cat git rebase --edit-todo) they were gone.
> 
> I don't understand the changes that have been going through, but I suspect
> this is a result of the sequencer refactoring.  Is there a way to prevent
> git rebase from overwriting these changes?
> 
> To reproduce:
> $ git rebase -i HEAD -x "echo x false >> \"$(git rev-parse
> --git-dir)/rebase-merge/git-rebase-todo\""
> 
> This should cause the rebase to fail (and indeed, it does on 2.11.0.390)
> since it inserts an "x false".  But somewhere between there and
> 2.12.0.rc1.440, this behavior is changed.

Do you also modify the author-script file to execute arbitrary shell
commands? ;-)

Seriously again, it should not be too much of a deal to handle your use
case by re-reading the git-rebase-todo file if it has changed after an
`exec` has modified it. We already force a re-read of the index.

I won't be able to take care of that immediately, though, as I have a
pressing other patch series I need to get done.

If you want to take a crack at it in the meantime, I think this is where I
would start:

https://github.com/git-for-windows/git/blob/ce4c6ca554/sequencer.c#L2020-L2027

I would probably try to make the code smart by looking at the
timestamp/size/inode fields of the stat data of git-rebase-todo, and only
force a re-read in case those fields are different, the `res` variable is
0, and then the code would need to `continue;` in order to skip the
increment of `todo_list->current`.

And before that, I'd turn your example into a test case in
t/t3405-rebase-malformed.sh...

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Rebase sequencer changes prevent exec commands from modifying the todo file?
       [not found]   ` <CAKNkOnOkSgFei7jpck8Z7tH+jYn_MXvarA86GAadT8jJt4aO-g@mail.gmail.com>
@ 2017-04-12 22:05     ` Johannes Schindelin
  2017-04-12 23:26       ` Stephen Hicks
  0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2017-04-12 22:05 UTC (permalink / raw)
  To: Stephen Hicks; +Cc: git

Hi Stephen,

On Tue, 11 Apr 2017, Stephen Hicks wrote:

> Thanks for the tips.  I think I have an approach that works, by simply
> returning sequencer_continue() immediately after a successful exec.

I am not sure that that works really as expected, as you re-enter the
sequencer_continue() and neither the original author nor I expected nested
calls.

> I'm hesitant to only use mtime, size, and inode, since it's quite likely
> that these are all identical even if the file has changed.

Not at all. The mtime and the size will most likely be different.

I am reluctant to take your wholesale approach, as I perform literally
dozens of rebases with >100 commits, including plenty of exec calls, and I
want the rebase to become faster instead of slower.

> Say, the command is simply a `sed -i 's/^exec /#### /g'`, then the
> timestamp (in seconds) will almost definitely be the same, and the size
> and inode will be the same as well.

Try it. The inode is different.

> Granted this is a contrived example, but it would be unfortunate if
> accidentally keeping the size the same were to cause the change to not
> be picked up.
> 
> Another option would be to hash the contents, but at that point, I'm not
> sure it's any better than simply unconditionally re-parsing the TODO.

Again, my intent is to make rebase faster, not slower. Hashing the
contents would make it slower. So would re-reading it.

> https://github.com/git/git/pull/343

Thank you for starting to work on this. I left a couple of comments.
Please do not be offended by their terseness, I really wanted to point out
a couple of things I think we can improve together, but I am also way past
my bedtime.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Rebase sequencer changes prevent exec commands from modifying the todo file?
  2017-04-12 22:05     ` Johannes Schindelin
@ 2017-04-12 23:26       ` Stephen Hicks
  2017-04-13 13:52         ` Johannes Schindelin
  0 siblings, 1 reply; 4+ messages in thread
From: Stephen Hicks @ 2017-04-12 23:26 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On Wed, Apr 12, 2017 at 3:05 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Stephen,
>
> On Tue, 11 Apr 2017, Stephen Hicks wrote:
>
> > Thanks for the tips.  I think I have an approach that works, by simply
> > returning sequencer_continue() immediately after a successful exec.
>
> I am not sure that that works really as expected, as you re-enter the
> sequencer_continue() and neither the original author nor I expected nested
> calls.

Alright, I can reproduce the longer version in place here.

> > I'm hesitant to only use mtime, size, and inode, since it's quite likely
> > that these are all identical even if the file has changed.
>
> Not at all. The mtime and the size will most likely be different.

In my experience, mtime is almost always the same, since the file is
written pretty much immediately before the exec - as long as the
command takes a small fraction of a second (which it usually does) the
mtime (which is in seconds, not millis or micros) will look the same.
For shell redirects, the inode stays the same, though the size will
usually be different, but this is not guaranteed.

> I am reluctant to take your wholesale approach, as I perform literally
> dozens of rebases with >100 commits, including plenty of exec calls, and I
> want the rebase to become faster instead of slower.

I'm in favor of that, but not sure of a reliable way to tell whether
the file is modified.  As mentioned above, mtime, size, and inode only
get us *most* of the way there.  This leaves it open to very
surprising issues when suddenly one edit out of a thousand
accidentally doesn't change any of these things and so doesn't get
picked up.  This would be a very difficult to debug issue as well,
since everything would look very reasonable from the outside.
Potentially it could be mitigated by actually writing a message when
the file is reloaded, making it more obvious when that doesn't occur
(though it's still not the most ergonomic situation).

> > Say, the command is simply a `sed -i 's/^exec /#### /g'`, then the
> > timestamp (in seconds) will almost definitely be the same, and the size
> > and inode will be the same as well.
>
> Try it. The inode is different.

You are correct that sed -i does change the inode.  `c=$(sed 's/^exec
/#### /g' git-rebase-todo); echo "$c" > git-rebase-todo` is a better
example, albeit a bit more contrived.  I've added a failing test case
to demonstrate the problem.

> > Granted this is a contrived example, but it would be unfortunate if
> > accidentally keeping the size the same were to cause the change to not
> > be picked up.
> >
> > Another option would be to hash the contents, but at that point, I'm not
> > sure it's any better than simply unconditionally re-parsing the TODO.
>
> Again, my intent is to make rebase faster, not slower. Hashing the
> contents would make it slower. So would re-reading it.
>
> > https://github.com/git/git/pull/343
>
> Thank you for starting to work on this. I left a couple of comments.
> Please do not be offended by their terseness, I really wanted to point out
> a couple of things I think we can improve together, but I am also way past
> my bedtime.
>
> Ciao,
> Johannes

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Rebase sequencer changes prevent exec commands from modifying the todo file?
  2017-04-12 23:26       ` Stephen Hicks
@ 2017-04-13 13:52         ` Johannes Schindelin
  0 siblings, 0 replies; 4+ messages in thread
From: Johannes Schindelin @ 2017-04-13 13:52 UTC (permalink / raw)
  To: Stephen Hicks; +Cc: git

Hi Stephen,

On Wed, 12 Apr 2017, Stephen Hicks wrote:

> On Wed, Apr 12, 2017 at 3:05 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > On Tue, 11 Apr 2017, Stephen Hicks wrote:
> >
> > > I'm hesitant to only use mtime, size, and inode, since it's quite
> > > likely that these are all identical even if the file has changed.
> >
> > Not at all. The mtime and the size will most likely be different.
> 
> In my experience, mtime is almost always the same, since the file is
> written pretty much immediately before the exec - as long as the
> command takes a small fraction of a second (which it usually does) the
> mtime (which is in seconds, not millis or micros) will look the same.

Oh, I see, you assume that mtime is in seconds. However, on pretty much
all platforms supported by Git we use nanosecond-precision mtimes [*1*].
In practice, the mtimes are not always nanosecond-precision, as some file
systems have a coarser granularity. It is typically a pretty fine a
granularity, though, much finer than a second [*2*].

My mistake, I should have explained that I wanted to suggest adding a
field of type `struct cache_time` to the `todo_list` structure, and to use
ST_MTIME_NSEC(&st) to populate and compare the `nsec` part of that.

Ciao,
Johannes

Footnote *1*: The platforms for which we disable nanosecond mtimes are:
OSF-1, AIX, HP-UX, Interix, Minix, NONSTOP_KERNEL (?), and QNX. (If you
look at git.git's config.mak.uname, you will see that Git for Windows'
nanosecond support has not yet made it upstream.) In other words, the
major platforms (Windows, MacOSX and Linux) all compare the nanosecond
part of the mtime.

Footnote *2*: the coarsest mtime of which I know is FAT32 (2-second
granularity). If you want to use Git on such a file system, well, that's
what you get. And then some performance penalties, too. It would appear
from a quick web search that ext4 has true nanosecond granularitry. NTFS
has a 100ns granularity which is still plenty enough for our purposes.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-04-13 13:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAKNkOnM366uiJKkz31hS8V3NTa8qksP2pXrH4+F-zodZaNdsqg@mail.gmail.com>
2017-03-02 15:25 ` Rebase sequencer changes prevent exec commands from modifying the todo file? Johannes Schindelin
     [not found]   ` <CAKNkOnOkSgFei7jpck8Z7tH+jYn_MXvarA86GAadT8jJt4aO-g@mail.gmail.com>
2017-04-12 22:05     ` Johannes Schindelin
2017-04-12 23:26       ` Stephen Hicks
2017-04-13 13:52         ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).