git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Consist timestamps within a checkout/clone
@ 2022-10-31 19:01 Mark Hills
  2022-10-31 20:17 ` Andreas Schwab
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Mark Hills @ 2022-10-31 19:01 UTC (permalink / raw)
  To: git

Our use case: we commit some compiled objects to the repo, where compiling 
is either slow or requires software which is not always available.

Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS upgrade) 
we are noticing a change in build behaviour.

Now, after a "git clone" we find the Makefile intermittently attempting 
(and failing) some builds that are not intended.

Indeed, Make is acting reasonably as the source file is sometimes 
marginally newer than the destination (both checked out by Git), example 
below.

I've never had to consider consistency timestamps within a Git checkout 
until now.

It's entirely possible there's _never_ a guarantee of consistency here.

But then something has certainly changed in practice, as this fault has 
gone from never happening to now every couple of days.

Imaginging I can't be the first person to encounter this, I searched for 
existing threads or docs, but overwhemingly the results were question of 
Git tracking the timestamps (as part of the commit) which this is not; 
it's consistency within one checkout.

$ git clone --depth 1 file:///path/to/repo.git

$ stat winner.jpeg
  File: winner.jpeg
  Size: 258243          Blocks: 520        IO Block: 4096   regular file
Device: fd07h/64775d    Inode: 33696       Links: 1
Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
Access: 2022-10-31 16:05:17.756858496 +0000
Modify: 2022-10-31 16:05:17.756858496 +0000
Change: 2022-10-31 16:05:17.756858496 +0000
 Birth: -

$ stat winner.svg
  File: winner.svg
  Size: 52685           Blocks: 112        IO Block: 4096   regular file
Device: fd07h/64775d    Inode: 33697       Links: 1
Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
Access: 2022-10-31 16:05:17.766859030 +0000
Modify: 2022-10-31 16:05:17.766859030 +0000
Change: 2022-10-31 16:05:17.766859030 +0000
 Birth: -

Elsewhere in the repository, it's clear the timestamps are not consistent:

$ stat Makefile
  File: Makefile
  Size: 8369            Blocks: 24         IO Block: 4096   regular file
Device: fd07h/64775d    Inode: 33655       Links: 1
Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
Access: 2022-10-31 16:05:51.628660212 +0000
Modify: 2022-10-31 16:05:17.746857963 +0000
Change: 2022-10-31 16:05:17.746857963 +0000
 Birth: -

-- 
Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 19:01 Consist timestamps within a checkout/clone Mark Hills
@ 2022-10-31 20:17 ` Andreas Schwab
  2022-10-31 20:21 ` Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2022-10-31 20:17 UTC (permalink / raw)
  To: Mark Hills; +Cc: git

On Okt 31 2022, Mark Hills wrote:

> It's entirely possible there's _never_ a guarantee of consistency here.

I don't think the order in which git writes the individual files is
defined in any way.  Thus depending on the precision of the time stamps
in the file system whether a file ends up newer than another one may
vary each time due to timing differences.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 19:01 Consist timestamps within a checkout/clone Mark Hills
  2022-10-31 20:17 ` Andreas Schwab
@ 2022-10-31 20:21 ` Ævar Arnfjörð Bjarmason
  2022-10-31 20:36   ` Taylor Blau
  2022-10-31 22:29   ` Mark Hills
  2022-11-01 13:55 ` Marc Branchaud
  2022-11-01 14:34 ` Erik Cervin Edin
  3 siblings, 2 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-31 20:21 UTC (permalink / raw)
  To: Mark Hills; +Cc: git, Matheus Tavares


On Mon, Oct 31 2022, Mark Hills wrote:

> Our use case: we commit some compiled objects to the repo, where compiling 
> is either slow or requires software which is not always available.
>
> Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS upgrade) 
> we are noticing a change in build behaviour.
>
> Now, after a "git clone" we find the Makefile intermittently attempting 
> (and failing) some builds that are not intended.
>
> Indeed, Make is acting reasonably as the source file is sometimes 
> marginally newer than the destination (both checked out by Git), example 
> below.
>
> I've never had to consider consistency timestamps within a Git checkout 
> until now.
>
> It's entirely possible there's _never_ a guarantee of consistency here.
>
> But then something has certainly changed in practice, as this fault has 
> gone from never happening to now every couple of days.
>
> Imaginging I can't be the first person to encounter this, I searched for 
> existing threads or docs, but overwhemingly the results were question of 
> Git tracking the timestamps (as part of the commit) which this is not; 
> it's consistency within one checkout.
>
> $ git clone --depth 1 file:///path/to/repo.git
>
> $ stat winner.jpeg
>   File: winner.jpeg
>   Size: 258243          Blocks: 520        IO Block: 4096   regular file
> Device: fd07h/64775d    Inode: 33696       Links: 1
> Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> Access: 2022-10-31 16:05:17.756858496 +0000
> Modify: 2022-10-31 16:05:17.756858496 +0000
> Change: 2022-10-31 16:05:17.756858496 +0000
>  Birth: -
>
> $ stat winner.svg
>   File: winner.svg
>   Size: 52685           Blocks: 112        IO Block: 4096   regular file
> Device: fd07h/64775d    Inode: 33697       Links: 1
> Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> Access: 2022-10-31 16:05:17.766859030 +0000
> Modify: 2022-10-31 16:05:17.766859030 +0000
> Change: 2022-10-31 16:05:17.766859030 +0000
>  Birth: -
>
> Elsewhere in the repository, it's clear the timestamps are not consistent:
>
> $ stat Makefile
>   File: Makefile
>   Size: 8369            Blocks: 24         IO Block: 4096   regular file
> Device: fd07h/64775d    Inode: 33655       Links: 1
> Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> Access: 2022-10-31 16:05:51.628660212 +0000
> Modify: 2022-10-31 16:05:17.746857963 +0000
> Change: 2022-10-31 16:05:17.746857963 +0000
>  Birth: -

I think you're almost certainly running into the parallel checkout,
which is new in that revision range. Try tweaking checkout.workers and
checkout.thresholdForParallelism (see "man git-config").

I can't say without looking at the code/Makefile (and even then, I don't
have time to dig here:), but if I had to bet I'd say that your
dependencies have probably always been broken with these checked-in
files, but they happend to work out if they were checked out in sorted
order.

And now with the parallel checkout they're not guaranteed to do that, as
some workers will "race ahead" and finish in an unpredictable order.

But that's all just a guess, perhaps it has nothing to do with parallel
checkout, such dependency issues are sensitive to all sorts of other
things, e.g. maybe git got slightly faster (or slower), so now files
that were always on different seconds (or the same) aren't in the state
they were in before...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 20:21 ` Ævar Arnfjörð Bjarmason
@ 2022-10-31 20:36   ` Taylor Blau
  2022-10-31 22:31     ` Mark Hills
  2022-10-31 22:29   ` Mark Hills
  1 sibling, 1 reply; 17+ messages in thread
From: Taylor Blau @ 2022-10-31 20:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Mark Hills, git, Matheus Tavares

On Mon, Oct 31, 2022 at 09:21:20PM +0100, Ævar Arnfjörð Bjarmason wrote:
> I think you're almost certainly running into the parallel checkout,
> which is new in that revision range. Try tweaking checkout.workers and
> checkout.thresholdForParallelism (see "man git-config").
>
> I can't say without looking at the code/Makefile (and even then, I don't
> have time to dig here:), but if I had to bet I'd say that your
> dependencies have probably always been broken with these checked-in
> files, but they happend to work out if they were checked out in sorted
> order.
>
> And now with the parallel checkout they're not guaranteed to do that, as
> some workers will "race ahead" and finish in an unpredictable order.

Doesn't checkout.thresholdForParallelism only matter when
checkout.workers != 1?

So what you wrote seems like a reasonable explanation, but only if the
original reporter set checkout.workers to imply the non-sequential
behavior in the first place.

That said...

  - I also don't know off-hand of a place where we've defined the order
    where Git will checkout files in the working copy. So depending on
    that behavior isn't a safe thing to do.

  - Committing build artifacts into your repository is generally
    discouraged.

So while I'd guess that setting `checkout.workers` back to "1" (if it
wasn't already) will probably restore the existing behavior, counting
on that behavior in the first place is wrong.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 20:21 ` Ævar Arnfjörð Bjarmason
  2022-10-31 20:36   ` Taylor Blau
@ 2022-10-31 22:29   ` Mark Hills
  2022-11-01 17:46     ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 17+ messages in thread
From: Mark Hills @ 2022-10-31 22:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Matheus Tavares

[-- Attachment #1: Type: text/plain, Size: 4582 bytes --]

On Mon, 31 Oct 2022, Ævar Arnfjörð Bjarmason wrote:

> 
> On Mon, Oct 31 2022, Mark Hills wrote:
> 
> > Our use case: we commit some compiled objects to the repo, where compiling 
> > is either slow or requires software which is not always available.
> >
> > Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS upgrade) 
> > we are noticing a change in build behaviour.
> >
> > Now, after a "git clone" we find the Makefile intermittently attempting 
> > (and failing) some builds that are not intended.
> >
> > Indeed, Make is acting reasonably as the source file is sometimes 
> > marginally newer than the destination (both checked out by Git), example 
> > below.
> >
> > I've never had to consider consistency timestamps within a Git checkout 
> > until now.
> >
> > It's entirely possible there's _never_ a guarantee of consistency here.
> >
> > But then something has certainly changed in practice, as this fault has 
> > gone from never happening to now every couple of days.
> >
> > Imaginging I can't be the first person to encounter this, I searched for 
> > existing threads or docs, but overwhemingly the results were question of 
> > Git tracking the timestamps (as part of the commit) which this is not; 
> > it's consistency within one checkout.
> >
> > $ git clone --depth 1 file:///path/to/repo.git
> >
> > $ stat winner.jpeg
> >   File: winner.jpeg
> >   Size: 258243          Blocks: 520        IO Block: 4096   regular file
> > Device: fd07h/64775d    Inode: 33696       Links: 1
> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > Access: 2022-10-31 16:05:17.756858496 +0000
> > Modify: 2022-10-31 16:05:17.756858496 +0000
> > Change: 2022-10-31 16:05:17.756858496 +0000
> >  Birth: -
> >
> > $ stat winner.svg
> >   File: winner.svg
> >   Size: 52685           Blocks: 112        IO Block: 4096   regular file
> > Device: fd07h/64775d    Inode: 33697       Links: 1
> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > Access: 2022-10-31 16:05:17.766859030 +0000
> > Modify: 2022-10-31 16:05:17.766859030 +0000
> > Change: 2022-10-31 16:05:17.766859030 +0000
> >  Birth: -
> >
> > Elsewhere in the repository, it's clear the timestamps are not consistent:
> >
> > $ stat Makefile
> >   File: Makefile
> >   Size: 8369            Blocks: 24         IO Block: 4096   regular file
> > Device: fd07h/64775d    Inode: 33655       Links: 1
> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > Access: 2022-10-31 16:05:51.628660212 +0000
> > Modify: 2022-10-31 16:05:17.746857963 +0000
> > Change: 2022-10-31 16:05:17.746857963 +0000
> >  Birth: -
> 
> I think you're almost certainly running into the parallel checkout,
> which is new in that revision range. Try tweaking checkout.workers and
> checkout.thresholdForParallelism (see "man git-config").

Thanks, it will be interesting to try this and I'll report back.
 
> I can't say without looking at the code/Makefile (and even then, I don't
> have time to dig here:), but if I had to bet I'd say that your
> dependencies have probably always been broken with these checked-in
> files, but they happend to work out if they were checked out in sorted
> order.
>
> And now with the parallel checkout they're not guaranteed to do that, as
> some workers will "race ahead" and finish in an unpredictable order.

These are very simple Makefile rules, I don't think these dependencies are 
broken; but your theory is in good alignment with the observed behaviour.

For example, the rule from the recent case above is:

  %.jpeg:         %.png
                  convert $< $(IMFLAGS) $@

  %.png:          %.svg
                  inkscape --export-type=png --export-filename=$@ $<

As you suggest, perhaps the Git implementation previously ran checked out 
in some kind of time order then this happens to fulfil a useful behaviour.

Specificaly with build artefacts. These are likely to have been added to 
the repo after the source file. This could have been providing some 
pratical and useful tendency of ordering.

> But that's all just a guess, perhaps it has nothing to do with parallel
> checkout, such dependency issues are sensitive to all sorts of other
> things, e.g. maybe git got slightly faster (or slower), so now files
> that were always on different seconds (or the same) aren't in the state
> they were in before...

Hopefully I'll get to some experiments to narrow this down.

Thanks

-- 
Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 20:36   ` Taylor Blau
@ 2022-10-31 22:31     ` Mark Hills
  2022-10-31 22:42       ` rsbecker
  2022-11-01 18:34       ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 17+ messages in thread
From: Mark Hills @ 2022-10-31 22:31 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Ævar Arnfjörð Bjarmason, git, Matheus Tavares

[-- Attachment #1: Type: text/plain, Size: 3133 bytes --]

On Mon, 31 Oct 2022, Taylor Blau wrote:

> On Mon, Oct 31, 2022 at 09:21:20PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > I think you're almost certainly running into the parallel checkout,
> > which is new in that revision range. Try tweaking checkout.workers and
> > checkout.thresholdForParallelism (see "man git-config").
> >
> > I can't say without looking at the code/Makefile (and even then, I don't
> > have time to dig here:), but if I had to bet I'd say that your
> > dependencies have probably always been broken with these checked-in
> > files, but they happend to work out if they were checked out in sorted
> > order.
> >
> > And now with the parallel checkout they're not guaranteed to do that, as
> > some workers will "race ahead" and finish in an unpredictable order.
> 
> Doesn't checkout.thresholdForParallelism only matter when
> checkout.workers != 1?
> 
> So what you wrote seems like a reasonable explanation, but only if the
> original reporter set checkout.workers to imply the non-sequential
> behavior in the first place.
> 
> That said...
> 
>   - I also don't know off-hand of a place where we've defined the order
>     where Git will checkout files in the working copy. So depending on
>     that behavior isn't a safe thing to do.
> 
>   - Committing build artifacts into your repository is generally
>     discouraged.

If it's undefined and never implemented this is reasonable.

But "generally" is a caveat, so while I agree with the statement it also 
implies there's valid cases outside of that. Ones which used to work, too.

Here are some useful cases I have seen for the combination of build rule + 
checked in file:

- part of a build requires licensed software that's not always available

- part of the build requires large memory that other builders generally do 
  not have available

- part of the build process uses a different platform or some other system 
  requirement

- to fetch data eg. from a URL, with a record of the URL/automation but 
  also a copy of the file as a record and for offline use

So it's useful, to retain repeatable automation but not always build from 
square one.

Generally discouraged to check in build results yes, but I've found it 
very practical.
 
> So while I'd guess that setting `checkout.workers` back to "1" (if it 
> wasn't already) will probably restore the existing behavior, counting on 
> that behavior in the first place is wrong.

I think perhaps the tail is wagging the dog here, though.

It's 'wrong' because it doesn't work; but I haven't seen anything to make 
me think this is fundamentally or theoretically flawed.

If we had a transactional file system we'd reasonably expect a checkout to 
be an atomic operation -- same timestamp on the files created in that 
step. A discrepancy in timestamps would be considered incorrect; it would 
imply an 'order' to the checkout which, as you say, is order-less.

Sowhat could be the bad outcomes if Git created files stamped with the 
point in time of the "git checkout"?

> Thanks,
> Taylor
> 
> 

-- 
Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Consist timestamps within a checkout/clone
  2022-10-31 22:31     ` Mark Hills
@ 2022-10-31 22:42       ` rsbecker
  2022-11-01 18:34       ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 17+ messages in thread
From: rsbecker @ 2022-10-31 22:42 UTC (permalink / raw)
  To: 'Mark Hills', 'Taylor Blau'
  Cc: 'Ævar Arnfjörð Bjarmason', git,
	'Matheus Tavares'

On October 31, 2022 6:31 PM, Mark Hills wrote:
>On Mon, 31 Oct 2022, Taylor Blau wrote:
>> On Mon, Oct 31, 2022 at 09:21:20PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> > I think you're almost certainly running into the parallel checkout,
>> > which is new in that revision range. Try tweaking checkout.workers
>> > and checkout.thresholdForParallelism (see "man git-config").
>> >
>> > I can't say without looking at the code/Makefile (and even then, I
>> > don't have time to dig here:), but if I had to bet I'd say that your
>> > dependencies have probably always been broken with these checked-in
>> > files, but they happend to work out if they were checked out in
>> > sorted order.
>> >
>> > And now with the parallel checkout they're not guaranteed to do
>> > that, as some workers will "race ahead" and finish in an unpredictable order.
>>
>> Doesn't checkout.thresholdForParallelism only matter when
>> checkout.workers != 1?
>>
>> So what you wrote seems like a reasonable explanation, but only if the
>> original reporter set checkout.workers to imply the non-sequential
>> behavior in the first place.
>>
>> That said...
>>
>>   - I also don't know off-hand of a place where we've defined the order
>>     where Git will checkout files in the working copy. So depending on
>>     that behavior isn't a safe thing to do.
>>
>>   - Committing build artifacts into your repository is generally
>>     discouraged.
>
>If it's undefined and never implemented this is reasonable.
>
>But "generally" is a caveat, so while I agree with the statement it also implies
>there's valid cases outside of that. Ones which used to work, too.
>
>Here are some useful cases I have seen for the combination of build rule +
>checked in file:
>
>- part of a build requires licensed software that's not always available
>
>- part of the build requires large memory that other builders generally do
>  not have available
>
>- part of the build process uses a different platform or some other system
>  requirement
>
>- to fetch data eg. from a URL, with a record of the URL/automation but
>  also a copy of the file as a record and for offline use
>
>So it's useful, to retain repeatable automation but not always build from square
>one.
>
>Generally discouraged to check in build results yes, but I've found it very practical.
>
>> So while I'd guess that setting `checkout.workers` back to "1" (if it
>> wasn't already) will probably restore the existing behavior, counting
>> on that behavior in the first place is wrong.
>
>I think perhaps the tail is wagging the dog here, though.
>
>It's 'wrong' because it doesn't work; but I haven't seen anything to make me think
>this is fundamentally or theoretically flawed.
>
>If we had a transactional file system we'd reasonably expect a checkout to be an
>atomic operation -- same timestamp on the files created in that step. A
>discrepancy in timestamps would be considered incorrect; it would imply an 'order'
>to the checkout which, as you say, is order-less.
>
>Sowhat could be the bad outcomes if Git created files stamped with the point in
>time of the "git checkout"?

Timestamps are written based on when git modifies the file in the working directory. This actually ensures that automation does work. If intermediate contents are checked into repositories (I have people who do this for very justifiable regulatory reasons), the build has to make sure that there are appropriate separations of timestamps (a.k.a. 1 second) at a minimum on UNIX-ish systems. On some other boxes that do not even have timestamps for files (you know who you are) this is moot.

However, there is a use case for maintaining timestamps - specifically for debuggers that check timestamps of source files. It is a big pain to make this work in git - but I script around this by setting the timestamps of files to the commit time when doing release builds, and allowing users to set the timestamp to the same for debugging. It helps but should not change the semantics of dev builds.

-Randall


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 19:01 Consist timestamps within a checkout/clone Mark Hills
  2022-10-31 20:17 ` Andreas Schwab
  2022-10-31 20:21 ` Ævar Arnfjörð Bjarmason
@ 2022-11-01 13:55 ` Marc Branchaud
  2022-11-02 14:45   ` Ævar Arnfjörð Bjarmason
  2022-11-01 14:34 ` Erik Cervin Edin
  3 siblings, 1 reply; 17+ messages in thread
From: Marc Branchaud @ 2022-11-01 13:55 UTC (permalink / raw)
  To: Mark Hills, git


On 2022-10-31 15:01, Mark Hills wrote:
> Our use case: we commit some compiled objects to the repo, where compiling
> is either slow or requires software which is not always available.
> 
> Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS upgrade)
> we are noticing a change in build behaviour.
> 
> Now, after a "git clone" we find the Makefile intermittently attempting
> (and failing) some builds that are not intended.
> 
> Indeed, Make is acting reasonably as the source file is sometimes
> marginally newer than the destination (both checked out by Git), example
> below.

A fix for this was proposed in 2018 and dismissed [1].

Back then, the problem was that as Git wrote files into a directory 
sometimes the clock would tick over at a bad time, and we'd end up with 
some files being "newer" than others.  This would sour Make runs as you 
describe.

Nominally this is caused by putting generated files in the repo, but 
many times that is unavoidable (e.g. you're forking an upstream that 
puts automake-generated stuff in the repo).

IMHO, dismissing the problem back then was a mistake.  At the time I 
advocated teaching Git to give all the files it touches (creates or 
modifies) in a directory the same mtime (e.g. the time at the start of 
the checkout operation).

Instead the decision was to do nothing in Git, and instead let people 
create their own post-checkout hooks to touch the files.  I (and others) 
argued this was inadequate, to no avail.

		M.

[1] https://public-inbox.org/git/20180413170129.15310-1-mgorny@gentoo.org/#r

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 19:01 Consist timestamps within a checkout/clone Mark Hills
                   ` (2 preceding siblings ...)
  2022-11-01 13:55 ` Marc Branchaud
@ 2022-11-01 14:34 ` Erik Cervin Edin
  2022-11-01 15:53   ` Ævar Arnfjörð Bjarmason
  3 siblings, 1 reply; 17+ messages in thread
From: Erik Cervin Edin @ 2022-11-01 14:34 UTC (permalink / raw)
  To: Mark Hills; +Cc: git

I have little to add on the underlying issue or non-issue but some
ideas on how to solve your problem

On Mon, Oct 31, 2022 at 8:39 PM Mark Hills <mark@xwax.org> wrote:
>
> ...
> Indeed, Make is acting reasonably as the source file is sometimes
> marginally newer than the destination (both checked out by Git), example
> below.
>
> I've never had to consider consistency timestamps within a Git checkout
> until now.
>
> It's entirely possible there's _never_ a guarantee of consistency here.

If your makefile depends on checkout, why not
  git ls-files | xargs touch
or if this done in an environment where there's not a fresh clone each
time, maybe
  git diff HEAD --name-only --diff-filter=AM | xargs touch
or something along those lines

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-11-01 14:34 ` Erik Cervin Edin
@ 2022-11-01 15:53   ` Ævar Arnfjörð Bjarmason
  2022-11-03 13:02     ` Erik Cervin Edin
  0 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-01 15:53 UTC (permalink / raw)
  To: Erik Cervin Edin; +Cc: Mark Hills, git


On Tue, Nov 01 2022, Erik Cervin Edin wrote:

> I have little to add on the underlying issue or non-issue but some
> ideas on how to solve your problem
>
> On Mon, Oct 31, 2022 at 8:39 PM Mark Hills <mark@xwax.org> wrote:
>>
>> ...
>> Indeed, Make is acting reasonably as the source file is sometimes
>> marginally newer than the destination (both checked out by Git), example
>> below.
>>
>> I've never had to consider consistency timestamps within a Git checkout
>> until now.
>>
>> It's entirely possible there's _never_ a guarantee of consistency here.
>
> If your makefile depends on checkout, why not
>   git ls-files | xargs touch
> or if this done in an environment where there's not a fresh clone each
> time, maybe
>   git diff HEAD --name-only --diff-filter=AM | xargs touch
> or something along those lines

I believe you might be trying to re-invent "make -B" :)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 22:29   ` Mark Hills
@ 2022-11-01 17:46     ` Ævar Arnfjörð Bjarmason
  2022-11-02 14:16       ` Matheus Tavares
  0 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-01 17:46 UTC (permalink / raw)
  To: Mark Hills; +Cc: git, Matheus Tavares


On Mon, Oct 31 2022, Mark Hills wrote:

> On Mon, 31 Oct 2022, Ævar Arnfjörð Bjarmason wrote:
>
>> 
>> On Mon, Oct 31 2022, Mark Hills wrote:
>> 
>> > Our use case: we commit some compiled objects to the repo, where compiling 
>> > is either slow or requires software which is not always available.
>> >
>> > Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS upgrade) 
>> > we are noticing a change in build behaviour.
>> >
>> > Now, after a "git clone" we find the Makefile intermittently attempting 
>> > (and failing) some builds that are not intended.
>> >
>> > Indeed, Make is acting reasonably as the source file is sometimes 
>> > marginally newer than the destination (both checked out by Git), example 
>> > below.
>> >
>> > I've never had to consider consistency timestamps within a Git checkout 
>> > until now.
>> >
>> > It's entirely possible there's _never_ a guarantee of consistency here.
>> >
>> > But then something has certainly changed in practice, as this fault has 
>> > gone from never happening to now every couple of days.
>> >
>> > Imaginging I can't be the first person to encounter this, I searched for 
>> > existing threads or docs, but overwhemingly the results were question of 
>> > Git tracking the timestamps (as part of the commit) which this is not; 
>> > it's consistency within one checkout.
>> >
>> > $ git clone --depth 1 file:///path/to/repo.git
>> >
>> > $ stat winner.jpeg
>> >   File: winner.jpeg
>> >   Size: 258243          Blocks: 520        IO Block: 4096   regular file
>> > Device: fd07h/64775d    Inode: 33696       Links: 1
>> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
>> > Access: 2022-10-31 16:05:17.756858496 +0000
>> > Modify: 2022-10-31 16:05:17.756858496 +0000
>> > Change: 2022-10-31 16:05:17.756858496 +0000
>> >  Birth: -
>> >
>> > $ stat winner.svg
>> >   File: winner.svg
>> >   Size: 52685           Blocks: 112        IO Block: 4096   regular file
>> > Device: fd07h/64775d    Inode: 33697       Links: 1
>> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
>> > Access: 2022-10-31 16:05:17.766859030 +0000
>> > Modify: 2022-10-31 16:05:17.766859030 +0000
>> > Change: 2022-10-31 16:05:17.766859030 +0000
>> >  Birth: -
>> >
>> > Elsewhere in the repository, it's clear the timestamps are not consistent:
>> >
>> > $ stat Makefile
>> >   File: Makefile
>> >   Size: 8369            Blocks: 24         IO Block: 4096   regular file
>> > Device: fd07h/64775d    Inode: 33655       Links: 1
>> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
>> > Access: 2022-10-31 16:05:51.628660212 +0000
>> > Modify: 2022-10-31 16:05:17.746857963 +0000
>> > Change: 2022-10-31 16:05:17.746857963 +0000
>> >  Birth: -
>> 
>> I think you're almost certainly running into the parallel checkout,
>> which is new in that revision range. Try tweaking checkout.workers and
>> checkout.thresholdForParallelism (see "man git-config").
>
> Thanks, it will be interesting to try this and I'll report back.

FWIW I was under the impression that we'd made it the default, so unless
you opted-in it's probably not that.

>> I can't say without looking at the code/Makefile (and even then, I don't
>> have time to dig here:), but if I had to bet I'd say that your
>> dependencies have probably always been broken with these checked-in
>> files, but they happend to work out if they were checked out in sorted
>> order.
>>
>> And now with the parallel checkout they're not guaranteed to do that, as
>> some workers will "race ahead" and finish in an unpredictable order.
>
> These are very simple Makefile rules, I don't think these dependencies are 
> broken; but your theory is in good alignment with the observed behaviour.
>
> For example, the rule from the recent case above is:
>
>   %.jpeg:         %.png
>                   convert $< $(IMFLAGS) $@
>
>   %.png:          %.svg
>                   inkscape --export-type=png --export-filename=$@ $<

Grom a glance those don't seem broken to me, but I don't know how it
interacts with your built assets.

So e.g. if you are checking in your *.jpeg files those will be more
recent than either the *.png or source *.svn, so they won't be built.

This is fast getting out of scope of Git-specific advice, but you should
run "make --debug" (there's also sub-debug flags) to see if make's idea
of the dependency graph matches yours.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-10-31 22:31     ` Mark Hills
  2022-10-31 22:42       ` rsbecker
@ 2022-11-01 18:34       ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-01 18:34 UTC (permalink / raw)
  To: Mark Hills; +Cc: Taylor Blau, git, Matheus Tavares


On Mon, Oct 31 2022, Mark Hills wrote:

> On Mon, 31 Oct 2022, Taylor Blau wrote:
>
>> On Mon, Oct 31, 2022 at 09:21:20PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> > I think you're almost certainly running into the parallel checkout,
>> > which is new in that revision range. Try tweaking checkout.workers and
>> > checkout.thresholdForParallelism (see "man git-config").
>> >
>> > I can't say without looking at the code/Makefile (and even then, I don't
>> > have time to dig here:), but if I had to bet I'd say that your
>> > dependencies have probably always been broken with these checked-in
>> > files, but they happend to work out if they were checked out in sorted
>> > order.
>> >
>> > And now with the parallel checkout they're not guaranteed to do that, as
>> > some workers will "race ahead" and finish in an unpredictable order.
>> 
>> Doesn't checkout.thresholdForParallelism only matter when
>> checkout.workers != 1?
>> 
>> So what you wrote seems like a reasonable explanation, but only if the
>> original reporter set checkout.workers to imply the non-sequential
>> behavior in the first place.
>> 
>> That said...
>> 
>>   - I also don't know off-hand of a place where we've defined the order
>>     where Git will checkout files in the working copy. So depending on
>>     that behavior isn't a safe thing to do.
>> 
>>   - Committing build artifacts into your repository is generally
>>     discouraged.
>
> If it's undefined and never implemented this is reasonable.
>
> But "generally" is a caveat, so while I agree with the statement it also 
> implies there's valid cases outside of that. Ones which used to work, too.
>
> Here are some useful cases I have seen for the combination of build rule + 
> checked in file:
>
> - part of a build requires licensed software that's not always available
>
> - part of the build requires large memory that other builders generally do 
>   not have available
>
> - part of the build process uses a different platform or some other system 
>   requirement
>
> - to fetch data eg. from a URL, with a record of the URL/automation but 
>   also a copy of the file as a record and for offline use
>
> So it's useful, to retain repeatable automation but not always build from 
> square one.
>
> Generally discouraged to check in build results yes, but I've found it 
> very practical.
>  
>> So while I'd guess that setting `checkout.workers` back to "1" (if it 
>> wasn't already) will probably restore the existing behavior, counting on 
>> that behavior in the first place is wrong.
>
> I think perhaps the tail is wagging the dog here, though.
>
> It's 'wrong' because it doesn't work; but I haven't seen anything to make 
> me think this is fundamentally or theoretically flawed.
>
> If we had a transactional file system we'd reasonably expect a checkout to 
> be an atomic operation -- same timestamp on the files created in that 
> step. A discrepancy in timestamps would be considered incorrect; it would 
> imply an 'order' to the checkout which, as you say, is order-less.
>
> Sowhat could be the bad outcomes if Git created files stamped with the 
> point in time of the "git checkout"?

I agree that it's practical in some scenarios, including checking in
built assets.

But those that are doing that need to be aware that combining that sort
of thing with source control tends to upend your build system's idea of
the world.

E.g. until recently in git.git we had a po/git.pot in-tree, which is a
"compiled file" (although a plain-text one) that was checked in, and
dealing with that in make's dependency graph was a (minor) pain
sometimes.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-11-01 17:46     ` Ævar Arnfjörð Bjarmason
@ 2022-11-02 14:16       ` Matheus Tavares
  2022-11-02 14:28         ` Matheus Tavares
  0 siblings, 1 reply; 17+ messages in thread
From: Matheus Tavares @ 2022-11-02 14:16 UTC (permalink / raw)
  To: avarab; +Cc: git, mark, matheus.bernardino

> On Mon, Oct 31 2022, Mark Hills wrote:
>
> > On Mon, 31 Oct 2022, Ævar Arnfjörð Bjarmason wrote:
> >
> >>
> >> On Mon, Oct 31 2022, Mark Hills wrote:
> >>
> >> > Our use case: we commit some compiled objects to the repo, where compiling
> >> > is either slow or requires software which is not always available.
> >> >
> >> > Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS upgrade)
> >> > we are noticing a change in build behaviour.
> >> >
> >> > Now, after a "git clone" we find the Makefile intermittently attempting
> >> > (and failing) some builds that are not intended.
> >> >
> >> > Indeed, Make is acting reasonably as the source file is sometimes
> >> > marginally newer than the destination (both checked out by Git), example
> >> > below.
> >> >
> >> > I've never had to consider consistency timestamps within a Git checkout
> >> > until now.
> >> >
> >> > It's entirely possible there's _never_ a guarantee of consistency here.
> >> >
> >> > But then something has certainly changed in practice, as this fault has
> >> > gone from never happening to now every couple of days.
> >> >
> >> > Imaginging I can't be the first person to encounter this, I searched for
> >> > existing threads or docs, but overwhemingly the results were question of
> >> > Git tracking the timestamps (as part of the commit) which this is not;
> >> > it's consistency within one checkout.
> >> >
> >> > $ git clone --depth 1 file:///path/to/repo.git
> >> >
> >> > $ stat winner.jpeg
> >> >   File: winner.jpeg
> >> >   Size: 258243          Blocks: 520        IO Block: 4096   regular file
> >> > Device: fd07h/64775d    Inode: 33696       Links: 1
> >> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> >> > Access: 2022-10-31 16:05:17.756858496 +0000
> >> > Modify: 2022-10-31 16:05:17.756858496 +0000
> >> > Change: 2022-10-31 16:05:17.756858496 +0000
> >> >  Birth: -
> >> >
> >> > $ stat winner.svg
> >> >   File: winner.svg
> >> >   Size: 52685           Blocks: 112        IO Block: 4096   regular file
> >> > Device: fd07h/64775d    Inode: 33697       Links: 1
> >> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> >> > Access: 2022-10-31 16:05:17.766859030 +0000
> >> > Modify: 2022-10-31 16:05:17.766859030 +0000
> >> > Change: 2022-10-31 16:05:17.766859030 +0000
> >> >  Birth: -
> >> >
> >> > Elsewhere in the repository, it's clear the timestamps are not consistent:
> >> >
> >> > $ stat Makefile
> >> >   File: Makefile
> >> >   Size: 8369            Blocks: 24         IO Block: 4096   regular file
> >> > Device: fd07h/64775d    Inode: 33655       Links: 1
> >> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> >> > Access: 2022-10-31 16:05:51.628660212 +0000
> >> > Modify: 2022-10-31 16:05:17.746857963 +0000
> >> > Change: 2022-10-31 16:05:17.746857963 +0000
> >> >  Birth: -
> >>
> >> I think you're almost certainly running into the parallel checkout,
> >> which is new in that revision range. Try tweaking checkout.workers and
> >> checkout.thresholdForParallelism (see "man git-config").

This does look like something you would see with parallel checkout, yes.
But...

> > Thanks, it will be interesting to try this and I'll report back.
>
> FWIW I was under the impression that we'd made it the default, so unless
> you opted-in it's probably not that.

... it indeed should be disabled by default. It seems Mark didn't
manually enable parallel checkout, as the original message only mentions
the git upgrade as a changing factor. And Alpine's git installation
script for 2.32.4 [1] doesn't seem to change our defaults either.

Perhaps, it just happens that 2.32.4 changed the checkout processing
time slightly so that each entry is finished a bit slower (or the system
was overloaded at that moment?). Anyways, the creation order (based on
the mtimes) looks correct to me from a sequential-checkout point of
view: first Makefile, than winner.jpeg, and finally winner.svg. That's
the order in which these files would appear in the index, which is the
order followed by sequential checkout.

[1]: https://git.alpinelinux.org/aports/tree/main/git/APKBUILD?h=3.14-stable&id=0f3285f2cfcb8362460002c27e219fadbf18c885

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-11-02 14:16       ` Matheus Tavares
@ 2022-11-02 14:28         ` Matheus Tavares
  0 siblings, 0 replies; 17+ messages in thread
From: Matheus Tavares @ 2022-11-02 14:28 UTC (permalink / raw)
  To: avarab; +Cc: git, mark

[Oops, I accidentally sent this from my business account. I'm
quote-replying it now from my correct personal account just in case
the original falls under spam folders for "spoofing".]

On Wed, Nov 2, 2022 at 11:16 AM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> > On Mon, Oct 31 2022, Mark Hills wrote:
> >
> > > On Mon, 31 Oct 2022, Ævar Arnfjörð Bjarmason wrote:
> > >
> > >>
> > >> On Mon, Oct 31 2022, Mark Hills wrote:
> > >>
> > >> > Our use case: we commit some compiled objects to the repo, where compiling
> > >> > is either slow or requires software which is not always available.
> > >> >
> > >> > Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS upgrade)
> > >> > we are noticing a change in build behaviour.
> > >> >
> > >> > Now, after a "git clone" we find the Makefile intermittently attempting
> > >> > (and failing) some builds that are not intended.
> > >> >
> > >> > Indeed, Make is acting reasonably as the source file is sometimes
> > >> > marginally newer than the destination (both checked out by Git), example
> > >> > below.
> > >> >
> > >> > I've never had to consider consistency timestamps within a Git checkout
> > >> > until now.
> > >> >
> > >> > It's entirely possible there's _never_ a guarantee of consistency here.
> > >> >
> > >> > But then something has certainly changed in practice, as this fault has
> > >> > gone from never happening to now every couple of days.
> > >> >
> > >> > Imaginging I can't be the first person to encounter this, I searched for
> > >> > existing threads or docs, but overwhemingly the results were question of
> > >> > Git tracking the timestamps (as part of the commit) which this is not;
> > >> > it's consistency within one checkout.
> > >> >
> > >> > $ git clone --depth 1 file:///path/to/repo.git
> > >> >
> > >> > $ stat winner.jpeg
> > >> >   File: winner.jpeg
> > >> >   Size: 258243          Blocks: 520        IO Block: 4096   regular file
> > >> > Device: fd07h/64775d    Inode: 33696       Links: 1
> > >> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > >> > Access: 2022-10-31 16:05:17.756858496 +0000
> > >> > Modify: 2022-10-31 16:05:17.756858496 +0000
> > >> > Change: 2022-10-31 16:05:17.756858496 +0000
> > >> >  Birth: -
> > >> >
> > >> > $ stat winner.svg
> > >> >   File: winner.svg
> > >> >   Size: 52685           Blocks: 112        IO Block: 4096   regular file
> > >> > Device: fd07h/64775d    Inode: 33697       Links: 1
> > >> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > >> > Access: 2022-10-31 16:05:17.766859030 +0000
> > >> > Modify: 2022-10-31 16:05:17.766859030 +0000
> > >> > Change: 2022-10-31 16:05:17.766859030 +0000
> > >> >  Birth: -
> > >> >
> > >> > Elsewhere in the repository, it's clear the timestamps are not consistent:
> > >> >
> > >> > $ stat Makefile
> > >> >   File: Makefile
> > >> >   Size: 8369            Blocks: 24         IO Block: 4096   regular file
> > >> > Device: fd07h/64775d    Inode: 33655       Links: 1
> > >> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > >> > Access: 2022-10-31 16:05:51.628660212 +0000
> > >> > Modify: 2022-10-31 16:05:17.746857963 +0000
> > >> > Change: 2022-10-31 16:05:17.746857963 +0000
> > >> >  Birth: -
> > >>
> > >> I think you're almost certainly running into the parallel checkout,
> > >> which is new in that revision range. Try tweaking checkout.workers and
> > >> checkout.thresholdForParallelism (see "man git-config").
>
> This does look like something you would see with parallel checkout, yes.
> But...
>
> > > Thanks, it will be interesting to try this and I'll report back.
> >
> > FWIW I was under the impression that we'd made it the default, so unless
> > you opted-in it's probably not that.
>
> ... it indeed should be disabled by default. It seems Mark didn't
> manually enable parallel checkout, as the original message only mentions
> the git upgrade as a changing factor. And Alpine's git installation
> script for 2.32.4 [1] doesn't seem to change our defaults either.
>
> Perhaps, it just happens that 2.32.4 changed the checkout processing
> time slightly so that each entry is finished a bit slower (or the system
> was overloaded at that moment?). Anyways, the creation order (based on
> the mtimes) looks correct to me from a sequential-checkout point of
> view: first Makefile, than winner.jpeg, and finally winner.svg. That's
> the order in which these files would appear in the index, which is the
> order followed by sequential checkout.
>
> [1]: https://git.alpinelinux.org/aports/tree/main/git/APKBUILD?h=3.14-stable&id=0f3285f2cfcb8362460002c27e219fadbf18c885

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-11-01 13:55 ` Marc Branchaud
@ 2022-11-02 14:45   ` Ævar Arnfjörð Bjarmason
  2022-11-03 13:46     ` Marc Branchaud
  0 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-02 14:45 UTC (permalink / raw)
  To: Marc Branchaud; +Cc: Mark Hills, git, Michał Górny


On Tue, Nov 01 2022, Marc Branchaud wrote:

> On 2022-10-31 15:01, Mark Hills wrote:
>> Our use case: we commit some compiled objects to the repo, where compiling
>> is either slow or requires software which is not always available.
>> Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS
>> upgrade)
>> we are noticing a change in build behaviour.
>> Now, after a "git clone" we find the Makefile intermittently
>> attempting
>> (and failing) some builds that are not intended.
>> Indeed, Make is acting reasonably as the source file is sometimes
>> marginally newer than the destination (both checked out by Git), example
>> below.
>
> A fix for this was proposed in 2018 and dismissed [1].
>
> Back then, the problem was that as Git wrote files into a directory
> sometimes the clock would tick over at a bad time, and we'd end up
> with some files being "newer" than others.  This would sour Make runs
> as you describe.
>
> Nominally this is caused by putting generated files in the repo, but
> many times that is unavoidable (e.g. you're forking an upstream that 
> puts automake-generated stuff in the repo).
>
> IMHO, dismissing the problem back then was a mistake.  At the time I
> advocated teaching Git to give all the files it touches (creates or 
> modifies) in a directory the same mtime (e.g. the time at the start of
> the checkout operation).
>
> Instead the decision was to do nothing in Git, and instead let people
> create their own post-checkout hooks to touch the files.  I (and
> others) argued this was inadequate, to no avail.
>
> 		M.
>
> [1] https://public-inbox.org/git/20180413170129.15310-1-mgorny@gentoo.org/#r

I think that's the wrong take-away from that thread. Maybe a patch for
this will get rejected in the end, but in that case it wasn't because
the git project is never going to take a patch like this.

Maybe it won't, but:

 * That commit has no tests
 * It's clearly controversial behavior, so *if* we add it I think it's
   better to make it opt-in configurable.
 * Once that's done, you'd need doc changes etc. for that.

Now, maybe a sufficiently polished version would also be "meh" for
whatever reason, I just think it's premature to say that a change in
this direction would never be accepted.

That being said, I do wonder if software in the wild is being
monkeypatched to work around issues with make (or make-like tools)
whether such a change isn't better advocated in e.g. GNU make itself.

If it added "B" to "MAKEFLAGS" if it detected:

 * I'm in a git repository
 * It's the first time I'm running here, or "nothing is built yet"
 * My dependency graph would be different with "-B"

Wouldn't that be what people who want this feature are after?

It's not like it's SCM-agnostic, it already goes to significant trouble
to cater to RCS and SCCS of all things, so I don't see why they'd
categorically reject a patch to cater to modern VCS's.

And, unlike Gike, GNU make wouldn't need to guess that munging
timestamps would fix it, it can compute both versions of the dependency
graph, so it would know...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-11-01 15:53   ` Ævar Arnfjörð Bjarmason
@ 2022-11-03 13:02     ` Erik Cervin Edin
  0 siblings, 0 replies; 17+ messages in thread
From: Erik Cervin Edin @ 2022-11-03 13:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Mark Hills, git

On Tue, Nov 1, 2022 at 4:53 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> I believe you might be trying to re-invent "make -B" :)

True, in the simple case, but if you
  git diff HEAD --name-only --diff-filter=AM | xargs touch
that should consolidate the modified times on disk of the files of that commit

It needs a bit more work, something like
  pre_checkout=$(git rev-parse HEAD)
  git checkout XYX &&
  git diff pre_checkout...XYZ --name-only --diff-filter=AM | xargs touch
but something like that can work around the inconsistent ordered
modified times after a checkout

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Consist timestamps within a checkout/clone
  2022-11-02 14:45   ` Ævar Arnfjörð Bjarmason
@ 2022-11-03 13:46     ` Marc Branchaud
  0 siblings, 0 replies; 17+ messages in thread
From: Marc Branchaud @ 2022-11-03 13:46 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Mark Hills, git, Michał Górny


On 2022-11-02 10:45, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Nov 01 2022, Marc Branchaud wrote:
>>
>> Instead the decision was to do nothing in Git, and instead let people
>> create their own post-checkout hooks to touch the files.  I (and others)
>> argued this was inadequate, to no avail.
>> >> [1] 
https://public-inbox.org/git/20180413170129.15310-1-mgorny@gentoo.org/#r
> 
> I think that's the wrong take-away from that thread. Maybe a patch for
> this will get rejected in the end, but in that case it wasn't because
> the git project is never going to take a patch like this.
> 
> Maybe it won't, but:
> 
>   * That commit has no tests
>   * It's clearly controversial behavior, so *if* we add it I think it's
>     better to make it opt-in configurable.
>   * Once that's done, you'd need doc changes etc. for that.
> 
> Now, maybe a sufficiently polished version would also be "meh" for
> whatever reason, I just think it's premature to say that a change in
> this direction would never be accepted.

I did not say that it would never be accepted; perhaps I should have 
said "outcome" instead of "decision".  That 2018 thread barely discussed 
changes to the patch itself.  The patch's writer (not me) didn't pursue 
the work, after its frosty initial reception.

Past discussions around these proposals have been negative, which 
discourages people from polishing a submission as you suggest.  I hope 
that this time is different.  (Before you suggest I submit a patch, I 
sadly don't have the time to hack on Git these days.)

> That being said, I do wonder if software in the wild is being
> monkeypatched to work around issues with make (or make-like tools)
> whether such a change isn't better advocated in e.g. GNU make itself.
> 
> If it added "B" to "MAKEFLAGS" if it detected:
> 
>   * I'm in a git repository
>   * It's the first time I'm running here, or "nothing is built yet"
>   * My dependency graph would be different with "-B"
> 
> Wouldn't that be what people who want this feature are after?
> 
> It's not like it's SCM-agnostic, it already goes to significant trouble
> to cater to RCS and SCCS of all things, so I don't see why they'd
> categorically reject a patch to cater to modern VCS's.
> 
> And, unlike Gike, GNU make wouldn't need to guess that munging
> timestamps would fix it, it can compute both versions of the dependency
> graph, so it would know...

Fair points about advocating for changes in make.  However, Gnu make 
isn't the only flavour out there.  Our builds use both BSD's and Gnu's 
makes, for example.  (Also, BSD make has a completely different 
interpretation of -B, and does not have any flag that mirrors Gnu make's 
-B.)

Git is really the ideal place to solve this problem, instead of playing 
whack-a-mole with build tools and upstream projects.  Making the 
behaviour opt-in is perfectly reasonable.

		M.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-11-03 13:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-31 19:01 Consist timestamps within a checkout/clone Mark Hills
2022-10-31 20:17 ` Andreas Schwab
2022-10-31 20:21 ` Ævar Arnfjörð Bjarmason
2022-10-31 20:36   ` Taylor Blau
2022-10-31 22:31     ` Mark Hills
2022-10-31 22:42       ` rsbecker
2022-11-01 18:34       ` Ævar Arnfjörð Bjarmason
2022-10-31 22:29   ` Mark Hills
2022-11-01 17:46     ` Ævar Arnfjörð Bjarmason
2022-11-02 14:16       ` Matheus Tavares
2022-11-02 14:28         ` Matheus Tavares
2022-11-01 13:55 ` Marc Branchaud
2022-11-02 14:45   ` Ævar Arnfjörð Bjarmason
2022-11-03 13:46     ` Marc Branchaud
2022-11-01 14:34 ` Erik Cervin Edin
2022-11-01 15:53   ` Ævar Arnfjörð Bjarmason
2022-11-03 13:02     ` Erik Cervin Edin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).