git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: <rsbecker@nexbridge.com>
To: "'Mark Hills'" <mark@xwax.org>, "'Taylor Blau'" <me@ttaylorr.com>
Cc: "'Ævar Arnfjörð Bjarmason'" <avarab@gmail.com>,
	git@vger.kernel.org,
	"'Matheus Tavares'" <matheus.bernardino@usp.br>
Subject: RE: Consist timestamps within a checkout/clone
Date: Mon, 31 Oct 2022 18:42:02 -0400	[thread overview]
Message-ID: <005e01d8ed7a$020589a0$06109ce0$@nexbridge.com> (raw)
In-Reply-To: <d4db484f-a525-f6db-1bfb-922f788dacd@xwax.org>

On October 31, 2022 6:31 PM, Mark Hills wrote:
>On Mon, 31 Oct 2022, Taylor Blau wrote:
>> On Mon, Oct 31, 2022 at 09:21:20PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> > I think you're almost certainly running into the parallel checkout,
>> > which is new in that revision range. Try tweaking checkout.workers
>> > and checkout.thresholdForParallelism (see "man git-config").
>> >
>> > I can't say without looking at the code/Makefile (and even then, I
>> > don't have time to dig here:), but if I had to bet I'd say that your
>> > dependencies have probably always been broken with these checked-in
>> > files, but they happend to work out if they were checked out in
>> > sorted order.
>> >
>> > And now with the parallel checkout they're not guaranteed to do
>> > that, as some workers will "race ahead" and finish in an unpredictable order.
>>
>> Doesn't checkout.thresholdForParallelism only matter when
>> checkout.workers != 1?
>>
>> So what you wrote seems like a reasonable explanation, but only if the
>> original reporter set checkout.workers to imply the non-sequential
>> behavior in the first place.
>>
>> That said...
>>
>>   - I also don't know off-hand of a place where we've defined the order
>>     where Git will checkout files in the working copy. So depending on
>>     that behavior isn't a safe thing to do.
>>
>>   - Committing build artifacts into your repository is generally
>>     discouraged.
>
>If it's undefined and never implemented this is reasonable.
>
>But "generally" is a caveat, so while I agree with the statement it also implies
>there's valid cases outside of that. Ones which used to work, too.
>
>Here are some useful cases I have seen for the combination of build rule +
>checked in file:
>
>- part of a build requires licensed software that's not always available
>
>- part of the build requires large memory that other builders generally do
>  not have available
>
>- part of the build process uses a different platform or some other system
>  requirement
>
>- to fetch data eg. from a URL, with a record of the URL/automation but
>  also a copy of the file as a record and for offline use
>
>So it's useful, to retain repeatable automation but not always build from square
>one.
>
>Generally discouraged to check in build results yes, but I've found it very practical.
>
>> So while I'd guess that setting `checkout.workers` back to "1" (if it
>> wasn't already) will probably restore the existing behavior, counting
>> on that behavior in the first place is wrong.
>
>I think perhaps the tail is wagging the dog here, though.
>
>It's 'wrong' because it doesn't work; but I haven't seen anything to make me think
>this is fundamentally or theoretically flawed.
>
>If we had a transactional file system we'd reasonably expect a checkout to be an
>atomic operation -- same timestamp on the files created in that step. A
>discrepancy in timestamps would be considered incorrect; it would imply an 'order'
>to the checkout which, as you say, is order-less.
>
>Sowhat could be the bad outcomes if Git created files stamped with the point in
>time of the "git checkout"?

Timestamps are written based on when git modifies the file in the working directory. This actually ensures that automation does work. If intermediate contents are checked into repositories (I have people who do this for very justifiable regulatory reasons), the build has to make sure that there are appropriate separations of timestamps (a.k.a. 1 second) at a minimum on UNIX-ish systems. On some other boxes that do not even have timestamps for files (you know who you are) this is moot.

However, there is a use case for maintaining timestamps - specifically for debuggers that check timestamps of source files. It is a big pain to make this work in git - but I script around this by setting the timestamps of files to the commit time when doing release builds, and allowing users to set the timestamp to the same for debugging. It helps but should not change the semantics of dev builds.

-Randall


  reply	other threads:[~2022-10-31 22:42 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 19:01 Consist timestamps within a checkout/clone Mark Hills
2022-10-31 20:17 ` Andreas Schwab
2022-10-31 20:21 ` Ævar Arnfjörð Bjarmason
2022-10-31 20:36   ` Taylor Blau
2022-10-31 22:31     ` Mark Hills
2022-10-31 22:42       ` rsbecker [this message]
2022-11-01 18:34       ` Ævar Arnfjörð Bjarmason
2022-10-31 22:29   ` Mark Hills
2022-11-01 17:46     ` Ævar Arnfjörð Bjarmason
2022-11-02 14:16       ` Matheus Tavares
2022-11-02 14:28         ` Matheus Tavares
2022-11-01 13:55 ` Marc Branchaud
2022-11-02 14:45   ` Ævar Arnfjörð Bjarmason
2022-11-03 13:46     ` Marc Branchaud
2022-11-01 14:34 ` Erik Cervin Edin
2022-11-01 15:53   ` Ævar Arnfjörð Bjarmason
2022-11-03 13:02     ` Erik Cervin Edin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='005e01d8ed7a$020589a0$06109ce0$@nexbridge.com' \
    --to=rsbecker@nexbridge.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=mark@xwax.org \
    --cc=matheus.bernardino@usp.br \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).