git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Mark Hills <mark@xwax.org>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Matheus Tavares <matheus.bernardino@usp.br>
Subject: Re: Consist timestamps within a checkout/clone
Date: Mon, 31 Oct 2022 22:29:04 +0000 (GMT)	[thread overview]
Message-ID: <a87ebafd-c83-7a1d-d8d2-953bc9a93184@xwax.org> (raw)
In-Reply-To: <221031.86zgdb68p3.gmgdl@evledraar.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4582 bytes --]

On Mon, 31 Oct 2022, Ævar Arnfjörð Bjarmason wrote:

> 
> On Mon, Oct 31 2022, Mark Hills wrote:
> 
> > Our use case: we commit some compiled objects to the repo, where compiling 
> > is either slow or requires software which is not always available.
> >
> > Since upgrading Git 2.26.3 -> 2.32.4 (as part of Alpine Linux OS upgrade) 
> > we are noticing a change in build behaviour.
> >
> > Now, after a "git clone" we find the Makefile intermittently attempting 
> > (and failing) some builds that are not intended.
> >
> > Indeed, Make is acting reasonably as the source file is sometimes 
> > marginally newer than the destination (both checked out by Git), example 
> > below.
> >
> > I've never had to consider consistency timestamps within a Git checkout 
> > until now.
> >
> > It's entirely possible there's _never_ a guarantee of consistency here.
> >
> > But then something has certainly changed in practice, as this fault has 
> > gone from never happening to now every couple of days.
> >
> > Imaginging I can't be the first person to encounter this, I searched for 
> > existing threads or docs, but overwhemingly the results were question of 
> > Git tracking the timestamps (as part of the commit) which this is not; 
> > it's consistency within one checkout.
> >
> > $ git clone --depth 1 file:///path/to/repo.git
> >
> > $ stat winner.jpeg
> >   File: winner.jpeg
> >   Size: 258243          Blocks: 520        IO Block: 4096   regular file
> > Device: fd07h/64775d    Inode: 33696       Links: 1
> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > Access: 2022-10-31 16:05:17.756858496 +0000
> > Modify: 2022-10-31 16:05:17.756858496 +0000
> > Change: 2022-10-31 16:05:17.756858496 +0000
> >  Birth: -
> >
> > $ stat winner.svg
> >   File: winner.svg
> >   Size: 52685           Blocks: 112        IO Block: 4096   regular file
> > Device: fd07h/64775d    Inode: 33697       Links: 1
> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > Access: 2022-10-31 16:05:17.766859030 +0000
> > Modify: 2022-10-31 16:05:17.766859030 +0000
> > Change: 2022-10-31 16:05:17.766859030 +0000
> >  Birth: -
> >
> > Elsewhere in the repository, it's clear the timestamps are not consistent:
> >
> > $ stat Makefile
> >   File: Makefile
> >   Size: 8369            Blocks: 24         IO Block: 4096   regular file
> > Device: fd07h/64775d    Inode: 33655       Links: 1
> > Access: (0644/-rw-r--r--)  Uid: (  106/ luthier)   Gid: (  106/ luthier)
> > Access: 2022-10-31 16:05:51.628660212 +0000
> > Modify: 2022-10-31 16:05:17.746857963 +0000
> > Change: 2022-10-31 16:05:17.746857963 +0000
> >  Birth: -
> 
> I think you're almost certainly running into the parallel checkout,
> which is new in that revision range. Try tweaking checkout.workers and
> checkout.thresholdForParallelism (see "man git-config").

Thanks, it will be interesting to try this and I'll report back.
 
> I can't say without looking at the code/Makefile (and even then, I don't
> have time to dig here:), but if I had to bet I'd say that your
> dependencies have probably always been broken with these checked-in
> files, but they happend to work out if they were checked out in sorted
> order.
>
> And now with the parallel checkout they're not guaranteed to do that, as
> some workers will "race ahead" and finish in an unpredictable order.

These are very simple Makefile rules, I don't think these dependencies are 
broken; but your theory is in good alignment with the observed behaviour.

For example, the rule from the recent case above is:

  %.jpeg:         %.png
                  convert $< $(IMFLAGS) $@

  %.png:          %.svg
                  inkscape --export-type=png --export-filename=$@ $<

As you suggest, perhaps the Git implementation previously ran checked out 
in some kind of time order then this happens to fulfil a useful behaviour.

Specificaly with build artefacts. These are likely to have been added to 
the repo after the source file. This could have been providing some 
pratical and useful tendency of ordering.

> But that's all just a guess, perhaps it has nothing to do with parallel
> checkout, such dependency issues are sensitive to all sorts of other
> things, e.g. maybe git got slightly faster (or slower), so now files
> that were always on different seconds (or the same) aren't in the state
> they were in before...

Hopefully I'll get to some experiments to narrow this down.

Thanks

-- 
Mark

  parent reply	other threads:[~2022-10-31 22:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 19:01 Consist timestamps within a checkout/clone Mark Hills
2022-10-31 20:17 ` Andreas Schwab
2022-10-31 20:21 ` Ævar Arnfjörð Bjarmason
2022-10-31 20:36   ` Taylor Blau
2022-10-31 22:31     ` Mark Hills
2022-10-31 22:42       ` rsbecker
2022-11-01 18:34       ` Ævar Arnfjörð Bjarmason
2022-10-31 22:29   ` Mark Hills [this message]
2022-11-01 17:46     ` Ævar Arnfjörð Bjarmason
2022-11-02 14:16       ` Matheus Tavares
2022-11-02 14:28         ` Matheus Tavares
2022-11-01 13:55 ` Marc Branchaud
2022-11-02 14:45   ` Ævar Arnfjörð Bjarmason
2022-11-03 13:46     ` Marc Branchaud
2022-11-01 14:34 ` Erik Cervin Edin
2022-11-01 15:53   ` Ævar Arnfjörð Bjarmason
2022-11-03 13:02     ` Erik Cervin Edin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a87ebafd-c83-7a1d-d8d2-953bc9a93184@xwax.org \
    --to=mark@xwax.org \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).