bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* RFC: git-commit based mtime-reproducible tarballs
       [not found] <87h6wtgmhy.fsf__22556.7857896507$1673713908$gmane$org@redhat.com>
@ 2023-01-15 11:01 ` Simon Josefsson via Gnulib discussion list
  2023-01-15 13:21   ` Bruno Haible
  0 siblings, 1 reply; 10+ messages in thread
From: Simon Josefsson via Gnulib discussion list @ 2023-01-15 11:01 UTC (permalink / raw)
  To: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 3772 bytes --]

Hi.  Quoting the recent binutils announcement:

>   As an experiment these tarballs were made with the new "-r <date>"
>   option supported by the src-release.sh script.  This attempts to make
>   reproducible tarballs by sorting the files and passing the
>   "--mtime=<date>" option to tar.  The date used for these tarballs was
>   obtained by running:
>   
>     git log -1 --format=%cd --date=format:%F bfd/version.m4

This got me thinking about git-version-gen and GNUmakefile, and I came
up with the patch below to use the most recent commit as the timestamp
for all files in the tarball.  What do you think?

There are some concerns about this:

1) Having the same mtime on all files in a tarball may cause problems
for some projects that have fragile dependency-systems.  While I think
all dependency checks really should be using >= timestamp tests, I
wouldn't rule out that some use > timestamp tests, which would cause
(sometimes unwanted) rebuilding of some files.  Are there
dependency-constructs where the same mtime for all files in a tarball is
just a bad idea, with no better approach available?

2) The use of TAR_OPTIONS in GNUmakefile is complex and somewhat hard to
debug.  I can't find any cleaner way to provide options to tar for 'make
dist' though.  Automake defines $(AMTAR) but looks like an internal
symbol which also isn't used (bug?), instead $(am__tar) is used and
defined as am__tar = $${TAR-tar} chof - "$$tardir".  So we can override
TAR in Makefile.am but it looks like a user-variable that we shouldn't
override.  So pending support for a AMTAR (or AM_TAR?) variable in
Makefile.am that actually works, I guess we are stuck with the
TAR_OPTIONS approach.  We could do 'TAR = env TAR_OPTIONS_=... tar' in
Makefile.am but it looks like the wrong approach.

3) The Makefile.am snippet in git-version-gen is difficult to maintain,
can't we put such snippets in a gnulib-owned file and suggest use of
'include gl/top-gl-Makefile.am-include.mk' instead?  The same applies to
gen-ChangeLog rule.  The logic would have to be a bit more complex to
support per-project modifications to these rules though.

Two small bugs that are possible to fix but not important before we know
if mtime-reproducible tarballs is useful or not:

4) If there is no .version file when you type 'make dist' my patch below
would fail to provide --mtime=... to tar.  So it fails if you didn't do
'make' before 'make dist' after ./bootstrap + ./configure in a clean
checkout.

5) It is also a bit fragile that it assume 'git log -1' works without
checking for errors before invoking touch.

/Simon

diff --git a/build-aux/git-version-gen b/build-aux/git-version-gen
index a72057bf2c..0a98cb12dd 100755
--- a/build-aux/git-version-gen
+++ b/build-aux/git-version-gen
@@ -66,6 +66,7 @@ scriptversion=2022-07-09.08; # UTC
 # BUILT_SOURCES = $(top_srcdir)/.version
 # $(top_srcdir)/.version:
 #      echo '$(VERSION)' > $@-t
+#      touch -m -d @$(shell git log -1 --format=%cd --date=unix) $@-t
 #      mv $@-t $@
 # dist-hook:
 #      echo '$(VERSION)' > $(distdir)/.tarball-version
diff --git a/top/GNUmakefile b/top/GNUmakefile
index 07b331fe53..f0dd41b5b4 100644
--- a/top/GNUmakefile
+++ b/top/GNUmakefile
@@ -25,8 +25,14 @@
 _gl-Makefile := $(wildcard [M]akefile)
 ifneq ($(_gl-Makefile),)
 
+_gl-.version := $(wildcard .version)
+ifneq ($(_gl-.version),)
+_tar_mtime := --mtime=.version
+endif
+
 # Make tar archive easier to reproduce.
-export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name
+export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name \
+       $(_tar_mtime)
 
 # Allow the user to add to this in the Makefile.
 ALL_RECURSIVE_TARGETS =

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-15 11:01 ` RFC: git-commit based mtime-reproducible tarballs Simon Josefsson via Gnulib discussion list
@ 2023-01-15 13:21   ` Bruno Haible
  2023-01-15 16:03     ` Paul Eggert
  2023-01-16  8:28     ` Simon Josefsson via Gnulib discussion list
  0 siblings, 2 replies; 10+ messages in thread
From: Bruno Haible @ 2023-01-15 13:21 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Simon Josefsson

Hi Simon,

> >   This attempts to make
> >   reproducible tarballs by sorting the files and passing the
> >   "--mtime=<date>" option to tar. ...
> Having the same mtime on all files in a tarball

First question: What is the point of doing that?

Reproducibility is about verifying that an artifact A was generated
from a source S.

When I, as a GNU maintainer or uploader, create a tarball and upload it
to ftp.gnu.org, that tarball is the source S. Because that's what I sign
with my GPG key. The commits in the git repo aren't the source, and even
the git checkout on my disk aren't the source — because I am free to
unpack and repack the tarball as I like, before I upload it to ftp.gnu.org.

When someone runs a complex build on possibly untrusted servers in the
cloud, then it makes sense to view the tarball as an artifact A and the
git repository as the source S. (If the git repository is hosted elsewhere.
If the git repository is being hosted on the same untrusted servers,
it is not sufficient.)

As a consequence, please make such modifications dependent on an option
or environment variable (maybe SOURCE_DATE_EPOCH [1]?); don't activate
them for everyone.

> 1) Having the same mtime on all files in a tarball may cause problems

Definitely. HP-UX 'make' attempts to rebuilds a file Y that depends on
a file X, if Y and X have the same timestamp (mtime). It is long known
that you have to have actually different timestamps for some files.

Bruno

[1] https://reproducible-builds.org/docs/source-date-epoch/





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-15 13:21   ` Bruno Haible
@ 2023-01-15 16:03     ` Paul Eggert
  2023-01-15 22:25       ` Bruno Haible
  2023-01-16  9:45       ` Vivien Kraus
  2023-01-16  8:28     ` Simon Josefsson via Gnulib discussion list
  1 sibling, 2 replies; 10+ messages in thread
From: Paul Eggert @ 2023-01-15 16:03 UTC (permalink / raw)
  To: Bruno Haible; +Cc: Simon Josefsson, bug-gnulib

On 2023-01-15 05:21, Bruno Haible wrote:
> Reproducibility is about verifying that an artifact A was generated
> from a source S.

Quite true. However, there's something else going on: when I do an 'ls 
-l' of a source directory that I got from a distribution tarball, it's 
useful to see the last time the contents of each source file was changed 
upstream. When sources are in a Git repository, I've found the commit 
timestamp to be a good representation for that.

For TZDB, where users have long wanted reproducibility, I use something 
like this in a Makefile recipe for each source file $$file:

	      time=`git log -1 --format='tformat:%ct' $$file` &&
	      touch -cmd @$$time $$file

Here are three problems I ran into with this approach, and the solutions 
that TZDB uses:

1. As you mentioned, what if you're building a release from sources that 
have not yet been committed? In this case TZDB's Makefile recipe warns 
but goes ahead with the timestamp that the working file already has.

2. What about platform-independent files that are automatically created 
from source files from the repository, and that are shipped in the 
release tarball? In this case, the TZDB Makefile arranges for each such 
file to have a timestamp one second later than the maximum of timestamps 
of files that the file depends on. This step is the biggest hassle, 
since it means I need to repeat in the Makefile the logic that 'make' 
already uses when calculating dependencies.

3. What about tarball metadata other than last-modified time? Here, TZDB 
uses the following GNU Tar options:

   GNUTARFLAGS= --format=pax --pax-option='delete=atime,delete=ctime' \
   --numeric-owner --owner=0 --group=0 \
   --mode=go+u,go-w --sort=name

The need for most of this should be obvious, if one wants the tarball to 
be reproducible. However, some details are less obvious. GNUTARFLAGS 
specifies pax format because the default GNU Tar format becomes 
unportable after 2242-03-16 12:56:32 UTC due to the 33-bit limitation of 
ustar. And GNUTARFLAGS uses delete=atime,delete=ctime so that atime and 
ctime do not leak into the tarball and make it less reproducible; since 
mtime values are always a multiple of 1 second (given steps 1 and 2) 
this means the tarball will be ustar-compatible until 2242, giving users 
*plenty* of time to prepare for pax format timestamps.

There is an argument that we need not have a fancy GNUTARFLAGS like 
this, because I'm signing the tarballs and users have to trust me 
anyway. Still, some users want to "trust but verify" and a reproducible 
tarball is easier to audit than a non-reproducible one, so for these 
users it can be a win to omit the irrelevant data from the tarball.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-15 16:03     ` Paul Eggert
@ 2023-01-15 22:25       ` Bruno Haible
  2023-01-16  8:40         ` Simon Josefsson via Gnulib discussion list
  2023-01-16  9:45       ` Vivien Kraus
  1 sibling, 1 reply; 10+ messages in thread
From: Bruno Haible @ 2023-01-15 22:25 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Simon Josefsson, bug-gnulib

Paul Eggert wrote:
> some users want to "trust but verify" and a reproducible 
> tarball is easier to audit than a non-reproducible one, so for these 
> users it can be a win to omit the irrelevant data from the tarball.

Reproducibility can be implemented in different ways:
  - by omitting irrelevant data from the tarball,
  - by having a customized comparison program 'diff', such that
    "diff --ignore-irrelevant-metadata contents1 contents2"
    would ignore the irrelevant parts.

> when I do an 'ls 
> -l' of a source directory that I got from a distribution tarball, it's 
> useful to see the last time the contents of each source file was changed 
> upstream.

OK, now we're discussing different ways to make a tarball reproducible.
That's nice, because Simon's proposal was to make all timestamps equal,
and that puts me off.
In binutils-2.40.tar.bz2 all files are from 2023-01-14.
In android-studio-2021.3.1.17-linux.tar.gz all files are from 2010-01-01.
It gives me as a user no idea whether this tarball is 13 years old,
2 years old, or from yesterday.

I much prefer Paul's approach, since it still conveys meaningful
timestamps:

> For TZDB, where users have long wanted reproducibility, I use something 
> like this in a Makefile recipe for each source file $$file:
> 
> 	      time=`git log -1 --format='tformat:%ct' $$file` &&
> 	      touch -cmd @$$time $$file

That's good for the files that are under version control.

> 2. What about platform-independent files that are automatically created 
> from source files from the repository, and that are shipped in the 
> release tarball?

For these, you could unpack the tarball, see in which order the timestamps
are, and then assign artificial timestamps, in the same order but exactly
2 seconds apart. For example, if the tarball contains
under version control:
  hello.c         2023-01-14 13:28:14
  configure.ac    2023-01-01 14:03:07
and not under version control:
  configure       2023-01-15 04:09:10
  config.h.in     2023-01-15 04:05:19
then you would determine the
  max_timestamp_under_vc = max { 2023-01-14 13:28:14, 2023-01-01 14:03:07 }
                         = 2023-01-14 13:28:14
and then, since config.h.in is older than configure:
  touch -m (max_timestamp_under_vc + 2 seconds) config.h.in
  touch -m (max_timestamp_under_vc + 4 seconds) configure

You can do this without knowing the Makefile rules or scripts which created
config.h.in and configure.

The increment of 2 seconds is, of course, for VFAT file systems, which have
only 2 seconds of resolution for file modification times.

> GNUTARFLAGS uses delete=atime,delete=ctime so that atime and 
> ctime do not leak into the tarball and make it less reproducible

I agree, it's pointless to have atime and ctime in a tarball.

Bruno





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-15 13:21   ` Bruno Haible
  2023-01-15 16:03     ` Paul Eggert
@ 2023-01-16  8:28     ` Simon Josefsson via Gnulib discussion list
  1 sibling, 0 replies; 10+ messages in thread
From: Simon Josefsson via Gnulib discussion list @ 2023-01-16  8:28 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 3180 bytes --]

Hi Bruno,

> Hi Simon,
>
>> >   This attempts to make
>> >   reproducible tarballs by sorting the files and passing the
>> >   "--mtime=<date>" option to tar. ...
>> Having the same mtime on all files in a tarball
>
> First question: What is the point of doing that?

Good question, I don't know the motivation for the binutils people.  For
me, the motivation would be to get rid of arbitrary/random differences
in non-source artifacts.  Those makes auditing non-source for
reproducibility more difficult and error prone, and even a source for
side-channels.  I think the exact motivations are still not fully
understood and that this is an evolving space -- articulating the goals
is useful to measure if we will actually meet them.

To me this is similar to including a build timestamp in a binary.  In
theory it would not cause any problems for anyone, but in practice it
will be one more source of differences that may hide or complicate
finding other more important differences.  Thus from a helicopter
perspective it does make sense to fix that particular non-reproducible
behaviour even though it is difficult to argue that the timestamp by
itself is a serious bug that is important to fix.

> Reproducibility is about verifying that an artifact A was generated
> from a source S.

Right, and I think the proponents of reproducability suggest that an
even stronger verification should be possible: that there is a
one-to-one correspondence between source S and artifact A [for a
particular environment where A is relevant].

If different artifacts A can be generated from the same source S this
will be a source of unreproducability and non-deterministic behaviour,
which ultimately can be a security/safety/reliability problem.

> When I, as a GNU maintainer or uploader, create a tarball and upload it
> to ftp.gnu.org, that tarball is the source S. Because that's what I sign
> with my GPG key. The commits in the git repo aren't the source, and even
> the git checkout on my disk aren't the source — because I am free to
> unpack and repack the tarball as I like, before I upload it to ftp.gnu.org.

Yeah, and I think this is what is being challenged recently -- some
people don't consider tarballs the only relevant source code any more.

To me this makes some sense: we all have tried to fix a small bug in a
package by making changes to some source code, and then see the build
fail catastrophically and sometimes in ways that can't even be resolved
because the necessary tools or source codes were forgotten from the
released tarball.

I think it is good practice to verify that our tarballs can be
regenerated reproducibly from version controlled sources and free tools.

>> 1) Having the same mtime on all files in a tarball may cause problems
>
> Definitely. HP-UX 'make' attempts to rebuilds a file Y that depends on
> a file X, if Y and X have the same timestamp (mtime). It is long known
> that you have to have actually different timestamps for some files.

Interesting -- I wonder if supporting HP-UX [without GNU make] is worth
more than the benefits from reproducible tarballs.

/Simon

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-15 22:25       ` Bruno Haible
@ 2023-01-16  8:40         ` Simon Josefsson via Gnulib discussion list
  2023-01-16  8:51           ` Jim Meyering
  0 siblings, 1 reply; 10+ messages in thread
From: Simon Josefsson via Gnulib discussion list @ 2023-01-16  8:40 UTC (permalink / raw)
  To: Bruno Haible; +Cc: Paul Eggert, bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 4010 bytes --]

Bruno Haible <bruno@clisp.org> writes:

> Paul Eggert wrote:
>> some users want to "trust but verify" and a reproducible 
>> tarball is easier to audit than a non-reproducible one, so for these 
>> users it can be a win to omit the irrelevant data from the tarball.
>
> Reproducibility can be implemented in different ways:
>   - by omitting irrelevant data from the tarball,
>   - by having a customized comparison program 'diff', such that
>     "diff --ignore-irrelevant-metadata contents1 contents2"
>     would ignore the irrelevant parts.

The problem with a --ignore-irrelevant-metadata approach is that it will
be a judgement call what is irrelevant, and two projects may have
different philosophies that are mutually incompatible.

A devils advocate case: consider a build-system that embeds the
source-code timestamp information in the binary, and the binary sends of
a hash of its executable binary to a remote server for verification
purposes.  In some projects this may be what you want to achieve.  Then
ignoring this particular metadata will be a critical failure for that
project.

I think it is a worthy goal to reach a tarball that is deterministically
and one-way reproducable from git source code [for the same set of tool
versions].

>> when I do an 'ls 
>> -l' of a source directory that I got from a distribution tarball, it's 
>> useful to see the last time the contents of each source file was changed 
>> upstream.
>
> OK, now we're discussing different ways to make a tarball reproducible.
> That's nice, because Simon's proposal was to make all timestamps equal,
> and that puts me off.
> In binutils-2.40.tar.bz2 all files are from 2023-01-14.
> In android-studio-2021.3.1.17-linux.tar.gz all files are from 2010-01-01.
> It gives me as a user no idea whether this tarball is 13 years old,
> 2 years old, or from yesterday.
>
> I much prefer Paul's approach, since it still conveys meaningful
> timestamps:

I agree!

I even wonder if the binutils tarball build properly on say HP-UX then?

>> For TZDB, where users have long wanted reproducibility, I use something 
>> like this in a Makefile recipe for each source file $$file:
>> 
>> 	      time=`git log -1 --format='tformat:%ct' $$file` &&
>> 	      touch -cmd @$$time $$file
>
> That's good for the files that are under version control.
>
>> 2. What about platform-independent files that are automatically created 
>> from source files from the repository, and that are shipped in the 
>> release tarball?
>
> For these, you could unpack the tarball, see in which order the timestamps
> are, and then assign artificial timestamps, in the same order but exactly
> 2 seconds apart. For example, if the tarball contains
> under version control:
>   hello.c         2023-01-14 13:28:14
>   configure.ac    2023-01-01 14:03:07
> and not under version control:
>   configure       2023-01-15 04:09:10
>   config.h.in     2023-01-15 04:05:19
> then you would determine the
>   max_timestamp_under_vc = max { 2023-01-14 13:28:14, 2023-01-01 14:03:07 }
>                          = 2023-01-14 13:28:14
> and then, since config.h.in is older than configure:
>   touch -m (max_timestamp_under_vc + 2 seconds) config.h.in
>   touch -m (max_timestamp_under_vc + 4 seconds) configure
>
> You can do this without knowing the Makefile rules or scripts which created
> config.h.in and configure.
>
> The increment of 2 seconds is, of course, for VFAT file systems, which have
> only 2 seconds of resolution for file modification times.

Clever!

To implement this we would need a dist-hook to do the 'touch -m ...'
dance on all files.

I somewhat fear that the solution here will be more of a problem than
the original problem due to the complexity.

Does anyone see a problem with this approach?  Do you think it is a good
idea?  I like it and don't see any further problems, except for the
complexity but I don't see a way to reduce it.

/Simon

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-16  8:40         ` Simon Josefsson via Gnulib discussion list
@ 2023-01-16  8:51           ` Jim Meyering
  0 siblings, 0 replies; 10+ messages in thread
From: Jim Meyering @ 2023-01-16  8:51 UTC (permalink / raw)
  To: Simon Josefsson; +Cc: Bruno Haible, Paul Eggert, bug-gnulib@gnu.org List

[-- Attachment #1: Type: text/plain, Size: 4216 bytes --]

On Mon, Jan 16, 2023, 12:41 AM Simon Josefsson via Gnulib discussion list <
bug-gnulib@gnu.org> wrote:

> Bruno Haible <bruno@clisp.org> writes:
>
> > Paul Eggert wrote:
> >> some users want to "trust but verify" and a reproducible
> >> tarball is easier to audit than a non-reproducible one, so for these
> >> users it can be a win to omit the irrelevant data from the tarball.
> >
> > Reproducibility can be implemented in different ways:
> >   - by omitting irrelevant data from the tarball,
> >   - by having a customized comparison program 'diff', such that
> >     "diff --ignore-irrelevant-metadata contents1 contents2"
> >     would ignore the irrelevant parts.
>
> The problem with a --ignore-irrelevant-metadata approach is that it will
> be a judgement call what is irrelevant, and two projects may have
> different philosophies that are mutually incompatible.
>
> A devils advocate case: consider a build-system that embeds the
> source-code timestamp information in the binary, and the binary sends of
> a hash of its executable binary to a remote server for verification
> purposes.  In some projects this may be what you want to achieve.  Then
> ignoring this particular metadata will be a critical failure for that
> project.
>
> I think it is a worthy goal to reach a tarball that is deterministically
> and one-way reproducable from git source code [for the same set of tool
> versions].
>
> >> when I do an 'ls
> >> -l' of a source directory that I got from a distribution tarball, it's
> >> useful to see the last time the contents of each source file was
> changed
> >> upstream.
> >
> > OK, now we're discussing different ways to make a tarball reproducible.
> > That's nice, because Simon's proposal was to make all timestamps equal,
> > and that puts me off.
> > In binutils-2.40.tar.bz2 all files are from 2023-01-14.
> > In android-studio-2021.3.1.17-linux.tar.gz all files are from 2010-01-01.
> > It gives me as a user no idea whether this tarball is 13 years old,
> > 2 years old, or from yesterday.
> >
> > I much prefer Paul's approach, since it still conveys meaningful
> > timestamps:
>
> I agree!
>
> I even wonder if the binutils tarball build properly on say HP-UX then?
>
> >> For TZDB, where users have long wanted reproducibility, I use something
> >> like this in a Makefile recipe for each source file $$file:
> >>
> >>            time=`git log -1 --format='tformat:%ct' $$file` &&
> >>            touch -cmd @$$time $$file
> >
> > That's good for the files that are under version control.
> >
> >> 2. What about platform-independent files that are automatically created
> >> from source files from the repository, and that are shipped in the
> >> release tarball?
> >
> > For these, you could unpack the tarball, see in which order the
> timestamps
> > are, and then assign artificial timestamps, in the same order but exactly
> > 2 seconds apart. For example, if the tarball contains
> > under version control:
> >   hello.c         2023-01-14 13:28:14
> >   configure.ac    2023-01-01 14:03:07
> > and not under version control:
> >   configure       2023-01-15 04:09:10
> >   config.h.in     2023-01-15 04:05:19
> > then you would determine the
> >   max_timestamp_under_vc = max { 2023-01-14 13:28:14, 2023-01-01
> 14:03:07 }
> >                          = 2023-01-14 13:28:14
> > and then, since config.h.in is older than configure:
> >   touch -m (max_timestamp_under_vc + 2 seconds) config.h.in
> >   touch -m (max_timestamp_under_vc + 4 seconds) configure
> >
> > You can do this without knowing the Makefile rules or scripts which
> created
> > config.h.in and configure.
> >
> > The increment of 2 seconds is, of course, for VFAT file systems, which
> have
> > only 2 seconds of resolution for file modification times.
>
> Clever!
>
> To implement this we would need a dist-hook to do the 'touch -m ...'
> dance on all files.
>
> I somewhat fear that the solution here will be more of a problem than
> the original problem due to the complexity.
>
> Does anyone see a problem with this approach?  Do you think it is a good
> idea?  I like it and don't see any further problems, except for the
> complexity but I don't see a way to reduce it.
>

I like it, too.

>

[-- Attachment #2: Type: text/html, Size: 5823 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-15 16:03     ` Paul Eggert
  2023-01-15 22:25       ` Bruno Haible
@ 2023-01-16  9:45       ` Vivien Kraus
  2023-01-16 11:48         ` Bruno Haible
  2023-01-16 23:00         ` Simon Josefsson via Gnulib discussion list
  1 sibling, 2 replies; 10+ messages in thread
From: Vivien Kraus @ 2023-01-16  9:45 UTC (permalink / raw)
  To: Paul Eggert, Bruno Haible; +Cc: Simon Josefsson, bug-gnulib

Hello,

Le dimanche 15 janvier 2023 à 08:03 -0800, Paul Eggert a écrit :
> For TZDB, where users have long wanted reproducibility, I use
> something 
> like this in a Makefile recipe for each source file $$file:
> 
>               time=`git log -1 --format='tformat:%ct' $$file` &&
>               touch -cmd @$$time $$file

If your texinfo file includes version.texi, then its modification date
is very important because it impacts the date that appears in the final
file. Your solution is in my opinion the only correct way to answer the
problem.

However, there are situations in which you only have access to a
shallow clone of the git repository (for instance, Gitlab CI). I am not
sure how this solution would work in that case.

Best regards,

Vivien


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-16  9:45       ` Vivien Kraus
@ 2023-01-16 11:48         ` Bruno Haible
  2023-01-16 23:00         ` Simon Josefsson via Gnulib discussion list
  1 sibling, 0 replies; 10+ messages in thread
From: Bruno Haible @ 2023-01-16 11:48 UTC (permalink / raw)
  To: Vivien Kraus; +Cc: Paul Eggert, Simon Josefsson, bug-gnulib

Vivien Kraus wrote:
> However, there are situations in which you only have access to a
> shallow clone of the git repository (for instance, Gitlab CI). I am not
> sure how this solution would work in that case.

The shallow clone is an optimization, that was based on the assumption that
the CI does not need the git history. For CI runs that produce tarballs in
this way, and that need to look into the git history, a full clone will be
needed.

Bruno





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: git-commit based mtime-reproducible tarballs
  2023-01-16  9:45       ` Vivien Kraus
  2023-01-16 11:48         ` Bruno Haible
@ 2023-01-16 23:00         ` Simon Josefsson via Gnulib discussion list
  1 sibling, 0 replies; 10+ messages in thread
From: Simon Josefsson via Gnulib discussion list @ 2023-01-16 23:00 UTC (permalink / raw)
  To: Vivien Kraus; +Cc: Paul Eggert, Bruno Haible, bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 455 bytes --]

Vivien Kraus <vivien@planete-kraus.eu> writes:

> However, there are situations in which you only have access to a
> shallow clone of the git repository (for instance, Gitlab CI). I am not
> sure how this solution would work in that case.

Indeed, good point.  I think 'make dist' should continue to work in
shallow clones, with its obvious consequences (incomplete ChangeLog,
non-deterministic mtime of version controlled files, anything else?).

/Simon

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-01-16 23:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <87h6wtgmhy.fsf__22556.7857896507$1673713908$gmane$org@redhat.com>
2023-01-15 11:01 ` RFC: git-commit based mtime-reproducible tarballs Simon Josefsson via Gnulib discussion list
2023-01-15 13:21   ` Bruno Haible
2023-01-15 16:03     ` Paul Eggert
2023-01-15 22:25       ` Bruno Haible
2023-01-16  8:40         ` Simon Josefsson via Gnulib discussion list
2023-01-16  8:51           ` Jim Meyering
2023-01-16  9:45       ` Vivien Kraus
2023-01-16 11:48         ` Bruno Haible
2023-01-16 23:00         ` Simon Josefsson via Gnulib discussion list
2023-01-16  8:28     ` Simon Josefsson via Gnulib discussion list

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).