* [PATCH] supply '-n' to gzip to produce identical tarballs
@ 2011-04-10 6:12 Fraser Tweedale
2011-04-10 7:38 ` Jakub Narebski
0 siblings, 1 reply; 6+ messages in thread
From: Fraser Tweedale @ 2011-04-10 6:12 UTC (permalink / raw)
To: git; +Cc: Fraser Tweedale
Without the '-n' ('--no-name') argument, gzip includes timestamp in
output which results in different files. Important systems like FreeBSD
ports and perhaps many others hash/checksum downloaded files to ensure
integrity. For projects that do not release official archives, gitweb's
snapshot feature would be an excellent stand-in but for the fact that the
files it produces are not identical.
Supply '-n' to gzip to exclude timestamp from output and produce idential
output every time.
Signed-off-by: Fraser Tweedale <frase@frase.id.au>
---
gitweb/gitweb.perl | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 46186ab..2ab08da 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -186,7 +186,7 @@ our %known_snapshot_formats = (
'type' => 'application/x-gzip',
'suffix' => '.tar.gz',
'format' => 'tar',
- 'compressor' => ['gzip']},
+ 'compressor' => ['gzip', '-n']},
'tbz2' => {
'display' => 'tar.bz2',
--
1.7.4.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
2011-04-10 6:12 [PATCH] supply '-n' to gzip to produce identical tarballs Fraser Tweedale
@ 2011-04-10 7:38 ` Jakub Narebski
2011-04-10 10:13 ` Fraser Tweedale
0 siblings, 1 reply; 6+ messages in thread
From: Jakub Narebski @ 2011-04-10 7:38 UTC (permalink / raw)
To: Fraser Tweedale; +Cc: git
Fraser Tweedale <frase@frase.id.au> writes:
> Subject: [PATCH] supply '-n' to gzip to produce identical tarballs
>
> Without the '-n' ('--no-name') argument, gzip includes timestamp in
> output which results in different files. Important systems like FreeBSD
> ports and perhaps many others hash/checksum downloaded files to ensure
> integrity. For projects that do not release official archives, gitweb's
> snapshot feature would be an excellent stand-in but for the fact that the
> files it produces are not identical.
>
> Supply '-n' to gzip to exclude timestamp from output and produce idential
> output every time.
>
> Signed-off-by: Fraser Tweedale <frase@frase.id.au>
Very good description, except subject line should denote which
subsystem this commit affects, i.e.:
gitweb: supply '-n' to gzip to produce identical tarballs
Hmmm... gzip in gitweb's 'snapshot' action gets data compressed from
standard input, not from filesystem. Isn't -n / --no-name no-op then?
Just asking...
> ---
> gitweb/gitweb.perl | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 46186ab..2ab08da 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -186,7 +186,7 @@ our %known_snapshot_formats = (
> 'type' => 'application/x-gzip',
> 'suffix' => '.tar.gz',
> 'format' => 'tar',
> - 'compressor' => ['gzip']},
> + 'compressor' => ['gzip', '-n']},
Perhaps it would be more clear to use
+ 'compressor' => ['gzip', '--no-name']},
>
> 'tbz2' => {
> 'display' => 'tar.bz2',
> --
> 1.7.4.3
>
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
2011-04-10 7:38 ` Jakub Narebski
@ 2011-04-10 10:13 ` Fraser Tweedale
2011-04-10 13:55 ` Jakub Narebski
0 siblings, 1 reply; 6+ messages in thread
From: Fraser Tweedale @ 2011-04-10 10:13 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
On Sun, Apr 10, 2011 at 12:38:32AM -0700, Jakub Narebski wrote:
> Fraser Tweedale <frase@frase.id.au> writes:
>
> > Subject: [PATCH] supply '-n' to gzip to produce identical tarballs
> >
> > Without the '-n' ('--no-name') argument, gzip includes timestamp in
> > output which results in different files. Important systems like FreeBSD
> > ports and perhaps many others hash/checksum downloaded files to ensure
> > integrity. For projects that do not release official archives, gitweb's
> > snapshot feature would be an excellent stand-in but for the fact that the
> > files it produces are not identical.
> >
> > Supply '-n' to gzip to exclude timestamp from output and produce idential
> > output every time.
> >
> > Signed-off-by: Fraser Tweedale <frase@frase.id.au>
>
> Very good description, except subject line should denote which
> subsystem this commit affects, i.e.:
>
> gitweb: supply '-n' to gzip to produce identical tarballs
>
Thank you. Do I need to amend the message and resubmit the patch? (first
time submitting a patch to git; I used git send-email).
> Hmmm... gzip in gitweb's 'snapshot' action gets data compressed from
> standard input, not from filesystem. Isn't -n / --no-name no-op then?
> Just asking...
>
It is not no-op; I have tested to confirm this. I'm not sure whether
a file name is recorded in the stdin case, or if so what it is, but the
timestamp is recorded and that makes the difference.
> > ---
> > gitweb/gitweb.perl | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index 46186ab..2ab08da 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -186,7 +186,7 @@ our %known_snapshot_formats = (
> > 'type' => 'application/x-gzip',
> > 'suffix' => '.tar.gz',
> > 'format' => 'tar',
> > - 'compressor' => ['gzip']},
> > + 'compressor' => ['gzip', '-n']},
>
> Perhaps it would be more clear to use
>
> + 'compressor' => ['gzip', '--no-name']},
>
> >
> > 'tbz2' => {
> > 'display' => 'tar.bz2',
> > --
> > 1.7.4.3
> >
>
Definitely, if the argument is the same (or even present) on all systems.
On FreeBSD and GNU both '-n' and '--no-name' are do the job, but an audit
of other systems should be done to ensure they don't break. I chose '-n'
as it seemed the more conservative choice.
> --
> Jakub Narebski
> Poland
> ShadeHawk on #git
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
2011-04-10 10:13 ` Fraser Tweedale
@ 2011-04-10 13:55 ` Jakub Narebski
2011-04-11 19:24 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Jakub Narebski @ 2011-04-10 13:55 UTC (permalink / raw)
To: Fraser Tweedale; +Cc: git
On Sun, 10 Apr 2011, Fraser Tweedale wrote:
> On Sun, Apr 10, 2011 at 12:38:32AM -0700, Jakub Narebski wrote:
> > Fraser Tweedale <frase@frase.id.au> writes:
> >
> > > Subject: [PATCH] supply '-n' to gzip to produce identical tarballs
> > Very good description, except subject line should denote which
> > subsystem this commit affects, i.e.:
> >
> > gitweb: supply '-n' to gzip to produce identical tarballs
>
> Thank you. Do I need to amend the message and resubmit the patch? (first
> time submitting a patch to git; I used git send-email).
I don't think so. I guess that Junio can do such trivial amend when
applying, at the time he is adding his signoff.
> > Hmmm... gzip in gitweb's 'snapshot' action gets data compressed from
> > standard input, not from filesystem. Isn't -n / --no-name no-op then?
> > Just asking...
>
> It is not no-op; I have tested to confirm this. I'm not sure whether
> a file name is recorded in the stdin case, or if so what it is, but the
> timestamp is recorded and that makes the difference.
Thanks for the clarification.
For what it is worth:
Acked-by: Jakub Narebski <jnareb@gmail.com>
> > > - 'compressor' => ['gzip']},
> > > + 'compressor' => ['gzip', '-n']},
> >
> > Perhaps it would be more clear to use
> >
> > + 'compressor' => ['gzip', '--no-name']},
> Definitely, if the argument is the same (or even present) on all systems.
> On FreeBSD and GNU both '-n' and '--no-name' are do the job, but an audit
> of other systems should be done to ensure they don't break. I chose '-n'
> as it seemed the more conservative choice.
So you choose '-n' because it has more chance of being widely supported,
isn't it? Good enough for me.
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
2011-04-10 13:55 ` Jakub Narebski
@ 2011-04-11 19:24 ` Junio C Hamano
2011-04-11 20:59 ` Fraser Tweedale
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2011-04-11 19:24 UTC (permalink / raw)
To: Jakub Narebski; +Cc: Fraser Tweedale, git
Jakub Narebski <jnareb@gmail.com> writes:
>> > Perhaps it would be more clear to use
>> >
>> > + 'compressor' => ['gzip', '--no-name']},
>
>> Definitely, if the argument is the same (or even present) on all systems.
>> On FreeBSD and GNU both '-n' and '--no-name' are do the job, but an audit
>> of other systems should be done to ensure they don't break. I chose '-n'
>> as it seemed the more conservative choice.
>
> So you choose '-n' because it has more chance of being widely supported,
> isn't it? Good enough for me.
Interesting. "gzip <COPYING" does get a consistent result because it can
fstat to get the timestamp, but "cat COPYING | gzip" does change its
output every time it is run. Good catch and a solution. Thanks, both.
So I should expect a pull request sometime after 1.7.5 final from you,
with "an audit of other systems" done by others on the list noted in the
final commit message?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
2011-04-11 19:24 ` Junio C Hamano
@ 2011-04-11 20:59 ` Fraser Tweedale
0 siblings, 0 replies; 6+ messages in thread
From: Fraser Tweedale @ 2011-04-11 20:59 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jakub Narebski, git
[-- Attachment #1: Type: text/plain, Size: 1090 bytes --]
On Mon, Apr 11, 2011 at 12:24:05PM -0700, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>
> >> > Perhaps it would be more clear to use
> >> >
> >> > + 'compressor' => ['gzip', '--no-name']},
> >
> >> Definitely, if the argument is the same (or even present) on all systems.
> >> On FreeBSD and GNU both '-n' and '--no-name' are do the job, but an audit
> >> of other systems should be done to ensure they don't break. I chose '-n'
> >> as it seemed the more conservative choice.
> >
> > So you choose '-n' because it has more chance of being widely supported,
> > isn't it? Good enough for me.
>
> Interesting. "gzip <COPYING" does get a consistent result because it can
> fstat to get the timestamp, but "cat COPYING | gzip" does change its
> output every time it is run. Good catch and a solution. Thanks, both.
>
> So I should expect a pull request sometime after 1.7.5 final from you,
> with "an audit of other systems" done by others on the list noted in the
> final commit message?
>
>
Sure, no problem.
Thanks,
Fraser
[-- Attachment #2: Type: application/pgp-signature, Size: 196 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-04-11 21:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-10 6:12 [PATCH] supply '-n' to gzip to produce identical tarballs Fraser Tweedale
2011-04-10 7:38 ` Jakub Narebski
2011-04-10 10:13 ` Fraser Tweedale
2011-04-10 13:55 ` Jakub Narebski
2011-04-11 19:24 ` Junio C Hamano
2011-04-11 20:59 ` Fraser Tweedale
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).