git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] supply '-n' to gzip to produce identical tarballs
@ 2011-04-10  6:12 Fraser Tweedale
  2011-04-10  7:38 ` Jakub Narebski
  0 siblings, 1 reply; 6+ messages in thread
From: Fraser Tweedale @ 2011-04-10  6:12 UTC (permalink / raw)
  To: git; +Cc: Fraser Tweedale

Without the '-n' ('--no-name') argument, gzip includes timestamp in
output which results in different files.  Important systems like FreeBSD
ports and perhaps many others hash/checksum downloaded files to ensure
integrity.  For projects that do not release official archives, gitweb's
snapshot feature would be an excellent stand-in but for the fact that the
files it produces are not identical.

Supply '-n' to gzip to exclude timestamp from output and produce idential
output every time.

Signed-off-by: Fraser Tweedale <frase@frase.id.au>
---
 gitweb/gitweb.perl |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 46186ab..2ab08da 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -186,7 +186,7 @@ our %known_snapshot_formats = (
 		'type' => 'application/x-gzip',
 		'suffix' => '.tar.gz',
 		'format' => 'tar',
-		'compressor' => ['gzip']},
+		'compressor' => ['gzip', '-n']},
 
 	'tbz2' => {
 		'display' => 'tar.bz2',
-- 
1.7.4.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
  2011-04-10  6:12 [PATCH] supply '-n' to gzip to produce identical tarballs Fraser Tweedale
@ 2011-04-10  7:38 ` Jakub Narebski
  2011-04-10 10:13   ` Fraser Tweedale
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Narebski @ 2011-04-10  7:38 UTC (permalink / raw)
  To: Fraser Tweedale; +Cc: git

Fraser Tweedale <frase@frase.id.au> writes:

> Subject: [PATCH] supply '-n' to gzip to produce identical tarballs
>
> Without the '-n' ('--no-name') argument, gzip includes timestamp in
> output which results in different files.  Important systems like FreeBSD
> ports and perhaps many others hash/checksum downloaded files to ensure
> integrity.  For projects that do not release official archives, gitweb's
> snapshot feature would be an excellent stand-in but for the fact that the
> files it produces are not identical.
> 
> Supply '-n' to gzip to exclude timestamp from output and produce idential
> output every time.
> 
> Signed-off-by: Fraser Tweedale <frase@frase.id.au>

Very good description, except subject line should denote which
subsystem this commit affects, i.e.:

  gitweb: supply '-n' to gzip to produce identical tarballs

Hmmm... gzip in gitweb's 'snapshot' action gets data compressed from
standard input, not from filesystem.  Isn't -n / --no-name no-op then?
Just asking...

> ---
>  gitweb/gitweb.perl |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 46186ab..2ab08da 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -186,7 +186,7 @@ our %known_snapshot_formats = (
>  		'type' => 'application/x-gzip',
>  		'suffix' => '.tar.gz',
>  		'format' => 'tar',
> -		'compressor' => ['gzip']},
> +		'compressor' => ['gzip', '-n']},

Perhaps it would be more clear to use

  +		'compressor' => ['gzip', '--no-name']},

>  
>  	'tbz2' => {
>  		'display' => 'tar.bz2',
> -- 
> 1.7.4.3
> 

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
  2011-04-10  7:38 ` Jakub Narebski
@ 2011-04-10 10:13   ` Fraser Tweedale
  2011-04-10 13:55     ` Jakub Narebski
  0 siblings, 1 reply; 6+ messages in thread
From: Fraser Tweedale @ 2011-04-10 10:13 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Sun, Apr 10, 2011 at 12:38:32AM -0700, Jakub Narebski wrote:
> Fraser Tweedale <frase@frase.id.au> writes:
> 
> > Subject: [PATCH] supply '-n' to gzip to produce identical tarballs
> >
> > Without the '-n' ('--no-name') argument, gzip includes timestamp in
> > output which results in different files.  Important systems like FreeBSD
> > ports and perhaps many others hash/checksum downloaded files to ensure
> > integrity.  For projects that do not release official archives, gitweb's
> > snapshot feature would be an excellent stand-in but for the fact that the
> > files it produces are not identical.
> > 
> > Supply '-n' to gzip to exclude timestamp from output and produce idential
> > output every time.
> > 
> > Signed-off-by: Fraser Tweedale <frase@frase.id.au>
> 
> Very good description, except subject line should denote which
> subsystem this commit affects, i.e.:
> 
>   gitweb: supply '-n' to gzip to produce identical tarballs
> 
Thank you.  Do I need to amend the message and resubmit the patch?  (first
time submitting a patch to git; I used git send-email).

> Hmmm... gzip in gitweb's 'snapshot' action gets data compressed from
> standard input, not from filesystem.  Isn't -n / --no-name no-op then?
> Just asking...
> 
It is not no-op; I have tested to confirm this.  I'm not sure whether
a file name is recorded in the stdin case, or if so what it is, but the
timestamp is recorded and that makes the difference.

> > ---
> >  gitweb/gitweb.perl |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index 46186ab..2ab08da 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -186,7 +186,7 @@ our %known_snapshot_formats = (
> >  		'type' => 'application/x-gzip',
> >  		'suffix' => '.tar.gz',
> >  		'format' => 'tar',
> > -		'compressor' => ['gzip']},
> > +		'compressor' => ['gzip', '-n']},
> 
> Perhaps it would be more clear to use
> 
>   +		'compressor' => ['gzip', '--no-name']},
> 
> >  
> >  	'tbz2' => {
> >  		'display' => 'tar.bz2',
> > -- 
> > 1.7.4.3
> > 
> 
Definitely, if the argument is the same (or even present) on all systems.
On FreeBSD and GNU both '-n' and '--no-name' are do the job, but an audit
of other systems should be done to ensure they don't break.  I chose '-n'
as it seemed the more conservative choice.

> -- 
> Jakub Narebski
> Poland
> ShadeHawk on #git

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
  2011-04-10 10:13   ` Fraser Tweedale
@ 2011-04-10 13:55     ` Jakub Narebski
  2011-04-11 19:24       ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Narebski @ 2011-04-10 13:55 UTC (permalink / raw)
  To: Fraser Tweedale; +Cc: git

On Sun, 10 Apr 2011, Fraser Tweedale wrote:
> On Sun, Apr 10, 2011 at 12:38:32AM -0700, Jakub Narebski wrote:
> > Fraser Tweedale <frase@frase.id.au> writes:
> > 
> > > Subject: [PATCH] supply '-n' to gzip to produce identical tarballs
 
> > Very good description, except subject line should denote which
> > subsystem this commit affects, i.e.:
> > 
> >   gitweb: supply '-n' to gzip to produce identical tarballs
>  
> Thank you.  Do I need to amend the message and resubmit the patch?  (first
> time submitting a patch to git; I used git send-email).

I don't think so.  I guess that Junio can do such trivial amend when
applying, at the time he is adding his signoff.

> > Hmmm... gzip in gitweb's 'snapshot' action gets data compressed from
> > standard input, not from filesystem.  Isn't -n / --no-name no-op then?
> > Just asking...
> 
> It is not no-op; I have tested to confirm this.  I'm not sure whether
> a file name is recorded in the stdin case, or if so what it is, but the
> timestamp is recorded and that makes the difference.

Thanks for the clarification.

For what it is worth:

Acked-by: Jakub Narebski <jnareb@gmail.com>


> > > -		'compressor' => ['gzip']},
> > > +		'compressor' => ['gzip', '-n']},
> > 
> > Perhaps it would be more clear to use
> > 
> >   +		'compressor' => ['gzip', '--no-name']},

> Definitely, if the argument is the same (or even present) on all systems.
> On FreeBSD and GNU both '-n' and '--no-name' are do the job, but an audit
> of other systems should be done to ensure they don't break.  I chose '-n'
> as it seemed the more conservative choice.

So you choose '-n' because it has more chance of being widely supported,
isn't it?  Good enough for me.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
  2011-04-10 13:55     ` Jakub Narebski
@ 2011-04-11 19:24       ` Junio C Hamano
  2011-04-11 20:59         ` Fraser Tweedale
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2011-04-11 19:24 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Fraser Tweedale, git

Jakub Narebski <jnareb@gmail.com> writes:

>> > Perhaps it would be more clear to use
>> > 
>> >   +		'compressor' => ['gzip', '--no-name']},
>
>> Definitely, if the argument is the same (or even present) on all systems.
>> On FreeBSD and GNU both '-n' and '--no-name' are do the job, but an audit
>> of other systems should be done to ensure they don't break.  I chose '-n'
>> as it seemed the more conservative choice.
>
> So you choose '-n' because it has more chance of being widely supported,
> isn't it?  Good enough for me.

Interesting.  "gzip <COPYING" does get a consistent result because it can
fstat to get the timestamp, but "cat COPYING | gzip" does change its
output every time it is run.  Good catch and a solution.  Thanks, both.

So I should expect a pull request sometime after 1.7.5 final from you,
with "an audit of other systems" done by others on the list noted in the
final commit message?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] supply '-n' to gzip to produce identical tarballs
  2011-04-11 19:24       ` Junio C Hamano
@ 2011-04-11 20:59         ` Fraser Tweedale
  0 siblings, 0 replies; 6+ messages in thread
From: Fraser Tweedale @ 2011-04-11 20:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git

[-- Attachment #1: Type: text/plain, Size: 1090 bytes --]

On Mon, Apr 11, 2011 at 12:24:05PM -0700, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> >> > Perhaps it would be more clear to use
> >> > 
> >> >   +		'compressor' => ['gzip', '--no-name']},
> >
> >> Definitely, if the argument is the same (or even present) on all systems.
> >> On FreeBSD and GNU both '-n' and '--no-name' are do the job, but an audit
> >> of other systems should be done to ensure they don't break.  I chose '-n'
> >> as it seemed the more conservative choice.
> >
> > So you choose '-n' because it has more chance of being widely supported,
> > isn't it?  Good enough for me.
> 
> Interesting.  "gzip <COPYING" does get a consistent result because it can
> fstat to get the timestamp, but "cat COPYING | gzip" does change its
> output every time it is run.  Good catch and a solution.  Thanks, both.
> 
> So I should expect a pull request sometime after 1.7.5 final from you,
> with "an audit of other systems" done by others on the list noted in the
> final commit message?
> 
> 
Sure, no problem.

Thanks,

Fraser

[-- Attachment #2: Type: application/pgp-signature, Size: 196 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-04-11 21:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-10  6:12 [PATCH] supply '-n' to gzip to produce identical tarballs Fraser Tweedale
2011-04-10  7:38 ` Jakub Narebski
2011-04-10 10:13   ` Fraser Tweedale
2011-04-10 13:55     ` Jakub Narebski
2011-04-11 19:24       ` Junio C Hamano
2011-04-11 20:59         ` Fraser Tweedale

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).