git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* clean/smudge filters on .zip/.tgz files
@ 2013-02-26 22:38 Tim Chase
  2013-02-27  6:39 ` Johannes Sixt
  0 siblings, 1 reply; 3+ messages in thread
From: Tim Chase @ 2013-02-26 22:38 UTC (permalink / raw)
  To: git

Various programs that I use ([Open|Libre]Office, Vym, etc) use a
zipped/.tgz'ed file format, usually containing multiple
(usually) plain-text files within.

I'm trying to figure out a way for git to treat these as virtual
directories for purposes of merging/diffing.  

Reading up on clean/smudge filters, it looks like they expect one
file coming in and one file going out, rather than one file
on one side and a directory-tree of files on the other side.

I tried creating my own pair of clean/smudge filters that would
uncompress the files, but there's no good way put multiple files on
stdout.

Has anybody else played with such a scheme for uncompressing files as
they go into git and recompressing them as they come back out?

-tkc

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: clean/smudge filters on .zip/.tgz files
  2013-02-26 22:38 clean/smudge filters on .zip/.tgz files Tim Chase
@ 2013-02-27  6:39 ` Johannes Sixt
  2013-02-27 15:18   ` Michael J Gruber
  0 siblings, 1 reply; 3+ messages in thread
From: Johannes Sixt @ 2013-02-27  6:39 UTC (permalink / raw)
  To: Tim Chase; +Cc: git

Am 2/26/2013 23:38, schrieb Tim Chase:
> Various programs that I use ([Open|Libre]Office, Vym, etc) use a
> zipped/.tgz'ed file format, usually containing multiple
> (usually) plain-text files within.
> 
> I'm trying to figure out a way for git to treat these as virtual
> directories for purposes of merging/diffing.  
> 
> Reading up on clean/smudge filters, it looks like they expect one
> file coming in and one file going out, rather than one file
> on one side and a directory-tree of files on the other side.
> 
> I tried creating my own pair of clean/smudge filters that would
> uncompress the files, but there's no good way put multiple files on
> stdout.
> 
> Has anybody else played with such a scheme for uncompressing files as
> they go into git and recompressing them as they come back out?

I attempted to do something like this for OpenDocument files (I didn't get
very far) until I discovered that LibreOffice can save "flat open document
files". That combined with the option "save files optimized" switched off
results in fairly readable XML in a single file that can even be merged
under some circumstances.

You would still need a clean filter that normalizes the style numbers,
cross reference marks and other stuff that changes each time LibreOffice
saves the file.

-- Hannes

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: clean/smudge filters on .zip/.tgz files
  2013-02-27  6:39 ` Johannes Sixt
@ 2013-02-27 15:18   ` Michael J Gruber
  0 siblings, 0 replies; 3+ messages in thread
From: Michael J Gruber @ 2013-02-27 15:18 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Tim Chase, git

Johannes Sixt venit, vidit, dixit 27.02.2013 07:39:
> Am 2/26/2013 23:38, schrieb Tim Chase:
>> Various programs that I use ([Open|Libre]Office, Vym, etc) use a
>> zipped/.tgz'ed file format, usually containing multiple
>> (usually) plain-text files within.
>>
>> I'm trying to figure out a way for git to treat these as virtual
>> directories for purposes of merging/diffing.  
>>
>> Reading up on clean/smudge filters, it looks like they expect one
>> file coming in and one file going out, rather than one file
>> on one side and a directory-tree of files on the other side.
>>
>> I tried creating my own pair of clean/smudge filters that would
>> uncompress the files, but there's no good way put multiple files on
>> stdout.
>>
>> Has anybody else played with such a scheme for uncompressing files as
>> they go into git and recompressing them as they come back out?
> 
> I attempted to do something like this for OpenDocument files (I didn't get
> very far) until I discovered that LibreOffice can save "flat open document
> files". That combined with the option "save files optimized" switched off
> results in fairly readable XML in a single file that can even be merged
> under some circumstances.
> 
> You would still need a clean filter that normalizes the style numbers,
> cross reference marks and other stuff that changes each time LibreOffice
> saves the file.
> 
> -- Hannes
> 

In general, using "zip -0" is a good way of getting something that
delta-compresses well and that can give a meaningful diff (and has no
information loss).

The (my) problem is that recompressing a zip archive (i.e. multi-file)
is a pita, you can't just use a pipe "unzip | zip -0". You'd have to do
that in a temp dir.

Michael

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-02-27 15:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-26 22:38 clean/smudge filters on .zip/.tgz files Tim Chase
2013-02-27  6:39 ` Johannes Sixt
2013-02-27 15:18   ` Michael J Gruber

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).