git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Un-paged commit messages in git filter-branch's commit-filter?
@ 2016-06-13  6:28 Stefan Tauner
  2016-06-16  9:59 ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Tauner @ 2016-06-13  6:28 UTC (permalink / raw)
  To: git

Hello,

I am trying to do a major cleanup of the repository in one of my
projects (and switch from git-svn to native git). I have developed a
commit-filter script over the last months that massages partially
dreadful commit messages into something acceptable. While I am not 100%
sure I think that upgrading git has broken it partially. AFAICT since
the update the commit-filter does not get the original message anymore
but at least the subject/first paragraph is run through a pager or
something similar:
The first line is broken into multiple lines (i.e. some line breaks are
inserted about every 72 characters where none have been before).

I have tried to run "git --no-pager filter-branch ..." to no avail. I
have briefly looked at the source but could not find any proofs...
Any hints would be appreciated. This is how I run my script:

tmpvar="$(</home/.....sh)" ; git --no-pager filter-branch -f --commit-filter "$tmpvar" --tag-name-filter cat -- HEAD

-- 
Kind regards, Stefan Tauner

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Un-paged commit messages in git filter-branch's commit-filter?
  2016-06-13  6:28 Un-paged commit messages in git filter-branch's commit-filter? Stefan Tauner
@ 2016-06-16  9:59 ` Jeff King
  2016-07-31 16:39   ` Stefan Tauner
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2016-06-16  9:59 UTC (permalink / raw)
  To: Stefan Tauner; +Cc: git

On Mon, Jun 13, 2016 at 08:28:18AM +0200, Stefan Tauner wrote:

> I am trying to do a major cleanup of the repository in one of my
> projects (and switch from git-svn to native git). I have developed a
> commit-filter script over the last months that massages partially
> dreadful commit messages into something acceptable. While I am not 100%
> sure I think that upgrading git has broken it partially. AFAICT since
> the update the commit-filter does not get the original message anymore
> but at least the subject/first paragraph is run through a pager or
> something similar:
> The first line is broken into multiple lines (i.e. some line breaks are
> inserted about every 72 characters where none have been before).

There are some output formats that will wrap lines, but by default,
filter-branch should not be using them (and I could not reproduce the
issue in a simple test). Can you show us what your commit-filter looks
like?

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Un-paged commit messages in git filter-branch's commit-filter?
  2016-06-16  9:59 ` Jeff King
@ 2016-07-31 16:39   ` Stefan Tauner
  2016-08-01 21:36     ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Tauner @ 2016-07-31 16:39 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Thu, 16 Jun 2016 05:59:47 -0400
Jeff King <peff@peff.net> wrote:

> On Mon, Jun 13, 2016 at 08:28:18AM +0200, Stefan Tauner wrote:
> 
> > I am trying to do a major cleanup of the repository in one of my
> > projects (and switch from git-svn to native git). I have developed a
> > commit-filter script over the last months that massages partially
> > dreadful commit messages into something acceptable. While I am not 100%
> > sure I think that upgrading git has broken it partially. AFAICT since
> > the update the commit-filter does not get the original message anymore
> > but at least the subject/first paragraph is run through a pager or
> > something similar:
> > The first line is broken into multiple lines (i.e. some line breaks are
> > inserted about every 72 characters where none have been before).  
> 
> There are some output formats that will wrap lines, but by default,
> filter-branch should not be using them (and I could not reproduce the
> issue in a simple test). Can you show us what your commit-filter looks
> like?

Thanks for your answer. I have tried to reproduce it in other (newly
created) repositories but failed. However, it seems to relate to some
kind of persistent paging setting, is that possible?
git config -l does not show anything suspicious.

The following commands produce paged output:
git show hash
git show --pretty=%B
git log hash^..hash
Commit message in gitk


These do NOT produce paged output:
git patch hash^..hash
Commit message in gitg 0.2.7


This is the script I tried to use to reproduce the problem:

#!/bin/bash
export LC_ALL=C
input=$(cat)
echo "===========================
$input
===========================" >> /tmp/paging_bug.txt
git commit-tree "$@" -m "$input"

-- 
Kind regards/Mit freundlichen Grüßen, Stefan Tauner

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Un-paged commit messages in git filter-branch's commit-filter?
  2016-07-31 16:39   ` Stefan Tauner
@ 2016-08-01 21:36     ` Jeff King
  2016-08-01 21:49       ` Stefan Tauner
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2016-08-01 21:36 UTC (permalink / raw)
  To: Stefan Tauner; +Cc: git

On Sun, Jul 31, 2016 at 06:39:35PM +0200, Stefan Tauner wrote:

> > There are some output formats that will wrap lines, but by default,
> > filter-branch should not be using them (and I could not reproduce the
> > issue in a simple test). Can you show us what your commit-filter looks
> > like?
> 
> Thanks for your answer. I have tried to reproduce it in other (newly
> created) repositories but failed. However, it seems to relate to some
> kind of persistent paging setting, is that possible?
> git config -l does not show anything suspicious.
> 
> The following commands produce paged output:
> git show hash
> git show --pretty=%B
> git log hash^..hash
> Commit message in gitk
> 
> 
> These do NOT produce paged output:
> git patch hash^..hash
> Commit message in gitg 0.2.7

What is "git patch"? An alias for "format-patch?".

> This is the script I tried to use to reproduce the problem:
> 
> #!/bin/bash
> export LC_ALL=C
> input=$(cat)
> echo "===========================
> $input
> ===========================" >> /tmp/paging_bug.txt
> git commit-tree "$@" -m "$input"

Can you be more specific about the input you're feeding to git and the
output you're seeing?

For instance, if I do:

  git init
  echo content >file
  git add file
  git commit -m "$(perl -e 'print join(" ", 1..100)')"

I get a commit message with one long unwrapped line, which I can view
via git-log, etc. Now if I try to run filter-branch on that:

  git filter-branch --commit-filter '
	input=$(cat)
	{
		echo "===================="
		echo $input
		echo "===================="
	} >>/tmp/paging_bug.txt
	git commit-tree "$@" -m "$input"
  '

then the commit remains unchanged, and paging_bug shows one long line.
What am I missing?

(I wondered at first if the extra "cat" and "-m" could be messing up
whitespace for you, but it should not, as the quoting around "$input"
should preserve things like newlines. And anyway, the bug in that case
would be the _opposite_; I'd expect it to stuff everything onto a single
line rather than breaking lines).

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Un-paged commit messages in git filter-branch's commit-filter?
  2016-08-01 21:36     ` Jeff King
@ 2016-08-01 21:49       ` Stefan Tauner
  2016-08-01 23:24         ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Tauner @ 2016-08-01 21:49 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, 1 Aug 2016 17:36:31 -0400
Jeff King <peff@peff.net> wrote:

> On Sun, Jul 31, 2016 at 06:39:35PM +0200, Stefan Tauner wrote:
> 
> > > There are some output formats that will wrap lines, but by default,
> > > filter-branch should not be using them (and I could not reproduce the
> > > issue in a simple test). Can you show us what your commit-filter looks
> > > like?  
> > 
> > Thanks for your answer. I have tried to reproduce it in other (newly
> > created) repositories but failed. However, it seems to relate to some
> > kind of persistent paging setting, is that possible?
> > git config -l does not show anything suspicious.
> > 
> > The following commands produce paged output:
> > git show hash
> > git show --pretty=%B
> > git log hash^..hash
> > Commit message in gitk
> > 
> > 
> > These do NOT produce paged output:
> > git patch hash^..hash
> > Commit message in gitg 0.2.7  
> 
> What is "git patch"? An alias for "format-patch?".

Yes, sorry.
And this is the most amazing thing about this behavior... what's so
different between format-patch and log or show --pretty=%B. Shouldn't
these match 100%?

> 
> > This is the script I tried to use to reproduce the problem:
> > 
> > #!/bin/bash
> > export LC_ALL=C
> > input=$(cat)
> > echo "===========================
> > $input
> > ===========================" >> /tmp/paging_bug.txt
> > git commit-tree "$@" -m "$input"  
> 
> Can you be more specific about the input you're feeding to git and the
> output you're seeing?
> 
> For instance, if I do:
> 
>   git init
>   echo content >file
>   git add file
>   git commit -m "$(perl -e 'print join(" ", 1..100)')"
> 
> I get a commit message with one long unwrapped line, which I can view
> via git-log, etc.

That's approximately what I did in my tests as well. And like you, when
I do this in a fresh repository, it works like that..

> Now if I try to run filter-branch on that:
> 
>   git filter-branch --commit-filter '
> 	input=$(cat)
> 	{
> 		echo "===================="
> 		echo $input
> 		echo "===================="
> 	} >>/tmp/paging_bug.txt
> 	git commit-tree "$@" -m "$input"
>   '
> 
> then the commit remains unchanged, and paging_bug shows one long line.

as well as filter-branch. That's what I meant when I wrote I cannot
reproduce it with a new repository (to create a MWE). I wrote the first
mail under the presumption that filter-branch is somehow involved but
apparently it is not the only git command and receives the mangled
input already as the commands stated in the last email show.

> What am I missing?
> 
> (I wondered at first if the extra "cat" and "-m" could be messing up
> whitespace for you, but it should not, as the quoting around "$input"
> should preserve things like newlines. And anyway, the bug in that case
> would be the _opposite_; I'd expect it to stuff everything onto a single
> line rather than breaking lines).

The commit messages I try to process are nothing special really... just
very long and not subject-like (because SVN and not giving too much
thought to them sometimes). The only special thing I can think of is
that they have been processed by git-svn earlier.

-- 
Kind regards/Mit freundlichen Grüßen, Stefan Tauner

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Un-paged commit messages in git filter-branch's commit-filter?
  2016-08-01 21:49       ` Stefan Tauner
@ 2016-08-01 23:24         ` Jeff King
  2016-08-02  0:07           ` Eric Wong
  2016-08-06  9:40           ` Stefan Tauner
  0 siblings, 2 replies; 8+ messages in thread
From: Jeff King @ 2016-08-01 23:24 UTC (permalink / raw)
  To: Stefan Tauner; +Cc: git

On Mon, Aug 01, 2016 at 11:49:09PM +0200, Stefan Tauner wrote:

> > For instance, if I do:
> > 
> >   git init
> >   echo content >file
> >   git add file
> >   git commit -m "$(perl -e 'print join(" ", 1..100)')"
> > 
> > I get a commit message with one long unwrapped line, which I can view
> > via git-log, etc.
> 
> That's approximately what I did in my tests as well. And like you, when
> I do this in a fresh repository, it works like that..

One thing to look at, I guess, is whether they are corrupted coming in
to the repository, or when they are being formatted.

If you do:

  git cat-file commit HEAD

you will get the raw bytes of the commit object stored by git. In the
example above, it obviously shows one long line. Have you checked that
it does so in your cases that misbehave?

> > (I wondered at first if the extra "cat" and "-m" could be messing up
> > whitespace for you, but it should not, as the quoting around "$input"
> > should preserve things like newlines. And anyway, the bug in that case
> > would be the _opposite_; I'd expect it to stuff everything onto a single
> > line rather than breaking lines).
> 
> The commit messages I try to process are nothing special really... just
> very long and not subject-like (because SVN and not giving too much
> thought to them sometimes). The only special thing I can think of is
> that they have been processed by git-svn earlier.

Hmm. The usual problem with svn-imported commits is not long lines,
exactly, but rather that the commit message has one big paragraph at the
top, rather than a subject/body split.

So when you ask git for the "subject" in such a case, it may paste many
lines together as a single one. For example:

  $ commit=$(seq 1 5 | git commit-tree HEAD^{tree})
  $ git cat-file commit $commit
  tree 07753f428765ac1afe2020b24e40785869bd4a85
  author Jeff King <peff@peff.net> 1470093739 -0400
  committer Jeff King <peff@peff.net> 1470093739 -0400

  1
  2
  3
  4
  5

  $ git log --format=%s $commit
  1 2 3 4 5

So could it be that your lines actually _are_ broken in the git objects,
but "%s" and other tools try to salvage them as a single subject?

I don't recall offhand whether git-svn does line-wrapping or any other
commit-message munging.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Un-paged commit messages in git filter-branch's commit-filter?
  2016-08-01 23:24         ` Jeff King
@ 2016-08-02  0:07           ` Eric Wong
  2016-08-06  9:40           ` Stefan Tauner
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Wong @ 2016-08-02  0:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Tauner, git

Jeff King <peff@peff.net> wrote:
> I don't recall offhand whether git-svn does line-wrapping or any other
> commit-message munging.

Definitely no line-wrapping.  Munging is minimal:
it respects i18n.commitencoding, adds a trailing newline,
and "git-svn-id:" line.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Un-paged commit messages in git filter-branch's commit-filter?
  2016-08-01 23:24         ` Jeff King
  2016-08-02  0:07           ` Eric Wong
@ 2016-08-06  9:40           ` Stefan Tauner
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Tauner @ 2016-08-06  9:40 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, 1 Aug 2016 19:24:29 -0400
Jeff King <peff@peff.net> wrote:

> So could it be that your lines actually _are_ broken in the git objects,
> but "%s" and other tools try to salvage them as a single subject?

YES! :)
Thanks so much! I was apparently ignoring this trivial explanation
because I was too much persuaded that the actual commits had missing
line breaks and only something in git was adding them. But it is
actually the other way around as you said: the few commands that print
the overlong lines are those that rely on %s. gitg for example parses
the output of the following:
git show --num-stat --pretty="format:%s%n%n%b%n\x01"
I was not aware that %s lumps lines together if it does not find a
proper subject on the first line.

I can work with that now. Thanks again!
-- 
Kind regards/Mit freundlichen Grüßen, Stefan Tauner

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-08-06 23:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-13  6:28 Un-paged commit messages in git filter-branch's commit-filter? Stefan Tauner
2016-06-16  9:59 ` Jeff King
2016-07-31 16:39   ` Stefan Tauner
2016-08-01 21:36     ` Jeff King
2016-08-01 21:49       ` Stefan Tauner
2016-08-01 23:24         ` Jeff King
2016-08-02  0:07           ` Eric Wong
2016-08-06  9:40           ` Stefan Tauner

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).