git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* rev-list pretty format behavior
@ 2015-04-04 23:27 Oliver Runge
  2015-04-05 21:12 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Oliver Runge @ 2015-04-04 23:27 UTC (permalink / raw
  To: git

Heyup, everybody.

Apologies if this turns out to be a duplicate. Gmane seems broken, so
I couldn't search the archive.

I'm using git version 2.4.0-rc1. The same behavior exists in 2.1.0.

With git-log it is possible to specify a custom pretty format that
outputs one line per commit:
> git log --pretty=format:"%h ..." HEAD~3...HEAD
826aed5 ...
915e44c ...
067178e ...

Trying the same with rev-list results in:
> git rev-list --pretty=format:"%h ..." HEAD~3...HEAD
commit 826aed50cbb072d8f159e4c8ba0f9bd3df21a234
826aed5 ...
commit 915e44c6357f3bd9d5fa498a201872c4367302d3
915e44c ...
commit 067178ed8a7822e6bc88ad606b707fc33658e6fc
067178e ...

Is the separate line of "commit <hash>" a must for all formats except
"oneline" or a possible bug?
Based on the git-rev-list man page and git-log, I would expect to be
able to override the format as described, since it is possible to get
the "commit <hash>" line for any format by prefixing it with "commit
%H".

The only way to get similar behaviour is to do something like:
> git rev-list ... | grep -v '^commit'
and that's quite hacky.

I looked at the code and the flow in rev-list seems odd to me. The
header_prefix is set outside of show_commit(), it is empty for
format=oneline, but set to "commit " for any other formats. It's then
printed inside show_commit() and followed by the (possibly
abbreviated) hash. So the display logic is split into two places,
neither of which knows much about the other, both make decisions based
on the pretty format specified.

Even if the behavior is correct, would you agree that this could be
refactored a bit, so the output is less stitched together?
I'd happily try to help refactoring or fixing it, if it is indeed a bug.

Thanks for your ears!
  Oliver

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rev-list pretty format behavior
  2015-04-04 23:27 rev-list pretty format behavior Oliver Runge
@ 2015-04-05 21:12 ` Junio C Hamano
  2015-04-06 11:05   ` Oliver Runge
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2015-04-05 21:12 UTC (permalink / raw
  To: Oliver Runge; +Cc: git

Oliver Runge <oliver.runge@gmail.com> writes:

> I'm using git version 2.4.0-rc1. The same behavior exists in 2.1.0.
>
> Trying the same with rev-list results in:
>> git rev-list --pretty=format:"%h ..." HEAD~3...HEAD
> commit 826aed50cbb072d8f159e4c8ba0f9bd3df21a234
> 826aed5 ...
> commit 915e44c6357f3bd9d5fa498a201872c4367302d3
> 915e44c ...
> commit 067178ed8a7822e6bc88ad606b707fc33658e6fc
> 067178e ...

This is very much the designed behaviour, I would think.  IIRC, the
user-format support of "rev-list" was designed so that the scripts
can customize the output from "rev-list -v", which was how scripts
were expected to read various pieces of information for each commit
originally.  And the 40-hex commit object name and/or a line that
begins with "commit ..." when a user format is used are meant to
serve as stable record separator (in that sense, having %H or %h in
the userformat given to rev-list is redundant) when these scripts
are reading output from "rev-list".

A new option to tell "rev-list" that "I am designing an output that
is a-line-per-commit with the userformat and do not need the default
record separator" or "I will arrange record separator myself" would
be an acceptable thing to add, provided if many scripts yet to be
written would benefit from such a feature, though.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rev-list pretty format behavior
  2015-04-05 21:12 ` Junio C Hamano
@ 2015-04-06 11:05   ` Oliver Runge
  2015-04-07 13:53     ` Michael J Gruber
  0 siblings, 1 reply; 5+ messages in thread
From: Oliver Runge @ 2015-04-06 11:05 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

Hallo, Mr. Hamano.

Thank you for your quick and detailed response.

On 5 April 2015 at 23:12, Junio C Hamano <gitster@pobox.com> wrote:
> This is very much the designed behaviour, I would think.  IIRC, the
> user-format support of "rev-list" was designed so that the scripts
> can customize the output from "rev-list -v", which was how scripts
> were expected to read various pieces of information for each commit
> originally.  And the 40-hex commit object name and/or a line that
> begins with "commit ..." when a user format is used are meant to
> serve as stable record separator (in that sense, having %H or %h in
> the userformat given to rev-list is redundant) when these scripts
> are reading output from "rev-list".

I see, but then I find it even stranger, because "rev-list -v" without
"pretty" parameter will only output the hash as separator and "commit
<sha1>" is only introduced if a "pretty" parameter other than
"oneline" is specified. The docu states the formating is intended to
make "git rev-list" behave more like "git log", and apart from the
pretty settings "email" and "format"/"tformat" (which don't have
"commit <sha1>" in "git log") the formating works exactly like it does
in "git log".

docu:
------------------------------------------
Commit Formatting
       Using these options, git-rev-list(1) will act similar to the more
       specialized family of commit log tools: git-log(1), git-show(1),
       and git-whatchanged(1)
------------------------------------------
and
------------------------------------------
- format:<string>
The format:<string> format allows you to specify which information you
want to show. It works a little bit like printf format, with the
notable exception that you get a newline with %n instead of \n.
E.g, format:"The author of %h was %an, %ar%nThe title was >>%s<<%n"
would show something like this:

    The author of fe6e0ee was Junio C Hamano, 23 hours ago
    The title was >>t4119: test autocomputing -p<n> for traditional
diff input.<<
------------------------------------------

> A new option to tell "rev-list" that "I am designing an output that
> is a-line-per-commit with the userformat and do not need the default
> record separator" or "I will arrange record separator myself" would
> be an acceptable thing to add, provided if many scripts yet to be
> written would benefit from such a feature, though.

I searched github for usages of "git rev-list --pretty=format" to see
whether I'm alone. I realize this is merely anecdotal, but perhaps
still useful.

Scripts ignoring the separator:
------------------------------------------
# no idea why it always prints those commit lines
git rev-list --pretty=format:" - %s" "$@" |grep -v ^commit
------------------------------------------

------------------------------------------
git rev-list --pretty=format:"%H %h|%an:%s" "$@" | sed -n
"s/^\([0-9a-f]\{40\}\) \(.*\)$/n\1 [$shape label=\"{\2}\"]/p"
------------------------------------------

(shortened with "..." by me)
------------------------------------------
git rev-list --pretty=format:"%H %h %d" "$@" | awk '
...
!/^commit/ {
...
}'
------------------------------------------

Most of the scripts I found hack around the "commit <sha1>" lines,
mostly in a way that would still work if the lines suddenly weren't
there anymore. But unfortunately there are also some examples that
would break:
------------------------------------------
git rev-list --oneline --pretty=format:"%C(yellow)%h
%C(red)%ad%C(green)%d %C(reset)%s%C(cyan) [%cn]" --date=short
HEAD~2..HEAD | awk 'NR % 2 == 0'
------------------------------------------

And finally there are a few that really use the current behavior:
------------------------------------------
# tcl
set revisions [$::versioned_interpreter git rev-list
"--pretty=format:%at%n%an <%ae>%n%s" -n 10 $revision]
set result {}
foreach {commit date author summary} [split $revisions \n] {
    lappend result [list [lindex $commit 1] $date $author $summary]
}
------------------------------------------

(shortened with "..." by me)
------------------------------------------
save()
{
    awk '{print $2 " '$1'" }' | sort >$R/sha/$1
}
...
make_sha()
{
    git rev-list --pretty=format: ^Research-V6 BSD-1 | save BSD-1
    git rev-list --pretty=format: ^BSD-1 BSD-2 | save BSD-2
    ...
}
------------------------------------------

I really feel that it should be the default behavior for "format",
since the separator intention isn't described in the docu and isn't
really needed for scripts that want to provide their own formating.
That being said, I understand that that's likely not going to happen,
especially since it would break quite a few legacy scripts.

But it would be prudent to update the docu to highlight the different
behavior for the pretty settings "email" and "format"/"tformat", and
even though I think another feature to turn off the separator lines
makes the command more complex, the fact that so many scripts seem to
write around the behavior might justify it.

I'd like to help with both tasks, if you think they are reasonable.

Oliver

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rev-list pretty format behavior
  2015-04-06 11:05   ` Oliver Runge
@ 2015-04-07 13:53     ` Michael J Gruber
  2015-04-08 17:12       ` Oliver Runge
  0 siblings, 1 reply; 5+ messages in thread
From: Michael J Gruber @ 2015-04-07 13:53 UTC (permalink / raw
  To: Oliver Runge, Junio C Hamano; +Cc: git

Oliver Runge venit, vidit, dixit 06.04.2015 13:05:
> Hallo, Mr. Hamano.
> 
> Thank you for your quick and detailed response.
> 
> On 5 April 2015 at 23:12, Junio C Hamano <gitster@pobox.com> wrote:
>> This is very much the designed behaviour, I would think.  IIRC, the
>> user-format support of "rev-list" was designed so that the scripts
>> can customize the output from "rev-list -v", which was how scripts
>> were expected to read various pieces of information for each commit
>> originally.  And the 40-hex commit object name and/or a line that
>> begins with "commit ..." when a user format is used are meant to
>> serve as stable record separator (in that sense, having %H or %h in
>> the userformat given to rev-list is redundant) when these scripts
>> are reading output from "rev-list".
> 
> I see, but then I find it even stranger, because "rev-list -v" without
> "pretty" parameter will only output the hash as separator and "commit
> <sha1>" is only introduced if a "pretty" parameter other than
> "oneline" is specified. The docu states the formating is intended to
> make "git rev-list" behave more like "git log", and apart from the
> pretty settings "email" and "format"/"tformat" (which don't have
> "commit <sha1>" in "git log") the formating works exactly like it does
> in "git log".
> 
> docu:
> ------------------------------------------
> Commit Formatting
>        Using these options, git-rev-list(1) will act similar to the more
>        specialized family of commit log tools: git-log(1), git-show(1),
>        and git-whatchanged(1)
> ------------------------------------------
> and
> ------------------------------------------
> - format:<string>
> The format:<string> format allows you to specify which information you
> want to show. It works a little bit like printf format, with the
> notable exception that you get a newline with %n instead of \n.
> E.g, format:"The author of %h was %an, %ar%nThe title was >>%s<<%n"
> would show something like this:
> 
>     The author of fe6e0ee was Junio C Hamano, 23 hours ago
>     The title was >>t4119: test autocomputing -p<n> for traditional
> diff input.<<
> ------------------------------------------
> 
>> A new option to tell "rev-list" that "I am designing an output that
>> is a-line-per-commit with the userformat and do not need the default
>> record separator" or "I will arrange record separator myself" would
>> be an acceptable thing to add, provided if many scripts yet to be
>> written would benefit from such a feature, though.
> 
> I searched github for usages of "git rev-list --pretty=format" to see
> whether I'm alone. I realize this is merely anecdotal, but perhaps
> still useful.
> 
> Scripts ignoring the separator:
> ------------------------------------------
> # no idea why it always prints those commit lines
> git rev-list --pretty=format:" - %s" "$@" |grep -v ^commit
> ------------------------------------------
> 
> ------------------------------------------
> git rev-list --pretty=format:"%H %h|%an:%s" "$@" | sed -n
> "s/^\([0-9a-f]\{40\}\) \(.*\)$/n\1 [$shape label=\"{\2}\"]/p"
> ------------------------------------------
> 
> (shortened with "..." by me)
> ------------------------------------------
> git rev-list --pretty=format:"%H %h %d" "$@" | awk '
> ...
> !/^commit/ {
> ...
> }'
> ------------------------------------------
> 
> Most of the scripts I found hack around the "commit <sha1>" lines,
> mostly in a way that would still work if the lines suddenly weren't
> there anymore. But unfortunately there are also some examples that
> would break:
> ------------------------------------------
> git rev-list --oneline --pretty=format:"%C(yellow)%h
> %C(red)%ad%C(green)%d %C(reset)%s%C(cyan) [%cn]" --date=short
> HEAD~2..HEAD | awk 'NR % 2 == 0'
> ------------------------------------------
> 
> And finally there are a few that really use the current behavior:
> ------------------------------------------
> # tcl
> set revisions [$::versioned_interpreter git rev-list
> "--pretty=format:%at%n%an <%ae>%n%s" -n 10 $revision]
> set result {}
> foreach {commit date author summary} [split $revisions \n] {
>     lappend result [list [lindex $commit 1] $date $author $summary]
> }
> ------------------------------------------
> 
> (shortened with "..." by me)
> ------------------------------------------
> save()
> {
>     awk '{print $2 " '$1'" }' | sort >$R/sha/$1
> }
> ...
> make_sha()
> {
>     git rev-list --pretty=format: ^Research-V6 BSD-1 | save BSD-1
>     git rev-list --pretty=format: ^BSD-1 BSD-2 | save BSD-2
>     ...
> }
> ------------------------------------------
> 
> I really feel that it should be the default behavior for "format",
> since the separator intention isn't described in the docu and isn't
> really needed for scripts that want to provide their own formating.
> That being said, I understand that that's likely not going to happen,
> especially since it would break quite a few legacy scripts.
> 
> But it would be prudent to update the docu to highlight the different
> behavior for the pretty settings "email" and "format"/"tformat", and
> even though I think another feature to turn off the separator lines
> makes the command more complex, the fact that so many scripts seem to
> write around the behavior might justify it.
> 
> I'd like to help with both tasks, if you think they are reasonable.
> 
> Oliver
> 

I'm wondering what the difference is - or should be - between "git log"
and "git rev-list" with (completely) user specified output. That
question goes both ways:

- Why do we need "rev-list" to have completely flexible output when we
have "log" with such flexibility?

- Why do we even have pretty formats for "rev-list"?

I'm thinking of rev-list as a raw (plumbing) revision lister much like
cat-file is the inspection tool for the objects, and log as the human
facing output with appropriate defaults (resp. show).

Note that "rev-list -v" isn't even documented afaics.

Michael

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rev-list pretty format behavior
  2015-04-07 13:53     ` Michael J Gruber
@ 2015-04-08 17:12       ` Oliver Runge
  0 siblings, 0 replies; 5+ messages in thread
From: Oliver Runge @ 2015-04-08 17:12 UTC (permalink / raw
  To: Michael J Gruber; +Cc: Junio C Hamano, git

Heyup, Dr. Gruber.

On 7 April 2015 at 15:53, Michael J Gruber <git@drmicha.warpmail.net> wrote:
> I'm wondering what the difference is - or should be - between "git log"
> and "git rev-list" with (completely) user specified output. That
> question goes both ways:
>
> - Why do we need "rev-list" to have completely flexible output when we
> have "log" with such flexibility?
>
> - Why do we even have pretty formats for "rev-list"?
>
> I'm thinking of rev-list as a raw (plumbing) revision lister much like
> cat-file is the inspection tool for the objects, and log as the human
> facing output with appropriate defaults (resp. show).
>
> Note that "rev-list -v" isn't even documented afaics.

I can't answer your questions, because I don't have a very deep
understanding of either command, but according to the "log" docu,
formating really belongs to "rev-list" and "log" only adds the diff-*
features:
------------------------------------------
The command takes options applicable to the git rev-list command
to control what is shown and how, and options applicable to the
git diff-* commands to control how the changes each commit
introduces are shown.
------------------------------------------

I also feel that perhaps "pretty" is a bit of a misnomer and naturally
is associated with "human readable", but the formating is vital for
any raw output that scripts can process.

Oliver

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-04-08 17:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-04 23:27 rev-list pretty format behavior Oliver Runge
2015-04-05 21:12 ` Junio C Hamano
2015-04-06 11:05   ` Oliver Runge
2015-04-07 13:53     ` Michael J Gruber
2015-04-08 17:12       ` Oliver Runge

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).