git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH 0/2] extend --abbrev support to diff-patch format
@ 2020-08-09  2:19 Đoàn Trần Công Danh
  2020-08-09  2:19 ` [RFC PATCH 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
                   ` (6 more replies)
  0 siblings, 7 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-09  2:19 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

For some reason, projects tracks local copy of released version of other
projects, and backports change to earlier version.

Due to lower number of object in simplified history,
abbreviated object id has fewer characters, thus generate some noise
when projects try to compare the backported patch with original patch
if the file hunk is an exact match.

This series try to lower that noise.
Since this is very localised use case,
and the noise is not completely eliminated,
other experienced developers may have better opinions.

Đoàn Trần Công Danh (2):
  revision: differentiate if --no-abbrev asked explicitly
  diff: extend --abbrev support to diff-patch format

 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 revision.c                                    |  2 +-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 7 files changed, 100 insertions(+), 6 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC PATCH 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-09  2:19 [RFC PATCH 0/2] extend --abbrev support to diff-patch format Đoàn Trần Công Danh
@ 2020-08-09  2:19 ` Đoàn Trần Công Danh
  2020-08-09  2:19 ` [RFC PATCH 2/2] diff: extend --abbrev support to diff-patch format Đoàn Trần Công Danh
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-09  2:19 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

When we see `--no-abbrev' in command's arguments, we reset `abbrev' of
`diff_options` to 0, thus, on a later stage, the object id won't
be shortened (by not set object_id[abbrev] to '\0').

While not doing anything is very effective way to show full object id,
we couldn't differentiate if --no-abbrev or not.

In a later change, we want to extend --abbrev support to diff-patch
format.

Let's ask for full object id if we see --no-abbrev instead.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 revision.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index 6de29cdf7a..69cc834662 100644
--- a/revision.c
+++ b/revision.c
@@ -2432,7 +2432,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--always")) {
 		revs->always_show_header = 1;
 	} else if (!strcmp(arg, "--no-abbrev")) {
-		revs->abbrev = 0;
+		revs->abbrev = hexsz;
 	} else if (!strcmp(arg, "--abbrev")) {
 		revs->abbrev = DEFAULT_ABBREV;
 	} else if (skip_prefix(arg, "--abbrev=", &optarg)) {
-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC PATCH 2/2] diff: extend --abbrev support to diff-patch format
  2020-08-09  2:19 [RFC PATCH 0/2] extend --abbrev support to diff-patch format Đoàn Trần Công Danh
  2020-08-09  2:19 ` [RFC PATCH 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
@ 2020-08-09  2:19 ` Đoàn Trần Công Danh
  2020-08-09 19:01 ` [RFC PATCH 0/2] " Junio C Hamano
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-09  2:19 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---

Not sure if this part should go to the commit message or not:

To preserve backward compatibility with old script that specify both
--full-index and --abbrev, always shows full object id if --full-index
is specified.

 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 6 files changed, 99 insertions(+), 5 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 7987d72b02..c11efa7865 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -441,10 +441,11 @@ endif::git-format-patch[]
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	name in diff-raw format output and diff-tree header
-	lines, show only a partial prefix.  This is
-	independent of the `--full-index` option above, which controls
-	the diff-patch output format.  Non default number of
-	digits can be specified with `--abbrev=<n>`.
+	lines, show only a partial prefix.
+	In diff-patch output format, `--full-index` takes higher
+	precedent, i.e. if `--full-index` is specified, full blob
+	names will be shown regardless of `--abbrev`.
+	Non default number of digits can be specified with `--abbrev=<n>`.
 
 -B[<n>][/<m>]::
 --break-rewrites[=[<n>][/<m>]]::
diff --git a/diff.c b/diff.c
index d24aaa3047..b5dedb0165 100644
--- a/diff.c
+++ b/diff.c
@@ -4319,7 +4319,10 @@ static void fill_metainfo(struct strbuf *msg,
 	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
-		int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
+		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
+
+		if (o->flags.full_index)
+			abbrev = hexsz;
 
 		if (o->flags.binary) {
 			mmfile_t mf;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 3f60f7d96c..e6eb4dd4c7 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -221,6 +221,9 @@ diff-tree --root -r --abbrev=4 initial
 :noellipses diff-tree --root -r --abbrev=4 initial
 diff-tree -p initial
 diff-tree --root -p initial
+diff-tree --root -p --abbrev=10 initial
+diff-tree --root -p --full-index initial
+diff-tree --root -p --full-index --abbrev=10 initial
 diff-tree --patch-with-stat initial
 diff-tree --root --patch-with-stat initial
 diff-tree --patch-with-raw initial
diff --git a/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
new file mode 100644
index 0000000000..7518a9044e
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000..35d242ba79
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
new file mode 100644
index 0000000000..69f913fbe5
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
new file mode 100644
index 0000000000..1b0b6717fa
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-09  2:19 [RFC PATCH 0/2] extend --abbrev support to diff-patch format Đoàn Trần Công Danh
  2020-08-09  2:19 ` [RFC PATCH 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
  2020-08-09  2:19 ` [RFC PATCH 2/2] diff: extend --abbrev support to diff-patch format Đoàn Trần Công Danh
@ 2020-08-09 19:01 ` Junio C Hamano
  2020-08-10 10:00   ` Jeff King
  2020-08-11 11:49 ` [PATCH v2 0/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2020-08-09 19:01 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> Đoàn Trần Công Danh (2):
>   revision: differentiate if --no-abbrev asked explicitly
>   diff: extend --abbrev support to diff-patch format

It was not clear, at least to me at all, what these patches are
trying to achieve (i.e. what end-users appreciate) until I saw the
code change X-<.

The changes to fill_metainfo() make sense to me.  It just needs log
messages that explain the intent better.  They do not even make it
clear that they want to make the abbreviation length of the object
names on the "index $from..$to $mode" lines configurable.

Thanks.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-09 19:01 ` [RFC PATCH 0/2] " Junio C Hamano
@ 2020-08-10 10:00   ` Jeff King
  2020-08-10 12:31     ` Đoàn Trần Công Danh
  2020-08-10 15:06     ` Junio C Hamano
  0 siblings, 2 replies; 41+ messages in thread
From: Jeff King @ 2020-08-10 10:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Đoàn Trần Công Danh, git

On Sun, Aug 09, 2020 at 12:01:35PM -0700, Junio C Hamano wrote:

> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> > Đoàn Trần Công Danh (2):
> >   revision: differentiate if --no-abbrev asked explicitly
> >   diff: extend --abbrev support to diff-patch format
> 
> It was not clear, at least to me at all, what these patches are
> trying to achieve (i.e. what end-users appreciate) until I saw the
> code change X-<.
> 
> The changes to fill_metainfo() make sense to me.  It just needs log
> messages that explain the intent better.  They do not even make it
> clear that they want to make the abbreviation length of the object
> names on the "index $from..$to $mode" lines configurable.

After reading the original including cover letter, I'm still confused
using why --full-index is not the solution for most cases. Perhaps that
would be worth touching on, as well.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-10 10:00   ` Jeff King
@ 2020-08-10 12:31     ` Đoàn Trần Công Danh
  2020-08-10 15:15       ` Junio C Hamano
  2020-08-10 15:06     ` Junio C Hamano
  1 sibling, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-10 12:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On 2020-08-10 06:00:38-0400, Jeff King <peff@peff.net> wrote:
> On Sun, Aug 09, 2020 at 12:01:35PM -0700, Junio C Hamano wrote:
> 
> > Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> > 
> > > Đoàn Trần Công Danh (2):
> > >   revision: differentiate if --no-abbrev asked explicitly
> > >   diff: extend --abbrev support to diff-patch format
> > 
> > It was not clear, at least to me at all, what these patches are
> > trying to achieve (i.e. what end-users appreciate) until I saw the
> > code change X-<.
> > 
> > The changes to fill_metainfo() make sense to me.  It just needs log
> > messages that explain the intent better.  They do not even make it
> > clear that they want to make the abbreviation length of the object
> > names on the "index $from..$to $mode" lines configurable.
> 
> After reading the original including cover letter, I'm still confused
> using why --full-index is not the solution for most cases. Perhaps that
> would be worth touching on, as well.

At that time, I'm not really sure what should be written there.
The commit message was inspired by --abbrev documentation.

Reading both of your two's emails, I think this one could be used:
I'll resend this serie if this serie is deemed useful with this
explaination.

	diff: index-line: make object name's abbrev length configurable

	There are some projects that want to archive and track only
	released version of other software projects. They also want
	to backport some changes into those versions unsupported by
	upstream. Most of git hosting services support some method to
	download patches without cloning the full (potentially large)
	repository and/or fiddling with git partial-clone or
	sparse-checkout.

	Most of those git hosting services generate those patches with
	git-format-patch(1) or something alike. Due to its large
	amount of objects, their object names' abbreviation in the
	index-line is usually long but not full.

	A lot of those patches couldn't be applied cleanly to old
	versions of said software, thus requires some changes from
	developer and they needs to be regenerated from their trimmed
	tree. Because the archive tree has significantly fewer
	objects, the abbreviation in the index line is usually shorter
	than the original patch. Thus, it generates some noise when
	said developers try to compare the new patch with the original
	patch if there's an exact file-hunk match.

	Make the object name's abbreviation length configurable to
	lower those noise.

	<Below is the  note in 2/2, I don't know if it's worth put
	into commit message>

	To preserve backward compatibility with old script that specify
	both --full-index and --abbrev, always shows full object id
	if --full-index is specified.


-- 
Danh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-10 10:00   ` Jeff King
  2020-08-10 12:31     ` Đoàn Trần Công Danh
@ 2020-08-10 15:06     ` Junio C Hamano
  1 sibling, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2020-08-10 15:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Đoàn Trần Công Danh, git

Jeff King <peff@peff.net> writes:

> On Sun, Aug 09, 2020 at 12:01:35PM -0700, Junio C Hamano wrote:
>
>> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
>> 
>> > Đoàn Trần Công Danh (2):
>> >   revision: differentiate if --no-abbrev asked explicitly
>> >   diff: extend --abbrev support to diff-patch format
>> 
>> It was not clear, at least to me at all, what these patches are
>> trying to achieve (i.e. what end-users appreciate) until I saw the
>> code change X-<.
>> 
>> The changes to fill_metainfo() make sense to me.  It just needs log
>> messages that explain the intent better.  They do not even make it
>> clear that they want to make the abbreviation length of the object
>> names on the "index $from..$to $mode" lines configurable.
>
> After reading the original including cover letter, I'm still confused
> using why --full-index is not the solution for most cases. Perhaps that
> would be worth touching on, as well.

True.  

Presumably you could force some stability without sacrificing line
length limit by using --abbrev=12 instead of --full-index but I do
not think it is such a big deal.  But it does look odd that we use
a special/single-purpose option --full-index to control the length
only for those two object names on the "index" line, when all the
other object names we see are controlled with the --abbrev option.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-10 12:31     ` Đoàn Trần Công Danh
@ 2020-08-10 15:15       ` Junio C Hamano
  2020-08-10 15:27         ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2020-08-10 15:15 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: Jeff King, git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> Reading both of your two's emails, I think this one could be used:
> I'll resend this serie if this serie is deemed useful with this
> explaination.
>
> 	diff: index-line: make object name's abbrev length configurable
>
> 	There are some projects that want to archive and track only
> 	released version of other software projects. They also want
> 	to backport some changes into those versions unsupported by
> 	upstream. Most of git hosting services support some method to
> 	download patches without cloning the full (potentially large)
> 	repository and/or fiddling with git partial-clone or
> 	sparse-checkout.
>
> 	Most of those git hosting services generate those patches with
> 	git-format-patch(1) or something alike. Due to its large
> 	amount of objects, their object names' abbreviation in the
> 	index-line is usually long but not full.
>
> 	A lot of those patches couldn't be applied cleanly to old
> 	versions of said software, thus requires some changes from
> 	developer and they needs to be regenerated from their trimmed
> 	tree. Because the archive tree has significantly fewer
> 	objects, the abbreviation in the index line is usually shorter
> 	than the original patch. Thus, it generates some noise when
> 	said developers try to compare the new patch with the original
> 	patch if there's an exact file-hunk match.
>
> 	Make the object name's abbreviation length configurable to
> 	lower those noise.

I agree with Peff that with the above as the sole motivating use
case, the "--full-index" option is the right approach.  It is a much
more robust solution than "--abbrev=16 would be long enough for all
project participants to avoid length drift".  IOW these four
paragraphs do not argue _for_ this change, at least to me.

On the other hand, I think you could argue that "--full-index" is
merely a synonym for "--abbrev=40", and the patch fixes the
inconsistency between the object names on the "index" line, which
can choose only between the default abbrev length and the full
abbrev length, and all the other places we show object names, which
uniformly honor the "--abbrev" option.


> 	<Below is the  note in 2/2, I don't know if it's worth put
> 	into commit message>
>
> 	To preserve backward compatibility with old script that specify
> 	both --full-index and --abbrev, always shows full object id
> 	if --full-index is specified.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-10 15:15       ` Junio C Hamano
@ 2020-08-10 15:27         ` Jeff King
  2020-08-11  0:33           ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2020-08-10 15:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Đoàn Trần Công Danh, git

On Mon, Aug 10, 2020 at 08:15:41AM -0700, Junio C Hamano wrote:

> > 	A lot of those patches couldn't be applied cleanly to old
> > 	versions of said software, thus requires some changes from
> > 	developer and they needs to be regenerated from their trimmed
> > 	tree. Because the archive tree has significantly fewer
> > 	objects, the abbreviation in the index line is usually shorter
> > 	than the original patch. Thus, it generates some noise when
> > 	said developers try to compare the new patch with the original
> > 	patch if there's an exact file-hunk match.
> >
> > 	Make the object name's abbreviation length configurable to
> > 	lower those noise.
> 
> I agree with Peff that with the above as the sole motivating use
> case, the "--full-index" option is the right approach.  It is a much
> more robust solution than "--abbrev=16 would be long enough for all
> project participants to avoid length drift".  IOW these four
> paragraphs do not argue _for_ this change, at least to me.

Yeah, that's what I was getting at: if you care about robust
machine-readability, then the full index is the best solution. Reading
between the lines, I think the argument may be "using --full-index is
too long and therefore ugly, so people like the short-ish names but with
a bit of extra safety".

There's an extra challenge here, which is that you have to convince the
sender to use the extra --abbrev option, even though they themselves
won't be the ones running into the problem when applying. But I don't
think there's an elegant solution to that (we could just bump the
default abbrev everywhere to 12+, which is enough in practice).

Though I'm not 100% sure that "git apply" is smart enough to only look
at blobs (i.e., if "1234abcd" is ambiguous between a tree and a blob,
ignore the tree since patches always apply to blobs). That might be
another avenue that would make things more likely to work without
anybody having to configure anything.

> On the other hand, I think you could argue that "--full-index" is
> merely a synonym for "--abbrev=40", and the patch fixes the
> inconsistency between the object names on the "index" line, which
> can choose only between the default abbrev length and the full
> abbrev length, and all the other places we show object names, which
> uniformly honor the "--abbrev" option.

Yeah, I certainly don't mind the extra flexibility between "full" and
"default" for "index" lines. I do wonder if people want to configure the
abbreviations for those lines separately from other parts. I don't know
that I've ever particularly cared about that flexibility, but the fact
that they were set up separately all those years ago makes me think
somebody might.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-10 15:27         ` Jeff King
@ 2020-08-11  0:33           ` Đoàn Trần Công Danh
  2020-08-11  5:22             ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-11  0:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On 2020-08-10 11:27:05-0400, Jeff King <peff@peff.net> wrote:
> On Mon, Aug 10, 2020 at 08:15:41AM -0700, Junio C Hamano wrote:
> 
> > > 	A lot of those patches couldn't be applied cleanly to old
> > > 	versions of said software, thus requires some changes from
> > > 	developer and they needs to be regenerated from their trimmed
> > > 	tree. Because the archive tree has significantly fewer
> > > 	objects, the abbreviation in the index line is usually shorter
> > > 	than the original patch. Thus, it generates some noise when
> > > 	said developers try to compare the new patch with the original
> > > 	patch if there's an exact file-hunk match.
> > >
> > > 	Make the object name's abbreviation length configurable to
> > > 	lower those noise.
> > 
> > I agree with Peff that with the above as the sole motivating use
> > case, the "--full-index" option is the right approach.  It is a much
> > more robust solution than "--abbrev=16 would be long enough for all
> > project participants to avoid length drift".  IOW these four
> > paragraphs do not argue _for_ this change, at least to me.
> 
> Yeah, that's what I was getting at: if you care about robust
> machine-readability, then the full index is the best solution. Reading
> between the lines, I think the argument may be "using --full-index is
> too long and therefore ugly, so people like the short-ish names but with
> a bit of extra safety".

My argument was people can either easily fetch the patch via HTTP like:

	curl -LO https://github.com/git/git/commit/eb12adc74cf22add318f884072be2071d181abaa.patch

or take it from a mailing list archive, bugzilla, instead of
cloning a full repository. With those options, we can't say,
"we prefer full-index, please send us the patch with full-index
instead".

> 
> There's an extra challenge here, which is that you have to convince the
> sender to use the extra --abbrev option, even though they themselves
> won't be the ones running into the problem when applying.

Not really, since the sender tree is usually larger than the archived
tree, their abbrev is usually long enough, and the receiver will use
--abbrev to lengthen their abbrev to reduce the noise instead.

> But I don't
> think there's an elegant solution to that (we could just bump the
> default abbrev everywhere to 12+, which is enough in practice).
> Though I'm not 100% sure that "git apply" is smart enough to only look
> at blobs (i.e., if "1234abcd" is ambiguous between a tree and a blob,
> ignore the tree since patches always apply to blobs). That might be
> another avenue that would make things more likely to work without
> anybody having to configure anything.
> 
> > On the other hand, I think you could argue that "--full-index" is
> > merely a synonym for "--abbrev=40", and the patch fixes the
> > inconsistency between the object names on the "index" line, which
> > can choose only between the default abbrev length and the full
> > abbrev length, and all the other places we show object names, which
> > uniformly honor the "--abbrev" option.

I think this argument could be a way to go.
In fact, I always try to use --abbrev with diff family because I know
it works with a handful with other tools, (describe, blame), then
I surprise that it doesn't work, and the documentation tells me
`--abbrev` only works with diff-raw and diff-tree header line.

Then, I keep forgetting that documentation, and I tried again.

For now, I filtered out the index line before comparing 2 patches.

> Yeah, I certainly don't mind the extra flexibility between "full" and
> "default" for "index" lines. I do wonder if people want to configure the
> abbreviations for those lines separately from other parts. I don't know
> that I've ever particularly cared about that flexibility, but the fact
> that they were set up separately all those years ago makes me think
> somebody might.

I don't think people particularly care about the index line (and to
the extent, its length) that much, since the default is number is
actually a minimum number, if Git can't differentiate object with that
number of characters, Git will show a longer object names anyway.

I think most people scripts will put a regex for:

	/index [a-z0-9]{7,}\.\.[a-z0-9]{7,} [0-7]{6}/

Or even:

	/index [a-z0-9]+\.\.[a-z0-9]+ [0-7]+/

For the former case, we could change the code in 2/2 to set the minimum
default to DEFAULT_ABBREV instead of MINIMUM_ABBREV?

For the historical case that users put both --full-index and --abbrev
into there scripts, we still keep our promise to not break their
script by always respect --full-index, regardless of --abbrev.

-- 
Danh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-11  0:33           ` Đoàn Trần Công Danh
@ 2020-08-11  5:22             ` Jeff King
  2020-08-11 12:07               ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2020-08-11  5:22 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: Junio C Hamano, git

On Tue, Aug 11, 2020 at 07:33:59AM +0700, Đoàn Trần Công Danh wrote:

> > Yeah, that's what I was getting at: if you care about robust
> > machine-readability, then the full index is the best solution. Reading
> > between the lines, I think the argument may be "using --full-index is
> > too long and therefore ugly, so people like the short-ish names but with
> > a bit of extra safety".
> 
> My argument was people can either easily fetch the patch via HTTP like:
> 
> 	curl -LO https://github.com/git/git/commit/eb12adc74cf22add318f884072be2071d181abaa.patch
> 
> or take it from a mailing list archive, bugzilla, instead of
> cloning a full repository. With those options, we can't say,
> "we prefer full-index, please send us the patch with full-index
> instead".

OK. But then how would they use "--abbrev" in that case? I.e., isn't it
too late at that point (especially in the mailing list archive case) to
do change anything in the formatting of the patch?

Maybe I'm confused...

> > There's an extra challenge here, which is that you have to convince the
> > sender to use the extra --abbrev option, even though they themselves
> > won't be the ones running into the problem when applying.
> 
> Not really, since the sender tree is usually larger than the archived
> tree, their abbrev is usually long enough, and the receiver will use
> --abbrev to lengthen their abbrev to reduce the noise instead.

Now I'm doubly confused. If the sender has the larger tree then they'll
have the larger abbrev. So what's the problem?

Going back to re-read your earlier responses...So...this _isn't_ a
problem within Git itself? It's only about people trying to compare
textual patches byte-for-byte and seeing different index lines?

If that's the case, then it seems to me that the byte comparison is the
problem here. If I have:

  index 1234abcd..5678bcde

and

  index 1234abcd87..5678bcde65

those should be considered equivalent to see if two patches are
plausibly the same. And I think tools like git-cherry, etc, would do
that (and we provide git-patch-id for that purpose, too).

> > Yeah, I certainly don't mind the extra flexibility between "full" and
> > "default" for "index" lines. I do wonder if people want to configure the
> > abbreviations for those lines separately from other parts. I don't know
> > that I've ever particularly cared about that flexibility, but the fact
> > that they were set up separately all those years ago makes me think
> > somebody might.
> 
> I don't think people particularly care about the index line (and to
> the extent, its length) that much, since the default is number is
> actually a minimum number, if Git can't differentiate object with that
> number of characters, Git will show a longer object names anyway.
> 
> I think most people scripts will put a regex for:
> 
> 	/index [a-z0-9]{7,}\.\.[a-z0-9]{7,} [0-7]{6}/
> 
> Or even:
> 
> 	/index [a-z0-9]+\.\.[a-z0-9]+ [0-7]+/
> 
> For the former case, we could change the code in 2/2 to set the minimum
> default to DEFAULT_ABBREV instead of MINIMUM_ABBREV?
> 
> For the historical case that users put both --full-index and --abbrev
> into there scripts, we still keep our promise to not break their
> script by always respect --full-index, regardless of --abbrev.

I care less about scripting (as you note, anything consuming abbreviated
objects has to handle longer-than-minimum names anyway), and was more
wondering whether anybody really cared that:

 git log --abbrev=30 -p

kept the short index lines (e.g., because they're easier to read). But
I'm having trouble coming up with a plausible reason somebody would want
long object names in earlier lines like "Merge:" but not in the patch
index lines. And already we respect --abbrev for --raw, so it's not like
the diff code isn't already affected. Making "-p" consistent with all
the rest of it is probably worth doing regardless.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 0/2] diff: index-line: respect --abbrev in object's name
  2020-08-09  2:19 [RFC PATCH 0/2] extend --abbrev support to diff-patch format Đoàn Trần Công Danh
                   ` (2 preceding siblings ...)
  2020-08-09 19:01 ` [RFC PATCH 0/2] " Junio C Hamano
@ 2020-08-11 11:49 ` Đoàn Trần Công Danh
  2020-08-11 11:49   ` [PATCH v2 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
  2020-08-11 11:49   ` [PATCH v2 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  2020-08-14  0:23 ` [PATCH v3 0/2] " Đoàn Trần Công Danh
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-11 11:49 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that consistent is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's resolve that inconsistency.

Đoàn Trần Công Danh (2):
  revision: differentiate if --no-abbrev asked explicitly
  diff: index-line: respect --abbrev in object's name

 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 revision.c                                    |  2 +-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 7 files changed, 100 insertions(+), 6 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

Range-diff against v1:
1:  9daef7445c = 1:  9daef7445c revision: differentiate if --no-abbrev asked explicitly
2:  9146313893 ! 2:  12acf1fe5d diff: extend --abbrev support to diff-patch format
    @@ Metadata
     Author: Đoàn Trần Công Danh <congdanhqx@gmail.com>
     
      ## Commit message ##
    -    diff: extend --abbrev support to diff-patch format
    +    diff: index-line: respect --abbrev in object's name
    +
    +    A handful of Git's commands respect `--abbrev' for customizing length
    +    of abbreviation of object names.
    +
    +    For diff-family, Git supports 2 different options for 2 different
    +    purposes, `--full-index' for showing diff-patch object's name in full,
    +    and `--abbrev' to customize the length of object names in diff-raw and
    +    diff-tree header lines, without any options to customise the length of
    +    object names in diff-patch format. When working with diff-patch format,
    +    we only have two options, either full index, or default abbrev length.
    +
    +    Although, that consistent is documented, it doesn't stop users from
    +    trying to use `--abbrev' with the hope of customising diff-patch's
    +    objects' name's abbreviation.
    +
    +    Let's resolve that inconsistency.
    +
    +    To preserve backward compatibility with old script that specify both
    +    `--full-index' and `--abbrev', always shows full object id
    +    if `--full-index' is specified.
     
         Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
     
-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-11 11:49 ` [PATCH v2 0/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
@ 2020-08-11 11:49   ` Đoàn Trần Công Danh
  2020-08-11 18:54     ` Junio C Hamano
  2020-08-11 11:49   ` [PATCH v2 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  1 sibling, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-11 11:49 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

When we see `--no-abbrev' in command's arguments, we reset `abbrev' of
`diff_options` to 0, thus, on a later stage, the object id won't
be shortened (by not set object_id[abbrev] to '\0').

While not doing anything is very effective way to show full object id,
we couldn't differentiate if --no-abbrev or not.

In a later change, we want to extend --abbrev support to diff-patch
format.

Let's ask for full object id if we see --no-abbrev instead.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 revision.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index 3dcf689341..24027b1af3 100644
--- a/revision.c
+++ b/revision.c
@@ -2430,7 +2430,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--always")) {
 		revs->always_show_header = 1;
 	} else if (!strcmp(arg, "--no-abbrev")) {
-		revs->abbrev = 0;
+		revs->abbrev = hexsz;
 	} else if (!strcmp(arg, "--abbrev")) {
 		revs->abbrev = DEFAULT_ABBREV;
 	} else if (skip_prefix(arg, "--abbrev=", &optarg)) {
-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-11 11:49 ` [PATCH v2 0/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  2020-08-11 11:49   ` [PATCH v2 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
@ 2020-08-11 11:49   ` Đoàn Trần Công Danh
  1 sibling, 0 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-11 11:49 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

A handful of Git's commands respect `--abbrev' for customizing length
of abbreviation of object names.

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that consistent is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's resolve that inconsistency.

To preserve backward compatibility with old script that specify both
`--full-index' and `--abbrev', always shows full object id
if `--full-index' is specified.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 6 files changed, 99 insertions(+), 5 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 7987d72b02..c11efa7865 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -441,10 +441,11 @@ endif::git-format-patch[]
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	name in diff-raw format output and diff-tree header
-	lines, show only a partial prefix.  This is
-	independent of the `--full-index` option above, which controls
-	the diff-patch output format.  Non default number of
-	digits can be specified with `--abbrev=<n>`.
+	lines, show only a partial prefix.
+	In diff-patch output format, `--full-index` takes higher
+	precedent, i.e. if `--full-index` is specified, full blob
+	names will be shown regardless of `--abbrev`.
+	Non default number of digits can be specified with `--abbrev=<n>`.
 
 -B[<n>][/<m>]::
 --break-rewrites[=[<n>][/<m>]]::
diff --git a/diff.c b/diff.c
index f9709de7b4..20dedfe2a9 100644
--- a/diff.c
+++ b/diff.c
@@ -4319,7 +4319,10 @@ static void fill_metainfo(struct strbuf *msg,
 	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
-		int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
+		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
+
+		if (o->flags.full_index)
+			abbrev = hexsz;
 
 		if (o->flags.binary) {
 			mmfile_t mf;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 3f60f7d96c..e6eb4dd4c7 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -221,6 +221,9 @@ diff-tree --root -r --abbrev=4 initial
 :noellipses diff-tree --root -r --abbrev=4 initial
 diff-tree -p initial
 diff-tree --root -p initial
+diff-tree --root -p --abbrev=10 initial
+diff-tree --root -p --full-index initial
+diff-tree --root -p --full-index --abbrev=10 initial
 diff-tree --patch-with-stat initial
 diff-tree --root --patch-with-stat initial
 diff-tree --patch-with-raw initial
diff --git a/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
new file mode 100644
index 0000000000..7518a9044e
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000..35d242ba79
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
new file mode 100644
index 0000000000..69f913fbe5
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
new file mode 100644
index 0000000000..1b0b6717fa
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 0/2] extend --abbrev support to diff-patch format
  2020-08-11  5:22             ` Jeff King
@ 2020-08-11 12:07               ` Đoàn Trần Công Danh
  0 siblings, 0 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-11 12:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On 2020-08-11 01:22:26-0400, Jeff King <peff@peff.net> wrote:
> On Tue, Aug 11, 2020 at 07:33:59AM +0700, Đoàn Trần Công Danh wrote:
> 
> > > Yeah, that's what I was getting at: if you care about robust
> > > machine-readability, then the full index is the best solution. Reading
> > > between the lines, I think the argument may be "using --full-index is
> > > too long and therefore ugly, so people like the short-ish names but with
> > > a bit of extra safety".
> > 
> > My argument was people can either easily fetch the patch via HTTP like:
> > 
> > 	curl -LO https://github.com/git/git/commit/eb12adc74cf22add318f884072be2071d181abaa.patch
> > 
> > or take it from a mailing list archive, bugzilla, instead of
> > cloning a full repository. With those options, we can't say,
> > "we prefer full-index, please send us the patch with full-index
> > instead".
> 
> OK. But then how would they use "--abbrev" in that case? I.e., isn't it
> too late at that point (especially in the mailing list archive case) to
> do change anything in the formatting of the patch?
> 
> Maybe I'm confused...
> 
> > > There's an extra challenge here, which is that you have to convince the
> > > sender to use the extra --abbrev option, even though they themselves
> > > won't be the ones running into the problem when applying.
> > 
> > Not really, since the sender tree is usually larger than the archived
> > tree, their abbrev is usually long enough, and the receiver will use
> > --abbrev to lengthen their abbrev to reduce the noise instead.
> 
> Now I'm doubly confused. If the sender has the larger tree then they'll
> have the larger abbrev. So what's the problem?
> 
> Going back to re-read your earlier responses...So...this _isn't_ a
> problem within Git itself?

Correct. It's NOT Git's problems by any mean.

> It's only about people trying to compare
> textual patches byte-for-byte and seeing different index lines?

Yeah, it's about people trying to backport patch to old tree.
Fixing conflicts, and try to compare to old patch to see if they have
made any errors.

Because conflicts resolving is complicated.

> 
> If that's the case, then it seems to me that the byte comparison is the
> problem here. If I have:
> 
>   index 1234abcd..5678bcde
> 
> and
> 
>   index 1234abcd87..5678bcde65
> 
> those should be considered equivalent to see if two patches are
> plausibly the same. And I think tools like git-cherry, etc, would do
> that (and we provide git-patch-id for that purpose, too).

Yes, git-patch-id is very useful tool.

There're time that half of the patch can be applied cleanly with the
exact object names.
Another half needs to be fixed heavily, (maybe removed). git-cherry
and git-patch-id couldn't cope well in those situation.
That condition is true if there's a major change to source tree.

> > > Yeah, I certainly don't mind the extra flexibility between "full" and
> > > "default" for "index" lines. I do wonder if people want to configure the
> > > abbreviations for those lines separately from other parts. I don't know
> > > that I've ever particularly cared about that flexibility, but the fact
> > > that they were set up separately all those years ago makes me think
> > > somebody might.
> > 
> > I don't think people particularly care about the index line (and to
> > the extent, its length) that much, since the default is number is
> > actually a minimum number, if Git can't differentiate object with that
> > number of characters, Git will show a longer object names anyway.
> > 
> > I think most people scripts will put a regex for:
> > 
> > 	/index [a-z0-9]{7,}\.\.[a-z0-9]{7,} [0-7]{6}/
> > 
> > Or even:
> > 
> > 	/index [a-z0-9]+\.\.[a-z0-9]+ [0-7]+/
> > 
> > For the former case, we could change the code in 2/2 to set the minimum
> > default to DEFAULT_ABBREV instead of MINIMUM_ABBREV?
> > 
> > For the historical case that users put both --full-index and --abbrev
> > into there scripts, we still keep our promise to not break their
> > script by always respect --full-index, regardless of --abbrev.
> 
> I care less about scripting (as you note, anything consuming abbreviated
> objects has to handle longer-than-minimum names anyway), and was more
> wondering whether anybody really cared that:
> 
>  git log --abbrev=30 -p
> 
> kept the short index lines (e.g., because they're easier to read). But
> I'm having trouble coming up with a plausible reason somebody would want
> long object names in earlier lines like "Merge:" but not in the patch
> index lines. And already we respect --abbrev for --raw, so it's not like
> the diff code isn't already affected. Making "-p" consistent with all
> the rest of it is probably worth doing regardless.

Yes, I think this is the easier to accept argument.
I've gone with that in the resend.

-- 
Thanks.
Danh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-11 11:49   ` [PATCH v2 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
@ 2020-08-11 18:54     ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2020-08-11 18:54 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> When we see `--no-abbrev' in command's arguments, we reset `abbrev' of
> `diff_options` to 0, thus, on a later stage, the object id won't
> be shortened (by not set object_id[abbrev] to '\0').

s/by not set/&ting/ probably?  

Please do not write pseudo-code that does not exist anywhere in the
codebase that pretends to be a quote from some real code.  It wasts
reviewer's time looking for non-existing code to see what really is
going on.

I think you meant this line in diff_abbrev_oid():

	if (abbrev)
		hex[abbrev] = '\0';

so you can say that more explicitly, perhaps:

  ... we reset the 'abbrev' field in diff-options to 0 and this
  value is looked at diff_abbrev_oid() to decide not to truncate
  the object name.

or somesuch.

> While not doing anything is very effective way to show full object id,
> we couldn't differentiate if --no-abbrev or not.

You mean you want to tell if --no-abbrev and/or --full-index was
given from the command line and act differently?  You need to
justify why you need to be able to do so (which you attempted in the
remainder of the proposed log message), and also you need to justify
why changing revs->abbrev's meaning is the best way to do so (which
you did not).  In fill_metainfo(), which you are going to touch in
[2/2], you can peek o->flags.full_index to see if --full-index was
given, and the fact that revs->abbrev is not DEFAULT_ABBREV (which
is how the field is initialized in repo_init_revisions()) but is 0
tells us that "--no-abbrev" was given.  --abbrev=0 would have busted
minimum abbrev and set the field to MINIMUM_ABBREV, --abbrev=99 would
have busted hash size and set the field to hexsz.  So 0 is the sign
that --no-abbrev was given.  No?

> In a later change, we want to extend --abbrev support to diff-patch
> format.

And it is unclear why this planned change requires it.

> Let's ask for full object id if we see --no-abbrev instead.

Hence, this "let's ask" is not justified very well.

> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  revision.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/revision.c b/revision.c
> index 3dcf689341..24027b1af3 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2430,7 +2430,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  	} else if (!strcmp(arg, "--always")) {
>  		revs->always_show_header = 1;
>  	} else if (!strcmp(arg, "--no-abbrev")) {
> -		revs->abbrev = 0;
> +		revs->abbrev = hexsz;
>  	} else if (!strcmp(arg, "--abbrev")) {
>  		revs->abbrev = DEFAULT_ABBREV;
>  	} else if (skip_prefix(arg, "--abbrev=", &optarg)) {

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 0/2] diff: index-line: respect --abbrev in object's name
  2020-08-09  2:19 [RFC PATCH 0/2] extend --abbrev support to diff-patch format Đoàn Trần Công Danh
                   ` (3 preceding siblings ...)
  2020-08-11 11:49 ` [PATCH v2 0/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
@ 2020-08-14  0:23 ` Đoàn Trần Công Danh
  2020-08-14  0:23   ` [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
  2020-08-14  0:23   ` [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  2020-08-20 12:35 ` [PATCH v4 0/2] " Đoàn Trần Công Danh
  2020-08-21 11:51 ` [PATCH v5 0/2] " Đoàn Trần Công Danh
  6 siblings, 2 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-14  0:23 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Junio C Hamano, Jeff King

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that consistent is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's resolve that inconsistency.

Đoàn Trần Công Danh (2):
  revision: differentiate if --no-abbrev asked explicitly
  diff: index-line: respect --abbrev in object's name

 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 revision.c                                    |  2 +-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 7 files changed, 100 insertions(+), 6 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

Range-diff against v2:
1:  9daef7445c ! 1:  9a26c5b611 revision: differentiate if --no-abbrev asked explicitly
    @@ Metadata
      ## Commit message ##
         revision: differentiate if --no-abbrev asked explicitly
     
    -    When we see `--no-abbrev' in command's arguments, we reset `abbrev' of
    -    `diff_options` to 0, thus, on a later stage, the object id won't
    -    be shortened (by not set object_id[abbrev] to '\0').
    +    When we see --no-abbrev in command's arguments, we reset the 'abbrev'
    +    field in diff-options to 0 and this value will be looked at
    +    diff_abbrev_oid() to decide not to truncate the object name.
    +
    +    In a later change, we want to extend --abbrev support to diff-patch
    +    format. When --abbrev supporting diff-patch, we need to differentiate
    +    those below scenarios:
    +
    +    * None of those options --abbrev, --no-abbrev, and --full-index are
    +      asked. diff-patch should keep old behavior of using DEFAULT_ABBREV
    +      for the index length.
    +    * --no-abbrev is asked, diff-patch should treat this option as same as
    +      --full-index and show full object name in index line.
     
         While not doing anything is very effective way to show full object id,
         we couldn't differentiate if --no-abbrev or not.
     
    -    In a later change, we want to extend --abbrev support to diff-patch
    -    format.
    +    We can differentiate those above scenarios by either:
    +    * Add a new field in diff-options to mark if --no-abbrev was asked.
    +      With this option, we have a new field for a single purpose, and one
    +      more thing to worry about.
    +    * Treat --no-abbrev as same as --full-index. This option is problematic,
    +      since we want to allow --abbrev overwrite --no-abbrev again.
    +      On the other hand, we also need to keep our promises with scripter
    +      who uses --full-index to ask for full object names in index line,
    +      so, we need to respsect --full-index regardless of --abbrev.
    +    * Set 'abbrev' field in diff-options to the length of the hash we are
    +      using. With this option, we can differentiate if --no-abbrev was asked
    +      ('abbrev' is hash's length) or none of --[no-]abbrev was asked
    +      ('abbrev' is '0'), script with --full-index still works and our
    +      headache is kept as is.
     
    -    Let's ask for full object id if we see --no-abbrev instead.
    +    Let's ask for full object id if we see --no-abbrev.
     
         Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
     
2:  12acf1fe5d = 2:  760df7782d diff: index-line: respect --abbrev in object's name
-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-14  0:23 ` [PATCH v3 0/2] " Đoàn Trần Công Danh
@ 2020-08-14  0:23   ` Đoàn Trần Công Danh
  2020-08-14  0:50     ` Junio C Hamano
  2020-08-14  0:23   ` [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  1 sibling, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-14  0:23 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Junio C Hamano, Jeff King

When we see --no-abbrev in command's arguments, we reset the 'abbrev'
field in diff-options to 0 and this value will be looked at
diff_abbrev_oid() to decide not to truncate the object name.

In a later change, we want to extend --abbrev support to diff-patch
format. When --abbrev supporting diff-patch, we need to differentiate
those below scenarios:

* None of those options --abbrev, --no-abbrev, and --full-index are
  asked. diff-patch should keep old behavior of using DEFAULT_ABBREV
  for the index length.
* --no-abbrev is asked, diff-patch should treat this option as same as
  --full-index and show full object name in index line.

While not doing anything is very effective way to show full object id,
we couldn't differentiate if --no-abbrev or not.

We can differentiate those above scenarios by either:
* Add a new field in diff-options to mark if --no-abbrev was asked.
  With this option, we have a new field for a single purpose, and one
  more thing to worry about.
* Treat --no-abbrev as same as --full-index. This option is problematic,
  since we want to allow --abbrev overwrite --no-abbrev again.
  On the other hand, we also need to keep our promises with scripter
  who uses --full-index to ask for full object names in index line,
  so, we need to respsect --full-index regardless of --abbrev.
* Set 'abbrev' field in diff-options to the length of the hash we are
  using. With this option, we can differentiate if --no-abbrev was asked
  ('abbrev' is hash's length) or none of --[no-]abbrev was asked
  ('abbrev' is '0'), script with --full-index still works and our
  headache is kept as is.

Let's ask for full object id if we see --no-abbrev.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 revision.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index 3dcf689341..24027b1af3 100644
--- a/revision.c
+++ b/revision.c
@@ -2430,7 +2430,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--always")) {
 		revs->always_show_header = 1;
 	} else if (!strcmp(arg, "--no-abbrev")) {
-		revs->abbrev = 0;
+		revs->abbrev = hexsz;
 	} else if (!strcmp(arg, "--abbrev")) {
 		revs->abbrev = DEFAULT_ABBREV;
 	} else if (skip_prefix(arg, "--abbrev=", &optarg)) {
-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-14  0:23 ` [PATCH v3 0/2] " Đoàn Trần Công Danh
  2020-08-14  0:23   ` [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
@ 2020-08-14  0:23   ` Đoàn Trần Công Danh
  2020-08-14 15:18     ` SZEDER Gábor
  1 sibling, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-14  0:23 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Junio C Hamano, Jeff King

A handful of Git's commands respect `--abbrev' for customizing length
of abbreviation of object names.

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that consistent is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's resolve that inconsistency.

To preserve backward compatibility with old script that specify both
`--full-index' and `--abbrev', always shows full object id
if `--full-index' is specified.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 6 files changed, 99 insertions(+), 5 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 7987d72b02..c11efa7865 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -441,10 +441,11 @@ endif::git-format-patch[]
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	name in diff-raw format output and diff-tree header
-	lines, show only a partial prefix.  This is
-	independent of the `--full-index` option above, which controls
-	the diff-patch output format.  Non default number of
-	digits can be specified with `--abbrev=<n>`.
+	lines, show only a partial prefix.
+	In diff-patch output format, `--full-index` takes higher
+	precedent, i.e. if `--full-index` is specified, full blob
+	names will be shown regardless of `--abbrev`.
+	Non default number of digits can be specified with `--abbrev=<n>`.
 
 -B[<n>][/<m>]::
 --break-rewrites[=[<n>][/<m>]]::
diff --git a/diff.c b/diff.c
index f9709de7b4..20dedfe2a9 100644
--- a/diff.c
+++ b/diff.c
@@ -4319,7 +4319,10 @@ static void fill_metainfo(struct strbuf *msg,
 	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
-		int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
+		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
+
+		if (o->flags.full_index)
+			abbrev = hexsz;
 
 		if (o->flags.binary) {
 			mmfile_t mf;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 3f60f7d96c..e6eb4dd4c7 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -221,6 +221,9 @@ diff-tree --root -r --abbrev=4 initial
 :noellipses diff-tree --root -r --abbrev=4 initial
 diff-tree -p initial
 diff-tree --root -p initial
+diff-tree --root -p --abbrev=10 initial
+diff-tree --root -p --full-index initial
+diff-tree --root -p --full-index --abbrev=10 initial
 diff-tree --patch-with-stat initial
 diff-tree --root --patch-with-stat initial
 diff-tree --patch-with-raw initial
diff --git a/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
new file mode 100644
index 0000000000..7518a9044e
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000..35d242ba79
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
new file mode 100644
index 0000000000..69f913fbe5
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
new file mode 100644
index 0000000000..1b0b6717fa
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
-- 
2.28.0.215.g32ffa52ee0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-14  0:23   ` [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
@ 2020-08-14  0:50     ` Junio C Hamano
  2020-08-14  0:59       ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2020-08-14  0:50 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Jeff King

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> When we see --no-abbrev in command's arguments, we reset the 'abbrev'
> field in diff-options to 0 and this value will be looked at
> diff_abbrev_oid() to decide not to truncate the object name.
>
> In a later change, we want to extend --abbrev support to diff-patch
> format. When --abbrev supporting diff-patch, we need to differentiate
> those below scenarios:
>
> * None of those options --abbrev, --no-abbrev, and --full-index are
>   asked. diff-patch should keep old behavior of using DEFAULT_ABBREV
>   for the index length.
> * --no-abbrev is asked, diff-patch should treat this option as same as
>   --full-index and show full object name in index line.

Sorry, but are you saying that the above two cases cannot be
differentiated in the current code?

 * If none of --abbrev, --no-abbrev, --full-index are given, then
   diff.c::prep_parse_options() will leave options->flags.full_index
   and options->abbrev untouched.  They are initialized to false and
   DEFAULT_ABBREV (typically -1 when unconfigured).

 * If --no-abbrev is given, options->abbrev is set to 0.
   options->flags.full_index is not touched.

So you should be able to tell these two apart by only looking at
options->flags.full_index bit.  Perhaps, even though you said "we
need to differentiate", you meant something else?

> While not doing anything is very effective way to show full object id,
> we couldn't differentiate if --no-abbrev or not.

Hmph.  --no-abbrev without --full-index would not set
flags.full_index bit; using --full-index would set the bit.  Are you
planning to do something special when BOTH --no-abbrev and --full-index
is given?  I am confused X-<.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-14  0:50     ` Junio C Hamano
@ 2020-08-14  0:59       ` Đoàn Trần Công Danh
  2020-08-14  1:06         ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-14  0:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

On 2020-08-13 17:50:31-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> > When we see --no-abbrev in command's arguments, we reset the 'abbrev'
> > field in diff-options to 0 and this value will be looked at
> > diff_abbrev_oid() to decide not to truncate the object name.
> >
> > In a later change, we want to extend --abbrev support to diff-patch
> > format. When --abbrev supporting diff-patch, we need to differentiate
> > those below scenarios:
> >
> > * None of those options --abbrev, --no-abbrev, and --full-index are
> >   asked. diff-patch should keep old behavior of using DEFAULT_ABBREV
> >   for the index length.
> > * --no-abbrev is asked, diff-patch should treat this option as same as
> >   --full-index and show full object name in index line.
> 
> Sorry, but are you saying that the above two cases cannot be
> differentiated in the current code?
> 
>  * If none of --abbrev, --no-abbrev, --full-index are given, then
>    diff.c::prep_parse_options() will leave options->flags.full_index
>    and options->abbrev untouched.  They are initialized to false and
>    DEFAULT_ABBREV (typically -1 when unconfigured).
> 
>  * If --no-abbrev is given, options->abbrev is set to 0.
>    options->flags.full_index is not touched.
> 
> So you should be able to tell these two apart by only looking at
> options->flags.full_index bit.  Perhaps, even though you said "we
> need to differentiate", you meant something else?

Oops, I shouldn't say anything about --full-index in the second point
to reduce confusion.

Let me list some combination here:

* none of --abbrev --no-abbrev --full-index -> default short index
* --abbrev --full-index                     -> full-index
* --full-index --abbrev                     -> full-index
* --abbrev --no-abbrev                      -> full-index
* --no-abbrev --abbrev=[n]                  -> shortened index to n char

So, we can't use full_index bit, because --no-abbrev can be defeated
by --abbrev, but --full-index will always win --abbrev.

> 
> > While not doing anything is very effective way to show full object id,
> > we couldn't differentiate if --no-abbrev or not.
> 
> Hmph.  --no-abbrev without --full-index would not set
> flags.full_index bit; using --full-index would set the bit.  Are you
> planning to do something special when BOTH --no-abbrev and --full-index
> is given?  I am confused X-<.

I'm not planning for anything special when both --no-abbrev and
--full-index is given.

I'm planning for:

* BOTH --abbrev and --no-abbrev but NOT --full-index;
* BOTH --abbrev AND --full-index

Sorry for the confusion,
I hope it's clear now, and you could help me rephase a bit to reduce
confusion.

-- 
Danh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-14  0:59       ` Đoàn Trần Công Danh
@ 2020-08-14  1:06         ` Junio C Hamano
  2020-08-14 14:50           ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2020-08-14  1:06 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Jeff King

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> Let me list some combination here:
>
> * none of --abbrev --no-abbrev --full-index -> default short index
> * --abbrev --full-index                     -> full-index
> * --full-index --abbrev                     -> full-index
> * --abbrev --no-abbrev                      -> full-index
> * --no-abbrev --abbrev=[n]                  -> shortened index to n char
>
> So, we can't use full_index bit, because --no-abbrev can be defeated
> by --abbrev, but --full-index will always win --abbrev.

Sure, I wasn't suggesting to flip the flags.full_index bit upon
seeing "--no-abbrev".  When --no-abbrev is in effect (i.e. the last
one among --no-abbrev, --abbrev, or --abbrev=n), .abbrev field is
set to 0.  So wouldn't it be sufficient to say

 - If flags.full_index bit is set, show the full object name

 - If abbrev is 0, show the full object name

 - All other cases, after clamping the value of abbrev to reasonable
   value, truncat the object name to that length

What am I missing?

> I'm planning for:
>
> * BOTH --abbrev and --no-abbrev but NOT --full-index;
> * BOTH --abbrev AND --full-index


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-14  1:06         ` Junio C Hamano
@ 2020-08-14 14:50           ` Đoàn Trần Công Danh
  2020-08-19 22:50             ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-14 14:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

On 2020-08-13 18:06:22-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> > Let me list some combination here:
> >
> > * none of --abbrev --no-abbrev --full-index -> default short index
> > * --abbrev --full-index                     -> full-index
> > * --full-index --abbrev                     -> full-index
> > * --abbrev --no-abbrev                      -> full-index
> > * --no-abbrev --abbrev=[n]                  -> shortened index to n char
> >
> > So, we can't use full_index bit, because --no-abbrev can be defeated
> > by --abbrev, but --full-index will always win --abbrev.
> 
> Sure, I wasn't suggesting to flip the flags.full_index bit upon
> seeing "--no-abbrev".  When --no-abbrev is in effect (i.e. the last
> one among --no-abbrev, --abbrev, or --abbrev=n), .abbrev field is
> set to 0.  So wouldn't it be sufficient to say
> 
>  - If flags.full_index bit is set, show the full object name
> 
>  - If abbrev is 0, show the full object name
> 
>  - All other cases, after clamping the value of abbrev to reasonable
>    value, truncat the object name to that length
> 
> What am I missing?

No, you didn't miss anything.

It's obviously me, who screwed up the logical thinking.

Originally, I come up with something along the line in 2/2:

	int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
	if (!o->abbrev)
		abbrev = o->abbrev;

I couldn't recalled what I wrote, but that logic requires 1/2,
after I come up with 1/2, I re-analysed 2/2 and come up with current
logic.

I failed to re-visit 1/2 to check if it's necessary.
It's all MY fault.

Sorry for wasting everyone's time in 1/2.

Please eject 1/2 from this series.

-- 
Thanks

Danh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-14  0:23   ` [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
@ 2020-08-14 15:18     ` SZEDER Gábor
  2020-08-14 17:00       ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2020-08-14 15:18 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Junio C Hamano, Jeff King

On Fri, Aug 14, 2020 at 07:23:10AM +0700, Đoàn Trần Công Danh wrote:
> A handful of Git's commands respect `--abbrev' for customizing length
> of abbreviation of object names.
> 
> For diff-family, Git supports 2 different options for 2 different
> purposes, `--full-index' for showing diff-patch object's name in full,
> and `--abbrev' to customize the length of object names in diff-raw and
> diff-tree header lines, without any options to customise the length of
> object names in diff-patch format. When working with diff-patch format,
> we only have two options, either full index, or default abbrev length.
> 
> Although, that consistent is documented, it doesn't stop users from
> trying to use `--abbrev' with the hope of customising diff-patch's
> objects' name's abbreviation.
> 
> Let's resolve that inconsistency.
> 
> To preserve backward compatibility with old script that specify both
> `--full-index' and `--abbrev', always shows full object id
> if `--full-index' is specified.
> 
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  Documentation/diff-options.txt                |  9 +++---
>  diff.c                                        |  5 +++-
>  t/t4013-diff-various.sh                       |  3 ++
>  ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
>  ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
>  ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
>  6 files changed, 99 insertions(+), 5 deletions(-)
>  create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
>  create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
>  create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial
> 
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 7987d72b02..c11efa7865 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -441,10 +441,11 @@ endif::git-format-patch[]
>  --abbrev[=<n>]::
>  	Instead of showing the full 40-byte hexadecimal object
>  	name in diff-raw format output and diff-tree header
> -	lines, show only a partial prefix.  This is
> -	independent of the `--full-index` option above, which controls
> -	the diff-patch output format.  Non default number of
> -	digits can be specified with `--abbrev=<n>`.
> +	lines, show only a partial prefix.
> +	In diff-patch output format, `--full-index` takes higher
> +	precedent, i.e. if `--full-index` is specified, full blob
> +	names will be shown regardless of `--abbrev`.
> +	Non default number of digits can be specified with `--abbrev=<n>`.
>  
>  -B[<n>][/<m>]::
>  --break-rewrites[=[<n>][/<m>]]::
> diff --git a/diff.c b/diff.c
> index f9709de7b4..20dedfe2a9 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4319,7 +4319,10 @@ static void fill_metainfo(struct strbuf *msg,
>  	}
>  	if (one && two && !oideq(&one->oid, &two->oid)) {
>  		const unsigned hexsz = the_hash_algo->hexsz;
> -		int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
> +		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
> +
> +		if (o->flags.full_index)
> +			abbrev = hexsz;
>  
>  		if (o->flags.binary) {
>  			mmfile_t mf;
> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index 3f60f7d96c..e6eb4dd4c7 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -221,6 +221,9 @@ diff-tree --root -r --abbrev=4 initial
>  :noellipses diff-tree --root -r --abbrev=4 initial
>  diff-tree -p initial
>  diff-tree --root -p initial
> +diff-tree --root -p --abbrev=10 initial
> +diff-tree --root -p --full-index initial
> +diff-tree --root -p --full-index --abbrev=10 initial
>  diff-tree --patch-with-stat initial
>  diff-tree --root --patch-with-stat initial
>  diff-tree --patch-with-raw initial
> diff --git a/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
> new file mode 100644
> index 0000000000..7518a9044e
> --- /dev/null
> +++ b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
> @@ -0,0 +1,29 @@
> +$ git diff-tree --root -p --abbrev=10 initial
> +444ac553ac7612cc88969031b02b3767fb8a353a
> +diff --git a/dir/sub b/dir/sub
> +new file mode 100644
> +index 0000000000..35d242ba79
> +--- /dev/null
> ++++ b/dir/sub
> +@@ -0,0 +1,2 @@
> ++A
> ++B
> +diff --git a/file0 b/file0
> +new file mode 100644
> +index 0000000000..01e79c32a8
> +--- /dev/null
> ++++ b/file0
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +diff --git a/file2 b/file2
> +new file mode 100644
> +index 0000000000..01e79c32a8
> +--- /dev/null
> ++++ b/file2
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +$
> diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
> new file mode 100644
> index 0000000000..69f913fbe5
> --- /dev/null
> +++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
> @@ -0,0 +1,29 @@
> +$ git diff-tree --root -p --full-index --abbrev=10 initial
> +444ac553ac7612cc88969031b02b3767fb8a353a
> +diff --git a/dir/sub b/dir/sub
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
> +--- /dev/null
> ++++ b/dir/sub
> +@@ -0,0 +1,2 @@
> ++A
> ++B
> +diff --git a/file0 b/file0
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
> +--- /dev/null
> ++++ b/file0
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +diff --git a/file2 b/file2
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
> +--- /dev/null
> ++++ b/file2
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +$
> diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
> new file mode 100644
> index 0000000000..1b0b6717fa
> --- /dev/null
> +++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
> @@ -0,0 +1,29 @@
> +$ git diff-tree --root -p --full-index initial
> +444ac553ac7612cc88969031b02b3767fb8a353a
> +diff --git a/dir/sub b/dir/sub
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
> +--- /dev/null
> ++++ b/dir/sub
> +@@ -0,0 +1,2 @@
> ++A
> ++B
> +diff --git a/file0 b/file0
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
> +--- /dev/null
> ++++ b/file0
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +diff --git a/file2 b/file2
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
> +--- /dev/null
> ++++ b/file2
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +$

All these new tests break when run with GIT_TEST_DEFAULT_HASH=sha256
with something like:

  + test_cmp expect actual
  --- expect	2020-08-14 15:05:12.209285397 +0000
  +++ actual	2020-08-14 15:05:12.205285279 +0000
  @@ -2,7 +2,7 @@
   0000000000000000000000000000000000000000000000000000000000000000
   diff --git a/dir/sub b/dir/sub
   new file mode ffff44
  -index ffffffffff..fffffffa79
  +index ffffffffff..fffffff895
   --- /dev/null
   +++ b/dir/sub
   @@ -0,0 +1,2 @@
  @@ -10,7 +10,7 @@
   +B
   diff --git a/file0 b/file0
   new file mode ffff44
  -index ffffffffff..fffffff2a8
  +index ffffffffff..fffffff5d7
   --- /dev/null
   +++ b/file0
   @@ -0,0 +1,3 @@
  @@ -19,7 +19,7 @@
   +3
   diff --git a/file2 b/file2
   new file mode ffff44
  -index ffffffffff..fffffff2a8
  +index ffffffffff..fffffff5d7
   --- /dev/null
   +++ b/file2
   @@ -0,0 +1,3 @@
  error: last command exited with $?=1



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-14 15:18     ` SZEDER Gábor
@ 2020-08-14 17:00       ` Junio C Hamano
  2020-08-14 18:59         ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2020-08-14 17:00 UTC (permalink / raw)
  To: SZEDER Gábor, brian m. carlson
  Cc: Đoàn Trần Công Danh, git, Jeff King

SZEDER Gábor <szeder.dev@gmail.com> writes:

>> +diff --git a/file0 b/file0
>> +new file mode 100644
>> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
>> +--- /dev/null
>> ++++ b/file0
>> +@@ -0,0 +1,3 @@
>> ++1
>> ++2
>> ++3
>> +diff --git a/file2 b/file2
>> +new file mode 100644
>> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
>> +--- /dev/null
>> ++++ b/file2
>> +@@ -0,0 +1,3 @@
>> ++1
>> ++2
>> ++3
>> +$
>
> All these new tests break when run with GIT_TEST_DEFAULT_HASH=sha256
> with something like:
>
>   + test_cmp expect actual
>   --- expect	2020-08-14 15:05:12.209285397 +0000
>   +++ actual	2020-08-14 15:05:12.205285279 +0000
>   @@ -2,7 +2,7 @@
>    0000000000000000000000000000000000000000000000000000000000000000
>    diff --git a/dir/sub b/dir/sub
>    new file mode ffff44
>   -index ffffffffff..fffffffa79
>   +index ffffffffff..fffffff895

Ouch.  These apparently come from

process_diffs () {
	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
	_x07="$_x05[0-9a-f][0-9a-f]" &&
	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
	    -e "s/From $_x40 /From $ZERO_OID /" \
	    -e "s/from $_x40)/from $ZERO_OID)/" \
	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
	    -e "s/^$_x40 /$ZERO_OID /" \
	    -e "s/^$_x40$/$ZERO_OID/" \
	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
	    -e "s/$_x07\.\.\./fffffff.../g" \
	    -e "s/ $_x04\.\.\./ ffff.../g" \
	    -e "s/ $_x04/ ffff/g" \
	    "$1"
}

Hash-independence may be good, but it should not munge expected mode
bits from 100644 to ffff44, which I think is a bug in the original
introduced in 72f936b1 (t4013: make test hash independent,
2020-02-07).

When we are adjusting the abbrev length of the index line, of course
$_x07 would not be sufficient to match the abbreviated object name
in full, so a79 vs 895 can be explained and is a bug in this patch
that did not update the process_diffs helper.

Another thing that I find somewhat problematic in the original
(brian cc'ed) is that it does not special case all-zero object name
specially.  By turning any and all instances of $_x40 to $ZERO_OID,
we lose the distinction between a random-looking object name which
got turned into $ZERO_OID by the processing, and an object name that
was $ZERO_OID from the beginning, so we won't catch a possible
future bug where new file's preimage object name is not $ZERO_OID
(this is plausible when you try to show an intent-to-add entry; the
diff between the index and the working tree would be "new file"
patch, but the index entry records the object name for an empty
blob, e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 instead of $ZERO_OID
can easily be emitted by a mistaken implementation).

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-14 17:00       ` Junio C Hamano
@ 2020-08-14 18:59         ` Junio C Hamano
  2020-08-15  0:21           ` brian m. carlson
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2020-08-14 18:59 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: brian m. carlson, Đoàn Trần Công Danh, git,
	Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> Ouch.  These apparently come from
>
> process_diffs () {
> ...
> }
>
> Hash-independence may be good, but it should not munge expected mode
> bits from 100644 to ffff44, which I think is a bug in the original
> introduced in 72f936b1 (t4013: make test hash independent,
> 2020-02-07).
>
> When we are adjusting the abbrev length of the index line, of course
> $_x07 would not be sufficient to match the abbreviated object name
> in full, so a79 vs 895 can be explained and is a bug in this patch
> that did not update the process_diffs helper.
>
> Another thing that I find somewhat problematic in the original
> (brian cc'ed) is that it does not special case all-zero object name
> specially.  By turning any and all instances of $_x40 to $ZERO_OID,
> we lose the distinction between a random-looking object name which
> got turned into $ZERO_OID by the processing, and an object name that
> was $ZERO_OID from the beginning, so we won't catch a possible
> future bug where new file's preimage object name is not $ZERO_OID
> (this is plausible when you try to show an intent-to-add entry; the
> diff between the index and the working tree would be "new file"
> patch, but the index entry records the object name for an empty
> blob, e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 instead of $ZERO_OID
> can easily be emitted by a mistaken implementation).

So here is what I came up with as a possible starting point.  The
idea is to grab hexadecimal strings at locations the original tried
to isolate with various contexts, and

 - if the input happens to be all zero, use '0', otherwise use 'f'

 - if the input is 40-bytes (i.e. unabbreviated object name in the
   SHA-1 world), repeat the character chosen in the first step as
   many times as there are chars in $ZERO_OID

 - otherwise, repeat the character chosen in the first step as many
   times as there are chars in the input.

 - regardless of all of the above, special case possible in-tree
   blob modes (100644, 100755 and 120000) and don't munge them.

I haven't tried it with the patch that started this discussion
thread, nor with SHA-256 build, though.


 t/t4013-diff-various.sh | 58 +++++++++++++++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 21 deletions(-)

diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 43267d6024..b33e60ab9d 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -130,27 +130,43 @@ test_expect_success setup '
 EOF
 
 process_diffs () {
-	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
-	_x07="$_x05[0-9a-f][0-9a-f]" &&
-	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
-	    -e "s/From $_x40 /From $ZERO_OID /" \
-	    -e "s/from $_x40)/from $ZERO_OID)/" \
-	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
-	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
-	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
-	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
-	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
-	    -e "s/^$_x40 /$ZERO_OID /" \
-	    -e "s/^$_x40$/$ZERO_OID/" \
-	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
-	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
-	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
-	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
-	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
-	    -e "s/$_x07\.\.\./fffffff.../g" \
-	    -e "s/ $_x04\.\.\./ ffff.../g" \
-	    -e "s/ $_x04/ ffff/g" \
-	    "$1"
+	perl -e '
+		my $oid_length = length($ARGV[0]);
+		my $x40 = "[0-9a-f]{40}";
+		my $xab = "[0-9a-f]{4,16}";
+		my $orx = "[0-9a-f]" x $oid_length;
+
+		sub munge_oid {
+			my ($oid) = @_;
+			my $x;
+
+			if ($oid =~ /^(100644|100755|120000)$/) {
+				return $oid;
+			}
+
+			if ($oid =~ /^0*$/) {
+				$x = "0";
+			} else {
+				$x = "f";
+			}
+
+			if (length($oid) == 40) {
+				return $x x $oid_length;
+			} else {
+				return $x x length($oid);
+			}
+		}
+
+		while (<STDIN>) {
+			s/($orx)/munge_oid($1)/ge;
+			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
+			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
+			s/($x40) /munge_oid($1) . " "/ge;
+			s/^($x40)($| )/munge_oid($1) . $2/e;
+			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
+			print;
+		}
+	' "$ZERO_OID" <"$1"
 }
 
 V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-14 18:59         ` Junio C Hamano
@ 2020-08-15  0:21           ` brian m. carlson
  2020-08-15  2:27             ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 41+ messages in thread
From: brian m. carlson @ 2020-08-15  0:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Đoàn Trần Công Danh, git,
	Jeff King

[-- Attachment #1: Type: text/plain, Size: 8854 bytes --]

On 2020-08-14 at 18:59:53, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Ouch.  These apparently come from
> >
> > process_diffs () {
> > ...
> > }
> >
> > Hash-independence may be good, but it should not munge expected mode
> > bits from 100644 to ffff44, which I think is a bug in the original
> > introduced in 72f936b1 (t4013: make test hash independent,
> > 2020-02-07).
> >
> > When we are adjusting the abbrev length of the index line, of course
> > $_x07 would not be sufficient to match the abbreviated object name
> > in full, so a79 vs 895 can be explained and is a bug in this patch
> > that did not update the process_diffs helper.
> >
> > Another thing that I find somewhat problematic in the original
> > (brian cc'ed) is that it does not special case all-zero object name
> > specially.  By turning any and all instances of $_x40 to $ZERO_OID,
> > we lose the distinction between a random-looking object name which
> > got turned into $ZERO_OID by the processing, and an object name that
> > was $ZERO_OID from the beginning, so we won't catch a possible
> > future bug where new file's preimage object name is not $ZERO_OID
> > (this is plausible when you try to show an intent-to-add entry; the
> > diff between the index and the working tree would be "new file"
> > patch, but the index entry records the object name for an empty
> > blob, e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 instead of $ZERO_OID
> > can easily be emitted by a mistaken implementation).

Yeah, it wasn't intended that we munge mode bits, and I definitely agree
we'd be better off distinguishing between all-zero and non-zero OIDs.

As you might imagine, this is not my favorite test because have a large
amount of diff formats, and even I think the giant list of sed
expressions I wrote is hideous.  It is, however, reasonably
comprehensive, which is pretty much the only nice thing that can be said
about it.

> So here is what I came up with as a possible starting point.  The
> idea is to grab hexadecimal strings at locations the original tried
> to isolate with various contexts, and
> 
>  - if the input happens to be all zero, use '0', otherwise use 'f'
> 
>  - if the input is 40-bytes (i.e. unabbreviated object name in the
>    SHA-1 world), repeat the character chosen in the first step as
>    many times as there are chars in $ZERO_OID
> 
>  - otherwise, repeat the character chosen in the first step as many
>    times as there are chars in the input.
> 
>  - regardless of all of the above, special case possible in-tree
>    blob modes (100644, 100755 and 120000) and don't munge them.
> 
> I haven't tried it with the patch that started this discussion
> thread, nor with SHA-256 build, though.

This seems like a sane approach.

> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index 43267d6024..b33e60ab9d 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -130,27 +130,43 @@ test_expect_success setup '
>  EOF
>  
>  process_diffs () {
> -	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
> -	_x07="$_x05[0-9a-f][0-9a-f]" &&
> -	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
> -	    -e "s/From $_x40 /From $ZERO_OID /" \
> -	    -e "s/from $_x40)/from $ZERO_OID)/" \
> -	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
> -	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
> -	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
> -	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
> -	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
> -	    -e "s/^$_x40 /$ZERO_OID /" \
> -	    -e "s/^$_x40$/$ZERO_OID/" \
> -	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
> -	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
> -	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
> -	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
> -	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
> -	    -e "s/$_x07\.\.\./fffffff.../g" \
> -	    -e "s/ $_x04\.\.\./ ffff.../g" \
> -	    -e "s/ $_x04/ ffff/g" \
> -	    "$1"
> +	perl -e '
> +		my $oid_length = length($ARGV[0]);
> +		my $x40 = "[0-9a-f]{40}";
> +		my $xab = "[0-9a-f]{4,16}";
> +		my $orx = "[0-9a-f]" x $oid_length;
> +
> +		sub munge_oid {
> +			my ($oid) = @_;
> +			my $x;
> +
> +			if ($oid =~ /^(100644|100755|120000)$/) {
> +				return $oid;
> +			}
> +
> +			if ($oid =~ /^0*$/) {
> +				$x = "0";
> +			} else {
> +				$x = "f";
> +			}
> +
> +			if (length($oid) == 40) {
> +				return $x x $oid_length;
> +			} else {
> +				return $x x length($oid);
> +			}
> +		}
> +
> +		while (<STDIN>) {
> +			s/($orx)/munge_oid($1)/ge;
> +			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> +			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
> +			s/($x40) /munge_oid($1) . " "/ge;
> +			s/^($x40)($| )/munge_oid($1) . $2/e;
> +			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
> +			print;
> +		}
> +	' "$ZERO_OID" <"$1"
>  }

This is much nicer, but I think we need the following on top of it
because we have a couple of tricky cases the original didn't consider:

* Some of our 64-bit object IDs get processed twice, turning them into
  88-character object IDs, which don't match.  We therefore need "\b".
* The new unabbreviated index lines aren't accounted for, so I included
  them by possibly matching "\.\.".
* We have some lines that look like "commit $OID (from $OID)" that
  aren't accounted for.  Because we now have an optional OID in
  munge_oid, I had to account for that as well.

So this is what's on top:

diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index b5c7e1a63b..dfc87a0d19 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -140,6 +140,8 @@ process_diffs () {
 			my ($oid) = @_;
 			my $x;
 
+			return "" unless length $oid;
+
 			if ($oid =~ /^(100644|100755|120000)$/) {
 				return $oid;
 			}
@@ -160,8 +162,8 @@ process_diffs () {
 		while (<STDIN>) {
 			s/($orx)/munge_oid($1)/ge;
 			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
-			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
-			s/($x40) /munge_oid($1) . " "/ge;
+			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
+			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
 			s/^($x40)($| )/munge_oid($1) . $2/e;
 			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
 			print;

Or, a fresh original version:

-- %< --
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index e6eb4dd4c7..dfc87a0d19 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -130,27 +130,45 @@ test_expect_success setup '
 EOF
 
 process_diffs () {
-	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
-	_x07="$_x05[0-9a-f][0-9a-f]" &&
-	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
-	    -e "s/From $_x40 /From $ZERO_OID /" \
-	    -e "s/from $_x40)/from $ZERO_OID)/" \
-	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
-	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
-	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
-	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
-	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
-	    -e "s/^$_x40 /$ZERO_OID /" \
-	    -e "s/^$_x40$/$ZERO_OID/" \
-	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
-	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
-	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
-	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
-	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
-	    -e "s/$_x07\.\.\./fffffff.../g" \
-	    -e "s/ $_x04\.\.\./ ffff.../g" \
-	    -e "s/ $_x04/ ffff/g" \
-	    "$1"
+	perl -e '
+		my $oid_length = length($ARGV[0]);
+		my $x40 = "[0-9a-f]{40}";
+		my $xab = "[0-9a-f]{4,16}";
+		my $orx = "[0-9a-f]" x $oid_length;
+
+		sub munge_oid {
+			my ($oid) = @_;
+			my $x;
+
+			return "" unless length $oid;
+
+			if ($oid =~ /^(100644|100755|120000)$/) {
+				return $oid;
+			}
+
+			if ($oid =~ /^0*$/) {
+				$x = "0";
+			} else {
+				$x = "f";
+			}
+
+			if (length($oid) == 40) {
+				return $x x $oid_length;
+			} else {
+				return $x x length($oid);
+			}
+		}
+
+		while (<STDIN>) {
+			s/($orx)/munge_oid($1)/ge;
+			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
+			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
+			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
+			s/^($x40)($| )/munge_oid($1) . $2/e;
+			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
+			print;
+		}
+	' "$ZERO_OID" <"$1"
 }
 
 V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')
-- %< --

Anyone is welcome to have my sign-off if they pick up any part of this
patch.
-- 
brian m. carlson: Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-15  0:21           ` brian m. carlson
@ 2020-08-15  2:27             ` Đoàn Trần Công Danh
  2020-08-17 16:17               ` Junio C Hamano
  2020-08-20  4:56               ` Junio C Hamano
  0 siblings, 2 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-15  2:27 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano, SZEDER Gábor, git, Jeff King

On 2020-08-15 00:21:20+0000, "brian m. carlson" <sandals@crustytoothpaste.net> wrote:
> On 2020-08-14 at 18:59:53, Junio C Hamano wrote:
> > Junio C Hamano <gitster@pobox.com> writes:
> > 
> > > Ouch.  These apparently come from
> > >
> > > process_diffs () {
> > > ...
> > > }
> > >
> > > Hash-independence may be good, but it should not munge expected mode
> > > bits from 100644 to ffff44, which I think is a bug in the original
> > > introduced in 72f936b1 (t4013: make test hash independent,
> > > 2020-02-07).
> > >
> > > When we are adjusting the abbrev length of the index line, of course
> > > $_x07 would not be sufficient to match the abbreviated object name
> > > in full, so a79 vs 895 can be explained and is a bug in this patch
> > > that did not update the process_diffs helper.
> > >
> > > Another thing that I find somewhat problematic in the original
> > > (brian cc'ed) is that it does not special case all-zero object name
> > > specially.  By turning any and all instances of $_x40 to $ZERO_OID,
> > > we lose the distinction between a random-looking object name which
> > > got turned into $ZERO_OID by the processing, and an object name that
> > > was $ZERO_OID from the beginning, so we won't catch a possible
> > > future bug where new file's preimage object name is not $ZERO_OID
> > > (this is plausible when you try to show an intent-to-add entry; the
> > > diff between the index and the working tree would be "new file"
> > > patch, but the index entry records the object name for an empty
> > > blob, e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 instead of $ZERO_OID
> > > can easily be emitted by a mistaken implementation).
> 
> Yeah, it wasn't intended that we munge mode bits, and I definitely agree
> we'd be better off distinguishing between all-zero and non-zero OIDs.
> 
> As you might imagine, this is not my favorite test because have a large
> amount of diff formats, and even I think the giant list of sed
> expressions I wrote is hideous.  It is, however, reasonably
> comprehensive, which is pretty much the only nice thing that can be said
> about it.
> 
> > So here is what I came up with as a possible starting point.  The
> > idea is to grab hexadecimal strings at locations the original tried
> > to isolate with various contexts, and
> > 
> >  - if the input happens to be all zero, use '0', otherwise use 'f'
> > 
> >  - if the input is 40-bytes (i.e. unabbreviated object name in the
> >    SHA-1 world), repeat the character chosen in the first step as
> >    many times as there are chars in $ZERO_OID
> > 
> >  - otherwise, repeat the character chosen in the first step as many
> >    times as there are chars in the input.
> > 
> >  - regardless of all of the above, special case possible in-tree
> >    blob modes (100644, 100755 and 120000) and don't munge them.
> > 
> > I haven't tried it with the patch that started this discussion
> > thread, nor with SHA-256 build, though.
> 
> This seems like a sane approach.
> 
> > diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> > index 43267d6024..b33e60ab9d 100755
> > --- a/t/t4013-diff-various.sh
> > +++ b/t/t4013-diff-various.sh
> > @@ -130,27 +130,43 @@ test_expect_success setup '
> >  EOF
> >  
> >  process_diffs () {
> > -	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
> > -	_x07="$_x05[0-9a-f][0-9a-f]" &&
> > -	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
> > -	    -e "s/From $_x40 /From $ZERO_OID /" \
> > -	    -e "s/from $_x40)/from $ZERO_OID)/" \
> > -	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
> > -	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
> > -	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
> > -	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
> > -	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
> > -	    -e "s/^$_x40 /$ZERO_OID /" \
> > -	    -e "s/^$_x40$/$ZERO_OID/" \
> > -	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
> > -	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
> > -	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
> > -	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
> > -	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
> > -	    -e "s/$_x07\.\.\./fffffff.../g" \
> > -	    -e "s/ $_x04\.\.\./ ffff.../g" \
> > -	    -e "s/ $_x04/ ffff/g" \
> > -	    "$1"
> > +	perl -e '
> > +		my $oid_length = length($ARGV[0]);
> > +		my $x40 = "[0-9a-f]{40}";
> > +		my $xab = "[0-9a-f]{4,16}";
> > +		my $orx = "[0-9a-f]" x $oid_length;
> > +
> > +		sub munge_oid {
> > +			my ($oid) = @_;
> > +			my $x;
> > +
> > +			if ($oid =~ /^(100644|100755|120000)$/) {
> > +				return $oid;
> > +			}
> > +
> > +			if ($oid =~ /^0*$/) {
> > +				$x = "0";
> > +			} else {
> > +				$x = "f";
> > +			}
> > +
> > +			if (length($oid) == 40) {
> > +				return $x x $oid_length;
> > +			} else {
> > +				return $x x length($oid);
> > +			}
> > +		}
> > +
> > +		while (<STDIN>) {
> > +			s/($orx)/munge_oid($1)/ge;
> > +			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> > +			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
> > +			s/($x40) /munge_oid($1) . " "/ge;
> > +			s/^($x40)($| )/munge_oid($1) . $2/e;
> > +			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
> > +			print;
> > +		}
> > +	' "$ZERO_OID" <"$1"
> >  }
> 
> This is much nicer, but I think we need the following on top of it
> because we have a couple of tricky cases the original didn't consider:
> 
> * Some of our 64-bit object IDs get processed twice, turning them into
>   88-character object IDs, which don't match.  We therefore need "\b".
> * The new unabbreviated index lines aren't accounted for, so I included
>   them by possibly matching "\.\.".
> * We have some lines that look like "commit $OID (from $OID)" that
>   aren't accounted for.  Because we now have an optional OID in
>   munge_oid, I had to account for that as well.
> 
> So this is what's on top:
> 
> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index b5c7e1a63b..dfc87a0d19 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -140,6 +140,8 @@ process_diffs () {
>  			my ($oid) = @_;
>  			my $x;
>  
> +			return "" unless length $oid;
> +
>  			if ($oid =~ /^(100644|100755|120000)$/) {
>  				return $oid;
>  			}
> @@ -160,8 +162,8 @@ process_diffs () {
>  		while (<STDIN>) {
>  			s/($orx)/munge_oid($1)/ge;
>  			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> -			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
> -			s/($x40) /munge_oid($1) . " "/ge;
> +			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
> +			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
>  			s/^($x40)($| )/munge_oid($1) . $2/e;
>  			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
>  			print;
> 
> Or, a fresh original version:
> 
> -- %< --
> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index e6eb4dd4c7..dfc87a0d19 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -130,27 +130,45 @@ test_expect_success setup '
>  EOF
>  
>  process_diffs () {
> -	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
> -	_x07="$_x05[0-9a-f][0-9a-f]" &&
> -	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
> -	    -e "s/From $_x40 /From $ZERO_OID /" \
> -	    -e "s/from $_x40)/from $ZERO_OID)/" \
> -	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
> -	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
> -	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
> -	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
> -	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
> -	    -e "s/^$_x40 /$ZERO_OID /" \
> -	    -e "s/^$_x40$/$ZERO_OID/" \
> -	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
> -	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
> -	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
> -	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
> -	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
> -	    -e "s/$_x07\.\.\./fffffff.../g" \
> -	    -e "s/ $_x04\.\.\./ ffff.../g" \
> -	    -e "s/ $_x04/ ffff/g" \
> -	    "$1"
> +	perl -e '
> +		my $oid_length = length($ARGV[0]);
> +		my $x40 = "[0-9a-f]{40}";
> +		my $xab = "[0-9a-f]{4,16}";
> +		my $orx = "[0-9a-f]" x $oid_length;
> +
> +		sub munge_oid {
> +			my ($oid) = @_;
> +			my $x;
> +
> +			return "" unless length $oid;
> +
> +			if ($oid =~ /^(100644|100755|120000)$/) {
> +				return $oid;
> +			}
> +
> +			if ($oid =~ /^0*$/) {
> +				$x = "0";
> +			} else {
> +				$x = "f";
> +			}
> +
> +			if (length($oid) == 40) {
> +				return $x x $oid_length;
> +			} else {
> +				return $x x length($oid);
> +			}
> +		}
> +
> +		while (<STDIN>) {
> +			s/($orx)/munge_oid($1)/ge;
> +			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> +			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
> +			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
> +			s/^($x40)($| )/munge_oid($1) . $2/e;
> +			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
> +			print;
> +		}
> +	' "$ZERO_OID" <"$1"
>  }
>  
>  V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')
> -- %< --
> 
> Anyone is welcome to have my sign-off if they pick up any part of this
> patch.

This one seems to work with:

	make -j9 test GIT_TEST_DEFAULT_HASH=sha256

If noone step up and write this into a patch in some days,
I'll take this as first step in my series.

Also waiting some days so other people could come up with better idea,
sed's y seems to be able to work if we don't have the constraint
on all-0 oid.


-- 
Danh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-15  2:27             ` Đoàn Trần Công Danh
@ 2020-08-17 16:17               ` Junio C Hamano
  2020-08-20  4:56               ` Junio C Hamano
  1 sibling, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2020-08-17 16:17 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: brian m. carlson, SZEDER Gábor, git, Jeff King

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

>> Anyone is welcome to have my sign-off if they pick up any part of this
>> patch.

Likewise.

> This one seems to work with:
>
> 	make -j9 test GIT_TEST_DEFAULT_HASH=sha256
>
> If noone step up and write this into a patch in some days,
> I'll take this as first step in my series.
>
> Also waiting some days so other people could come up with better idea,
> sed's y seems to be able to work if we don't have the constraint
> on all-0 oid.

y/// would help with the "same length" transformation, but I think
the primary reason why "sed" would not work well in this example is
"locate the things to tranform" part is harder to separate from
"here is how to transform a token" part when written in "sed".

Yes, you could use /pattern/{ s/from/to/; s/from2/to/; ...} but
at least I didn't see an obvious way to simplify the original and
reduce its repetitiveness.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly
  2020-08-14 14:50           ` Đoàn Trần Công Danh
@ 2020-08-19 22:50             ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2020-08-19 22:50 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Jeff King

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

>> What am I missing?
>
> No, you didn't miss anything.
>
> It's obviously me, who screwed up the logical thinking.
>
> Originally, I come up with something along the line in 2/2:
>
> 	int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
> 	if (!o->abbrev)
> 		abbrev = o->abbrev;
>
> I couldn't recalled what I wrote, but that logic requires 1/2,
> after I come up with 1/2, I re-analysed 2/2 and come up with current
> logic.
>
> I failed to re-visit 1/2 to check if it's necessary.
> It's all MY fault.
>
> Sorry for wasting everyone's time in 1/2.
>
> Please eject 1/2 from this series.

OK.  2/2 would work without this step.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-15  2:27             ` Đoàn Trần Công Danh
  2020-08-17 16:17               ` Junio C Hamano
@ 2020-08-20  4:56               ` Junio C Hamano
  1 sibling, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2020-08-20  4:56 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: brian m. carlson, SZEDER Gábor, git, Jeff King

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

>> Anyone is welcome to have my sign-off if they pick up any part of this
>> patch.
>
> This one seems to work with:
>
> 	make -j9 test GIT_TEST_DEFAULT_HASH=sha256
>
> If noone step up and write this into a patch in some days,
> I'll take this as first step in my series.

So, is there anything happening on this front?

Thanks.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v4 0/2] diff: index-line: respect --abbrev in object's name
  2020-08-09  2:19 [RFC PATCH 0/2] extend --abbrev support to diff-patch format Đoàn Trần Công Danh
                   ` (4 preceding siblings ...)
  2020-08-14  0:23 ` [PATCH v3 0/2] " Đoàn Trần Công Danh
@ 2020-08-20 12:35 ` Đoàn Trần Công Danh
  2020-08-20 12:35   ` [PATCH v4 1/2] t4013: improve diff-post-processor logic Đoàn Trần Công Danh
  2020-08-20 12:35   ` [PATCH v4 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  2020-08-21 11:51 ` [PATCH v5 0/2] " Đoàn Trần Công Danh
  6 siblings, 2 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-20 12:35 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, SZEDER Gábor,
	brian m . carlson, Junio C Hamano, Jeff King

On 2020-08-19 21:56:33-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> >> Anyone is welcome to have my sign-off if they pick up any part of this
> >> patch.
> >
> > This one seems to work with:
> >
> > 	make -j9 test GIT_TEST_DEFAULT_HASH=sha256
> >
> > If noone step up and write this into a patch in some days,
> > I'll take this as first step in my series.
> 
> So, is there anything happening on this front?

Sorry, I was busy with work-life this week.

--------------8<-----------

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that consistent is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's resolve that inconsistency.


brian m. carlson (1):
  t4013: improve diff-post-processor logic

Đoàn Trần Công Danh (1):
  diff: index-line: respect --abbrev in object's name

 Documentation/diff-options.txt                |  9 +--
 diff.c                                        |  5 +-
 t/t4013-diff-various.sh                       | 63 ++++++++++++-------
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++
 6 files changed, 138 insertions(+), 26 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

Range-diff against v3:
1:  7966f6ca1d < -:  ---------- revision: differentiate if --no-abbrev asked explicitly
-:  ---------- > 1:  a52d0e59ec t4013: improve diff-post-processor logic
2:  1d32e791a4 = 2:  3cec490500 diff: index-line: respect --abbrev in object's name
-- 
2.28.0.143.g760df7782d.dirty


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v4 1/2] t4013: improve diff-post-processor logic
  2020-08-20 12:35 ` [PATCH v4 0/2] " Đoàn Trần Công Danh
@ 2020-08-20 12:35   ` Đoàn Trần Công Danh
  2020-08-20 19:49     ` Junio C Hamano
  2020-08-20 12:35   ` [PATCH v4 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  1 sibling, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-20 12:35 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, SZEDER Gábor, Junio C Hamano, Jeff King,
	Đoàn Trần Công Danh

From: "brian m. carlson" <sandals@crustytoothpaste.net>

From 72f936b1 (t4013: make test hash independent, 2020-02-07),
we started to adjust metadata of git-diff's output in order to
ignore uninteresting metadata which is dependent of underlying hash
algorithm.

However, we forgot to special case all-zero object names, which is
special for not-exist objects, in consequence, we could't catch
possible future bugs where object names is all-zeros including but
not limiting to:
* show intend-to-add entry
* deleted entry
* diff between index and working tree with new file

In addition, in the incoming change, we would like to test for
diff with 10 characters index, which is also not covered by current
diff-processor logic.

Let's fix the bug for all-zero object names and extend object names'
abbrev to 16 while we're at it.

Based-on-patch-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 t/t4013-diff-various.sh | 60 ++++++++++++++++++++++++++---------------
 1 file changed, 39 insertions(+), 21 deletions(-)

diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 5f97dd6d65..f6bdfc13fd 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -130,27 +130,45 @@ test_expect_success setup '
 EOF
 
 process_diffs () {
-	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
-	_x07="$_x05[0-9a-f][0-9a-f]" &&
-	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
-	    -e "s/From $_x40 /From $ZERO_OID /" \
-	    -e "s/from $_x40)/from $ZERO_OID)/" \
-	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
-	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
-	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
-	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
-	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
-	    -e "s/^$_x40 /$ZERO_OID /" \
-	    -e "s/^$_x40$/$ZERO_OID/" \
-	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
-	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
-	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
-	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
-	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
-	    -e "s/$_x07\.\.\./fffffff.../g" \
-	    -e "s/ $_x04\.\.\./ ffff.../g" \
-	    -e "s/ $_x04/ ffff/g" \
-	    "$1"
+	perl -e '
+		my $oid_length = length($ARGV[0]);
+		my $x40 = "[0-9a-f]{40}";
+		my $xab = "[0-9a-f]{4,16}";
+		my $orx = "[0-9a-f]" x $oid_length;
+
+		sub munge_oid {
+			my ($oid) = @_;
+			my $x;
+
+			return "" unless length $oid;
+
+			if ($oid =~ /^(100644|100755|120000)$/) {
+				return $oid;
+			}
+
+			if ($oid =~ /^0*$/) {
+				$x = "0";
+			} else {
+				$x = "f";
+			}
+
+			if (length($oid) == 40) {
+				return $x x $oid_length;
+			} else {
+				return $x x length($oid);
+			}
+		}
+
+		while (<STDIN>) {
+			s/($orx)/munge_oid($1)/ge;
+			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
+			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
+			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
+			s/^($x40)($| )/munge_oid($1) . $2/e;
+			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
+			print;
+		}
+	' "$ZERO_OID" <"$1"
 }
 
 V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')
-- 
2.28.0.143.g760df7782d.dirty


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v4 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-20 12:35 ` [PATCH v4 0/2] " Đoàn Trần Công Danh
  2020-08-20 12:35   ` [PATCH v4 1/2] t4013: improve diff-post-processor logic Đoàn Trần Công Danh
@ 2020-08-20 12:35   ` Đoàn Trần Công Danh
  2020-08-20 19:58     ` Junio C Hamano
  1 sibling, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-20 12:35 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, SZEDER Gábor,
	brian m . carlson, Junio C Hamano, Jeff King

A handful of Git's commands respect `--abbrev' for customizing length
of abbreviation of object names.

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that consistent is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's resolve that inconsistency.

To preserve backward compatibility with old script that specify both
`--full-index' and `--abbrev', always shows full object id
if `--full-index' is specified.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 6 files changed, 99 insertions(+), 5 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index b7af973d9c..1bb897d665 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -446,10 +446,11 @@ endif::git-format-patch[]
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	name in diff-raw format output and diff-tree header
-	lines, show only a partial prefix.  This is
-	independent of the `--full-index` option above, which controls
-	the diff-patch output format.  Non default number of
-	digits can be specified with `--abbrev=<n>`.
+	lines, show only a partial prefix.
+	In diff-patch output format, `--full-index` takes higher
+	precedent, i.e. if `--full-index` is specified, full blob
+	names will be shown regardless of `--abbrev`.
+	Non default number of digits can be specified with `--abbrev=<n>`.
 
 -B[<n>][/<m>]::
 --break-rewrites[=[<n>][/<m>]]::
diff --git a/diff.c b/diff.c
index f9709de7b4..20dedfe2a9 100644
--- a/diff.c
+++ b/diff.c
@@ -4319,7 +4319,10 @@ static void fill_metainfo(struct strbuf *msg,
 	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
-		int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
+		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
+
+		if (o->flags.full_index)
+			abbrev = hexsz;
 
 		if (o->flags.binary) {
 			mmfile_t mf;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index f6bdfc13fd..5c7b0122b4 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -239,6 +239,9 @@ diff-tree --root -r --abbrev=4 initial
 :noellipses diff-tree --root -r --abbrev=4 initial
 diff-tree -p initial
 diff-tree --root -p initial
+diff-tree --root -p --abbrev=10 initial
+diff-tree --root -p --full-index initial
+diff-tree --root -p --full-index --abbrev=10 initial
 diff-tree --patch-with-stat initial
 diff-tree --root --patch-with-stat initial
 diff-tree --patch-with-raw initial
diff --git a/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
new file mode 100644
index 0000000000..7518a9044e
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000..35d242ba79
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
new file mode 100644
index 0000000000..69f913fbe5
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
new file mode 100644
index 0000000000..1b0b6717fa
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
-- 
2.28.0.143.g760df7782d.dirty


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 1/2] t4013: improve diff-post-processor logic
  2020-08-20 12:35   ` [PATCH v4 1/2] t4013: improve diff-post-processor logic Đoàn Trần Công Danh
@ 2020-08-20 19:49     ` Junio C Hamano
  2020-08-21 12:05       ` Đoàn Trần Công Danh
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2020-08-20 19:49 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: git, brian m. carlson, SZEDER Gábor, Jeff King

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> From: "brian m. carlson" <sandals@crustytoothpaste.net>
>
> From 72f936b1 (t4013: make test hash independent, 2020-02-07),
> we started to adjust metadata of git-diff's output in order to
> ignore uninteresting metadata which is dependent of underlying hash
> algorithm.
>
> However, we forgot to special case all-zero object names, which is
> special for not-exist objects, in consequence, we could't catch

"for missing objects" would probably the best, even though
"non-existing objects" would also work.

> possible future bugs where object names is all-zeros including but

s/is/are/

> not limiting to:

"not limited to", I think.

> * show intend-to-add entry
> * deleted entry
> * diff between index and working tree with new file
>
> In addition, in the incoming change, we would like to test for

s/incoming/upcoming/?

> diff with 10 characters index, which is also not covered by current

"test for customizing the length of abbreviated blob object names on
the index line"?

s/covered/supported/

> diff-processor logic.
>
> Let's fix the bug for all-zero object names and extend object names'
> abbrev to 16 while we're at it.

"and support abbreviation of object names up to 16 bytes"?

Also while we are at it, we fixed the post-processing not to touch
the file modes, which were mistakenly munged in the older code as if
they were object names abbreviated to 7 hexdigits.

> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index 5f97dd6d65..f6bdfc13fd 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -130,27 +130,45 @@ test_expect_success setup '
>  EOF
>  
>  process_diffs () {
> -	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
> -	_x07="$_x05[0-9a-f][0-9a-f]" &&
> -	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
> -	    -e "s/From $_x40 /From $ZERO_OID /" \
> -	    -e "s/from $_x40)/from $ZERO_OID)/" \
> -	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
> -	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
> -	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
> -	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
> -	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
> -	    -e "s/^$_x40 /$ZERO_OID /" \
> -	    -e "s/^$_x40$/$ZERO_OID/" \
> -	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
> -	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
> -	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
> -	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
> -	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
> -	    -e "s/$_x07\.\.\./fffffff.../g" \
> -	    -e "s/ $_x04\.\.\./ ffff.../g" \
> -	    -e "s/ $_x04/ ffff/g" \
> -	    "$1"
> +	perl -e '
> +		my $oid_length = length($ARGV[0]);
> +		my $x40 = "[0-9a-f]{40}";
> +		my $xab = "[0-9a-f]{4,16}";
> +		my $orx = "[0-9a-f]" x $oid_length;
> +
> +		sub munge_oid {
> +			my ($oid) = @_;
> +			my $x;
> +
> +			return "" unless length $oid;
> +
> +			if ($oid =~ /^(100644|100755|120000)$/) {
> +				return $oid;
> +			}
> +
> +			if ($oid =~ /^0*$/) {
> +				$x = "0";
> +			} else {
> +				$x = "f";
> +			}
> +
> +			if (length($oid) == 40) {
> +				return $x x $oid_length;
> +			} else {
> +				return $x x length($oid);
> +			}
> +		}
> +
> +		while (<STDIN>) {
> +			s/($orx)/munge_oid($1)/ge;
> +			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> +			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
> +			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
> +			s/^($x40)($| )/munge_oid($1) . $2/e;
> +			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
> +			print;
> +		}
> +	' "$ZERO_OID" <"$1"
>  }
>  
>  V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-20 12:35   ` [PATCH v4 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
@ 2020-08-20 19:58     ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2020-08-20 19:58 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: git, SZEDER Gábor, brian m . carlson, Jeff King

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> A handful of Git's commands respect `--abbrev' for customizing length
> of abbreviation of object names.
>
> For diff-family, Git supports 2 different options for 2 different
> purposes, `--full-index' for showing diff-patch object's name in full,
> and `--abbrev' to customize the length of object names in diff-raw and
> diff-tree header lines, without any options to customise the length of
> object names in diff-patch format. When working with diff-patch format,
> we only have two options, either full index, or default abbrev length.

Correct.


> Although, that consistent is documented, it doesn't stop users from

I am not sure what you meant by the word "consistent" here.
Dropping the word altogether makes the sentence work just fine, but
it may not be what you wanted to say, so I dunno.

> trying to use `--abbrev' with the hope of customising diff-patch's
> objects' name's abbreviation.

OK.

> Let's resolve that inconsistency.

I am not sure if this should even be called "inconsistency".  We
have two things to control, (1) length of abbreviated object names
in the "raw" output and (2) length of abbreviated object names on
the "index" line, and they can be independently controlled with
different options.  What this patch does is, as the first paragraph
explained, the latter does not take an arbitrary length as end users
may expect.

	Let's allow the blob object names shown on the "index" line
	to be abbreviated to arbitrary length given via the "--abbrev"
	option.

perhaps?

> To preserve backward compatibility with old script that specify both
> `--full-index' and `--abbrev', always shows full object id
> if `--full-index' is specified.

s/shows/show/ but otherwise good.

> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index b7af973d9c..1bb897d665 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -446,10 +446,11 @@ endif::git-format-patch[]
>  --abbrev[=<n>]::
>  	Instead of showing the full 40-byte hexadecimal object
>  	name in diff-raw format output and diff-tree header
> -	lines, show only a partial prefix.  This is
> -	independent of the `--full-index` option above, which controls
> -	the diff-patch output format.  Non default number of
> -	digits can be specified with `--abbrev=<n>`.
> +	lines, show only a partial prefix.
> +	In diff-patch output format, `--full-index` takes higher
> +	precedent, i.e. if `--full-index` is specified, full blob

"precedence", I think.

> +	names will be shown regardless of `--abbrev`.
> +	Non default number of digits can be specified with `--abbrev=<n>`.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v5 0/2] diff: index-line: respect --abbrev in object's name
  2020-08-09  2:19 [RFC PATCH 0/2] extend --abbrev support to diff-patch format Đoàn Trần Công Danh
                   ` (5 preceding siblings ...)
  2020-08-20 12:35 ` [PATCH v4 0/2] " Đoàn Trần Công Danh
@ 2020-08-21 11:51 ` Đoàn Trần Công Danh
  2020-08-21 11:51   ` [PATCH v5 1/2] t4013: improve diff-post-processor logic Đoàn Trần Công Danh
  2020-08-21 11:51   ` [PATCH v5 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  6 siblings, 2 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-21 11:51 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, SZEDER Gábor,
	brian m . carlson, Junio C Hamano, Jeff King

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that consistent is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's resolve that inconsistency.


brian m. carlson (1):
  t4013: improve diff-post-processor logic

Đoàn Trần Công Danh (1):
  diff: index-line: respect --abbrev in object's name

 Documentation/diff-options.txt                |  9 +--
 diff.c                                        |  5 +-
 t/t4013-diff-various.sh                       | 63 ++++++++++++-------
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++
 6 files changed, 138 insertions(+), 26 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

Range-diff against v4:
1:  a52d0e59ec ! 1:  d7e0f13eba t4013: improve diff-post-processor logic
    @@ Commit message
         algorithm.
     
         However, we forgot to special case all-zero object names, which is
    -    special for not-exist objects, in consequence, we could't catch
    +    special for missing objects, in consequence, we could't catch
         possible future bugs where object names is all-zeros including but
    -    not limiting to:
    +    not limited to:
         * show intend-to-add entry
         * deleted entry
         * diff between index and working tree with new file
     
    -    In addition, in the incoming change, we would like to test for
    -    diff with 10 characters index, which is also not covered by current
    -    diff-processor logic.
    +    We also mistakenly munged file-modes as if they were object names
    +    abbreviated to 6 hexadecimal digits.
     
    -    Let's fix the bug for all-zero object names and extend object names'
    -    abbrev to 16 while we're at it.
    +    In addition, in the upcoming change, we would like to test for
    +    customizing the length of abbreviated blob objects on the index line,
    +    which is not supported by current diff-processor logic.
    +
    +    Let's fix the bug for all-zero object names, and file modes.
    +    While we're at it, support abbreviation of object names up to 16 bytes.
     
         Based-on-patch-by: Junio C Hamano <gitster@pobox.com>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
2:  3cec490500 ! 2:  a9eb73ceab diff: index-line: respect --abbrev in object's name
    @@ Commit message
         object names in diff-patch format. When working with diff-patch format,
         we only have two options, either full index, or default abbrev length.
     
    -    Although, that consistent is documented, it doesn't stop users from
    +    Although, that behaviour is documented, it doesn't stop users from
         trying to use `--abbrev' with the hope of customising diff-patch's
         objects' name's abbreviation.
     
    -    Let's resolve that inconsistency.
    +    Let's allow the blob object names shown on the "index" line to be
    +    abbreviated to arbitrary length given via the "--abbrev" option.
     
         To preserve backward compatibility with old script that specify both
    -    `--full-index' and `--abbrev', always shows full object id
    +    `--full-index' and `--abbrev', always show full object id
         if `--full-index' is specified.
     
         Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
    @@ Documentation/diff-options.txt: endif::git-format-patch[]
     -	digits can be specified with `--abbrev=<n>`.
     +	lines, show only a partial prefix.
     +	In diff-patch output format, `--full-index` takes higher
    -+	precedent, i.e. if `--full-index` is specified, full blob
    ++	precedence, i.e. if `--full-index` is specified, full blob
     +	names will be shown regardless of `--abbrev`.
     +	Non default number of digits can be specified with `--abbrev=<n>`.
      
-- 
2.28.0.143.g760df7782d.dirty


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v5 1/2] t4013: improve diff-post-processor logic
  2020-08-21 11:51 ` [PATCH v5 0/2] " Đoàn Trần Công Danh
@ 2020-08-21 11:51   ` Đoàn Trần Công Danh
  2020-08-21 11:51   ` [PATCH v5 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
  1 sibling, 0 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-21 11:51 UTC (permalink / raw)
  To: git
  Cc: brian m. carlson, SZEDER Gábor, Junio C Hamano, Jeff King,
	Đoàn Trần Công Danh

From: "brian m. carlson" <sandals@crustytoothpaste.net>

From 72f936b1 (t4013: make test hash independent, 2020-02-07),
we started to adjust metadata of git-diff's output in order to
ignore uninteresting metadata which is dependent of underlying hash
algorithm.

However, we forgot to special case all-zero object names, which is
special for missing objects, in consequence, we could't catch
possible future bugs where object names is all-zeros including but
not limited to:
* show intend-to-add entry
* deleted entry
* diff between index and working tree with new file

We also mistakenly munged file-modes as if they were object names
abbreviated to 6 hexadecimal digits.

In addition, in the upcoming change, we would like to test for
customizing the length of abbreviated blob objects on the index line,
which is not supported by current diff-processor logic.

Let's fix the bug for all-zero object names, and file modes.
While we're at it, support abbreviation of object names up to 16 bytes.

Based-on-patch-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 t/t4013-diff-various.sh | 60 ++++++++++++++++++++++++++---------------
 1 file changed, 39 insertions(+), 21 deletions(-)

diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 5f97dd6d65..f6bdfc13fd 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -130,27 +130,45 @@ test_expect_success setup '
 EOF
 
 process_diffs () {
-	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
-	_x07="$_x05[0-9a-f][0-9a-f]" &&
-	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
-	    -e "s/From $_x40 /From $ZERO_OID /" \
-	    -e "s/from $_x40)/from $ZERO_OID)/" \
-	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
-	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
-	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
-	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
-	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
-	    -e "s/^$_x40 /$ZERO_OID /" \
-	    -e "s/^$_x40$/$ZERO_OID/" \
-	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
-	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
-	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
-	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
-	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
-	    -e "s/$_x07\.\.\./fffffff.../g" \
-	    -e "s/ $_x04\.\.\./ ffff.../g" \
-	    -e "s/ $_x04/ ffff/g" \
-	    "$1"
+	perl -e '
+		my $oid_length = length($ARGV[0]);
+		my $x40 = "[0-9a-f]{40}";
+		my $xab = "[0-9a-f]{4,16}";
+		my $orx = "[0-9a-f]" x $oid_length;
+
+		sub munge_oid {
+			my ($oid) = @_;
+			my $x;
+
+			return "" unless length $oid;
+
+			if ($oid =~ /^(100644|100755|120000)$/) {
+				return $oid;
+			}
+
+			if ($oid =~ /^0*$/) {
+				$x = "0";
+			} else {
+				$x = "f";
+			}
+
+			if (length($oid) == 40) {
+				return $x x $oid_length;
+			} else {
+				return $x x length($oid);
+			}
+		}
+
+		while (<STDIN>) {
+			s/($orx)/munge_oid($1)/ge;
+			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
+			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
+			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
+			s/^($x40)($| )/munge_oid($1) . $2/e;
+			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
+			print;
+		}
+	' "$ZERO_OID" <"$1"
 }
 
 V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')
-- 
2.28.0.143.g760df7782d.dirty


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v5 2/2] diff: index-line: respect --abbrev in object's name
  2020-08-21 11:51 ` [PATCH v5 0/2] " Đoàn Trần Công Danh
  2020-08-21 11:51   ` [PATCH v5 1/2] t4013: improve diff-post-processor logic Đoàn Trần Công Danh
@ 2020-08-21 11:51   ` Đoàn Trần Công Danh
  1 sibling, 0 replies; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-21 11:51 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, SZEDER Gábor,
	brian m . carlson, Junio C Hamano, Jeff King

A handful of Git's commands respect `--abbrev' for customizing length
of abbreviation of object names.

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that behaviour is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's allow the blob object names shown on the "index" line to be
abbreviated to arbitrary length given via the "--abbrev" option.

To preserve backward compatibility with old script that specify both
`--full-index' and `--abbrev', always show full object id
if `--full-index' is specified.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 6 files changed, 99 insertions(+), 5 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index b7af973d9c..573fb9bb71 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -446,10 +446,11 @@ endif::git-format-patch[]
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	name in diff-raw format output and diff-tree header
-	lines, show only a partial prefix.  This is
-	independent of the `--full-index` option above, which controls
-	the diff-patch output format.  Non default number of
-	digits can be specified with `--abbrev=<n>`.
+	lines, show only a partial prefix.
+	In diff-patch output format, `--full-index` takes higher
+	precedence, i.e. if `--full-index` is specified, full blob
+	names will be shown regardless of `--abbrev`.
+	Non default number of digits can be specified with `--abbrev=<n>`.
 
 -B[<n>][/<m>]::
 --break-rewrites[=[<n>][/<m>]]::
diff --git a/diff.c b/diff.c
index f9709de7b4..20dedfe2a9 100644
--- a/diff.c
+++ b/diff.c
@@ -4319,7 +4319,10 @@ static void fill_metainfo(struct strbuf *msg,
 	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
-		int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
+		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
+
+		if (o->flags.full_index)
+			abbrev = hexsz;
 
 		if (o->flags.binary) {
 			mmfile_t mf;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index f6bdfc13fd..5c7b0122b4 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -239,6 +239,9 @@ diff-tree --root -r --abbrev=4 initial
 :noellipses diff-tree --root -r --abbrev=4 initial
 diff-tree -p initial
 diff-tree --root -p initial
+diff-tree --root -p --abbrev=10 initial
+diff-tree --root -p --full-index initial
+diff-tree --root -p --full-index --abbrev=10 initial
 diff-tree --patch-with-stat initial
 diff-tree --root --patch-with-stat initial
 diff-tree --patch-with-raw initial
diff --git a/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
new file mode 100644
index 0000000000..7518a9044e
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000..35d242ba79
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
new file mode 100644
index 0000000000..69f913fbe5
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
new file mode 100644
index 0000000000..1b0b6717fa
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
@@ -0,0 +1,29 @@
+$ git diff-tree --root -p --full-index initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
-- 
2.28.0.143.g760df7782d.dirty


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 1/2] t4013: improve diff-post-processor logic
  2020-08-20 19:49     ` Junio C Hamano
@ 2020-08-21 12:05       ` Đoàn Trần Công Danh
  2020-08-21 15:44         ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Đoàn Trần Công Danh @ 2020-08-21 12:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, brian m. carlson, SZEDER Gábor, Jeff King

On 2020-08-20 12:49:05-0700, Junio C Hamano <gitster@pobox.com> wrote:
> > diff-processor logic.
> >
> > Let's fix the bug for all-zero object names and extend object names'
> > abbrev to 16 while we're at it.
> 
> "and support abbreviation of object names up to 16 bytes"?
> 
> Also while we are at it, we fixed the post-processing not to touch
> the file modes, which were mistakenly munged in the older code as if
> they were object names abbreviated to 7 hexdigits.

I've integrated your suggestion into newest series.
I think you meant 6 hex-digits here, and I took the liberty to change
to 7 digits.

Thanks,

-- 
Danh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 1/2] t4013: improve diff-post-processor logic
  2020-08-21 12:05       ` Đoàn Trần Công Danh
@ 2020-08-21 15:44         ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2020-08-21 15:44 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: git, brian m. carlson, SZEDER Gábor, Jeff King

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> On 2020-08-20 12:49:05-0700, Junio C Hamano <gitster@pobox.com> wrote:
>> > diff-processor logic.
>> >
>> > Let's fix the bug for all-zero object names and extend object names'
>> > abbrev to 16 while we're at it.
>> 
>> "and support abbreviation of object names up to 16 bytes"?
>> 
>> Also while we are at it, we fixed the post-processing not to touch
>> the file modes, which were mistakenly munged in the older code as if
>> they were object names abbreviated to 7 hexdigits.
>
> I've integrated your suggestion into newest series.
> I think you meant 6 hex-digits here, and I took the liberty to change
> to 7 digits.

Thanks, yeah, 100644 has 6 digits but I somehow couldn't count ;-)

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2020-08-21 15:44 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-09  2:19 [RFC PATCH 0/2] extend --abbrev support to diff-patch format Đoàn Trần Công Danh
2020-08-09  2:19 ` [RFC PATCH 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
2020-08-09  2:19 ` [RFC PATCH 2/2] diff: extend --abbrev support to diff-patch format Đoàn Trần Công Danh
2020-08-09 19:01 ` [RFC PATCH 0/2] " Junio C Hamano
2020-08-10 10:00   ` Jeff King
2020-08-10 12:31     ` Đoàn Trần Công Danh
2020-08-10 15:15       ` Junio C Hamano
2020-08-10 15:27         ` Jeff King
2020-08-11  0:33           ` Đoàn Trần Công Danh
2020-08-11  5:22             ` Jeff King
2020-08-11 12:07               ` Đoàn Trần Công Danh
2020-08-10 15:06     ` Junio C Hamano
2020-08-11 11:49 ` [PATCH v2 0/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
2020-08-11 11:49   ` [PATCH v2 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
2020-08-11 18:54     ` Junio C Hamano
2020-08-11 11:49   ` [PATCH v2 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
2020-08-14  0:23 ` [PATCH v3 0/2] " Đoàn Trần Công Danh
2020-08-14  0:23   ` [PATCH v3 1/2] revision: differentiate if --no-abbrev asked explicitly Đoàn Trần Công Danh
2020-08-14  0:50     ` Junio C Hamano
2020-08-14  0:59       ` Đoàn Trần Công Danh
2020-08-14  1:06         ` Junio C Hamano
2020-08-14 14:50           ` Đoàn Trần Công Danh
2020-08-19 22:50             ` Junio C Hamano
2020-08-14  0:23   ` [PATCH v3 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
2020-08-14 15:18     ` SZEDER Gábor
2020-08-14 17:00       ` Junio C Hamano
2020-08-14 18:59         ` Junio C Hamano
2020-08-15  0:21           ` brian m. carlson
2020-08-15  2:27             ` Đoàn Trần Công Danh
2020-08-17 16:17               ` Junio C Hamano
2020-08-20  4:56               ` Junio C Hamano
2020-08-20 12:35 ` [PATCH v4 0/2] " Đoàn Trần Công Danh
2020-08-20 12:35   ` [PATCH v4 1/2] t4013: improve diff-post-processor logic Đoàn Trần Công Danh
2020-08-20 19:49     ` Junio C Hamano
2020-08-21 12:05       ` Đoàn Trần Công Danh
2020-08-21 15:44         ` Junio C Hamano
2020-08-20 12:35   ` [PATCH v4 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh
2020-08-20 19:58     ` Junio C Hamano
2020-08-21 11:51 ` [PATCH v5 0/2] " Đoàn Trần Công Danh
2020-08-21 11:51   ` [PATCH v5 1/2] t4013: improve diff-post-processor logic Đoàn Trần Công Danh
2020-08-21 11:51   ` [PATCH v5 2/2] diff: index-line: respect --abbrev in object's name Đoàn Trần Công Danh

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).