git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix
@ 2016-08-14  8:56 Johannes Schindelin
  2016-08-14 20:42 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2016-08-14  8:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jacob Keller

The '>' character is not a legal part of filenames on Windows. So let's
just not use it in Git's source code. This poses a challenge in the test
script t4013 which distills command-lines into file names (so that the
expected outcome can be stored in files with said names).

We have to take particular care not to confound the existing conversion
of unwanted characters to underscores with the new substitution of '>':
the existing conversion chose to collapse runs of multiple unwanted
characters into a single underscore. If we allowed '>' to be collapsed,
too, the file name generated from the command "diff [...]=-- [...]"
would be identical to the one generated from "diff [...]=--> [...]".

Please squash this patch into
3c90ffd2f01e2d0d080c8e42df2ee89709b324de

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Published-As: https://github.com/dscho/git/releases/tag/mingw-t4013-v1
Fetch-It-Via: git fetch https://github.com/dscho/git mingw-t4013-v1

	For the record: this prevented my beautiful CI jobs from even
	checking out the source code for `pu` in the last days.

	Junio, please let me know if you would prefer this as a separate
	patch.

 t/t4013-diff-various.sh                                                 | 2 +-
 ...aster^_side => diff.diff_--diff-line-prefix=--__master_master^_side} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename t/t4013/{diff.diff_--diff-line-prefix=-->_master_master^_side => diff.diff_--diff-line-prefix=--__master_master^_side} (100%)

diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 5204645..84e2ee0 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -115,7 +115,7 @@ do
 	case "$cmd" in
 	'' | '#'*) continue ;;
 	esac
-	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g')
+	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/')
 	pfx=$(printf "%04d" $test_count)
 	expect="$TEST_DIRECTORY/t4013/diff.$test"
 	actual="$pfx-diff.$test"
diff --git a/t/t4013/diff.diff_--diff-line-prefix=-->_master_master^_side b/t/t4013/diff.diff_--diff-line-prefix=--__master_master^_side
similarity index 100%
rename from t/t4013/diff.diff_--diff-line-prefix=-->_master_master^_side
rename to t/t4013/diff.diff_--diff-line-prefix=--__master_master^_side
-- 
2.9.2.691.g78954f3

base-commit: 945e149951a152207b56d5e49ff5167d151a4c89

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix
  2016-08-14  8:56 [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix Johannes Schindelin
@ 2016-08-14 20:42 ` Junio C Hamano
  2016-08-15 14:07   ` Johannes Schindelin
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2016-08-14 20:42 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jacob Keller

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> -	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g')
> +	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/')

The existing sed scriptlet says "we cannot have slash and do not
want to have space in filename, so we squash runs of them to a
single underscore".  If you have more characters that you do not
want, you should add that to the existing set instead.

While you are at it, it may be sensible to add a colon to that set,
too, no?

Something like this, perhaps?

> -	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g')
> +	test=$(echo "$cmd" | sed -e 's|[/ <:>][/ <:>]*|_|g')

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix
  2016-08-14 20:42 ` Junio C Hamano
@ 2016-08-15 14:07   ` Johannes Schindelin
  2016-08-15 16:20     ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2016-08-15 14:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jacob Keller

Hi Junio,

On Sun, 14 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > -	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g')
> > +	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/')
> 
> The existing sed scriptlet says "we cannot have slash and do not
> want to have space in filename, so we squash runs of them to a
> single underscore".  If you have more characters that you do not
> want, you should add that to the existing set instead.

No, we cannot do that. I even mentioned it in my commit message why:

	We have to take particular care not to confound the existing
	conversion of unwanted characters to underscores with the new
	substitution of '>': the existing conversion chose to collapse
	runs of multiple unwanted characters into a single underscore. If
	we allowed '>' to be collapsed, too, the file name generated from
	the command "diff [...]=-- [...]" would be identical to the one
	generated from "diff [...]=--> [...]".

And as there is exactly this case (two command-lines, differing only in
the '>' character), doing what you suggested would *break* the test since
it would now look at the wrong file.

I know that this is so because my first iteration of the patch did exactly
what you suggested.

> While you are at it, it may be sensible to add a colon to that set, too,
> no?
> 
> Something like this, perhaps?
> 
> > -	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g')
> > +	test=$(echo "$cmd" | sed -e 's|[/ <:>][/ <:>]*|_|g')

Maybe. But what other characters are missing? Those are not the only ones
illegal on Windows: [/ \0-\x1f"*:<>?\\|] would be a more complete version
of what you suggested. Except that it is not a valid basic regex.

But is that really necessary?

I think not, for the following reasons:

- my patch was specifically designed to stop my CI from pestering me about
  a totally broken revision that cannot even be checked out (and causes
  subsequent problems because of it),

- as such, my patch was meant not to be an all-encompassing solution to
  the problem of filenames that are illegal on Windows, but really a tiny
  patch that you could apply as quickly as possible so that my CI jobs
  would stop pestering me (which they did not, because `pu` is still
  broken),

- I even meant this little patch to be so small that it can be easily
  squashed into Jacob's patch,

- I do not want to complicate regular expressions unless *really* needed
  (and you have to admit that we do not need to address any more
  characters than the '>' *right now*), as

	- regular expressions are not exactly an easy meal, so it makes
	  sense to keep them as simple as possible both for contributors'
	  as well as for maintainers' sake, and

	- when trying to come up with a super-complete solution, it is
	  really easy not only to spend waaaaay too much time on it, but
	  also to introduce bugs that are not spotted for a loooong time
	  because nothing actually exercises the newly introduced code, and

	- If It Ain't Broke, Don't Fix It.

- the broader solution must come separately, and not as a mere add-on to
  one test case: we really need to ensure that such file names do not
  enter Git's source code.

I will send my proposal to address the larger issue in a moment, in the
meantime I *beg* you to add this here patch as a quick fix to my CI woes,
either by squashing it into the indicated commit, or by appending it to
the topic branch. I do not care which one, as long as `pu` gets fixed.

Thanks,
Dscho



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix
  2016-08-15 14:07   ` Johannes Schindelin
@ 2016-08-15 16:20     ` Junio C Hamano
  2016-08-16 15:39       ` Johannes Schindelin
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2016-08-15 16:20 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jacob Keller

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Sun, 14 Aug 2016, Junio C Hamano wrote:
>
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>> 
>> > -	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g')
>> > +	test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/')
>> ...
> I know that this is so because my first iteration of the patch did exactly
> what you suggested.

You probably should have stepped back and taken a deep breath before
writing the second round.  Doing so after writing it before sending
would also have been OK.

Among three characters that are special cased here, the problem "if
we squish a run of problematic characters into one underscore, we
risk making the result ambiguous" is NOT limited to '>'; it is not a
new problem with '>', either.  I can think of two other possible
solutions offhand:

 (1) drop "squishing a run", i.e. [/ ]*, from the regexp, and rename
     existing test vectors that would be affected;

 (2) change the string used in the offending test so that squishing
     will not make the result ambiguous.

before special casing "y/>/_/"; as there is nothing in ">" that
inherently causes the ambiguity that won't be caused by "/" or " ",
adding second clause to the sed expression that does things
differently only for ">" is simply wrong.  After all, you only
wanted to affect the "prefix=-->" test and not the whole set of the
tests in the script.

Obviously (1) is a lot of impact with little gain, and as Jacob
already offered to do, I think (2) is a lot more sensible solution
and it also is more in line with your "If it isn't broken, do not
fix it", I would say.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix
  2016-08-15 16:20     ` Junio C Hamano
@ 2016-08-16 15:39       ` Johannes Schindelin
  0 siblings, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2016-08-16 15:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jacob Keller

Hi Junio,

On Mon, 15 Aug 2016, Junio C Hamano wrote:

> Obviously (1) is a lot of impact with little gain, and as Jacob
> already offered to do, I think (2) is a lot more sensible solution
> and it also is more in line with your "If it isn't broken, do not
> fix it", I would say.

Yep, that makes most sense to me, too.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-16 15:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-14  8:56 [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix Johannes Schindelin
2016-08-14 20:42 ` Junio C Hamano
2016-08-15 14:07   ` Johannes Schindelin
2016-08-15 16:20     ` Junio C Hamano
2016-08-16 15:39       ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).