git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Phillip Wood <phillip.wood123@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
	Taylor Blau <me@ttaylorr.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Thomas Guyot-Sionnest <tguyot@gmail.com>
Subject: Re: [PATCH v2 4/4] diff --no-index: support reading from named pipes
Date: Wed, 05 Jul 2023 15:19:01 -0700	[thread overview]
Message-ID: <xmqq1qhmvw6i.fsf@gitster.g> (raw)
In-Reply-To: <4e05a0be54f66f2b394642762832e426a545426c.1688586536.git.phillip.wood@dunelm.org.uk> (Phillip Wood's message of "Wed, 5 Jul 2023 20:49:30 +0100")

Phillip Wood <phillip.wood123@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> In some shells, such as bash and zsh, it's possible to use a command
> substitution to provide the output of a command as a file argument to
> another process, like so:
>
>   diff -u <(printf "a\nb\n") <(printf "a\nc\n")

"command substitution" -> "process substitution", I think, are the
words found in the manual pages of both shells cited as examples.

> However, this syntax does not produce useful results with "git diff
> --no-index". On macOS, the arguments to the command are named pipes
> under /dev/fd, and git diff doesn't know how to handle a named pipe. On
> Linux, the arguments are symlinks to pipes, so git diff "helpfully"
> diffs these symlinks, comparing their targets like "pipe:[1234]" and
> "pipe:[5678]".

Concisely and clearly described.  Very nice.

> To address this "diff --no-index" is changed so that if a path given on

"this" -> "this,", probably.

> the commandline is a named pipe or a symbolic link that resolves to a
> named pipe then we read the data to diff from that pipe. This is
> implemented by generalizing the code that already exists to handle
> reading from stdin when the user passes the path "-".

Great.

> If the user tries to compare a named pipe to a directory then we die as
> we do when trying to compare stdin to a directory.
>
> As process substitution is not support by POSIX this change is tested by
> using a pipe and a symbolic link to a pipe.

People who are interested to more directly test the intended use
case can add a test "diff --no-index <(one) <(two)" under some
prerequisite (like BASH).  The test on the underlying mechansim like
this patch does is just fine in the meantime.

> diff --git a/diff-no-index.c b/diff-no-index.c
> index 4470e0271d..4771cf02aa 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -41,74 +41,119 @@ static int read_directory_contents(const char *path, struct string_list *list)
>   */
>  static const char file_from_standard_input[] = "-";
>  
> -static int get_mode(const char *path, int *mode)
> +/*
> + * For paths given on the command-line we treat "-" as stdin and named
> + * pipes and symbolic links to named pipes specially.
> + */
> +enum special {
> +	SPECIAL_NONE,
> +	SPECIAL_STDIN,
> +	SPECIAL_PIPE,
> +};
> +
> +static int get_mode(const char *path, int *mode, enum special *special)
>  {
>  	struct stat st;
>  
> -	if (!path || !strcmp(path, "/dev/null"))
> +	if (!path || !strcmp(path, "/dev/null")) {
>  		*mode = 0;
>  #ifdef GIT_WINDOWS_NATIVE
> -	else if (!strcasecmp(path, "nul"))
> +	} else if (!strcasecmp(path, "nul")) {
>  		*mode = 0;
>  #endif
> -	else if (path == file_from_standard_input)
> +	} else if (path == file_from_standard_input) {
>  		*mode = create_ce_mode(0666);
> -	else if (lstat(path, &st))
> +		*special = SPECIAL_STDIN;
> +	} else if (lstat(path, &st)) {
>  		return error("Could not access '%s'", path);
> -	else
> +	} else {
>  		*mode = st.st_mode;
> +	}
> +	/*
> +	 * For paths on the command-line treat named pipes and symbolic
> +	 * links that resolve to a named pipe specially.
> +	 */

Excellent.  The code structure is very clear.  We do what we have
already done, and in addition, treat pipes and symlinks to pipes
specially.

> +	if (special &&
> +	    (S_ISFIFO(*mode) ||
> +	     (S_ISLNK(*mode) && !stat(path, &st) && S_ISFIFO(st.st_mode)))) {
> +		*mode = create_ce_mode(0666);
> +		*special = SPECIAL_PIPE;
> +	}

What should we do when !stat(path, &st) fails here?  It may be a
dangling symbolic link, causes stat() to fail and we want to be
silent about it without changing *mode.  And that is exactly what
the code does.  Great.

Under what circumstances does a caller pass NULL to the special?  If
we were given a fifo and the caller passes NULL to the special,
*mode at this point would be S_ISFIFO() and not 0666.  It may be
what we want, but I do not immediately see why (I'll realize why
while reading the "recursing?" bit below).

> -static void populate_from_stdin(struct diff_filespec *s)
> +static void populate_common(struct diff_filespec *s, struct strbuf *buf)
>  {
> -	struct strbuf buf = STRBUF_INIT;
>  	size_t size = 0;

Not a new issue, but this initialization assignment is useless as
strbuf_detach() would always overwrite it (or die and would not come
back to us).

> -	if (strbuf_read(&buf, 0, 0) < 0)
> -		die_errno("error while reading from stdin");
> -
>  	s->should_munmap = 0;
> -	s->data = strbuf_detach(&buf, &size);
> +	s->data = strbuf_detach(buf, &size);
>  	s->size = size;

Again not a new issue, but it is unfortunate that s->size is not
size_t but ulong, which requires us to use a temporary variable of
the correct type, only to assign to it.

>  	s->should_free = 1;
>  	s->is_stdin = 1;
>  }
>  
> -static struct diff_filespec *noindex_filespec(const char *name, int mode)
> +static void populate_from_pipe(struct diff_filespec *s)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +	int fd = xopen(s->path, O_RDONLY);
> +
> +	if (strbuf_read(&buf, fd, 0) < 0)
> +		die_errno("error while reading from '%s'", s->path);
> +	close(fd);
> +	populate_common(s, &buf);
> +}
> +
> +static void populate_from_stdin(struct diff_filespec *s)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +
> +	if (strbuf_read(&buf, 0, 0) < 0)
> +		die_errno("error while reading from stdin");
> +	populate_common(s, &buf);
> +}

Both look OK.  I would have named "_common" -> "_finalize" or
something to signal that this is the shared "tail" part of the two
callers, but it is minor.

> +static struct diff_filespec *noindex_filespec(const char *name, int mode,
> +					      enum special special)
>  {
>  	struct diff_filespec *s;
>  
>  	if (!name)
>  		name = "/dev/null";
>  	s = alloc_filespec(name);
>  	fill_filespec(s, null_oid(), 0, mode);
> -	if (name == file_from_standard_input)
> +	if (special == SPECIAL_STDIN)
>  		populate_from_stdin(s);
> +	else if (special == SPECIAL_PIPE)
> +		populate_from_pipe(s);
>  	return s;
>  }

Great.

>  static int queue_diff(struct diff_options *o,
> -		      const char *name1, const char *name2)
> +		      const char *name1, const char *name2, int recursing)
>  {
>  	int mode1 = 0, mode2 = 0;
> +	enum special special1 = SPECIAL_NONE, special2 = SPECIAL_NONE;
>  
> -	if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
> +	/* Paths can only be special if we're not recursing. */
> +	if (get_mode(name1, &mode1, recursing ? NULL : &special1) ||
> +	    get_mode(name2, &mode2, recursing ? NULL : &special2))
>  		return -1;

Ahh, OK.  If we are told to "diff --no-index D1 D2", even if we find
a fifo or symlink to a fifo, they aren't "<(foo)" process
substitutions.  Makes sense.  And because we do not handle FIFO in
such a case, we'd leave *mode (in get_mode()) to a value that we
never use (i.e. S_ISFIFO(*mode) is true); hopefully later part of
the control flow somewhere we will notice it and die?

>  	if (mode1 && mode2 && S_ISDIR(mode1) != S_ISDIR(mode2)) {
>  		struct diff_filespec *d1, *d2;
>  
>  		if (S_ISDIR(mode1)) {
>  			/* 2 is file that is created */
> -			d1 = noindex_filespec(NULL, 0);
> -			d2 = noindex_filespec(name2, mode2);
> +			d1 = noindex_filespec(NULL, 0, SPECIAL_NONE);
> +			d2 = noindex_filespec(name2, mode2, special2);
>  			name2 = NULL;
>  			mode2 = 0;
>  		} else {
>  			/* 1 is file that is deleted */
> -			d1 = noindex_filespec(name1, mode1);
> -			d2 = noindex_filespec(NULL, 0);
> +			d1 = noindex_filespec(name1, mode1, special1);
> +			d2 = noindex_filespec(NULL, 0, SPECIAL_NONE);
>  			name1 = NULL;
>  			mode1 = 0;
>  		}
> @@ -173,7 +218,7 @@ static int queue_diff(struct diff_options *o,
>  				n2 = buffer2.buf;
>  			}
>  
> -			ret = queue_diff(o, n1, n2);
> +			ret = queue_diff(o, n1, n2, 1);
>  		}
>  		string_list_clear(&p1, 0);
>  		string_list_clear(&p2, 0);
> @@ -189,8 +234,8 @@ static int queue_diff(struct diff_options *o,
>  			SWAP(name1, name2);
>  		}
>  
> -		d1 = noindex_filespec(name1, mode1);
> -		d2 = noindex_filespec(name2, mode2);
> +		d1 = noindex_filespec(name1, mode1, special1);
> +		d2 = noindex_filespec(name2, mode2, special2);
>  		diff_queue(&diff_queued_diff, d1, d2);
>  		return 0;
>  	}

Looks quite straight-forward.

> @@ -215,15 +260,27 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
>   */
>  static void fixup_paths(const char **path, struct strbuf *replacement)
>  {
> -	unsigned int isdir0, isdir1;
> -
> -	isdir0 = path[0] != file_from_standard_input && is_directory(path[0]);
> -	isdir1 = path[1] != file_from_standard_input && is_directory(path[1]);
> +	struct stat st;
> +	unsigned int isdir0 = 0, isdir1 = 0;
> +	unsigned int ispipe0 = 0, ispipe1 = 0;
> +
> +	if (path[0] != file_from_standard_input && !stat(path[0], &st)) {
> +		isdir0 = S_ISDIR(st.st_mode);
> +		ispipe0 = S_ISFIFO(st.st_mode);
> +	}
> +
> +	if (path[1] != file_from_standard_input && !stat(path[1], &st)) {
> +		isdir1 = S_ISDIR(st.st_mode);
> +		ispipe1 = S_ISFIFO(st.st_mode);
> +	}
>  
>  	if ((path[0] == file_from_standard_input && isdir1) ||
>  	    (isdir0 && path[1] == file_from_standard_input))
>  		die(_("cannot compare stdin to a directory"));
>  
> +	if ((isdir0 && ispipe1) || (ispipe0 && isdir1))
> +		die(_("cannot compare a named pipe to a directory"));
> +

Nice.


  reply	other threads:[~2023-07-05 22:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-27 14:10 [PATCH 0/3] diff --no-index: support reading from named pipes Phillip Wood
2023-06-27 14:10 ` [PATCH 1/3] diff --no-index: die on error reading stdin Phillip Wood
2023-06-27 14:10 ` [PATCH 2/3] t4054: test diff --no-index with stdin Phillip Wood
2023-06-27 14:10 ` [PATCH 3/3] diff --no-index: support reading from named pipes Phillip Wood
2023-06-27 19:44   ` Junio C Hamano
2023-06-28 10:05     ` Phillip Wood
2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
2023-07-05 19:49   ` [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory Phillip Wood
2023-07-05 21:17     ` Junio C Hamano
2023-07-05 19:49   ` [PATCH v2 2/4] diff --no-index: die on error reading stdin Phillip Wood
2023-07-05 21:18     ` Junio C Hamano
2023-07-05 19:49   ` [PATCH v2 3/4] t4054: test diff --no-index with stdin Phillip Wood
2023-07-05 21:22     ` Junio C Hamano
2023-07-05 19:49   ` [PATCH v2 4/4] diff --no-index: support reading from named pipes Phillip Wood
2023-07-05 22:19     ` Junio C Hamano [this message]
2023-08-09 17:17     ` Jeff King
2023-08-10 12:56       ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq1qhmvw6i.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=phillip.wood123@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    --cc=tguyot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).