From: Junio C Hamano <gitster@pobox.com>
To: Phillip Wood <phillip.wood123@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Taylor Blau <me@ttaylorr.com>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Thomas Guyot-Sionnest <tguyot@gmail.com>
Subject: Re: [PATCH v2 4/4] diff --no-index: support reading from named pipes
Date: Wed, 05 Jul 2023 15:19:01 -0700 [thread overview]
Message-ID: <xmqq1qhmvw6i.fsf@gitster.g> (raw)
In-Reply-To: <4e05a0be54f66f2b394642762832e426a545426c.1688586536.git.phillip.wood@dunelm.org.uk> (Phillip Wood's message of "Wed, 5 Jul 2023 20:49:30 +0100")
Phillip Wood <phillip.wood123@gmail.com> writes:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> In some shells, such as bash and zsh, it's possible to use a command
> substitution to provide the output of a command as a file argument to
> another process, like so:
>
> diff -u <(printf "a\nb\n") <(printf "a\nc\n")
"command substitution" -> "process substitution", I think, are the
words found in the manual pages of both shells cited as examples.
> However, this syntax does not produce useful results with "git diff
> --no-index". On macOS, the arguments to the command are named pipes
> under /dev/fd, and git diff doesn't know how to handle a named pipe. On
> Linux, the arguments are symlinks to pipes, so git diff "helpfully"
> diffs these symlinks, comparing their targets like "pipe:[1234]" and
> "pipe:[5678]".
Concisely and clearly described. Very nice.
> To address this "diff --no-index" is changed so that if a path given on
"this" -> "this,", probably.
> the commandline is a named pipe or a symbolic link that resolves to a
> named pipe then we read the data to diff from that pipe. This is
> implemented by generalizing the code that already exists to handle
> reading from stdin when the user passes the path "-".
Great.
> If the user tries to compare a named pipe to a directory then we die as
> we do when trying to compare stdin to a directory.
>
> As process substitution is not support by POSIX this change is tested by
> using a pipe and a symbolic link to a pipe.
People who are interested to more directly test the intended use
case can add a test "diff --no-index <(one) <(two)" under some
prerequisite (like BASH). The test on the underlying mechansim like
this patch does is just fine in the meantime.
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 4470e0271d..4771cf02aa 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -41,74 +41,119 @@ static int read_directory_contents(const char *path, struct string_list *list)
> */
> static const char file_from_standard_input[] = "-";
>
> -static int get_mode(const char *path, int *mode)
> +/*
> + * For paths given on the command-line we treat "-" as stdin and named
> + * pipes and symbolic links to named pipes specially.
> + */
> +enum special {
> + SPECIAL_NONE,
> + SPECIAL_STDIN,
> + SPECIAL_PIPE,
> +};
> +
> +static int get_mode(const char *path, int *mode, enum special *special)
> {
> struct stat st;
>
> - if (!path || !strcmp(path, "/dev/null"))
> + if (!path || !strcmp(path, "/dev/null")) {
> *mode = 0;
> #ifdef GIT_WINDOWS_NATIVE
> - else if (!strcasecmp(path, "nul"))
> + } else if (!strcasecmp(path, "nul")) {
> *mode = 0;
> #endif
> - else if (path == file_from_standard_input)
> + } else if (path == file_from_standard_input) {
> *mode = create_ce_mode(0666);
> - else if (lstat(path, &st))
> + *special = SPECIAL_STDIN;
> + } else if (lstat(path, &st)) {
> return error("Could not access '%s'", path);
> - else
> + } else {
> *mode = st.st_mode;
> + }
> + /*
> + * For paths on the command-line treat named pipes and symbolic
> + * links that resolve to a named pipe specially.
> + */
Excellent. The code structure is very clear. We do what we have
already done, and in addition, treat pipes and symlinks to pipes
specially.
> + if (special &&
> + (S_ISFIFO(*mode) ||
> + (S_ISLNK(*mode) && !stat(path, &st) && S_ISFIFO(st.st_mode)))) {
> + *mode = create_ce_mode(0666);
> + *special = SPECIAL_PIPE;
> + }
What should we do when !stat(path, &st) fails here? It may be a
dangling symbolic link, causes stat() to fail and we want to be
silent about it without changing *mode. And that is exactly what
the code does. Great.
Under what circumstances does a caller pass NULL to the special? If
we were given a fifo and the caller passes NULL to the special,
*mode at this point would be S_ISFIFO() and not 0666. It may be
what we want, but I do not immediately see why (I'll realize why
while reading the "recursing?" bit below).
> -static void populate_from_stdin(struct diff_filespec *s)
> +static void populate_common(struct diff_filespec *s, struct strbuf *buf)
> {
> - struct strbuf buf = STRBUF_INIT;
> size_t size = 0;
Not a new issue, but this initialization assignment is useless as
strbuf_detach() would always overwrite it (or die and would not come
back to us).
> - if (strbuf_read(&buf, 0, 0) < 0)
> - die_errno("error while reading from stdin");
> -
> s->should_munmap = 0;
> - s->data = strbuf_detach(&buf, &size);
> + s->data = strbuf_detach(buf, &size);
> s->size = size;
Again not a new issue, but it is unfortunate that s->size is not
size_t but ulong, which requires us to use a temporary variable of
the correct type, only to assign to it.
> s->should_free = 1;
> s->is_stdin = 1;
> }
>
> -static struct diff_filespec *noindex_filespec(const char *name, int mode)
> +static void populate_from_pipe(struct diff_filespec *s)
> +{
> + struct strbuf buf = STRBUF_INIT;
> + int fd = xopen(s->path, O_RDONLY);
> +
> + if (strbuf_read(&buf, fd, 0) < 0)
> + die_errno("error while reading from '%s'", s->path);
> + close(fd);
> + populate_common(s, &buf);
> +}
> +
> +static void populate_from_stdin(struct diff_filespec *s)
> +{
> + struct strbuf buf = STRBUF_INIT;
> +
> + if (strbuf_read(&buf, 0, 0) < 0)
> + die_errno("error while reading from stdin");
> + populate_common(s, &buf);
> +}
Both look OK. I would have named "_common" -> "_finalize" or
something to signal that this is the shared "tail" part of the two
callers, but it is minor.
> +static struct diff_filespec *noindex_filespec(const char *name, int mode,
> + enum special special)
> {
> struct diff_filespec *s;
>
> if (!name)
> name = "/dev/null";
> s = alloc_filespec(name);
> fill_filespec(s, null_oid(), 0, mode);
> - if (name == file_from_standard_input)
> + if (special == SPECIAL_STDIN)
> populate_from_stdin(s);
> + else if (special == SPECIAL_PIPE)
> + populate_from_pipe(s);
> return s;
> }
Great.
> static int queue_diff(struct diff_options *o,
> - const char *name1, const char *name2)
> + const char *name1, const char *name2, int recursing)
> {
> int mode1 = 0, mode2 = 0;
> + enum special special1 = SPECIAL_NONE, special2 = SPECIAL_NONE;
>
> - if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
> + /* Paths can only be special if we're not recursing. */
> + if (get_mode(name1, &mode1, recursing ? NULL : &special1) ||
> + get_mode(name2, &mode2, recursing ? NULL : &special2))
> return -1;
Ahh, OK. If we are told to "diff --no-index D1 D2", even if we find
a fifo or symlink to a fifo, they aren't "<(foo)" process
substitutions. Makes sense. And because we do not handle FIFO in
such a case, we'd leave *mode (in get_mode()) to a value that we
never use (i.e. S_ISFIFO(*mode) is true); hopefully later part of
the control flow somewhere we will notice it and die?
> if (mode1 && mode2 && S_ISDIR(mode1) != S_ISDIR(mode2)) {
> struct diff_filespec *d1, *d2;
>
> if (S_ISDIR(mode1)) {
> /* 2 is file that is created */
> - d1 = noindex_filespec(NULL, 0);
> - d2 = noindex_filespec(name2, mode2);
> + d1 = noindex_filespec(NULL, 0, SPECIAL_NONE);
> + d2 = noindex_filespec(name2, mode2, special2);
> name2 = NULL;
> mode2 = 0;
> } else {
> /* 1 is file that is deleted */
> - d1 = noindex_filespec(name1, mode1);
> - d2 = noindex_filespec(NULL, 0);
> + d1 = noindex_filespec(name1, mode1, special1);
> + d2 = noindex_filespec(NULL, 0, SPECIAL_NONE);
> name1 = NULL;
> mode1 = 0;
> }
> @@ -173,7 +218,7 @@ static int queue_diff(struct diff_options *o,
> n2 = buffer2.buf;
> }
>
> - ret = queue_diff(o, n1, n2);
> + ret = queue_diff(o, n1, n2, 1);
> }
> string_list_clear(&p1, 0);
> string_list_clear(&p2, 0);
> @@ -189,8 +234,8 @@ static int queue_diff(struct diff_options *o,
> SWAP(name1, name2);
> }
>
> - d1 = noindex_filespec(name1, mode1);
> - d2 = noindex_filespec(name2, mode2);
> + d1 = noindex_filespec(name1, mode1, special1);
> + d2 = noindex_filespec(name2, mode2, special2);
> diff_queue(&diff_queued_diff, d1, d2);
> return 0;
> }
Looks quite straight-forward.
> @@ -215,15 +260,27 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
> */
> static void fixup_paths(const char **path, struct strbuf *replacement)
> {
> - unsigned int isdir0, isdir1;
> -
> - isdir0 = path[0] != file_from_standard_input && is_directory(path[0]);
> - isdir1 = path[1] != file_from_standard_input && is_directory(path[1]);
> + struct stat st;
> + unsigned int isdir0 = 0, isdir1 = 0;
> + unsigned int ispipe0 = 0, ispipe1 = 0;
> +
> + if (path[0] != file_from_standard_input && !stat(path[0], &st)) {
> + isdir0 = S_ISDIR(st.st_mode);
> + ispipe0 = S_ISFIFO(st.st_mode);
> + }
> +
> + if (path[1] != file_from_standard_input && !stat(path[1], &st)) {
> + isdir1 = S_ISDIR(st.st_mode);
> + ispipe1 = S_ISFIFO(st.st_mode);
> + }
>
> if ((path[0] == file_from_standard_input && isdir1) ||
> (isdir0 && path[1] == file_from_standard_input))
> die(_("cannot compare stdin to a directory"));
>
> + if ((isdir0 && ispipe1) || (ispipe0 && isdir1))
> + die(_("cannot compare a named pipe to a directory"));
> +
Nice.
next prev parent reply other threads:[~2023-07-05 22:19 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-27 14:10 [PATCH 0/3] diff --no-index: support reading from named pipes Phillip Wood
2023-06-27 14:10 ` [PATCH 1/3] diff --no-index: die on error reading stdin Phillip Wood
2023-06-27 14:10 ` [PATCH 2/3] t4054: test diff --no-index with stdin Phillip Wood
2023-06-27 14:10 ` [PATCH 3/3] diff --no-index: support reading from named pipes Phillip Wood
2023-06-27 19:44 ` Junio C Hamano
2023-06-28 10:05 ` Phillip Wood
2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
2023-07-05 19:49 ` [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory Phillip Wood
2023-07-05 21:17 ` Junio C Hamano
2023-07-05 19:49 ` [PATCH v2 2/4] diff --no-index: die on error reading stdin Phillip Wood
2023-07-05 21:18 ` Junio C Hamano
2023-07-05 19:49 ` [PATCH v2 3/4] t4054: test diff --no-index with stdin Phillip Wood
2023-07-05 21:22 ` Junio C Hamano
2023-07-05 19:49 ` [PATCH v2 4/4] diff --no-index: support reading from named pipes Phillip Wood
2023-07-05 22:19 ` Junio C Hamano [this message]
2023-08-09 17:17 ` Jeff King
2023-08-10 12:56 ` Phillip Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq1qhmvw6i.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=phillip.wood123@gmail.com \
--cc=sandals@crustytoothpaste.net \
--cc=tguyot@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).