git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] t0021: convert perl script to C test-tool helper
@ 2022-07-22 19:42 Matheus Tavares
  2022-07-22 19:42 ` [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
                   ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-07-22 19:42 UTC (permalink / raw)
  To: git; +Cc: gitster, larsxschneider, christian.couder

This addresses the "left over bits" comment from [1], converting the
t0021/rot13-filter.pl script to a C test-tool helper in order to drop
the PERL dependency from tests using this script.

This series builds on top of mt/checkout-count-fix, also adjusting the
script invocations from that patchset.

[1]: https://lore.kernel.org/git/xmqqfsj4dhfi.fsf@gitster.g/

Matheus Tavares (2):
  t/t0021: convert the rot13-filter.pl script to C
  t/t0021: replace old rot13-filter.pl uses with new test-tool cmd

 Makefile                                |   1 +
 pkt-line.c                              |  13 +-
 pkt-line.h                              |   2 +
 t/helper/test-rot13-filter.c            | 396 ++++++++++++++++++++++++
 t/helper/test-tool.c                    |   1 +
 t/helper/test-tool.h                    |   1 +
 t/t0021-conversion.sh                   |  71 ++---
 t/t0021/rot13-filter.pl                 | 247 ---------------
 t/t2080-parallel-checkout-basics.sh     |   7 +-
 t/t2082-parallel-checkout-attributes.sh |   7 +-
 10 files changed, 450 insertions(+), 296 deletions(-)
 create mode 100644 t/helper/test-rot13-filter.c
 delete mode 100644 t/t0021/rot13-filter.pl

-- 
2.37.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-22 19:42 [PATCH 0/2] t0021: convert perl script to C test-tool helper Matheus Tavares
@ 2022-07-22 19:42 ` Matheus Tavares
  2022-07-23  4:52   ` Ævar Arnfjörð Bjarmason
  2022-07-23  4:59   ` Ævar Arnfjörð Bjarmason
  2022-07-22 19:42 ` [PATCH 2/2] t/t0021: replace old rot13-filter.pl uses with new test-tool cmd Matheus Tavares
  2022-07-24 15:09 ` [PATCH v2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
  2 siblings, 2 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-07-22 19:42 UTC (permalink / raw)
  To: git; +Cc: gitster, larsxschneider, christian.couder

This script is currently used by three test files: t0021-conversion.sh,
t2080-parallel-checkout-basics.sh, and
t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL
dependency at these tests, let's convert the script to a C test-tool
command. Note, however, that we still use the script as a wrapper at
this commit, in order to minimize the amount of changes it introduces
and help reviewers. At the next commit we will properly remove the
script and adjust the affected tests to use test-tool.

Furthermore, note that there is a small adjustment at test
t0021-conversion.sh because it depended on a specific error message
given by perl's die routine.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Makefile                     |   1 +
 pkt-line.c                   |  13 +-
 pkt-line.h                   |   2 +
 t/helper/test-rot13-filter.c | 396 +++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c         |   1 +
 t/helper/test-tool.h         |   1 +
 t/t0021-conversion.sh        |   2 +-
 t/t0021/rot13-filter.pl      | 248 +---------------------
 8 files changed, 416 insertions(+), 248 deletions(-)
 create mode 100644 t/helper/test-rot13-filter.c

diff --git a/Makefile b/Makefile
index 04d0fd1fe6..7cfcf3a911 100644
--- a/Makefile
+++ b/Makefile
@@ -764,6 +764,7 @@ TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
 TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
+TEST_BUILTINS_OBJS += test-rot13-filter.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
 TEST_BUILTINS_OBJS += test-run-command.o
diff --git a/pkt-line.c b/pkt-line.c
index 8e43c2def4..4425bdae36 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -309,9 +309,10 @@ int write_packetized_from_fd_no_flush(int fd_in, int fd_out)
 	return err;
 }
 
-int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
+					     int fd_out, int *count_ptr)
 {
-	int err = 0;
+	int err = 0, count = 0;
 	size_t bytes_written = 0;
 	size_t bytes_to_write;
 
@@ -324,10 +325,18 @@ int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_ou
 			break;
 		err = packet_write_gently(fd_out, src_in + bytes_written, bytes_to_write);
 		bytes_written += bytes_to_write;
+		count++;
 	}
+	if (count_ptr)
+		*count_ptr = count;
 	return err;
 }
 
+int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
+{
+	return write_packetized_from_buf_no_flush_count(src_in, len, fd_out, NULL);
+}
+
 static int get_packet_data(int fd, char **src_buf, size_t *src_size,
 			   void *dst, unsigned size, int options)
 {
diff --git a/pkt-line.h b/pkt-line.h
index 6d2a63db23..43986c525c 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -33,6 +33,8 @@ int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd_no_flush(int fd_in, int fd_out);
 int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out);
+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
+					     int fd_out, int *count_ptr);
 
 /*
  * Stdio versions of packet_write functions. When mixing these with fd
diff --git a/t/helper/test-rot13-filter.c b/t/helper/test-rot13-filter.c
new file mode 100644
index 0000000000..bbad031aee
--- /dev/null
+++ b/t/helper/test-rot13-filter.c
@@ -0,0 +1,396 @@
+/*
+ * Example implementation for the Git filter protocol version 2
+ * See Documentation/gitattributes.txt, section "Filter Protocol"
+ *
+ * Usage: test-tool rot13-filter [--always-delay] <log path> <capabilities>
+ *
+ * Log path defines a debug log file that the script writes to. The
+ * subsequent arguments define a list of supported protocol capabilities
+ * ("clean", "smudge", etc).
+ *
+ * When --always-delay is given all pathnames with the "can-delay" flag
+ * that don't appear on the list bellow are delayed with a count of 1
+ * (see more below).
+ *
+ * This implementation supports special test cases:
+ * (1) If data with the pathname "clean-write-fail.r" is processed with
+ *     a "clean" operation then the write operation will die.
+ * (2) If data with the pathname "smudge-write-fail.r" is processed with
+ *     a "smudge" operation then the write operation will die.
+ * (3) If data with the pathname "error.r" is processed with any
+ *     operation then the filter signals that it cannot or does not want
+ *     to process the file.
+ * (4) If data with the pathname "abort.r" is processed with any
+ *     operation then the filter signals that it cannot or does not want
+ *     to process the file and any file after that is processed with the
+ *     same command.
+ * (5) If data with a pathname that is a key in the delay hash is
+ *     requested (e.g. "test-delay10.a") then the filter responds with
+ *     a "delay" status and sets the "requested" field in the delay hash.
+ *     The filter will signal the availability of this object after
+ *     "count" (field in delay hash) "list_available_blobs" commands.
+ * (6) If data with the pathname "missing-delay.a" is processed that the
+ *     filter will drop the path from the "list_available_blobs" response.
+ * (7) If data with the pathname "invalid-delay.a" is processed that the
+ *     filter will add the path "unfiltered" which was not delayed before
+ *     to the "list_available_blobs" response.
+ */
+
+#include "test-tool.h"
+#include "pkt-line.h"
+#include "string-list.h"
+#include "strmap.h"
+
+static FILE *logfile;
+static int always_delay;
+static struct strmap delay = STRMAP_INIT;
+static struct string_list requested_caps = STRING_LIST_INIT_NODUP;
+
+static int has_capability(const char *cap)
+{
+	return unsorted_string_list_has_string(&requested_caps, cap);
+}
+
+static char *rot13(char *str)
+{
+	char *c;
+	for (c = str; *c; c++) {
+		if (*c >= 'a' && *c <= 'z')
+			*c = 'a' + (*c - 'a' + 13) % 26;
+		else if (*c >= 'A' && *c <= 'Z')
+			*c = 'A' + (*c - 'A' + 13) % 26;
+	}
+	return str;
+}
+
+static char *skip_key_dup(const char *buf, size_t size, const char *key)
+{
+	struct strbuf keybuf = STRBUF_INIT;
+	strbuf_addf(&keybuf, "%s=", key);
+	if (!skip_prefix_mem(buf, size, keybuf.buf, &buf, &size) || !size)
+		die("bad %s: '%s'", key, xstrndup(buf, size));
+	strbuf_release(&keybuf);
+	return xstrndup(buf, size);
+}
+
+/*
+ * Read a text packet, expecting that it is in the form "key=value" for
+ * the given key. An EOF does not trigger any error and is reported
+ * back to the caller with NULL. Die if the "key" part of "key=value" does
+ * not match the given key, or the value part is empty.
+ */
+static char *packet_key_val_read(const char *key)
+{
+	int size;
+	char *buf;
+	if (packet_read_line_gently(0, &size, &buf) < 0)
+		return NULL;
+	return skip_key_dup(buf, size, key);
+}
+
+static struct string_list *packet_read_capabilities(void)
+{
+	struct string_list *caps = xmalloc(sizeof(*caps));
+	string_list_init_dup(caps);
+	while (1) {
+		int size;
+		char *buf = packet_read_line(0, &size);
+		if (!buf)
+			break;
+		string_list_append_nodup(caps,
+					 skip_key_dup(buf, size, "capability"));
+	}
+	return caps;
+}
+
+/* Read remote capabilities and check them against capabilities we require */
+static struct string_list *packet_read_and_check_capabilities(
+		struct string_list *required_caps)
+{
+	struct string_list *remote_caps = packet_read_capabilities();
+	struct string_list_item *item;
+	for_each_string_list_item(item, required_caps) {
+		if (!unsorted_string_list_has_string(remote_caps, item->string)) {
+			die("required '%s' capability not available from remote",
+			    item->string);
+		}
+	}
+	return remote_caps;
+}
+
+/*
+ * Check our capabilities we want to advertise against the remote ones
+ * and then advertise our capabilities
+ */
+static void packet_check_and_write_capabilities(struct string_list *remote_caps,
+						struct string_list *our_caps)
+{
+	struct string_list_item *item;
+	for_each_string_list_item(item, our_caps) {
+		if (!unsorted_string_list_has_string(remote_caps, item->string)) {
+			die("our capability '%s' is not available from remote",
+			    item->string);
+		}
+		packet_write_fmt(1, "capability=%s\n", item->string);
+	}
+	packet_flush(1);
+}
+
+struct delay_entry {
+	int requested, count;
+	char *output;
+};
+
+static void command_loop(void)
+{
+	while (1) {
+		char *command = packet_key_val_read("command");
+		if (!command) {
+			fprintf(logfile, "STOP\n");
+			break;
+		}
+		fprintf(logfile, "IN: %s", command);
+
+		if (!strcmp(command, "list_available_blobs")) {
+			struct hashmap_iter iter;
+			struct strmap_entry *ent;
+			struct string_list_item *str_item;
+			struct string_list paths = STRING_LIST_INIT_NODUP;
+
+			/* flush */
+			if (packet_read_line(0, NULL))
+				die("bad list_available_blobs end");
+
+			strmap_for_each_entry(&delay, &iter, ent) {
+				struct delay_entry *delay_entry = ent->value;
+				if (!delay_entry->requested)
+					continue;
+				delay_entry->count--;
+				if (!strcmp(ent->key, "invalid-delay.a")) {
+					/* Send Git a pathname that was not delayed earlier */
+					packet_write_fmt(1, "pathname=unfiltered");
+				}
+				if (!strcmp(ent->key, "missing-delay.a")) {
+					/* Do not signal Git that this file is available */
+				} else if (!delay_entry->count) {
+					string_list_insert(&paths, ent->key);
+					packet_write_fmt(1, "pathname=%s", ent->key);
+				}
+			}
+
+			/* Print paths in sorted order. */
+			for_each_string_list_item(str_item, &paths)
+				fprintf(logfile, " %s", str_item->string);
+			string_list_clear(&paths, 0);
+
+			packet_flush(1);
+
+			fprintf(logfile, " [OK]\n");
+			packet_write_fmt(1, "status=success");
+			packet_flush(1);
+		} else {
+			char *buf, *output;
+			int size;
+			char *pathname;
+			struct delay_entry *entry;
+			struct strbuf input = STRBUF_INIT;
+
+			pathname = packet_key_val_read("pathname");
+			if (!pathname)
+				die("unexpected EOF while expecting pathname");
+			fprintf(logfile, " %s", pathname);
+
+			/* Read until flush */
+			buf = packet_read_line(0, &size);
+			while (buf) {
+				if (!strcmp(buf, "can-delay=1")) {
+					entry = strmap_get(&delay, pathname);
+					if (entry && !entry->requested) {
+						entry->requested = 1;
+					} else if (!entry && always_delay) {
+						entry = xcalloc(1, sizeof(*entry));
+						entry->requested = 1;
+						entry->count = 1;
+						strmap_put(&delay, pathname, entry);
+					}
+				} else if (starts_with(buf, "ref=") ||
+					   starts_with(buf, "treeish=") ||
+					   starts_with(buf, "blob=")) {
+					fprintf(logfile, " %s", buf);
+				} else {
+					/*
+					 * In general, filters need to be graceful about
+					 * new metadata, since it's documented that we
+					 * can pass any key-value pairs, but for tests,
+					 * let's be a little stricter.
+					 */
+					die("Unknown message '%s'", buf);
+				}
+				buf = packet_read_line(0, &size);
+			}
+
+
+			read_packetized_to_strbuf(0, &input, 0);
+			fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);
+
+			entry = strmap_get(&delay, pathname);
+			if (entry && entry->output) {
+				output = entry->output;
+			} else if (!strcmp(pathname, "error.r") || !strcmp(pathname, "abort.r")) {
+				output = "";
+			} else if (!strcmp(command, "clean") && has_capability("clean")) {
+				output = rot13(input.buf);
+			} else if (!strcmp(command, "smudge") && has_capability("smudge")) {
+				output = rot13(input.buf);
+			} else {
+				die("bad command '%s'", command);
+			}
+
+			if (!strcmp(pathname, "error.r")) {
+				fprintf(logfile, "[ERROR]\n");
+				packet_write_fmt(1, "status=error");
+				packet_flush(1);
+			} else if (!strcmp(pathname, "abort.r")) {
+				fprintf(logfile, "[ABORT]\n");
+				packet_write_fmt(1, "status=abort");
+				packet_flush(1);
+			} else if (!strcmp(command, "smudge") &&
+				   (entry = strmap_get(&delay, pathname)) &&
+				   entry->requested == 1) {
+				fprintf(logfile, "[DELAYED]\n");
+				packet_write_fmt(1, "status=delayed");
+				packet_flush(1);
+				entry->requested = 2;
+				entry->output = xstrdup(output);
+			} else {
+				int i, nr_packets;
+				size_t output_len;
+				struct strbuf sb = STRBUF_INIT;
+				packet_write_fmt(1, "status=success");
+				packet_flush(1);
+
+				strbuf_addf(&sb, "%s-write-fail.r", command);
+				if (!strcmp(pathname, sb.buf)) {
+					fprintf(logfile, "[WRITE FAIL]\n");
+					die("%s write error", command);
+				}
+
+				output_len = strlen(output);
+				fprintf(logfile, "OUT: %"PRIuMAX" ", (uintmax_t)output_len);
+
+				if (write_packetized_from_buf_no_flush_count(output,
+					output_len, 1, &nr_packets))
+					die("failed to write buffer to stdout");
+				packet_flush(1);
+
+				for (i = 0; i < nr_packets; i++)
+					fprintf(logfile, ".");
+				fprintf(logfile, " [OK]\n");
+
+				packet_flush(1);
+				strbuf_release(&sb);
+			}
+			free(pathname);
+			strbuf_release(&input);
+		}
+		free(command);
+	}
+}
+
+static void free_delay_hash(void)
+{
+	struct hashmap_iter iter;
+	struct strmap_entry *ent;
+
+	strmap_for_each_entry(&delay, &iter, ent) {
+		struct delay_entry *delay_entry = ent->value;
+		free(delay_entry->output);
+		free(delay_entry);
+	}
+	strmap_clear(&delay, 0);
+}
+
+static void add_delay_entry(char *pathname, int count)
+{
+	struct delay_entry *entry = xcalloc(1, sizeof(*entry));
+	entry->count = count;
+	if (strmap_put(&delay, pathname, entry))
+		BUG("adding the same path twice to delay hash?");
+}
+
+static void packet_initialize(const char *name, int version)
+{
+	struct strbuf sb = STRBUF_INIT;
+	int size;
+	char *pkt_buf = packet_read_line(0, &size);
+
+	strbuf_addf(&sb, "%s-client", name);
+	if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
+		die("bad initialize: '%s'", xstrndup(pkt_buf, size));
+
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "version=%d", version);
+	pkt_buf = packet_read_line(0, &size);
+	if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
+		die("bad version: '%s'", xstrndup(pkt_buf, size));
+
+	pkt_buf = packet_read_line(0, &size);
+	if (pkt_buf)
+		die("bad version end: '%s'", xstrndup(pkt_buf, size));
+
+	packet_write_fmt(1, "%s-server", name);
+	packet_write_fmt(1, "version=%d", version);
+	packet_flush(1);
+	strbuf_release(&sb);
+}
+
+static char *rot13_usage = "test-tool rot13-filter [--always-delay] <log path> <capabilities>";
+
+int cmd__rot13_filter(int argc, const char **argv)
+{
+	int i = 1;
+	struct string_list *remote_caps, supported_caps = STRING_LIST_INIT_NODUP;
+
+	string_list_append(&supported_caps, "clean");
+	string_list_append(&supported_caps, "smudge");
+	string_list_append(&supported_caps, "delay");
+
+	if (argc > 1 && !strcmp(argv[i], "--always-delay")) {
+		always_delay = 1;
+		i++;
+	}
+	if (argc - i < 2)
+		usage(rot13_usage);
+
+	logfile = fopen(argv[i++], "a");
+	if (!logfile)
+		die_errno("failed to open log file");
+
+	for ( ; i < argc; i++)
+		string_list_append(&requested_caps, argv[i]);
+
+	add_delay_entry("test-delay10.a", 1);
+	add_delay_entry("test-delay11.a", 1);
+	add_delay_entry("test-delay20.a", 2);
+	add_delay_entry("test-delay10.b", 1);
+	add_delay_entry("missing-delay.a", 1);
+	add_delay_entry("invalid-delay.a", 1);
+
+	fprintf(logfile, "START\n");
+
+	packet_initialize("git-filter", 2);
+
+	remote_caps = packet_read_and_check_capabilities(&supported_caps);
+	packet_check_and_write_capabilities(remote_caps, &requested_caps);
+	fprintf(logfile, "init handshake complete\n");
+
+	string_list_clear(&supported_caps, 0);
+	string_list_clear(remote_caps, 0);
+
+	command_loop();
+
+	fclose(logfile);
+	string_list_clear(&requested_caps, 0);
+	free_delay_hash();
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 318fdbab0c..d6a560f832 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -65,6 +65,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "rot13-filter", cmd__rot13_filter },
 	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index bb79927163..21a91b1019 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -54,6 +54,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__rot13_filter(int argc, const char **argv);
 int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 1c840348bd..963b66e08c 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -735,7 +735,7 @@ test_expect_success PERL 'process filter should restart after unexpected write f
 		rm -f debug.log &&
 		git checkout --quiet --no-progress . 2>git-stderr.log &&
 
-		grep "smudge write error at" git-stderr.log &&
+		grep "smudge write error" git-stderr.log &&
 		test_i18ngrep "error: external filter" git-stderr.log &&
 
 		cat >expected.log <<-EOF &&
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 7bb93768f3..1447bc0a24 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -1,247 +1,5 @@
-#
-# Example implementation for the Git filter protocol version 2
-# See Documentation/gitattributes.txt, section "Filter Protocol"
-#
-# Usage: rot13-filter.pl [--always-delay] <log path> <capabilities>
-#
-# Log path defines a debug log file that the script writes to. The
-# subsequent arguments define a list of supported protocol capabilities
-# ("clean", "smudge", etc).
-#
-# When --always-delay is given all pathnames with the "can-delay" flag
-# that don't appear on the list bellow are delayed with a count of 1
-# (see more below).
-#
-# This implementation supports special test cases:
-# (1) If data with the pathname "clean-write-fail.r" is processed with
-#     a "clean" operation then the write operation will die.
-# (2) If data with the pathname "smudge-write-fail.r" is processed with
-#     a "smudge" operation then the write operation will die.
-# (3) If data with the pathname "error.r" is processed with any
-#     operation then the filter signals that it cannot or does not want
-#     to process the file.
-# (4) If data with the pathname "abort.r" is processed with any
-#     operation then the filter signals that it cannot or does not want
-#     to process the file and any file after that is processed with the
-#     same command.
-# (5) If data with a pathname that is a key in the DELAY hash is
-#     requested (e.g. "test-delay10.a") then the filter responds with
-#     a "delay" status and sets the "requested" field in the DELAY hash.
-#     The filter will signal the availability of this object after
-#     "count" (field in DELAY hash) "list_available_blobs" commands.
-# (6) If data with the pathname "missing-delay.a" is processed that the
-#     filter will drop the path from the "list_available_blobs" response.
-# (7) If data with the pathname "invalid-delay.a" is processed that the
-#     filter will add the path "unfiltered" which was not delayed before
-#     to the "list_available_blobs" response.
-#
-
 use 5.008;
-sub gitperllib {
-	# Git assumes that all path lists are Unix-y colon-separated ones. But
-	# when the Git for Windows executes the test suite, its MSYS2 Bash
-	# calls git.exe, and colon-separated path lists are converted into
-	# Windows-y semicolon-separated lists of *Windows* paths (which
-	# naturally contain a colon after the drive letter, so splitting by
-	# colons simply does not cut it).
-	#
-	# Detect semicolon-separated path list and handle them appropriately.
 
-	if ($ENV{GITPERLLIB} =~ /;/) {
-		return split(/;/, $ENV{GITPERLLIB});
-	}
-	return split(/:/, $ENV{GITPERLLIB});
-}
-use lib (gitperllib());
-use strict;
-use warnings;
-use IO::File;
-use Git::Packet;
-
-my $MAX_PACKET_CONTENT_SIZE = 65516;
-
-my $always_delay = 0;
-if ( $ARGV[0] eq '--always-delay' ) {
-	$always_delay = 1;
-	shift @ARGV;
-}
-
-my $log_file                = shift @ARGV;
-my @capabilities            = @ARGV;
-
-open my $debug, ">>", $log_file or die "cannot open log file: $!";
-
-my %DELAY = (
-	'test-delay10.a' => { "requested" => 0, "count" => 1 },
-	'test-delay11.a' => { "requested" => 0, "count" => 1 },
-	'test-delay20.a' => { "requested" => 0, "count" => 2 },
-	'test-delay10.b' => { "requested" => 0, "count" => 1 },
-	'missing-delay.a' => { "requested" => 0, "count" => 1 },
-	'invalid-delay.a' => { "requested" => 0, "count" => 1 },
-);
-
-sub rot13 {
-	my $str = shift;
-	$str =~ y/A-Za-z/N-ZA-Mn-za-m/;
-	return $str;
-}
-
-print $debug "START\n";
-$debug->flush();
-
-packet_initialize("git-filter", 2);
-
-my %remote_caps = packet_read_and_check_capabilities("clean", "smudge", "delay");
-packet_check_and_write_capabilities(\%remote_caps, @capabilities);
-
-print $debug "init handshake complete\n";
-$debug->flush();
-
-while (1) {
-	my ( $res, $command ) = packet_key_val_read("command");
-	if ( $res == -1 ) {
-		print $debug "STOP\n";
-		exit();
-	}
-	print $debug "IN: $command";
-	$debug->flush();
-
-	if ( $command eq "list_available_blobs" ) {
-		# Flush
-		packet_compare_lists([1, ""], packet_bin_read()) ||
-			die "bad list_available_blobs end";
-
-		foreach my $pathname ( sort keys %DELAY ) {
-			if ( $DELAY{$pathname}{"requested"} >= 1 ) {
-				$DELAY{$pathname}{"count"} = $DELAY{$pathname}{"count"} - 1;
-				if ( $pathname eq "invalid-delay.a" ) {
-					# Send Git a pathname that was not delayed earlier
-					packet_txt_write("pathname=unfiltered");
-				}
-				if ( $pathname eq "missing-delay.a" ) {
-					# Do not signal Git that this file is available
-				} elsif ( $DELAY{$pathname}{"count"} == 0 ) {
-					print $debug " $pathname";
-					packet_txt_write("pathname=$pathname");
-				}
-			}
-		}
-
-		packet_flush();
-
-		print $debug " [OK]\n";
-		$debug->flush();
-		packet_txt_write("status=success");
-		packet_flush();
-	} else {
-		my ( $res, $pathname ) = packet_key_val_read("pathname");
-		if ( $res == -1 ) {
-			die "unexpected EOF while expecting pathname";
-		}
-		print $debug " $pathname";
-		$debug->flush();
-
-		# Read until flush
-		my ( $done, $buffer ) = packet_txt_read();
-		while ( $buffer ne '' ) {
-			if ( $buffer eq "can-delay=1" ) {
-				if ( exists $DELAY{$pathname} and $DELAY{$pathname}{"requested"} == 0 ) {
-					$DELAY{$pathname}{"requested"} = 1;
-				} elsif ( !exists $DELAY{$pathname} and $always_delay ) {
-					$DELAY{$pathname} = { "requested" => 1, "count" => 1 };
-				}
-			} elsif ($buffer =~ /^(ref|treeish|blob)=/) {
-				print $debug " $buffer";
-			} else {
-				# In general, filters need to be graceful about
-				# new metadata, since it's documented that we
-				# can pass any key-value pairs, but for tests,
-				# let's be a little stricter.
-				die "Unknown message '$buffer'";
-			}
-
-			( $done, $buffer ) = packet_txt_read();
-		}
-		if ( $done == -1 ) {
-			die "unexpected EOF after pathname '$pathname'";
-		}
-
-		my $input = "";
-		{
-			binmode(STDIN);
-			my $buffer;
-			my $done = 0;
-			while ( !$done ) {
-				( $done, $buffer ) = packet_bin_read();
-				$input .= $buffer;
-			}
-			if ( $done == -1 ) {
-				die "unexpected EOF while reading input for '$pathname'";
-			}			
-			print $debug " " . length($input) . " [OK] -- ";
-			$debug->flush();
-		}
-
-		my $output;
-		if ( exists $DELAY{$pathname} and exists $DELAY{$pathname}{"output"} ) {
-			$output = $DELAY{$pathname}{"output"}
-		} elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
-			$output = "";
-		} elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
-			$output = rot13($input);
-		} elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
-			$output = rot13($input);
-		} else {
-			die "bad command '$command'";
-		}
-
-		if ( $pathname eq "error.r" ) {
-			print $debug "[ERROR]\n";
-			$debug->flush();
-			packet_txt_write("status=error");
-			packet_flush();
-		} elsif ( $pathname eq "abort.r" ) {
-			print $debug "[ABORT]\n";
-			$debug->flush();
-			packet_txt_write("status=abort");
-			packet_flush();
-		} elsif ( $command eq "smudge" and
-			exists $DELAY{$pathname} and
-			$DELAY{$pathname}{"requested"} == 1 ) {
-			print $debug "[DELAYED]\n";
-			$debug->flush();
-			packet_txt_write("status=delayed");
-			packet_flush();
-			$DELAY{$pathname}{"requested"} = 2;
-			$DELAY{$pathname}{"output"} = $output;
-		} else {
-			packet_txt_write("status=success");
-			packet_flush();
-
-			if ( $pathname eq "${command}-write-fail.r" ) {
-				print $debug "[WRITE FAIL]\n";
-				$debug->flush();
-				die "${command} write error";
-			}
-
-			print $debug "OUT: " . length($output) . " ";
-			$debug->flush();
-
-			while ( length($output) > 0 ) {
-				my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE );
-				packet_bin_write($packet);
-				# dots represent the number of packets
-				print $debug ".";
-				if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
-					$output = substr( $output, $MAX_PACKET_CONTENT_SIZE );
-				} else {
-					$output = "";
-				}
-			}
-			packet_flush();
-			print $debug " [OK]\n";
-			$debug->flush();
-			packet_flush();
-		}
-	}
-}
+my @quoted_args = map "'$_'", @ARGV;
+exec "test-tool rot13-filter @quoted_args";
+die "failed to exec test-tool";
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/2] t/t0021: replace old rot13-filter.pl uses with new test-tool cmd
  2022-07-22 19:42 [PATCH 0/2] t0021: convert perl script to C test-tool helper Matheus Tavares
  2022-07-22 19:42 ` [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
@ 2022-07-22 19:42 ` Matheus Tavares
  2022-07-24 15:09 ` [PATCH v2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
  2 siblings, 0 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-07-22 19:42 UTC (permalink / raw)
  To: git; +Cc: gitster, larsxschneider, christian.couder

Complete the perl-to-C conversion from the previous commit by actually
removing the old perl script and adjusting the test cases to directly
call "test-tool rot13-filter".

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/t0021-conversion.sh                   | 69 ++++++++++++-------------
 t/t0021/rot13-filter.pl                 |  5 --
 t/t2080-parallel-checkout-basics.sh     |  7 +--
 t/t2082-parallel-checkout-attributes.sh |  7 +--
 4 files changed, 37 insertions(+), 51 deletions(-)
 delete mode 100644 t/t0021/rot13-filter.pl

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 963b66e08c..aeaa8e02ed 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -17,9 +17,6 @@ tr \
   'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'
 EOF
 
-write_script rot13-filter.pl "$PERL_PATH" \
-	<"$TEST_DIRECTORY"/t0021/rot13-filter.pl
-
 generate_random_characters () {
 	LEN=$1
 	NAME=$2
@@ -365,8 +362,8 @@ test_expect_success 'diff does not reuse worktree files that need cleaning' '
 	test_line_count = 0 count
 '
 
-test_expect_success PERL 'required process filter should filter data' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should filter data' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -450,8 +447,8 @@ test_expect_success PERL 'required process filter should filter data' '
 	)
 '
 
-test_expect_success PERL 'required process filter should filter data for various subcommands' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should filter data for various subcommands' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	(
 		cd repo &&
@@ -561,9 +558,9 @@ test_expect_success PERL 'required process filter should filter data for various
 	)
 '
 
-test_expect_success PERL 'required process filter takes precedence' '
+test_expect_success 'required process filter takes precedence' '
 	test_config_global filter.protocol.clean false &&
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -587,8 +584,8 @@ test_expect_success PERL 'required process filter takes precedence' '
 	)
 '
 
-test_expect_success PERL 'required process filter should be used only for "clean" operation only' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
+test_expect_success 'required process filter should be used only for "clean" operation only' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -622,8 +619,8 @@ test_expect_success PERL 'required process filter should be used only for "clean
 	)
 '
 
-test_expect_success PERL 'required process filter should process multiple packets' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should process multiple packets' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 
 	rm -rf repo &&
@@ -687,8 +684,8 @@ test_expect_success PERL 'required process filter should process multiple packet
 	)
 '
 
-test_expect_success PERL 'required process filter with clean error should fail' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter with clean error should fail' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -706,8 +703,8 @@ test_expect_success PERL 'required process filter with clean error should fail'
 	)
 '
 
-test_expect_success PERL 'process filter should restart after unexpected write failure' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter should restart after unexpected write failure' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -761,8 +758,8 @@ test_expect_success PERL 'process filter should restart after unexpected write f
 	)
 '
 
-test_expect_success PERL 'process filter should not be restarted if it signals an error' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter should not be restarted if it signals an error' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -804,8 +801,8 @@ test_expect_success PERL 'process filter should not be restarted if it signals a
 	)
 '
 
-test_expect_success PERL 'process filter abort stops processing of all further files' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter abort stops processing of all further files' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -861,10 +858,10 @@ test_expect_success PERL 'invalid process filter must fail (and not hang!)' '
 	)
 '
 
-test_expect_success PERL 'delayed checkout in process filter' '
-	test_config_global filter.a.process "rot13-filter.pl a.log clean smudge delay" &&
+test_expect_success 'delayed checkout in process filter' '
+	test_config_global filter.a.process "test-tool rot13-filter a.log clean smudge delay" &&
 	test_config_global filter.a.required true &&
-	test_config_global filter.b.process "rot13-filter.pl b.log clean smudge delay" &&
+	test_config_global filter.b.process "test-tool rot13-filter b.log clean smudge delay" &&
 	test_config_global filter.b.required true &&
 
 	rm -rf repo &&
@@ -940,8 +937,8 @@ test_expect_success PERL 'delayed checkout in process filter' '
 	)
 '
 
-test_expect_success PERL 'missing file in delayed checkout' '
-	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
+test_expect_success 'missing file in delayed checkout' '
+	test_config_global filter.bug.process "test-tool rot13-filter bug.log clean smudge delay" &&
 	test_config_global filter.bug.required true &&
 
 	rm -rf repo &&
@@ -960,8 +957,8 @@ test_expect_success PERL 'missing file in delayed checkout' '
 	grep "error: .missing-delay\.a. was not filtered properly" git-stderr.log
 '
 
-test_expect_success PERL 'invalid file in delayed checkout' '
-	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
+test_expect_success 'invalid file in delayed checkout' '
+	test_config_global filter.bug.process "test-tool rot13-filter bug.log clean smudge delay" &&
 	test_config_global filter.bug.required true &&
 
 	rm -rf repo &&
@@ -990,10 +987,10 @@ do
 		mode_prereq='UTF8_NFD_TO_NFC' ;;
 	esac
 
-	test_expect_success PERL,SYMLINKS,$mode_prereq \
+	test_expect_success SYMLINKS,$mode_prereq \
 	"delayed checkout with $mode-collision don't write to the wrong place" '
 		test_config_global filter.delay.process \
-			"\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+			"test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 		test_config_global filter.delay.required true &&
 
 		git init $mode-collision &&
@@ -1026,12 +1023,12 @@ do
 	'
 done
 
-test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
+test_expect_success SYMLINKS,CASE_INSENSITIVE_FS \
 "delayed checkout with submodule collision don't write to the wrong place" '
 	git init collision-with-submodule &&
 	(
 		cd collision-with-submodule &&
-		git config filter.delay.process "\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		# We need Git to treat the submodule "a" and the
@@ -1062,11 +1059,11 @@ test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
 	)
 '
 
-test_expect_success PERL 'setup for progress tests' '
+test_expect_success 'setup for progress tests' '
 	git init progress &&
 	(
 		cd progress &&
-		git config filter.delay.process "rot13-filter.pl delay-progress.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter delay-progress.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		echo "*.a filter=delay" >.gitattributes &&
@@ -1132,12 +1129,12 @@ do
 	'
 done
 
-test_expect_success PERL 'delayed checkout correctly reports the number of updated entries' '
+test_expect_success 'delayed checkout correctly reports the number of updated entries' '
 	rm -rf repo &&
 	git init repo &&
 	(
 		cd repo &&
-		git config filter.delay.process "../rot13-filter.pl delayed.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter delayed.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		echo "*.a filter=delay" >.gitattributes &&
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
deleted file mode 100644
index 1447bc0a24..0000000000
--- a/t/t0021/rot13-filter.pl
+++ /dev/null
@@ -1,5 +0,0 @@
-use 5.008;
-
-my @quoted_args = map "'$_'", @ARGV;
-exec "test-tool rot13-filter @quoted_args";
-die "failed to exec test-tool";
diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh
index c683e60007..7d956625ca 100755
--- a/t/t2080-parallel-checkout-basics.sh
+++ b/t/t2080-parallel-checkout-basics.sh
@@ -230,12 +230,9 @@ test_expect_success SYMLINKS 'parallel checkout checks for symlinks in leading d
 # check the final report including sequential, parallel, and delayed entries
 # all at the same time. So we must have finer control of the parallel checkout
 # variables.
-test_expect_success PERL '"git checkout ." report should not include failed entries' '
-	write_script rot13-filter.pl "$PERL_PATH" \
-		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
-
+test_expect_success '"git checkout ." report should not include failed entries' '
 	test_config_global filter.delay.process \
-		"\"$(pwd)/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		"test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 	test_config_global filter.delay.required true &&
 	test_config_global filter.cat.clean cat  &&
 	test_config_global filter.cat.smudge cat  &&
diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh
index 2525457961..2df55b9405 100755
--- a/t/t2082-parallel-checkout-attributes.sh
+++ b/t/t2082-parallel-checkout-attributes.sh
@@ -138,12 +138,9 @@ test_expect_success 'parallel-checkout and external filter' '
 # The delayed queue is independent from the parallel queue, and they should be
 # able to work together in the same checkout process.
 #
-test_expect_success PERL 'parallel-checkout and delayed checkout' '
-	write_script rot13-filter.pl "$PERL_PATH" \
-		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
-
+test_expect_success 'parallel-checkout and delayed checkout' '
 	test_config_global filter.delay.process \
-		"\"$(pwd)/rot13-filter.pl\" --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
+		"test-tool rot13-filter --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
 	test_config_global filter.delay.required true &&
 
 	echo "abcd" >original &&
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-22 19:42 ` [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
@ 2022-07-23  4:52   ` Ævar Arnfjörð Bjarmason
  2022-07-23  4:59   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 34+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-23  4:52 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, gitster, larsxschneider, christian.couder


On Fri, Jul 22 2022, Matheus Tavares wrote:

> +my @quoted_args = map "'$_'", @ARGV;
> +exec "test-tool rot13-filter @quoted_args";
> +die "failed to exec test-tool";

You end up throwing this away, but this whole escaping business is just
bad use of the API, you can pass a list to "exec" have it escape
arguments.  See "perldoc -f exec".

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-22 19:42 ` [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
  2022-07-23  4:52   ` Ævar Arnfjörð Bjarmason
@ 2022-07-23  4:59   ` Ævar Arnfjörð Bjarmason
  2022-07-23 13:36     ` Matheus Tavares
  1 sibling, 1 reply; 34+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-23  4:59 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, gitster, larsxschneider, christian.couder


On Fri, Jul 22 2022, Matheus Tavares wrote:

Looking a bit closer...

> however, that we still use the script as a wrapper at
> this commit, in order to minimize the amount of changes it introduces
> and help reviewers. At the next commit we will properly remove the
> script and adjust the affected tests to use test-tool.

I'd prefer if we just squashed this, if you want to avoid some of the
diff verbosity you could leave the PERL prereq on all the
test_expect_success and remove it in a 2/2 (we just wouldn't run the
test until then).

But I think it's all boilerplate, so just doing it in one step would be
better, reasoning about the in-between steps is harder IMO (e.g. "exec"
escaping or whatever)>

> +static char *rot13(char *str)
> +{
> +	char *c;
> +	for (c = str; *c; c++) {
> +		if (*c >= 'a' && *c <= 'z')
> +			*c = 'a' + (*c - 'a' + 13) % 26;
> +		else if (*c >= 'A' && *c <= 'Z')
> +			*c = 'A' + (*c - 'A' + 13) % 26;
> +	}
> +	return str;
> +}

Looks fine, but we should probably put in our CodingGuidelines at some
point that we don't care about EBCDIC, as this isn't portable C (but
probably portable enough, as we can probably assume ASCII) :)

> +static struct string_list *packet_read_capabilities(void)
> +{
> +	struct string_list *caps = xmalloc(sizeof(*caps));

malloc here...

> +	string_list_init_dup(caps);
> +	while (1) {
> +		int size;
> +		char *buf = packet_read_line(0, &size);
> +		if (!buf)
> +			break;
> +		string_list_append_nodup(caps,
> +					 skip_key_dup(buf, size, "capability"));
> +	}
> +	return caps;
> +}
> +
> +/* Read remote capabilities and check them against capabilities we require */
> +static struct string_list *packet_read_and_check_capabilities(
> +		struct string_list *required_caps)
> +{
> +	struct string_list *remote_caps = packet_read_capabilities();

...and here...
> +	struct string_list_item *item;
> +	for_each_string_list_item(item, required_caps) {
> +		if (!unsorted_string_list_has_string(remote_caps, item->string)) {
> +			die("required '%s' capability not available from remote",
> +			    item->string);
> +		}
> +	}
> +	return remote_caps;

...we'll return it...

> +	remote_caps = packet_read_and_check_capabilities(&supported_caps);
> +	packet_check_and_write_capabilities(remote_caps, &requested_caps);
> +	fprintf(logfile, "init handshake complete\n");
> +
> +	string_list_clear(&supported_caps, 0);
> +	string_list_clear(remote_caps, 0);

..and here you're missing a free(), but I wonder why not just declare
this string_list in this function, and pass it down instead?

It's unfortunate that none of these tests seem to pass with
SANITIZE=leak already, but the new command seems not to leak from a
trivial glance except for in that one case.

Not knowing much about the filtering mechanism, I wonder if this code
here wouldn't be better as a built-in some day. I.e. isn't this all
shimmy we need to talk to some arbitrary conversion filter, except for
the rot13 part?

So if we just invoked a "tr" with run_command() to do the actual rot13
filtering we could do any sort of arbitrary replacement, and present a
variant of this this command as a "if you can't be bothered with
packet-line" in gitattributes(5)...

...but maybe that's hopeless for some reason I'm missing, in any case,
more #leftoverbits.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-23  4:59   ` Ævar Arnfjörð Bjarmason
@ 2022-07-23 13:36     ` Matheus Tavares
  0 siblings, 0 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-07-23 13:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Lars Schneider, Christian Couder

On Sat, Jul 23, 2022 at 2:15 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Fri, Jul 22 2022, Matheus Tavares wrote:
>
> Looking a bit closer...
>
> > however, that we still use the script as a wrapper at
> > this commit, in order to minimize the amount of changes it introduces
> > and help reviewers. At the next commit we will properly remove the
> > script and adjust the affected tests to use test-tool.
>
> I'd prefer if we just squashed this, if you want to avoid some of the
> diff verbosity you could leave the PERL prereq on all the
> test_expect_success and remove it in a 2/2 (we just wouldn't run the
> test until then).
>
> But I think it's all boilerplate, so just doing it in one step would be
> better, reasoning about the in-between steps is harder IMO (e.g. "exec"
> escaping or whatever)

Sure, will do! My split attempt was to try to reduce the mental load
for the reviewers, but if it ended up making it harder instead of
helping, let's squash the two patches.

> > +     remote_caps = packet_read_and_check_capabilities(&supported_caps);
> > +     packet_check_and_write_capabilities(remote_caps, &requested_caps);
> > +     fprintf(logfile, "init handshake complete\n");
> > +
> > +     string_list_clear(&supported_caps, 0);
> > +     string_list_clear(remote_caps, 0);
>
> ..and here you're missing a free(), but I wonder why not just declare
> this string_list in this function, and pass it down instead?

Makes sense, will do.

> Not knowing much about the filtering mechanism, I wonder if this code
> here wouldn't be better as a built-in some day. I.e. isn't this all
> shimmy we need to talk to some arbitrary conversion filter, except for
> the rot13 part?
>
> So if we just invoked a "tr" with run_command() to do the actual rot13
> filtering we could do any sort of arbitrary replacement, and present a
> variant of this this command as a "if you can't be bothered with
> packet-line" in gitattributes(5)...

Hmm, maybe so. But I would expect that someone building a long running
process filter (as opposed to a "single-shot" filter, like the "tr"
use case)  would also want to have finer control over the
communication and "queueing" mechanics. And I'm not sure if that would
be feasible via an off-the-shelf solution packed with Git itself.

For example, while some filters may process the received paths
sequentially, Git-LFS will use the delay capability to queue and
download blobs in the background, examining the queue every time Git
asks for the list of currently available blobs.

Anyways, I could see these packet-line routines being exported as a
library for those writing such filters.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-22 19:42 [PATCH 0/2] t0021: convert perl script to C test-tool helper Matheus Tavares
  2022-07-22 19:42 ` [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
  2022-07-22 19:42 ` [PATCH 2/2] t/t0021: replace old rot13-filter.pl uses with new test-tool cmd Matheus Tavares
@ 2022-07-24 15:09 ` Matheus Tavares
  2022-07-28 16:58   ` Johannes Schindelin
  2022-07-31 18:19   ` [PATCH v3 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
  2 siblings, 2 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-07-24 15:09 UTC (permalink / raw)
  To: git; +Cc: gitster, larsxschneider, christian.couder, avarab

This script is currently used by three test files: t0021-conversion.sh,
t2080-parallel-checkout-basics.sh, and
t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL
dependency at these tests, let's convert the script to a C test-tool
command.

Note that there is a small adjustment needed at test t0021-conversion.sh
because it depended on a specific error message given by perl's die
routine.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---

Changes since v1:
- Squashed the two patches together.
- Declared `remote_caps` at cmd__rot13_filter()'s stack and passed it
  down the call stack instead of dynamic allocation.

 Makefile                                |   1 +
 pkt-line.c                              |  13 +-
 pkt-line.h                              |   2 +
 t/helper/test-rot13-filter.c            | 393 ++++++++++++++++++++++++
 t/helper/test-tool.c                    |   1 +
 t/helper/test-tool.h                    |   1 +
 t/t0021-conversion.sh                   |  71 ++---
 t/t0021/rot13-filter.pl                 | 247 ---------------
 t/t2080-parallel-checkout-basics.sh     |   7 +-
 t/t2082-parallel-checkout-attributes.sh |   7 +-
 10 files changed, 447 insertions(+), 296 deletions(-)
 create mode 100644 t/helper/test-rot13-filter.c
 delete mode 100644 t/t0021/rot13-filter.pl

diff --git a/Makefile b/Makefile
index 04d0fd1fe6..7cfcf3a911 100644
--- a/Makefile
+++ b/Makefile
@@ -764,6 +764,7 @@ TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
 TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
+TEST_BUILTINS_OBJS += test-rot13-filter.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
 TEST_BUILTINS_OBJS += test-run-command.o
diff --git a/pkt-line.c b/pkt-line.c
index 8e43c2def4..4425bdae36 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -309,9 +309,10 @@ int write_packetized_from_fd_no_flush(int fd_in, int fd_out)
 	return err;
 }
 
-int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
+					     int fd_out, int *count_ptr)
 {
-	int err = 0;
+	int err = 0, count = 0;
 	size_t bytes_written = 0;
 	size_t bytes_to_write;
 
@@ -324,10 +325,18 @@ int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_ou
 			break;
 		err = packet_write_gently(fd_out, src_in + bytes_written, bytes_to_write);
 		bytes_written += bytes_to_write;
+		count++;
 	}
+	if (count_ptr)
+		*count_ptr = count;
 	return err;
 }
 
+int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
+{
+	return write_packetized_from_buf_no_flush_count(src_in, len, fd_out, NULL);
+}
+
 static int get_packet_data(int fd, char **src_buf, size_t *src_size,
 			   void *dst, unsigned size, int options)
 {
diff --git a/pkt-line.h b/pkt-line.h
index 6d2a63db23..43986c525c 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -33,6 +33,8 @@ int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd_no_flush(int fd_in, int fd_out);
 int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out);
+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
+					     int fd_out, int *count_ptr);
 
 /*
  * Stdio versions of packet_write functions. When mixing these with fd
diff --git a/t/helper/test-rot13-filter.c b/t/helper/test-rot13-filter.c
new file mode 100644
index 0000000000..536111f272
--- /dev/null
+++ b/t/helper/test-rot13-filter.c
@@ -0,0 +1,393 @@
+/*
+ * Example implementation for the Git filter protocol version 2
+ * See Documentation/gitattributes.txt, section "Filter Protocol"
+ *
+ * Usage: test-tool rot13-filter [--always-delay] <log path> <capabilities>
+ *
+ * Log path defines a debug log file that the script writes to. The
+ * subsequent arguments define a list of supported protocol capabilities
+ * ("clean", "smudge", etc).
+ *
+ * When --always-delay is given all pathnames with the "can-delay" flag
+ * that don't appear on the list bellow are delayed with a count of 1
+ * (see more below).
+ *
+ * This implementation supports special test cases:
+ * (1) If data with the pathname "clean-write-fail.r" is processed with
+ *     a "clean" operation then the write operation will die.
+ * (2) If data with the pathname "smudge-write-fail.r" is processed with
+ *     a "smudge" operation then the write operation will die.
+ * (3) If data with the pathname "error.r" is processed with any
+ *     operation then the filter signals that it cannot or does not want
+ *     to process the file.
+ * (4) If data with the pathname "abort.r" is processed with any
+ *     operation then the filter signals that it cannot or does not want
+ *     to process the file and any file after that is processed with the
+ *     same command.
+ * (5) If data with a pathname that is a key in the delay hash is
+ *     requested (e.g. "test-delay10.a") then the filter responds with
+ *     a "delay" status and sets the "requested" field in the delay hash.
+ *     The filter will signal the availability of this object after
+ *     "count" (field in delay hash) "list_available_blobs" commands.
+ * (6) If data with the pathname "missing-delay.a" is processed that the
+ *     filter will drop the path from the "list_available_blobs" response.
+ * (7) If data with the pathname "invalid-delay.a" is processed that the
+ *     filter will add the path "unfiltered" which was not delayed before
+ *     to the "list_available_blobs" response.
+ */
+
+#include "test-tool.h"
+#include "pkt-line.h"
+#include "string-list.h"
+#include "strmap.h"
+
+static FILE *logfile;
+static int always_delay;
+static struct strmap delay = STRMAP_INIT;
+static struct string_list requested_caps = STRING_LIST_INIT_NODUP;
+
+static int has_capability(const char *cap)
+{
+	return unsorted_string_list_has_string(&requested_caps, cap);
+}
+
+static char *rot13(char *str)
+{
+	char *c;
+	for (c = str; *c; c++) {
+		if (*c >= 'a' && *c <= 'z')
+			*c = 'a' + (*c - 'a' + 13) % 26;
+		else if (*c >= 'A' && *c <= 'Z')
+			*c = 'A' + (*c - 'A' + 13) % 26;
+	}
+	return str;
+}
+
+static char *skip_key_dup(const char *buf, size_t size, const char *key)
+{
+	struct strbuf keybuf = STRBUF_INIT;
+	strbuf_addf(&keybuf, "%s=", key);
+	if (!skip_prefix_mem(buf, size, keybuf.buf, &buf, &size) || !size)
+		die("bad %s: '%s'", key, xstrndup(buf, size));
+	strbuf_release(&keybuf);
+	return xstrndup(buf, size);
+}
+
+/*
+ * Read a text packet, expecting that it is in the form "key=value" for
+ * the given key. An EOF does not trigger any error and is reported
+ * back to the caller with NULL. Die if the "key" part of "key=value" does
+ * not match the given key, or the value part is empty.
+ */
+static char *packet_key_val_read(const char *key)
+{
+	int size;
+	char *buf;
+	if (packet_read_line_gently(0, &size, &buf) < 0)
+		return NULL;
+	return skip_key_dup(buf, size, key);
+}
+
+static void packet_read_capabilities(struct string_list *caps)
+{
+	while (1) {
+		int size;
+		char *buf = packet_read_line(0, &size);
+		if (!buf)
+			break;
+		string_list_append_nodup(caps,
+					 skip_key_dup(buf, size, "capability"));
+	}
+}
+
+/* Read remote capabilities and check them against capabilities we require */
+static void packet_read_and_check_capabilities(struct string_list *remote_caps,
+					       struct string_list *required_caps)
+{
+	struct string_list_item *item;
+	packet_read_capabilities(remote_caps);
+	for_each_string_list_item(item, required_caps) {
+		if (!unsorted_string_list_has_string(remote_caps, item->string)) {
+			die("required '%s' capability not available from remote",
+			    item->string);
+		}
+	}
+}
+
+/*
+ * Check our capabilities we want to advertise against the remote ones
+ * and then advertise our capabilities
+ */
+static void packet_check_and_write_capabilities(struct string_list *remote_caps,
+						struct string_list *our_caps)
+{
+	struct string_list_item *item;
+	for_each_string_list_item(item, our_caps) {
+		if (!unsorted_string_list_has_string(remote_caps, item->string)) {
+			die("our capability '%s' is not available from remote",
+			    item->string);
+		}
+		packet_write_fmt(1, "capability=%s\n", item->string);
+	}
+	packet_flush(1);
+}
+
+struct delay_entry {
+	int requested, count;
+	char *output;
+};
+
+static void command_loop(void)
+{
+	while (1) {
+		char *command = packet_key_val_read("command");
+		if (!command) {
+			fprintf(logfile, "STOP\n");
+			break;
+		}
+		fprintf(logfile, "IN: %s", command);
+
+		if (!strcmp(command, "list_available_blobs")) {
+			struct hashmap_iter iter;
+			struct strmap_entry *ent;
+			struct string_list_item *str_item;
+			struct string_list paths = STRING_LIST_INIT_NODUP;
+
+			/* flush */
+			if (packet_read_line(0, NULL))
+				die("bad list_available_blobs end");
+
+			strmap_for_each_entry(&delay, &iter, ent) {
+				struct delay_entry *delay_entry = ent->value;
+				if (!delay_entry->requested)
+					continue;
+				delay_entry->count--;
+				if (!strcmp(ent->key, "invalid-delay.a")) {
+					/* Send Git a pathname that was not delayed earlier */
+					packet_write_fmt(1, "pathname=unfiltered");
+				}
+				if (!strcmp(ent->key, "missing-delay.a")) {
+					/* Do not signal Git that this file is available */
+				} else if (!delay_entry->count) {
+					string_list_insert(&paths, ent->key);
+					packet_write_fmt(1, "pathname=%s", ent->key);
+				}
+			}
+
+			/* Print paths in sorted order. */
+			for_each_string_list_item(str_item, &paths)
+				fprintf(logfile, " %s", str_item->string);
+			string_list_clear(&paths, 0);
+
+			packet_flush(1);
+
+			fprintf(logfile, " [OK]\n");
+			packet_write_fmt(1, "status=success");
+			packet_flush(1);
+		} else {
+			char *buf, *output;
+			int size;
+			char *pathname;
+			struct delay_entry *entry;
+			struct strbuf input = STRBUF_INIT;
+
+			pathname = packet_key_val_read("pathname");
+			if (!pathname)
+				die("unexpected EOF while expecting pathname");
+			fprintf(logfile, " %s", pathname);
+
+			/* Read until flush */
+			buf = packet_read_line(0, &size);
+			while (buf) {
+				if (!strcmp(buf, "can-delay=1")) {
+					entry = strmap_get(&delay, pathname);
+					if (entry && !entry->requested) {
+						entry->requested = 1;
+					} else if (!entry && always_delay) {
+						entry = xcalloc(1, sizeof(*entry));
+						entry->requested = 1;
+						entry->count = 1;
+						strmap_put(&delay, pathname, entry);
+					}
+				} else if (starts_with(buf, "ref=") ||
+					   starts_with(buf, "treeish=") ||
+					   starts_with(buf, "blob=")) {
+					fprintf(logfile, " %s", buf);
+				} else {
+					/*
+					 * In general, filters need to be graceful about
+					 * new metadata, since it's documented that we
+					 * can pass any key-value pairs, but for tests,
+					 * let's be a little stricter.
+					 */
+					die("Unknown message '%s'", buf);
+				}
+				buf = packet_read_line(0, &size);
+			}
+
+
+			read_packetized_to_strbuf(0, &input, 0);
+			fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);
+
+			entry = strmap_get(&delay, pathname);
+			if (entry && entry->output) {
+				output = entry->output;
+			} else if (!strcmp(pathname, "error.r") || !strcmp(pathname, "abort.r")) {
+				output = "";
+			} else if (!strcmp(command, "clean") && has_capability("clean")) {
+				output = rot13(input.buf);
+			} else if (!strcmp(command, "smudge") && has_capability("smudge")) {
+				output = rot13(input.buf);
+			} else {
+				die("bad command '%s'", command);
+			}
+
+			if (!strcmp(pathname, "error.r")) {
+				fprintf(logfile, "[ERROR]\n");
+				packet_write_fmt(1, "status=error");
+				packet_flush(1);
+			} else if (!strcmp(pathname, "abort.r")) {
+				fprintf(logfile, "[ABORT]\n");
+				packet_write_fmt(1, "status=abort");
+				packet_flush(1);
+			} else if (!strcmp(command, "smudge") &&
+				   (entry = strmap_get(&delay, pathname)) &&
+				   entry->requested == 1) {
+				fprintf(logfile, "[DELAYED]\n");
+				packet_write_fmt(1, "status=delayed");
+				packet_flush(1);
+				entry->requested = 2;
+				entry->output = xstrdup(output);
+			} else {
+				int i, nr_packets;
+				size_t output_len;
+				struct strbuf sb = STRBUF_INIT;
+				packet_write_fmt(1, "status=success");
+				packet_flush(1);
+
+				strbuf_addf(&sb, "%s-write-fail.r", command);
+				if (!strcmp(pathname, sb.buf)) {
+					fprintf(logfile, "[WRITE FAIL]\n");
+					die("%s write error", command);
+				}
+
+				output_len = strlen(output);
+				fprintf(logfile, "OUT: %"PRIuMAX" ", (uintmax_t)output_len);
+
+				if (write_packetized_from_buf_no_flush_count(output,
+					output_len, 1, &nr_packets))
+					die("failed to write buffer to stdout");
+				packet_flush(1);
+
+				for (i = 0; i < nr_packets; i++)
+					fprintf(logfile, ".");
+				fprintf(logfile, " [OK]\n");
+
+				packet_flush(1);
+				strbuf_release(&sb);
+			}
+			free(pathname);
+			strbuf_release(&input);
+		}
+		free(command);
+	}
+}
+
+static void free_delay_hash(void)
+{
+	struct hashmap_iter iter;
+	struct strmap_entry *ent;
+
+	strmap_for_each_entry(&delay, &iter, ent) {
+		struct delay_entry *delay_entry = ent->value;
+		free(delay_entry->output);
+		free(delay_entry);
+	}
+	strmap_clear(&delay, 0);
+}
+
+static void add_delay_entry(char *pathname, int count)
+{
+	struct delay_entry *entry = xcalloc(1, sizeof(*entry));
+	entry->count = count;
+	if (strmap_put(&delay, pathname, entry))
+		BUG("adding the same path twice to delay hash?");
+}
+
+static void packet_initialize(const char *name, int version)
+{
+	struct strbuf sb = STRBUF_INIT;
+	int size;
+	char *pkt_buf = packet_read_line(0, &size);
+
+	strbuf_addf(&sb, "%s-client", name);
+	if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
+		die("bad initialize: '%s'", xstrndup(pkt_buf, size));
+
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "version=%d", version);
+	pkt_buf = packet_read_line(0, &size);
+	if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
+		die("bad version: '%s'", xstrndup(pkt_buf, size));
+
+	pkt_buf = packet_read_line(0, &size);
+	if (pkt_buf)
+		die("bad version end: '%s'", xstrndup(pkt_buf, size));
+
+	packet_write_fmt(1, "%s-server", name);
+	packet_write_fmt(1, "version=%d", version);
+	packet_flush(1);
+	strbuf_release(&sb);
+}
+
+static char *rot13_usage = "test-tool rot13-filter [--always-delay] <log path> <capabilities>";
+
+int cmd__rot13_filter(int argc, const char **argv)
+{
+	int i = 1;
+	struct string_list remote_caps = STRING_LIST_INIT_DUP,
+			   supported_caps = STRING_LIST_INIT_NODUP;
+
+	string_list_append(&supported_caps, "clean");
+	string_list_append(&supported_caps, "smudge");
+	string_list_append(&supported_caps, "delay");
+
+	if (argc > 1 && !strcmp(argv[i], "--always-delay")) {
+		always_delay = 1;
+		i++;
+	}
+	if (argc - i < 2)
+		usage(rot13_usage);
+
+	logfile = fopen(argv[i++], "a");
+	if (!logfile)
+		die_errno("failed to open log file");
+
+	for ( ; i < argc; i++)
+		string_list_append(&requested_caps, argv[i]);
+
+	add_delay_entry("test-delay10.a", 1);
+	add_delay_entry("test-delay11.a", 1);
+	add_delay_entry("test-delay20.a", 2);
+	add_delay_entry("test-delay10.b", 1);
+	add_delay_entry("missing-delay.a", 1);
+	add_delay_entry("invalid-delay.a", 1);
+
+	fprintf(logfile, "START\n");
+
+	packet_initialize("git-filter", 2);
+
+	packet_read_and_check_capabilities(&remote_caps, &supported_caps);
+	packet_check_and_write_capabilities(&remote_caps, &requested_caps);
+	fprintf(logfile, "init handshake complete\n");
+
+	string_list_clear(&supported_caps, 0);
+	string_list_clear(&remote_caps, 0);
+
+	command_loop();
+
+	fclose(logfile);
+	string_list_clear(&requested_caps, 0);
+	free_delay_hash();
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 318fdbab0c..d6a560f832 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -65,6 +65,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "rot13-filter", cmd__rot13_filter },
 	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index bb79927163..21a91b1019 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -54,6 +54,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__rot13_filter(int argc, const char **argv);
 int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 1c840348bd..aeaa8e02ed 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -17,9 +17,6 @@ tr \
   'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'
 EOF
 
-write_script rot13-filter.pl "$PERL_PATH" \
-	<"$TEST_DIRECTORY"/t0021/rot13-filter.pl
-
 generate_random_characters () {
 	LEN=$1
 	NAME=$2
@@ -365,8 +362,8 @@ test_expect_success 'diff does not reuse worktree files that need cleaning' '
 	test_line_count = 0 count
 '
 
-test_expect_success PERL 'required process filter should filter data' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should filter data' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -450,8 +447,8 @@ test_expect_success PERL 'required process filter should filter data' '
 	)
 '
 
-test_expect_success PERL 'required process filter should filter data for various subcommands' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should filter data for various subcommands' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	(
 		cd repo &&
@@ -561,9 +558,9 @@ test_expect_success PERL 'required process filter should filter data for various
 	)
 '
 
-test_expect_success PERL 'required process filter takes precedence' '
+test_expect_success 'required process filter takes precedence' '
 	test_config_global filter.protocol.clean false &&
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -587,8 +584,8 @@ test_expect_success PERL 'required process filter takes precedence' '
 	)
 '
 
-test_expect_success PERL 'required process filter should be used only for "clean" operation only' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
+test_expect_success 'required process filter should be used only for "clean" operation only' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -622,8 +619,8 @@ test_expect_success PERL 'required process filter should be used only for "clean
 	)
 '
 
-test_expect_success PERL 'required process filter should process multiple packets' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should process multiple packets' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 
 	rm -rf repo &&
@@ -687,8 +684,8 @@ test_expect_success PERL 'required process filter should process multiple packet
 	)
 '
 
-test_expect_success PERL 'required process filter with clean error should fail' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter with clean error should fail' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -706,8 +703,8 @@ test_expect_success PERL 'required process filter with clean error should fail'
 	)
 '
 
-test_expect_success PERL 'process filter should restart after unexpected write failure' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter should restart after unexpected write failure' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -735,7 +732,7 @@ test_expect_success PERL 'process filter should restart after unexpected write f
 		rm -f debug.log &&
 		git checkout --quiet --no-progress . 2>git-stderr.log &&
 
-		grep "smudge write error at" git-stderr.log &&
+		grep "smudge write error" git-stderr.log &&
 		test_i18ngrep "error: external filter" git-stderr.log &&
 
 		cat >expected.log <<-EOF &&
@@ -761,8 +758,8 @@ test_expect_success PERL 'process filter should restart after unexpected write f
 	)
 '
 
-test_expect_success PERL 'process filter should not be restarted if it signals an error' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter should not be restarted if it signals an error' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -804,8 +801,8 @@ test_expect_success PERL 'process filter should not be restarted if it signals a
 	)
 '
 
-test_expect_success PERL 'process filter abort stops processing of all further files' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter abort stops processing of all further files' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -861,10 +858,10 @@ test_expect_success PERL 'invalid process filter must fail (and not hang!)' '
 	)
 '
 
-test_expect_success PERL 'delayed checkout in process filter' '
-	test_config_global filter.a.process "rot13-filter.pl a.log clean smudge delay" &&
+test_expect_success 'delayed checkout in process filter' '
+	test_config_global filter.a.process "test-tool rot13-filter a.log clean smudge delay" &&
 	test_config_global filter.a.required true &&
-	test_config_global filter.b.process "rot13-filter.pl b.log clean smudge delay" &&
+	test_config_global filter.b.process "test-tool rot13-filter b.log clean smudge delay" &&
 	test_config_global filter.b.required true &&
 
 	rm -rf repo &&
@@ -940,8 +937,8 @@ test_expect_success PERL 'delayed checkout in process filter' '
 	)
 '
 
-test_expect_success PERL 'missing file in delayed checkout' '
-	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
+test_expect_success 'missing file in delayed checkout' '
+	test_config_global filter.bug.process "test-tool rot13-filter bug.log clean smudge delay" &&
 	test_config_global filter.bug.required true &&
 
 	rm -rf repo &&
@@ -960,8 +957,8 @@ test_expect_success PERL 'missing file in delayed checkout' '
 	grep "error: .missing-delay\.a. was not filtered properly" git-stderr.log
 '
 
-test_expect_success PERL 'invalid file in delayed checkout' '
-	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
+test_expect_success 'invalid file in delayed checkout' '
+	test_config_global filter.bug.process "test-tool rot13-filter bug.log clean smudge delay" &&
 	test_config_global filter.bug.required true &&
 
 	rm -rf repo &&
@@ -990,10 +987,10 @@ do
 		mode_prereq='UTF8_NFD_TO_NFC' ;;
 	esac
 
-	test_expect_success PERL,SYMLINKS,$mode_prereq \
+	test_expect_success SYMLINKS,$mode_prereq \
 	"delayed checkout with $mode-collision don't write to the wrong place" '
 		test_config_global filter.delay.process \
-			"\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+			"test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 		test_config_global filter.delay.required true &&
 
 		git init $mode-collision &&
@@ -1026,12 +1023,12 @@ do
 	'
 done
 
-test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
+test_expect_success SYMLINKS,CASE_INSENSITIVE_FS \
 "delayed checkout with submodule collision don't write to the wrong place" '
 	git init collision-with-submodule &&
 	(
 		cd collision-with-submodule &&
-		git config filter.delay.process "\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		# We need Git to treat the submodule "a" and the
@@ -1062,11 +1059,11 @@ test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
 	)
 '
 
-test_expect_success PERL 'setup for progress tests' '
+test_expect_success 'setup for progress tests' '
 	git init progress &&
 	(
 		cd progress &&
-		git config filter.delay.process "rot13-filter.pl delay-progress.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter delay-progress.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		echo "*.a filter=delay" >.gitattributes &&
@@ -1132,12 +1129,12 @@ do
 	'
 done
 
-test_expect_success PERL 'delayed checkout correctly reports the number of updated entries' '
+test_expect_success 'delayed checkout correctly reports the number of updated entries' '
 	rm -rf repo &&
 	git init repo &&
 	(
 		cd repo &&
-		git config filter.delay.process "../rot13-filter.pl delayed.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter delayed.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		echo "*.a filter=delay" >.gitattributes &&
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
deleted file mode 100644
index 7bb93768f3..0000000000
--- a/t/t0021/rot13-filter.pl
+++ /dev/null
@@ -1,247 +0,0 @@
-#
-# Example implementation for the Git filter protocol version 2
-# See Documentation/gitattributes.txt, section "Filter Protocol"
-#
-# Usage: rot13-filter.pl [--always-delay] <log path> <capabilities>
-#
-# Log path defines a debug log file that the script writes to. The
-# subsequent arguments define a list of supported protocol capabilities
-# ("clean", "smudge", etc).
-#
-# When --always-delay is given all pathnames with the "can-delay" flag
-# that don't appear on the list bellow are delayed with a count of 1
-# (see more below).
-#
-# This implementation supports special test cases:
-# (1) If data with the pathname "clean-write-fail.r" is processed with
-#     a "clean" operation then the write operation will die.
-# (2) If data with the pathname "smudge-write-fail.r" is processed with
-#     a "smudge" operation then the write operation will die.
-# (3) If data with the pathname "error.r" is processed with any
-#     operation then the filter signals that it cannot or does not want
-#     to process the file.
-# (4) If data with the pathname "abort.r" is processed with any
-#     operation then the filter signals that it cannot or does not want
-#     to process the file and any file after that is processed with the
-#     same command.
-# (5) If data with a pathname that is a key in the DELAY hash is
-#     requested (e.g. "test-delay10.a") then the filter responds with
-#     a "delay" status and sets the "requested" field in the DELAY hash.
-#     The filter will signal the availability of this object after
-#     "count" (field in DELAY hash) "list_available_blobs" commands.
-# (6) If data with the pathname "missing-delay.a" is processed that the
-#     filter will drop the path from the "list_available_blobs" response.
-# (7) If data with the pathname "invalid-delay.a" is processed that the
-#     filter will add the path "unfiltered" which was not delayed before
-#     to the "list_available_blobs" response.
-#
-
-use 5.008;
-sub gitperllib {
-	# Git assumes that all path lists are Unix-y colon-separated ones. But
-	# when the Git for Windows executes the test suite, its MSYS2 Bash
-	# calls git.exe, and colon-separated path lists are converted into
-	# Windows-y semicolon-separated lists of *Windows* paths (which
-	# naturally contain a colon after the drive letter, so splitting by
-	# colons simply does not cut it).
-	#
-	# Detect semicolon-separated path list and handle them appropriately.
-
-	if ($ENV{GITPERLLIB} =~ /;/) {
-		return split(/;/, $ENV{GITPERLLIB});
-	}
-	return split(/:/, $ENV{GITPERLLIB});
-}
-use lib (gitperllib());
-use strict;
-use warnings;
-use IO::File;
-use Git::Packet;
-
-my $MAX_PACKET_CONTENT_SIZE = 65516;
-
-my $always_delay = 0;
-if ( $ARGV[0] eq '--always-delay' ) {
-	$always_delay = 1;
-	shift @ARGV;
-}
-
-my $log_file                = shift @ARGV;
-my @capabilities            = @ARGV;
-
-open my $debug, ">>", $log_file or die "cannot open log file: $!";
-
-my %DELAY = (
-	'test-delay10.a' => { "requested" => 0, "count" => 1 },
-	'test-delay11.a' => { "requested" => 0, "count" => 1 },
-	'test-delay20.a' => { "requested" => 0, "count" => 2 },
-	'test-delay10.b' => { "requested" => 0, "count" => 1 },
-	'missing-delay.a' => { "requested" => 0, "count" => 1 },
-	'invalid-delay.a' => { "requested" => 0, "count" => 1 },
-);
-
-sub rot13 {
-	my $str = shift;
-	$str =~ y/A-Za-z/N-ZA-Mn-za-m/;
-	return $str;
-}
-
-print $debug "START\n";
-$debug->flush();
-
-packet_initialize("git-filter", 2);
-
-my %remote_caps = packet_read_and_check_capabilities("clean", "smudge", "delay");
-packet_check_and_write_capabilities(\%remote_caps, @capabilities);
-
-print $debug "init handshake complete\n";
-$debug->flush();
-
-while (1) {
-	my ( $res, $command ) = packet_key_val_read("command");
-	if ( $res == -1 ) {
-		print $debug "STOP\n";
-		exit();
-	}
-	print $debug "IN: $command";
-	$debug->flush();
-
-	if ( $command eq "list_available_blobs" ) {
-		# Flush
-		packet_compare_lists([1, ""], packet_bin_read()) ||
-			die "bad list_available_blobs end";
-
-		foreach my $pathname ( sort keys %DELAY ) {
-			if ( $DELAY{$pathname}{"requested"} >= 1 ) {
-				$DELAY{$pathname}{"count"} = $DELAY{$pathname}{"count"} - 1;
-				if ( $pathname eq "invalid-delay.a" ) {
-					# Send Git a pathname that was not delayed earlier
-					packet_txt_write("pathname=unfiltered");
-				}
-				if ( $pathname eq "missing-delay.a" ) {
-					# Do not signal Git that this file is available
-				} elsif ( $DELAY{$pathname}{"count"} == 0 ) {
-					print $debug " $pathname";
-					packet_txt_write("pathname=$pathname");
-				}
-			}
-		}
-
-		packet_flush();
-
-		print $debug " [OK]\n";
-		$debug->flush();
-		packet_txt_write("status=success");
-		packet_flush();
-	} else {
-		my ( $res, $pathname ) = packet_key_val_read("pathname");
-		if ( $res == -1 ) {
-			die "unexpected EOF while expecting pathname";
-		}
-		print $debug " $pathname";
-		$debug->flush();
-
-		# Read until flush
-		my ( $done, $buffer ) = packet_txt_read();
-		while ( $buffer ne '' ) {
-			if ( $buffer eq "can-delay=1" ) {
-				if ( exists $DELAY{$pathname} and $DELAY{$pathname}{"requested"} == 0 ) {
-					$DELAY{$pathname}{"requested"} = 1;
-				} elsif ( !exists $DELAY{$pathname} and $always_delay ) {
-					$DELAY{$pathname} = { "requested" => 1, "count" => 1 };
-				}
-			} elsif ($buffer =~ /^(ref|treeish|blob)=/) {
-				print $debug " $buffer";
-			} else {
-				# In general, filters need to be graceful about
-				# new metadata, since it's documented that we
-				# can pass any key-value pairs, but for tests,
-				# let's be a little stricter.
-				die "Unknown message '$buffer'";
-			}
-
-			( $done, $buffer ) = packet_txt_read();
-		}
-		if ( $done == -1 ) {
-			die "unexpected EOF after pathname '$pathname'";
-		}
-
-		my $input = "";
-		{
-			binmode(STDIN);
-			my $buffer;
-			my $done = 0;
-			while ( !$done ) {
-				( $done, $buffer ) = packet_bin_read();
-				$input .= $buffer;
-			}
-			if ( $done == -1 ) {
-				die "unexpected EOF while reading input for '$pathname'";
-			}			
-			print $debug " " . length($input) . " [OK] -- ";
-			$debug->flush();
-		}
-
-		my $output;
-		if ( exists $DELAY{$pathname} and exists $DELAY{$pathname}{"output"} ) {
-			$output = $DELAY{$pathname}{"output"}
-		} elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
-			$output = "";
-		} elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
-			$output = rot13($input);
-		} elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
-			$output = rot13($input);
-		} else {
-			die "bad command '$command'";
-		}
-
-		if ( $pathname eq "error.r" ) {
-			print $debug "[ERROR]\n";
-			$debug->flush();
-			packet_txt_write("status=error");
-			packet_flush();
-		} elsif ( $pathname eq "abort.r" ) {
-			print $debug "[ABORT]\n";
-			$debug->flush();
-			packet_txt_write("status=abort");
-			packet_flush();
-		} elsif ( $command eq "smudge" and
-			exists $DELAY{$pathname} and
-			$DELAY{$pathname}{"requested"} == 1 ) {
-			print $debug "[DELAYED]\n";
-			$debug->flush();
-			packet_txt_write("status=delayed");
-			packet_flush();
-			$DELAY{$pathname}{"requested"} = 2;
-			$DELAY{$pathname}{"output"} = $output;
-		} else {
-			packet_txt_write("status=success");
-			packet_flush();
-
-			if ( $pathname eq "${command}-write-fail.r" ) {
-				print $debug "[WRITE FAIL]\n";
-				$debug->flush();
-				die "${command} write error";
-			}
-
-			print $debug "OUT: " . length($output) . " ";
-			$debug->flush();
-
-			while ( length($output) > 0 ) {
-				my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE );
-				packet_bin_write($packet);
-				# dots represent the number of packets
-				print $debug ".";
-				if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
-					$output = substr( $output, $MAX_PACKET_CONTENT_SIZE );
-				} else {
-					$output = "";
-				}
-			}
-			packet_flush();
-			print $debug " [OK]\n";
-			$debug->flush();
-			packet_flush();
-		}
-	}
-}
diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh
index c683e60007..7d956625ca 100755
--- a/t/t2080-parallel-checkout-basics.sh
+++ b/t/t2080-parallel-checkout-basics.sh
@@ -230,12 +230,9 @@ test_expect_success SYMLINKS 'parallel checkout checks for symlinks in leading d
 # check the final report including sequential, parallel, and delayed entries
 # all at the same time. So we must have finer control of the parallel checkout
 # variables.
-test_expect_success PERL '"git checkout ." report should not include failed entries' '
-	write_script rot13-filter.pl "$PERL_PATH" \
-		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
-
+test_expect_success '"git checkout ." report should not include failed entries' '
 	test_config_global filter.delay.process \
-		"\"$(pwd)/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		"test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 	test_config_global filter.delay.required true &&
 	test_config_global filter.cat.clean cat  &&
 	test_config_global filter.cat.smudge cat  &&
diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh
index 2525457961..2df55b9405 100755
--- a/t/t2082-parallel-checkout-attributes.sh
+++ b/t/t2082-parallel-checkout-attributes.sh
@@ -138,12 +138,9 @@ test_expect_success 'parallel-checkout and external filter' '
 # The delayed queue is independent from the parallel queue, and they should be
 # able to work together in the same checkout process.
 #
-test_expect_success PERL 'parallel-checkout and delayed checkout' '
-	write_script rot13-filter.pl "$PERL_PATH" \
-		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
-
+test_expect_success 'parallel-checkout and delayed checkout' '
 	test_config_global filter.delay.process \
-		"\"$(pwd)/rot13-filter.pl\" --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
+		"test-tool rot13-filter --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
 	test_config_global filter.delay.required true &&
 
 	echo "abcd" >original &&
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-24 15:09 ` [PATCH v2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
@ 2022-07-28 16:58   ` Johannes Schindelin
  2022-07-28 17:54     ` Junio C Hamano
                       ` (2 more replies)
  2022-07-31 18:19   ` [PATCH v3 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
  1 sibling, 3 replies; 34+ messages in thread
From: Johannes Schindelin @ 2022-07-28 16:58 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, gitster, larsxschneider, christian.couder, avarab

Hi Matheus,

On Sun, 24 Jul 2022, Matheus Tavares wrote:

> This script is currently used by three test files: t0021-conversion.sh,
> t2080-parallel-checkout-basics.sh, and
> t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL
> dependency at these tests, let's convert the script to a C test-tool
> command.

Great!

>  - Squashed the two patches together.

I see why this might have been suggested, but it definitely made it more
challenging for me to review. You see, it is easy to just fly over a patch
that simply removes the `PERL` prereq, but it is much harder to jump back
and forth over all of these removals when the `.c` version of the filter
is added before them and the `.pl` version is removed after them. So I
find that it was bad advice, but I do not fault you for following it (we
all want reviews to just be over already and therefore sometimes pander to
the reviewers, no matter how much or little sense their feedback makes).

It just would have been easier for me to review if the chaff was separated
from the wheat, so to say.

To illustrate my point: it was a bit of a challenge to find the adjustment
of the "smudge write error at" needle in all of that cruft. It would have
made my life as a reviewer substantially easier had the patch
series been organized this way (which I assume you had before the feedback
you received demanded to squash everything in one hot pile):

	1/3 adjust the needle for the error message
	2/3 implement the rot13-filter in C
	3/3 use the test-tool in the tests and remove the PERL prereq, and
	    remove rot13-filter.pl

> [...]
> diff --git a/pkt-line.c b/pkt-line.c
> index 8e43c2def4..4425bdae36 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -309,9 +309,10 @@ int write_packetized_from_fd_no_flush(int fd_in, int fd_out)
>  	return err;
>  }
>
> -int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
> +int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
> +					     int fd_out, int *count_ptr)
>  {
> -	int err = 0;
> +	int err = 0, count = 0;
>  	size_t bytes_written = 0;
>  	size_t bytes_to_write;
>
> @@ -324,10 +325,18 @@ int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_ou
>  			break;
>  		err = packet_write_gently(fd_out, src_in + bytes_written, bytes_to_write);
>  		bytes_written += bytes_to_write;
> +		count++;
>  	}
> +	if (count_ptr)
> +		*count_ptr = count;

This is not just a counter, but a packet counter, right? In any case, it
would probably make more sense to increment the value directly:

		if (count_ptr)
			(*count_ptr)++;

More on that below, where you use it.

>  	return err;
>  }
>
> +int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
> +{
> +	return write_packetized_from_buf_no_flush_count(src_in, len, fd_out, NULL);
> +}

Have you considered making this a `static inline` in `pkt-line.h`?

> [...]
> diff --git a/t/helper/test-rot13-filter.c b/t/helper/test-rot13-filter.c
> new file mode 100644
> index 0000000000..536111f272
> --- /dev/null
> +++ b/t/helper/test-rot13-filter.c
> @@ -0,0 +1,393 @@
> +/*
> + * Example implementation for the Git filter protocol version 2
> + * See Documentation/gitattributes.txt, section "Filter Protocol"
> + *
> + * Usage: test-tool rot13-filter [--always-delay] <log path> <capabilities>
> + *
> + * Log path defines a debug log file that the script writes to. The
> + * subsequent arguments define a list of supported protocol capabilities
> + * ("clean", "smudge", etc).
> + *
> + * When --always-delay is given all pathnames with the "can-delay" flag
> + * that don't appear on the list bellow are delayed with a count of 1
> + * (see more below).
> + *
> + * This implementation supports special test cases:
> + * (1) If data with the pathname "clean-write-fail.r" is processed with
> + *     a "clean" operation then the write operation will die.
> + * (2) If data with the pathname "smudge-write-fail.r" is processed with
> + *     a "smudge" operation then the write operation will die.
> + * (3) If data with the pathname "error.r" is processed with any
> + *     operation then the filter signals that it cannot or does not want
> + *     to process the file.
> + * (4) If data with the pathname "abort.r" is processed with any
> + *     operation then the filter signals that it cannot or does not want
> + *     to process the file and any file after that is processed with the
> + *     same command.
> + * (5) If data with a pathname that is a key in the delay hash is
> + *     requested (e.g. "test-delay10.a") then the filter responds with
> + *     a "delay" status and sets the "requested" field in the delay hash.
> + *     The filter will signal the availability of this object after
> + *     "count" (field in delay hash) "list_available_blobs" commands.
> + * (6) If data with the pathname "missing-delay.a" is processed that the
> + *     filter will drop the path from the "list_available_blobs" response.
> + * (7) If data with the pathname "invalid-delay.a" is processed that the
> + *     filter will add the path "unfiltered" which was not delayed before
> + *     to the "list_available_blobs" response.
> + */
> +
> +#include "test-tool.h"
> +#include "pkt-line.h"
> +#include "string-list.h"
> +#include "strmap.h"
> +
> +static FILE *logfile;
> +static int always_delay;
> +static struct strmap delay = STRMAP_INIT;
> +static struct string_list requested_caps = STRING_LIST_INIT_NODUP;
> +
> +static int has_capability(const char *cap)
> +{
> +	return unsorted_string_list_has_string(&requested_caps, cap);
> +}
> +
> +static char *rot13(char *str)
> +{
> +	char *c;
> +	for (c = str; *c; c++) {
> +		if (*c >= 'a' && *c <= 'z')
> +			*c = 'a' + (*c - 'a' + 13) % 26;
> +		else if (*c >= 'A' && *c <= 'Z')
> +			*c = 'A' + (*c - 'A' + 13) % 26;

That's quite verbose, but it _is_ correct (if a bit harder than necessary
to validate, I admit that I had to look up whether `%`'s precedence is higher
than `+` in https://en.cppreference.com/w/c/language/operator_precedence).

A conciser way (also easier to reason about):

	for (c = str; *c; c++)
		if (isalpha(*c))
			*c += tolower(*c) < 'n' ? 13 : -13;

For fun, you could also look at
https://hea-www.harvard.edu/~fine/Tech/rot13.html whether you want to use
yet another approach.

> +	}
> +	return str;
> +}
> +
> +static char *skip_key_dup(const char *buf, size_t size, const char *key)
> +{
> +	struct strbuf keybuf = STRBUF_INIT;
> +	strbuf_addf(&keybuf, "%s=", key);
> +	if (!skip_prefix_mem(buf, size, keybuf.buf, &buf, &size) || !size)
> +		die("bad %s: '%s'", key, xstrndup(buf, size));
> +	strbuf_release(&keybuf);
> +	return xstrndup(buf, size);

This does what we want it to do, but it looks as if it was code translated
from a language that does not care one bit about allocations to a language
that cares a lot.

For example, instead of allocating a `strbuf` just to append `=` to the
key, in idiomatic C this code would read like this:

static char *get_value(char *buf, size_t size, const char *key)
{
	const char *orig_buf = buf;
	int orig_size = (int)size;

	if (!skip_prefix_mem(buf, size, key, &buf, &size) ||
	    !skip_prefix_mem(buf, size, "=", &buf, &size) ||
	    !size)
		die("expected key '%s', got '%.*s'",
		    key, orig_size, orig_buf);

	return xstrndup(buf, size);
}

I was tempted, even, to suggest returning a `const char *` after
NUL-terminating the line (via `buf[size] = '\0';`) instead of
`xstrndup()`ing it, but `packet_read_line()` reads into the singleton
`packet_buffer` and we use e.g. the `command` that is returned from this
function after reading the next packet, so the command would most likely
be overwritten.

> +}
> +
> +/*
> + * Read a text packet, expecting that it is in the form "key=value" for
> + * the given key. An EOF does not trigger any error and is reported
> + * back to the caller with NULL. Die if the "key" part of "key=value" does
> + * not match the given key, or the value part is empty.
> + */
> +static char *packet_key_val_read(const char *key)
> +{
> +	int size;
> +	char *buf;
> +	if (packet_read_line_gently(0, &size, &buf) < 0)
> +		return NULL;
> +	return skip_key_dup(buf, size, key);
> +}
> +
> +static void packet_read_capabilities(struct string_list *caps)
> +{
> +	while (1) {

In Git's source code, I think we prefer `for (;;)`. But not by much:

$ git grep 'while (1)' \*.c | wc
    128     508    3745

$ git grep 'for (;;)' \*.c | wc
    156     614    4389

> +		int size;
> +		char *buf = packet_read_line(0, &size);
> +		if (!buf)
> +			break;
> +		string_list_append_nodup(caps,
> +					 skip_key_dup(buf, size, "capability"));

It is tempting to use unsorted string lists for everything because Perl
makes that relatively easy.

However, in this instance I would strongly recommend using something more
akin to Perl's "hash" data structure, in this instance a `strset`.

> +	}
> +}
> +
> +/* Read remote capabilities and check them against capabilities we require */
> +static void packet_read_and_check_capabilities(struct string_list *remote_caps,
> +					       struct string_list *required_caps)
> +{
> +	struct string_list_item *item;
> +	packet_read_capabilities(remote_caps);
> +	for_each_string_list_item(item, required_caps) {
> +		if (!unsorted_string_list_has_string(remote_caps, item->string)) {
> +			die("required '%s' capability not available from remote",
> +			    item->string);
> +		}
> +	}
> +}

This is a pretty literal translation from Perl to C, and a couple of years
ago, I would have done the same.

However, these days I would recommend against it. In this instance, we are
really only interested in three capabilities: clean, smudge and delay. It
is much, much simpler to read in the capabilities and then manually verify
that the three required ones were included:

static void read_capabilities(struct strset *remote_caps)
{
	char *cap
	while ((cap = packet_key_val_read("capability")))
		strset_add(remote_caps, cap);

	if (!strset_contains(remote_caps, "clean"))
		die("required 'clean' capability not available from remote");
	if (!strset_contains(remote_caps, "smudge"))
		die("required 'smudge' capability not available from remote");
	if (!strset_contains(remote_caps, "delay"))
		die("required 'delay' capability not available from remote");
}

> +
> +/*
> + * Check our capabilities we want to advertise against the remote ones
> + * and then advertise our capabilities
> + */
> +static void packet_check_and_write_capabilities(struct string_list *remote_caps,
> +						struct string_list *our_caps)

The list of "our caps" comes from the command-line. In C, this means we
get a `const char **argv` and an `int argc`. So:

static void check_and_write_capabilities(struct strset *remote_caps,
					 const char **caps, int caps_count)
{
	int i;

	for (i = 0; i < caps_count; i++) {
		if (!strset_contains(remote_caps, caps[i]))
			die("our capability '%s' is not available from remote",
			    caps[i]);

		packet_write_fmt(1, "capability=%s\n", caps[i]);
	}
	packet_flush(1);
}

And then we would call it via

	check_and_write_capabilities(remote_caps, argv + 1, argc - 1);

> +
> +struct delay_entry {
> +	int requested, count;
> +	char *output;
> +};

Since you declare this here, it makes most sense to define
`free_delay_hash()` (which should really be named `free_delay_entries()`)
and `add_delay_entry()` here.

> +
> +static void command_loop(void)
> +{
> +	while (1) {
> +		char *command = packet_key_val_read("command");
> +		if (!command) {
> +			fprintf(logfile, "STOP\n");
> +			break;
> +		}
> +		fprintf(logfile, "IN: %s", command);

We will also need to `fflush(logfile)` here, to imitate the Perl script's
behavior more precisely.

> +
> +		if (!strcmp(command, "list_available_blobs")) {
> +			struct hashmap_iter iter;
> +			struct strmap_entry *ent;
> +			struct string_list_item *str_item;
> +			struct string_list paths = STRING_LIST_INIT_NODUP;
> +
> +			/* flush */
> +			if (packet_read_line(0, NULL))
> +				die("bad list_available_blobs end");
> +
> +			strmap_for_each_entry(&delay, &iter, ent) {
> +				struct delay_entry *delay_entry = ent->value;
> +				if (!delay_entry->requested)
> +					continue;
> +				delay_entry->count--;
> +				if (!strcmp(ent->key, "invalid-delay.a")) {
> +					/* Send Git a pathname that was not delayed earlier */
> +					packet_write_fmt(1, "pathname=unfiltered");
> +				}
> +				if (!strcmp(ent->key, "missing-delay.a")) {
> +					/* Do not signal Git that this file is available */
> +				} else if (!delay_entry->count) {
> +					string_list_insert(&paths, ent->key);
> +					packet_write_fmt(1, "pathname=%s", ent->key);
> +				}
> +			}
> +
> +			/* Print paths in sorted order. */

The Perl script does not order them specifically. Do we really have to do
that here?

In any case, it is more performant to append the paths in an unsorted way
and then sort them once in the end (that's O(N log(N)) instead of O(N^2)).

> +			for_each_string_list_item(str_item, &paths)
> +				fprintf(logfile, " %s", str_item->string);
> +			string_list_clear(&paths, 0);
> +
> +			packet_flush(1);
> +
> +			fprintf(logfile, " [OK]\n");
> +			packet_write_fmt(1, "status=success");
> +			packet_flush(1);

I know the Perl script uses an else here, but I'd much rather insert a
`continue` at the end of the `list_available_blobs` clause and de-indent
the remainder of the loop body.

> +		} else {
> +			char *buf, *output;
> +			int size;
> +			char *pathname;
> +			struct delay_entry *entry;
> +			struct strbuf input = STRBUF_INIT;
> +
> +			pathname = packet_key_val_read("pathname");
> +			if (!pathname)
> +				die("unexpected EOF while expecting pathname");
> +			fprintf(logfile, " %s", pathname);

Again, let's `fflush(logfile)` here.

> +
> +			/* Read until flush */
> +			buf = packet_read_line(0, &size);
> +			while (buf) {

Let's write this in more idiomatic C:

			while ((buf = packet_read_line(0, &size))) {

> +				if (!strcmp(buf, "can-delay=1")) {
> +					entry = strmap_get(&delay, pathname);
> +					if (entry && !entry->requested) {
> +						entry->requested = 1;
> +					} else if (!entry && always_delay) {
> +						entry = xcalloc(1, sizeof(*entry));
> +						entry->requested = 1;
> +						entry->count = 1;
> +						strmap_put(&delay, pathname, entry);

I guess here is our chance to extend the signature of `add_delay_entry()`
to accept a `requested` parameter, and to call that here.

> +					}
> +				} else if (starts_with(buf, "ref=") ||
> +					   starts_with(buf, "treeish=") ||
> +					   starts_with(buf, "blob=")) {
> +					fprintf(logfile, " %s", buf);
> +				} else {
> +					/*
> +					 * In general, filters need to be graceful about
> +					 * new metadata, since it's documented that we
> +					 * can pass any key-value pairs, but for tests,
> +					 * let's be a little stricter.
> +					 */
> +					die("Unknown message '%s'", buf);
> +				}
> +				buf = packet_read_line(0, &size);
> +			}
> +
> +
> +			read_packetized_to_strbuf(0, &input, 0);
> +			fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);

This reads _so much nicer_ than the Perl version!

> +
> +			entry = strmap_get(&delay, pathname);
> +			if (entry && entry->output) {
> +				output = entry->output;
> +			} else if (!strcmp(pathname, "error.r") || !strcmp(pathname, "abort.r")) {
> +				output = "";
> +			} else if (!strcmp(command, "clean") && has_capability("clean")) {
> +				output = rot13(input.buf);
> +			} else if (!strcmp(command, "smudge") && has_capability("smudge")) {
> +				output = rot13(input.buf);
> +			} else {
> +				die("bad command '%s'", command);
> +			}
> +
> +			if (!strcmp(pathname, "error.r")) {
> +				fprintf(logfile, "[ERROR]\n");
> +				packet_write_fmt(1, "status=error");
> +				packet_flush(1);
> +			} else if (!strcmp(pathname, "abort.r")) {
> +				fprintf(logfile, "[ABORT]\n");
> +				packet_write_fmt(1, "status=abort");
> +				packet_flush(1);
> +			} else if (!strcmp(command, "smudge") &&
> +				   (entry = strmap_get(&delay, pathname)) &&
> +				   entry->requested == 1) {
> +				fprintf(logfile, "[DELAYED]\n");
> +				packet_write_fmt(1, "status=delayed");
> +				packet_flush(1);
> +				entry->requested = 2;
> +				entry->output = xstrdup(output);

We need to call `free(entry->output)` before that lest we leak memory, but
only if `output` is not identical anyway:

				if (entry->output != output) {
					free(entry->output);
					entry->output = xstrdup(output);
				}


> +			} else {
> +				int i, nr_packets;
> +				size_t output_len;
> +				struct strbuf sb = STRBUF_INIT;
> +				packet_write_fmt(1, "status=success");
> +				packet_flush(1);
> +
> +				strbuf_addf(&sb, "%s-write-fail.r", command);
> +				if (!strcmp(pathname, sb.buf)) {

We can easily avoid allocating the string just for comparing it:

				const char *p;

				if (skip_prefix(pathname, command, &p) &&
				    !strcmp(p, "-write-fail.r")) {

> +					fprintf(logfile, "[WRITE FAIL]\n");

					fflush(logfile) ;-)

> +					die("%s write error", command);
> +				}
> +
> +				output_len = strlen(output);
> +				fprintf(logfile, "OUT: %"PRIuMAX" ", (uintmax_t)output_len);
> +
> +				if (write_packetized_from_buf_no_flush_count(output,
> +					output_len, 1, &nr_packets))
> +					die("failed to write buffer to stdout");
> +				packet_flush(1);
> +
> +				for (i = 0; i < nr_packets; i++)
> +					fprintf(logfile, ".");

That's not quite the same as the Perl script does: it prints a '.'
(without flushing, though) _every_ time it wrote a packet.

If you want to emulate that, you will have to copy/edit that loop (and in
that case, the insanely long-named function
`write_packetized_from_buf_no_flush_count()` is unnecessary, too).

> +				fprintf(logfile, " [OK]\n");
> +
> +				packet_flush(1);
> +				strbuf_release(&sb);
> +			}
> +			free(pathname);
> +			strbuf_release(&input);
> +		}
> +		free(command);
> +	}
> +}
> +
> +static void free_delay_hash(void)
> +{
> +	struct hashmap_iter iter;
> +	struct strmap_entry *ent;
> +
> +	strmap_for_each_entry(&delay, &iter, ent) {
> +		struct delay_entry *delay_entry = ent->value;
> +		free(delay_entry->output);
> +		free(delay_entry);
> +	}
> +	strmap_clear(&delay, 0);
> +}
> +
> +static void add_delay_entry(char *pathname, int count)
> +{
> +	struct delay_entry *entry = xcalloc(1, sizeof(*entry));
> +	entry->count = count;
> +	if (strmap_put(&delay, pathname, entry))
> +		BUG("adding the same path twice to delay hash?");
> +}
> +
> +static void packet_initialize(const char *name, int version)
> +{
> +	struct strbuf sb = STRBUF_INIT;
> +	int size;
> +	char *pkt_buf = packet_read_line(0, &size);
> +
> +	strbuf_addf(&sb, "%s-client", name);
> +	if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))

We do not need the flexibility of the Perl package, where `name` is a
parameter. We can hard-code `git-filter-client` here. I.e. something like
this:

	if (!pkt_buf || size != 17 ||
	    strncmp(pkt_buf, "git-filter-client", 17))

> +		die("bad initialize: '%s'", xstrndup(pkt_buf, size));
> +
> +	strbuf_reset(&sb);
> +	strbuf_addf(&sb, "version=%d", version);

Same here. We do not need to allocate a string just to compare it to the
packet's payload.

> +	pkt_buf = packet_read_line(0, &size);
> +	if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
> +		die("bad version: '%s'", xstrndup(pkt_buf, size));
> +
> +	pkt_buf = packet_read_line(0, &size);
> +	if (pkt_buf)
> +		die("bad version end: '%s'", xstrndup(pkt_buf, size));
> +
> +	packet_write_fmt(1, "%s-server", name);
> +	packet_write_fmt(1, "version=%d", version);
> +	packet_flush(1);
> +	strbuf_release(&sb);
> +}
> +
> +static char *rot13_usage = "test-tool rot13-filter [--always-delay] <log path> <capabilities>";
> +
> +int cmd__rot13_filter(int argc, const char **argv)
> +{
> +	int i = 1;
> +	struct string_list remote_caps = STRING_LIST_INIT_DUP,
> +			   supported_caps = STRING_LIST_INIT_NODUP;
> +
> +	string_list_append(&supported_caps, "clean");
> +	string_list_append(&supported_caps, "smudge");
> +	string_list_append(&supported_caps, "delay");
> +
> +	if (argc > 1 && !strcmp(argv[i], "--always-delay")) {
> +		always_delay = 1;
> +		i++;
> +	}
> +	if (argc - i < 2)
> +		usage(rot13_usage);
> +
> +	logfile = fopen(argv[i++], "a");
> +	if (!logfile)
> +		die_errno("failed to open log file");
> +
> +	for ( ; i < argc; i++)
> +		string_list_append(&requested_caps, argv[i]);
> +
> +	add_delay_entry("test-delay10.a", 1);
> +	add_delay_entry("test-delay11.a", 1);
> +	add_delay_entry("test-delay20.a", 2);
> +	add_delay_entry("test-delay10.b", 1);
> +	add_delay_entry("missing-delay.a", 1);
> +	add_delay_entry("invalid-delay.a", 1);
> +
> +	fprintf(logfile, "START\n");
> +
> +	packet_initialize("git-filter", 2);
> +
> +	packet_read_and_check_capabilities(&remote_caps, &supported_caps);
> +	packet_check_and_write_capabilities(&remote_caps, &requested_caps);
> +	fprintf(logfile, "init handshake complete\n");
> +
> +	string_list_clear(&supported_caps, 0);
> +	string_list_clear(&remote_caps, 0);
> +
> +	command_loop();
> +
> +	fclose(logfile);
> +	string_list_clear(&requested_caps, 0);
> +	free_delay_hash();
> +	return 0;
> +}

Other than that, this looks great!

Thank you,
Dscho


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-28 16:58   ` Johannes Schindelin
@ 2022-07-28 17:54     ` Junio C Hamano
  2022-07-28 19:50     ` Ævar Arnfjörð Bjarmason
  2022-07-31  2:52     ` Matheus Tavares
  2 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2022-07-28 17:54 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Matheus Tavares, git, larsxschneider, christian.couder, avarab

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>>  - Squashed the two patches together.
>
> I see why this might have been suggested, but it definitely made it more
> challenging for me to review. You see, it is easy to just fly over a patch
> that simply removes the `PERL` prereq, but it is much harder to jump back
> and forth over all of these removals when the `.c` version of the filter
> is added before them and the `.pl` version is removed after them.

Yeah, I tend to agree.

> ...
> Other than that, this looks great!

Yup, thanks for an excellent review.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-28 16:58   ` Johannes Schindelin
  2022-07-28 17:54     ` Junio C Hamano
@ 2022-07-28 19:50     ` Ævar Arnfjörð Bjarmason
  2022-07-31  2:52     ` Matheus Tavares
  2 siblings, 0 replies; 34+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-28 19:50 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Matheus Tavares, git, gitster, larsxschneider, christian.couder


On Thu, Jul 28 2022, Johannes Schindelin wrote:

> [...]
> I see why this might have been suggested, but it definitely made it more
> challenging for me to review. You see, it is easy to just fly over a patch
> that simply removes the `PERL` prereq, but it is much harder to jump back
> and forth over all of these removals when the `.c` version of the filter
> is added before them and the `.pl` version is removed after them. So I
> find that it was bad advice, but I do not fault you for following it (we
> all want reviews to just be over already and therefore sometimes pander to
> the reviewers, no matter how much or little sense their feedback makes).
> [...]
> To illustrate my point: it was a bit of a challenge to find the adjustment
> of the "smudge write error at" needle in all of that cruft. It would have
> made my life as a reviewer substantially easier had the patch
> series been organized this way (which I assume you had before the feedback
> you received demanded to squash everything in one hot pile):

If you don't think a suggestion of mine makes sense, I'd appreciate it
if you just replied me directly, instead of sending this sort of comment
to someone else. I find your wording here to be somewhere between snarky
and mean-spirited. I didn't demand anything.

If this was the first time this sort of thing has occurred I wouldn't
say anything about it, but this is far from being the first time.

In any case, if you read more than a few words into
https://lore.kernel.org/git/220723.86pmhwquie.gmgdl@evledraar.gmail.com/
you'll see that I suggested splitting the removal of the PERL prereq
into its own change, which I think would address what you're bringing up
here.

What I was mainly commenting on was that this series could avoid
introducing code in-between the v1 1/2 and 2/2 which is only needed
because of that split-up. I.e. the "exec", and needing to quote those
arguments.

Which I stand by, I think it's much easier to just do a "git show
--word-diff" on this than reason about how that "chain-loading" is
working, and whether the inter-series state is buggy. But again, the
concern you about the associated verbosity is easy to mitigate.

On the point of pandering to reviewers I find it really nitpicky to ask
for changes to change some working O(N^2)) code in a test-tool to O(N
log(N)), or to avoid a few allocations here & there.

If it was a new git built-in, then sure, but I think our collective time
is much better spend by just letting that sort of thing slide when it
comes to test-tools, which are almost always going to be operating on
the relatively tiny set of test data we expose them too.

Unless Matheus is keenly interested on optimizing this code, that is.

> [...]
> This does what we want it to do, but it looks as if it was code translated
> from a language that does not care one bit about allocations to a language
> that cares a lot.

FWIW Perl cares a lot about allocations, the sort of code you're
commenting on here doesn't involve allocations in Perl in the general
case, since it "allocates ahead", similar to how we use alloc_nr() and
strbuf_reset() patterns.

What it doesn't care about is free()-ing memory, which is an orthagonal
thing. But that's just an optimization, the general assumptios in that
if your program ever needs X MB of memory it's likely to need at least
that much again, or it'll exit() and have the OS clean it up.




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-28 16:58   ` Johannes Schindelin
  2022-07-28 17:54     ` Junio C Hamano
  2022-07-28 19:50     ` Ævar Arnfjörð Bjarmason
@ 2022-07-31  2:52     ` Matheus Tavares
  2022-08-09  9:36       ` Johannes Schindelin
  2 siblings, 1 reply; 34+ messages in thread
From: Matheus Tavares @ 2022-07-31  2:52 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: git, gitster, larsxschneider, christian.couder, avarab

Hi, Dscho

On Thu, Jul 28, 2022 at 1:58 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> > On Sun, 24 Jul 2022, Matheus Tavares wrote:
> >
> > diff --git a/t/helper/test-rot13-filter.c b/t/helper/test-rot13-filter.c
> > +static char *rot13(char *str)
> > +{
> > +     char *c;
> > +     for (c = str; *c; c++) {
> > +             if (*c >= 'a' && *c <= 'z')
> > +                     *c = 'a' + (*c - 'a' + 13) % 26;
> > +             else if (*c >= 'A' && *c <= 'Z')
> > +                     *c = 'A' + (*c - 'A' + 13) % 26;
>
> That's quite verbose, but it _is_ correct (if a bit harder than necessary
> to validate, I admit that I had to look up whether `%`'s precedence is higher
> than `+` in https://en.cppreference.com/w/c/language/operator_precedence).
>
> A conciser way (also easier to reason about):
>
>         for (c = str; *c; c++)
>                 if (isalpha(*c))
>                         *c += tolower(*c) < 'n' ? 13 : -13;

Nice :) Thanks.

> > [...]
> > +static void packet_read_capabilities(struct string_list *caps)
> > +{
> > +     while (1) {
> > +             int size;
> > +             char *buf = packet_read_line(0, &size);
> > +             if (!buf)
> > +                     break;
> > +             string_list_append_nodup(caps,
> > +                                      skip_key_dup(buf, size, "capability"));
>
> It is tempting to use unsorted string lists for everything because Perl
> makes that relatively easy.
>
> However, in this instance I would strongly recommend using something more
> akin to Perl's "hash" data structure, in this instance a `strset`.

Ok, will do.

> > +
> > +/*
> > + * Check our capabilities we want to advertise against the remote ones
> > + * and then advertise our capabilities
> > + */
> > +static void packet_check_and_write_capabilities(struct string_list *remote_caps,
> > +                                             struct string_list *our_caps)
>
> The list of "our caps" comes from the command-line. In C, this means we
> get a `const char **argv` and an `int argc`. So:
>
> static void check_and_write_capabilities(struct strset *remote_caps,
>                                          const char **caps, int caps_count)
> {
>         int i;
>
>         for (i = 0; i < caps_count; i++) {
>                 if (!strset_contains(remote_caps, caps[i]))
>                         die("our capability '%s' is not available from remote",
>                             caps[i]);
>
>                 packet_write_fmt(1, "capability=%s\n", caps[i]);
>         }
>         packet_flush(1);
> }

Makes sense. We also use the list elsewhere (has_capability()), but we
can easily replace that with two global flags to indicate if we have
the "clean" and "smudge" caps.

> And then we would call it via
>
>         check_and_write_capabilities(remote_caps, argv + 1, argc - 1);
>
> [...]
> > +static void command_loop(void)
> > +{
> > +     while (1) {
> > +             char *command = packet_key_val_read("command");
> > +             if (!command) {
> > +                     fprintf(logfile, "STOP\n");
> > +                     break;
> > +             }
> > +             fprintf(logfile, "IN: %s", command);
>
> We will also need to `fflush(logfile)` here, to imitate the Perl script's
> behavior more precisely.

I was somewhat intrigued as to why the flushes were needed in the Perl
script. But reading [1] and [2], now, it seems to have been an
oversight.

That is, Eric suggested splictily flushing stdout because it is a
pipe, but the author ended up erroneously disabling autoflush for
stdout too, so that's why we needed the flushes there. They later
acknowledged that and said that they would re-enabled it (see [2]),
but it seems to have been forgotten. So I think we can safely drop the
flush calls.

[1]: http://public-inbox.org/git/20160723072721.GA20875%40starla/
[2]: https://lore.kernel.org/git/7F1F1A0E-8FC3-4FBD-81AA-37786DE0EF50@gmail.com/

> > +
> > +             if (!strcmp(command, "list_available_blobs")) {
> > +                     struct hashmap_iter iter;
> > +                     struct strmap_entry *ent;
> > +                     struct string_list_item *str_item;
> > +                     struct string_list paths = STRING_LIST_INIT_NODUP;
> > +
> > +                     /* flush */
> > +                     if (packet_read_line(0, NULL))
> > +                             die("bad list_available_blobs end");
> > +
> > +                     strmap_for_each_entry(&delay, &iter, ent) {
> > +                             struct delay_entry *delay_entry = ent->value;
> > +                             if (!delay_entry->requested)
> > +                                     continue;
> > +                             delay_entry->count--;
> > +                             if (!strcmp(ent->key, "invalid-delay.a")) {
> > +                                     /* Send Git a pathname that was not delayed earlier */
> > +                                     packet_write_fmt(1, "pathname=unfiltered");
> > +                             }
> > +                             if (!strcmp(ent->key, "missing-delay.a")) {
> > +                                     /* Do not signal Git that this file is available */
> > +                             } else if (!delay_entry->count) {
> > +                                     string_list_insert(&paths, ent->key);
> > +                                     packet_write_fmt(1, "pathname=%s", ent->key);
> > +                             }
> > +                     }
> > +
> > +                     /* Print paths in sorted order. */
>
> The Perl script does not order them specifically. Do we really have to do
> that here?

It actually prints them in sorted order:

        foreach my $pathname ( sort keys %DELAY )

That is required because some test cases will compare the output using
this order.

> In any case, it is more performant to append the paths in an unsorted way
> and then sort them once in the end (that's O(N log(N)) instead of O(N^2)).

OK, will do.

> > +                     for_each_string_list_item(str_item, &paths)
> > +                             fprintf(logfile, " %s", str_item->string);
> > +                     string_list_clear(&paths, 0);
> > +
> > +                     packet_flush(1);
> > +
> > +                     fprintf(logfile, " [OK]\n");
> > +                     packet_write_fmt(1, "status=success");
> > +                     packet_flush(1);
>
> I know the Perl script uses an else here, but I'd much rather insert a
> `continue` at the end of the `list_available_blobs` clause and de-indent
> the remainder of the loop body.

Sure! I think we can take a step further and extract the if logic to a
separate function.

> > +             } else {
> > +                     char *buf, *output;
> > +                     int size;
> > +                     char *pathname;
> > +                     struct delay_entry *entry;
> > +                     struct strbuf input = STRBUF_INIT;
> > +
> > +                     pathname = packet_key_val_read("pathname");
> > +                     if (!pathname)
> > +                             die("unexpected EOF while expecting pathname");
> > +                     fprintf(logfile, " %s", pathname);
>
> Again, let's `fflush(logfile)` here.
>
> > +
> > +                     /* Read until flush */
> > +                     buf = packet_read_line(0, &size);
> > +                     while (buf) {
>
> Let's write this in more idiomatic C:
>
>                         while ((buf = packet_read_line(0, &size))) {
>
> > +                             if (!strcmp(buf, "can-delay=1")) {
> > +                                     entry = strmap_get(&delay, pathname);
> > +                                     if (entry && !entry->requested) {
> > +                                             entry->requested = 1;
> > +                                     } else if (!entry && always_delay) {
> > +                                             entry = xcalloc(1, sizeof(*entry));
> > +                                             entry->requested = 1;
> > +                                             entry->count = 1;
> > +                                             strmap_put(&delay, pathname, entry);
>
> I guess here is our chance to extend the signature of `add_delay_entry()`
> to accept a `requested` parameter, and to call that here.
>
> > +                                     }
> > +                             } else if (starts_with(buf, "ref=") ||
> > +                                        starts_with(buf, "treeish=") ||
> > +                                        starts_with(buf, "blob=")) {
> > +                                     fprintf(logfile, " %s", buf);
> > +                             } else {
> > +                                     /*
> > +                                      * In general, filters need to be graceful about
> > +                                      * new metadata, since it's documented that we
> > +                                      * can pass any key-value pairs, but for tests,
> > +                                      * let's be a little stricter.
> > +                                      */
> > +                                     die("Unknown message '%s'", buf);
> > +                             }
> > +                             buf = packet_read_line(0, &size);
> > +                     }
> > +
> > +
> > +                     read_packetized_to_strbuf(0, &input, 0);
> > +                     fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);
>
> This reads _so much nicer_ than the Perl version!
>
> > +
> > +                     entry = strmap_get(&delay, pathname);
> > +                     if (entry && entry->output) {
> > +                             output = entry->output;
> > +                     } else if (!strcmp(pathname, "error.r") || !strcmp(pathname, "abort.r")) {
> > +                             output = "";
> > +                     } else if (!strcmp(command, "clean") && has_capability("clean")) {
> > +                             output = rot13(input.buf);
> > +                     } else if (!strcmp(command, "smudge") && has_capability("smudge")) {
> > +                             output = rot13(input.buf);
> > +                     } else {
> > +                             die("bad command '%s'", command);
> > +                     }
> > +
> > +                     if (!strcmp(pathname, "error.r")) {
> > +                             fprintf(logfile, "[ERROR]\n");
> > +                             packet_write_fmt(1, "status=error");
> > +                             packet_flush(1);
> > +                     } else if (!strcmp(pathname, "abort.r")) {
> > +                             fprintf(logfile, "[ABORT]\n");
> > +                             packet_write_fmt(1, "status=abort");
> > +                             packet_flush(1);
> > +                     } else if (!strcmp(command, "smudge") &&
> > +                                (entry = strmap_get(&delay, pathname)) &&
> > +                                entry->requested == 1) {
> > +                             fprintf(logfile, "[DELAYED]\n");
> > +                             packet_write_fmt(1, "status=delayed");
> > +                             packet_flush(1);
> > +                             entry->requested = 2;
> > +                             entry->output = xstrdup(output);
>
> We need to call `free(entry->output)` before that lest we leak memory, but
> only if `output` is not identical anyway:
>
>                                 if (entry->output != output) {
>                                         free(entry->output);
>                                         entry->output = xstrdup(output);
>                                 }

I think, entry->output will always be NULL here, since we only get
inside this if block after entry->requested has been set to 1 at the
top of the function; and, at that point, we haven't run ro13 yet.
Nevertheless, it doesn't hurt to add the free call anyway :)

>
> > +                     } else {
> > +                             int i, nr_packets;
> > +                             size_t output_len;
> > +                             struct strbuf sb = STRBUF_INIT;
> > +                             packet_write_fmt(1, "status=success");
> > +                             packet_flush(1);
> > +
> > +                             strbuf_addf(&sb, "%s-write-fail.r", command);
> > +                             if (!strcmp(pathname, sb.buf)) {
>
> We can easily avoid allocating the string just for comparing it:
>
>                                 const char *p;
>
>                                 if (skip_prefix(pathname, command, &p) &&
>                                     !strcmp(p, "-write-fail.r")) {
>
> > +                                     fprintf(logfile, "[WRITE FAIL]\n");
>
>                                         fflush(logfile) ;-)
>
> > +                                     die("%s write error", command);
> > +                             }
> > +
> > +                             output_len = strlen(output);
> > +                             fprintf(logfile, "OUT: %"PRIuMAX" ", (uintmax_t)output_len);
> > +
> > +                             if (write_packetized_from_buf_no_flush_count(output,
> > +                                     output_len, 1, &nr_packets))
> > +                                     die("failed to write buffer to stdout");
> > +                             packet_flush(1);
> > +
> > +                             for (i = 0; i < nr_packets; i++)
> > +                                     fprintf(logfile, ".");
>
> That's not quite the same as the Perl script does: it prints a '.'
> (without flushing, though) _every_ time it wrote a packet.
>
> If you want to emulate that, you will have to copy/edit that loop (and in
> that case, the insanely long-named function
> `write_packetized_from_buf_no_flush_count()` is unnecessary, too).

Hmm, I'm not sure we need to emulate that. I do dislike the huge
function name as well, but I also don't quite like to repeat code
copying that loop here...

> > +                             fprintf(logfile, " [OK]\n");
> > +
> > +                             packet_flush(1);
> > +                             strbuf_release(&sb);
> > +                     }
> > +                     free(pathname);
> > +                     strbuf_release(&input);
> > +             }
> > +             free(command);
> > +     }
> > +}
> > [...]
> > +static void packet_initialize(const char *name, int version)
> > +{
> > +     struct strbuf sb = STRBUF_INIT;
> > +     int size;
> > +     char *pkt_buf = packet_read_line(0, &size);
> > +
> > +     strbuf_addf(&sb, "%s-client", name);
> > +     if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
>
> We do not need the flexibility of the Perl package, where `name` is a
> parameter. We can hard-code `git-filter-client` here. I.e. something like
> this:
>
>         if (!pkt_buf || size != 17 ||
>             strncmp(pkt_buf, "git-filter-client", 17))

Good idea! Thanks. Perhaps, can't we do:

        if (!pkt_buf || strncmp(pkt_buf, "git-filter-client", size))

to avoid the hard-coded and possibly error-prone 17?

> > +             die("bad initialize: '%s'", xstrndup(pkt_buf, size));
> > +
> > +     strbuf_reset(&sb);
> > +     strbuf_addf(&sb, "version=%d", version);

Thanks for a very detailed review and great suggestions!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 0/3] t0021: convert perl script to C test-tool helper
  2022-07-24 15:09 ` [PATCH v2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
  2022-07-28 16:58   ` Johannes Schindelin
@ 2022-07-31 18:19   ` Matheus Tavares
  2022-07-31 18:19     ` [PATCH v3 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
                       ` (3 more replies)
  1 sibling, 4 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-07-31 18:19 UTC (permalink / raw)
  To: git; +Cc: gitster, avarab, johannes.schindelin

Convert t/t0021/rot13-filter.pl to a test-tool helper to avoid the PERL
prereq in various tests.

Changes since v2:

- Split into 3 patches.
- write_packetized_from_buf_no_flush(): s/counter_ptr/packet_counter/ and
  incremented ptr directly.
- Convert write_packetized_from_buf_no_flush_count() to static inline.
- Simplified rot13 routine.
- Avoided memory allocations at skip_key_dup (now get_value()) and
  packet_initialize().
- Replace unsorted list "remote_caps" by strset.
- Simplified packet_read_and_check_capabilities() (now read_capabilities()),
  to test the tree capabilities directly.
- check_and_write_capabilities(): operate on (argv, argc) directly, instead of
  creating a list.
- Moved "struct delay_entry" routines closed to the struct declaration.
- command_loop(): sorted paths after their insertion to the list.
- command_loop(): extracted list_available_blobs logic to separated function.
- Other small refactoring for more idiomatic code.

Matheus Tavares (3):
  t0021: avoid grepping for a Perl-specific string at filter output
  t0021: implementation the rot13-filter.pl script in C
  tests: use the new C rot13-filter helper to avoid PERL prereq

 Makefile                                |   1 +
 pkt-line.c                              |   5 +-
 pkt-line.h                              |   8 +-
 t/helper/test-rot13-filter.c            | 379 ++++++++++++++++++++++++
 t/helper/test-tool.c                    |   1 +
 t/helper/test-tool.h                    |   1 +
 t/t0021-conversion.sh                   |  71 +++--
 t/t0021/rot13-filter.pl                 | 247 ---------------
 t/t2080-parallel-checkout-basics.sh     |   7 +-
 t/t2082-parallel-checkout-attributes.sh |   7 +-
 10 files changed, 431 insertions(+), 296 deletions(-)
 create mode 100644 t/helper/test-rot13-filter.c
 delete mode 100644 t/t0021/rot13-filter.pl

Range-diff against v2:
-:  ---------- > 1:  5ec95c7e69 t0021: avoid grepping for a Perl-specific string at filter output
-:  ---------- > 2:  86e6baba46 t0021: implementation the rot13-filter.pl script in C
1:  f38f722de7 ! 3:  c66fc0a186 t/t0021: convert the rot13-filter.pl script to C
    @@ Metadata
     Author: Matheus Tavares <matheus.bernardino@usp.br>
     
      ## Commit message ##
    -    t/t0021: convert the rot13-filter.pl script to C
    +    tests: use the new C rot13-filter helper to avoid PERL prereq
     
    -    This script is currently used by three test files: t0021-conversion.sh,
    -    t2080-parallel-checkout-basics.sh, and
    -    t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL
    -    dependency at these tests, let's convert the script to a C test-tool
    -    command.
    -
    -    Note that there is a small adjustment needed at test t0021-conversion.sh
    -    because it depended on a specific error message given by perl's die
    -    routine.
    +    The previous commit implemented a C version of the t0021/rot13-filter.pl
    +    script. Let's use this new C helper to eliminate the PERL prereq from
    +    various tests, and also remove the superseded Perl script.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    - ## Makefile ##
    -@@ Makefile: TEST_BUILTINS_OBJS += test-read-midx.o
    - TEST_BUILTINS_OBJS += test-ref-store.o
    - TEST_BUILTINS_OBJS += test-reftable.o
    - TEST_BUILTINS_OBJS += test-regex.o
    -+TEST_BUILTINS_OBJS += test-rot13-filter.o
    - TEST_BUILTINS_OBJS += test-repository.o
    - TEST_BUILTINS_OBJS += test-revision-walking.o
    - TEST_BUILTINS_OBJS += test-run-command.o
    -
    - ## pkt-line.c ##
    -@@ pkt-line.c: int write_packetized_from_fd_no_flush(int fd_in, int fd_out)
    - 	return err;
    - }
    - 
    --int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
    -+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
    -+					     int fd_out, int *count_ptr)
    - {
    --	int err = 0;
    -+	int err = 0, count = 0;
    - 	size_t bytes_written = 0;
    - 	size_t bytes_to_write;
    - 
    -@@ pkt-line.c: int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_ou
    - 			break;
    - 		err = packet_write_gently(fd_out, src_in + bytes_written, bytes_to_write);
    - 		bytes_written += bytes_to_write;
    -+		count++;
    - 	}
    -+	if (count_ptr)
    -+		*count_ptr = count;
    - 	return err;
    - }
    - 
    -+int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
    -+{
    -+	return write_packetized_from_buf_no_flush_count(src_in, len, fd_out, NULL);
    -+}
    -+
    - static int get_packet_data(int fd, char **src_buf, size_t *src_size,
    - 			   void *dst, unsigned size, int options)
    - {
    -
    - ## pkt-line.h ##
    -@@ pkt-line.h: int packet_flush_gently(int fd);
    - int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
    - int write_packetized_from_fd_no_flush(int fd_in, int fd_out);
    - int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out);
    -+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
    -+					     int fd_out, int *count_ptr);
    - 
    - /*
    -  * Stdio versions of packet_write functions. When mixing these with fd
    -
    - ## t/helper/test-rot13-filter.c (new) ##
    -@@
    -+/*
    -+ * Example implementation for the Git filter protocol version 2
    -+ * See Documentation/gitattributes.txt, section "Filter Protocol"
    -+ *
    -+ * Usage: test-tool rot13-filter [--always-delay] <log path> <capabilities>
    -+ *
    -+ * Log path defines a debug log file that the script writes to. The
    -+ * subsequent arguments define a list of supported protocol capabilities
    -+ * ("clean", "smudge", etc).
    -+ *
    -+ * When --always-delay is given all pathnames with the "can-delay" flag
    -+ * that don't appear on the list bellow are delayed with a count of 1
    -+ * (see more below).
    -+ *
    -+ * This implementation supports special test cases:
    -+ * (1) If data with the pathname "clean-write-fail.r" is processed with
    -+ *     a "clean" operation then the write operation will die.
    -+ * (2) If data with the pathname "smudge-write-fail.r" is processed with
    -+ *     a "smudge" operation then the write operation will die.
    -+ * (3) If data with the pathname "error.r" is processed with any
    -+ *     operation then the filter signals that it cannot or does not want
    -+ *     to process the file.
    -+ * (4) If data with the pathname "abort.r" is processed with any
    -+ *     operation then the filter signals that it cannot or does not want
    -+ *     to process the file and any file after that is processed with the
    -+ *     same command.
    -+ * (5) If data with a pathname that is a key in the delay hash is
    -+ *     requested (e.g. "test-delay10.a") then the filter responds with
    -+ *     a "delay" status and sets the "requested" field in the delay hash.
    -+ *     The filter will signal the availability of this object after
    -+ *     "count" (field in delay hash) "list_available_blobs" commands.
    -+ * (6) If data with the pathname "missing-delay.a" is processed that the
    -+ *     filter will drop the path from the "list_available_blobs" response.
    -+ * (7) If data with the pathname "invalid-delay.a" is processed that the
    -+ *     filter will add the path "unfiltered" which was not delayed before
    -+ *     to the "list_available_blobs" response.
    -+ */
    -+
    -+#include "test-tool.h"
    -+#include "pkt-line.h"
    -+#include "string-list.h"
    -+#include "strmap.h"
    -+
    -+static FILE *logfile;
    -+static int always_delay;
    -+static struct strmap delay = STRMAP_INIT;
    -+static struct string_list requested_caps = STRING_LIST_INIT_NODUP;
    -+
    -+static int has_capability(const char *cap)
    -+{
    -+	return unsorted_string_list_has_string(&requested_caps, cap);
    -+}
    -+
    -+static char *rot13(char *str)
    -+{
    -+	char *c;
    -+	for (c = str; *c; c++) {
    -+		if (*c >= 'a' && *c <= 'z')
    -+			*c = 'a' + (*c - 'a' + 13) % 26;
    -+		else if (*c >= 'A' && *c <= 'Z')
    -+			*c = 'A' + (*c - 'A' + 13) % 26;
    -+	}
    -+	return str;
    -+}
    -+
    -+static char *skip_key_dup(const char *buf, size_t size, const char *key)
    -+{
    -+	struct strbuf keybuf = STRBUF_INIT;
    -+	strbuf_addf(&keybuf, "%s=", key);
    -+	if (!skip_prefix_mem(buf, size, keybuf.buf, &buf, &size) || !size)
    -+		die("bad %s: '%s'", key, xstrndup(buf, size));
    -+	strbuf_release(&keybuf);
    -+	return xstrndup(buf, size);
    -+}
    -+
    -+/*
    -+ * Read a text packet, expecting that it is in the form "key=value" for
    -+ * the given key. An EOF does not trigger any error and is reported
    -+ * back to the caller with NULL. Die if the "key" part of "key=value" does
    -+ * not match the given key, or the value part is empty.
    -+ */
    -+static char *packet_key_val_read(const char *key)
    -+{
    -+	int size;
    -+	char *buf;
    -+	if (packet_read_line_gently(0, &size, &buf) < 0)
    -+		return NULL;
    -+	return skip_key_dup(buf, size, key);
    -+}
    -+
    -+static void packet_read_capabilities(struct string_list *caps)
    -+{
    -+	while (1) {
    -+		int size;
    -+		char *buf = packet_read_line(0, &size);
    -+		if (!buf)
    -+			break;
    -+		string_list_append_nodup(caps,
    -+					 skip_key_dup(buf, size, "capability"));
    -+	}
    -+}
    -+
    -+/* Read remote capabilities and check them against capabilities we require */
    -+static void packet_read_and_check_capabilities(struct string_list *remote_caps,
    -+					       struct string_list *required_caps)
    -+{
    -+	struct string_list_item *item;
    -+	packet_read_capabilities(remote_caps);
    -+	for_each_string_list_item(item, required_caps) {
    -+		if (!unsorted_string_list_has_string(remote_caps, item->string)) {
    -+			die("required '%s' capability not available from remote",
    -+			    item->string);
    -+		}
    -+	}
    -+}
    -+
    -+/*
    -+ * Check our capabilities we want to advertise against the remote ones
    -+ * and then advertise our capabilities
    -+ */
    -+static void packet_check_and_write_capabilities(struct string_list *remote_caps,
    -+						struct string_list *our_caps)
    -+{
    -+	struct string_list_item *item;
    -+	for_each_string_list_item(item, our_caps) {
    -+		if (!unsorted_string_list_has_string(remote_caps, item->string)) {
    -+			die("our capability '%s' is not available from remote",
    -+			    item->string);
    -+		}
    -+		packet_write_fmt(1, "capability=%s\n", item->string);
    -+	}
    -+	packet_flush(1);
    -+}
    -+
    -+struct delay_entry {
    -+	int requested, count;
    -+	char *output;
    -+};
    -+
    -+static void command_loop(void)
    -+{
    -+	while (1) {
    -+		char *command = packet_key_val_read("command");
    -+		if (!command) {
    -+			fprintf(logfile, "STOP\n");
    -+			break;
    -+		}
    -+		fprintf(logfile, "IN: %s", command);
    -+
    -+		if (!strcmp(command, "list_available_blobs")) {
    -+			struct hashmap_iter iter;
    -+			struct strmap_entry *ent;
    -+			struct string_list_item *str_item;
    -+			struct string_list paths = STRING_LIST_INIT_NODUP;
    -+
    -+			/* flush */
    -+			if (packet_read_line(0, NULL))
    -+				die("bad list_available_blobs end");
    -+
    -+			strmap_for_each_entry(&delay, &iter, ent) {
    -+				struct delay_entry *delay_entry = ent->value;
    -+				if (!delay_entry->requested)
    -+					continue;
    -+				delay_entry->count--;
    -+				if (!strcmp(ent->key, "invalid-delay.a")) {
    -+					/* Send Git a pathname that was not delayed earlier */
    -+					packet_write_fmt(1, "pathname=unfiltered");
    -+				}
    -+				if (!strcmp(ent->key, "missing-delay.a")) {
    -+					/* Do not signal Git that this file is available */
    -+				} else if (!delay_entry->count) {
    -+					string_list_insert(&paths, ent->key);
    -+					packet_write_fmt(1, "pathname=%s", ent->key);
    -+				}
    -+			}
    -+
    -+			/* Print paths in sorted order. */
    -+			for_each_string_list_item(str_item, &paths)
    -+				fprintf(logfile, " %s", str_item->string);
    -+			string_list_clear(&paths, 0);
    -+
    -+			packet_flush(1);
    -+
    -+			fprintf(logfile, " [OK]\n");
    -+			packet_write_fmt(1, "status=success");
    -+			packet_flush(1);
    -+		} else {
    -+			char *buf, *output;
    -+			int size;
    -+			char *pathname;
    -+			struct delay_entry *entry;
    -+			struct strbuf input = STRBUF_INIT;
    -+
    -+			pathname = packet_key_val_read("pathname");
    -+			if (!pathname)
    -+				die("unexpected EOF while expecting pathname");
    -+			fprintf(logfile, " %s", pathname);
    -+
    -+			/* Read until flush */
    -+			buf = packet_read_line(0, &size);
    -+			while (buf) {
    -+				if (!strcmp(buf, "can-delay=1")) {
    -+					entry = strmap_get(&delay, pathname);
    -+					if (entry && !entry->requested) {
    -+						entry->requested = 1;
    -+					} else if (!entry && always_delay) {
    -+						entry = xcalloc(1, sizeof(*entry));
    -+						entry->requested = 1;
    -+						entry->count = 1;
    -+						strmap_put(&delay, pathname, entry);
    -+					}
    -+				} else if (starts_with(buf, "ref=") ||
    -+					   starts_with(buf, "treeish=") ||
    -+					   starts_with(buf, "blob=")) {
    -+					fprintf(logfile, " %s", buf);
    -+				} else {
    -+					/*
    -+					 * In general, filters need to be graceful about
    -+					 * new metadata, since it's documented that we
    -+					 * can pass any key-value pairs, but for tests,
    -+					 * let's be a little stricter.
    -+					 */
    -+					die("Unknown message '%s'", buf);
    -+				}
    -+				buf = packet_read_line(0, &size);
    -+			}
    -+
    -+
    -+			read_packetized_to_strbuf(0, &input, 0);
    -+			fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);
    -+
    -+			entry = strmap_get(&delay, pathname);
    -+			if (entry && entry->output) {
    -+				output = entry->output;
    -+			} else if (!strcmp(pathname, "error.r") || !strcmp(pathname, "abort.r")) {
    -+				output = "";
    -+			} else if (!strcmp(command, "clean") && has_capability("clean")) {
    -+				output = rot13(input.buf);
    -+			} else if (!strcmp(command, "smudge") && has_capability("smudge")) {
    -+				output = rot13(input.buf);
    -+			} else {
    -+				die("bad command '%s'", command);
    -+			}
    -+
    -+			if (!strcmp(pathname, "error.r")) {
    -+				fprintf(logfile, "[ERROR]\n");
    -+				packet_write_fmt(1, "status=error");
    -+				packet_flush(1);
    -+			} else if (!strcmp(pathname, "abort.r")) {
    -+				fprintf(logfile, "[ABORT]\n");
    -+				packet_write_fmt(1, "status=abort");
    -+				packet_flush(1);
    -+			} else if (!strcmp(command, "smudge") &&
    -+				   (entry = strmap_get(&delay, pathname)) &&
    -+				   entry->requested == 1) {
    -+				fprintf(logfile, "[DELAYED]\n");
    -+				packet_write_fmt(1, "status=delayed");
    -+				packet_flush(1);
    -+				entry->requested = 2;
    -+				entry->output = xstrdup(output);
    -+			} else {
    -+				int i, nr_packets;
    -+				size_t output_len;
    -+				struct strbuf sb = STRBUF_INIT;
    -+				packet_write_fmt(1, "status=success");
    -+				packet_flush(1);
    -+
    -+				strbuf_addf(&sb, "%s-write-fail.r", command);
    -+				if (!strcmp(pathname, sb.buf)) {
    -+					fprintf(logfile, "[WRITE FAIL]\n");
    -+					die("%s write error", command);
    -+				}
    -+
    -+				output_len = strlen(output);
    -+				fprintf(logfile, "OUT: %"PRIuMAX" ", (uintmax_t)output_len);
    -+
    -+				if (write_packetized_from_buf_no_flush_count(output,
    -+					output_len, 1, &nr_packets))
    -+					die("failed to write buffer to stdout");
    -+				packet_flush(1);
    -+
    -+				for (i = 0; i < nr_packets; i++)
    -+					fprintf(logfile, ".");
    -+				fprintf(logfile, " [OK]\n");
    -+
    -+				packet_flush(1);
    -+				strbuf_release(&sb);
    -+			}
    -+			free(pathname);
    -+			strbuf_release(&input);
    -+		}
    -+		free(command);
    -+	}
    -+}
    -+
    -+static void free_delay_hash(void)
    -+{
    -+	struct hashmap_iter iter;
    -+	struct strmap_entry *ent;
    -+
    -+	strmap_for_each_entry(&delay, &iter, ent) {
    -+		struct delay_entry *delay_entry = ent->value;
    -+		free(delay_entry->output);
    -+		free(delay_entry);
    -+	}
    -+	strmap_clear(&delay, 0);
    -+}
    -+
    -+static void add_delay_entry(char *pathname, int count)
    -+{
    -+	struct delay_entry *entry = xcalloc(1, sizeof(*entry));
    -+	entry->count = count;
    -+	if (strmap_put(&delay, pathname, entry))
    -+		BUG("adding the same path twice to delay hash?");
    -+}
    -+
    -+static void packet_initialize(const char *name, int version)
    -+{
    -+	struct strbuf sb = STRBUF_INIT;
    -+	int size;
    -+	char *pkt_buf = packet_read_line(0, &size);
    -+
    -+	strbuf_addf(&sb, "%s-client", name);
    -+	if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
    -+		die("bad initialize: '%s'", xstrndup(pkt_buf, size));
    -+
    -+	strbuf_reset(&sb);
    -+	strbuf_addf(&sb, "version=%d", version);
    -+	pkt_buf = packet_read_line(0, &size);
    -+	if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
    -+		die("bad version: '%s'", xstrndup(pkt_buf, size));
    -+
    -+	pkt_buf = packet_read_line(0, &size);
    -+	if (pkt_buf)
    -+		die("bad version end: '%s'", xstrndup(pkt_buf, size));
    -+
    -+	packet_write_fmt(1, "%s-server", name);
    -+	packet_write_fmt(1, "version=%d", version);
    -+	packet_flush(1);
    -+	strbuf_release(&sb);
    -+}
    -+
    -+static char *rot13_usage = "test-tool rot13-filter [--always-delay] <log path> <capabilities>";
    -+
    -+int cmd__rot13_filter(int argc, const char **argv)
    -+{
    -+	int i = 1;
    -+	struct string_list remote_caps = STRING_LIST_INIT_DUP,
    -+			   supported_caps = STRING_LIST_INIT_NODUP;
    -+
    -+	string_list_append(&supported_caps, "clean");
    -+	string_list_append(&supported_caps, "smudge");
    -+	string_list_append(&supported_caps, "delay");
    -+
    -+	if (argc > 1 && !strcmp(argv[i], "--always-delay")) {
    -+		always_delay = 1;
    -+		i++;
    -+	}
    -+	if (argc - i < 2)
    -+		usage(rot13_usage);
    -+
    -+	logfile = fopen(argv[i++], "a");
    -+	if (!logfile)
    -+		die_errno("failed to open log file");
    -+
    -+	for ( ; i < argc; i++)
    -+		string_list_append(&requested_caps, argv[i]);
    -+
    -+	add_delay_entry("test-delay10.a", 1);
    -+	add_delay_entry("test-delay11.a", 1);
    -+	add_delay_entry("test-delay20.a", 2);
    -+	add_delay_entry("test-delay10.b", 1);
    -+	add_delay_entry("missing-delay.a", 1);
    -+	add_delay_entry("invalid-delay.a", 1);
    -+
    -+	fprintf(logfile, "START\n");
    -+
    -+	packet_initialize("git-filter", 2);
    -+
    -+	packet_read_and_check_capabilities(&remote_caps, &supported_caps);
    -+	packet_check_and_write_capabilities(&remote_caps, &requested_caps);
    -+	fprintf(logfile, "init handshake complete\n");
    -+
    -+	string_list_clear(&supported_caps, 0);
    -+	string_list_clear(&remote_caps, 0);
    -+
    -+	command_loop();
    -+
    -+	fclose(logfile);
    -+	string_list_clear(&requested_caps, 0);
    -+	free_delay_hash();
    -+	return 0;
    -+}
    -
    - ## t/helper/test-tool.c ##
    -@@ t/helper/test-tool.c: static struct test_cmd cmds[] = {
    - 	{ "read-midx", cmd__read_midx },
    - 	{ "ref-store", cmd__ref_store },
    - 	{ "reftable", cmd__reftable },
    -+	{ "rot13-filter", cmd__rot13_filter },
    - 	{ "dump-reftable", cmd__dump_reftable },
    - 	{ "regex", cmd__regex },
    - 	{ "repository", cmd__repository },
    -
    - ## t/helper/test-tool.h ##
    -@@ t/helper/test-tool.h: int cmd__read_cache(int argc, const char **argv);
    - int cmd__read_graph(int argc, const char **argv);
    - int cmd__read_midx(int argc, const char **argv);
    - int cmd__ref_store(int argc, const char **argv);
    -+int cmd__rot13_filter(int argc, const char **argv);
    - int cmd__reftable(int argc, const char **argv);
    - int cmd__regex(int argc, const char **argv);
    - int cmd__repository(int argc, const char **argv);
    -
      ## t/t0021-conversion.sh ##
     @@ t/t0021-conversion.sh: tr \
        'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'
    @@ t/t0021-conversion.sh: test_expect_success PERL 'required process filter with cl
      	rm -rf repo &&
      	mkdir repo &&
      	(
    -@@ t/t0021-conversion.sh: test_expect_success PERL 'process filter should restart after unexpected write f
    - 		rm -f debug.log &&
    - 		git checkout --quiet --no-progress . 2>git-stderr.log &&
    - 
    --		grep "smudge write error at" git-stderr.log &&
    -+		grep "smudge write error" git-stderr.log &&
    - 		test_i18ngrep "error: external filter" git-stderr.log &&
    - 
    - 		cat >expected.log <<-EOF &&
     @@ t/t0021-conversion.sh: test_expect_success PERL 'process filter should restart after unexpected write f
      	)
      '
-- 
2.37.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 1/3] t0021: avoid grepping for a Perl-specific string at filter output
  2022-07-31 18:19   ` [PATCH v3 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
@ 2022-07-31 18:19     ` Matheus Tavares
  2022-08-01 20:41       ` Junio C Hamano
  2022-07-31 18:19     ` [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: Matheus Tavares @ 2022-07-31 18:19 UTC (permalink / raw)
  To: git; +Cc: gitster, avarab, johannes.schindelin

This test sets the t0021/rot13-filter.pl script as a long-running
process filter for a git checkout command. It then expects the filter to
fail producing a specific error message at stderr. In the following
commits we are going to replace the script with a C test-tool helper,
but the test currently expects the error message in a Perl-specific
format. That is, when you call `die <msg>` in Perl, it emits
"<msg> at - line 1." In preparation for the conversion, let's avoid the
Perl-specific part and only grep for <msg> itself.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/t0021-conversion.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 1c840348bd..963b66e08c 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -735,7 +735,7 @@ test_expect_success PERL 'process filter should restart after unexpected write f
 		rm -f debug.log &&
 		git checkout --quiet --no-progress . 2>git-stderr.log &&
 
-		grep "smudge write error at" git-stderr.log &&
+		grep "smudge write error" git-stderr.log &&
 		test_i18ngrep "error: external filter" git-stderr.log &&
 
 		cat >expected.log <<-EOF &&
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-07-31 18:19   ` [PATCH v3 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
  2022-07-31 18:19     ` [PATCH v3 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
@ 2022-07-31 18:19     ` Matheus Tavares
  2022-08-01 11:33       ` Ævar Arnfjörð Bjarmason
                         ` (3 more replies)
  2022-07-31 18:19     ` [PATCH v3 3/3] tests: use the new C rot13-filter helper to avoid PERL prereq Matheus Tavares
  2022-08-15  1:06     ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
  3 siblings, 4 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-07-31 18:19 UTC (permalink / raw)
  To: git; +Cc: gitster, avarab, johannes.schindelin, Johannes Schindelin

This script is currently used by three test files: t0021-conversion.sh,
t2080-parallel-checkout-basics.sh, and
t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL
dependency at these tests, let's convert the script to a C test-tool
command. The following commit will take care of actually modifying the
said tests to use the new C helper and removing the Perl script.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Makefile                     |   1 +
 pkt-line.c                   |   5 +-
 pkt-line.h                   |   8 +-
 t/helper/test-rot13-filter.c | 379 +++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c         |   1 +
 t/helper/test-tool.h         |   1 +
 6 files changed, 393 insertions(+), 2 deletions(-)
 create mode 100644 t/helper/test-rot13-filter.c

diff --git a/Makefile b/Makefile
index 04d0fd1fe6..7cfcf3a911 100644
--- a/Makefile
+++ b/Makefile
@@ -764,6 +764,7 @@ TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
 TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
+TEST_BUILTINS_OBJS += test-rot13-filter.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
 TEST_BUILTINS_OBJS += test-run-command.o
diff --git a/pkt-line.c b/pkt-line.c
index 8e43c2def4..ce4e73b683 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -309,7 +309,8 @@ int write_packetized_from_fd_no_flush(int fd_in, int fd_out)
 	return err;
 }
 
-int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
+					     int fd_out, int *packet_counter)
 {
 	int err = 0;
 	size_t bytes_written = 0;
@@ -324,6 +325,8 @@ int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_ou
 			break;
 		err = packet_write_gently(fd_out, src_in + bytes_written, bytes_to_write);
 		bytes_written += bytes_to_write;
+		if (packet_counter)
+			(*packet_counter)++;
 	}
 	return err;
 }
diff --git a/pkt-line.h b/pkt-line.h
index 6d2a63db23..804fe687fb 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -32,7 +32,13 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((f
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd_no_flush(int fd_in, int fd_out);
-int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out);
+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
+					     int fd_out, int *packet_counter);
+static inline int write_packetized_from_buf_no_flush(const char *src_in,
+						     size_t len, int fd_out)
+{
+	return write_packetized_from_buf_no_flush_count(src_in, len, fd_out, NULL);
+}
 
 /*
  * Stdio versions of packet_write functions. When mixing these with fd
diff --git a/t/helper/test-rot13-filter.c b/t/helper/test-rot13-filter.c
new file mode 100644
index 0000000000..d584511f8e
--- /dev/null
+++ b/t/helper/test-rot13-filter.c
@@ -0,0 +1,379 @@
+/*
+ * Example implementation for the Git filter protocol version 2
+ * See Documentation/gitattributes.txt, section "Filter Protocol"
+ *
+ * Usage: test-tool rot13-filter [--always-delay] <log path> <capabilities>
+ *
+ * Log path defines a debug log file that the script writes to. The
+ * subsequent arguments define a list of supported protocol capabilities
+ * ("clean", "smudge", etc).
+ *
+ * When --always-delay is given all pathnames with the "can-delay" flag
+ * that don't appear on the list bellow are delayed with a count of 1
+ * (see more below).
+ *
+ * This implementation supports special test cases:
+ * (1) If data with the pathname "clean-write-fail.r" is processed with
+ *     a "clean" operation then the write operation will die.
+ * (2) If data with the pathname "smudge-write-fail.r" is processed with
+ *     a "smudge" operation then the write operation will die.
+ * (3) If data with the pathname "error.r" is processed with any
+ *     operation then the filter signals that it cannot or does not want
+ *     to process the file.
+ * (4) If data with the pathname "abort.r" is processed with any
+ *     operation then the filter signals that it cannot or does not want
+ *     to process the file and any file after that is processed with the
+ *     same command.
+ * (5) If data with a pathname that is a key in the delay hash is
+ *     requested (e.g. "test-delay10.a") then the filter responds with
+ *     a "delay" status and sets the "requested" field in the delay hash.
+ *     The filter will signal the availability of this object after
+ *     "count" (field in delay hash) "list_available_blobs" commands.
+ * (6) If data with the pathname "missing-delay.a" is processed that the
+ *     filter will drop the path from the "list_available_blobs" response.
+ * (7) If data with the pathname "invalid-delay.a" is processed that the
+ *     filter will add the path "unfiltered" which was not delayed before
+ *     to the "list_available_blobs" response.
+ */
+
+#include "test-tool.h"
+#include "pkt-line.h"
+#include "string-list.h"
+#include "strmap.h"
+
+static FILE *logfile;
+static int always_delay, has_clean_cap, has_smudge_cap;
+static struct strmap delay = STRMAP_INIT;
+
+static char *rot13(char *str)
+{
+	char *c;
+	for (c = str; *c; c++)
+		if (isalpha(*c))
+			*c += tolower(*c) < 'n' ? 13 : -13;
+	return str;
+}
+
+static char *get_value(char *buf, size_t size, const char *key)
+{
+	const char *orig_buf = buf;
+	int orig_size = (int)size;
+
+	if (!skip_prefix_mem((const char *)buf, size, key, (const char **)&buf, &size) ||
+	    !skip_prefix_mem((const char *)buf, size, "=", (const char **)&buf, &size) ||
+	    !size)
+		die("expected key '%s', got '%.*s'",
+		    key, orig_size, orig_buf);
+
+	buf[size] = '\0';
+	return buf;
+}
+
+/*
+ * Read a text packet, expecting that it is in the form "key=value" for
+ * the given key. An EOF does not trigger any error and is reported
+ * back to the caller with NULL. Die if the "key" part of "key=value" does
+ * not match the given key, or the value part is empty.
+ */
+static char *packet_key_val_read(const char *key)
+{
+	int size;
+	char *buf;
+	if (packet_read_line_gently(0, &size, &buf) < 0)
+		return NULL;
+	return xstrdup(get_value(buf, size, key));
+}
+
+static inline void assert_remote_capability(struct strset *caps, const char *cap)
+{
+	if (!strset_contains(caps, cap))
+		die("required '%s' capability not available from remote", cap);
+}
+
+static void read_capabilities(struct strset *remote_caps)
+{
+	for (;;) {
+		int size;
+		char *buf = packet_read_line(0, &size);
+		if (!buf)
+			break;
+		strset_add(remote_caps, get_value(buf, size, "capability"));
+	}
+
+	assert_remote_capability(remote_caps, "clean");
+	assert_remote_capability(remote_caps, "smudge");
+	assert_remote_capability(remote_caps, "delay");
+}
+
+static void check_and_write_capabilities(struct strset *remote_caps,
+					 const char **caps, int caps_count)
+{
+	int i;
+	for (i = 0; i < caps_count; i++) {
+		if (!strset_contains(remote_caps, caps[i]))
+			die("our capability '%s' is not available from remote",
+			    caps[i]);
+		packet_write_fmt(1, "capability=%s\n", caps[i]);
+	}
+	packet_flush(1);
+}
+
+struct delay_entry {
+	int requested, count;
+	char *output;
+};
+
+static void free_delay_entries(void)
+{
+	struct hashmap_iter iter;
+	struct strmap_entry *ent;
+
+	strmap_for_each_entry(&delay, &iter, ent) {
+		struct delay_entry *delay_entry = ent->value;
+		free(delay_entry->output);
+		free(delay_entry);
+	}
+	strmap_clear(&delay, 0);
+}
+
+static void add_delay_entry(char *pathname, int count, int requested)
+{
+	struct delay_entry *entry = xcalloc(1, sizeof(*entry));
+	entry->count = count;
+	entry->requested = requested;
+	if (strmap_put(&delay, pathname, entry))
+		BUG("adding the same path twice to delay hash?");
+}
+
+static void reply_list_available_blobs_cmd(void)
+{
+	struct hashmap_iter iter;
+	struct strmap_entry *ent;
+	struct string_list_item *str_item;
+	struct string_list paths = STRING_LIST_INIT_NODUP;
+
+	/* flush */
+	if (packet_read_line(0, NULL))
+		die("bad list_available_blobs end");
+
+	strmap_for_each_entry(&delay, &iter, ent) {
+		struct delay_entry *delay_entry = ent->value;
+		if (!delay_entry->requested)
+			continue;
+		delay_entry->count--;
+		if (!strcmp(ent->key, "invalid-delay.a")) {
+			/* Send Git a pathname that was not delayed earlier */
+			packet_write_fmt(1, "pathname=unfiltered");
+		}
+		if (!strcmp(ent->key, "missing-delay.a")) {
+			/* Do not signal Git that this file is available */
+		} else if (!delay_entry->count) {
+			string_list_append(&paths, ent->key);
+			packet_write_fmt(1, "pathname=%s", ent->key);
+		}
+	}
+
+	/* Print paths in sorted order. */
+	string_list_sort(&paths);
+	for_each_string_list_item(str_item, &paths)
+		fprintf(logfile, " %s", str_item->string);
+	string_list_clear(&paths, 0);
+
+	packet_flush(1);
+
+	fprintf(logfile, " [OK]\n");
+	packet_write_fmt(1, "status=success");
+	packet_flush(1);
+}
+
+static void command_loop(void)
+{
+	for (;;) {
+		char *buf, *output;
+		int size;
+		char *pathname;
+		struct delay_entry *entry;
+		struct strbuf input = STRBUF_INIT;
+		char *command = packet_key_val_read("command");
+
+		if (!command) {
+			fprintf(logfile, "STOP\n");
+			break;
+		}
+		fprintf(logfile, "IN: %s", command);
+
+		if (!strcmp(command, "list_available_blobs")) {
+			reply_list_available_blobs_cmd();
+			free(command);
+			continue;
+		}
+
+		pathname = packet_key_val_read("pathname");
+		if (!pathname)
+			die("unexpected EOF while expecting pathname");
+		fprintf(logfile, " %s", pathname);
+
+		/* Read until flush */
+		while ((buf = packet_read_line(0, &size))) {
+			if (!strcmp(buf, "can-delay=1")) {
+				entry = strmap_get(&delay, pathname);
+				if (entry && !entry->requested) {
+					entry->requested = 1;
+				} else if (!entry && always_delay) {
+					add_delay_entry(pathname, 1, 1);
+				}
+			} else if (starts_with(buf, "ref=") ||
+				   starts_with(buf, "treeish=") ||
+				   starts_with(buf, "blob=")) {
+				fprintf(logfile, " %s", buf);
+			} else {
+				/*
+				 * In general, filters need to be graceful about
+				 * new metadata, since it's documented that we
+				 * can pass any key-value pairs, but for tests,
+				 * let's be a little stricter.
+				 */
+				die("Unknown message '%s'", buf);
+			}
+		}
+
+
+		read_packetized_to_strbuf(0, &input, 0);
+		fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);
+
+		entry = strmap_get(&delay, pathname);
+		if (entry && entry->output) {
+			output = entry->output;
+		} else if (!strcmp(pathname, "error.r") || !strcmp(pathname, "abort.r")) {
+			output = "";
+		} else if (!strcmp(command, "clean") && has_clean_cap) {
+			output = rot13(input.buf);
+		} else if (!strcmp(command, "smudge") && has_smudge_cap) {
+			output = rot13(input.buf);
+		} else {
+			die("bad command '%s'", command);
+		}
+
+		if (!strcmp(pathname, "error.r")) {
+			fprintf(logfile, "[ERROR]\n");
+			packet_write_fmt(1, "status=error");
+			packet_flush(1);
+		} else if (!strcmp(pathname, "abort.r")) {
+			fprintf(logfile, "[ABORT]\n");
+			packet_write_fmt(1, "status=abort");
+			packet_flush(1);
+		} else if (!strcmp(command, "smudge") &&
+			   (entry = strmap_get(&delay, pathname)) &&
+			   entry->requested == 1) {
+			fprintf(logfile, "[DELAYED]\n");
+			packet_write_fmt(1, "status=delayed");
+			packet_flush(1);
+			entry->requested = 2;
+			if (entry->output != output) {
+				free(entry->output);
+				entry->output = xstrdup(output);
+			}
+		} else {
+			int i, nr_packets = 0;
+			size_t output_len;
+			const char *p;
+			packet_write_fmt(1, "status=success");
+			packet_flush(1);
+
+			if (skip_prefix(pathname, command, &p) &&
+			    !strcmp(p, "-write-fail.r")) {
+				fprintf(logfile, "[WRITE FAIL]\n");
+				die("%s write error", command);
+			}
+
+			output_len = strlen(output);
+			fprintf(logfile, "OUT: %"PRIuMAX" ", (uintmax_t)output_len);
+
+			if (write_packetized_from_buf_no_flush_count(output,
+				output_len, 1, &nr_packets))
+				die("failed to write buffer to stdout");
+			packet_flush(1);
+
+			for (i = 0; i < nr_packets; i++)
+				fprintf(logfile, ".");
+			fprintf(logfile, " [OK]\n");
+
+			packet_flush(1);
+		}
+		free(pathname);
+		strbuf_release(&input);
+		free(command);
+	}
+}
+
+static void packet_initialize(void)
+{
+	int size;
+	char *pkt_buf = packet_read_line(0, &size);
+
+	if (!pkt_buf || strncmp(pkt_buf, "git-filter-client", size))
+		die("bad initialize: '%s'", xstrndup(pkt_buf, size));
+
+	pkt_buf = packet_read_line(0, &size);
+	if (!pkt_buf || strncmp(pkt_buf, "version=2", size))
+		die("bad version: '%.*s'", (int)size, pkt_buf);
+
+	pkt_buf = packet_read_line(0, &size);
+	if (pkt_buf)
+		die("bad version end: '%.*s'", (int)size, pkt_buf);
+
+	packet_write_fmt(1, "git-filter-server");
+	packet_write_fmt(1, "version=2");
+	packet_flush(1);
+}
+
+static char *rot13_usage = "test-tool rot13-filter [--always-delay] <log path> <capabilities>";
+
+int cmd__rot13_filter(int argc, const char **argv)
+{
+	const char **caps;
+	int cap_count, i = 1;
+	struct strset remote_caps = STRSET_INIT;
+
+	if (argc > 1 && !strcmp(argv[1], "--always-delay")) {
+		always_delay = 1;
+		i++;
+	}
+	if (argc - i < 2)
+		usage(rot13_usage);
+
+	logfile = fopen(argv[i++], "a");
+	if (!logfile)
+		die_errno("failed to open log file");
+
+	caps = argv + i;
+	cap_count = argc - i;
+
+	for (i = 0; i < cap_count; i++) {
+		if (!strcmp(caps[i], "clean"))
+			has_clean_cap = 1;
+		else if (!strcmp(caps[i], "smudge"))
+			has_smudge_cap = 1;
+	}
+
+	add_delay_entry("test-delay10.a", 1, 0);
+	add_delay_entry("test-delay11.a", 1, 0);
+	add_delay_entry("test-delay20.a", 2, 0);
+	add_delay_entry("test-delay10.b", 1, 0);
+	add_delay_entry("missing-delay.a", 1, 0);
+	add_delay_entry("invalid-delay.a", 1, 0);
+
+	fprintf(logfile, "START\n");
+	packet_initialize();
+
+	read_capabilities(&remote_caps);
+	check_and_write_capabilities(&remote_caps, caps, cap_count);
+	fprintf(logfile, "init handshake complete\n");
+	strset_clear(&remote_caps);
+
+	command_loop();
+
+	fclose(logfile);
+	free_delay_entries();
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 318fdbab0c..d6a560f832 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -65,6 +65,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "rot13-filter", cmd__rot13_filter },
 	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index bb79927163..21a91b1019 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -54,6 +54,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__rot13_filter(int argc, const char **argv);
 int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 3/3] tests: use the new C rot13-filter helper to avoid PERL prereq
  2022-07-31 18:19   ` [PATCH v3 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
  2022-07-31 18:19     ` [PATCH v3 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
  2022-07-31 18:19     ` [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
@ 2022-07-31 18:19     ` Matheus Tavares
  2022-08-15  1:06     ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
  3 siblings, 0 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-07-31 18:19 UTC (permalink / raw)
  To: git; +Cc: gitster, avarab, johannes.schindelin

The previous commit implemented a C version of the t0021/rot13-filter.pl
script. Let's use this new C helper to eliminate the PERL prereq from
various tests, and also remove the superseded Perl script.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/t0021-conversion.sh                   |  69 ++++---
 t/t0021/rot13-filter.pl                 | 247 ------------------------
 t/t2080-parallel-checkout-basics.sh     |   7 +-
 t/t2082-parallel-checkout-attributes.sh |   7 +-
 4 files changed, 37 insertions(+), 293 deletions(-)
 delete mode 100644 t/t0021/rot13-filter.pl

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 963b66e08c..aeaa8e02ed 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -17,9 +17,6 @@ tr \
   'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'
 EOF
 
-write_script rot13-filter.pl "$PERL_PATH" \
-	<"$TEST_DIRECTORY"/t0021/rot13-filter.pl
-
 generate_random_characters () {
 	LEN=$1
 	NAME=$2
@@ -365,8 +362,8 @@ test_expect_success 'diff does not reuse worktree files that need cleaning' '
 	test_line_count = 0 count
 '
 
-test_expect_success PERL 'required process filter should filter data' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should filter data' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -450,8 +447,8 @@ test_expect_success PERL 'required process filter should filter data' '
 	)
 '
 
-test_expect_success PERL 'required process filter should filter data for various subcommands' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should filter data for various subcommands' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	(
 		cd repo &&
@@ -561,9 +558,9 @@ test_expect_success PERL 'required process filter should filter data for various
 	)
 '
 
-test_expect_success PERL 'required process filter takes precedence' '
+test_expect_success 'required process filter takes precedence' '
 	test_config_global filter.protocol.clean false &&
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -587,8 +584,8 @@ test_expect_success PERL 'required process filter takes precedence' '
 	)
 '
 
-test_expect_success PERL 'required process filter should be used only for "clean" operation only' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
+test_expect_success 'required process filter should be used only for "clean" operation only' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -622,8 +619,8 @@ test_expect_success PERL 'required process filter should be used only for "clean
 	)
 '
 
-test_expect_success PERL 'required process filter should process multiple packets' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should process multiple packets' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 
 	rm -rf repo &&
@@ -687,8 +684,8 @@ test_expect_success PERL 'required process filter should process multiple packet
 	)
 '
 
-test_expect_success PERL 'required process filter with clean error should fail' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter with clean error should fail' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -706,8 +703,8 @@ test_expect_success PERL 'required process filter with clean error should fail'
 	)
 '
 
-test_expect_success PERL 'process filter should restart after unexpected write failure' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter should restart after unexpected write failure' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -761,8 +758,8 @@ test_expect_success PERL 'process filter should restart after unexpected write f
 	)
 '
 
-test_expect_success PERL 'process filter should not be restarted if it signals an error' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter should not be restarted if it signals an error' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -804,8 +801,8 @@ test_expect_success PERL 'process filter should not be restarted if it signals a
 	)
 '
 
-test_expect_success PERL 'process filter abort stops processing of all further files' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter abort stops processing of all further files' '
+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -861,10 +858,10 @@ test_expect_success PERL 'invalid process filter must fail (and not hang!)' '
 	)
 '
 
-test_expect_success PERL 'delayed checkout in process filter' '
-	test_config_global filter.a.process "rot13-filter.pl a.log clean smudge delay" &&
+test_expect_success 'delayed checkout in process filter' '
+	test_config_global filter.a.process "test-tool rot13-filter a.log clean smudge delay" &&
 	test_config_global filter.a.required true &&
-	test_config_global filter.b.process "rot13-filter.pl b.log clean smudge delay" &&
+	test_config_global filter.b.process "test-tool rot13-filter b.log clean smudge delay" &&
 	test_config_global filter.b.required true &&
 
 	rm -rf repo &&
@@ -940,8 +937,8 @@ test_expect_success PERL 'delayed checkout in process filter' '
 	)
 '
 
-test_expect_success PERL 'missing file in delayed checkout' '
-	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
+test_expect_success 'missing file in delayed checkout' '
+	test_config_global filter.bug.process "test-tool rot13-filter bug.log clean smudge delay" &&
 	test_config_global filter.bug.required true &&
 
 	rm -rf repo &&
@@ -960,8 +957,8 @@ test_expect_success PERL 'missing file in delayed checkout' '
 	grep "error: .missing-delay\.a. was not filtered properly" git-stderr.log
 '
 
-test_expect_success PERL 'invalid file in delayed checkout' '
-	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
+test_expect_success 'invalid file in delayed checkout' '
+	test_config_global filter.bug.process "test-tool rot13-filter bug.log clean smudge delay" &&
 	test_config_global filter.bug.required true &&
 
 	rm -rf repo &&
@@ -990,10 +987,10 @@ do
 		mode_prereq='UTF8_NFD_TO_NFC' ;;
 	esac
 
-	test_expect_success PERL,SYMLINKS,$mode_prereq \
+	test_expect_success SYMLINKS,$mode_prereq \
 	"delayed checkout with $mode-collision don't write to the wrong place" '
 		test_config_global filter.delay.process \
-			"\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+			"test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 		test_config_global filter.delay.required true &&
 
 		git init $mode-collision &&
@@ -1026,12 +1023,12 @@ do
 	'
 done
 
-test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
+test_expect_success SYMLINKS,CASE_INSENSITIVE_FS \
 "delayed checkout with submodule collision don't write to the wrong place" '
 	git init collision-with-submodule &&
 	(
 		cd collision-with-submodule &&
-		git config filter.delay.process "\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		# We need Git to treat the submodule "a" and the
@@ -1062,11 +1059,11 @@ test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
 	)
 '
 
-test_expect_success PERL 'setup for progress tests' '
+test_expect_success 'setup for progress tests' '
 	git init progress &&
 	(
 		cd progress &&
-		git config filter.delay.process "rot13-filter.pl delay-progress.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter delay-progress.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		echo "*.a filter=delay" >.gitattributes &&
@@ -1132,12 +1129,12 @@ do
 	'
 done
 
-test_expect_success PERL 'delayed checkout correctly reports the number of updated entries' '
+test_expect_success 'delayed checkout correctly reports the number of updated entries' '
 	rm -rf repo &&
 	git init repo &&
 	(
 		cd repo &&
-		git config filter.delay.process "../rot13-filter.pl delayed.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter delayed.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		echo "*.a filter=delay" >.gitattributes &&
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
deleted file mode 100644
index 7bb93768f3..0000000000
--- a/t/t0021/rot13-filter.pl
+++ /dev/null
@@ -1,247 +0,0 @@
-#
-# Example implementation for the Git filter protocol version 2
-# See Documentation/gitattributes.txt, section "Filter Protocol"
-#
-# Usage: rot13-filter.pl [--always-delay] <log path> <capabilities>
-#
-# Log path defines a debug log file that the script writes to. The
-# subsequent arguments define a list of supported protocol capabilities
-# ("clean", "smudge", etc).
-#
-# When --always-delay is given all pathnames with the "can-delay" flag
-# that don't appear on the list bellow are delayed with a count of 1
-# (see more below).
-#
-# This implementation supports special test cases:
-# (1) If data with the pathname "clean-write-fail.r" is processed with
-#     a "clean" operation then the write operation will die.
-# (2) If data with the pathname "smudge-write-fail.r" is processed with
-#     a "smudge" operation then the write operation will die.
-# (3) If data with the pathname "error.r" is processed with any
-#     operation then the filter signals that it cannot or does not want
-#     to process the file.
-# (4) If data with the pathname "abort.r" is processed with any
-#     operation then the filter signals that it cannot or does not want
-#     to process the file and any file after that is processed with the
-#     same command.
-# (5) If data with a pathname that is a key in the DELAY hash is
-#     requested (e.g. "test-delay10.a") then the filter responds with
-#     a "delay" status and sets the "requested" field in the DELAY hash.
-#     The filter will signal the availability of this object after
-#     "count" (field in DELAY hash) "list_available_blobs" commands.
-# (6) If data with the pathname "missing-delay.a" is processed that the
-#     filter will drop the path from the "list_available_blobs" response.
-# (7) If data with the pathname "invalid-delay.a" is processed that the
-#     filter will add the path "unfiltered" which was not delayed before
-#     to the "list_available_blobs" response.
-#
-
-use 5.008;
-sub gitperllib {
-	# Git assumes that all path lists are Unix-y colon-separated ones. But
-	# when the Git for Windows executes the test suite, its MSYS2 Bash
-	# calls git.exe, and colon-separated path lists are converted into
-	# Windows-y semicolon-separated lists of *Windows* paths (which
-	# naturally contain a colon after the drive letter, so splitting by
-	# colons simply does not cut it).
-	#
-	# Detect semicolon-separated path list and handle them appropriately.
-
-	if ($ENV{GITPERLLIB} =~ /;/) {
-		return split(/;/, $ENV{GITPERLLIB});
-	}
-	return split(/:/, $ENV{GITPERLLIB});
-}
-use lib (gitperllib());
-use strict;
-use warnings;
-use IO::File;
-use Git::Packet;
-
-my $MAX_PACKET_CONTENT_SIZE = 65516;
-
-my $always_delay = 0;
-if ( $ARGV[0] eq '--always-delay' ) {
-	$always_delay = 1;
-	shift @ARGV;
-}
-
-my $log_file                = shift @ARGV;
-my @capabilities            = @ARGV;
-
-open my $debug, ">>", $log_file or die "cannot open log file: $!";
-
-my %DELAY = (
-	'test-delay10.a' => { "requested" => 0, "count" => 1 },
-	'test-delay11.a' => { "requested" => 0, "count" => 1 },
-	'test-delay20.a' => { "requested" => 0, "count" => 2 },
-	'test-delay10.b' => { "requested" => 0, "count" => 1 },
-	'missing-delay.a' => { "requested" => 0, "count" => 1 },
-	'invalid-delay.a' => { "requested" => 0, "count" => 1 },
-);
-
-sub rot13 {
-	my $str = shift;
-	$str =~ y/A-Za-z/N-ZA-Mn-za-m/;
-	return $str;
-}
-
-print $debug "START\n";
-$debug->flush();
-
-packet_initialize("git-filter", 2);
-
-my %remote_caps = packet_read_and_check_capabilities("clean", "smudge", "delay");
-packet_check_and_write_capabilities(\%remote_caps, @capabilities);
-
-print $debug "init handshake complete\n";
-$debug->flush();
-
-while (1) {
-	my ( $res, $command ) = packet_key_val_read("command");
-	if ( $res == -1 ) {
-		print $debug "STOP\n";
-		exit();
-	}
-	print $debug "IN: $command";
-	$debug->flush();
-
-	if ( $command eq "list_available_blobs" ) {
-		# Flush
-		packet_compare_lists([1, ""], packet_bin_read()) ||
-			die "bad list_available_blobs end";
-
-		foreach my $pathname ( sort keys %DELAY ) {
-			if ( $DELAY{$pathname}{"requested"} >= 1 ) {
-				$DELAY{$pathname}{"count"} = $DELAY{$pathname}{"count"} - 1;
-				if ( $pathname eq "invalid-delay.a" ) {
-					# Send Git a pathname that was not delayed earlier
-					packet_txt_write("pathname=unfiltered");
-				}
-				if ( $pathname eq "missing-delay.a" ) {
-					# Do not signal Git that this file is available
-				} elsif ( $DELAY{$pathname}{"count"} == 0 ) {
-					print $debug " $pathname";
-					packet_txt_write("pathname=$pathname");
-				}
-			}
-		}
-
-		packet_flush();
-
-		print $debug " [OK]\n";
-		$debug->flush();
-		packet_txt_write("status=success");
-		packet_flush();
-	} else {
-		my ( $res, $pathname ) = packet_key_val_read("pathname");
-		if ( $res == -1 ) {
-			die "unexpected EOF while expecting pathname";
-		}
-		print $debug " $pathname";
-		$debug->flush();
-
-		# Read until flush
-		my ( $done, $buffer ) = packet_txt_read();
-		while ( $buffer ne '' ) {
-			if ( $buffer eq "can-delay=1" ) {
-				if ( exists $DELAY{$pathname} and $DELAY{$pathname}{"requested"} == 0 ) {
-					$DELAY{$pathname}{"requested"} = 1;
-				} elsif ( !exists $DELAY{$pathname} and $always_delay ) {
-					$DELAY{$pathname} = { "requested" => 1, "count" => 1 };
-				}
-			} elsif ($buffer =~ /^(ref|treeish|blob)=/) {
-				print $debug " $buffer";
-			} else {
-				# In general, filters need to be graceful about
-				# new metadata, since it's documented that we
-				# can pass any key-value pairs, but for tests,
-				# let's be a little stricter.
-				die "Unknown message '$buffer'";
-			}
-
-			( $done, $buffer ) = packet_txt_read();
-		}
-		if ( $done == -1 ) {
-			die "unexpected EOF after pathname '$pathname'";
-		}
-
-		my $input = "";
-		{
-			binmode(STDIN);
-			my $buffer;
-			my $done = 0;
-			while ( !$done ) {
-				( $done, $buffer ) = packet_bin_read();
-				$input .= $buffer;
-			}
-			if ( $done == -1 ) {
-				die "unexpected EOF while reading input for '$pathname'";
-			}			
-			print $debug " " . length($input) . " [OK] -- ";
-			$debug->flush();
-		}
-
-		my $output;
-		if ( exists $DELAY{$pathname} and exists $DELAY{$pathname}{"output"} ) {
-			$output = $DELAY{$pathname}{"output"}
-		} elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
-			$output = "";
-		} elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
-			$output = rot13($input);
-		} elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
-			$output = rot13($input);
-		} else {
-			die "bad command '$command'";
-		}
-
-		if ( $pathname eq "error.r" ) {
-			print $debug "[ERROR]\n";
-			$debug->flush();
-			packet_txt_write("status=error");
-			packet_flush();
-		} elsif ( $pathname eq "abort.r" ) {
-			print $debug "[ABORT]\n";
-			$debug->flush();
-			packet_txt_write("status=abort");
-			packet_flush();
-		} elsif ( $command eq "smudge" and
-			exists $DELAY{$pathname} and
-			$DELAY{$pathname}{"requested"} == 1 ) {
-			print $debug "[DELAYED]\n";
-			$debug->flush();
-			packet_txt_write("status=delayed");
-			packet_flush();
-			$DELAY{$pathname}{"requested"} = 2;
-			$DELAY{$pathname}{"output"} = $output;
-		} else {
-			packet_txt_write("status=success");
-			packet_flush();
-
-			if ( $pathname eq "${command}-write-fail.r" ) {
-				print $debug "[WRITE FAIL]\n";
-				$debug->flush();
-				die "${command} write error";
-			}
-
-			print $debug "OUT: " . length($output) . " ";
-			$debug->flush();
-
-			while ( length($output) > 0 ) {
-				my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE );
-				packet_bin_write($packet);
-				# dots represent the number of packets
-				print $debug ".";
-				if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
-					$output = substr( $output, $MAX_PACKET_CONTENT_SIZE );
-				} else {
-					$output = "";
-				}
-			}
-			packet_flush();
-			print $debug " [OK]\n";
-			$debug->flush();
-			packet_flush();
-		}
-	}
-}
diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh
index c683e60007..7d956625ca 100755
--- a/t/t2080-parallel-checkout-basics.sh
+++ b/t/t2080-parallel-checkout-basics.sh
@@ -230,12 +230,9 @@ test_expect_success SYMLINKS 'parallel checkout checks for symlinks in leading d
 # check the final report including sequential, parallel, and delayed entries
 # all at the same time. So we must have finer control of the parallel checkout
 # variables.
-test_expect_success PERL '"git checkout ." report should not include failed entries' '
-	write_script rot13-filter.pl "$PERL_PATH" \
-		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
-
+test_expect_success '"git checkout ." report should not include failed entries' '
 	test_config_global filter.delay.process \
-		"\"$(pwd)/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		"test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
 	test_config_global filter.delay.required true &&
 	test_config_global filter.cat.clean cat  &&
 	test_config_global filter.cat.smudge cat  &&
diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh
index 2525457961..2df55b9405 100755
--- a/t/t2082-parallel-checkout-attributes.sh
+++ b/t/t2082-parallel-checkout-attributes.sh
@@ -138,12 +138,9 @@ test_expect_success 'parallel-checkout and external filter' '
 # The delayed queue is independent from the parallel queue, and they should be
 # able to work together in the same checkout process.
 #
-test_expect_success PERL 'parallel-checkout and delayed checkout' '
-	write_script rot13-filter.pl "$PERL_PATH" \
-		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
-
+test_expect_success 'parallel-checkout and delayed checkout' '
 	test_config_global filter.delay.process \
-		"\"$(pwd)/rot13-filter.pl\" --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
+		"test-tool rot13-filter --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
 	test_config_global filter.delay.required true &&
 
 	echo "abcd" >original &&
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-07-31 18:19     ` [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
@ 2022-08-01 11:33       ` Ævar Arnfjörð Bjarmason
  2022-08-02  0:16         ` Matheus Tavares
  2022-08-01 11:39       ` Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-01 11:33 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, gitster, Johannes Schindelin


On Sun, Jul 31 2022, Matheus Tavares wrote:


> +static char *rot13_usage = "test-tool rot13-filter [--always-delay] <log path> <capabilities>";
> +
> +int cmd__rot13_filter(int argc, const char **argv)
> +{
> +	const char **caps;
> +	int cap_count, i = 1;
> +	struct strset remote_caps = STRSET_INIT;
> +
> +	if (argc > 1 && !strcmp(argv[1], "--always-delay")) {
> +		always_delay = 1;
> +		i++;
> +	}
> +	if (argc - i < 2)
> +		usage(rot13_usage);
> +
> +	logfile = fopen(argv[i++], "a");
> +	if (!logfile)
> +		die_errno("failed to open log file");
> +
> +	caps = argv + i;
> +	cap_count = argc - i;

Since you need to change every single caller consider just starting out
with parse_options() here instead of rolling your own parsing. You could
use it for --always-delay in any case, but you could also just add a
--log-path and --capability (an OPT_STRING_LIST), so:

	test-tool rot13-filter [--always-delay] --log-path=<path> [--capability <capbility]...

> +
> +	for (i = 0; i < cap_count; i++) {
> +		if (!strcmp(caps[i], "clean"))
> +			has_clean_cap = 1;
> +		else if (!strcmp(caps[i], "smudge"))
> +			has_smudge_cap = 1;

In any case, maybe BUG() in an "else" here with "unknown capability"?

> +	fclose(logfile);

Perhaps check the return value & die_errno() if we fail to fclose()
(happens e.g. if the disk fills up).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-07-31 18:19     ` [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
  2022-08-01 11:33       ` Ævar Arnfjörð Bjarmason
@ 2022-08-01 11:39       ` Ævar Arnfjörð Bjarmason
  2022-08-01 21:18       ` Junio C Hamano
  2022-08-09 10:47       ` Johannes Schindelin
  3 siblings, 0 replies; 34+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-01 11:39 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, gitster, Johannes Schindelin


On Sun, Jul 31 2022, Matheus Tavares wrote:

> +static void reply_list_available_blobs_cmd(void)
> +{
> +	struct hashmap_iter iter;
> +	struct strmap_entry *ent;
> +	struct string_list_item *str_item;
> +	struct string_list paths = STRING_LIST_INIT_NODUP;
> +
> +	/* flush */
> +	if (packet_read_line(0, NULL))
> +		die("bad list_available_blobs end");

Shouldn't anything that's not an OS error (e.g. write error) be a BUG()
instead in this code? I.e. it would be a bug in our own testcode if we
feed the wrong data here, or if pkt-line doesn't work as we expect...

> +
> +	strmap_for_each_entry(&delay, &iter, ent) {
> +		struct delay_entry *delay_entry = ent->value;
> +		if (!delay_entry->requested)
> +			continue;
> +		delay_entry->count--;
> +		if (!strcmp(ent->key, "invalid-delay.a")) {
> +			/* Send Git a pathname that was not delayed earlier */
> +			packet_write_fmt(1, "pathname=unfiltered");
> +		}
> +		if (!strcmp(ent->key, "missing-delay.a")) {
> +			/* Do not signal Git that this file is available */
> +		} else if (!delay_entry->count) {
> +			string_list_append(&paths, ent->key);
> +			packet_write_fmt(1, "pathname=%s", ent->key);
> +		}
> +	}
> +
> +	/* Print paths in sorted order. */
> +	string_list_sort(&paths);
> +	for_each_string_list_item(str_item, &paths)
> +		fprintf(logfile, " %s", str_item->string);
> +	string_list_clear(&paths, 0);
> +
> +	packet_flush(1);
> +
> +	fprintf(logfile, " [OK]\n");

I think it should be called out in the commit message that this is not
what the Perl version is doing, i.e. it does things like:

	print $debug " [OK]\n";
	$debug->flush();

After having previously printed the equivalent of your
for_each_string_list_item() to the log file.

In Perl anything that uses PerlIO is subject to internal buffering,
which doesn't have the same semantics as stdio buffering.

I think in this case it won't matter, since you're not expecting to have
concurrent writers. You could even use fputc() here.

But a faithful reproduction of the Perl version would be something like
appending the output here to a "struct strbuf", and then "flushing" it
at the end when the perl version does a "$debug->flush()".

I don't think that's worth the effort here, and we should just say that
it doesn't matter. I just think we should note it. Thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] t0021: avoid grepping for a Perl-specific string at filter output
  2022-07-31 18:19     ` [PATCH v3 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
@ 2022-08-01 20:41       ` Junio C Hamano
  0 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2022-08-01 20:41 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, avarab, johannes.schindelin

Matheus Tavares <matheus.bernardino@usp.br> writes:

> This test sets the t0021/rot13-filter.pl script as a long-running
> process filter for a git checkout command. It then expects the filter to
> fail producing a specific error message at stderr. In the following
> commits we are going to replace the script with a C test-tool helper,
> but the test currently expects the error message in a Perl-specific
> format. That is, when you call `die <msg>` in Perl, it emits
> "<msg> at - line 1." In preparation for the conversion, let's avoid the
> Perl-specific part and only grep for <msg> itself.

Sounds sane.  I am a bit surprised that we check for messages from
the external filter tool, actually, rather than messages we would
emit in response to an error by the filter tool, which ought to be
more stable no matter how the external tool expresses its failures.

But the posted change gets the job done perfectly fine, so it is OK.

Thanks.

> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  t/t0021-conversion.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index 1c840348bd..963b66e08c 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -735,7 +735,7 @@ test_expect_success PERL 'process filter should restart after unexpected write f
>  		rm -f debug.log &&
>  		git checkout --quiet --no-progress . 2>git-stderr.log &&
>  
> -		grep "smudge write error at" git-stderr.log &&
> +		grep "smudge write error" git-stderr.log &&
>  		test_i18ngrep "error: external filter" git-stderr.log &&
>  
>  		cat >expected.log <<-EOF &&

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-07-31 18:19     ` [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
  2022-08-01 11:33       ` Ævar Arnfjörð Bjarmason
  2022-08-01 11:39       ` Ævar Arnfjörð Bjarmason
@ 2022-08-01 21:18       ` Junio C Hamano
  2022-08-02  0:13         ` Matheus Tavares
                           ` (2 more replies)
  2022-08-09 10:47       ` Johannes Schindelin
  3 siblings, 3 replies; 34+ messages in thread
From: Junio C Hamano @ 2022-08-01 21:18 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, avarab, johannes.schindelin

Matheus Tavares <matheus.bernardino@usp.br> writes:

> +static char *get_value(char *buf, size_t size, const char *key)
> +{
> +	const char *orig_buf = buf;
> +	int orig_size = (int)size;
> +
> +	if (!skip_prefix_mem((const char *)buf, size, key, (const char **)&buf, &size) ||
> +	    !skip_prefix_mem((const char *)buf, size, "=", (const char **)&buf, &size) ||
> +	    !size)

So, skip_prefix_mem(), when successfully parses the prefix out,
advances buf[] to skip the prefix and shortens size by the same
amount, so buf[size] is pointing at the same byte.  The code wants
to make sure buf[] begins with the "<key>=", skip that part, so
presumably buf[] after the above part moves to the beginning of
<value> in the "<key>=<value>" string?  It also wants to reject
"<key>=", i.e. an empty string as the <value>?

> +		die("expected key '%s', got '%.*s'",
> +		    key, orig_size, orig_buf);
> +
> +	buf[size] = '\0';

I find this assignment somewhat strange, but primarily because it
uses the updated buf[size] that ought to be pointing at the same
byte as the original buf[size].  Is this necessary because buf[size]
upon the entry to this function does not necessarily have NUL there?

Reading ahead,

 * packet_key_val_read() feeds the buffer taken from
   packet_read_line_gently(), so buf[size] should be NUL terminated
   already.

 * read_capabilities() feeds the buffer taken from
   packet_read_line(), so buf[size] should be NUL terminated
   already.

> +	return buf;
> +}

And the caller gets the byte position that begins the <value> part.

> +static char *packet_key_val_read(const char *key)
> +{
> +	int size;
> +	char *buf;
> +	if (packet_read_line_gently(0, &size, &buf) < 0)
> +		return NULL;
> +	return xstrdup(get_value(buf, size, key));
> +}

The returned value from get_value() is pointing into
pkt-line.c::packet_buffer[], so we return a copy to the caller,
which takes the ownership.  OK.

> +static inline void assert_remote_capability(struct strset *caps, const char *cap)
> +{
> +	if (!strset_contains(caps, cap))
> +		die("required '%s' capability not available from remote", cap);
> +}
> +
> +static void read_capabilities(struct strset *remote_caps)
> +{
> +	for (;;) {
> +		int size;
> +		char *buf = packet_read_line(0, &size);
> +		if (!buf)
> +			break;
> +		strset_add(remote_caps, get_value(buf, size, "capability"));
> +	}

strset_add() creates a copy of what get_value() borrowed from
pkt-line.c::packet_buffer[] here, which is good.

> +	assert_remote_capability(remote_caps, "clean");
> +	assert_remote_capability(remote_caps, "smudge");
> +	assert_remote_capability(remote_caps, "delay");
> +}

> +static void command_loop(void)
> +{
> +	for (;;) {
> +		char *buf, *output;
> +		int size;
> +		char *pathname;
> +		struct delay_entry *entry;
> +		struct strbuf input = STRBUF_INIT;
> +		char *command = packet_key_val_read("command");
> +
> +		if (!command) {
> +			fprintf(logfile, "STOP\n");
> +			break;
> +		}
> +		fprintf(logfile, "IN: %s", command);
> +
> +		if (!strcmp(command, "list_available_blobs")) {
> +			reply_list_available_blobs_cmd();
> +			free(command);
> +			continue;
> +		}

OK.

> +		pathname = packet_key_val_read("pathname");
> +		if (!pathname)
> +			die("unexpected EOF while expecting pathname");
> +		fprintf(logfile, " %s", pathname);
> +
> +		/* Read until flush */
> +		while ((buf = packet_read_line(0, &size))) {
> +			if (!strcmp(buf, "can-delay=1")) {
> +				entry = strmap_get(&delay, pathname);
> +				if (entry && !entry->requested) {
> +					entry->requested = 1;
> +				} else if (!entry && always_delay) {
> +					add_delay_entry(pathname, 1, 1);
> +				}

These are unnecessary {} around single statement blocks, but let's
let it pass in a test helper.

> +			} else if (starts_with(buf, "ref=") ||
> +				   starts_with(buf, "treeish=") ||
> +				   starts_with(buf, "blob=")) {
> +				fprintf(logfile, " %s", buf);
> +			} else {
> +				/*
> +				 * In general, filters need to be graceful about
> +				 * new metadata, since it's documented that we
> +				 * can pass any key-value pairs, but for tests,
> +				 * let's be a little stricter.
> +				 */
> +				die("Unknown message '%s'", buf);
> +			}
> +		}
> +
> +
> +		read_packetized_to_strbuf(0, &input, 0);

I do not see a need for double blank lines above.

> +		fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);
> +
> +		entry = strmap_get(&delay, pathname);
> +		if (entry && entry->output) {
> +			output = entry->output;
> +		} else if (!strcmp(pathname, "error.r") || !strcmp(pathname, "abort.r")) {
> +			output = "";
> +		} else if (!strcmp(command, "clean") && has_clean_cap) {
> +			output = rot13(input.buf);
> +		} else if (!strcmp(command, "smudge") && has_smudge_cap) {
> +			output = rot13(input.buf);
> +		} else {
> +			die("bad command '%s'", command);
> +		}

Good.  At this point, output all points into something and itself
does not own the memory it is pointing at.

> +		if (!strcmp(pathname, "error.r")) {
> +			fprintf(logfile, "[ERROR]\n");
> +			packet_write_fmt(1, "status=error");
> +			packet_flush(1);
> +		} else if (!strcmp(pathname, "abort.r")) {
> +			fprintf(logfile, "[ABORT]\n");
> +			packet_write_fmt(1, "status=abort");
> +			packet_flush(1);
> +		} else if (!strcmp(command, "smudge") &&
> +			   (entry = strmap_get(&delay, pathname)) &&
> +			   entry->requested == 1) {
> +			fprintf(logfile, "[DELAYED]\n");
> +			packet_write_fmt(1, "status=delayed");
> +			packet_flush(1);
> +			entry->requested = 2;
> +			if (entry->output != output) {
> +				free(entry->output);
> +				entry->output = xstrdup(output);
> +			}
> +		} else {
> +			int i, nr_packets = 0;
> +			size_t output_len;
> +			const char *p;
> +			packet_write_fmt(1, "status=success");
> +			packet_flush(1);
> +
> +			if (skip_prefix(pathname, command, &p) &&
> +			    !strcmp(p, "-write-fail.r")) {
> +				fprintf(logfile, "[WRITE FAIL]\n");
> +				die("%s write error", command);
> +			}
> +
> +			output_len = strlen(output);
> +			fprintf(logfile, "OUT: %"PRIuMAX" ", (uintmax_t)output_len);
> +
> +			if (write_packetized_from_buf_no_flush_count(output,
> +				output_len, 1, &nr_packets))
> +				die("failed to write buffer to stdout");
> +			packet_flush(1);
> +
> +			for (i = 0; i < nr_packets; i++)
> +				fprintf(logfile, ".");
> +			fprintf(logfile, " [OK]\n");
> +
> +			packet_flush(1);
> +		}
> +		free(pathname);
> +		strbuf_release(&input);
> +		free(command);
> +	}
> +}

OK, at this point we are done with pathname and command so we can
free them for the next round.  input was used as a scratch buffer
and we are done with it, too.

Looking good.

Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-08-01 21:18       ` Junio C Hamano
@ 2022-08-02  0:13         ` Matheus Tavares
  2022-08-09 10:00         ` Johannes Schindelin
  2022-08-09 10:37         ` Johannes Schindelin
  2 siblings, 0 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-08-02  0:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, avarab, johannes.schindelin

On Mon, Aug 1, 2022 at 6:18 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Matheus Tavares <matheus.bernardino@usp.br> writes:
>
> > +             die("expected key '%s', got '%.*s'",
> > +                 key, orig_size, orig_buf);
> > +
> > +     buf[size] = '\0';
>
> I find this assignment somewhat strange, but primarily because it
> uses the updated buf[size] that ought to be pointing at the same
> byte as the original buf[size].  Is this necessary because buf[size]
> upon the entry to this function does not necessarily have NUL there?
>
> Reading ahead,
>
>  * packet_key_val_read() feeds the buffer taken from
>    packet_read_line_gently(), so buf[size] should be NUL terminated
>    already.
>
>  * read_capabilities() feeds the buffer taken from
>    packet_read_line(), so buf[size] should be NUL terminated
>    already.
>
> > +     return buf;
> > +}
>
> And the caller gets the byte position that begins the <value> part.

Good point. I'll remove the buf[size] = '\0' assignment.

> > +                             if (entry && !entry->requested) {
> > +                                     entry->requested = 1;
> > +                             } else if (!entry && always_delay) {
> > +                                     add_delay_entry(pathname, 1, 1);
> > +                             }
>
> These are unnecessary {} around single statement blocks, but let's
> let it pass in a test helper.
> > [...]
> > +                             die("Unknown message '%s'", buf);
> > +                     }
> > +             }
> > +
> > +
> > +             read_packetized_to_strbuf(0, &input, 0);
>
> I do not see a need for double blank lines above.

Oops, I will fix these too. Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-08-01 11:33       ` Ævar Arnfjörð Bjarmason
@ 2022-08-02  0:16         ` Matheus Tavares
  2022-08-09  9:45           ` Johannes Schindelin
  0 siblings, 1 reply; 34+ messages in thread
From: Matheus Tavares @ 2022-08-02  0:16 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, gitster, Johannes Schindelin

On Mon, Aug 1, 2022 at 8:37 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> On Sun, Jul 31 2022, Matheus Tavares wrote:
> >
> > +
> > +     caps = argv + i;
> > +     cap_count = argc - i;
>
> Since you need to change every single caller consider just starting out
> with parse_options() here instead of rolling your own parsing. You could
> use it for --always-delay in any case, but you could also just add a
> --log-path and --capability (an OPT_STRING_LIST), so:
>
>         test-tool rot13-filter [--always-delay] --log-path=<path> [--capability <capbility]...

Ah, makes sense. Thanks

> > +
> > +     for (i = 0; i < cap_count; i++) {
> > +             if (!strcmp(caps[i], "clean"))
> > +                     has_clean_cap = 1;
> > +             else if (!strcmp(caps[i], "smudge"))
> > +                     has_smudge_cap = 1;
>
> In any case, maybe BUG() in an "else" here with "unknown capability"?

Yup, will do.

> > +     fclose(logfile);
>
> Perhaps check the return value & die_errno() if we fail to fclose()
> (happens e.g. if the disk fills up).

Sure. Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2] t/t0021: convert the rot13-filter.pl script to C
  2022-07-31  2:52     ` Matheus Tavares
@ 2022-08-09  9:36       ` Johannes Schindelin
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Schindelin @ 2022-08-09  9:36 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, gitster, larsxschneider, christian.couder, avarab

Hi Matheus,

On Sat, 30 Jul 2022, Matheus Tavares wrote:

> On Thu, Jul 28, 2022 at 1:58 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > > On Sun, 24 Jul 2022, Matheus Tavares wrote:
> > >
> > > +static void command_loop(void)
> > > +{
> > > +     while (1) {
> > > +             char *command = packet_key_val_read("command");
> > > +             if (!command) {
> > > +                     fprintf(logfile, "STOP\n");
> > > +                     break;
> > > +             }
> > > +             fprintf(logfile, "IN: %s", command);
> >
> > We will also need to `fflush(logfile)` here, to imitate the Perl script's
> > behavior more precisely.
>
> I was somewhat intrigued as to why the flushes were needed in the Perl
> script. But reading [1] and [2], now, it seems to have been an
> oversight.
>
> That is, Eric suggested splictily flushing stdout because it is a
> pipe, but the author ended up erroneously disabling autoflush for
> stdout too, so that's why we needed the flushes there. They later
> acknowledged that and said that they would re-enabled it (see [2]),
> but it seems to have been forgotten. So I think we can safely drop the
> flush calls.
>
> [1]: http://public-inbox.org/git/20160723072721.GA20875%40starla/
> [2]: https://lore.kernel.org/git/7F1F1A0E-8FC3-4FBD-81AA-37786DE0EF50@gmail.com/

I am somewhat weary of introducing a change of behavior while
reimplementing a Perl script in C at the same time, but in this instance I
think that the benefit of _not_ touching the `pkt-line.c` code is a
convincing reason to do so.

> > > +
> > > +             if (!strcmp(command, "list_available_blobs")) {
> > > +                     struct hashmap_iter iter;
> > > +                     struct strmap_entry *ent;
> > > +                     struct string_list_item *str_item;
> > > +                     struct string_list paths = STRING_LIST_INIT_NODUP;
> > > +
> > > +                     /* flush */
> > > +                     if (packet_read_line(0, NULL))
> > > +                             die("bad list_available_blobs end");
> > > +
> > > +                     strmap_for_each_entry(&delay, &iter, ent) {
> > > +                             struct delay_entry *delay_entry = ent->value;
> > > +                             if (!delay_entry->requested)
> > > +                                     continue;
> > > +                             delay_entry->count--;
> > > +                             if (!strcmp(ent->key, "invalid-delay.a")) {
> > > +                                     /* Send Git a pathname that was not delayed earlier */
> > > +                                     packet_write_fmt(1, "pathname=unfiltered");
> > > +                             }
> > > +                             if (!strcmp(ent->key, "missing-delay.a")) {
> > > +                                     /* Do not signal Git that this file is available */
> > > +                             } else if (!delay_entry->count) {
> > > +                                     string_list_insert(&paths, ent->key);
> > > +                                     packet_write_fmt(1, "pathname=%s", ent->key);
> > > +                             }
> > > +                     }
> > > +
> > > +                     /* Print paths in sorted order. */
> >
> > The Perl script does not order them specifically. Do we really have to do
> > that here?
>
> It actually prints them in sorted order:
>
>         foreach my $pathname ( sort keys %DELAY )

Whoops, sorry for missing that!

> > > +                             fprintf(logfile, " [OK]\n");
> > > +
> > > +                             packet_flush(1);
> > > +                             strbuf_release(&sb);
> > > +                     }
> > > +                     free(pathname);
> > > +                     strbuf_release(&input);
> > > +             }
> > > +             free(command);
> > > +     }
> > > +}
> > > [...]
> > > +static void packet_initialize(const char *name, int version)
> > > +{
> > > +     struct strbuf sb = STRBUF_INIT;
> > > +     int size;
> > > +     char *pkt_buf = packet_read_line(0, &size);
> > > +
> > > +     strbuf_addf(&sb, "%s-client", name);
> > > +     if (!pkt_buf || strncmp(pkt_buf, sb.buf, size))
> >
> > We do not need the flexibility of the Perl package, where `name` is a
> > parameter. We can hard-code `git-filter-client` here. I.e. something like
> > this:
> >
> >         if (!pkt_buf || size != 17 ||
> >             strncmp(pkt_buf, "git-filter-client", 17))
>
> Good idea! Thanks. Perhaps, can't we do:
>
>         if (!pkt_buf || strncmp(pkt_buf, "git-filter-client", size))
>
> to avoid the hard-coded and possibly error-prone 17?

I am afraid that this is not idempotent. If `pkt_buf` is "git" and `size`
is 3, then the suggested `strncmp()` would return 0, but we would want it
to be non-zero.

The best way to avoid the hard-coded 17 would be to introduce a local
constant and use `strlen()` on it (which modern compilers would evaluate
already at compile time).

> > > +             die("bad initialize: '%s'", xstrndup(pkt_buf, size));
> > > +
> > > +     strbuf_reset(&sb);
> > > +     strbuf_addf(&sb, "version=%d", version);
>
> Thanks for a very detailed review and great suggestions!

Thank you for your contribution that is very much relevant to my
interests!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-08-02  0:16         ` Matheus Tavares
@ 2022-08-09  9:45           ` Johannes Schindelin
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Schindelin @ 2022-08-09  9:45 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: Ævar Arnfjörð Bjarmason, git, gitster

[-- Attachment #1: Type: text/plain, Size: 905 bytes --]

Hi Matheus,

On Mon, 1 Aug 2022, Matheus Tavares wrote:

> On Mon, Aug 1, 2022 at 8:37 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> >
> > On Sun, Jul 31 2022, Matheus Tavares wrote:
> > >
> > > +
> > > +     for (i = 0; i < cap_count; i++) {
> > > +             if (!strcmp(caps[i], "clean"))
> > > +                     has_clean_cap = 1;
> > > +             else if (!strcmp(caps[i], "smudge"))
> > > +                     has_smudge_cap = 1;
> >
> > In any case, maybe BUG() in an "else" here with "unknown capability"?
>
> Yup, will do.

Please don't, the suggestion is unsound.

The idea here is to find out whether the command-line listed the "clean"
and/or the "smudge" capabilities, ignoring all others for the moment.

To error out here with a BUG() would most likely break the invocation
in t0021 where we also pass the `delay` capability.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-08-01 21:18       ` Junio C Hamano
  2022-08-02  0:13         ` Matheus Tavares
@ 2022-08-09 10:00         ` Johannes Schindelin
  2022-08-10 18:37           ` Junio C Hamano
  2022-08-09 10:37         ` Johannes Schindelin
  2 siblings, 1 reply; 34+ messages in thread
From: Johannes Schindelin @ 2022-08-09 10:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matheus Tavares, git, avarab

Hi Junio,

On Mon, 1 Aug 2022, Junio C Hamano wrote:

> Matheus Tavares <matheus.bernardino@usp.br> writes:
>
> > +		/* Read until flush */
> > +		while ((buf = packet_read_line(0, &size))) {
> > +			if (!strcmp(buf, "can-delay=1")) {
> > +				entry = strmap_get(&delay, pathname);
> > +				if (entry && !entry->requested) {
> > +					entry->requested = 1;
> > +				} else if (!entry && always_delay) {
> > +					add_delay_entry(pathname, 1, 1);
> > +				}
>
> These are unnecessary {} around single statement blocks, but let's
> let it pass in a test helper.

I would like to encourage you to think of ways how this project could
avoid the cost (mental space, reviewer time, back and forth between
contributor and reviewer) of such trivial code formatting issues.

My favored solution would be to adjust the code formatting rules in Git to
such an extent that it can be completely automated, whether via a
`clang-format-diff` rule [*1*] or via an adapted `checkpatch` [*2*] or via
something that is modeled after cURL's `checksrc` script [*3*].

It costs us too much time, and is too annoying all around, having to spend
so many brain cycles on code style (which people like me find much less
interesting than the actual, functional changes).

I'd much rather focus on the implementation of the rot13 filter and
potentially how this patch could give rise to even broader enhancements to
Git's source code that eventually have a user-visible, positive impact.

Ciao,
Dscho

Footnote *1*: https://lore.kernel.org/git/YstJl+5BPyR5RWnR@tapette.crustytoothpaste.net/
Footnote *2*: https://lore.kernel.org/git/xmqqbktvl0s4.fsf@gitster.g/
Footnote *3*: https://github.com/curl/curl/blob/master/scripts/checksrc.pl

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-08-01 21:18       ` Junio C Hamano
  2022-08-02  0:13         ` Matheus Tavares
  2022-08-09 10:00         ` Johannes Schindelin
@ 2022-08-09 10:37         ` Johannes Schindelin
  2 siblings, 0 replies; 34+ messages in thread
From: Johannes Schindelin @ 2022-08-09 10:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matheus Tavares, git, avarab

Hi Junio,

On Mon, 1 Aug 2022, Junio C Hamano wrote:

>  * read_capabilities() feeds the buffer taken from
>    packet_read_line(), so buf[size] should be NUL terminated
>    already.

Could you help me agree?

In `packet_read_line()`, we call `packet_read()` with the
`PACKET_READ_CHOMP_NEWLINE` option, but we do not NUL-terminate the
buffer.

See https://github.com/git/git/blob/v2.37.1/pkt-line.c#L488-L494

In `packet_read()`, we call `packet_read_with_status()`, but do not
NUL-terminate the buffer.

See https://github.com/git/git/blob/v2.37.1/pkt-line.c#L478-L486

In `packet_read_with_status()`, I see that we call `get_packet_data()`
which does not NUL-terminate the buffer. Then we parse the length via
`packet_length()` which does not NUL-terminate the buffer.

Then, crucially, if the packet length is smaller than 3, we set the length
that is returned to 0 and return early indicating the conditions
`PACKET_READ_FLUSH`, `PACKET_READ_DELIM`, or `PACKET_READ_RESPONSE_END`,
which are ignored by `packet_read()`.

In this instance, the buffer is not NUL-terminated, I think. But if you
see that I missed something, I would like to know.

See https://github.com/git/git/blob/v2.37.1/pkt-line.c#L399-L476

And yes, in the case that there is a regular payload,
https://github.com/git/git/blob/v2.37.1/pkt-line.c#L456 NUL-terminates the
buffer.

And the proposed `get_value()` function would avoid returning a not
NUL-terminated buffer by virtue of using the `skip_prefix_mem()` function
with a non-empty prefix but a zero length buffer.

Therefore it is _still_ safe to skip the `buf[size] = '\0';` assignment
despite what I wrote above, even if it adds yet another piece of code to
Git's source code which is harder than necessary to reason about.

After all, it took me half an hour to research and write up this mail,
when reading `buf[size] = '\0';` would have taken all of two seconds to
verify that the code is safe.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-07-31 18:19     ` [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
                         ` (2 preceding siblings ...)
  2022-08-01 21:18       ` Junio C Hamano
@ 2022-08-09 10:47       ` Johannes Schindelin
  3 siblings, 0 replies; 34+ messages in thread
From: Johannes Schindelin @ 2022-08-09 10:47 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, gitster, avarab

Hi Matheus,

On Sun, 31 Jul 2022, Matheus Tavares wrote:

> diff --git a/pkt-line.c b/pkt-line.c
> index 8e43c2def4..ce4e73b683 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -309,7 +309,8 @@ int write_packetized_from_fd_no_flush(int fd_in, int fd_out)
>  	return err;
>  }
>
> -int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
> +int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
> +					     int fd_out, int *packet_counter)
>  {
>  	int err = 0;
>  	size_t bytes_written = 0;
> @@ -324,6 +325,8 @@ int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_ou
>  			break;
>  		err = packet_write_gently(fd_out, src_in + bytes_written, bytes_to_write);
>  		bytes_written += bytes_to_write;
> +		if (packet_counter)
> +			(*packet_counter)++;

The only reason why we do this here is to try to imitate the Perl script
that prints out a dot for every packet written, right?

But the Perl script wrote out those dots immediately and individually, not
in one go after writing all the packets.

Unless the tests rely on the dots in the output, I would therefore
recommend to simply scrap this functionality (and to write about it in the
commit message, with the rationale that it does not fit into the current C
code's paradigms and would require intrusive changes of questionable
benefit) and avoid touching `pkt-line.[ch]` altogether.

> [...]
> diff --git a/pkt-line.h b/pkt-line.h
> [...]
> +static void packet_initialize(void)
> +{
> +	int size;
> +	char *pkt_buf = packet_read_line(0, &size);
> +
> +	if (!pkt_buf || strncmp(pkt_buf, "git-filter-client", size))
> +		die("bad initialize: '%s'", xstrndup(pkt_buf, size));
> +
> +	pkt_buf = packet_read_line(0, &size);
> +	if (!pkt_buf || strncmp(pkt_buf, "version=2", size))
> +		die("bad version: '%.*s'", (int)size, pkt_buf);

This would mistake a packet `v` for being valid.

Junio pointed out in his review that `packet_read_line()` already
NUL-terminates the buffer (except when it returns `NULL`), therefore we
can write this instead:

	if (!pkt_buf || strcmp(pkt_buf, "version=2"))

Likewise with `"git-filter-client"`.

> +
> +	pkt_buf = packet_read_line(0, &size);
> +	if (pkt_buf)
> +		die("bad version end: '%.*s'", (int)size, pkt_buf);
> +
> +	packet_write_fmt(1, "git-filter-server");
> +	packet_write_fmt(1, "version=2");
> +	packet_flush(1);
> +}
> +
> +static char *rot13_usage = "test-tool rot13-filter [--always-delay] <log path> <capabilities>";
> +
> +int cmd__rot13_filter(int argc, const char **argv)
> +{
> +	const char **caps;
> +	int cap_count, i = 1;
> +	struct strset remote_caps = STRSET_INIT;
> +
> +	if (argc > 1 && !strcmp(argv[1], "--always-delay")) {
> +		always_delay = 1;
> +		i++;
> +	}

This is so much simpler to read than if it used `parse_options()`,
therefore I think that this is good as-is.

It is probably obvious that I did not spend as much time on reviewing this
round as I did the previous time (after all, if one spends three hours
here and three hours there, pretty soon one ends up having missed lunch
before knowing it). However, it is equally obvious that you did a great
job addressing my review of the previous round.

Thank you,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-08-09 10:00         ` Johannes Schindelin
@ 2022-08-10 18:37           ` Junio C Hamano
  2022-08-10 19:58             ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2022-08-10 18:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Matheus Tavares, git, avarab

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I would like to encourage you to think of ways how this project could
> avoid the cost (mental space, reviewer time, back and forth between
> contributor and reviewer) of such trivial code formatting issues.

I do not need your encouragement.  I am sure the submitter could
have run clang-format or checkpatch.pl or whatever and noticed the
issue.  Small style diversions in submitted patches are distracting
enough to prevent me from concentrating on and noticing problems in
the more important aspects like correctness and leakiness.  That is
why people get formatting issues pointed out and CodingGuidelines
talks about styles.

Checkpatch is OK, but IIRC, you cannot ask to check "only the code I
changed in this patch" to clang-format, which may be the show
stopper.  Otherwise, I would quite welcome an automated "pre-flight"
automation, like "make" target, that submitters can use and GGG can
help them use.

Thanks.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-08-10 18:37           ` Junio C Hamano
@ 2022-08-10 19:58             ` Junio C Hamano
  0 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2022-08-10 19:58 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Matheus Tavares, git, avarab

Junio C Hamano <gitster@pobox.com> writes:

> Checkpatch is OK, but IIRC, you cannot ask to check "only the code I
> changed in this patch" to clang-format, which may be the show
> stopper.  Otherwise, I would quite welcome an automated "pre-flight"
> automation, like "make" target, that submitters can use and GGG can
> help them use.

Let me step a bit back.  I do not think any automated tool would be
free of false positives, so it is OK to configure the tool loose and
have "judgement case" still be dealt by human reviewer, but if the
automation is overly strict, that would probably waste submitters'
time too much.

You would need to accept that the new contributors are human and are
capable of learning and configuring editors on their end, and after
they get reminded of the style rules once or twice and they get used
to the process, they would also help coaching yet even newer
contributors.

I personally feel that the level of style issues that need to be
pointed out among the recent list traffic is not overly excessive.

Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 0/3] t0021: convert perl script to C test-tool helper
  2022-07-31 18:19   ` [PATCH v3 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
                       ` (2 preceding siblings ...)
  2022-07-31 18:19     ` [PATCH v3 3/3] tests: use the new C rot13-filter helper to avoid PERL prereq Matheus Tavares
@ 2022-08-15  1:06     ` Matheus Tavares
  2022-08-15  1:06       ` [PATCH v4 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
                         ` (3 more replies)
  3 siblings, 4 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-08-15  1:06 UTC (permalink / raw)
  To: git; +Cc: gitster, avarab, johannes.schindelin

Convert t/t0021/rot13-filter.pl to a test-tool helper to avoid the PERL
prereq in various tests.

Main changes since v3:
Patch 2:
- Mentioned in commit message why we removed the flush() calls for the
  log file handler.
- Removed 'buf[size] = \0' and relied on the fact that packet_read()
  already 0-terminates the buffer. This also allows us to use NULL
  instead of &size in many places, dropping down the unneeded variable.
- Used parse-options instead of manual argv fiddling. I'm not strongly
  about one way or another, but I found the parse-options slightly
  easier for new options that may be added in the future.
- Style: removed unnecessary {} and newline.

Notes:
- About the s/die()/BUG()/ suggestion: I ended up leaving the die()
  calls because this seems to be the preferred mechanics at the
  t/helper/*.c files.

- About the suggestion of dropping the dot printing from Dscho: I really
  wished we could do that because I dislike the huge function name at
  pkt-line.*. Unfortunately, though, many tests in t0021-conversion.sh
  do seem to rely on the number of dots printed to the log file to check
  the proper number of packets sent. See e.g. the test 'required process
  filter should process multiple packets'.

Matheus Tavares (3):
  t0021: avoid grepping for a Perl-specific string at filter output
  t0021: implementation the rot13-filter.pl script in C
  tests: use the new C rot13-filter helper to avoid PERL prereq

 Makefile                                |   1 +
 pkt-line.c                              |   5 +-
 pkt-line.h                              |   8 +-
 t/helper/test-rot13-filter.c            | 382 ++++++++++++++++++++++++
 t/helper/test-tool.c                    |   1 +
 t/helper/test-tool.h                    |   1 +
 t/t0021-conversion.sh                   |  71 +++--
 t/t0021/rot13-filter.pl                 | 247 ---------------
 t/t2080-parallel-checkout-basics.sh     |   7 +-
 t/t2082-parallel-checkout-attributes.sh |   7 +-
 10 files changed, 434 insertions(+), 296 deletions(-)
 create mode 100644 t/helper/test-rot13-filter.c
 delete mode 100644 t/t0021/rot13-filter.pl

Range-diff against v3:
1:  5ec95c7e69 = 1:  64dc9af1ad t0021: avoid grepping for a Perl-specific string at filter output
2:  86e6baba46 ! 2:  99d8458f35 t0021: implementation the rot13-filter.pl script in C
    @@ Commit message
         command. The following commit will take care of actually modifying the
         said tests to use the new C helper and removing the Perl script.
     
    +    The Perl script flushes the log file handler after each write. As
    +    commented in [1], this seems to be an early design decision that was
    +    later reconsidered, but possibly ended up being left in the code by
    +    accident:
    +
    +            >> +$debug->flush();T
    +            >
    +            > Isn't $debug flushed automatically?
    +
    +            Maybe, but autoflush is not explicitly enabled. I will
    +            enable it again (I disabled it because of Eric's comment
    +            but I re-read the comment and he is only talking about
    +            pipes).
    +
    +    Anyways, this behavior is not really needed for the tests and the
    +    flush() calls make the code slightly larger, so let's avoid them
    +    altogether in the new C version.
    +
    +    [1]: https://lore.kernel.org/git/7F1F1A0E-8FC3-4FBD-81AA-37786DE0EF50@gmail.com/
    +
         Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    @@ t/helper/test-rot13-filter.c (new)
     + * Example implementation for the Git filter protocol version 2
     + * See Documentation/gitattributes.txt, section "Filter Protocol"
     + *
    -+ * Usage: test-tool rot13-filter [--always-delay] <log path> <capabilities>
    ++ * Usage: test-tool rot13-filter [--always-delay] --log=<path> <capabilities>
     + *
     + * Log path defines a debug log file that the script writes to. The
     + * subsequent arguments define a list of supported protocol capabilities
    @@ t/helper/test-rot13-filter.c (new)
     +#include "pkt-line.h"
     +#include "string-list.h"
     +#include "strmap.h"
    ++#include "parse-options.h"
     +
     +static FILE *logfile;
     +static int always_delay, has_clean_cap, has_smudge_cap;
     +static struct strmap delay = STRMAP_INIT;
     +
    ++static inline const char *str_or_null(const char *str)
    ++{
    ++	return str ? str : "(null)";
    ++}
    ++
     +static char *rot13(char *str)
     +{
     +	char *c;
    @@ t/helper/test-rot13-filter.c (new)
     +	return str;
     +}
     +
    -+static char *get_value(char *buf, size_t size, const char *key)
    ++static char *get_value(char *buf, const char *key)
     +{
     +	const char *orig_buf = buf;
    -+	int orig_size = (int)size;
    -+
    -+	if (!skip_prefix_mem((const char *)buf, size, key, (const char **)&buf, &size) ||
    -+	    !skip_prefix_mem((const char *)buf, size, "=", (const char **)&buf, &size) ||
    -+	    !size)
    -+		die("expected key '%s', got '%.*s'",
    -+		    key, orig_size, orig_buf);
    -+
    -+	buf[size] = '\0';
    ++	if (!buf ||
    ++	    !skip_prefix((const char *)buf, key, (const char **)&buf) ||
    ++	    !skip_prefix((const char *)buf, "=", (const char **)&buf) ||
    ++	    !*buf)
    ++		die("expected key '%s', got '%s'", key, str_or_null(orig_buf));
     +	return buf;
     +}
     +
    @@ t/helper/test-rot13-filter.c (new)
     + */
     +static char *packet_key_val_read(const char *key)
     +{
    -+	int size;
     +	char *buf;
    -+	if (packet_read_line_gently(0, &size, &buf) < 0)
    ++	if (packet_read_line_gently(0, NULL, &buf) < 0)
     +		return NULL;
    -+	return xstrdup(get_value(buf, size, key));
    ++	return xstrdup(get_value(buf, key));
     +}
     +
     +static inline void assert_remote_capability(struct strset *caps, const char *cap)
    @@ t/helper/test-rot13-filter.c (new)
     +static void read_capabilities(struct strset *remote_caps)
     +{
     +	for (;;) {
    -+		int size;
    -+		char *buf = packet_read_line(0, &size);
    ++		char *buf = packet_read_line(0, NULL);
     +		if (!buf)
     +			break;
    -+		strset_add(remote_caps, get_value(buf, size, "capability"));
    ++		strset_add(remote_caps, get_value(buf, "capability"));
     +	}
     +
     +	assert_remote_capability(remote_caps, "clean");
    @@ t/helper/test-rot13-filter.c (new)
     +}
     +
     +static void check_and_write_capabilities(struct strset *remote_caps,
    -+					 const char **caps, int caps_count)
    ++					 const char **caps, int nr_caps)
     +{
     +	int i;
    -+	for (i = 0; i < caps_count; i++) {
    ++	for (i = 0; i < nr_caps; i++) {
     +		if (!strset_contains(remote_caps, caps[i]))
     +			die("our capability '%s' is not available from remote",
     +			    caps[i]);
    @@ t/helper/test-rot13-filter.c (new)
     +{
     +	for (;;) {
     +		char *buf, *output;
    -+		int size;
     +		char *pathname;
     +		struct delay_entry *entry;
     +		struct strbuf input = STRBUF_INIT;
    @@ t/helper/test-rot13-filter.c (new)
     +		fprintf(logfile, " %s", pathname);
     +
     +		/* Read until flush */
    -+		while ((buf = packet_read_line(0, &size))) {
    ++		while ((buf = packet_read_line(0, NULL))) {
     +			if (!strcmp(buf, "can-delay=1")) {
     +				entry = strmap_get(&delay, pathname);
    -+				if (entry && !entry->requested) {
    ++				if (entry && !entry->requested)
     +					entry->requested = 1;
    -+				} else if (!entry && always_delay) {
    ++				else if (!entry && always_delay)
     +					add_delay_entry(pathname, 1, 1);
    -+				}
     +			} else if (starts_with(buf, "ref=") ||
     +				   starts_with(buf, "treeish=") ||
     +				   starts_with(buf, "blob=")) {
    @@ t/helper/test-rot13-filter.c (new)
     +			}
     +		}
     +
    -+
     +		read_packetized_to_strbuf(0, &input, 0);
     +		fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);
     +
    @@ t/helper/test-rot13-filter.c (new)
     +
     +static void packet_initialize(void)
     +{
    -+	int size;
    -+	char *pkt_buf = packet_read_line(0, &size);
    ++	char *pkt_buf = packet_read_line(0, NULL);
     +
    -+	if (!pkt_buf || strncmp(pkt_buf, "git-filter-client", size))
    -+		die("bad initialize: '%s'", xstrndup(pkt_buf, size));
    ++	if (!pkt_buf || strcmp(pkt_buf, "git-filter-client"))
    ++		die("bad initialize: '%s'", str_or_null(pkt_buf));
     +
    -+	pkt_buf = packet_read_line(0, &size);
    -+	if (!pkt_buf || strncmp(pkt_buf, "version=2", size))
    -+		die("bad version: '%.*s'", (int)size, pkt_buf);
    ++	pkt_buf = packet_read_line(0, NULL);
    ++	if (!pkt_buf || strcmp(pkt_buf, "version=2"))
    ++		die("bad version: '%s'", str_or_null(pkt_buf));
     +
    -+	pkt_buf = packet_read_line(0, &size);
    ++	pkt_buf = packet_read_line(0, NULL);
     +	if (pkt_buf)
    -+		die("bad version end: '%.*s'", (int)size, pkt_buf);
    ++		die("bad version end: '%s'", pkt_buf);
     +
     +	packet_write_fmt(1, "git-filter-server");
     +	packet_write_fmt(1, "version=2");
     +	packet_flush(1);
     +}
     +
    -+static char *rot13_usage = "test-tool rot13-filter [--always-delay] <log path> <capabilities>";
    ++static const char *rot13_usage[] = {
    ++	"test-tool rot13-filter [--always-delay] --log=<path> <capabilities>",
    ++	NULL
    ++};
     +
     +int cmd__rot13_filter(int argc, const char **argv)
     +{
    -+	const char **caps;
    -+	int cap_count, i = 1;
    ++	int i, nr_caps;
     +	struct strset remote_caps = STRSET_INIT;
    ++	const char *log_path = NULL;
     +
    -+	if (argc > 1 && !strcmp(argv[1], "--always-delay")) {
    -+		always_delay = 1;
    -+		i++;
    -+	}
    -+	if (argc - i < 2)
    -+		usage(rot13_usage);
    ++	struct option options[] = {
    ++		OPT_BOOL(0, "always-delay", &always_delay,
    ++			 "delay all paths with the can-delay flag"),
    ++		OPT_STRING(0, "log", &log_path, "path",
    ++			   "path to the debug log file"),
    ++		OPT_END()
    ++	};
    ++	nr_caps = parse_options(argc, argv, NULL, options, rot13_usage,
    ++				PARSE_OPT_STOP_AT_NON_OPTION);
     +
    -+	logfile = fopen(argv[i++], "a");
    ++	if (!log_path || !nr_caps)
    ++		usage_with_options(rot13_usage, options);
    ++
    ++	logfile = fopen(log_path, "a");
     +	if (!logfile)
     +		die_errno("failed to open log file");
     +
    -+	caps = argv + i;
    -+	cap_count = argc - i;
    -+
    -+	for (i = 0; i < cap_count; i++) {
    -+		if (!strcmp(caps[i], "clean"))
    -+			has_clean_cap = 1;
    -+		else if (!strcmp(caps[i], "smudge"))
    ++	for (i = 0; i < nr_caps; i++) {
    ++		if (!strcmp(argv[i], "smudge"))
     +			has_smudge_cap = 1;
    ++		if (!strcmp(argv[i], "clean"))
    ++			has_clean_cap = 1;
     +	}
     +
     +	add_delay_entry("test-delay10.a", 1, 0);
    @@ t/helper/test-rot13-filter.c (new)
     +	packet_initialize();
     +
     +	read_capabilities(&remote_caps);
    -+	check_and_write_capabilities(&remote_caps, caps, cap_count);
    ++	check_and_write_capabilities(&remote_caps, argv, nr_caps);
     +	fprintf(logfile, "init handshake complete\n");
     +	strset_clear(&remote_caps);
     +
     +	command_loop();
     +
    -+	fclose(logfile);
    ++	if (fclose(logfile))
    ++		die_errno("error closing logfile");
     +	free_delay_entries();
     +	return 0;
     +}
3:  c66fc0a186 ! 3:  d6033abbce tests: use the new C rot13-filter helper to avoid PERL prereq
    @@ t/t0021-conversion.sh: test_expect_success 'diff does not reuse worktree files t
     -test_expect_success PERL 'required process filter should filter data' '
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
     +test_expect_success 'required process filter should filter data' '
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
      	test_config_global filter.protocol.required true &&
      	rm -rf repo &&
      	mkdir repo &&
    @@ t/t0021-conversion.sh: test_expect_success PERL 'required process filter should
     -test_expect_success PERL 'required process filter should filter data for various subcommands' '
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
     +test_expect_success 'required process filter should filter data for various subcommands' '
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
      	test_config_global filter.protocol.required true &&
      	(
      		cd repo &&
    @@ t/t0021-conversion.sh: test_expect_success PERL 'required process filter should
     +test_expect_success 'required process filter takes precedence' '
      	test_config_global filter.protocol.clean false &&
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean" &&
      	test_config_global filter.protocol.required true &&
      	rm -rf repo &&
      	mkdir repo &&
    @@ t/t0021-conversion.sh: test_expect_success PERL 'required process filter takes p
     -test_expect_success PERL 'required process filter should be used only for "clean" operation only' '
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
     +test_expect_success 'required process filter should be used only for "clean" operation only' '
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean" &&
      	rm -rf repo &&
      	mkdir repo &&
      	(
    @@ t/t0021-conversion.sh: test_expect_success PERL 'required process filter should
     -test_expect_success PERL 'required process filter should process multiple packets' '
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
     +test_expect_success 'required process filter should process multiple packets' '
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
      	test_config_global filter.protocol.required true &&
      
      	rm -rf repo &&
    @@ t/t0021-conversion.sh: test_expect_success PERL 'required process filter should
     -test_expect_success PERL 'required process filter with clean error should fail' '
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
     +test_expect_success 'required process filter with clean error should fail' '
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
      	test_config_global filter.protocol.required true &&
      	rm -rf repo &&
      	mkdir repo &&
    @@ t/t0021-conversion.sh: test_expect_success PERL 'required process filter with cl
     -test_expect_success PERL 'process filter should restart after unexpected write failure' '
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
     +test_expect_success 'process filter should restart after unexpected write failure' '
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
      	rm -rf repo &&
      	mkdir repo &&
      	(
    @@ t/t0021-conversion.sh: test_expect_success PERL 'process filter should restart a
     -test_expect_success PERL 'process filter should not be restarted if it signals an error' '
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
     +test_expect_success 'process filter should not be restarted if it signals an error' '
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
      	rm -rf repo &&
      	mkdir repo &&
      	(
    @@ t/t0021-conversion.sh: test_expect_success PERL 'process filter should not be re
     -test_expect_success PERL 'process filter abort stops processing of all further files' '
     -	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
     +test_expect_success 'process filter abort stops processing of all further files' '
    -+	test_config_global filter.protocol.process "test-tool rot13-filter debug.log clean smudge" &&
    ++	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
      	rm -rf repo &&
      	mkdir repo &&
      	(
    @@ t/t0021-conversion.sh: test_expect_success PERL 'invalid process filter must fai
     -test_expect_success PERL 'delayed checkout in process filter' '
     -	test_config_global filter.a.process "rot13-filter.pl a.log clean smudge delay" &&
     +test_expect_success 'delayed checkout in process filter' '
    -+	test_config_global filter.a.process "test-tool rot13-filter a.log clean smudge delay" &&
    ++	test_config_global filter.a.process "test-tool rot13-filter --log=a.log clean smudge delay" &&
      	test_config_global filter.a.required true &&
     -	test_config_global filter.b.process "rot13-filter.pl b.log clean smudge delay" &&
    -+	test_config_global filter.b.process "test-tool rot13-filter b.log clean smudge delay" &&
    ++	test_config_global filter.b.process "test-tool rot13-filter --log=b.log clean smudge delay" &&
      	test_config_global filter.b.required true &&
      
      	rm -rf repo &&
    @@ t/t0021-conversion.sh: test_expect_success PERL 'delayed checkout in process fil
     -test_expect_success PERL 'missing file in delayed checkout' '
     -	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
     +test_expect_success 'missing file in delayed checkout' '
    -+	test_config_global filter.bug.process "test-tool rot13-filter bug.log clean smudge delay" &&
    ++	test_config_global filter.bug.process "test-tool rot13-filter --log=bug.log clean smudge delay" &&
      	test_config_global filter.bug.required true &&
      
      	rm -rf repo &&
    @@ t/t0021-conversion.sh: test_expect_success PERL 'missing file in delayed checkou
     -test_expect_success PERL 'invalid file in delayed checkout' '
     -	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
     +test_expect_success 'invalid file in delayed checkout' '
    -+	test_config_global filter.bug.process "test-tool rot13-filter bug.log clean smudge delay" &&
    ++	test_config_global filter.bug.process "test-tool rot13-filter --log=bug.log clean smudge delay" &&
      	test_config_global filter.bug.required true &&
      
      	rm -rf repo &&
    @@ t/t0021-conversion.sh: do
      	"delayed checkout with $mode-collision don't write to the wrong place" '
      		test_config_global filter.delay.process \
     -			"\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
    -+			"test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
    ++			"test-tool rot13-filter --always-delay --log=delayed.log clean smudge delay" &&
      		test_config_global filter.delay.required true &&
      
      		git init $mode-collision &&
    @@ t/t0021-conversion.sh: do
      	(
      		cd collision-with-submodule &&
     -		git config filter.delay.process "\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
    -+		git config filter.delay.process "test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
    ++		git config filter.delay.process "test-tool rot13-filter --always-delay --log=delayed.log clean smudge delay" &&
      		git config filter.delay.required true &&
      
      		# We need Git to treat the submodule "a" and the
    @@ t/t0021-conversion.sh: test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
      	(
      		cd progress &&
     -		git config filter.delay.process "rot13-filter.pl delay-progress.log clean smudge delay" &&
    -+		git config filter.delay.process "test-tool rot13-filter delay-progress.log clean smudge delay" &&
    ++		git config filter.delay.process "test-tool rot13-filter --log=delay-progress.log clean smudge delay" &&
      		git config filter.delay.required true &&
      
      		echo "*.a filter=delay" >.gitattributes &&
    @@ t/t0021-conversion.sh: do
      	(
      		cd repo &&
     -		git config filter.delay.process "../rot13-filter.pl delayed.log clean smudge delay" &&
    -+		git config filter.delay.process "test-tool rot13-filter delayed.log clean smudge delay" &&
    ++		git config filter.delay.process "test-tool rot13-filter --log=delayed.log clean smudge delay" &&
      		git config filter.delay.required true &&
      
      		echo "*.a filter=delay" >.gitattributes &&
    @@ t/t2080-parallel-checkout-basics.sh: test_expect_success SYMLINKS 'parallel chec
     +test_expect_success '"git checkout ." report should not include failed entries' '
      	test_config_global filter.delay.process \
     -		"\"$(pwd)/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
    -+		"test-tool rot13-filter --always-delay delayed.log clean smudge delay" &&
    ++		"test-tool rot13-filter --always-delay --log=delayed.log clean smudge delay" &&
      	test_config_global filter.delay.required true &&
      	test_config_global filter.cat.clean cat  &&
      	test_config_global filter.cat.smudge cat  &&
    @@ t/t2082-parallel-checkout-attributes.sh: test_expect_success 'parallel-checkout
     +test_expect_success 'parallel-checkout and delayed checkout' '
      	test_config_global filter.delay.process \
     -		"\"$(pwd)/rot13-filter.pl\" --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
    -+		"test-tool rot13-filter --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
    ++		"test-tool rot13-filter --always-delay --log=\"$(pwd)/delayed.log\" clean smudge delay" &&
      	test_config_global filter.delay.required true &&
      
      	echo "abcd" >original &&
-- 
2.37.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 1/3] t0021: avoid grepping for a Perl-specific string at filter output
  2022-08-15  1:06     ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
@ 2022-08-15  1:06       ` Matheus Tavares
  2022-08-15  1:06       ` [PATCH v4 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-08-15  1:06 UTC (permalink / raw)
  To: git; +Cc: gitster, avarab, johannes.schindelin

This test sets the t0021/rot13-filter.pl script as a long-running
process filter for a git checkout command. It then expects the filter to
fail producing a specific error message at stderr. In the following
commits we are going to replace the script with a C test-tool helper,
but the test currently expects the error message in a Perl-specific
format. That is, when you call `die <msg>` in Perl, it emits
"<msg> at - line 1." In preparation for the conversion, let's avoid the
Perl-specific part and only grep for <msg> itself.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/t0021-conversion.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 1c840348bd..963b66e08c 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -735,7 +735,7 @@ test_expect_success PERL 'process filter should restart after unexpected write f
 		rm -f debug.log &&
 		git checkout --quiet --no-progress . 2>git-stderr.log &&
 
-		grep "smudge write error at" git-stderr.log &&
+		grep "smudge write error" git-stderr.log &&
 		test_i18ngrep "error: external filter" git-stderr.log &&
 
 		cat >expected.log <<-EOF &&
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 2/3] t0021: implementation the rot13-filter.pl script in C
  2022-08-15  1:06     ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
  2022-08-15  1:06       ` [PATCH v4 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
@ 2022-08-15  1:06       ` Matheus Tavares
  2022-08-15  1:06       ` [PATCH v4 3/3] tests: use the new C rot13-filter helper to avoid PERL prereq Matheus Tavares
  2022-08-15 13:01       ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Johannes Schindelin
  3 siblings, 0 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-08-15  1:06 UTC (permalink / raw)
  To: git; +Cc: gitster, avarab, johannes.schindelin, Johannes Schindelin

This script is currently used by three test files: t0021-conversion.sh,
t2080-parallel-checkout-basics.sh, and
t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL
dependency at these tests, let's convert the script to a C test-tool
command. The following commit will take care of actually modifying the
said tests to use the new C helper and removing the Perl script.

The Perl script flushes the log file handler after each write. As
commented in [1], this seems to be an early design decision that was
later reconsidered, but possibly ended up being left in the code by
accident:

	>> +$debug->flush();
	>
	> Isn't $debug flushed automatically?

	Maybe, but autoflush is not explicitly enabled. I will
	enable it again (I disabled it because of Eric's comment
	but I re-read the comment and he is only talking about
	pipes).

Anyways, this behavior is not really needed for the tests and the
flush() calls make the code slightly larger, so let's avoid them
altogether in the new C version.

[1]: https://lore.kernel.org/git/7F1F1A0E-8FC3-4FBD-81AA-37786DE0EF50@gmail.com/

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Makefile                     |   1 +
 pkt-line.c                   |   5 +-
 pkt-line.h                   |   8 +-
 t/helper/test-rot13-filter.c | 382 +++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c         |   1 +
 t/helper/test-tool.h         |   1 +
 6 files changed, 396 insertions(+), 2 deletions(-)
 create mode 100644 t/helper/test-rot13-filter.c

diff --git a/Makefile b/Makefile
index 2ec9b2dc6b..ae7def7c66 100644
--- a/Makefile
+++ b/Makefile
@@ -772,6 +772,7 @@ TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
 TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
+TEST_BUILTINS_OBJS += test-rot13-filter.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
 TEST_BUILTINS_OBJS += test-run-command.o
diff --git a/pkt-line.c b/pkt-line.c
index 8e43c2def4..ce4e73b683 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -309,7 +309,8 @@ int write_packetized_from_fd_no_flush(int fd_in, int fd_out)
 	return err;
 }
 
-int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out)
+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
+					     int fd_out, int *packet_counter)
 {
 	int err = 0;
 	size_t bytes_written = 0;
@@ -324,6 +325,8 @@ int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_ou
 			break;
 		err = packet_write_gently(fd_out, src_in + bytes_written, bytes_to_write);
 		bytes_written += bytes_to_write;
+		if (packet_counter)
+			(*packet_counter)++;
 	}
 	return err;
 }
diff --git a/pkt-line.h b/pkt-line.h
index 1f623de60a..79c538b99e 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -32,7 +32,13 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((f
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd_no_flush(int fd_in, int fd_out);
-int write_packetized_from_buf_no_flush(const char *src_in, size_t len, int fd_out);
+int write_packetized_from_buf_no_flush_count(const char *src_in, size_t len,
+					     int fd_out, int *packet_counter);
+static inline int write_packetized_from_buf_no_flush(const char *src_in,
+						     size_t len, int fd_out)
+{
+	return write_packetized_from_buf_no_flush_count(src_in, len, fd_out, NULL);
+}
 
 /*
  * Stdio versions of packet_write functions. When mixing these with fd
diff --git a/t/helper/test-rot13-filter.c b/t/helper/test-rot13-filter.c
new file mode 100644
index 0000000000..f8d564c622
--- /dev/null
+++ b/t/helper/test-rot13-filter.c
@@ -0,0 +1,382 @@
+/*
+ * Example implementation for the Git filter protocol version 2
+ * See Documentation/gitattributes.txt, section "Filter Protocol"
+ *
+ * Usage: test-tool rot13-filter [--always-delay] --log=<path> <capabilities>
+ *
+ * Log path defines a debug log file that the script writes to. The
+ * subsequent arguments define a list of supported protocol capabilities
+ * ("clean", "smudge", etc).
+ *
+ * When --always-delay is given all pathnames with the "can-delay" flag
+ * that don't appear on the list bellow are delayed with a count of 1
+ * (see more below).
+ *
+ * This implementation supports special test cases:
+ * (1) If data with the pathname "clean-write-fail.r" is processed with
+ *     a "clean" operation then the write operation will die.
+ * (2) If data with the pathname "smudge-write-fail.r" is processed with
+ *     a "smudge" operation then the write operation will die.
+ * (3) If data with the pathname "error.r" is processed with any
+ *     operation then the filter signals that it cannot or does not want
+ *     to process the file.
+ * (4) If data with the pathname "abort.r" is processed with any
+ *     operation then the filter signals that it cannot or does not want
+ *     to process the file and any file after that is processed with the
+ *     same command.
+ * (5) If data with a pathname that is a key in the delay hash is
+ *     requested (e.g. "test-delay10.a") then the filter responds with
+ *     a "delay" status and sets the "requested" field in the delay hash.
+ *     The filter will signal the availability of this object after
+ *     "count" (field in delay hash) "list_available_blobs" commands.
+ * (6) If data with the pathname "missing-delay.a" is processed that the
+ *     filter will drop the path from the "list_available_blobs" response.
+ * (7) If data with the pathname "invalid-delay.a" is processed that the
+ *     filter will add the path "unfiltered" which was not delayed before
+ *     to the "list_available_blobs" response.
+ */
+
+#include "test-tool.h"
+#include "pkt-line.h"
+#include "string-list.h"
+#include "strmap.h"
+#include "parse-options.h"
+
+static FILE *logfile;
+static int always_delay, has_clean_cap, has_smudge_cap;
+static struct strmap delay = STRMAP_INIT;
+
+static inline const char *str_or_null(const char *str)
+{
+	return str ? str : "(null)";
+}
+
+static char *rot13(char *str)
+{
+	char *c;
+	for (c = str; *c; c++)
+		if (isalpha(*c))
+			*c += tolower(*c) < 'n' ? 13 : -13;
+	return str;
+}
+
+static char *get_value(char *buf, const char *key)
+{
+	const char *orig_buf = buf;
+	if (!buf ||
+	    !skip_prefix((const char *)buf, key, (const char **)&buf) ||
+	    !skip_prefix((const char *)buf, "=", (const char **)&buf) ||
+	    !*buf)
+		die("expected key '%s', got '%s'", key, str_or_null(orig_buf));
+	return buf;
+}
+
+/*
+ * Read a text packet, expecting that it is in the form "key=value" for
+ * the given key. An EOF does not trigger any error and is reported
+ * back to the caller with NULL. Die if the "key" part of "key=value" does
+ * not match the given key, or the value part is empty.
+ */
+static char *packet_key_val_read(const char *key)
+{
+	char *buf;
+	if (packet_read_line_gently(0, NULL, &buf) < 0)
+		return NULL;
+	return xstrdup(get_value(buf, key));
+}
+
+static inline void assert_remote_capability(struct strset *caps, const char *cap)
+{
+	if (!strset_contains(caps, cap))
+		die("required '%s' capability not available from remote", cap);
+}
+
+static void read_capabilities(struct strset *remote_caps)
+{
+	for (;;) {
+		char *buf = packet_read_line(0, NULL);
+		if (!buf)
+			break;
+		strset_add(remote_caps, get_value(buf, "capability"));
+	}
+
+	assert_remote_capability(remote_caps, "clean");
+	assert_remote_capability(remote_caps, "smudge");
+	assert_remote_capability(remote_caps, "delay");
+}
+
+static void check_and_write_capabilities(struct strset *remote_caps,
+					 const char **caps, int nr_caps)
+{
+	int i;
+	for (i = 0; i < nr_caps; i++) {
+		if (!strset_contains(remote_caps, caps[i]))
+			die("our capability '%s' is not available from remote",
+			    caps[i]);
+		packet_write_fmt(1, "capability=%s\n", caps[i]);
+	}
+	packet_flush(1);
+}
+
+struct delay_entry {
+	int requested, count;
+	char *output;
+};
+
+static void free_delay_entries(void)
+{
+	struct hashmap_iter iter;
+	struct strmap_entry *ent;
+
+	strmap_for_each_entry(&delay, &iter, ent) {
+		struct delay_entry *delay_entry = ent->value;
+		free(delay_entry->output);
+		free(delay_entry);
+	}
+	strmap_clear(&delay, 0);
+}
+
+static void add_delay_entry(char *pathname, int count, int requested)
+{
+	struct delay_entry *entry = xcalloc(1, sizeof(*entry));
+	entry->count = count;
+	entry->requested = requested;
+	if (strmap_put(&delay, pathname, entry))
+		BUG("adding the same path twice to delay hash?");
+}
+
+static void reply_list_available_blobs_cmd(void)
+{
+	struct hashmap_iter iter;
+	struct strmap_entry *ent;
+	struct string_list_item *str_item;
+	struct string_list paths = STRING_LIST_INIT_NODUP;
+
+	/* flush */
+	if (packet_read_line(0, NULL))
+		die("bad list_available_blobs end");
+
+	strmap_for_each_entry(&delay, &iter, ent) {
+		struct delay_entry *delay_entry = ent->value;
+		if (!delay_entry->requested)
+			continue;
+		delay_entry->count--;
+		if (!strcmp(ent->key, "invalid-delay.a")) {
+			/* Send Git a pathname that was not delayed earlier */
+			packet_write_fmt(1, "pathname=unfiltered");
+		}
+		if (!strcmp(ent->key, "missing-delay.a")) {
+			/* Do not signal Git that this file is available */
+		} else if (!delay_entry->count) {
+			string_list_append(&paths, ent->key);
+			packet_write_fmt(1, "pathname=%s", ent->key);
+		}
+	}
+
+	/* Print paths in sorted order. */
+	string_list_sort(&paths);
+	for_each_string_list_item(str_item, &paths)
+		fprintf(logfile, " %s", str_item->string);
+	string_list_clear(&paths, 0);
+
+	packet_flush(1);
+
+	fprintf(logfile, " [OK]\n");
+	packet_write_fmt(1, "status=success");
+	packet_flush(1);
+}
+
+static void command_loop(void)
+{
+	for (;;) {
+		char *buf, *output;
+		char *pathname;
+		struct delay_entry *entry;
+		struct strbuf input = STRBUF_INIT;
+		char *command = packet_key_val_read("command");
+
+		if (!command) {
+			fprintf(logfile, "STOP\n");
+			break;
+		}
+		fprintf(logfile, "IN: %s", command);
+
+		if (!strcmp(command, "list_available_blobs")) {
+			reply_list_available_blobs_cmd();
+			free(command);
+			continue;
+		}
+
+		pathname = packet_key_val_read("pathname");
+		if (!pathname)
+			die("unexpected EOF while expecting pathname");
+		fprintf(logfile, " %s", pathname);
+
+		/* Read until flush */
+		while ((buf = packet_read_line(0, NULL))) {
+			if (!strcmp(buf, "can-delay=1")) {
+				entry = strmap_get(&delay, pathname);
+				if (entry && !entry->requested)
+					entry->requested = 1;
+				else if (!entry && always_delay)
+					add_delay_entry(pathname, 1, 1);
+			} else if (starts_with(buf, "ref=") ||
+				   starts_with(buf, "treeish=") ||
+				   starts_with(buf, "blob=")) {
+				fprintf(logfile, " %s", buf);
+			} else {
+				/*
+				 * In general, filters need to be graceful about
+				 * new metadata, since it's documented that we
+				 * can pass any key-value pairs, but for tests,
+				 * let's be a little stricter.
+				 */
+				die("Unknown message '%s'", buf);
+			}
+		}
+
+		read_packetized_to_strbuf(0, &input, 0);
+		fprintf(logfile, " %"PRIuMAX" [OK] -- ", (uintmax_t)input.len);
+
+		entry = strmap_get(&delay, pathname);
+		if (entry && entry->output) {
+			output = entry->output;
+		} else if (!strcmp(pathname, "error.r") || !strcmp(pathname, "abort.r")) {
+			output = "";
+		} else if (!strcmp(command, "clean") && has_clean_cap) {
+			output = rot13(input.buf);
+		} else if (!strcmp(command, "smudge") && has_smudge_cap) {
+			output = rot13(input.buf);
+		} else {
+			die("bad command '%s'", command);
+		}
+
+		if (!strcmp(pathname, "error.r")) {
+			fprintf(logfile, "[ERROR]\n");
+			packet_write_fmt(1, "status=error");
+			packet_flush(1);
+		} else if (!strcmp(pathname, "abort.r")) {
+			fprintf(logfile, "[ABORT]\n");
+			packet_write_fmt(1, "status=abort");
+			packet_flush(1);
+		} else if (!strcmp(command, "smudge") &&
+			   (entry = strmap_get(&delay, pathname)) &&
+			   entry->requested == 1) {
+			fprintf(logfile, "[DELAYED]\n");
+			packet_write_fmt(1, "status=delayed");
+			packet_flush(1);
+			entry->requested = 2;
+			if (entry->output != output) {
+				free(entry->output);
+				entry->output = xstrdup(output);
+			}
+		} else {
+			int i, nr_packets = 0;
+			size_t output_len;
+			const char *p;
+			packet_write_fmt(1, "status=success");
+			packet_flush(1);
+
+			if (skip_prefix(pathname, command, &p) &&
+			    !strcmp(p, "-write-fail.r")) {
+				fprintf(logfile, "[WRITE FAIL]\n");
+				die("%s write error", command);
+			}
+
+			output_len = strlen(output);
+			fprintf(logfile, "OUT: %"PRIuMAX" ", (uintmax_t)output_len);
+
+			if (write_packetized_from_buf_no_flush_count(output,
+				output_len, 1, &nr_packets))
+				die("failed to write buffer to stdout");
+			packet_flush(1);
+
+			for (i = 0; i < nr_packets; i++)
+				fprintf(logfile, ".");
+			fprintf(logfile, " [OK]\n");
+
+			packet_flush(1);
+		}
+		free(pathname);
+		strbuf_release(&input);
+		free(command);
+	}
+}
+
+static void packet_initialize(void)
+{
+	char *pkt_buf = packet_read_line(0, NULL);
+
+	if (!pkt_buf || strcmp(pkt_buf, "git-filter-client"))
+		die("bad initialize: '%s'", str_or_null(pkt_buf));
+
+	pkt_buf = packet_read_line(0, NULL);
+	if (!pkt_buf || strcmp(pkt_buf, "version=2"))
+		die("bad version: '%s'", str_or_null(pkt_buf));
+
+	pkt_buf = packet_read_line(0, NULL);
+	if (pkt_buf)
+		die("bad version end: '%s'", pkt_buf);
+
+	packet_write_fmt(1, "git-filter-server");
+	packet_write_fmt(1, "version=2");
+	packet_flush(1);
+}
+
+static const char *rot13_usage[] = {
+	"test-tool rot13-filter [--always-delay] --log=<path> <capabilities>",
+	NULL
+};
+
+int cmd__rot13_filter(int argc, const char **argv)
+{
+	int i, nr_caps;
+	struct strset remote_caps = STRSET_INIT;
+	const char *log_path = NULL;
+
+	struct option options[] = {
+		OPT_BOOL(0, "always-delay", &always_delay,
+			 "delay all paths with the can-delay flag"),
+		OPT_STRING(0, "log", &log_path, "path",
+			   "path to the debug log file"),
+		OPT_END()
+	};
+	nr_caps = parse_options(argc, argv, NULL, options, rot13_usage,
+				PARSE_OPT_STOP_AT_NON_OPTION);
+
+	if (!log_path || !nr_caps)
+		usage_with_options(rot13_usage, options);
+
+	logfile = fopen(log_path, "a");
+	if (!logfile)
+		die_errno("failed to open log file");
+
+	for (i = 0; i < nr_caps; i++) {
+		if (!strcmp(argv[i], "smudge"))
+			has_smudge_cap = 1;
+		if (!strcmp(argv[i], "clean"))
+			has_clean_cap = 1;
+	}
+
+	add_delay_entry("test-delay10.a", 1, 0);
+	add_delay_entry("test-delay11.a", 1, 0);
+	add_delay_entry("test-delay20.a", 2, 0);
+	add_delay_entry("test-delay10.b", 1, 0);
+	add_delay_entry("missing-delay.a", 1, 0);
+	add_delay_entry("invalid-delay.a", 1, 0);
+
+	fprintf(logfile, "START\n");
+	packet_initialize();
+
+	read_capabilities(&remote_caps);
+	check_and_write_capabilities(&remote_caps, argv, nr_caps);
+	fprintf(logfile, "init handshake complete\n");
+	strset_clear(&remote_caps);
+
+	command_loop();
+
+	if (fclose(logfile))
+		die_errno("error closing logfile");
+	free_delay_entries();
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 318fdbab0c..d6a560f832 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -65,6 +65,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "rot13-filter", cmd__rot13_filter },
 	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index bb79927163..21a91b1019 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -54,6 +54,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__rot13_filter(int argc, const char **argv);
 int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 3/3] tests: use the new C rot13-filter helper to avoid PERL prereq
  2022-08-15  1:06     ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
  2022-08-15  1:06       ` [PATCH v4 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
  2022-08-15  1:06       ` [PATCH v4 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
@ 2022-08-15  1:06       ` Matheus Tavares
  2022-08-15 13:01       ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Johannes Schindelin
  3 siblings, 0 replies; 34+ messages in thread
From: Matheus Tavares @ 2022-08-15  1:06 UTC (permalink / raw)
  To: git; +Cc: gitster, avarab, johannes.schindelin

The previous commit implemented a C version of the t0021/rot13-filter.pl
script. Let's use this new C helper to eliminate the PERL prereq from
various tests, and also remove the superseded Perl script.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 t/t0021-conversion.sh                   |  69 ++++---
 t/t0021/rot13-filter.pl                 | 247 ------------------------
 t/t2080-parallel-checkout-basics.sh     |   7 +-
 t/t2082-parallel-checkout-attributes.sh |   7 +-
 4 files changed, 37 insertions(+), 293 deletions(-)
 delete mode 100644 t/t0021/rot13-filter.pl

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 963b66e08c..abecd75e4e 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -17,9 +17,6 @@ tr \
   'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'
 EOF
 
-write_script rot13-filter.pl "$PERL_PATH" \
-	<"$TEST_DIRECTORY"/t0021/rot13-filter.pl
-
 generate_random_characters () {
 	LEN=$1
 	NAME=$2
@@ -365,8 +362,8 @@ test_expect_success 'diff does not reuse worktree files that need cleaning' '
 	test_line_count = 0 count
 '
 
-test_expect_success PERL 'required process filter should filter data' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should filter data' '
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -450,8 +447,8 @@ test_expect_success PERL 'required process filter should filter data' '
 	)
 '
 
-test_expect_success PERL 'required process filter should filter data for various subcommands' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should filter data for various subcommands' '
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	(
 		cd repo &&
@@ -561,9 +558,9 @@ test_expect_success PERL 'required process filter should filter data for various
 	)
 '
 
-test_expect_success PERL 'required process filter takes precedence' '
+test_expect_success 'required process filter takes precedence' '
 	test_config_global filter.protocol.clean false &&
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -587,8 +584,8 @@ test_expect_success PERL 'required process filter takes precedence' '
 	)
 '
 
-test_expect_success PERL 'required process filter should be used only for "clean" operation only' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean" &&
+test_expect_success 'required process filter should be used only for "clean" operation only' '
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -622,8 +619,8 @@ test_expect_success PERL 'required process filter should be used only for "clean
 	)
 '
 
-test_expect_success PERL 'required process filter should process multiple packets' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter should process multiple packets' '
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 
 	rm -rf repo &&
@@ -687,8 +684,8 @@ test_expect_success PERL 'required process filter should process multiple packet
 	)
 '
 
-test_expect_success PERL 'required process filter with clean error should fail' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'required process filter with clean error should fail' '
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
 	test_config_global filter.protocol.required true &&
 	rm -rf repo &&
 	mkdir repo &&
@@ -706,8 +703,8 @@ test_expect_success PERL 'required process filter with clean error should fail'
 	)
 '
 
-test_expect_success PERL 'process filter should restart after unexpected write failure' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter should restart after unexpected write failure' '
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -761,8 +758,8 @@ test_expect_success PERL 'process filter should restart after unexpected write f
 	)
 '
 
-test_expect_success PERL 'process filter should not be restarted if it signals an error' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter should not be restarted if it signals an error' '
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -804,8 +801,8 @@ test_expect_success PERL 'process filter should not be restarted if it signals a
 	)
 '
 
-test_expect_success PERL 'process filter abort stops processing of all further files' '
-	test_config_global filter.protocol.process "rot13-filter.pl debug.log clean smudge" &&
+test_expect_success 'process filter abort stops processing of all further files' '
+	test_config_global filter.protocol.process "test-tool rot13-filter --log=debug.log clean smudge" &&
 	rm -rf repo &&
 	mkdir repo &&
 	(
@@ -861,10 +858,10 @@ test_expect_success PERL 'invalid process filter must fail (and not hang!)' '
 	)
 '
 
-test_expect_success PERL 'delayed checkout in process filter' '
-	test_config_global filter.a.process "rot13-filter.pl a.log clean smudge delay" &&
+test_expect_success 'delayed checkout in process filter' '
+	test_config_global filter.a.process "test-tool rot13-filter --log=a.log clean smudge delay" &&
 	test_config_global filter.a.required true &&
-	test_config_global filter.b.process "rot13-filter.pl b.log clean smudge delay" &&
+	test_config_global filter.b.process "test-tool rot13-filter --log=b.log clean smudge delay" &&
 	test_config_global filter.b.required true &&
 
 	rm -rf repo &&
@@ -940,8 +937,8 @@ test_expect_success PERL 'delayed checkout in process filter' '
 	)
 '
 
-test_expect_success PERL 'missing file in delayed checkout' '
-	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
+test_expect_success 'missing file in delayed checkout' '
+	test_config_global filter.bug.process "test-tool rot13-filter --log=bug.log clean smudge delay" &&
 	test_config_global filter.bug.required true &&
 
 	rm -rf repo &&
@@ -960,8 +957,8 @@ test_expect_success PERL 'missing file in delayed checkout' '
 	grep "error: .missing-delay\.a. was not filtered properly" git-stderr.log
 '
 
-test_expect_success PERL 'invalid file in delayed checkout' '
-	test_config_global filter.bug.process "rot13-filter.pl bug.log clean smudge delay" &&
+test_expect_success 'invalid file in delayed checkout' '
+	test_config_global filter.bug.process "test-tool rot13-filter --log=bug.log clean smudge delay" &&
 	test_config_global filter.bug.required true &&
 
 	rm -rf repo &&
@@ -990,10 +987,10 @@ do
 		mode_prereq='UTF8_NFD_TO_NFC' ;;
 	esac
 
-	test_expect_success PERL,SYMLINKS,$mode_prereq \
+	test_expect_success SYMLINKS,$mode_prereq \
 	"delayed checkout with $mode-collision don't write to the wrong place" '
 		test_config_global filter.delay.process \
-			"\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+			"test-tool rot13-filter --always-delay --log=delayed.log clean smudge delay" &&
 		test_config_global filter.delay.required true &&
 
 		git init $mode-collision &&
@@ -1026,12 +1023,12 @@ do
 	'
 done
 
-test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
+test_expect_success SYMLINKS,CASE_INSENSITIVE_FS \
 "delayed checkout with submodule collision don't write to the wrong place" '
 	git init collision-with-submodule &&
 	(
 		cd collision-with-submodule &&
-		git config filter.delay.process "\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter --always-delay --log=delayed.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		# We need Git to treat the submodule "a" and the
@@ -1062,11 +1059,11 @@ test_expect_success PERL,SYMLINKS,CASE_INSENSITIVE_FS \
 	)
 '
 
-test_expect_success PERL 'setup for progress tests' '
+test_expect_success 'setup for progress tests' '
 	git init progress &&
 	(
 		cd progress &&
-		git config filter.delay.process "rot13-filter.pl delay-progress.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter --log=delay-progress.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		echo "*.a filter=delay" >.gitattributes &&
@@ -1132,12 +1129,12 @@ do
 	'
 done
 
-test_expect_success PERL 'delayed checkout correctly reports the number of updated entries' '
+test_expect_success 'delayed checkout correctly reports the number of updated entries' '
 	rm -rf repo &&
 	git init repo &&
 	(
 		cd repo &&
-		git config filter.delay.process "../rot13-filter.pl delayed.log clean smudge delay" &&
+		git config filter.delay.process "test-tool rot13-filter --log=delayed.log clean smudge delay" &&
 		git config filter.delay.required true &&
 
 		echo "*.a filter=delay" >.gitattributes &&
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
deleted file mode 100644
index 7bb93768f3..0000000000
--- a/t/t0021/rot13-filter.pl
+++ /dev/null
@@ -1,247 +0,0 @@
-#
-# Example implementation for the Git filter protocol version 2
-# See Documentation/gitattributes.txt, section "Filter Protocol"
-#
-# Usage: rot13-filter.pl [--always-delay] <log path> <capabilities>
-#
-# Log path defines a debug log file that the script writes to. The
-# subsequent arguments define a list of supported protocol capabilities
-# ("clean", "smudge", etc).
-#
-# When --always-delay is given all pathnames with the "can-delay" flag
-# that don't appear on the list bellow are delayed with a count of 1
-# (see more below).
-#
-# This implementation supports special test cases:
-# (1) If data with the pathname "clean-write-fail.r" is processed with
-#     a "clean" operation then the write operation will die.
-# (2) If data with the pathname "smudge-write-fail.r" is processed with
-#     a "smudge" operation then the write operation will die.
-# (3) If data with the pathname "error.r" is processed with any
-#     operation then the filter signals that it cannot or does not want
-#     to process the file.
-# (4) If data with the pathname "abort.r" is processed with any
-#     operation then the filter signals that it cannot or does not want
-#     to process the file and any file after that is processed with the
-#     same command.
-# (5) If data with a pathname that is a key in the DELAY hash is
-#     requested (e.g. "test-delay10.a") then the filter responds with
-#     a "delay" status and sets the "requested" field in the DELAY hash.
-#     The filter will signal the availability of this object after
-#     "count" (field in DELAY hash) "list_available_blobs" commands.
-# (6) If data with the pathname "missing-delay.a" is processed that the
-#     filter will drop the path from the "list_available_blobs" response.
-# (7) If data with the pathname "invalid-delay.a" is processed that the
-#     filter will add the path "unfiltered" which was not delayed before
-#     to the "list_available_blobs" response.
-#
-
-use 5.008;
-sub gitperllib {
-	# Git assumes that all path lists are Unix-y colon-separated ones. But
-	# when the Git for Windows executes the test suite, its MSYS2 Bash
-	# calls git.exe, and colon-separated path lists are converted into
-	# Windows-y semicolon-separated lists of *Windows* paths (which
-	# naturally contain a colon after the drive letter, so splitting by
-	# colons simply does not cut it).
-	#
-	# Detect semicolon-separated path list and handle them appropriately.
-
-	if ($ENV{GITPERLLIB} =~ /;/) {
-		return split(/;/, $ENV{GITPERLLIB});
-	}
-	return split(/:/, $ENV{GITPERLLIB});
-}
-use lib (gitperllib());
-use strict;
-use warnings;
-use IO::File;
-use Git::Packet;
-
-my $MAX_PACKET_CONTENT_SIZE = 65516;
-
-my $always_delay = 0;
-if ( $ARGV[0] eq '--always-delay' ) {
-	$always_delay = 1;
-	shift @ARGV;
-}
-
-my $log_file                = shift @ARGV;
-my @capabilities            = @ARGV;
-
-open my $debug, ">>", $log_file or die "cannot open log file: $!";
-
-my %DELAY = (
-	'test-delay10.a' => { "requested" => 0, "count" => 1 },
-	'test-delay11.a' => { "requested" => 0, "count" => 1 },
-	'test-delay20.a' => { "requested" => 0, "count" => 2 },
-	'test-delay10.b' => { "requested" => 0, "count" => 1 },
-	'missing-delay.a' => { "requested" => 0, "count" => 1 },
-	'invalid-delay.a' => { "requested" => 0, "count" => 1 },
-);
-
-sub rot13 {
-	my $str = shift;
-	$str =~ y/A-Za-z/N-ZA-Mn-za-m/;
-	return $str;
-}
-
-print $debug "START\n";
-$debug->flush();
-
-packet_initialize("git-filter", 2);
-
-my %remote_caps = packet_read_and_check_capabilities("clean", "smudge", "delay");
-packet_check_and_write_capabilities(\%remote_caps, @capabilities);
-
-print $debug "init handshake complete\n";
-$debug->flush();
-
-while (1) {
-	my ( $res, $command ) = packet_key_val_read("command");
-	if ( $res == -1 ) {
-		print $debug "STOP\n";
-		exit();
-	}
-	print $debug "IN: $command";
-	$debug->flush();
-
-	if ( $command eq "list_available_blobs" ) {
-		# Flush
-		packet_compare_lists([1, ""], packet_bin_read()) ||
-			die "bad list_available_blobs end";
-
-		foreach my $pathname ( sort keys %DELAY ) {
-			if ( $DELAY{$pathname}{"requested"} >= 1 ) {
-				$DELAY{$pathname}{"count"} = $DELAY{$pathname}{"count"} - 1;
-				if ( $pathname eq "invalid-delay.a" ) {
-					# Send Git a pathname that was not delayed earlier
-					packet_txt_write("pathname=unfiltered");
-				}
-				if ( $pathname eq "missing-delay.a" ) {
-					# Do not signal Git that this file is available
-				} elsif ( $DELAY{$pathname}{"count"} == 0 ) {
-					print $debug " $pathname";
-					packet_txt_write("pathname=$pathname");
-				}
-			}
-		}
-
-		packet_flush();
-
-		print $debug " [OK]\n";
-		$debug->flush();
-		packet_txt_write("status=success");
-		packet_flush();
-	} else {
-		my ( $res, $pathname ) = packet_key_val_read("pathname");
-		if ( $res == -1 ) {
-			die "unexpected EOF while expecting pathname";
-		}
-		print $debug " $pathname";
-		$debug->flush();
-
-		# Read until flush
-		my ( $done, $buffer ) = packet_txt_read();
-		while ( $buffer ne '' ) {
-			if ( $buffer eq "can-delay=1" ) {
-				if ( exists $DELAY{$pathname} and $DELAY{$pathname}{"requested"} == 0 ) {
-					$DELAY{$pathname}{"requested"} = 1;
-				} elsif ( !exists $DELAY{$pathname} and $always_delay ) {
-					$DELAY{$pathname} = { "requested" => 1, "count" => 1 };
-				}
-			} elsif ($buffer =~ /^(ref|treeish|blob)=/) {
-				print $debug " $buffer";
-			} else {
-				# In general, filters need to be graceful about
-				# new metadata, since it's documented that we
-				# can pass any key-value pairs, but for tests,
-				# let's be a little stricter.
-				die "Unknown message '$buffer'";
-			}
-
-			( $done, $buffer ) = packet_txt_read();
-		}
-		if ( $done == -1 ) {
-			die "unexpected EOF after pathname '$pathname'";
-		}
-
-		my $input = "";
-		{
-			binmode(STDIN);
-			my $buffer;
-			my $done = 0;
-			while ( !$done ) {
-				( $done, $buffer ) = packet_bin_read();
-				$input .= $buffer;
-			}
-			if ( $done == -1 ) {
-				die "unexpected EOF while reading input for '$pathname'";
-			}			
-			print $debug " " . length($input) . " [OK] -- ";
-			$debug->flush();
-		}
-
-		my $output;
-		if ( exists $DELAY{$pathname} and exists $DELAY{$pathname}{"output"} ) {
-			$output = $DELAY{$pathname}{"output"}
-		} elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
-			$output = "";
-		} elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
-			$output = rot13($input);
-		} elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
-			$output = rot13($input);
-		} else {
-			die "bad command '$command'";
-		}
-
-		if ( $pathname eq "error.r" ) {
-			print $debug "[ERROR]\n";
-			$debug->flush();
-			packet_txt_write("status=error");
-			packet_flush();
-		} elsif ( $pathname eq "abort.r" ) {
-			print $debug "[ABORT]\n";
-			$debug->flush();
-			packet_txt_write("status=abort");
-			packet_flush();
-		} elsif ( $command eq "smudge" and
-			exists $DELAY{$pathname} and
-			$DELAY{$pathname}{"requested"} == 1 ) {
-			print $debug "[DELAYED]\n";
-			$debug->flush();
-			packet_txt_write("status=delayed");
-			packet_flush();
-			$DELAY{$pathname}{"requested"} = 2;
-			$DELAY{$pathname}{"output"} = $output;
-		} else {
-			packet_txt_write("status=success");
-			packet_flush();
-
-			if ( $pathname eq "${command}-write-fail.r" ) {
-				print $debug "[WRITE FAIL]\n";
-				$debug->flush();
-				die "${command} write error";
-			}
-
-			print $debug "OUT: " . length($output) . " ";
-			$debug->flush();
-
-			while ( length($output) > 0 ) {
-				my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE );
-				packet_bin_write($packet);
-				# dots represent the number of packets
-				print $debug ".";
-				if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
-					$output = substr( $output, $MAX_PACKET_CONTENT_SIZE );
-				} else {
-					$output = "";
-				}
-			}
-			packet_flush();
-			print $debug " [OK]\n";
-			$debug->flush();
-			packet_flush();
-		}
-	}
-}
diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh
index c683e60007..00ce3033d3 100755
--- a/t/t2080-parallel-checkout-basics.sh
+++ b/t/t2080-parallel-checkout-basics.sh
@@ -230,12 +230,9 @@ test_expect_success SYMLINKS 'parallel checkout checks for symlinks in leading d
 # check the final report including sequential, parallel, and delayed entries
 # all at the same time. So we must have finer control of the parallel checkout
 # variables.
-test_expect_success PERL '"git checkout ." report should not include failed entries' '
-	write_script rot13-filter.pl "$PERL_PATH" \
-		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
-
+test_expect_success '"git checkout ." report should not include failed entries' '
 	test_config_global filter.delay.process \
-		"\"$(pwd)/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		"test-tool rot13-filter --always-delay --log=delayed.log clean smudge delay" &&
 	test_config_global filter.delay.required true &&
 	test_config_global filter.cat.clean cat  &&
 	test_config_global filter.cat.smudge cat  &&
diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh
index 2525457961..f3511cd43a 100755
--- a/t/t2082-parallel-checkout-attributes.sh
+++ b/t/t2082-parallel-checkout-attributes.sh
@@ -138,12 +138,9 @@ test_expect_success 'parallel-checkout and external filter' '
 # The delayed queue is independent from the parallel queue, and they should be
 # able to work together in the same checkout process.
 #
-test_expect_success PERL 'parallel-checkout and delayed checkout' '
-	write_script rot13-filter.pl "$PERL_PATH" \
-		<"$TEST_DIRECTORY"/t0021/rot13-filter.pl &&
-
+test_expect_success 'parallel-checkout and delayed checkout' '
 	test_config_global filter.delay.process \
-		"\"$(pwd)/rot13-filter.pl\" --always-delay \"$(pwd)/delayed.log\" clean smudge delay" &&
+		"test-tool rot13-filter --always-delay --log=\"$(pwd)/delayed.log\" clean smudge delay" &&
 	test_config_global filter.delay.required true &&
 
 	echo "abcd" >original &&
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 0/3] t0021: convert perl script to C test-tool helper
  2022-08-15  1:06     ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
                         ` (2 preceding siblings ...)
  2022-08-15  1:06       ` [PATCH v4 3/3] tests: use the new C rot13-filter helper to avoid PERL prereq Matheus Tavares
@ 2022-08-15 13:01       ` Johannes Schindelin
  2022-08-19 22:17         ` Junio C Hamano
  3 siblings, 1 reply; 34+ messages in thread
From: Johannes Schindelin @ 2022-08-15 13:01 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, gitster, avarab

Hi Matheus,

On Sun, 14 Aug 2022, Matheus Tavares wrote:

> Main changes since v3:
> Patch 2:
> - Mentioned in commit message why we removed the flush() calls for the
>   log file handler.
> - Removed 'buf[size] = \0' and relied on the fact that packet_read()
>   already 0-terminates the buffer. This also allows us to use NULL
>   instead of &size in many places, dropping down the unneeded variable.
> - Used parse-options instead of manual argv fiddling. I'm not strongly
>   about one way or another, but I found the parse-options slightly
>   easier for new options that may be added in the future.
> - Style: removed unnecessary {} and newline.

While I think that the `parse-options` were unnecessary churn, I won't
object because I find that I cannot motivate myself to care all that much
(other reviewers seem to find this type of aspects super exciting, a
sentiment I do not share). I care much more about the essence, about the
actual improvement brought about by your patch series, which is to reduce
Git's test suite's reliance on scripting.

The range-diff looks good to me, and I think this iteration is good to go.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 0/3] t0021: convert perl script to C test-tool helper
  2022-08-15 13:01       ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Johannes Schindelin
@ 2022-08-19 22:17         ` Junio C Hamano
  0 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2022-08-19 22:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Matheus Tavares, git, avarab

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Sun, 14 Aug 2022, Matheus Tavares wrote:
>
>> Main changes since v3:
>> Patch 2:
>> - Mentioned in commit message why we removed the flush() calls for the
>>   log file handler.
>> - Removed 'buf[size] = \0' and relied on the fact that packet_read()
>>   already 0-terminates the buffer. This also allows us to use NULL
> ...
> The range-diff looks good to me, and I think this iteration is good to go.

Thanks, both.  Let's merge it down.


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2022-08-19 22:18 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-22 19:42 [PATCH 0/2] t0021: convert perl script to C test-tool helper Matheus Tavares
2022-07-22 19:42 ` [PATCH 1/2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
2022-07-23  4:52   ` Ævar Arnfjörð Bjarmason
2022-07-23  4:59   ` Ævar Arnfjörð Bjarmason
2022-07-23 13:36     ` Matheus Tavares
2022-07-22 19:42 ` [PATCH 2/2] t/t0021: replace old rot13-filter.pl uses with new test-tool cmd Matheus Tavares
2022-07-24 15:09 ` [PATCH v2] t/t0021: convert the rot13-filter.pl script to C Matheus Tavares
2022-07-28 16:58   ` Johannes Schindelin
2022-07-28 17:54     ` Junio C Hamano
2022-07-28 19:50     ` Ævar Arnfjörð Bjarmason
2022-07-31  2:52     ` Matheus Tavares
2022-08-09  9:36       ` Johannes Schindelin
2022-07-31 18:19   ` [PATCH v3 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
2022-07-31 18:19     ` [PATCH v3 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
2022-08-01 20:41       ` Junio C Hamano
2022-07-31 18:19     ` [PATCH v3 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
2022-08-01 11:33       ` Ævar Arnfjörð Bjarmason
2022-08-02  0:16         ` Matheus Tavares
2022-08-09  9:45           ` Johannes Schindelin
2022-08-01 11:39       ` Ævar Arnfjörð Bjarmason
2022-08-01 21:18       ` Junio C Hamano
2022-08-02  0:13         ` Matheus Tavares
2022-08-09 10:00         ` Johannes Schindelin
2022-08-10 18:37           ` Junio C Hamano
2022-08-10 19:58             ` Junio C Hamano
2022-08-09 10:37         ` Johannes Schindelin
2022-08-09 10:47       ` Johannes Schindelin
2022-07-31 18:19     ` [PATCH v3 3/3] tests: use the new C rot13-filter helper to avoid PERL prereq Matheus Tavares
2022-08-15  1:06     ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Matheus Tavares
2022-08-15  1:06       ` [PATCH v4 1/3] t0021: avoid grepping for a Perl-specific string at filter output Matheus Tavares
2022-08-15  1:06       ` [PATCH v4 2/3] t0021: implementation the rot13-filter.pl script in C Matheus Tavares
2022-08-15  1:06       ` [PATCH v4 3/3] tests: use the new C rot13-filter helper to avoid PERL prereq Matheus Tavares
2022-08-15 13:01       ` [PATCH v4 0/3] t0021: convert perl script to C test-tool helper Johannes Schindelin
2022-08-19 22:17         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).