From: larsxschneider@gmail.com
To: git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com, sbeller@google.com,
Johannes.Schindelin@gmx.de, jnareb@gmail.com, mlbright@gmail.com,
tboegi@web.de, jacob.keller@gmail.com,
Lars Schneider <larsxschneider@gmail.com>
Subject: [PATCH v7 00/10] Git filter protocol
Date: Thu, 8 Sep 2016 20:21:22 +0200 [thread overview]
Message-ID: <20160908182132.50788-1-larsxschneider@gmail.com> (raw)
From: Lars Schneider <larsxschneider@gmail.com>
The goal of this series is to avoid launching a new clean/smudge filter
process for each file that is filtered.
A short summary about v1 to v5 can be found here:
https://git.github.io/rev_news/2016/08/17/edition-18/
This series is also published on web:
https://github.com/larsxschneider/git/pull/11
Thanks a lot to
Stefan, Junio, Jakub (Narebski and Keller), Peff, and Torsten
for very helpful reviews,
Lars
## Major changes since v6
* series rebased to Git v2.10.0
* simplified and improved multi-packet test
* v2 filter takes precedence over v1 filter
* rename "error-all" to "abort"
* patch 07/13 "pack-protocol: fix maximum pkt-line size" submitted separately:
http://public-inbox.org/git/20160829175509.69050-1-larsxschneider@gmail.com/
* removed "10/13 convert: generate large test files only once"
* patch 13/13 "read-cache: make sure file handles are not inherited by child
processes" moved to an own series:
http://public-inbox.org/git/20160905211111.72956-1-larsxschneider@gmail.com/
## All changes since v6
### Stefan
* http://public-inbox.org/git/CAGZ79kaJn-yf28+Ls2JyqSs6Jt-Oj2JW-sAXQn-Oe5xaxSyjMQ@mail.gmail.com/
* submit patch "pack-protocol: fix maximum pkt-line size" separately:
http://public-inbox.org/git/20160829175509.69050-1-larsxschneider@gmail.com/
* http://public-inbox.org/git/CAGZ79kbEoYVM_+MWkajT2pHN1OKZXATe5d+Dv_uSJ7dyPGxUeg@mail.gmail.com/
* trace failures in new *_gently() pkt-line functions
### Junio
* http://public-inbox.org/git/xmqq8tvkle6o.fsf@gitster.mtv.corp.google.com/
* share code between packet_write_fmt() and packet_write_fmt_gently()
* http://public-inbox.org/git/xmqq4m68ldrg.fsf@gitster.mtv.corp.google.com/
* remove "Second, it will..." from "pkt-line: add packet_write_gently()"
commit message
* use separate buffer in write_packetized_from_fd()
* fix write order in packet_write_gently()
* http://public-inbox.org/git/xmqqzio0jxh7.fsf@gitster.mtv.corp.google.com/
* rename read/write flush terminated packet streams functions
* remove meaningless "bytes_to_write > ..." check
* reuse packet_read() function in read_packetized_to_buf()
* http://public-inbox.org/git/xmqqk2f3fgbd.fsf@gitster.mtv.corp.google.com/
* revert test_config replacement in test setup
* http://public-inbox.org/git/xmqq37lncvj6.fsf@gitster.mtv.corp.google.com/
* improve multi-packet test
* http://public-inbox.org/git/xmqqzinv6wtg.fsf@gitster.mtv.corp.google.com/
* give v2 filter precedence over v1 filter
* http://public-inbox.org/git/xmqqmvjt3nht.fsf@gitster.mtv.corp.google.com/
* explain that "error" behavior mimics v1 filter
* explain that Git closes the pipe on exit to allow the filter a graceful
exit
### Jakub
* http://public-inbox.org/git/7fbd02a1-ad62-7391-df2a-835aae8a62a7@gmail.com/
* rename "error-all" to "abort"
* add a simple Perl v2 filter example in contrib
## Interdiff (v6..v7)
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index df059e9..ac000ea 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -470,20 +470,11 @@ packet: git< status=error\n
packet: git< 0000
------------------------
-In case the filter cannot or does not want to process the content
-as well as any future content for the lifetime of the Git process,
-it is expected to respond with an "error-all" status. Depending on
-the `filter.<driver>.required` flag Git will interpret that as error
-but it will not stop or restart the filter process.
-------------------------
-packet: git< status=error-all\n
-packet: git< 0000
-------------------------
-
If the filter experiences an error during processing, then it can
-send the status "error". Depending on the `filter.<driver>.required`
-flag Git will interpret that as error but it will not stop or restart
-the filter process.
+send the status "error" after the content was (partially or
+completely) sent. Depending on the `filter.<driver>.required` flag
+Git will interpret that as error but it will not stop or restart the
+filter process.
------------------------
packet: git< status=success\n
packet: git< 0000
@@ -495,21 +486,38 @@ packet: git< 0000
If the filter dies during the communication or does not adhere to
the protocol then Git will stop the filter process and restart it
-with the next file that needs to be processed.
+with the next file that needs to be processed. Depending on the
+`filter.<driver>.required` flag Git will interpret that as error.
+
+The error handling for all cases above mimic the behavior of
+the `filter.<driver>.clean` / `filter.<driver>.smudge` error
+handling.
+
+In case the filter cannot or does not want to process the content
+as well as any future content for the lifetime of the Git process,
+it is expected to respond with an "abort" status. Depending on
+the `filter.<driver>.required` flag Git will interpret that as error
+for the content as well as any future content for the lifetime of the
+Git process but it will not stop or restart the filter process.
+------------------------
+packet: git< status=abort\n
+packet: git< 0000
+------------------------
After the filter has processed a blob it is expected to wait for
-the next "key=value" list containing a command. When the Git process
-terminates, it will send a kill signal to the filter in that stage.
+the next "key=value" list containing a command. Git will close
+the command pipe on exit. The filter is expected to detect EOF
+and exit gracefully on its own.
A long running filter demo implementation can be found in
-`t/t0021/rot13-filter.pl` located in the Git core repository.
-If you develop your own long running filter process then the
-`GIT_TRACE_PACKET` environment variables can be very helpful
-for debugging (see linkgit:git[1]).
-
-If a `filter.<driver>.clean` or `filter.<driver>.smudge` command
-is configured then these commands always take precedence over
-a configured `filter.<driver>.process` command.
+`contrib/long-running-filter/example.pl` located in the Git
+core repository. If you develop your own long running filter
+process then the `GIT_TRACE_PACKET` environment variables can be
+very helpful for debugging (see linkgit:git[1]).
+
+If a `filter.<driver>.process` command is configured then it
+always takes precedence over a configured `filter.<driver>.clean`
+or `filter.<driver>.smudge` command.
Please note that you cannot use an existing `filter.<driver>.clean`
or `filter.<driver>.smudge` command with `filter.<driver>.process`
diff --git a/contrib/long-running-filter/example.pl b/contrib/long-running-filter/example.pl
new file mode 100755
index 0000000..279fbfb
--- /dev/null
+++ b/contrib/long-running-filter/example.pl
@@ -0,0 +1,111 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git filter protocol version 2
+# See Documentation/gitattributes.txt, section "Filter Protocol"
+#
+
+use strict;
+use warnings;
+
+my $MAX_PACKET_CONTENT_SIZE = 65516;
+
+sub packet_read {
+ my $buffer;
+ my $bytes_read = read STDIN, $buffer, 4;
+ if ( $bytes_read == 0 ) {
+
+ # EOF - Git stopped talking to us!
+ exit();
+ }
+ elsif ( $bytes_read != 4 ) {
+ die "invalid packet size '$bytes_read' field";
+ }
+ my $pkt_size = hex($buffer);
+ if ( $pkt_size == 0 ) {
+ return ( 1, "" );
+ }
+ elsif ( $pkt_size > 4 ) {
+ my $content_size = $pkt_size - 4;
+ $bytes_read = read STDIN, $buffer, $content_size;
+ if ( $bytes_read != $content_size ) {
+ die "invalid packet ($content_size expected; $bytes_read read)";
+ }
+ return ( 0, $buffer );
+ }
+ else {
+ die "invalid packet size";
+ }
+}
+
+sub packet_write {
+ my ($packet) = @_;
+ print STDOUT sprintf( "%04x", length($packet) + 4 );
+ print STDOUT $packet;
+ STDOUT->flush();
+}
+
+sub packet_flush {
+ print STDOUT sprintf( "%04x", 0 );
+ STDOUT->flush();
+}
+
+( packet_read() eq ( 0, "git-filter-client" ) ) || die "bad initialization";
+( packet_read() eq ( 0, "version=2" ) ) || die "bad version";
+( packet_read() eq ( 1, "" ) ) || die "bad version end";
+
+packet_write("git-filter-server\n");
+packet_write("version=2\n");
+
+( packet_read() eq ( 0, "clean=true" ) ) || die "bad capability";
+( packet_read() eq ( 0, "smudge=true" ) ) || die "bad capability";
+( packet_read() eq ( 1, "" ) ) || die "bad capability end";
+
+packet_write( "clean=true\n" );
+packet_write( "smudge=true\n" );
+packet_flush();
+
+while (1) {
+ my ($command) = packet_read() =~ /^command=([^=]+)\n$/;
+ my ($pathname) = packet_read() =~ /^pathname=([^=]+)\n$/;
+
+ packet_read();
+
+ my $input = "";
+ {
+ binmode(STDIN);
+ my $buffer;
+ my $done = 0;
+ while ( !$done ) {
+ ( $done, $buffer ) = packet_read();
+ $input .= $buffer;
+ }
+ }
+
+ my $output;
+ if ( $command eq "clean" ) {
+ ### Perform clean here ###
+ $output = $input;
+ }
+ elsif ( $command eq "smudge" ) {
+ ### Perform smudge here ###
+ $output = $input;
+ }
+ else {
+ die "bad command '$command'";
+ }
+
+ packet_write("status=success\n");
+ packet_flush();
+ while ( length($output) > 0 ) {
+ my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE );
+ packet_write($packet);
+ if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
+ $output = substr( $output, $MAX_PACKET_CONTENT_SIZE );
+ }
+ else {
+ $output = "";
+ }
+ }
+ packet_flush(); # flush content!
+ packet_flush(); # empty list!
+}
diff --git a/convert.c b/convert.c
index 0269a49..0ed48ed 100644
--- a/convert.c
+++ b/convert.c
@@ -723,9 +723,9 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len
goto done;
if (fd >= 0)
- err = packet_write_stream_with_flush_from_fd(fd, process->in);
+ err = write_packetized_from_fd(fd, process->in);
else
- err = packet_write_stream_with_flush_from_buf(src, len, process->in);
+ err = write_packetized_from_buf(src, len, process->in);
if (err)
goto done;
@@ -734,7 +734,7 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len
if (err)
goto done;
- err = packet_read_till_flush(process->out, &nbuf) < 0;
+ err = read_packetized_to_buf(process->out, &nbuf) < 0;
if (err)
goto done;
@@ -747,7 +747,7 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len
if (err || errno == EPIPE) {
if (!strcmp(filter_status.buf, "error")) {
/* The filter signaled a problem with the file. */
- } else if (!strcmp(filter_status.buf, "error-all")) {
+ } else if (!strcmp(filter_status.buf, "abort")) {
/*
* The filter signaled a permanent problem. Don't try to filter
* files with the same command for the lifetime of the current
@@ -790,9 +790,9 @@ static int apply_filter(const char *path, const char *src, size_t len,
if (!dst)
return 1;
- if ((CAP_CLEAN & wanted_capability) && drv->clean)
+ if (!drv->process && (CAP_CLEAN & wanted_capability) && drv->clean)
cmd = drv->clean;
- else if ((CAP_SMUDGE & wanted_capability) && drv->smudge)
+ else if (!drv->process && (CAP_SMUDGE & wanted_capability) && drv->smudge)
cmd = drv->smudge;
if (cmd && *cmd)
diff --git a/pkt-line.c b/pkt-line.c
index 3033aa3..5001a07 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -3,7 +3,6 @@
#include "run-command.h"
char packet_buffer[LARGE_PACKET_MAX];
-static char packet_write_buffer[LARGE_PACKET_MAX];
static const char *packet_trace_prefix = "git";
static struct trace_key trace_packet = TRACE_KEY_INIT(PACKET);
static struct trace_key trace_pack = TRACE_KEY_INIT(PACKFILE);
@@ -95,7 +94,10 @@ void packet_flush(int fd)
int packet_flush_gently(int fd)
{
packet_trace("0000", 4, 1);
- return (write_in_full(fd, "0000", 4) == 4 ? 0 : -1);
+ if (write_in_full(fd, "0000", 4) == 4)
+ return 0;
+ error("flush packet write failed");
+ return -1;
}
void packet_buf_flush(struct strbuf *buf)
@@ -132,39 +134,70 @@ static void format_packet(struct strbuf *out, const char *fmt, va_list args)
packet_trace(out->buf + orig_len + 4, n - 4, 1);
}
+static int packet_write_fmt_1(int fd, int gently,
+ const char *fmt, va_list args)
+{
+ struct strbuf buf = STRBUF_INIT;
+ size_t count;
+
+ format_packet(&buf, fmt, args);
+ count = write_in_full(fd, buf.buf, buf.len);
+ if (count == buf.len)
+ return 0;
+
+ if (!gently) {
+ if (errno == EPIPE) {
+ if (in_async())
+ async_exit(141);
+
+ signal(SIGPIPE, SIG_DFL);
+ raise(SIGPIPE);
+ /* Should never happen, but just in case... */
+ exit(141);
+ }
+ die_errno("packet write error");
+ }
+ error("packet write failed");
+ return -1;
+}
+
void packet_write_fmt(int fd, const char *fmt, ...)
{
- static struct strbuf buf = STRBUF_INIT;
va_list args;
- strbuf_reset(&buf);
va_start(args, fmt);
- format_packet(&buf, fmt, args);
+ packet_write_fmt_1(fd, 0, fmt, args);
va_end(args);
- write_or_die(fd, buf.buf, buf.len);
}
int packet_write_fmt_gently(int fd, const char *fmt, ...)
{
- static struct strbuf buf = STRBUF_INIT;
+ int status;
va_list args;
- strbuf_reset(&buf);
va_start(args, fmt);
- format_packet(&buf, fmt, args);
+ status = packet_write_fmt_1(fd, 1, fmt, args);
va_end(args);
- return (write_in_full(fd, buf.buf, buf.len) == buf.len ? 0 : -1);
+ return status;
}
int packet_write_gently(const int fd_out, const char *buf, size_t size)
{
- if (size > sizeof(packet_write_buffer) - 4)
+ static char packet_write_buffer[LARGE_PACKET_MAX];
+
+ if (size > sizeof(packet_write_buffer) - 4) {
+ error("packet write failed");
return -1;
+ }
packet_trace(buf, size, 1);
- memmove(packet_write_buffer + 4, buf, size);
size += 4;
set_packet_header(packet_write_buffer, size);
- return (write_in_full(fd_out, packet_write_buffer, size) == size ? 0 : -1);
+ memcpy(packet_write_buffer + 4, buf, size - 4);
+ if (write_in_full(fd_out, packet_write_buffer, size) == size)
+ return 0;
+
+ error("packet write failed");
+ return -1;
}
void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
@@ -176,35 +209,35 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
va_end(args);
}
-int packet_write_stream_with_flush_from_fd(int fd_in, int fd_out)
+int write_packetized_from_fd(int fd_in, int fd_out)
{
+ static char buf[PKTLINE_DATA_MAXLEN];
int err = 0;
ssize_t bytes_to_write;
while (!err) {
- bytes_to_write = xread(fd_in, packet_write_buffer, sizeof(packet_write_buffer) - 4);
+ bytes_to_write = xread(fd_in, buf, sizeof(buf));
if (bytes_to_write < 0)
return COPY_READ_ERROR;
if (bytes_to_write == 0)
break;
- if (bytes_to_write > sizeof(packet_write_buffer) - 4)
- return COPY_WRITE_ERROR;
- err = packet_write_gently(fd_out, packet_write_buffer, bytes_to_write);
+ err = packet_write_gently(fd_out, buf, bytes_to_write);
}
if (!err)
err = packet_flush_gently(fd_out);
return err;
}
-int packet_write_stream_with_flush_from_buf(const char *src_in, size_t len, int fd_out)
+int write_packetized_from_buf(const char *src_in, size_t len, int fd_out)
{
+ static char buf[PKTLINE_DATA_MAXLEN];
int err = 0;
size_t bytes_written = 0;
size_t bytes_to_write;
while (!err) {
- if ((len - bytes_written) > sizeof(packet_write_buffer) - 4)
- bytes_to_write = sizeof(packet_write_buffer) - 4;
+ if ((len - bytes_written) > sizeof(buf))
+ bytes_to_write = sizeof(buf);
else
bytes_to_write = len - bytes_written;
if (bytes_to_write == 0)
@@ -327,52 +360,29 @@ char *packet_read_line_buf(char **src, size_t *src_len, int *dst_len)
return packet_read_line_generic(-1, src, src_len, dst_len);
}
-ssize_t packet_read_till_flush(int fd_in, struct strbuf *sb_out)
+ssize_t read_packetized_to_buf(int fd_in, struct strbuf *sb_out)
{
- int len, ret;
+ int paket_len;
int options = PACKET_READ_GENTLE_ON_EOF;
- char linelen[4];
size_t oldlen = sb_out->len;
size_t oldalloc = sb_out->alloc;
for (;;) {
- /* Read packet header */
- ret = get_packet_data(fd_in, NULL, NULL, linelen, 4, options);
- if (ret < 0)
- goto done;
- len = packet_length(linelen);
- if (len < 0)
- die("protocol error: bad line length character: %.4s", linelen);
- if (!len) {
- /* Found a flush packet - Done! */
- packet_trace("0000", 4, 0);
+ strbuf_grow(sb_out, PKTLINE_DATA_MAXLEN+1);
+ paket_len = packet_read(fd_in, NULL, NULL,
+ sb_out->buf + sb_out->len, PKTLINE_DATA_MAXLEN+1, options);
+ if (paket_len <= 0)
break;
- }
- len -= 4;
-
- /* Read packet content */
- strbuf_grow(sb_out, len);
- ret = get_packet_data(fd_in, NULL, NULL, sb_out->buf + sb_out->len, len, options);
- if (ret < 0)
- goto done;
-
- if (ret != len) {
- error("protocol error: incomplete read (expected %d, got %d)", len, ret);
- goto done;
- }
-
- packet_trace(sb_out->buf + sb_out->len, len, 0);
- sb_out->len += len;
+ sb_out->len += paket_len;
}
-done:
- if (ret < 0) {
+ if (paket_len < 0) {
if (oldalloc == 0)
strbuf_release(sb_out);
else
strbuf_setlen(sb_out, oldlen);
- return ret; /* unexpected EOF */
+ return paket_len;
}
return sb_out->len - oldlen;
}
diff --git a/pkt-line.h b/pkt-line.h
index 6a3b7cf..3d873f3 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -25,8 +25,8 @@ void packet_buf_flush(struct strbuf *buf);
void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
int packet_flush_gently(int fd);
int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
-int packet_write_stream_with_flush_from_fd(int fd_in, int fd_out);
-int packet_write_stream_with_flush_from_buf(const char *src_in, size_t len, int fd_out);
+int write_packetized_from_fd(int fd_in, int fd_out);
+int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
/*
* Read a packetized line into the buffer, which must be at least size bytes
@@ -82,7 +82,7 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
/*
* Reads a stream of variable sized packets until a flush packet is detected.
*/
-ssize_t packet_read_till_flush(int fd_in, struct strbuf *sb_out);
+ssize_t read_packetized_to_buf(int fd_in, struct strbuf *sb_out);
#define DEFAULT_PACKET_MAX 1000
#define LARGE_PACKET_MAX 65520
diff --git a/read-cache.c b/read-cache.c
index 02f74d3..491e52d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -156,7 +156,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
{
int match = -1;
- int fd = open(ce->name, O_RDONLY | O_CLOEXEC);
+ int fd = open(ce->name, O_RDONLY);
if (fd >= 0) {
unsigned char sha1[20];
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 94b0bf3..1c98ac3 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -4,15 +4,6 @@ test_description='blob conversion via gitattributes'
. ./test-lib.sh
-if test_have_prereq EXPENSIVE
-then
- T0021_LARGE_FILE_SIZE=2048
- T0021_LARGISH_FILE_SIZE=100
-else
- T0021_LARGE_FILE_SIZE=30
- T0021_LARGISH_FILE_SIZE=2
-fi
-
cat <<EOF >rot13.sh
#!$SHELL_PATH
tr \
@@ -22,8 +13,8 @@ EOF
chmod +x rot13.sh
test_expect_success setup '
- test_config filter.rot13.smudge ./rot13.sh &&
- test_config filter.rot13.clean ./rot13.sh &&
+ git config filter.rot13.smudge ./rot13.sh &&
+ git config filter.rot13.clean ./rot13.sh &&
{
echo "*.t filter=rot13"
@@ -43,26 +34,7 @@ test_expect_success setup '
git checkout -- test test.t test.i &&
echo "content-test2" >test2.o &&
- echo "content-test3-subdir" >test3-subdir.o &&
-
- mkdir generated-test-data &&
- for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
- do
- RANDOM_STRING="$(test-genrandom end $i | tr -dc "A-Za-z0-9" )"
- ROT_RANDOM_STRING="$(echo $RANDOM_STRING | ./rot13.sh )"
- # Generate 1MB of empty data and 100 bytes of random characters
- # printf "$(test-genrandom start $i)"
- printf "%1048576d" 1 >>generated-test-data/large.file &&
- printf "$RANDOM_STRING" >>generated-test-data/large.file &&
- printf "%1048576d" 1 >>generated-test-data/large.file.rot13 &&
- printf "$ROT_RANDOM_STRING" >>generated-test-data/large.file.rot13 &&
-
- if test $i = $T0021_LARGISH_FILE_SIZE
- then
- cat generated-test-data/large.file >generated-test-data/largish.file &&
- cat generated-test-data/large.file.rot13 >generated-test-data/largish.file.rot13
- fi
- done
+ echo "content-test3-subdir" >test3-subdir.o
'
script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'
@@ -230,9 +202,9 @@ test_expect_success 'required filter clean failure' '
test_expect_success 'filtering large input to small output should use little memory' '
test_config filter.devnull.clean "cat >/dev/null" &&
test_config filter.devnull.required true &&
- cp generated-test-data/large.file large.file &&
- echo "large.file filter=devnull" >.gitattributes &&
- GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add large.file
+ for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&
+ echo "30MB filter=devnull" >.gitattributes &&
+ GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
'
test_expect_success 'filter that does not read is fine' '
@@ -245,15 +217,15 @@ test_expect_success 'filter that does not read is fine' '
test_cmp expect actual
'
-test_expect_success 'filter large file' '
+test_expect_success EXPENSIVE 'filter large file' '
test_config filter.largefile.smudge cat &&
test_config filter.largefile.clean cat &&
- echo "large.file filter=largefile" >.gitattributes &&
- cp generated-test-data/large.file large.file &&
- git add large.file 2>err &&
+ for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
+ echo "2GB filter=largefile" >.gitattributes &&
+ git add 2GB 2>err &&
test_must_be_empty err &&
- rm -f large.file &&
- git checkout -- large.file 2>err &&
+ rm -f 2GB &&
+ git checkout -- 2GB 2>err &&
test_must_be_empty err
'
@@ -374,22 +346,24 @@ test_expect_success PERL 'required process filter should filter data' '
check_filter \
git add . \
<<-\EOF &&
- 1 IN: clean test.r 57 [OK] -- OUT: 57 [OK]
- 1 IN: clean test2.r 14 [OK] -- OUT: 14 [OK]
- 1 IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
- 1 IN: clean testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 [OK]
- 1 start
+ 1 IN: clean test.r 57 [OK] -- OUT: 57 . [OK]
+ 1 IN: clean test2.r 14 [OK] -- OUT: 14 . [OK]
+ 1 IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
+ 1 IN: clean testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 . [OK]
+ 1 START
+ 1 STOP
1 wrote filter header
EOF
check_filter_count_clean \
git commit . -m "test commit" \
<<-\EOF &&
- x IN: clean test.r 57 [OK] -- OUT: 57 [OK]
- x IN: clean test2.r 14 [OK] -- OUT: 14 [OK]
- x IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
- x IN: clean testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 [OK]
- 1 start
+ x IN: clean test.r 57 [OK] -- OUT: 57 . [OK]
+ x IN: clean test2.r 14 [OK] -- OUT: 14 . [OK]
+ x IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
+ x IN: clean testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 . [OK]
+ 1 START
+ 1 STOP
1 wrote filter header
EOF
@@ -398,28 +372,31 @@ test_expect_success PERL 'required process filter should filter data' '
check_filter_ignore_clean \
git checkout . \
<<-\EOF &&
- start
+ START
wrote filter header
- IN: smudge test2.r 14 [OK] -- OUT: 14 [OK]
- IN: smudge testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 [OK]
+ IN: smudge test2.r 14 [OK] -- OUT: 14 . [OK]
+ IN: smudge testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 . [OK]
+ STOP
EOF
check_filter_ignore_clean \
git checkout empty \
<<-\EOF &&
- start
+ START
wrote filter header
+ STOP
EOF
check_filter_ignore_clean \
git checkout master \
<<-\EOF &&
- start
+ START
wrote filter header
- IN: smudge test.r 57 [OK] -- OUT: 57 [OK]
- IN: smudge test2.r 14 [OK] -- OUT: 14 [OK]
- IN: smudge test4-empty.r 0 [OK] -- OUT: 0 [OK]
- IN: smudge testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 [OK]
+ IN: smudge test.r 57 [OK] -- OUT: 57 . [OK]
+ IN: smudge test2.r 14 [OK] -- OUT: 14 . [OK]
+ IN: smudge test4-empty.r 0 [OK] -- OUT: 0 [OK]
+ IN: smudge testsubdir/test3-subdir.r 21 [OK] -- OUT: 21 . [OK]
+ STOP
EOF
check_rot13 ../test.o test.r &&
@@ -428,57 +405,8 @@ test_expect_success PERL 'required process filter should filter data' '
)
'
-test_expect_success PERL 'required process filter should filter smudge data and one-shot filter should clean' '
+test_expect_success PERL 'required process filter should clean only and take precedence' '
test_config_global filter.protocol.clean ./../rot13.sh &&
- test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl smudge" &&
- test_config_global filter.protocol.required true &&
- rm -rf repo &&
- mkdir repo &&
- (
- cd repo &&
- git init &&
-
- echo "*.r filter=protocol" >.gitattributes &&
- git add . &&
- git commit . -m "test commit" &&
- git branch empty &&
-
- cp ../test.o test.r &&
- cp ../test2.o test2.r &&
-
- check_filter_no_call \
- git add . &&
-
- check_filter_no_call \
- git commit . -m "test commit" &&
-
- rm -f test?.r testsubdir/test3-subdir.r &&
-
- check_filter_ignore_clean \
- git checkout . \
- <<-\EOF &&
- start
- wrote filter header
- IN: smudge test2.r 14 [OK] -- OUT: 14 [OK]
- EOF
-
- git checkout empty &&
-
- check_filter_ignore_clean \
- git checkout master\
- <<-\EOF &&
- start
- wrote filter header
- IN: smudge test.r 57 [OK] -- OUT: 57 [OK]
- IN: smudge test2.r 14 [OK] -- OUT: 14 [OK]
- EOF
-
- check_rot13 ../test.o test.r &&
- check_rot13 ../test2.o test2.r
- )
-'
-
-test_expect_success PERL 'required process filter should clean only' '
test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl clean" &&
test_config_global filter.protocol.required true &&
rm -rf repo &&
@@ -497,42 +425,74 @@ test_expect_success PERL 'required process filter should clean only' '
check_filter \
git add . \
<<-\EOF &&
- 1 IN: clean test.r 57 [OK] -- OUT: 57 [OK]
- 1 start
+ 1 IN: clean test.r 57 [OK] -- OUT: 57 . [OK]
+ 1 START
+ 1 STOP
1 wrote filter header
EOF
check_filter_count_clean \
git commit . -m "test commit" \
<<-\EOF
- x IN: clean test.r 57 [OK] -- OUT: 57 [OK]
- 1 start
+ x IN: clean test.r 57 [OK] -- OUT: 57 . [OK]
+ 1 START
+ 1 STOP
1 wrote filter header
EOF
)
'
-test_expect_success PERL 'required process filter should process binary files larger LARGE_PACKET_MAX' '
+generate_test_data () {
+ LEN=$1
+ NAME=$2
+ test-genrandom end $LEN |
+ perl -pe "s/./chr((ord($&) % 26) + 97)/sge" >../$NAME.file &&
+ cp ../$NAME.file . &&
+ ./../rot13.sh <../$NAME.file >../$NAME.file.rot13
+}
+
+test_expect_success PERL 'required process filter should process multiple packets' '
test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl clean smudge" &&
test_config_global filter.protocol.required true &&
+
rm -rf repo &&
mkdir repo &&
(
cd repo &&
git init &&
- echo "*.file filter=protocol" >.gitattributes &&
- cat ../generated-test-data/largish.file.rot13 >large.rot13 &&
- cat ../generated-test-data/largish.file >large.file &&
- cat large.file >large.original &&
+ # Generate data that requires 3 packets
+ PKTLINE_DATA_MAXLEN=65516 &&
+
+ generate_test_data $(($PKTLINE_DATA_MAXLEN )) 1pkt_1__ &&
+ generate_test_data $(($PKTLINE_DATA_MAXLEN + 1)) 2pkt_1+1 &&
+ generate_test_data $(($PKTLINE_DATA_MAXLEN * 2 - 1)) 2pkt_2-1 &&
+ generate_test_data $(($PKTLINE_DATA_MAXLEN * 2 )) 2pkt_2__ &&
+ generate_test_data $(($PKTLINE_DATA_MAXLEN * 2 + 1)) 3pkt_2+1 &&
- git add large.file .gitattributes &&
+ echo "*.file filter=protocol" >.gitattributes &&
+ check_filter \
+ git add *.file .gitattributes \
+ <<-\EOF &&
+ 1 IN: clean 1pkt_1__.file 65516 [OK] -- OUT: 65516 . [OK]
+ 1 IN: clean 2pkt_1+1.file 65517 [OK] -- OUT: 65517 .. [OK]
+ 1 IN: clean 2pkt_2-1.file 131031 [OK] -- OUT: 131031 .. [OK]
+ 1 IN: clean 2pkt_2__.file 131032 [OK] -- OUT: 131032 .. [OK]
+ 1 IN: clean 3pkt_2+1.file 131033 [OK] -- OUT: 131033 ... [OK]
+ 1 START
+ 1 STOP
+ 1 wrote filter header
+ EOF
git commit . -m "test commit" &&
- rm -f large.file &&
- git checkout -- large.file &&
- git cat-file blob :large.file >actual &&
- test_cmp large.rot13 actual
+ rm -f *.file &&
+ git checkout -- *.file &&
+
+ for f in *.file
+ do
+ git cat-file blob :$f >actual &&
+ test_cmp ../$f.rot13 actual
+ done
)
'
@@ -577,13 +537,14 @@ test_expect_success PERL 'process filter should restart after unexpected write f
check_filter_ignore_clean \
git checkout . \
<<-\EOF &&
- start
+ START
wrote filter header
IN: smudge smudge-write-fail.r 22 [OK] -- OUT: 22 [WRITE FAIL]
- start
+ START
wrote filter header
- IN: smudge test.r 57 [OK] -- OUT: 57 [OK]
- IN: smudge test2.r 14 [OK] -- OUT: 14 [OK]
+ IN: smudge test.r 57 [OK] -- OUT: 57 . [OK]
+ IN: smudge test2.r 14 [OK] -- OUT: 14 . [OK]
+ STOP
EOF
check_rot13 ../test.o test.r &&
@@ -617,11 +578,12 @@ test_expect_success PERL 'process filter should not restart in case of an error'
check_filter_ignore_clean \
git checkout . \
<<-\EOF &&
- start
+ START
wrote filter header
IN: smudge error.r 25 [OK] -- OUT: 0 [ERROR]
- IN: smudge test.r 57 [OK] -- OUT: 57 [OK]
- IN: smudge test2.r 14 [OK] -- OUT: 14 [OK]
+ IN: smudge test.r 57 [OK] -- OUT: 57 . [OK]
+ IN: smudge test2.r 14 [OK] -- OUT: 14 . [OK]
+ STOP
EOF
check_rot13 ../test.o test.r &&
@@ -642,8 +604,8 @@ test_expect_success PERL 'process filter should be able to signal an error for a
cp ../test.o test.r &&
cp ../test2.o test2.r &&
- echo "error this blob and all future blobs" >error-all.o &&
- cp error-all.o error-all.r &&
+ echo "error this blob and all future blobs" >abort.o &&
+ cp abort.o abort.r &&
git add . &&
git commit . -m "test commit" &&
rm -f *.r &&
@@ -651,14 +613,15 @@ test_expect_success PERL 'process filter should be able to signal an error for a
check_filter_ignore_clean \
git checkout . \
<<-\EOF &&
- start
+ START
wrote filter header
- IN: smudge error-all.r 37 [OK] -- OUT: 0 [ERROR-ALL]
+ IN: smudge abort.r 37 [OK] -- OUT: 0 [ABORT]
+ STOP
EOF
test_cmp ../test.o test.r &&
test_cmp ../test2.o test2.r &&
- test_cmp error-all.o error-all.r
+ test_cmp abort.o abort.r
)
'
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 0e1181d..8e27877 100755
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -14,7 +14,7 @@
# (3) If data with the pathname "error.r" is processed with any
# operation then the filter signals that it cannot or does not want
# to process the file.
-# (4) If data with the pathname "error-all.r" is processed with any
+# (4) If data with the pathname "abort.r" is processed with any
# operation then the filter signals that it cannot or does not want
# to process the file and any file after that is processed with the
# same command.
@@ -26,6 +26,8 @@ use warnings;
my $MAX_PACKET_CONTENT_SIZE = 65516;
my @capabilities = @ARGV;
+open my $debug, ">>", "rot13-filter.log";
+
sub rot13 {
my ($str) = @_;
$str =~ y/A-Za-z/N-ZA-Mn-za-m/;
@@ -38,6 +40,7 @@ sub packet_read {
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
+ print $debug "STOP\n";
exit();
}
elsif ( $bytes_read != 4 ) {
@@ -72,8 +75,7 @@ sub packet_flush {
STDOUT->flush();
}
-open my $debug, ">>", "rot13-filter.log";
-print $debug "start\n";
+print $debug "START\n";
$debug->flush();
( packet_read() eq ( 0, "git-filter-client" ) ) || die "bad initialization";
@@ -120,7 +122,7 @@ while (1) {
}
my $output;
- if ( $pathname eq "error.r" or $pathname eq "error-all.r" ) {
+ if ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
$output = "";
}
elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
@@ -142,10 +144,10 @@ while (1) {
packet_write("status=error\n");
packet_flush();
}
- elsif ( $pathname eq "error-all.r" ) {
- print $debug "[ERROR-ALL]\n";
+ elsif ( $pathname eq "abort.r" ) {
+ print $debug "[ABORT]\n";
$debug->flush();
- packet_write("status=error-all\n");
+ packet_write("status=abort\n");
packet_flush();
}
else {
@@ -161,6 +163,7 @@ while (1) {
while ( length($output) > 0 ) {
my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE );
packet_write($packet);
+ print $debug ".";
if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
$output = substr( $output, $MAX_PACKET_CONTENT_SIZE );
}
@@ -169,7 +172,7 @@ while (1) {
}
}
packet_flush();
- print $debug "[OK]\n";
+ print $debug " [OK]\n";
$debug->flush();
packet_flush();
}
Lars Schneider (10):
pkt-line: rename packet_write() to packet_write_fmt()
pkt-line: extract set_packet_header()
pkt-line: add packet_write_fmt_gently()
pkt-line: add packet_flush_gently()
pkt-line: add packet_write_gently()
pkt-line: add functions to read/write flush terminated packet streams
convert: quote filter names in error messages
convert: modernize tests
convert: make apply_filter() adhere to standard Git error handling
convert: add filter.<driver>.process option
Documentation/gitattributes.txt | 154 +++++++++++-
builtin/archive.c | 4 +-
builtin/receive-pack.c | 4 +-
builtin/remote-ext.c | 4 +-
builtin/upload-archive.c | 4 +-
connect.c | 2 +-
contrib/long-running-filter/example.pl | 111 +++++++++
convert.c | 373 +++++++++++++++++++++++++----
daemon.c | 2 +-
http-backend.c | 2 +-
pkt-line.c | 160 ++++++++++++-
pkt-line.h | 12 +-
shallow.c | 2 +-
t/t0021-conversion.sh | 423 ++++++++++++++++++++++++++++++---
t/t0021/rot13-filter.pl | 179 ++++++++++++++
unpack-trees.c | 1 +
upload-pack.c | 30 +--
17 files changed, 1356 insertions(+), 111 deletions(-)
create mode 100755 contrib/long-running-filter/example.pl
create mode 100755 t/t0021/rot13-filter.pl
--
2.10.0
next reply other threads:[~2016-09-08 18:21 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-08 18:21 larsxschneider [this message]
2016-09-08 18:21 ` [PATCH v7 01/10] pkt-line: rename packet_write() to packet_write_fmt() larsxschneider
2016-09-08 18:21 ` [PATCH v7 02/10] pkt-line: extract set_packet_header() larsxschneider
2016-09-08 18:21 ` [PATCH v7 03/10] pkt-line: add packet_write_fmt_gently() larsxschneider
2016-09-08 21:18 ` Stefan Beller
2016-09-11 11:36 ` Lars Schneider
2016-09-11 16:01 ` Stefan Beller
2016-09-12 9:22 ` Lars Schneider
2016-09-08 18:21 ` [PATCH v7 04/10] pkt-line: add packet_flush_gently() larsxschneider
2016-09-12 23:30 ` Junio C Hamano
2016-09-13 22:12 ` Lars Schneider
2016-09-13 22:44 ` Junio C Hamano
2016-09-15 16:42 ` Lars Schneider
2016-09-15 19:44 ` Jeff King
2016-09-15 20:19 ` Lars Schneider
2016-09-15 20:33 ` Junio C Hamano
2016-09-08 18:21 ` [PATCH v7 05/10] pkt-line: add packet_write_gently() larsxschneider
2016-09-08 21:24 ` Stefan Beller
2016-09-11 11:44 ` Lars Schneider
2016-09-12 23:31 ` Junio C Hamano
2016-09-08 18:21 ` [PATCH v7 06/10] pkt-line: add functions to read/write flush terminated packet streams larsxschneider
2016-09-08 21:49 ` Stefan Beller
2016-09-11 12:33 ` Lars Schneider
2016-09-11 16:03 ` Stefan Beller
2016-09-11 21:42 ` Junio C Hamano
2016-09-08 18:21 ` [PATCH v7 07/10] convert: quote filter names in error messages larsxschneider
2016-09-08 18:21 ` [PATCH v7 08/10] convert: modernize tests larsxschneider
2016-09-08 22:05 ` Stefan Beller
2016-09-11 12:34 ` Lars Schneider
2016-09-08 18:21 ` [PATCH v7 09/10] convert: make apply_filter() adhere to standard Git error handling larsxschneider
2016-09-08 18:21 ` [PATCH v7 10/10] convert: add filter.<driver>.process option larsxschneider
2016-09-10 6:29 ` Torsten Bögershausen
2016-09-12 9:49 ` Lars Schneider
2016-09-13 14:44 ` Torsten Bögershausen
2016-09-13 16:42 ` Junio C Hamano
2016-09-15 17:23 ` Lars Schneider
2016-09-15 20:04 ` Junio C Hamano
2016-09-29 6:33 ` Torsten Bögershausen
2016-09-29 9:37 ` Jakub Narębski
2016-09-10 16:40 ` Torsten Bögershausen
2016-09-13 22:04 ` Lars Schneider
2016-09-13 15:22 ` Junio C Hamano
2016-09-15 20:16 ` Lars Schneider
2016-09-15 20:24 ` Junio C Hamano
2016-09-13 16:00 ` [PATCH v7 00/10] Git filter protocol Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160908182132.50788-1-larsxschneider@gmail.com \
--to=larsxschneider@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jacob.keller@gmail.com \
--cc=jnareb@gmail.com \
--cc=mlbright@gmail.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=tboegi@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).