From: Lars Schneider <larsxschneider@gmail.com>
To: "Jakub Narębski" <jnareb@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
"Junio C Hamano" <gitster@pobox.com>,
"Torsten Bögershausen" <tboegi@web.de>,
"Martin-Louis Bright" <mlbright@gmail.com>,
"Eric Wong" <e@80x24.org>, "Jeff King" <peff@peff.net>
Subject: Re: [PATCH v3 10/10] convert: add filter.<driver>.process option
Date: Wed, 3 Aug 2016 15:10:19 +0200 [thread overview]
Message-ID: <9DDA993E-2AFD-4C69-8E22-58601EEC8A40@gmail.com> (raw)
In-Reply-To: <2f4743d1-3c93-406d-8b44-da0eb075e65c@gmail.com>
> On 01 Aug 2016, at 00:19, Jakub Narębski <jnareb@gmail.com> wrote:
>
> W dniu 30.07.2016 o 01:38, larsxschneider@gmail.com pisze:
>
[LONG SNIP]
First part answered here:
http://public-inbox.org/git/5180D54D-92C4-4875-AEB3-801663D70A8B%40gmail.com/
>
>> + }
>> + process = &entry->process;
>> +
>> + if (!(wanted_capability & entry->supported_capabilities))
>> + return 1; // it is OK if the wanted capability is not supported
>> +
>> + if FILTER_SUPPORTS_CLEAN(wanted_capability)
>> + filter_type = "clean";
>> + else if FILTER_SUPPORTS_SMUDGE(wanted_capability)
>> + filter_type = "smudge";
>> + else
>> + die("unexpected filter type");
>
> Style: it should be
>
> + if (FILTER_SUPPORTS_CLEAN(wanted_capability))
> + filter_type = "clean";
> + else if (FILTER_SUPPORTS_SMUDGE(wanted_capability))
> + filter_type = "smudge";
> + else
> + die("unexpected filter type");
>
> even though by accident the macro provides the parentheses to "if".
Agreed.
> Can we make an error/die message more detailed? Maybe it is
> not possible...
Yeah, I don't see an easy way...
>
>> +
>> + if (fd >= 0 && !src) {
>> + if (fstat(fd, &file_stat) == -1)
>> + return 0;
>> + len = file_stat.st_size;
>> + }
>
> All right, when fstat() can fail? Could we then send contents without
> size upfront, or is it better to require size to make it more consistent
> for filter drivers scripts?
If fstat() fails then there is clearly something wrong and the filter
should fail.
> Could this whole "send single file" be put in a separate function?
> Or is it not worth it?
This function would have almost the same signature as apply_protocol2_filter
and therefore I would say it's not worth it since the function is not
crazy long.
>
>> +
>> + sigchain_push(SIGPIPE, SIG_IGN);
>
> Hmmm... ignoring SIGPIPE was good for one-shot filters. Is it still
> O.K. for per-command persistent ones?
Very good question. You are right... we don't want to ignore any errors
during the protocol... I will remove it.
>
>> +
>> + packet_buf_write(&nbuf, "%s\n", filter_type);
>> + ret &= !direct_packet_write(process->in, nbuf.buf, nbuf.len, 1);
>> +
>> + if (ret) {
>> + strbuf_reset(&nbuf);
>> + packet_buf_write(&nbuf, "filename=%s\n", path);
>> + ret = !direct_packet_write(process->in, nbuf.buf, nbuf.len, 1);
>> + }
>
> Perhaps a better solution would be
>
> if (err)
> goto fin_error;
>
> rather than this.
OK, I change it to goto error handling style.
>
>> +
>> + if (ret) {
>> + strbuf_reset(&nbuf);
>> + packet_buf_write(&nbuf, "size=%"PRIuMAX"\n", (uintmax_t)len);
>> + ret = !direct_packet_write(process->in, nbuf.buf, nbuf.len, 1);
>> + }
>
> Or maybe extract writing the header for a file into a separate function?
> This one gets a bit long...
Maybe... but I think that would make it harder to understand the protocol. I
think I would prefer to have all the communication in one function layer.
>> +
>> + if (ret) {
>> + if (fd >= 0)
>> + ret = !multi_packet_write_from_fd(fd, process->in);
>> + else
>> + ret = !multi_packet_write_from_buf(src, len, process->in);
>> + }
>
> This is not streaming. The above sends whole file, or whole string to
> the filter process, without draining filter output. If the filter were
> to read some, then write some, it might deadlock on full buffers, isn't it?
> Or am I mistaken?
Correct.
>> +
>> + if (ret && !FILTER_SUPPORTS_STREAM(entry->supported_capabilities)) {
>> + strbuf = packet_read_line(process->out, NULL);
>> + if (strlen(strbuf) > 5 && !strncmp("size=", strbuf, 5)) {
>> + expected_bytes = (off_t)strtol(strbuf + 5, &strtol_end, 10);
>> + ret = (strtol_end != strbuf && errno != ERANGE);
>> + } else {
>> + ret = 0;
>> + }
>> + }
>> +
>> + if (ret) {
>> + strbuf_reset(&nbuf);
>> + ret = !multi_packet_read(process->out, &nbuf, expected_bytes,
>> + FILTER_SUPPORTS_STREAM(entry->supported_capabilities));
>> + }
>
> What happens if the output of filter does not fit in size_t? I see that
> (I think) this problem is inherited from the original implementation.
Correct. And therefore I would prefer not to change this in this series.
>> +
>> + if (ret) {
>> + filter_result = packet_read_line(process->out, NULL);
>> + ret = !strcmp(filter_result, "success");
>> + }
>> +
>> + sigchain_pop(SIGPIPE);
>> +
>> + if (ret) {
>> + strbuf_swap(dst, &nbuf);
>> + } else {
>> + if (!filter_result || strcmp(filter_result, "reject")) {
>> + // Something went wrong with the protocol filter. Force shutdown!
>> + error("external filter '%s' failed", cmd);
>> + kill_protocol2_filter(&cmd_process_map, entry);
>> + }
>> + }
>
> So if Git gets finish signal "success" from filter, it accepts the output.
> If Git gets finish signal "reject" from filter, it restarts filter (and
> reject the output - user can retry the command himself / herself).
> If Git gets any other finish signal, for example "error" (but this is not
> standarized), then it rejects the output, keeping the unfiltered result,
> but keeps filtering.
>
> I think it is not described in this detail in the documentation of the
> new protocol.
Agreed, will add!
>
>> + strbuf_release(&nbuf);
>> + return ret;
>> +}
>
> I wonder if this point might be start of the new patch... but then you
> would have no way to test what you wrote.
>
>> +
>> static struct convert_driver {
>> const char *name;
>> struct convert_driver *next;
>> const char *smudge;
>> const char *clean;
>> + const char *process;
>> int required;
>> } *user_convert, **user_convert_tail;
>
> All right.
>
>>
>> @@ -526,6 +871,10 @@ static int read_convert_config(const char *var, const char *value, void *cb)
>> if (!strcmp("clean", key))
>> return git_config_string(&drv->clean, var, value);
>>
>> + if (!strcmp("process", key)) {
>> + return git_config_string(&drv->process, var, value);
>> + }
>> +
>
> All right.
>
>> if (!strcmp("required", key)) {
>> drv->required = git_config_bool(var, value);
>> return 0;
>> @@ -823,7 +1172,12 @@ int would_convert_to_git_filter_fd(const char *path)
>> if (!ca.drv->required)
>> return 0;
>>
>> - return apply_filter(path, NULL, 0, -1, NULL, ca.drv->clean);
>> + if (!ca.drv->clean && ca.drv->process)
>> + return apply_protocol2_filter(
>> + path, NULL, 0, -1, NULL, ca.drv->process, FILTER_CAPABILITIES_CLEAN
>> + );
>> + else
>> + return apply_filter(path, NULL, 0, -1, NULL, ca.drv->clean);
>
> Could we augment apply_filter() instead, so that the invocation is
>
> return apply_filter(path, NULL, 0, -1, NULL, ca.drv, FILTER_CLEAN);
>
> Though I am not sure if moving this conditional to apply_filter would
> be a good idea; maybe wrapper around augmented apply_filter_do()?
Yes, a wrapper makes it way cleaner!
>> }
>>
>> const char *get_convert_attr_ascii(const char *path)
>> @@ -856,17 +1210,24 @@ int convert_to_git(const char *path, const char *src, size_t len,
>> struct strbuf *dst, enum safe_crlf checksafe)
>> {
>> int ret = 0;
>> - const char *filter = NULL;
>> + const char *clean_filter = NULL;
>> + const char *process_filter = NULL;
>> int required = 0;
>> struct conv_attrs ca;
>>
>> convert_attrs(&ca, path);
>> if (ca.drv) {
>> - filter = ca.drv->clean;
>> + clean_filter = ca.drv->clean;
>> + process_filter = ca.drv->process;
>> required = ca.drv->required;
>> }
>
> All right (assuming un-augmented apply_filter()).
>
>>
>> - ret |= apply_filter(path, src, len, -1, dst, filter);
>> + if (!clean_filter && process_filter)
>> + ret |= apply_protocol2_filter(
>> + path, src, len, -1, dst, process_filter, FILTER_CAPABILITIES_CLEAN
>> + );
>> + else
>> + ret |= apply_filter(path, src, len, -1, dst, clean_filter);
>
> I wonder if it would be more readable to write it like this
> (and of course elsewhere too):
>
> + if (!clean_filter && process_filter)
> + ret |= apply_protocol2_filter(
> + path, src, len, -1, dst, process_filter, FILTER_CAPABILITIES_CLEAN
> + );
> + else
> + ret |= apply_filter(
> + path, src, len, -1, dst, clean_filter);
> + );
>
>
> Though it would screw up "git blame -C -C -w"
Obsolete with the wrapper mentioned above.
>> if (!ret && required)
>> die("%s: clean filter '%s' failed", path, ca.drv->name);
>>
>> @@ -885,13 +1246,21 @@ int convert_to_git(const char *path, const char *src, size_t len,
>> void convert_to_git_filter_fd(const char *path, int fd, struct strbuf *dst,
>> enum safe_crlf checksafe)
>> {
>> + int ret = 0;
>
> Right, 'ret' is needed because we now have two possibilities:
> `clean` filter and `process` filter.
>
>> struct conv_attrs ca;
>> convert_attrs(&ca, path);
>>
>> assert(ca.drv);
>> - assert(ca.drv->clean);
>> + assert(ca.drv->clean || ca.drv->process);
>> +
>> + if (!ca.drv->clean && ca.drv->process)
>> + ret = apply_protocol2_filter(
>> + path, NULL, 0, fd, dst, ca.drv->process, FILTER_CAPABILITIES_CLEAN
>> + );
>> + else
>> + ret = apply_filter(path, NULL, 0, fd, dst, ca.drv->clean);
>>
>> - if (!apply_filter(path, NULL, 0, fd, dst, ca.drv->clean))
>> + if (!ret)
>> die("%s: clean filter '%s' failed", path, ca.drv->name);
>>
>> crlf_to_git(path, dst->buf, dst->len, dst, ca.crlf_action, checksafe);
>> @@ -902,14 +1271,16 @@ static int convert_to_working_tree_internal(const char *path, const char *src,
>> size_t len, struct strbuf *dst,
>> int normalizing)
>> {
>> - int ret = 0, ret_filter = 0;
>> - const char *filter = NULL;
>> + int ret = 0, ret_filter;
>
> Why the change:
>
> - int ret = 0, ret_filter = 0;
> + int ret = 0, ret_filter;
Reverted with the wrapper.
>> + const char *smudge_filter = NULL;
>> + const char *process_filter = NULL;
>> int required = 0;
>> struct conv_attrs ca;
>>
>> convert_attrs(&ca, path);
>> if (ca.drv) {
>> - filter = ca.drv->smudge;
>> + process_filter = ca.drv->process;
>> + smudge_filter = ca.drv->smudge;
>> required = ca.drv->required;
>> }
>
> All right, the same.
>
> [...]
>> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
>> index 34c8eb9..e8a7703 100755
>> --- a/t/t0021-conversion.sh
>> +++ b/t/t0021-conversion.sh
>> @@ -296,4 +296,409 @@ test_expect_success 'disable filter with empty override' '
>> test_must_be_empty err
>> '
>>
>> +test_expect_success PERL 'required process filter should filter data' '
>> + test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl clean smudge shutdown" &&
>> + test_config_global filter.protocol.required true &&
>> + rm -rf repo &&
>> + mkdir repo &&
>> + (
>> + cd repo &&
>> + git init &&
>> +
>> + echo "*.r filter=protocol" >.gitattributes &&
>> + git add . &&
>> + git commit . -m "test commit" &&
>
> This is more of "Initial commit", not that it matters
>
>> + git branch empty &&
>> +
>> + cat ../test.o >test.r &&
>
> Err, the above is just copying file, isn't it?
> Maybe it was copied from other tests, I have not checked.
It was created in the "setup" test.
>> + echo "test22" >test2.r &&
>> + mkdir testsubdir &&
>> + echo "test333" >testsubdir/test3.r &&
>
> All right, we test text file, we test binary file (I assume), we test
> file in a subdirectory. What about testing empty file? Or large file
> which would not fit in the stdin/stdout buffer (as EXPENSIVE test)?
No binary file. The main reason for this test is to check multiple files.
I'll add a empty file. A large file is tested in the next test.
>
>> +
>> + rm -f rot13-filter.log &&
>> + git add . &&
>
> So this runs "clean" filter, storing cleaned contents in the index.
Correct.
>> + sort rot13-filter.log | uniq -c | sed "s/^[ ]*//" >uniq-rot13-filter.log &&
>> + cat >expected_add.log <<-\EOF &&
>> + 1 IN: clean test.r 57 [OK] -- OUT: 57 [OK]
>> + 1 IN: clean test2.r 7 [OK] -- OUT: 7 [OK]
>> + 1 IN: clean testsubdir/test3.r 8 [OK] -- OUT: 8 [OK]
>
> And we check the "know size upfront" case (mistakenly called non-"stream").
Correct - however, I removed non-stream
>> + 1 IN: shutdown -- [OK]
>
> And test "shutdown" capability (not as separate test).
Fixed.
>> + 1 start
>> + 1 wrote filter header
>> + EOF
>
> And we are required to keep the expected_add.log file sorted by hand???
Well, the clean invocations (and therefore their order of appearance)
are not deterministic. See my discussion with Junio here:
http://public-inbox.org/git/xmqqshv18i8i.fsf%40gitster.mtv.corp.google.com/
>
>> + test_cmp expected_add.log uniq-rot13-filter.log &&
>> +
>> + >rot13-filter.log &&
>
> Truncate log. Still in the same test.
>
>> + git commit . -m "test commit" &&
>
> This is test commit with files undergoing "clean" part of filter.
>
>> + sort rot13-filter.log | uniq -c | sed "s/^[ ]*//" |
>> + sed "s/^\([0-9]\) IN: clean/x IN: clean/" >uniq-rot13-filter.log &&
>
> There is known performance regression, in that filter is run more
> than once on given file.
>
> Actually... why it does not use cleaned-up contents from the index?
See discussion here: http://public-inbox.org/git/20160722152753.GA6859%40sigill.intra.peff.net/
>> + cat >expected_commit.log <<-\EOF &&
>> + x IN: clean test.r 57 [OK] -- OUT: 57 [OK]
>> + x IN: clean test2.r 7 [OK] -- OUT: 7 [OK]
>> + x IN: clean testsubdir/test3.r 8 [OK] -- OUT: 8 [OK]
>> + 1 IN: shutdown -- [OK]
>> + 1 start
>> + 1 wrote filter header
>
> Right, this is the goal of the patch series: for filter to be started
> only once per git command invocation.
>
>> + EOF
>> + test_cmp expected_commit.log uniq-rot13-filter.log &&
>> +
>
> Still in the same test, even though we would be testing "smudge"
> capability now.
>
> It's a pity that t/test-lib.sh does not support subtests from
> the TAP specification (Test Anything Protocol that Git testsuite
> uses).
>
>> + >rot13-filter.log &&
>> + rm -f test?.r testsubdir/test3.r &&
>> + git checkout . &&
>
> All right, we removed some files so that "git checkout ." could
> restore them to life.
>
>> + cat rot13-filter.log | grep -v "IN: clean" >smudge-rot13-filter.log &&
>
> Useless use of cat
>
> + grep -v "IN: clean" rot13-filter.log >smudge-rot13-filter.log &&
Fixed, thanks!
> Also: why 'git checkout <path>' would run "clean" filter?
> Is it existing strange behaviour?
AFAIK, that's existing behavior.
>> + cat >expected_checkout.log <<-\EOF &&
>> + start
>> + wrote filter header
>> + IN: smudge test2.r 7 [OK] -- OUT: 7 [OK]
>> + IN: smudge testsubdir/test3.r 8 [OK] -- OUT: 8 [OK]
>> + IN: shutdown -- [OK]
>> + EOF
>
> This time without 'sort | uniq -c'.
Yes, because the smudge calls are deterministic!
> Is it really needed for the
> "good" case, or is it there for two cases to look similar?
I am not sure what you mean?!
>> + test_cmp expected_checkout.log smudge-rot13-filter.log &&
>> +
>> + git checkout empty &&
>
> Shouldn't we check that switching to branch 'empty' does not run
> filters, or is it covered by other tests? Or perhaps this simply
> does not matter here, is it?
Easy enough to check. I will add this.
>
>> +
>> + >rot13-filter.log &&
>> + git checkout master &&
>
> Does it test different callpath than 'git checkout .'? Well, the
> set of files is different...
>
>> + cat rot13-filter.log | grep -v "IN: clean" >smudge-rot13-filter.log &&
>> + cat >expected_checkout_master.log <<-\EOF &&
>> + start
>> + wrote filter header
>> + IN: smudge test.r 57 [OK] -- OUT: 57 [OK]
>> + IN: smudge test2.r 7 [OK] -- OUT: 7 [OK]
>> + IN: smudge testsubdir/test3.r 8 [OK] -- OUT: 8 [OK]
>> + IN: shutdown -- [OK]
>> + EOF
>> + test_cmp expected_checkout_master.log smudge-rot13-filter.log &&
>> +
>
> And here we start checking that the filter did filter,
> that is the content in the repository is "clean"ed-up.
> Still the same test.
>
>> + ./../rot13.sh <test.r >expected &&
>> + git cat-file blob :test.r >actual &&
>> + test_cmp expected actual &&
>> +
>> + ./../rot13.sh <test2.r >expected &&
>> + git cat-file blob :test2.r >actual &&
>> + test_cmp expected actual &&
>> +
>> + ./../rot13.sh <testsubdir/test3.r >expected &&
>> + git cat-file blob :testsubdir/test3.r >actual &&
>> + test_cmp expected actual
>> + )
>> +'
>> +
>> +test_expect_success PERL 'required process filter should filter data stream' '
>> + test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl stream clean smudge" &&
>> + test_config_global filter.protocol.required true &&
>
> Errr... I don't see how it is different from the previous test.
> [...]
stream/non-stream ... but this is obsolete in the next roll. fixed!
>
>> +
>> +test_expect_success PERL 'required process filter should filter smudge data and one-shot filter should clean' '
>
> All right, so this tests the precedence... well, it doesn't.
>
> It tests that `process` filter with "smudge" capability only works well
> with one-shot `clean` filter.
True. Isn't that what the test description indicates?
>> + test_config_global filter.protocol.clean ./../rot13.sh &&
>> + test_config_global filter.protocol.process "$TEST_DIRECTORY/t0021/rot13-filter.pl smudge" &&
>
> Why the difference in pathnames (the directory part) between those two?
rot13.sh is generated in the header of the file.
rot13-filter.pl is part of the test suite
>> + test_config_global filter.protocol.required true &&
>> + rm -rf repo &&
>> + mkdir repo &&
>> + (
>> + cd repo &&
>> + git init &&
>> +
>> + echo "*.r filter=protocol" >.gitattributes &&
>> + git add . &&
>> + git commit . -m "test commit" &&
>> + git branch empty &&
>> +
>> + cat ../test.o >test.r &&
>> + echo "test22" >test2.r &&
>> + mkdir testsubdir &&
>> + echo "test333" >testsubdir/test3.r &&
>> +
>> + rm -f rot13-filter.log &&
>> + git add . &&
>> + test_must_be_empty rot13-filter.log &&
>> +
>> + >rot13-filter.log &&
>> + git commit . -m "test commit" &&
>> + test_must_be_empty rot13-filter.log &&
>
> All right, these tests that `process` filter is not ran. But we don't
> know if it is because it lacks capability, or because it is overriden
> by one-shot filter (well, that comes later).
Only the clean one shot filter is configured. Therefore that shouldn't be a
problem, right?
>> +
>> + >rot13-filter.log &&
>> + rm -f test?.r testsubdir/test3.r &&
>> + git checkout . &&
>> + cat rot13-filter.log | grep -v "IN: clean" >smudge-rot13-filter.log &&
>> + cat >expected_checkout.log <<-\EOF &&
>> + start
>> + wrote filter header
>> + IN: smudge test2.r 7 [OK] -- OUT: 7 [OK]
>> + IN: smudge testsubdir/test3.r 8 [OK] -- OUT: 8 [OK]
>> + EOF
>> + test_cmp expected_checkout.log smudge-rot13-filter.log &&
>
> This part is repeated many, many times. Maybe add some helper
> shell function for this?
Good idea! Will add!
> [...]
>> + ./../rot13.sh <test.r >expected &&
>> + git cat-file blob :test.r >actual &&
>> + test_cmp expected actual &&
>> +
>> + ./../rot13.sh <test2.r >expected &&
>> + git cat-file blob :test2.r >actual &&
>> + test_cmp expected actual &&
>> +
>> + ./../rot13.sh <testsubdir/test3.r >expected &&
>> + git cat-file blob :testsubdir/test3.r >actual &&
>> + test_cmp expected actual
>
> Here we test that equivalent one-shot cleanup filter was run.
> Here also we have repeated contents; maybe some helper function
> would make it shorter?
Agreed!
>> + )
>> +'
>
> Here I am stopping examining tests in detail.
>
>> +test_expect_success PERL 'required process filter should clean only' '
>> +test_expect_success PERL 'required process filter should process files larger LARGE_PACKET_MAX' '
>
> Those two tests do not depend on being required or not; it is only
> that without required they would fail softly in case of latter test
> (which we can detect too).
True, but since they fail hard it is easier to check.
>> +test_expect_success PERL 'required process filter should with clean error should fail' '
>> +test_expect_success PERL 'process filter should restart after unexpected write failure' '
>
> So these two are sort of complimentary. When `process` is required,
> then it should fail if it cannot filter some file. If it is not,
> it should keep processing other files.
True.
>> +test_expect_success PERL 'process filter should not restart after intentionally rejected file' '
>
> Uh... all right, so "reject" means that filter cannot continue?
> Strange meaning for 'reject', though ;-)
No, with reject a filter can say "I don't want to process that file". This is a legitimate
response and I don't Git to restart the filter in that case.
>> test_done
>> diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
>> new file mode 100755
>> index 0000000..cb0925d
>> --- /dev/null
>> +++ b/t/t0021/rot13-filter.pl
>> @@ -0,0 +1,177 @@
>> +#!/usr/bin/perl
>> +#
>> +# Example implementation for the Git filter protocol version 2
>> +# See Documentation/gitattributes.txt, section "Filter Protocol"
>> +#
>> +# The script takes the list of supported protocol capabilities as
>> +# arguments ("stream", "clean", and "smudge" are supported).
>
> What about "shutdown"?
Will fix.
>> +#
>> +# This implementation supports three special test cases:
>> +# (1) If data with the filename "clean-write-fail.r" is processed with
>> +# a "clean" operation then the write operation will die.
>> +# (2) If data with the filename "smudge-write-fail.r" is processed with
>> +# a "smudge" operation then the write operation will die.
>
> All right, so it is hard failure with filter script dying.
Correct.
>
>> +# (3) If data with the filename "failure.r" is processed with any
>> +# operation then the filter signals that the operation was not
>> +# successful.
>
> All right, so it is failure detected by filter script and signalled to Git.
>
>> +#
>> +
>> +use strict;
>> +use warnings;
>
> So no more "use autodie", because of compatibility with old Perls.
>
>> +
>> +my $MAX_PACKET_CONTENT_SIZE = 65516;
>> +my @capabilities = @ARGV;
>
> No autoflush this time?
Eric recommended to disable it:
http://public-inbox.org/git/20160723072721.GA20875%40starla/
>> +
>> +sub rot13 {
>> + my ($str) = @_;
>> + $str =~ y/A-Za-z/N-ZA-Mn-za-m/;
>> + return $str;
>> +}
>> +
>> +sub packet_read {
>> + my $buffer;
>> + my $bytes_read = read STDIN, $buffer, 4;
>> + if ( $bytes_read == 0 ) {
>> + return;
>> + }
>> + elsif ( $bytes_read != 4 ) {
>> + die "invalid packet size '$bytes_read' field";
>> + }
>> + my $pkt_size = hex($buffer);
>> + if ( $pkt_size == 0 ) {
>> + return ( 1, "" );
>
> Unusual return convention. Though it is a test script, so
> it doesn't matter much.
>
>> + }
>> + elsif ( $pkt_size > 4 ) {
>> + my $content_size = $pkt_size - 4;
>> + $bytes_read = read STDIN, $buffer, $content_size;
>> + if ( $bytes_read != $content_size ) {
>> + die "invalid packet";
>
> More detailed error message, maybe?
OK
>> + }
>> + return ( 0, $buffer );
>> + }
>> + else {
>> + die "invalid packet size";
>> + }
>> +}
>> +
>> +sub packet_write {
>> + my ($packet) = @_;
>> + print STDOUT sprintf( "%04x", length($packet) + 4 );
>> + print STDOUT $packet;
>> + STDOUT->flush();
>> +}
>> +
>> +sub packet_flush {
>> + print STDOUT sprintf( "%04x", 0 );
>> + STDOUT->flush();
>> +}
>> +
>> +open my $debug, ">>", "rot13-filter.log";
>> +print $debug "start\n";
>> +$debug->flush();
>> +
>> +packet_write("git-filter-protocol\n");
>> +packet_write("version 2\n");
>> +packet_write( "capabilities " . join( ' ', @capabilities ) . "\n" );
>> +print $debug "wrote filter header\n";
>> +$debug->flush();
>> +
>> +while (1) {
>> + my $command = packet_read();
>> + unless ( defined($command) ) {
>> + exit();
>> + }
>> + chomp $command;
>> + print $debug "IN: $command";
>> + $debug->flush();
>> +
>> + if ( $command eq "shutdown" ) {
>> + print $debug " -- [OK]";
>> + $debug->flush();
>> + packet_write("done\n");
>> + exit();
>> + }
>> +
>> + my ($filename) = packet_read() =~ /filename=([^=]+)\n/;
>> + print $debug " $filename";
>> + $debug->flush();
>> + my ($filelen) = packet_read() =~ /size=([^=]+)\n/;
>> + chomp $filelen;
>
> I think this chomp is not needed, as "\n" is not included.
> Though the regexp should probably be anchored.
Agreed.
>> + print $debug " $filelen";
>> + $debug->flush();
>> +
>> + $filelen =~ /\A\d+\z/ or die "bad filelen: $filelen";
>> + my $output;
>> +
>> + if ( $filelen > 0 ) {
>
> So here is a special case for $filelen = 0.
> Negative $filelen is not allowed, via regexp.
Obsolete in v4.
>> + my $input = "";
>> + {
>> + binmode(STDIN);
>> + my $buffer;
>> + my $done = 0;
>> + while ( !$done ) {
>> + ( $done, $buffer ) = packet_read();
>> + $input .= $buffer;
>> + }
>> + print $debug " [OK] -- ";
>> + $debug->flush();
>> + }
>> +
>> + if ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
>> + $output = rot13($input);
>> + }
>> + elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
>> + $output = rot13($input);
>> + }
>
> These two conditionals could be shortened, but then they would be less
> readable. Or not:
>
> if ( grep { $_ eq $command } @capabilities ) {
> $output = rot13($input);
> }
I would like to keep it that way for readability since
the test script also serves as example implementation.
>> + else {
>> + die "bad command $command";
>> + }
>> + }
>> +
>> + my $output_len = length($output);
>> + if ( $filename eq "reject.r" ) {
>> + $output_len = 0;
>> + }
>> +
>> + if ( grep( /^stream$/, @capabilities ) ) {
>> + print $debug "OUT: STREAM ";
>> + }
>> + else {
>> + packet_write("size=$output_len\n");
>> + print $debug "OUT: $output_len ";
>> + }
>> + $debug->flush();
>> +
>> + if ( $filename eq "reject.r" ) {
>> + packet_write("reject\n");
>> + print $debug "[REJECT]\n"; # Could also be an error
>
> How if could be an error?
Removed.
>
>> + $debug->flush();
>> + }
>> +
>> + if ( $output_len > 0 ) {
>> + if (( $command eq "clean" and $filename eq "clean-write-fail.r" )
>> + or
>> + ( $command eq "smudge" and $filename eq "smudge-write-fail.r" ))
>
> Perhaps simply:
>
> + if ( $filename eq "${command}-write-fail.r" ) {
Nice! Will fix!
>> + {
>> + print $debug "[WRITE FAIL]\n";
>> + $debug->flush();
>> + die "write error";
>> + }
>> + else {
>> + while ( length($output) > 0 ) {
>> + my $packet = substr( $output, 0, $MAX_PACKET_CONTENT_SIZE );
>> + packet_write($packet);
>> + if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
>> + $output = substr( $output, $MAX_PACKET_CONTENT_SIZE );
>> + }
>> + else {
>> + $output = "";
>> + }
>> + }
>> + packet_flush();
>> + packet_write("success\n");
>> + print $debug "[OK]\n";
>> + $debug->flush();
>> + }
>> + }
>> +}
>>
>
Thank you very much (again!) for your extensive review,
Lars
next prev parent reply other threads:[~2016-08-03 13:11 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20160727000605.49982-1-larsxschneider%40gmail.com/>
2016-07-29 23:37 ` [PATCH v3 00/10] Git filter protocol larsxschneider
2016-07-29 23:37 ` [PATCH v3 01/10] pkt-line: extract set_packet_header() larsxschneider
2016-07-30 10:30 ` Jakub Narębski
2016-08-01 11:33 ` Lars Schneider
2016-08-03 20:05 ` Jakub Narębski
2016-08-05 11:52 ` Lars Schneider
2016-07-29 23:37 ` [PATCH v3 02/10] pkt-line: add direct_packet_write() and direct_packet_write_data() larsxschneider
2016-07-30 10:49 ` Jakub Narębski
2016-08-01 12:00 ` Lars Schneider
2016-08-03 20:12 ` Jakub Narębski
2016-08-05 12:02 ` Lars Schneider
2016-07-29 23:37 ` [PATCH v3 03/10] pkt-line: add packet_flush_gentle() larsxschneider
2016-07-30 12:04 ` Jakub Narębski
2016-08-01 12:28 ` Lars Schneider
2016-07-31 20:36 ` Torstem Bögershausen
2016-07-31 21:45 ` Lars Schneider
2016-08-02 19:56 ` Torsten Bögershausen
2016-08-05 9:59 ` Lars Schneider
2016-07-29 23:37 ` [PATCH v3 04/10] pkt-line: call packet_trace() only if a packet is actually send larsxschneider
2016-07-30 12:29 ` Jakub Narębski
2016-08-01 12:18 ` Lars Schneider
2016-08-03 20:15 ` Jakub Narębski
2016-07-29 23:37 ` [PATCH v3 05/10] pack-protocol: fix maximum pkt-line size larsxschneider
2016-07-30 13:58 ` Jakub Narębski
2016-08-01 12:23 ` Lars Schneider
2016-07-29 23:37 ` [PATCH v3 06/10] run-command: add clean_on_exit_handler larsxschneider
2016-07-30 9:50 ` Johannes Sixt
2016-08-01 11:14 ` Lars Schneider
2016-08-02 5:53 ` Johannes Sixt
2016-08-02 7:41 ` Lars Schneider
2016-07-29 23:37 ` [PATCH v3 07/10] convert: quote filter names in error messages larsxschneider
2016-07-29 23:37 ` [PATCH v3 08/10] convert: modernize tests larsxschneider
2016-07-29 23:38 ` [PATCH v3 09/10] convert: generate large test files only once larsxschneider
2016-07-29 23:38 ` [PATCH v3 10/10] convert: add filter.<driver>.process option larsxschneider
2016-07-30 22:05 ` Jakub Narębski
2016-07-31 9:42 ` Jakub Narębski
2016-07-31 19:49 ` Lars Schneider
2016-07-31 22:59 ` Jakub Narębski
2016-08-01 13:32 ` Lars Schneider
2016-08-03 18:30 ` Designing the filter process protocol (was: Re: [PATCH v3 10/10] convert: add filter.<driver>.process option) Jakub Narębski
2016-08-05 10:32 ` Lars Schneider
2016-08-06 18:24 ` Lars Schneider
2016-08-03 22:47 ` [PATCH v3 10/10] convert: add filter.<driver>.process option Jakub Narębski
2016-07-31 22:19 ` Jakub Narębski
2016-08-01 17:55 ` Lars Schneider
2016-08-04 0:42 ` Jakub Narębski
2016-08-03 13:10 ` Lars Schneider [this message]
2016-08-04 10:18 ` Jakub Narębski
2016-08-05 13:20 ` Lars Schneider
2016-08-03 16:42 ` [PATCH v4 00/12] Git filter protocol larsxschneider
2016-08-03 16:42 ` [PATCH v4 01/12] pkt-line: extract set_packet_header() larsxschneider
2016-08-03 20:18 ` Junio C Hamano
2016-08-03 21:12 ` Jeff King
2016-08-03 21:27 ` Jeff King
2016-08-04 16:14 ` Junio C Hamano
2016-08-05 14:55 ` Lars Schneider
2016-08-05 16:31 ` Junio C Hamano
2016-08-05 17:31 ` Lars Schneider
2016-08-05 17:41 ` Junio C Hamano
2016-08-03 21:56 ` Lars Schneider
2016-08-03 16:42 ` [PATCH v4 02/12] pkt-line: add direct_packet_write() and direct_packet_write_data() larsxschneider
2016-08-03 16:42 ` [PATCH v4 03/12] pkt-line: add packet_flush_gentle() larsxschneider
2016-08-03 21:39 ` Jeff King
2016-08-03 22:56 ` [PATCH 0/7] minor trace fixes and cosmetic improvements Jeff King
2016-08-03 22:56 ` [PATCH 1/7] trace: handle NULL argument in trace_disable() Jeff King
2016-08-03 22:58 ` [PATCH 2/7] trace: stop using write_or_whine_pipe() Jeff King
2016-08-03 22:58 ` [PATCH 3/7] trace: use warning() for printing trace errors Jeff King
2016-08-04 20:41 ` Junio C Hamano
2016-08-04 21:21 ` Jeff King
2016-08-04 21:28 ` Junio C Hamano
2016-08-05 7:56 ` Jeff King
2016-08-05 7:59 ` Christian Couder
2016-08-05 18:41 ` Junio C Hamano
2016-08-03 23:00 ` [PATCH 4/7] trace: cosmetic fixes for error messages Jeff King
2016-08-04 20:42 ` Junio C Hamano
2016-08-05 8:00 ` Jeff King
2016-08-03 23:00 ` [PATCH 5/7] trace: correct variable name in write() error message Jeff King
2016-08-03 23:01 ` [PATCH 6/7] trace: disable key after write error Jeff King
2016-08-04 20:45 ` Junio C Hamano
2016-08-04 21:22 ` Jeff King
2016-08-05 7:58 ` Jeff King
2016-08-03 23:01 ` [PATCH 7/7] write_or_die: drop write_or_whine_pipe() Jeff King
2016-08-03 23:04 ` [PATCH 0/7] minor trace fixes and cosmetic improvements Jeff King
2016-08-04 16:16 ` [PATCH v4 03/12] pkt-line: add packet_flush_gentle() Junio C Hamano
2016-08-03 16:42 ` [PATCH v4 04/12] pkt-line: call packet_trace() only if a packet is actually send larsxschneider
2016-08-03 16:42 ` [PATCH v4 05/12] pkt-line: add functions to read/write flush terminated packet streams larsxschneider
2016-08-03 16:42 ` [PATCH v4 06/12] pack-protocol: fix maximum pkt-line size larsxschneider
2016-08-03 16:42 ` [PATCH v4 07/12] run-command: add clean_on_exit_handler larsxschneider
2016-08-03 21:24 ` Jeff King
2016-08-03 22:15 ` Lars Schneider
2016-08-03 22:53 ` Jeff King
2016-08-03 23:09 ` Lars Schneider
2016-08-03 23:15 ` Jeff King
2016-08-05 13:08 ` Lars Schneider
2016-08-05 21:19 ` Torsten Bögershausen
2016-08-05 21:50 ` Lars Schneider
2016-08-03 16:42 ` [PATCH v4 08/12] convert: quote filter names in error messages larsxschneider
2016-08-03 16:42 ` [PATCH v4 09/12] convert: modernize tests larsxschneider
2016-08-03 16:42 ` [PATCH v4 10/12] convert: generate large test files only once larsxschneider
2016-08-03 16:42 ` [PATCH v4 11/12] convert: add filter.<driver>.process option larsxschneider
2016-08-03 17:45 ` Junio C Hamano
2016-08-03 21:48 ` Lars Schneider
2016-08-03 22:46 ` Jeff King
2016-08-05 12:53 ` Lars Schneider
2016-08-03 20:29 ` Junio C Hamano
2016-08-03 21:37 ` Lars Schneider
2016-08-03 21:43 ` Junio C Hamano
2016-08-03 22:01 ` Lars Schneider
2016-08-05 21:34 ` Torsten Bögershausen
2016-08-05 21:49 ` Lars Schneider
2016-08-05 22:06 ` Junio C Hamano
2016-08-05 22:27 ` Jeff King
2016-08-06 11:55 ` Lars Schneider
2016-08-06 12:14 ` Jeff King
2016-08-06 18:19 ` Lars Schneider
2016-08-08 15:02 ` Jeff King
2016-08-08 16:21 ` Lars Schneider
2016-08-08 16:26 ` Jeff King
2016-08-06 20:40 ` Torsten Bögershausen
2016-08-03 16:42 ` [PATCH v4 12/12] convert: add filter.<driver>.process shutdown command option larsxschneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9DDA993E-2AFD-4C69-8E22-58601EEC8A40@gmail.com \
--to=larsxschneider@gmail.com \
--cc=e@80x24.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jnareb@gmail.com \
--cc=mlbright@gmail.com \
--cc=peff@peff.net \
--cc=tboegi@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).