git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Jakub Narębski" <jnareb@gmail.com>
To: Lars Schneider <larsxschneider@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Torsten Bögershausen" <tboegi@web.de>,
	"Martin-Louis Bright" <mlbright@gmail.com>,
	"Eric Wong" <e@80x24.org>, "Jeff King" <peff@peff.net>
Subject: Re: [PATCH v3 10/10] convert: add filter.<driver>.process option
Date: Mon, 1 Aug 2016 00:59:31 +0200	[thread overview]
Message-ID: <7255ef06-a9a0-91b7-b6da-a90322de926b@gmail.com> (raw)
In-Reply-To: <6765D972-876A-4F94-A170-468002498296@gmail.com>

W dniu 31.07.2016 o 21:49, Lars Schneider pisze: 
> On 31 Jul 2016, at 11:42, Jakub Narębski <jnareb@gmail.com> wrote:
>> W dniu 31.07.2016 o 00:05, Jakub Narębski pisze:
>>> W dniu 30.07.2016 o 01:38, larsxschneider@gmail.com pisze:
[...]
>>> I think it would be nice to have here at least summary of the benchmarks
>>> you did in https://github.com/github/git-lfs/pull/1382

This would be nice to have in the commit message: real benchmarks.

>>
>> Note that this feature is especially useful if startup time is long,
>> that is if you are using an operating system with costly fork / new process
>> startup time like MS Windows (which you have mentioned), or writing
>> filter in a programming language with large startup time like Java
>> or Python (the latter may have changed since).
>>
>>  https://gnustavo.wordpress.com/2012/06/28/programming-languages-start-up-times/
> 
> OK, I will add this. Is it OK to add the link to the commit message?
> (since I don't know how long the link will be available).

I don't think it is needed.  Perhaps only a sentence or half to notify
where you could get most from this feature, but even then it is not
necessary.

I'm sorry for the confusion.

>> See below for proposal with two places to signal errors: before sending
>> first byte, and after.
> 
> Right now the protocol is implemented covering the following cases:
> 
> ## CASE 1 - no stream success

It is less "stream", more "size unknown".  Real streaming is interleaving
reading and writing, which is currently not supported due to lack of
start_async() - I think.

> 
> packet:          git< size=57\n
> packet:          git< SMUDGED_CONTENT
> packet:          git< 0000
> packet:          git< success\n

Right.  What happens if either length(SMUDGED_CONTENT) < size,
or length(SMUDGED_CONTENT) > size?  It could conceivably happen,
e.g. due to an error in size calculation.

NOTE that without using flush packet to signal end of contents,
we would be not able to signal a situation when filter encounters
an error (per-file, or long temporary) when it have written some
content already.  For example this may happen for git-LFS filter,
if the server hosting artifactory (or even whole network) gets
down during cleanup / smudging.

Well, unless we would use other special packets:
 - empty packet, that is "0004" pkt-line
 - invalid packet, that is "0001", "0002", "0003" pkt-line
to signal premature end of SMUDGED_CONTENT.

> 
> 
> ## CASE 2 - no stream success but 0 byte response
> 
> packet:          git< size=0\n
> packet:          git< success\n

Why there is need to special case 0 byte (empty file) response?

  packet:          git< size=0\n
  packet:          git< 0000
  packet:          git< success\n

is perfectly fine.
  
> ## CASE 3 - no stream filter; filter doesn't want to process the file
> 
> packet:          git< size=0\n
> packet:          git< reject\n

Why not simply
 
  packet:          git< reject\n

Or, if we are going success/reject/whatever route

  packet:          git< size=0\n
  packet:          git< 0000
  packet:          git< reject\n

> ## CASE 4 - no stream filter; filter error
> 
> packet:          git< size=57\n
> packet:          git< SMUDGED_CONTENT
> packet:          git< 0000
> packet:          git< error\n
> 
> CASE 4 is not explicitly checked. If a final message is neither
> "success" nor "reject" then it is interpreted as error. If that
> happens then Git will shutdown and restart the filter process
> if there is another file to filter. 

This should be documented.

> 
> Alternatively a filter process can shutdown itself, too, to signal
> an error.
> 
> The corresponding stream filter look like this:
> 
> ## CASE 1 - stream success
> 
> packet:          git< SMUDGED_CONTENT
> packet:          git< 0000
> packet:          git< success\n
> 
> 
> ## CASE 2 - stream success but 0 byte response
> 
> packet:          git< 0000
> packet:          git< success\n
> 
> 
> ## CASE 3 - stream filter; filter doesn't want to process the file
> 
> packet:          git< 0000
> packet:          git< reject\n
> 
> 
> ## CASE 4 - stream filter; filter error
> 
> packet:          git< SMUDGED_CONTENT
> packet:          git< 0000
> packet:          git< error\n
> 
> --
> 
> I just realized that the size 0 case is a bit inconsistent
> in the no stream case as it has no flush packet. Maybe I 
> should indeed remove the flush packet in the no stream case
> completely?!

That's what I wrote about SPOT (single point of truth), of using
either size or flush packet, but not both.  But...

As I wrote, you need some mechanism to signal premature end
of contents, and start of an error description.

> 
> Do the cases above make sense to you?

Except for the inconsistency of the size 0 case.  This what
I meant to say.

> 
> Regarding error handling. I would prefer it if the filter prints
> all errors to STDERR by itself. I think that is the safest
> option to communicate errors to the users because if the communication
> got into a bad state then Git might not be able to read the errors
> properly.
> 
> See Peff's response on the topic, too:
> http://public-inbox.org/git/20160729165018.GA6553%40sigill.intra.peff.net/

Actually it looks like Peff is slightly against using stderr.

JK> Git-LFS sends to stderr because there's no other option. I wonder if it
JK> would be nicer to make it Git's responsibility to talk to the user,
JK> because then it could respect things like "--quiet". I guess error
JK> messages are generally printed regardless of verbosity, though, so
JK> printing them unconditionally is OK.

I think it should be O.K., and it makes writing filter drivers
simpler if we don't have multiplex channels.

>> NOTE: there is a bit of mixed and possibly confusing notation, that
>> is 0000 is flush packet, not packet with 0000 as content.  Perhaps
>> write pkt-line in full?
> 
> I am not sure I understand what you mean (maybe it's too late for me...).
> Can you try to rephrase or give an example?

Compare

  packet:          git< 0000

with

  packet:          git< success\n

The former as pkt-line is

  git< 0000

the latter is

  git< 000csuccess\n
       ^^^^
           \-- packet header

-- 
Jakub Narębski


  reply	other threads:[~2016-07-31 23:00 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20160727000605.49982-1-larsxschneider%40gmail.com/>
2016-07-29 23:37 ` [PATCH v3 00/10] Git filter protocol larsxschneider
2016-07-29 23:37   ` [PATCH v3 01/10] pkt-line: extract set_packet_header() larsxschneider
2016-07-30 10:30     ` Jakub Narębski
2016-08-01 11:33       ` Lars Schneider
2016-08-03 20:05         ` Jakub Narębski
2016-08-05 11:52           ` Lars Schneider
2016-07-29 23:37   ` [PATCH v3 02/10] pkt-line: add direct_packet_write() and direct_packet_write_data() larsxschneider
2016-07-30 10:49     ` Jakub Narębski
2016-08-01 12:00       ` Lars Schneider
2016-08-03 20:12         ` Jakub Narębski
2016-08-05 12:02           ` Lars Schneider
2016-07-29 23:37   ` [PATCH v3 03/10] pkt-line: add packet_flush_gentle() larsxschneider
2016-07-30 12:04     ` Jakub Narębski
2016-08-01 12:28       ` Lars Schneider
2016-07-31 20:36     ` Torstem Bögershausen
2016-07-31 21:45       ` Lars Schneider
2016-08-02 19:56         ` Torsten Bögershausen
2016-08-05  9:59           ` Lars Schneider
2016-07-29 23:37   ` [PATCH v3 04/10] pkt-line: call packet_trace() only if a packet is actually send larsxschneider
2016-07-30 12:29     ` Jakub Narębski
2016-08-01 12:18       ` Lars Schneider
2016-08-03 20:15         ` Jakub Narębski
2016-07-29 23:37   ` [PATCH v3 05/10] pack-protocol: fix maximum pkt-line size larsxschneider
2016-07-30 13:58     ` Jakub Narębski
2016-08-01 12:23       ` Lars Schneider
2016-07-29 23:37   ` [PATCH v3 06/10] run-command: add clean_on_exit_handler larsxschneider
2016-07-30  9:50     ` Johannes Sixt
2016-08-01 11:14       ` Lars Schneider
2016-08-02  5:53         ` Johannes Sixt
2016-08-02  7:41           ` Lars Schneider
2016-07-29 23:37   ` [PATCH v3 07/10] convert: quote filter names in error messages larsxschneider
2016-07-29 23:37   ` [PATCH v3 08/10] convert: modernize tests larsxschneider
2016-07-29 23:38   ` [PATCH v3 09/10] convert: generate large test files only once larsxschneider
2016-07-29 23:38   ` [PATCH v3 10/10] convert: add filter.<driver>.process option larsxschneider
2016-07-30 22:05     ` Jakub Narębski
2016-07-31  9:42       ` Jakub Narębski
2016-07-31 19:49         ` Lars Schneider
2016-07-31 22:59           ` Jakub Narębski [this message]
2016-08-01 13:32       ` Lars Schneider
2016-08-03 18:30         ` Designing the filter process protocol (was: Re: [PATCH v3 10/10] convert: add filter.<driver>.process option) Jakub Narębski
2016-08-05 10:32           ` Lars Schneider
2016-08-06 18:24           ` Lars Schneider
2016-08-03 22:47         ` [PATCH v3 10/10] convert: add filter.<driver>.process option Jakub Narębski
2016-07-31 22:19     ` Jakub Narębski
2016-08-01 17:55       ` Lars Schneider
2016-08-04  0:42         ` Jakub Narębski
2016-08-03 13:10       ` Lars Schneider
2016-08-04 10:18         ` Jakub Narębski
2016-08-05 13:20           ` Lars Schneider
2016-08-03 16:42   ` [PATCH v4 00/12] Git filter protocol larsxschneider
2016-08-03 16:42     ` [PATCH v4 01/12] pkt-line: extract set_packet_header() larsxschneider
2016-08-03 20:18       ` Junio C Hamano
2016-08-03 21:12         ` Jeff King
2016-08-03 21:27           ` Jeff King
2016-08-04 16:14           ` Junio C Hamano
2016-08-05 14:55             ` Lars Schneider
2016-08-05 16:31               ` Junio C Hamano
2016-08-05 17:31             ` Lars Schneider
2016-08-05 17:41               ` Junio C Hamano
2016-08-03 21:56         ` Lars Schneider
2016-08-03 16:42     ` [PATCH v4 02/12] pkt-line: add direct_packet_write() and direct_packet_write_data() larsxschneider
2016-08-03 16:42     ` [PATCH v4 03/12] pkt-line: add packet_flush_gentle() larsxschneider
2016-08-03 21:39       ` Jeff King
2016-08-03 22:56         ` [PATCH 0/7] minor trace fixes and cosmetic improvements Jeff King
2016-08-03 22:56           ` [PATCH 1/7] trace: handle NULL argument in trace_disable() Jeff King
2016-08-03 22:58           ` [PATCH 2/7] trace: stop using write_or_whine_pipe() Jeff King
2016-08-03 22:58           ` [PATCH 3/7] trace: use warning() for printing trace errors Jeff King
2016-08-04 20:41             ` Junio C Hamano
2016-08-04 21:21               ` Jeff King
2016-08-04 21:28                 ` Junio C Hamano
2016-08-05  7:56                   ` Jeff King
2016-08-05  7:59                   ` Christian Couder
2016-08-05 18:41                     ` Junio C Hamano
2016-08-03 23:00           ` [PATCH 4/7] trace: cosmetic fixes for error messages Jeff King
2016-08-04 20:42             ` Junio C Hamano
2016-08-05  8:00               ` Jeff King
2016-08-03 23:00           ` [PATCH 5/7] trace: correct variable name in write() error message Jeff King
2016-08-03 23:01           ` [PATCH 6/7] trace: disable key after write error Jeff King
2016-08-04 20:45             ` Junio C Hamano
2016-08-04 21:22               ` Jeff King
2016-08-05  7:58                 ` Jeff King
2016-08-03 23:01           ` [PATCH 7/7] write_or_die: drop write_or_whine_pipe() Jeff King
2016-08-03 23:04           ` [PATCH 0/7] minor trace fixes and cosmetic improvements Jeff King
2016-08-04 16:16         ` [PATCH v4 03/12] pkt-line: add packet_flush_gentle() Junio C Hamano
2016-08-03 16:42     ` [PATCH v4 04/12] pkt-line: call packet_trace() only if a packet is actually send larsxschneider
2016-08-03 16:42     ` [PATCH v4 05/12] pkt-line: add functions to read/write flush terminated packet streams larsxschneider
2016-08-03 16:42     ` [PATCH v4 06/12] pack-protocol: fix maximum pkt-line size larsxschneider
2016-08-03 16:42     ` [PATCH v4 07/12] run-command: add clean_on_exit_handler larsxschneider
2016-08-03 21:24       ` Jeff King
2016-08-03 22:15         ` Lars Schneider
2016-08-03 22:53           ` Jeff King
2016-08-03 23:09             ` Lars Schneider
2016-08-03 23:15               ` Jeff King
2016-08-05 13:08                 ` Lars Schneider
2016-08-05 21:19                   ` Torsten Bögershausen
2016-08-05 21:50                     ` Lars Schneider
2016-08-03 16:42     ` [PATCH v4 08/12] convert: quote filter names in error messages larsxschneider
2016-08-03 16:42     ` [PATCH v4 09/12] convert: modernize tests larsxschneider
2016-08-03 16:42     ` [PATCH v4 10/12] convert: generate large test files only once larsxschneider
2016-08-03 16:42     ` [PATCH v4 11/12] convert: add filter.<driver>.process option larsxschneider
2016-08-03 17:45       ` Junio C Hamano
2016-08-03 21:48         ` Lars Schneider
2016-08-03 22:46           ` Jeff King
2016-08-05 12:53             ` Lars Schneider
2016-08-03 20:29       ` Junio C Hamano
2016-08-03 21:37         ` Lars Schneider
2016-08-03 21:43           ` Junio C Hamano
2016-08-03 22:01             ` Lars Schneider
2016-08-05 21:34       ` Torsten Bögershausen
2016-08-05 21:49         ` Lars Schneider
2016-08-05 22:06         ` Junio C Hamano
2016-08-05 22:27           ` Jeff King
2016-08-06 11:55             ` Lars Schneider
2016-08-06 12:14               ` Jeff King
2016-08-06 18:19                 ` Lars Schneider
2016-08-08 15:02                   ` Jeff King
2016-08-08 16:21                     ` Lars Schneider
2016-08-08 16:26                       ` Jeff King
2016-08-06 20:40           ` Torsten Bögershausen
2016-08-03 16:42     ` [PATCH v4 12/12] convert: add filter.<driver>.process shutdown command option larsxschneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7255ef06-a9a0-91b7-b6da-a90322de926b@gmail.com \
    --to=jnareb@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=mlbright@gmail.com \
    --cc=peff@peff.net \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).