git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Lars Schneider <larsxschneider@gmail.com>
To: "Jakub Narębski" <jnareb@gmail.com>
Cc: "Jeff King" <peff@peff.net>, "Git List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Torsten Bögershausen" <tboegi@web.de>, "Eric Wong" <e@80x24.org>,
	"Taylor Blau" <ttaylorr@github.com>
Subject: Re: [PATCH v2] convert: add "status=delayed" to filter process protocol
Date: Sun, 9 Apr 2017 20:41:02 +0200	[thread overview]
Message-ID: <6410F3B1-3A85-4B2C-9425-17912F57622B@gmail.com> (raw)
In-Reply-To: <7fbf63df-c61f-3d4b-99a8-eea52049709d@gmail.com>


> On 27 Feb 2017, at 23:11, Jakub Narębski <jnareb@gmail.com> wrote:
> 
> W dniu 27.02.2017 o 11:32, Lars Schneider pisze:
>> 
>>> On 27 Feb 2017, at 10:58, Jeff King <peff@peff.net> wrote:
>>> 
>>> On Sun, Feb 26, 2017 at 07:48:16PM +0100, Lars Schneider wrote:
>>> 
>>>> +If the request cannot be fulfilled within a reasonable amount of time
>>>> +then the filter can respond with a "delayed" status and a flush packet.
>>>> +Git will perform the same request at a later point in time, again. The
>>>> +filter can delay a response multiple times for a single request.
>>>> +------------------------
>>>> +packet:          git< status=delayed
>>>> +packet:          git< 0000
>>>> +------------------------
> 
> Is it something that happens instead of filter process sending the contents

Correct! I'll clarify this in v3!


>> 
>> I completely agree - I need to change that. However, the goal of the v2
>> iteration was to get the "convert" interface in an acceptable state.
>> That's what I intended to say in the patch comment section:
>> 
>>    "Please ignore all changes behind async_convert_to_working_tree() and 
>>     async_filter_finish() for now as I plan to change the implementation 
>>     as soon as the interface is in an acceptable state."
> 
> I think that it is more important to start with a good abstraction,
> and the proposal for protocol, rather than getting bogged down in
> implementation details that may change as the idea for protocol
> extension changes.

I'll send out v3 shortly as proposal for a complete solution.


>>> I think it would be much more efficient to do something like:
>>> 
>>> [Git issues a request and gives it an opaque index id]
>>> git> command=smudge
>>> git> pathname=foo
>>> git> index=0
>>> git> 0000
>>> git> CONTENT
>>> git> 0000
>>> 
>>> [The data isn't ready yet, so the filter tells us so...]
>>> git< status=delayed
>>> git< 0000
> 
> So is it only as replacement for "status=success" + contents or
> "status=abort", that is upfront before sending any part of the file?

Yes.


> Or, as one can assume from the point of the paragraph with the
> "status=delayed", it is about replacing null list for success or
> "status=error" after sending some part (maybe empty) of a file,
> that is:

No. As this would complicate things I don't want to support it. 
(and I clarified that in the docs in v3).


> If it would not be undue burden on the filter driver process, we might
> require for it to say where to continue at (in bytes), e.g.
> 
>    git< from=16426
> 
> That should, of course, go below index/pathname line.

This would make the protocol even more complicated. That's why I don't
want to support splitting the response.


>>> git< index=0
> 
> Or a filter driver could have used pathname as an index, that is
> 
>    git< pathname=path/testfile.dat

In v3 I've used an index to help Git finding the right cache entry
quickly.


> 
>>> git< 0000
>>> git< CONTENT
>>> git< 0000
>>> 
>>> From Git's side, the loop is something like:
>>> 
>>> while (delayed_items > 0) {
>>> 	/* issue a wait, and get back the status/index pair */
>>> 	status = send_wait(&index);
>>> 	delayed_items--;
> 
> This looks like my 'event loop' proposal[1][2], see below.

I implemented something similar in v3.


>> That could work! I had something like that in mind:
>> 
>> I teach Git a new command "list_completed" or similar. The filter
>> blocks this call until at least one item is ready for Git. 
>> Then the filter responds with a list of paths that identify the
>> "ready items". Then Git asks for these ready items just with the
>> path and not with any content. Could that work? Wouldn't the path
>> be "unique" to identify a blob per filter run?
> 
> Why in the "drain" phase it is still Git that needs to ask filter for
> contents, one file after another?  Wouldn't it be easier and simpler
> for filter to finish sending contents, and send signal that it has
> finished continue'ing?
> 
> To summarize my earlier emails, current proposal looks for me as if
> it were a "busy loop" solution, that is[2]:

In v3 the implementation still uses kind of a busy loop (I expect the
filter to block if there nothing ready, yet). An event loop would
complicate the protocol as the filter would need to initiate an action.
Right now only Git initiates actions.


> Footnotes:
> ----------
> a) We don't send the Git-side contents of blob again, isn't it?
>   So we need some protocol extension / new understanding anyway.
>   for example that we don't send contents if we request path again.
Correct - v3 doesn't send the content again.


> Also, one thing that we need to be solved, assuming that the proposed
> extension allows to send partial data from filter to be delayed and
> continued later, is that Git needs to keep this partial response in buf;
> this is because of precedence of gitattributes applying:

As mentioned above I don't want to support partial data as this
complicates things and is of no use for my Git LFS problem case.

Thanks,
Lars


      reply	other threads:[~2017-04-09 18:48 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-26 18:48 [PATCH v2] convert: add "status=delayed" to filter process protocol Lars Schneider
2017-02-27  9:58 ` Jeff King
2017-02-27 10:32   ` Lars Schneider
2017-02-27 10:53     ` Jeff King
2017-04-09 18:17       ` Lars Schneider
2017-02-27 22:11     ` Jakub Narębski
2017-04-09 18:41       ` Lars Schneider [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6410F3B1-3A85-4B2C-9425-17912F57622B@gmail.com \
    --to=larsxschneider@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=peff@peff.net \
    --cc=tboegi@web.de \
    --cc=ttaylorr@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).