git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Lars Schneider <larsxschneider@gmail.com>
Cc: git@vger.kernel.org, peff@peff.net, tboegi@web.de, e@80x24.org,
	ttaylorr@github.com, peartben@gmail.com
Subject: Re: [PATCH v7 6/6] convert: add "status=delayed" to filter process protocol
Date: Tue, 27 Jun 2017 12:00:15 -0700	[thread overview]
Message-ID: <xmqqbmp91cjk.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170627121027.99209-7-larsxschneider@gmail.com> (Lars Schneider's message of "Tue, 27 Jun 2017 14:10:27 +0200")

Lars Schneider <larsxschneider@gmail.com> writes:

> @@ -533,7 +534,8 @@ static int start_multi_file_filter_fn(struct subprocess_entry *subprocess)
>  	if (err)
>  		goto done;
>  
> -	err = packet_writel(process->in, "capability=clean", "capability=smudge", NULL);
> +	err = packet_writel(process->in,
> +		"capability=clean", "capability=smudge", "capability=delay", NULL);
>  
>  	for (;;) {
>  		cap_buf = packet_read_line(process->out, NULL);
> @@ -549,6 +551,8 @@ static int start_multi_file_filter_fn(struct subprocess_entry *subprocess)
>  			entry->supported_capabilities |= CAP_CLEAN;
>  		} else if (!strcmp(cap_name, "smudge")) {
>  			entry->supported_capabilities |= CAP_SMUDGE;
> +		} else if (!strcmp(cap_name, "delay")) {
> +			entry->supported_capabilities |= CAP_DELAY;
>  		} else {
>  			warning(
>  				"external filter '%s' requested unsupported filter capability '%s'",

I thought you said something about attempting to make this more
table-driven; did the attempt produce a better result?  Just being
curious.

> @@ -590,9 +594,11 @@ static void handle_filter_error(const struct strbuf *filter_status,
>  
>  static int apply_multi_file_filter(const char *path, const char *src, size_t len,
>  				   int fd, struct strbuf *dst, const char *cmd,
> -				   const unsigned int wanted_capability)
> +				   const unsigned int wanted_capability,
> +				   struct delayed_checkout *dco)
>  {
>  	int err;
> +	int can_delay = 0;
>  	struct cmd2process *entry;
>  	struct child_process *process;
>  	struct strbuf nbuf = STRBUF_INIT;
> @@ -647,6 +653,14 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len
>  	if (err)
>  		goto done;
>  
> +	if ((entry->supported_capabilities & CAP_DELAY) &&
> +	    dco && dco->state == CE_CAN_DELAY) {
> +		can_delay = 1;
> +		err = packet_write_fmt_gently(process->in, "can-delay=1\n");
> +		if (err)
> +			goto done;
> +	}
> +
>  	err = packet_flush_gently(process->in);
>  	if (err)
>  		goto done;
> @@ -662,14 +676,74 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len
>  	if (err)
>  		goto done;
>  
> -	err = strcmp(filter_status.buf, "success");
> +	if (can_delay && !strcmp(filter_status.buf, "delayed")) {
> +		dco->is_delayed = 1;
> +		string_list_insert(&dco->filters, cmd);
> +		string_list_insert(&dco->paths, path);
> +	} else {
> +		/* The filter got the blob and wants to send us a response. */
> +		err = strcmp(filter_status.buf, "success");
> +		if (err)
> +			goto done;
> +
> +		err = read_packetized_to_strbuf(process->out, &nbuf) < 0;
> +		if (err)
> +			goto done;
> +
> +		err = subprocess_read_status(process->out, &filter_status);
> +		if (err)
> +			goto done;
> +
> +		err = strcmp(filter_status.buf, "success");
> +	}
> +
> +done:
> +	sigchain_pop(SIGPIPE);
> +
> +	if (err)
> +		handle_filter_error(&filter_status, entry, wanted_capability);
> +	else
> +		strbuf_swap(dst, &nbuf);
> +	strbuf_release(&nbuf);
> +	return !err;
> +}

This I can understand better than the previous round ;-)

> diff --git a/convert.h b/convert.h
> index 82871a11d5..cdb91ab99a 100644
> --- a/convert.h
> +++ b/convert.h
> @@ -4,6 +4,8 @@
>  #ifndef CONVERT_H
>  #define CONVERT_H
>  
> +#include "string-list.h"
> +
>  enum safe_crlf {
>  	SAFE_CRLF_FALSE = 0,
>  	SAFE_CRLF_FAIL = 1,
> @@ -32,6 +34,27 @@ enum eol {
>  #endif
>  };
>  
> +enum ce_delay_state {
> +	CE_NO_DELAY = 0,
> +	CE_CAN_DELAY = 1,
> +	CE_RETRY = 2
> +};

This feels more natural and makes it easy to imagine the state
transition diagram.  enable-delay will take us from no-delay to
can-delay, and we tell the filter that it is OK to delay while
making the initial pass of the requests, and then after that we go
into retry state to collect the delayed reponses.

> +struct delayed_checkout {
> +	/* State of the currently processed cache entry. If the state is
> +	 * CE_CAN_DELAY, then the filter can change the 'is_delayed' flag
> +	 * to signal that the current cache entry was delayed. If the state is
> +	 * CE_RETRY, then this signals the filter that the cache entry was
> +	 * requested before.
> +	 */

        /*
         * Our multi-line comment look like this; slash-aster 
         * and aster-slash that opens and closes the block are
         * on their own lines.
         */

> +	enum ce_delay_state state;
> +	int is_delayed;

Hmph, I do not terribly mind but is this thing really needed?

Wouldn't filters and paths being non-empty be a good enough sign
that the backend said "ok, I am allowed to give a delayed response
so I acknowledge this path but would not give a result right away"?

> +int finish_delayed_checkout(struct checkout *state)
> +{
> +	int errs = 0;
> +	struct string_list_item *filter, *path;
> +	struct delayed_checkout *dco = state->delayed_checkout;
> +
> +	if (!state->delayed_checkout)
> +		return errs;
> +
> +	dco->state = CE_RETRY;
> +	while (dco->filters.nr > 0) {
> +		for_each_string_list_item(filter, &dco->filters) {
> +			struct string_list available_paths = STRING_LIST_INIT_NODUP;
> +
> +			if (!async_query_available_blobs(filter->string, &available_paths)) {
> +				/* Filter reported an error */
> +				errs = 1;
> +				filter->string = "";
> +				continue;
> +			}
> +			if (available_paths.nr <= 0) {
> +				/* Filter responded with no entries. That means
> +				 * the filter is done and we can remove the
> +				 * filter from the list (see
> +				 * "string_list_remove_empty_items" call below).
> +				 */
> +				filter->string = "";
> +				continue;
> +			}
> +
> +			/* In dco->paths we store a list of all delayed paths.
> +			 * The filter just send us a list of available paths.
> +			 * Remove them from the list.
> +			 */
> +			filter_string_list(&dco->paths, 0,
> +				&remove_available_paths, &available_paths);
> +
> +			for_each_string_list_item(path, &available_paths) {
> +				struct cache_entry* ce;
> +
> +				if (!path->util) {
> +					error("external filter '%s' signaled that '%s' "
> +					      "is now available although it has not been "
> +					      "delayed earlier",
> +					      filter->string, path->string);
> +					errs |= 1;
> +
> +					/* Do not ask the filter for available blobs,
> +					 * again, as the filter is likely buggy.
> +					 */
> +					filter->string = "";
> +					continue;
> +				}
> +				ce = index_file_exists(state->istate, path->string,
> +						       strlen(path->string), 0);
> +				assert(dco->state == CE_RETRY);

Can anything futz with dco->state at this late in the game?  You
entered into CE_RETRY state at the beginning of this function, and
this loop is going through each delayed ones. At this point, you are
going to make , but not yet have made, a request to the backend via
another call to checkout_entry() again.

Just wondering what kind of programming errors you are protecting
yourself against.  I briefly wondered perhaps you are afraid of a
bug in checkout_entry() that may flip the state back, but it that
is the case then the assert() would be after checkout_entry().

> +				errs |= (ce ? checkout_entry(ce, state, NULL) : 1);

> +			}
> +		}
> +		string_list_remove_empty_items(&dco->filters, 0);
> +	}
> +	string_list_clear(&dco->filters, 0);
> +
> +	/* At this point we should not have any delayed paths anymore. */
> +	errs |= dco->paths.nr;
> +	for_each_string_list_item(path, &dco->paths) {
> +		error("'%s' was not filtered properly", path->string);
> +	}
> +	string_list_clear(&dco->paths, 0);
> +
> +	free(dco);
> +	state->delayed_checkout = NULL;
> +
> +	return errs;
> +}

Thanks.

  reply	other threads:[~2017-06-27 19:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-27 12:10 [PATCH v7 0/6] convert: add "status=delayed" to filter process protocol Lars Schneider
2017-06-27 12:10 ` [PATCH v7 1/6] t0021: keep filter log files on comparison Lars Schneider
2017-06-27 12:10 ` [PATCH v7 2/6] t0021: make debug log file name configurable Lars Schneider
2017-06-27 12:10 ` [PATCH v7 3/6] t0021: write "OUT <size>" only on success Lars Schneider
2017-06-27 18:44   ` Junio C Hamano
2017-06-27 20:51     ` Lars Schneider
2017-06-27 21:26       ` Junio C Hamano
2017-06-27 12:10 ` [PATCH v7 4/6] convert: put the flags field before the flag itself for consistent style Lars Schneider
2017-06-27 12:10 ` [PATCH v7 5/6] convert: move multiple file filter error handling to separate function Lars Schneider
2017-06-27 12:10 ` [PATCH v7 6/6] convert: add "status=delayed" to filter process protocol Lars Schneider
2017-06-27 19:00   ` Junio C Hamano [this message]
2017-06-27 20:34     ` Lars Schneider
2017-06-27 21:30       ` Junio C Hamano
2017-06-27 21:58         ` Lars Schneider
2017-06-27 21:49       ` Lars Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqbmp91cjk.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=larsxschneider@gmail.com \
    --cc=peartben@gmail.com \
    --cc=peff@peff.net \
    --cc=tboegi@web.de \
    --cc=ttaylorr@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).