git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org,
	Christian Couder <christian.couder@gmail.com>,
	Jeff King <peff@peff.net>
Subject: Re: [PATCH v2 4/4] fetch: implement support for atomic reference updates
Date: Tue, 12 Jan 2021 13:22:40 +0100	[thread overview]
Message-ID: <X/2UkKS0AhT6jLXj@ncase> (raw)
In-Reply-To: <xmqq5z438ddv.fsf@gitster.c.googlers.com>

[-- Attachment #1: Type: text/plain, Size: 3858 bytes --]

On Mon, Jan 11, 2021 at 11:47:08AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> It would be much more preferrable to see this series go like so:
> >> 
> >>     [1/4] create append_fetch_head() that writes out to
> >>           fetch_head->fp
> >> 
> >>     [1.5/4] convert append_fetch_head() to ALWAYS accumulate in
> >>             fetch_head->buf, and ALWAYS flushes what is accumulated
> >>             at the end.
> >
> > This is a change I'm hesitant to make. The thing is that FETCH_HEAD is
> > quite weird as the current design allows different `git fetch --append`
> > processes to write to FETCH_HEAD at the same time.
> 
> Hmph, do you mean 
> 
> 	git fetch --append remote-a & git fetch --append remote-b
> 
> or something else [*1*]?

That's what I mean.

> With the current write-out codepath, there is no guarantee that ...
> 
> > > +		fprintf(fetch_head->fp, "%s\t%s\t%s",
> > > +			old_oid, merge_status_marker, note);
> > > +		for (i = 0; i < url_len; ++i)
> > > +			if ('\n' == url[i])
> > > +				fputs("\\n", fetch_head->fp);
> > > +			else
> > > +				fputc(url[i], fetch_head->fp);
> > > +		fputc('\n', fetch_head->fp);
> 
> ... these stdio operations for a single record would result in a
> single atomic write() that would not compete with writes by other
> processes.  So I wouldn't call "the current design allows ... at the
> same time."  It has been broken for years and nobody complained.

Fair enough. I somehow blindly assumed that `git fetch --all`, which
does invoke `git fetch --append`, did perform the fetch in parallel. If
that's not the case: all the better.

> > If we change to
> > always accumulate first and flush once we're done, then we essentially
> > have a change in behaviour as FETCH_HEADs aren't written in an
> > interleaving fashion anymore, but only at the end.
> 
> I view it a good thing, actually, for another reason [*2*], but that
> would take a follow-up fix that is much more involved, so let's not
> go there (yet).  At least buffering a single line's worth of output
> in a buffer and writing it out in a single write() would get us much
> closer to solving the "mixed up output" from multiple processes
> problem the current code seems to already have.

The buffering of a single line is exactly what we're doing now in the
non-atomic case. Previously there had been multiple writes, now we only
call `strbuf_write()` once on the buffer for each reference.

> > Also, if there is any
> > unexpected error in between updating references which causes us to die,
> > then we wouldn't have written the already updated references to
> > FETCH_HEAD now.
> 
> That one may be a valid concern.
> 
> Thanks.

Just to be explicit: are you fine with changes in this patch as-is? They
don't solve concurrent writes to FETCH_HEAD, but they should make it
easier to solve that if we ever need to.

Patrick

> 
> [Footnote]
> 
> *1* "git fetch --all/--multiple" was strictly serial operation to
>     avoid such a mixed-up output problem, but perhaps we weren't
>     careful enough when we introduced parallel fetch and broke it?
> 
> *2* When fetching from a single remote, the code makes sure that a
>     record that is not marked as 'not-for-merge' is listed first by
>     reordering the records, but it does not work well when fetching
>     from multiple remotes.  Queuing all in the buffer before showing
>     them would give us an opportunity to sort, but as I said, it
>     takes coordination among the processes---instead of each process
>     writing to FETCH_HEAD on their own, somebody has to collect the
>     records from them, reorder as needed and write them all out.
> 
>     cf. https://lore.kernel.org/git/X8oL190Vl03B0cQ%2F@coredump.intra.peff.net/
> 
>     

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2021-01-12 12:25 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-07 13:51 [PATCH 0/2] fetch: implement support for atomic reference updates Patrick Steinhardt
2021-01-07 13:51 ` [PATCH 1/2] fetch: allow passing a transaction to `s_update_ref()` Patrick Steinhardt
2021-01-07 22:59   ` Junio C Hamano
2021-01-08  0:45     ` Christian Couder
2021-01-08  7:18       ` Junio C Hamano
2021-01-07 13:51 ` [PATCH 2/2] fetch: implement support for atomic reference updates Patrick Steinhardt
2021-01-08  0:19   ` Junio C Hamano
2021-01-08 12:11 ` [PATCH v2 0/4] " Patrick Steinhardt
2021-01-08 12:11   ` [PATCH v2 1/4] fetch: extract writing to FETCH_HEAD Patrick Steinhardt
2021-01-08 23:40     ` Junio C Hamano
2021-01-11 10:26       ` Patrick Steinhardt
2021-01-11 19:24         ` Junio C Hamano
2021-01-08 12:11   ` [PATCH v2 2/4] fetch: refactor `s_update_ref` to use common exit path Patrick Steinhardt
2021-01-08 23:50     ` Junio C Hamano
2021-01-11 10:28       ` Patrick Steinhardt
2021-01-08 12:11   ` [PATCH v2 3/4] fetch: allow passing a transaction to `s_update_ref()` Patrick Steinhardt
2021-01-08 23:53     ` Junio C Hamano
2021-01-08 12:11   ` [PATCH v2 4/4] fetch: implement support for atomic reference updates Patrick Steinhardt
2021-01-09  0:05     ` Junio C Hamano
2021-01-11 10:42       ` Patrick Steinhardt
2021-01-11 19:47         ` Junio C Hamano
2021-01-12 12:22           ` Patrick Steinhardt [this message]
2021-01-12 12:29             ` Patrick Steinhardt
2021-01-12 19:19             ` Junio C Hamano
2021-01-11 11:05 ` [PATCH v3 0/5] " Patrick Steinhardt
2021-01-11 11:05   ` [PATCH v3 1/5] fetch: extract writing to FETCH_HEAD Patrick Steinhardt
2021-01-11 23:16     ` Junio C Hamano
2021-01-11 11:05   ` [PATCH v3 2/5] fetch: use strbuf to format FETCH_HEAD updates Patrick Steinhardt
2021-01-11 11:10     ` Patrick Steinhardt
2021-01-11 11:17     ` Christian Couder
2021-01-11 23:21     ` Junio C Hamano
2021-01-11 11:05   ` [PATCH v3 3/5] fetch: refactor `s_update_ref` to use common exit path Patrick Steinhardt
2021-01-11 11:05   ` [PATCH v3 4/5] fetch: allow passing a transaction to `s_update_ref()` Patrick Steinhardt
2021-01-11 11:05   ` [PATCH v3 5/5] fetch: implement support for atomic reference updates Patrick Steinhardt
2021-01-12  0:04   ` [PATCH v3 0/5] " Junio C Hamano
2021-01-12 12:27 ` [PATCH v4 " Patrick Steinhardt
2021-01-12 12:27   ` [PATCH v4 1/5] fetch: extract writing to FETCH_HEAD Patrick Steinhardt
2021-01-12 12:27   ` [PATCH v4 2/5] fetch: use strbuf to format FETCH_HEAD updates Patrick Steinhardt
2021-01-12 12:27   ` [PATCH v4 3/5] fetch: refactor `s_update_ref` to use common exit path Patrick Steinhardt
2021-01-12 12:27   ` [PATCH v4 4/5] fetch: allow passing a transaction to `s_update_ref()` Patrick Steinhardt
2021-01-12 12:27   ` [PATCH v4 5/5] fetch: implement support for atomic reference updates Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X/2UkKS0AhT6jLXj@ncase \
    --to=ps@pks.im \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).