git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Taylor Blau <me@ttaylorr.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Elijah Newren <newren@gmail.com>, git <git@vger.kernel.org>
Subject: Re: [Summit topic] Server-side merge/rebase: needs and wants?
Date: Tue, 30 Nov 2021 11:06:09 +0100	[thread overview]
Message-ID: <CAP8UFD1LgfZ0MT9=cMvxCcox++_MBBhWG9Twf42cMiXL42AdpQ@mail.gmail.com> (raw)
In-Reply-To: <211109.861r3qdpt8.gmgdl@evledraar.gmail.com>

On Tue, Nov 9, 2021 at 1:18 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>
> On Mon, Nov 08 2021, Taylor Blau wrote:
>
> > I was discussing this with Elijah today in IRC. I thought that I sent
> > the following message to the list, but somehow dropped it from the CC
> > list, and only sent it to Elijah and Johannes.
> >
> > Here it is in its entirety, this time copying the list.
> >
> > n Thu, Oct 21, 2021 at 01:56:06PM +0200, Johannes Schindelin wrote:
> >>  5.  The challenge is not necessarily the technical challenges, but the UX for
> >>      server tools that live “above” the git executable.
> >>
> >>      1. What kind of output is needed? Machine-readable error messages?
> >>
> >>      2. What Git objects must be created: a tree? A commit?
> >>
> >>      3. How to handle, report, and store conflicts? Index is not typically
> >>         available on the server.
> >
> > I looked a little bit more into what GitHub would need in order to make
> > the switch. For background, we currently perform merges and rebases
> > using libgit2 as the backend, for the obvious reason which is that in a
> > pre-ORT world we could not write an intermediate result without having
> > an index around.
> >
> > (As a fun aside, we used to expand our bare copy of a repository into a
> > temporary working directory, perform the merge there, and then delete
> > the directory. We definitely don't do that anymore ;)).
> >
> > It looks like our current libgit2 usage more-or-less returns an
> > (object_id, list<file>) tuple, where:
> >
> >   - a non-NULL object_id is the result of a successful (i.e.,
> >     conflict-free) merge; specifically the oid of the resulting root
> >     tree
> >
> >   - a NULL object_id and a non-empty list of files indicates that the
> >     merge could not be completed without manual conflict resolution, and
> >     the list of files indicates where the conflicts were
> >
> > When we try to process a conflicted merge, we display the list of files
> > where conflicts were present in the web UI. We do have a UI to resolve
> > conflicts, but we populate the contents of that UI by telling libgit2 to
> > perform the same merge on *just that file*, and writing out the file
> > with its conflict markers as the result (and sending that result out to
> > a web editor).
> >
> > So I think an ORT-powered server-side merge would have to be able to:
> >
> >   - write out the contents of a merge (with a tree, not a commit), and
> >     indicate whether or not that merge was successful with an exit code
> >
> >   - write out the list of files that had conflicts upon failure
> >
> > Given my limited knowledge of the ORT implementation, it seems like
> > writing out the conflicts themselves would be pretty easy. But GitHub
> > probably wouldn't use it, or at least not immediately, since we rely
> > heavily on being able to recreate the conflicts file-by-file as they are
> > needed.
> >
> > Anyway, I happened to be looking into all of this during the summit, but
> > never wrote any of it down. So I figured that this might be helpful in
> > case folks are interested in pursuing this further. If so, let me know
> > if there are any other questions about what GitHub might want on the
> > backend, and I'll try to answer as best I can.
>
> That's very informative, thanks.

Yeah, thanks!

> Not that "ort" won't me much better at this,

I think the optimizations in "ort" could still be useful. Wouldn't it
be nice if rename detection was optimized for example?

> but FWIW git-merge-tree
> sort of gets most of the way-ish to what you're describing already in
> terms of a command interface.

Yeah, but if the engine is not up to date, I am not sure it's worth it
to reuse it just for the current very limited command interface.

> I.e. I'm not the first or last to have (not for anything serious)
> implement a dry-run bare-repo merge with something like:
>
>     git merge-tree origin/master git-for-windows/main origin/seen >diff
>     # Better regex needed, but basically this
>     grep "^\+<<<<<<< \.our$" diff && conflict=t
>
> So with some parsing of that command output you can get a diff with one
> side or the other applied.

Yeah, it looks like it would be easy to add options like --ours,
--theirs, etc, to get only the part we are interested in. And we
already easily see if the merge conflicted or not from the current
output, as it seems to output:

"0 mode sha1 filename"

in case of a successful merge, and:

"1 mode sha1 filename"
"2 mode sha1 filename"
"3 mode sha1 filename"

in case of conflicts.

> From there it's a matter of applying the patch, and from there you'd get
> blobs/trees. which is painful from just having a diff & no index, so
> it's a common use-case of libgit2 for just such basic usage.
>
> But to the extent that we were talking about plumbing interfaces
> wouldn't basically a git-merge-tree on steroids (or extension thereof)
> do, i.e.:
>
>  * Ask it to merge X heads, returns whether it worked or not
>  * ... and can return a diff with conflict markers like this
>  * ... for just some <pathspec>
>  * ... maybe with the conflict already "resolved" one way or the other?
>  * ... optionally, after some markers write one/both sides, spew out the
>    relevant tree/blob OIDs
>  * ... which again, could be limited by the <pathspec> above.
>
> I'm thinking of something that basically works like git for-each-ref --format="". So:
>
>     git merge-tree --format="..." <heads> -- <pathspec>
>
> Where that <format> can be custom \0-delimited (or whatever) sections of
> payload that could have whatever combination of the above you'd need. I
> think git-for-each-ref is probably the best example we've got of a
> plumbing interface in this category, i.e. being able to extract
> arbitrary payloads via format specifiers & "path" (well, ref)
> limitation.

The current synopsis is:

git merge-tree <base-tree> <branch1> <branch2>

which is quite different from what you are proposing.

Given that it seems worth it to use a different underlying engine
(actually the "ort" one) than the current one, I think that it might
be better to start from scratch with a new command using the "ort"
engine.

> Elijah probably has much better ideas already, I'm just spitballing.

Yeah, I'd be interested in knowing Elijah's opinion on this. Although
maybe I misunderstood, but I thought that Elijah had plans to send
patches related to this to the list after v2.34.

> But if something like that worked it would be mostly a matter of
> stealing code from for-each-ref and the like, and then the <handwaiving>
> mapping that to ORT callbacks somehow.

Yeah, but what would be left from the original git merge-tree then?

Wouldn't it make more sense to start with a new command that has
roughly the same features as git merge-tree and a similar interface
(though maybe not quite the same as we could anticipate some future
extensions and maybe learn a bit from other commands), but uses "ort".
Then we could grow it as we want, without being burdened by the git
merge-tree legacy, in the same way as "ort" was developed without
being burdened by the recursive merge legacy?

> And then it could even learn a --batch mode, which with those formats
> could allow calling it without paying the price for command
> re-invocation, something like the update-ref/proposed cat-file interface
> discussed in another thread at [1].

Yeah, sure.

  reply	other threads:[~2021-11-30 10:07 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-21 11:55 Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
2021-10-21 11:55 ` [Summit topic] Crazy (and not so crazy) ideas Johannes Schindelin
2021-10-21 12:30   ` Son Luong Ngoc
2021-10-26 20:14   ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Eric Wong
2021-10-30 19:58     ` Ævar Arnfjörð Bjarmason
2021-11-03  9:24       ` test suite speedups via some not-so-crazy ideas (was: scripting speedups[...]) Ævar Arnfjörð Bjarmason
2021-11-03 22:12         ` test suite speedups via some not-so-crazy ideas Junio C Hamano
2021-11-02 13:52     ` scripting speedups [was: [Summit topic] Crazy (and not so crazy) ideas] Johannes Schindelin
2021-10-21 11:55 ` [Summit topic] SHA-256 Updates Johannes Schindelin
2021-10-21 11:56 ` [Summit topic] Server-side merge/rebase: needs and wants? Johannes Schindelin
2021-10-22  3:06   ` Bagas Sanjaya
2021-10-22 10:01     ` Johannes Schindelin
2021-10-23 20:52       ` Ævar Arnfjörð Bjarmason
2021-11-08 18:21   ` Taylor Blau
2021-11-09  2:15     ` Ævar Arnfjörð Bjarmason
2021-11-30 10:06       ` Christian Couder [this message]
2021-10-21 11:56 ` [Summit topic] Submodules and how to make them worth using Johannes Schindelin
2021-10-21 11:56 ` [Summit topic] Sparse checkout behavior and plans Johannes Schindelin
2021-10-21 11:56 ` [Summit topic] The state of getting a reftable backend working in git.git Johannes Schindelin
2021-10-25 19:00   ` Han-Wen Nienhuys
2021-10-25 22:09     ` Ævar Arnfjörð Bjarmason
2021-10-26  8:12       ` Han-Wen Nienhuys
2021-10-28 14:17         ` Philip Oakley
2021-10-26 15:51       ` Philip Oakley
2021-10-21 11:56 ` [Summit topic] Documentation (translations, FAQ updates, new user-focused, general improvements, etc.) Johannes Schindelin
2021-10-22 14:20   ` Jean-Noël Avila
2021-10-22 14:31     ` Ævar Arnfjörð Bjarmason
2021-10-27  7:02       ` Jean-Noël Avila
2021-10-27  8:50       ` Jeff King
2021-10-21 11:56 ` [Summit topic] Increasing diversity & inclusion (transition to `main`, etc) Johannes Schindelin
2021-10-21 12:55   ` Son Luong Ngoc
2021-10-22 10:02     ` vale check, was " Johannes Schindelin
2021-10-22 10:03       ` Johannes Schindelin
2021-10-21 11:57 ` [Summit topic] Improving Git UX Johannes Schindelin
2021-10-21 16:45   ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Ævar Arnfjörð Bjarmason
2021-10-21 23:03     ` changing the experimental 'git switch' Junio C Hamano
2021-10-22  3:33     ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Bagas Sanjaya
2021-10-22 14:04     ` martin
2021-10-22 14:24       ` Ævar Arnfjörð Bjarmason
2021-10-22 15:30         ` martin
2021-10-23  8:27           ` changing the experimental 'git switch' Sergey Organov
2021-10-22 21:54         ` Sergey Organov
2021-10-24  6:54       ` changing the experimental 'git switch' (was: [Summit topic] Improving Git UX) Martin
2021-10-24 20:27         ` changing the experimental 'git switch' Junio C Hamano
2021-10-25 12:48           ` Ævar Arnfjörð Bjarmason
2021-10-25 17:06             ` Junio C Hamano
2021-10-25 16:44     ` Sergey Organov
2021-10-25 22:23       ` Ævar Arnfjörð Bjarmason
2021-10-27 18:54         ` Sergey Organov
2021-10-21 11:57 ` [Summit topic] Improving reviewer quality of life (patchwork, subsystem lists?, etc) Johannes Schindelin
2021-10-21 13:41   ` Konstantin Ryabitsev
2021-10-22 22:06     ` Ævar Arnfjörð Bjarmason
2021-10-22  8:02 ` Missing notes, was Re: Notes from the Git Contributors' Summit 2021, virtual, Oct 19/20 Johannes Schindelin
2021-10-22  8:22   ` Johannes Schindelin
2021-10-22  8:30     ` Johannes Schindelin
2021-10-22  9:07       ` Johannes Schindelin
2021-10-22  9:44 ` Let's have public Git chalk talks, " Johannes Schindelin
2021-10-25 12:58   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP8UFD1LgfZ0MT9=cMvxCcox++_MBBhWG9Twf42cMiXL42AdpQ@mail.gmail.com' \
    --to=christian.couder@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).