From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>
Subject: [PATCH 00/20] fundamentals of merge-ort implementation
Date: Sun, 29 Nov 2020 07:43:03 +0000 [thread overview]
Message-ID: <pull.923.git.git.1606635803.gitgitgadget@gmail.com> (raw)
This is actually v3 of this series; but v2 depended on two topics that
hadn't graduated yet so I couldn't easily use gitgitgadget and get its
testing. Now that the topics have graduated, I have rebased on master. You
can see v2 and comments on it over here:
https://lore.kernel.org/git/20201102204344.342633-1-newren@gmail.com/
The goal of this series is to show the new design and structure behind
merge-ort, particularly the bits that are completely different to how
merge-recursive operates. There are still multiple important codepaths that
die with a "Not yet implemented" message, so the new merge algorithm is
still not very usable. However, it can handle very trivial rebases or
cherry-picks at the end of the series, and the number of test failures when
run under GIT_TEST_MERGE_ALGORITHM=ort drops from 2281 down to 1453.
At a high level, merge-ort avoids unpack_trees() and the index, instead
using traverse_trees() and its own data structure. After it is done
processing each path, it writes a tree. Only after it has created a new tree
will it touch the working copy or the index. It does so by using a simple
checkout-like step to switch from head to the newly created tree. If there
are conflicted entries, it touches up the index after the checkout-like step
to record those higher order stages.
In the series:
* Patch 1 adds some basic data structures.
* Patch 2 documents the high-level steps.
* Patches 3-5 are some simple setup.
* Patches 6-10 collect data from the traverse_trees() operation.
* Patches 11-15 process the individual paths and create a tree.
* Patches 16-19 handle checkout-and-then-write-higher-order-stages.
* Patch 20 frees data from the merge_options_internal data structure
Changes since v2 (Thanks to Stolee and Jonathan Tan for their excellent and
detailed reviews!):
* Add thorough code comments to data structures, to try to answer multiple
questions the reviewers brought up.
* Add comments in various areas of the code that were a bit tricky.
* As per Stolee's request/suggestion, restructured accesses to
conflict_info* to make the code document that we
* As per Jonathan's suggestion, restructured tree writing to have
"END_TREE" markers for directories -- namely, the directory itself.
* Removed path_conflict field (will add it back in a later series when it
is actually used)
* Improved requested commit messages
* Settled on "conflicted" instead of mix of "conflicted" and "unmerged"
* Fixed various typos, missing words, etc.
* Fixed various spaces around operators, missing blank lines, etc.
* Some other small tweaks I probably forgot or overlooked while typing up
this summary.
Things for reviewers to concentrate on:
* Patch 1: I added lots of comments describing the data structures, based
on reviewer questions/comments. Are they clear?
* Patch 9: Stolee was worried about the allocation of merged_info OR
conflict_info and whether we were safely accessing the fields. Since this
is our primary and largest data structure and many times most entries
only need a smaller merged_info, I really do want the space savings of
not always allocating the larger type. I typically access as a
merged_info* now, and added some accessor macros to document why each
access as a conflict_info* is okay. Does this seem like a good solution
to the concern?
* Patch 15: Jonathan felt the previous version of this patch was hard to
follow. I've restructured so that we process the directories in
opt->priv->paths directly; you can kind of view them as non-synthetic
end-of-tree markers. They may not stand out as such, though (since they
aren't synthetic with special names or handling), so I've added some
pretty big comments to explain things. Does this address concerns?
* Patches 16-20: these have not yet been reviewed, though these are easier
patches to review than many of the first 15! A quick guide: * Patches 16,
18, and 20 are very straightforward; patches 17 and 19 are the ones
that would benefit more from review.
* Patch 17 is
basically the twoway_merge subset of merge_working_tree() from
builtin/checkout.c. Find that bit of code and it's a direct
comparison.
* Patch 19
amounts to "how do I remove stage 0 entries in the index and replace
them with 1-3 higher order stages?".
Elijah Newren (20):
merge-ort: setup basic internal data structures
merge-ort: add some high-level algorithm structure
merge-ort: port merge_start() from merge-recursive
merge-ort: use histogram diff
merge-ort: add an err() function similar to one from merge-recursive
merge-ort: implement a very basic collect_merge_info()
merge-ort: avoid repeating fill_tree_descriptor() on the same tree
merge-ort: compute a few more useful fields for collect_merge_info
merge-ort: record stage and auxiliary info for every path
merge-ort: avoid recursing into identical trees
merge-ort: add a preliminary simple process_entries() implementation
merge-ort: have process_entries operate in a defined order
merge-ort: step 1 of tree writing -- record basenames, modes, and oids
merge-ort: step 2 of tree writing -- function to create tree object
merge-ort: step 3 of tree writing -- handling subdirectories as we go
merge-ort: basic outline for merge_switch_to_result()
merge-ort: add implementation of checkout()
tree: enable cmp_cache_name_compare() to be used elsewhere
merge-ort: add implementation of record_conflicted_index_entries()
merge-ort: free data structures in merge_finalize()
merge-ort.c | 1207 ++++++++++++++++++++++++++++++++++++++++++++++++++-
tree.c | 2 +-
tree.h | 2 +
3 files changed, 1207 insertions(+), 4 deletions(-)
base-commit: e67fbf927dfdf13d0b21dc6ea15dc3c7ef448ea0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-923%2Fnewren%2Fort-basics-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-923/newren/ort-basics-v1
Pull-Request: https://github.com/git/git/pull/923
--
gitgitgadget
next reply other threads:[~2020-11-29 7:46 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-29 7:43 Elijah Newren via GitGitGadget [this message]
2020-11-29 7:43 ` [PATCH 01/20] merge-ort: setup basic internal data structures Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 02/20] merge-ort: add some high-level algorithm structure Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 03/20] merge-ort: port merge_start() from merge-recursive Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 04/20] merge-ort: use histogram diff Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 05/20] merge-ort: add an err() function similar to one from merge-recursive Elijah Newren via GitGitGadget
2020-11-29 10:23 ` Ævar Arnfjörð Bjarmason
2020-11-30 16:56 ` Elijah Newren
2020-11-29 10:26 ` Ævar Arnfjörð Bjarmason
2020-11-29 7:43 ` [PATCH 06/20] merge-ort: implement a very basic collect_merge_info() Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 07/20] merge-ort: avoid repeating fill_tree_descriptor() on the same tree Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 08/20] merge-ort: compute a few more useful fields for collect_merge_info Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 09/20] merge-ort: record stage and auxiliary info for every path Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 10/20] merge-ort: avoid recursing into identical trees Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 11/20] merge-ort: add a preliminary simple process_entries() implementation Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 12/20] merge-ort: have process_entries operate in a defined order Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 13/20] merge-ort: step 1 of tree writing -- record basenames, modes, and oids Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 14/20] merge-ort: step 2 of tree writing -- function to create tree object Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 15/20] merge-ort: step 3 of tree writing -- handling subdirectories as we go Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 16/20] merge-ort: basic outline for merge_switch_to_result() Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 17/20] merge-ort: add implementation of checkout() Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 18/20] tree: enable cmp_cache_name_compare() to be used elsewhere Elijah Newren via GitGitGadget
2020-11-29 7:43 ` [PATCH 19/20] merge-ort: add implementation of record_conflicted_index_entries() Elijah Newren via GitGitGadget
2020-11-29 10:20 ` Ævar Arnfjörð Bjarmason
2020-11-29 7:43 ` [PATCH 20/20] merge-ort: free data structures in merge_finalize() Elijah Newren via GitGitGadget
2020-11-29 7:47 ` [PATCH 00/20] fundamentals of merge-ort implementation Elijah Newren
2020-12-04 20:47 ` [PATCH v2 " Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 01/20] merge-ort: setup basic internal data structures Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 02/20] merge-ort: add some high-level algorithm structure Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 03/20] merge-ort: port merge_start() from merge-recursive Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 04/20] merge-ort: use histogram diff Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 05/20] merge-ort: add an err() function similar to one from merge-recursive Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 06/20] merge-ort: implement a very basic collect_merge_info() Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 07/20] merge-ort: avoid repeating fill_tree_descriptor() on the same tree Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 08/20] merge-ort: compute a few more useful fields for collect_merge_info Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 09/20] merge-ort: record stage and auxiliary info for every path Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 10/20] merge-ort: avoid recursing into identical trees Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 11/20] merge-ort: add a preliminary simple process_entries() implementation Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 12/20] merge-ort: have process_entries operate in a defined order Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 13/20] merge-ort: step 1 of tree writing -- record basenames, modes, and oids Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 14/20] merge-ort: step 2 of tree writing -- function to create tree object Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 15/20] merge-ort: step 3 of tree writing -- handling subdirectories as we go Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 16/20] merge-ort: basic outline for merge_switch_to_result() Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 17/20] merge-ort: add implementation of checkout() Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 18/20] tree: enable cmp_cache_name_compare() to be used elsewhere Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 19/20] merge-ort: add implementation of record_conflicted_index_entries() Elijah Newren via GitGitGadget
2020-12-04 20:48 ` [PATCH v2 20/20] merge-ort: free data structures in merge_finalize() Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 00/20] fundamentals of merge-ort implementation Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 01/20] merge-ort: setup basic internal data structures Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 02/20] merge-ort: add some high-level algorithm structure Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 03/20] merge-ort: port merge_start() from merge-recursive Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 04/20] merge-ort: use histogram diff Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 05/20] merge-ort: add an err() function similar to one from merge-recursive Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 06/20] merge-ort: implement a very basic collect_merge_info() Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 07/20] merge-ort: avoid repeating fill_tree_descriptor() on the same tree Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 08/20] merge-ort: compute a few more useful fields for collect_merge_info Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 09/20] merge-ort: record stage and auxiliary info for every path Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 10/20] merge-ort: avoid recursing into identical trees Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 11/20] merge-ort: add a preliminary simple process_entries() implementation Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 12/20] merge-ort: have process_entries operate in a defined order Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 13/20] merge-ort: step 1 of tree writing -- record basenames, modes, and oids Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 14/20] merge-ort: step 2 of tree writing -- function to create tree object Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 15/20] merge-ort: step 3 of tree writing -- handling subdirectories as we go Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 16/20] merge-ort: basic outline for merge_switch_to_result() Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 17/20] merge-ort: add implementation of checkout() Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 18/20] tree: enable cmp_cache_name_compare() to be used elsewhere Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 19/20] merge-ort: add implementation of record_conflicted_index_entries() Elijah Newren via GitGitGadget
2020-12-13 8:04 ` [PATCH v3 20/20] merge-ort: free data structures in merge_finalize() Elijah Newren via GitGitGadget
2020-12-14 14:24 ` [PATCH v3 00/20] fundamentals of merge-ort implementation Felipe Contreras
2020-12-14 16:24 ` Elijah Newren
-- strict thread matches above, loose matches on Subject: below --
2020-10-30 3:41 [PATCH " Elijah Newren
2020-10-30 3:58 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.923.git.git.1606635803.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).