git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Victoria Dye <vdye@github.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 6/7] read-tree: make two-way merge sparse-aware
Date: Sat, 26 Feb 2022 00:05:01 -0800	[thread overview]
Message-ID: <CABPp-BHihsVQZWTE4ppOcFyk8-eVa+zZ1MhkssiTByxjPO4kcg@mail.gmail.com> (raw)
In-Reply-To: <9fdcab038b2962b7f954363e32d04591476cf219.1645640717.git.gitgitgadget@gmail.com>

On Wed, Feb 23, 2022 at 4:09 PM Victoria Dye via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Victoria Dye <vdye@github.com>
>
> Enable two-way merge with 'git read-tree' without expanding the sparse
> index. When in a sparse index, a two-way merge will trivially succeed as
> long as there are not changes to the same sparse directory in multiple trees
> (i.e., sparse directory-level "edit-edit" conflicts). If there are such
> conflicts, the merge will fail despite the possibility that individual files
> could merge cleanly.
>
> In order to resolve these "edit-edit" conflicts, "conflicted" sparse
> directories are - rather than rejected - merged by traversing their
> associated trees by OID. For each child of the sparse directory:
>
> 1. Files are merged as normal (see Documentation/git-read-tree.txt for
>    details).
> 2. Subdirectories are treated as sparse directories and merged in
>    'twoway_merge'. If there are no conflicts, they are merged according to
>    the rules in Documentation/git-read-tree.txt; otherwise, the subdirectory
>    is recursively traversed and merged.
>
> This process allows sparse directories to be individually merged at the
> necessary depth *without* expanding a full index.

The idea of merging directory-level entries turns out to be
problematic _if_ rename detection is involved, but read-tree-style
merges are only trivial merges that ignore rename detection.  As such,
this idea is perfectly reasonable, and is a good way to go.  Nicely
done.

Mostly the patch looks good.  There's one thing I'm wondering about, though...

>
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  builtin/read-tree.c                      |  5 --
>  t/t1092-sparse-checkout-compatibility.sh |  3 +-
>  unpack-trees.c                           | 75 ++++++++++++++++++++++++
>  3 files changed, 77 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/read-tree.c b/builtin/read-tree.c
> index a7b7f822281..5a421de2629 100644
> --- a/builtin/read-tree.c
> +++ b/builtin/read-tree.c
> @@ -225,11 +225,6 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
>                         opts.fn = opts.prefix ? bind_merge : oneway_merge;
>                         break;
>                 case 2:
> -                       /*
> -                        * TODO: update twoway_merge to handle edit/edit conflicts in
> -                        * sparse directories.
> -                        */
> -                       ensure_full_index(&the_index);
>                         opts.fn = twoway_merge;
>                         opts.initial_checkout = is_cache_unborn();
>                         break;
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index a404be0a10f..d6f19682d65 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -1411,7 +1411,8 @@ test_expect_success 'sparse index is not expanded: read-tree' '
>         init_repos &&
>
>         ensure_not_expanded checkout -b test-branch update-folder1 &&
> -       for MERGE_TREES in "update-folder2"
> +       for MERGE_TREES in "update-folder2" \
> +                          "base update-folder2"
>         do
>                 ensure_not_expanded read-tree -mu $MERGE_TREES &&
>                 ensure_not_expanded reset --hard HEAD || return 1
> diff --git a/unpack-trees.c b/unpack-trees.c
> index dba122a02bb..a4ace53904e 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1360,6 +1360,42 @@ static int is_sparse_directory_entry(struct cache_entry *ce,
>         return sparse_dir_matches_path(ce, info, name);
>  }
>
> +static int unpack_sparse_callback(int n, unsigned long mask, unsigned long dirmask, struct name_entry *names, struct traverse_info *info)
> +{
> +       struct cache_entry *src[MAX_UNPACK_TREES + 1] = { NULL, };
> +       struct unpack_trees_options *o = info->data;
> +       int ret;
> +
> +       assert(o->merge);
> +
> +       /*
> +        * Unlike in 'unpack_callback', where src[0] is derived from the index when
> +        * merging, src[0] is a transient cache entry derived from the first tree
> +        * provided. Create the temporary entry as if it came from a non-sparse index.
> +        */
> +       if (!is_null_oid(&names[0].oid)) {
> +               src[0] = create_ce_entry(info, &names[0], 0,
> +                                       &o->result, 1,
> +                                       dirmask & (1ul << 0));
> +               src[0]->ce_flags |= (CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
> +       }
> +
> +       /*
> +        * 'unpack_single_entry' assumes that src[0] is derived directly from
> +        * the index, rather than from an entry in 'names'. This is *not* true when
> +        * merging a sparse directory, in which case names[0] is the "index" source
> +        * entry. To match the expectations of 'unpack_single_entry', shift past the
> +        * "index" tree (i.e., names[0]) and adjust 'names', 'n', 'mask', and
> +        * 'dirmask' accordingly.
> +        */
> +       ret = unpack_single_entry(n - 1, mask >> 1, dirmask >> 1, src, names + 1, info);

So, you're passing one less entry to unpack_single_entry() when you've
traversed into a sparse directory...won't the traversal at the next
subdirectory deeper then also pass one less entry to
unpack_single_entry(), so after recursing a directory or two, you only
have one directory left and it won't conflict with anything so it just
uses that remaining tree?  (Or maybe it passes the wrong number of
arguments into twoway_merge()?)  Did I miss something in the logic
somewhere that avoids that issue?  It'd be nice to test it out, which
brings me to...

> +
> +       if (src[0])
> +               discard_cache_entry(src[0]);
> +
> +       return ret >= 0 ? mask : -1;
> +}
> +
>  /*
>   * Note that traverse_by_cache_tree() duplicates some logic in this function
>   * without actually calling it. If you change the logic here you may need to
> @@ -2464,6 +2500,37 @@ static int merged_entry(const struct cache_entry *ce,
>         return 1;
>  }
>
> +static int merged_sparse_dir(const struct cache_entry * const *src, int n,
> +                            struct unpack_trees_options *o)
> +{
> +       struct tree_desc t[MAX_UNPACK_TREES + 1];
> +       void * tree_bufs[MAX_UNPACK_TREES + 1];
> +       struct traverse_info info;
> +       int i, ret;
> +
> +       /*
> +        * Create the tree traversal information for traversing into *only* the
> +        * sparse directory.
> +        */
> +       setup_traverse_info(&info, src[0]->name);
> +       info.fn = unpack_sparse_callback;
> +       info.data = o;
> +       info.show_all_errors = o->show_all_errors;
> +       info.pathspec = o->pathspec;
> +
> +       /* Get the tree descriptors of the sparse directory in each of the merging trees */
> +       for (i = 0; i < n; i++)
> +               tree_bufs[i] = fill_tree_descriptor(o->src_index->repo, &t[i],
> +                                                   src[i] && !is_null_oid(&src[i]->oid) ? &src[i]->oid : NULL);
> +
> +       ret = traverse_trees(o->src_index, n, t, &info);
> +
> +       for (i = 0; i < n; i++)
> +               free(tree_bufs[i]);
> +
> +       return ret;
> +}
> +
>  static int deleted_entry(const struct cache_entry *ce,
>                          const struct cache_entry *old,
>                          struct unpack_trees_options *o)
> @@ -2734,6 +2801,14 @@ int twoway_merge(const struct cache_entry * const *src,
>                          * reject the merge instead.
>                          */
>                         return merged_entry(newtree, current, o);
> +               } else if (S_ISSPARSEDIR(current->ce_mode)) {
> +                       /*
> +                        * The sparse directories differ, but we don't know whether that's
> +                        * because of two different files in the directory being modified
> +                        * (can be trivially merged) or if there is a real file conflict.
> +                        * Merge the sparse directory by OID to compare file-by-file.
> +                        */
> +                       return merged_sparse_dir(src, 3, o);
>                 } else
>                         return reject_merge(current, o);
>         }
> --
> gitgitgadget

It would be nice to have a couple of tests.  In particular, one
designed to see what happens when we need to traverse into
subdirectories of sparse directory entries and paths different between
the two trees being merged.

  reply	other threads:[~2022-02-26  8:05 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-23 18:25 [PATCH 0/7] Sparse index: integrate with 'read-tree' Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 1/7] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-02-24 16:48   ` Derrick Stolee
2022-02-24 21:42     ` Victoria Dye
2022-02-23 18:25 ` [PATCH 2/7] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 3/7] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 4/7] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 5/7] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 6/7] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-02-26  8:05   ` Elijah Newren [this message]
2022-02-28 18:04     ` Victoria Dye
2022-03-01  2:56       ` Elijah Newren
2022-02-23 18:25 ` [PATCH 7/7] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-02-24 16:59 ` [PATCH 0/7] Sparse index: integrate with 'read-tree' Derrick Stolee
2022-02-24 22:34 ` [PATCH v2 " Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 1/7] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 2/7] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-02-25  7:45     ` Elijah Newren
2022-02-28 23:17       ` Victoria Dye
2022-02-24 22:34   ` [PATCH v2 3/7] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-02-26  8:41     ` Elijah Newren
2022-02-28 18:14       ` Victoria Dye
2022-02-28 23:09     ` Ævar Arnfjörð Bjarmason
2022-02-28 23:27       ` Victoria Dye
2022-02-28 23:46         ` Ævar Arnfjörð Bjarmason
2022-02-24 22:34   ` [PATCH v2 4/7] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 5/7] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-02-25  8:38     ` Elijah Newren
2022-02-25 20:25       ` Victoria Dye
2022-02-26  7:52         ` Elijah Newren
2022-02-28 18:44           ` Victoria Dye
2022-02-24 22:34   ` [PATCH v2 6/7] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 7/7] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-02-26  8:46   ` [PATCH v2 0/7] Sparse index: integrate with 'read-tree' Elijah Newren
2022-03-01 20:24   ` [PATCH v3 0/8] " Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 1/8] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 2/8] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 3/8] read-tree: explicitly disallow prefixes with a leading '/' Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 4/8] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 5/8] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 6/8] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-03-03 17:54       ` Glen Choo
2022-03-03 21:19         ` Victoria Dye
2022-03-04 18:47           ` Glen Choo
2022-03-01 20:24     ` [PATCH v3 7/8] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 8/8] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-03-02  7:22     ` [PATCH v3 0/8] Sparse index: integrate with 'read-tree' Elijah Newren
2022-03-02 13:40       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BHihsVQZWTE4ppOcFyk8-eVa+zZ1MhkssiTByxjPO4kcg@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).