git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH v3] unpack_trees: fix breakage when o->src_index != o->dst_index
Date: Sun, 29 Apr 2018 20:05:42 +0200	[thread overview]
Message-ID: <CACsJy8DyP_mXXJKn52Jzqe63N3GLpXePCr8ha97Lv9hr6u-M0w@mail.gmail.com> (raw)
In-Reply-To: <20180424065045.13905-1-newren@gmail.com>

On Tue, Apr 24, 2018 at 8:50 AM, Elijah Newren <newren@gmail.com> wrote:
> Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
> The code in unpack_trees() does not correctly handle them being different.
> There are two separate issues:
>
> First, there is the possibility of memory corruption.  Since
> unpack_trees() creates a temporary index in o->result and then discards
> o->dst_index and overwrites it with o->result, in the special case that
> o->src_index == o->dst_index, it is safe to just reuse o->src_index's
> split_index for o->result.  However, when src and dst are different,
> reusing o->src_index's split_index for o->result will cause the
> split_index to be shared.  If either index then has entries replaced or
> removed, it will result in the other index referring to free()'d memory.
>
> Second, we can drop the index extensions.  Previously, we were moving
> index extensions from o->dst_index to o->result.  Since o->src_index is
> the one that will have the necessary extensions (o->dst_index is likely to
> be a new index temporary index created to store the results), we should be
> moving the index extensions from there.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>
> Differences from v2:
>   - Don't NULLify src_index until we're done using it
>   - Actually built and tested[1]
>
> But it now passes the testsuite on both linux and mac[2], and I even re-merged
> all 53288 merge commits in linux.git (with a merge of this patch together with
> the directory rename detection series) for good measure.  [Only 7 commits
> showed a difference, all due to directory rename detection kicking in.]
>
> [1] Turns out that getting all fancy with an m4.10xlarge and nice levels of
> parallelization are great until you realize that your new setup omitted a
> critical step, leaving you running a slightly stale version of git instead...
> :-(
>
> [2] Actually, I get two test failures on mac from t0050-filesystem.sh, both
> with unicode normalization tests, but those two tests fail before my changes
> too.  All the other tests pass.
>
>  unpack-trees.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index e73745051e..49526d70aa 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>         o->result.timestamp.sec = o->src_index->timestamp.sec;
>         o->result.timestamp.nsec = o->src_index->timestamp.nsec;
>         o->result.version = o->src_index->version;
> -       o->result.split_index = o->src_index->split_index;
> -       if (o->result.split_index)
> +       if (!o->src_index->split_index) {
> +               o->result.split_index = NULL;
> +       } else if (o->src_index == o->dst_index) {
> +               /*
> +                * o->dst_index (and thus o->src_index) will be discarded
> +                * and overwritten with o->result at the end of this function,
> +                * so just use src_index's split_index to avoid having to
> +                * create a new one.
> +                */
> +               o->result.split_index = o->src_index->split_index;
>                 o->result.split_index->refcount++;
> +       } else {
> +               o->result.split_index = init_split_index(&o->result);
> +       }
>         hashcpy(o->result.sha1, o->src_index->sha1);
>         o->merge_size = len;
>         mark_all_ce_unused(o->src_index);
> @@ -1401,7 +1412,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>                 }
>         }
>
> -       o->src_index = NULL;
>         ret = check_updates(o) ? (-2) : 0;
>         if (o->dst_index) {
>                 if (!ret) {
> @@ -1412,12 +1422,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>                                                   WRITE_TREE_SILENT |
>                                                   WRITE_TREE_REPAIR);
>                 }
> -               move_index_extensions(&o->result, o->dst_index);
> +               move_index_extensions(&o->result, o->src_index);

While this looks like the right thing to do on paper, I believe it's
actually broken for a specific case of untracked cache. In short,
please do not touch this line. I will send a patch to revert
edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08),
which essentially deletes this line, with proper explanation and
perhaps a test if I could come up with one.

When we update the index, we depend on the fact that all updates must
invalidate the right untracked cache correctly. In this unpack
operations, we start copying entries over from src to result. Since
'result' (at least from the beginning) does not have an untracked
cache, it has nothing to invalidate when we copy entries over. By the
time we have done preparing 'result', what's recorded in src's (or
dst's for that matter) untracked cache may or may not apply to
'result'  index anymore. This copying only leads to more problems when
untracked cache is used.

Sorry I didn't notice this earlier :(

>                 discard_index(o->dst_index);
>                 *o->dst_index = o->result;
>         } else {
>                 discard_index(&o->result);
>         }
> +       o->src_index = NULL;
>
>  done:
>         clear_exclude_list(&el);
> --
> 2.17.0.253.g32393f1d0a
>
-- 
Duy

  reply	other threads:[~2018-04-29 18:06 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
2018-04-19 17:57 ` [PATCH v10 01/36] directory rename detection: basic testcases Elijah Newren
2018-04-19 17:57 ` [PATCH v10 02/36] directory rename detection: directory splitting testcases Elijah Newren
2018-04-19 17:57 ` [PATCH v10 03/36] directory rename detection: testcases to avoid taking detection too far Elijah Newren
2018-04-19 17:57 ` [PATCH v10 04/36] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
2018-04-19 17:57 ` [PATCH v10 05/36] directory rename detection: files/directories in the way of some renames Elijah Newren
2018-04-19 17:57 ` [PATCH v10 06/36] directory rename detection: testcases checking which side did the rename Elijah Newren
2018-04-19 17:57 ` [PATCH v10 07/36] directory rename detection: more involved edge/corner testcases Elijah Newren
2018-04-19 17:57 ` [PATCH v10 08/36] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
2018-04-19 17:57 ` [PATCH v10 09/36] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
2018-04-19 17:57 ` [PATCH v10 10/36] directory rename detection: tests for handling overwriting untracked files Elijah Newren
2018-04-19 17:57 ` [PATCH v10 11/36] directory rename detection: tests for handling overwriting dirty files Elijah Newren
2018-04-19 17:57 ` [PATCH v10 12/36] merge-recursive: move the get_renames() function Elijah Newren
2018-04-19 17:58 ` [PATCH v10 13/36] merge-recursive: introduce new functions to handle rename logic Elijah Newren
2018-04-19 17:58 ` [PATCH v10 14/36] merge-recursive: fix leaks of allocated renames and diff_filepairs Elijah Newren
2018-04-19 17:58 ` [PATCH v10 15/36] merge-recursive: make !o->detect_rename codepath more obvious Elijah Newren
2018-04-19 17:58 ` [PATCH v10 16/36] merge-recursive: split out code for determining diff_filepairs Elijah Newren
2018-04-19 17:58 ` [PATCH v10 17/36] merge-recursive: make a helper function for cleanup for handle_renames Elijah Newren
2018-04-19 17:58 ` [PATCH v10 18/36] merge-recursive: add get_directory_renames() Elijah Newren
2018-05-06 23:41   ` SZEDER Gábor
2018-05-07 15:45     ` [PATCH] fixup! " Elijah Newren
2019-10-09 20:38   ` [PATCH v10 18/36] " Johannes Schindelin
2019-10-11 20:02     ` Elijah Newren
2019-10-12 19:23       ` Johannes Schindelin
2018-04-19 17:58 ` [PATCH v10 19/36] merge-recursive: check for directory level conflicts Elijah Newren
2018-04-19 17:58 ` [PATCH v10 20/36] merge-recursive: add computation of collisions due to dir rename & merging Elijah Newren
2018-04-19 17:58 ` [PATCH v10 21/36] merge-recursive: check for file level conflicts then get new name Elijah Newren
2018-04-19 17:58 ` [PATCH v10 22/36] merge-recursive: when comparing files, don't include trees Elijah Newren
2018-04-19 17:58 ` [PATCH v10 23/36] merge-recursive: apply necessary modifications for directory renames Elijah Newren
2018-04-19 17:58 ` [PATCH v10 24/36] merge-recursive: avoid clobbering untracked files with " Elijah Newren
2018-04-19 17:58 ` [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
2018-04-19 20:48   ` Martin Ågren
2018-04-19 20:54     ` Martin Ågren
2018-04-19 21:06     ` Elijah Newren
2018-04-19 17:58 ` [PATCH v10 26/36] merge-recursive: fix remaining directory rename + dirty overwrite cases Elijah Newren
2018-04-19 17:58 ` [PATCH v10 27/36] directory rename detection: new testcases showcasing a pair of bugs Elijah Newren
2018-04-19 17:58 ` [PATCH v10 28/36] merge-recursive: avoid spurious rename/rename conflict from dir renames Elijah Newren
2018-04-19 17:58 ` [PATCH v10 29/36] merge-recursive: improve add_cacheinfo error handling Elijah Newren
2018-04-19 17:58 ` [PATCH v10 30/36] merge-recursive: move more is_dirty handling to merge_content Elijah Newren
2018-04-19 17:58 ` [PATCH v10 31/36] merge-recursive: avoid triggering add_cacheinfo error with dirty mod Elijah Newren
2018-04-19 17:58 ` [PATCH v10 32/36] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
2018-04-19 20:26   ` SZEDER Gábor
2018-04-19 20:55     ` Elijah Newren
2018-04-19 17:58 ` [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
2018-04-19 20:39   ` Martin Ågren
2018-04-19 20:54     ` Elijah Newren
2018-04-20 12:23   ` SZEDER Gábor
2018-04-20 15:23     ` Elijah Newren
2018-04-21 19:37     ` [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst Elijah Newren
2018-04-21 20:13       ` Elijah Newren
2018-04-22 12:38       ` Duy Nguyen
2018-04-23 17:09         ` Elijah Newren
2018-04-23 17:37           ` Duy Nguyen
2018-04-23 18:05             ` Elijah Newren
2018-04-24  0:24               ` [PATCH v2] unpack_trees: fix breakage when o->src_index != o->dst_index Elijah Newren
2018-04-24  1:51                 ` Junio C Hamano
2018-04-24  3:05                 ` Junio C Hamano
2018-04-24  6:50                   ` [PATCH v3] " Elijah Newren
2018-04-29 18:05                     ` Duy Nguyen [this message]
2018-04-29 20:53                       ` Johannes Schindelin
2018-04-30 14:42                         ` Duy Nguyen
2018-04-30 14:45                           ` Duy Nguyen
2018-04-30 16:19                             ` Elijah Newren
2018-04-30 16:29                               ` Duy Nguyen
2018-04-19 17:58 ` [PATCH v10 34/36] merge-recursive: fix remainder of was_dirty() to use original index Elijah Newren
2018-04-19 17:58 ` [PATCH v10 35/36] merge-recursive: make "Auto-merging" comment show for other merges Elijah Newren
2018-04-19 17:58 ` [PATCH v10 36/36] merge-recursive: fix check for skipability of working tree updates Elijah Newren
2018-04-19 18:35 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
2018-04-19 18:41   ` Stefan Beller
2018-04-19 19:54     ` Derrick Stolee
2018-04-19 20:22   ` Elijah Newren
2018-04-20  3:05   ` Junio C Hamano
2018-04-23 17:50     ` Elijah Newren
2018-04-24 20:20     ` [PATCH v10 1/2] fixup! merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
2018-04-24 20:21       ` [PATCH v10 2/2] fixup! t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
2018-04-23 17:28 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
2018-04-23 23:46   ` Junio C Hamano
2018-04-24  0:15     ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACsJy8DyP_mXXJKn52Jzqe63N3GLpXePCr8ha97Lv9hr6u-M0w@mail.gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).