git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>, Derrick Stolee <stolee@gmail.com>,
	Jeff King <peff@peff.net>,
	Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v4 0/2] commit-graph: ignore duplicates when merging layers
Date: Fri, 09 Oct 2020 20:53:50 +0000	[thread overview]
Message-ID: <pull.747.v4.git.1602276832.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.747.v3.git.1602169479482.gitgitgadget@gmail.com>

Here is v4 of a series that was previously only one patch.

Thanks to the efforts of Thomas, Peff, and Taylor, we have a full
understanding of what went wrong here. Details are in the patches
themselves, but generally when writing a commit-graph chain we rely on the
commit-graph parsing to know which commits are already in the lower layers
of the chain. When 'core.commitGraph' is disabled, then we don't do that
parsing, and then we add all reachable commits to the top layer and merge
down. This causes us to hit the previous die() statement.

This fixes the problem by first handling any duplicates we see during a
merge (this is important for handling any other data out there in this
situation) and also to disable commit-graph writes when 'core.commitGraph'
is disabled.

Thanks, -Stolee

Derrick Stolee (2):
  commit-graph: ignore duplicates when merging layers
  commit-graph: don't write commit-graph when disabled

 Documentation/git-commit-graph.txt |  4 +++-
 commit-graph.c                     | 21 ++++++++++++++++++---
 t/t5324-split-commit-graph.sh      | 13 +++++++++++++
 3 files changed, 34 insertions(+), 4 deletions(-)


base-commit: d98273ba77e1ab9ec755576bc86c716a97bf59d7
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-747%2Fderrickstolee%2Fcommit-graph-dup-commits-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-747/derrickstolee/commit-graph-dup-commits-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/747

Range-diff vs v3:

 1:  9e760f07ac ! 1:  9279aea3ef commit-graph: ignore duplicates when merging layers
     @@ Commit message
          pointers. This allows us to get the end result we want without extra
          memory costs and minimal CPU time.
      
     -    Since the root cause for producing commit-graph layers with these
     -    duplicate commits is currently unknown, it is difficult to create a test
     -    for this scenario. For now, we must rely on testing the example data
     -    graciously provided in [1]. My local test successfully merged layers,
     -    and 'git commit-graph verify' passed.
     +    The root cause is due to disabling core.commitGraph, which prevents
     +    parsing commits from the lower layers during a 'git commit-graph write
     +    --split' command. Since we use the 'graph_pos' value to determine
     +    whether a commit is in a lower layer, we never discover that those
     +    commits are already in the commit-graph chain and add them to the top
     +    layer. This layer is then merged down, creating duplicates.
     +
     +    The test added in t5324-split-commit-graph.sh fails without this change.
     +    However, we still have not completely removed the need for this
     +    duplicate check. That will come in a follow-up change.
      
          Reported-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
          Helped-by: Taylor Blau <me@ttaylorr.com>
     @@ commit-graph.c: static void sort_and_scan_merged_commits(struct write_commit_gra
       	stop_progress(&ctx->progress);
       }
       
     +
     + ## t/t5324-split-commit-graph.sh ##
     +@@ t/t5324-split-commit-graph.sh: test_expect_success '--split=replace with partial Bloom data' '
     + 	verify_chain_files_exist $graphdir
     + '
     + 
     ++test_expect_success 'prevent regression for duplicate commits across layers' '
     ++	git init dup &&
     ++	git -C dup config core.commitGraph false &&
     ++	git -C dup commit --allow-empty -m one &&
     ++	git -C dup commit-graph write --split=no-merge --reachable &&
     ++	git -C dup commit --allow-empty -m two &&
     ++	git -C dup commit-graph write --split=no-merge --reachable &&
     ++	git -C dup commit --allow-empty -m three &&
     ++	git -C dup commit-graph write --split --reachable &&
     ++	git -C dup commit-graph verify
     ++'
     ++
     + test_done
 -:  ---------- > 2:  4439e8ae8f commit-graph: don't write commit-graph when disabled

-- 
gitgitgadget

  parent reply	other threads:[~2020-10-09 20:53 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-08 13:56 [PATCH] commit-graph: ignore duplicates when merging layers Derrick Stolee via GitGitGadget
2020-10-08 14:15 ` Taylor Blau
2020-10-08 14:29   ` Derrick Stolee
2020-10-08 14:59 ` [PATCH v2] " Derrick Stolee via GitGitGadget
2020-10-08 15:04   ` [PATCH v3] " Derrick Stolee via GitGitGadget
2020-10-08 15:53     ` Jeff King
2020-10-08 16:26       ` Derrick Stolee
2020-10-08 16:42         ` Taylor Blau
2020-10-08 16:43         ` Jeff King
2020-10-09 20:53     ` Derrick Stolee via GitGitGadget [this message]
2020-10-09 20:53       ` [PATCH v4 1/2] " Derrick Stolee via GitGitGadget
2020-10-09 20:53       ` [PATCH v4 2/2] commit-graph: don't write commit-graph when disabled Derrick Stolee via GitGitGadget
2020-10-09 21:12         ` Junio C Hamano
2020-10-09 21:17         ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.747.v4.git.1602276832.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).