From: Jeff King <peff@peff.net> To: Arnaud Morin <arnaud.morin@gmail.com> Cc: git@vger.kernel.org, Kevin Willford <kewillf@microsoft.com> Subject: Re: rev-list with multiple commits sharing same patch-id Date: Tue, 12 Jan 2021 10:11:29 -0500 Message-ID: <X/28IXBpse2dNZQH@coredump.intra.peff.net> (raw) In-Reply-To: <X/2vgbnxWSmst5yB@coredump.intra.peff.net> On Tue, Jan 12, 2021 at 09:17:37AM -0500, Jeff King wrote: > It looks like this was broken in v2.10.0, via dfb7a1b4d0 (patch-ids: > stop using a hand-rolled hashmap implementation, 2016-07-29). > > I think the issue is that it is allowing duplicate entries in the > hashmap. The algorithm is something like: > > - iterate over left-hand commits, inserting patch-id for each into > hashmap > > - iterate over right-hand commits, seeing if any are present in > hashmap. If so, we exclude the commit _and_ mark the patch-id as > "seen" > > - iterate again over left-hand commits, omitting any whose patch-ids > are marked as "seen" > > So if two commits on the left-hand side have the same patch-id, if we > insert two entries into the hashmap, then only one of them is going to > get its "seen" flag marked in the middle step. Yeah, that's definitely it. Here's what the fix would look like directly on top of dfb7a1b4d0: diff --git a/patch-ids.c b/patch-ids.c index db31fa647a..a8ed4f0872 100644 --- a/patch-ids.c +++ b/patch-ids.c @@ -66,12 +66,24 @@ struct patch_id *add_commit_patch_id(struct commit *commit, struct patch_ids *ids) { struct patch_id *key = xcalloc(1, sizeof(*key)); + struct patch_id *old; if (init_patch_id_entry(key, commit, ids)) { free(key); return NULL; } + /* + * Don't allow duplicates. Note that we can't use hashmap_put here; we + * must retain the _old_ struct, because some commit->util is pointing + * to it. + */ + old = hashmap_get(&ids->patches, key, ids); + if (old) { + free(key); + return old; + } + hashmap_add(&ids->patches, key); return key; } diff --git a/t/t6007-rev-list-cherry-pick-file.sh b/t/t6007-rev-list-cherry-pick-file.sh index 28d4f6b259..0c5f1dcc54 100755 --- a/t/t6007-rev-list-cherry-pick-file.sh +++ b/t/t6007-rev-list-cherry-pick-file.sh @@ -207,4 +207,16 @@ test_expect_success '--count --left-right' ' test_cmp expect actual ' +test_expect_success '--cherry-pick with duplicates on each side' ' + git checkout -b dup-orig && + test_commit dup-base && + git revert dup-base && + git cherry-pick dup-base && + git checkout -b dup-side HEAD~3 && + test_tick && + git cherry-pick -3 dup-orig && + git rev-list --cherry-pick dup-orig...dup-side >actual && + test_must_be_empty actual +' + test_done Unfortunately we can't do the same thing now. The "seen" flag went away in 683f17ec44 (patch-ids: replace the seen indicator with a commit pointer, 2016-07-29), in favor of having each patch_id struct point to a "struct commit". We could revert that, or turn that pointer into a list, but it gets worse. Later, b3dfeebb92 (rebase: avoid computing unnecessary patch IDs, 2016-07-29) built on that by lazily computing the patch-ids. So each patch_id struct must point to exactly one commit! We'll have to either: - split them apart into two structs at the time of the lazy computation or - accept having duplicate patch_ids, but on lookup make sure we find _all_ matches, not just the first one. Hmm, that last one might not be too bad, since we have hashmap_get_next() (it does mean iterating over a chained bucket linearly, but that's already the worst-case for a hash lookup). -Peff
next prev parent reply other threads:[~2021-01-12 15:14 UTC|newest] Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-01-09 16:24 Arnaud Morin 2021-01-11 7:59 ` Arnaud Morin 2021-01-11 9:54 ` Christian Couder 2021-01-11 18:25 ` Arnaud Morin 2021-01-12 14:17 ` Jeff King 2021-01-12 15:11 ` Jeff King [this message] 2021-01-12 15:34 ` Arnaud Morin 2021-01-12 15:52 ` [PATCH] patch-ids: handle duplicate hashmap entries Jeff King 2021-01-12 16:24 ` Arnaud Morin 2021-01-12 19:13 ` Junio C Hamano 2021-01-13 9:24 ` Arnaud Morin 2021-01-13 12:59 ` Jeff King 2021-01-13 20:21 ` Junio C Hamano 2021-01-13 20:33 ` Jeff King 2021-01-13 19:28 ` Junio C Hamano
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=X/28IXBpse2dNZQH@coredump.intra.peff.net \ --to=peff@peff.net \ --cc=arnaud.morin@gmail.com \ --cc=git@vger.kernel.org \ --cc=kewillf@microsoft.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
git@vger.kernel.org list mirror (unofficial, one of many) This inbox may be cloned and mirrored by anyone: git clone --mirror https://public-inbox.org/git git clone --mirror http://ou63pmih66umazou.onion/git git clone --mirror http://czquwvybam4bgbro.onion/git git clone --mirror http://hjrcffqmbrq6wope.onion/git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V1 git git/ https://public-inbox.org/git \ git@vger.kernel.org public-inbox-index git Example config snippet for mirrors. Newsgroups are available over NNTP: nntp://news.public-inbox.org/inbox.comp.version-control.git nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git nntp://news.gmane.io/gmane.comp.version-control.git note: .onion URLs require Tor: https://www.torproject.org/ code repositories for the project(s) associated with this inbox: https://80x24.org/mirrors/git.git AGPL code for this site: git clone https://public-inbox.org/public-inbox.git