From: Derrick Stolee <stolee@gmail.com>
To: Jonathan Tan <jonathantanmy@google.com>, git@vger.kernel.org
Cc: congdanhqx@gmail.com, newren@gmail.com, gitster@pobox.com
Subject: Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
Date: Mon, 30 Mar 2020 08:13:32 -0400 [thread overview]
Message-ID: <e917f451-c00a-c819-7f78-888ba6e8f5ea@gmail.com> (raw)
In-Reply-To: <20200330040621.13701-1-jonathantanmy@google.com>
On 3/30/2020 12:06 AM, Jonathan Tan wrote:
> When rebasing against an upstream that has had many commits since the
> original branch was created:
>
> O -- O -- ... -- O -- O (upstream)
> \
> -- O (my-dev-branch)
>
> it must read the contents of every novel upstream commit, in addition to
> the tip of the upstream and the merge base, because "git rebase"
> attempts to exclude commits that are duplicates of upstream ones. This
> can be a significant performance hit, especially in a partial clone,
> wherein a read of an object may end up being a fetch.
>
> Add a flag to "git rebase" to allow suppression of this feature. This
> flag only works when using the "merge" backend.
So this is the behavior that already exists, and you are providing a way
to suppress it. However, you also change the default in this patch, which
may surprise users expecting this behavior to continue.
> This flag changes the behavior of sequencer_make_script(), called from
> do_interactive_rebase() <- run_rebase_interactive() <-
> run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
> (indirectly called from sequencer_make_script() through
> prepare_revision_walk()) will no longer call cherry_pick_list(), and
> thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
> means that the intermediate commits in upstream are no longer read (as
> shown by the test) and means that no PATCHSAME-caused skipping of
> commits is done by sequencer_make_script(), either directly or through
> make_script_with_merges().
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> This commit contains Junio's sign-off because I based it on
> jt/rebase-allow-duplicate.
>
> This does not include the fix by Đoàn Trần Công Danh. If we want all
> commits to pass all tests (whether run by Busybox or not) it seems like
> we should squash that patch instead of having it as a separate commit.
> If we do squash, maybe include a "Helped-by" with Đoàn Trần Công Danh's
> name.
>
> Junio wrote [1]:
>
>> Sounds much better to me. I do not mind --[no-]keep-cherry-pick,
>> either, by the way. I know I raised the possibility of having to
>> make it non-bool later, but since then I haven't thought of a good
>> third option myself anyway, so...
>
> In that case, I think it's better to stick to bool. This also means that
> the change from the version in jt/rebase-allow-duplicate is very small,
> hopefully aiding reviewers - mostly a replacement of
> --skip-cherry-pick-detection with --keep-cherry-pick (which mean the
> same thing).
>
> [1] https://lore.kernel.org/git/xmqq4kuakjcn.fsf@gitster.c.googlers.com/
> ---
> Documentation/git-rebase.txt | 21 +++++++++-
> builtin/rebase.c | 7 ++++
> sequencer.c | 3 +-
> sequencer.h | 2 +-
> t/t3402-rebase-merge.sh | 77 ++++++++++++++++++++++++++++++++++++
> 5 files changed, 107 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 0c4f038dd6..f4f8afeb9a 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -318,6 +318,21 @@ See also INCOMPATIBLE OPTIONS below.
> +
> See also INCOMPATIBLE OPTIONS below.
>
> +--keep-cherry-pick::
> +--no-keep-cherry-pick::
I noticed that this _could_ have been simplified to
--[no-]keep-cherry-pick::
but I also see several uses of either in our documentation. Do we
have a preference? By inspecting the lines before a "no-" string,
I see that some have these two lines, some use the [no-] pattern,
and others highlight the --no-<option> flag completely separately.
> + Control rebase's behavior towards commits in the working
> + branch that are already present upstream, i.e. cherry-picks.
I think the "already present upstream" is misleading. We don't rebase
things that are _reachable_ already, but this is probably better as
Specify if rebase should include commits in the working branch
that have diffs equivalent to other commits upstream. For example,
a cherry-picked commit has an equivalent diff.
> ++
> +By default, these commits will be dropped. Because this necessitates
> +reading all upstream commits, this can be expensive in repos with a
> +large number of upstream commits that need to be read.
Now I'm confused. Are they dropped by default? Which option does what?
--keep-cherry-pick makes me think that cherry-picked commits will come
along for the rebase, so we will not check for them. But you have documented
that --no-keep-cherry-pick is the default.
(Also, I keep writing "--[no-]keep-cherry-picks" (plural) because that
seems more natural to me. Then I go back and fix it when I notice.)
> ++
> +If `--keep-cherry-pick is given`, all commits (including these) will be
Bad tick marks: "`--keep-cherry-pick` is given"
> +re-applied. This allows rebase to forgo reading all upstream commits,
> +potentially improving performance.
This reasoning is good. Could you also introduce a config option to make
--keep-cherry-pick the default? I would like to enable that option by
default in Scalar, but could also see partial clones wanting to enable that
by default, too.
> ++
> +See also INCOMPATIBLE OPTIONS below.
> +
Could we just say that his only applies with the --merge option? Why require
the jump to the end of the options section? (After writing this, I go look
at the rest of the doc file and see this is a common pattern.)
> --rerere-autoupdate::
> --no-rerere-autoupdate::
> Allow the rerere mechanism to update the index with the
> @@ -568,6 +583,9 @@ In addition, the following pairs of options are incompatible:
> * --keep-base and --onto
> * --keep-base and --root
>
> +Also, the --keep-cherry-pick option requires the use of the merge backend
> +(e.g., through --merge).
> +
Will the command _fail_ if someone says --keep-cherry-pick without the merge
backend, or just have no effect? Also, specify the option with ticks and
`--[no-]keep-cherry-pick`
It seems that none of the options in this section are back-ticked, which I think
is a doc bug.
> BEHAVIORAL DIFFERENCES
> -----------------------
>
> @@ -866,7 +884,8 @@ Only works if the changes (patch IDs based on the diff contents) on
> 'subsystem' did.
>
> In that case, the fix is easy because 'git rebase' knows to skip
> -changes that are already present in the new upstream. So if you say
> +changes that are already present in the new upstream (unless
> +`--keep-cherry-pick` is given). So if you say
> (assuming you're on 'topic')
> ------------
> $ git rebase subsystem
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 8081741f8a..626549b0b2 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -88,6 +88,7 @@ struct rebase_options {
> struct strbuf git_format_patch_opt;
> int reschedule_failed_exec;
> int use_legacy_rebase;
> + int keep_cherry_pick;
> };
>
> #define REBASE_OPTIONS_INIT { \
> @@ -381,6 +382,7 @@ static int run_rebase_interactive(struct rebase_options *opts,
> flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
> flags |= opts->root_with_onto ? TODO_LIST_ROOT_WITH_ONTO : 0;
> flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
> + flags |= opts->keep_cherry_pick ? TODO_LIST_KEEP_CHERRY_PICK : 0;
Since opts->keep_cherry_pick is initialized as zero, did you change the default
behavior? Do we not have a test that verifies this behavior when using the merge
backend an no "--keep-cherry-pick" option?
If you initialize it to -1, then you can tell if the --no-keep-cherry-pick option
is specified, which is relevant to my concern below.
>
> switch (command) {
> case ACTION_NONE: {
> @@ -1515,6 +1517,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
> OPT_BOOL(0, "reschedule-failed-exec",
> &reschedule_failed_exec,
> N_("automatically re-schedule any `exec` that fails")),
> + OPT_BOOL(0, "keep-cherry-pick", &options.keep_cherry_pick,
> + N_("apply all changes, even those already present upstream")),
> OPT_END(),
> };
> int i;
> @@ -1848,6 +1852,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
> "interactive or merge options"));
> }
>
> + if (options.keep_cherry_pick && !is_interactive(&options))
> + die(_("--keep-cherry-pick does not work with the 'apply' backend"));
> +
I see you are failing here. Is this the right decision?
The apply backend will "keep" cherry-picks because it will not look for them upstream.
If anything, shouldn't it be that "--no-keep-cherry-pick" is incompatible?
> if (options.signoff) {
> if (options.type == REBASE_PRESERVE_MERGES)
> die("cannot combine '--signoff' with "
> diff --git a/sequencer.c b/sequencer.c
> index b9dbf1adb0..7bbb63f444 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4800,12 +4800,13 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
> int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
> const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
> int rebase_merges = flags & TODO_LIST_REBASE_MERGES;
> + int keep_cherry_pick = flags & TODO_LIST_KEEP_CHERRY_PICK;
>
> repo_init_revisions(r, &revs, NULL);
> revs.verbose_header = 1;
> if (!rebase_merges)
> revs.max_parents = 1;
> - revs.cherry_mark = 1;
> + revs.cherry_mark = !keep_cherry_pick;
> revs.limited = 1;
> revs.reverse = 1;
> revs.right_only = 1;
> diff --git a/sequencer.h b/sequencer.h
> index 9f9ae291e3..298b7de1c8 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -148,7 +148,7 @@ int sequencer_remove_state(struct replay_opts *opts);
> * `--onto`, we do not want to re-generate the root commits.
> */
> #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
> -
> +#define TODO_LIST_KEEP_CHERRY_PICK (1U << 7)
>
> int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
> const char **argv, unsigned flags);
> diff --git a/t/t3402-rebase-merge.sh b/t/t3402-rebase-merge.sh
> index a1ec501a87..64200c5f20 100755
> --- a/t/t3402-rebase-merge.sh
> +++ b/t/t3402-rebase-merge.sh
> @@ -162,4 +162,81 @@ test_expect_success 'rebase --skip works with two conflicts in a row' '
> git rebase --skip
> '
>
> +test_expect_success '--keep-cherry-pick' '
> + git init repo &&
> +
> + # O(1-10) -- O(1-11) -- O(0-10) master
> + # \
> + # -- O(1-11) -- O(1-12) otherbranch
> +
> + printf "Line %d\n" $(test_seq 1 10) >repo/file.txt &&
> + git -C repo add file.txt &&
> + git -C repo commit -m "base commit" &&
> +
> + printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> + git -C repo commit -a -m "add 11" &&
> +
> + printf "Line %d\n" $(test_seq 0 10) >repo/file.txt &&
> + git -C repo commit -a -m "add 0 delete 11" &&
> +
> + git -C repo checkout -b otherbranch HEAD^^ &&
> + printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> + git -C repo commit -a -m "add 11 in another branch" &&
> +
> + printf "Line %d\n" $(test_seq 1 12) >repo/file.txt &&
> + git -C repo commit -a -m "add 12 in another branch" &&
> +
> + # Regular rebase fails, because the 1-11 commit is deduplicated
> + test_must_fail git -C repo rebase --merge master 2> err &&
> + test_i18ngrep "error: could not apply.*add 12 in another branch" err &&
> + git -C repo rebase --abort &&
OK. So here you are demonstrating that the --no-keep-cherry-pick is the
new default. Just trying to be sure that this was intended.
> +
> + # With --keep-cherry-pick, it works
> + git -C repo rebase --merge --keep-cherry-pick master
> +'
> +
> +test_expect_success '--keep-cherry-pick refrains from reading unneeded blobs' '
> + git init server &&
> +
> + # O(1-10) -- O(1-11) -- O(1-12) master
> + # \
> + # -- O(0-10) otherbranch
> +
> + printf "Line %d\n" $(test_seq 1 10) >server/file.txt &&
> + git -C server add file.txt &&
> + git -C server commit -m "merge base" &&
> +
> + printf "Line %d\n" $(test_seq 1 11) >server/file.txt &&
> + git -C server commit -a -m "add 11" &&
> +
> + printf "Line %d\n" $(test_seq 1 12) >server/file.txt &&
> + git -C server commit -a -m "add 12" &&
> +
> + git -C server checkout -b otherbranch HEAD^^ &&
> + printf "Line %d\n" $(test_seq 0 10) >server/file.txt &&
> + git -C server commit -a -m "add 0" &&
> +
> + test_config -C server uploadpack.allowfilter 1 &&
> + test_config -C server uploadpack.allowanysha1inwant 1 &&
> +
> + git clone --filter=blob:none "file://$(pwd)/server" client &&
> + git -C client checkout origin/master &&
> + git -C client checkout origin/otherbranch &&
> +
> + # Sanity check to ensure that the blobs from the merge base and "add
> + # 11" are missing
> + git -C client rev-list --objects --all --missing=print >missing_list &&
> + MERGE_BASE_BLOB=$(git -C server rev-parse master^^:file.txt) &&
> + ADD_11_BLOB=$(git -C server rev-parse master^:file.txt) &&
> + grep "\\?$MERGE_BASE_BLOB" missing_list &&
> + grep "\\?$ADD_11_BLOB" missing_list &&
> +
> + git -C client rebase --merge --keep-cherry-pick origin/master &&
> +
> + # The blob from the merge base had to be fetched, but not "add 11"
> + git -C client rev-list --objects --all --missing=print >missing_list &&
> + ! grep "\\?$MERGE_BASE_BLOB" missing_list &&
> + grep "\\?$ADD_11_BLOB" missing_list
> +'
I appreciate this test showing that this is accomplishing the goal in
a partial clone.
Thanks,
-Stolee
next prev parent reply other threads:[~2020-03-30 12:13 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-09 20:55 [PATCH] rebase --merge: optionally skip upstreamed commits Jonathan Tan
2020-03-10 2:10 ` Taylor Blau
2020-03-10 15:51 ` Jonathan Tan
2020-03-10 12:17 ` Johannes Schindelin
2020-03-10 16:00 ` Jonathan Tan
2020-03-10 18:56 ` Elijah Newren
2020-03-10 22:56 ` Jonathan Tan
2020-03-12 18:04 ` Jonathan Tan
2020-03-12 22:40 ` Elijah Newren
2020-03-14 8:04 ` Elijah Newren
2020-03-17 3:03 ` Jonathan Tan
2020-03-18 17:30 ` [PATCH v2] " Jonathan Tan
2020-03-18 18:47 ` Junio C Hamano
2020-03-18 19:28 ` Jonathan Tan
2020-03-18 19:55 ` Junio C Hamano
2020-03-18 20:41 ` Elijah Newren
2020-03-18 23:39 ` Junio C Hamano
2020-03-19 0:17 ` Elijah Newren
2020-03-18 20:20 ` Junio C Hamano
2020-03-26 17:50 ` Jonathan Tan
2020-03-26 19:17 ` Elijah Newren
2020-03-26 19:27 ` Junio C Hamano
2020-03-29 10:12 ` [PATCH v2 4/4] t3402: use POSIX compliant regex(7) Đoàn Trần Công Danh
2020-03-30 4:06 ` [PATCH v3] rebase --merge: optionally skip upstreamed commits Jonathan Tan
2020-03-30 5:09 ` Junio C Hamano
2020-03-30 5:22 ` Danh Doan
2020-03-30 12:13 ` Derrick Stolee [this message]
2020-03-30 16:49 ` Junio C Hamano
2020-03-30 16:57 ` Jonathan Tan
2020-03-31 11:55 ` Derrick Stolee
2020-03-31 16:27 ` Elijah Newren
2020-03-31 18:34 ` Junio C Hamano
2020-03-31 18:43 ` Junio C Hamano
2020-04-10 22:27 ` Jonathan Tan
2020-04-11 0:06 ` Elijah Newren
2020-04-11 1:11 ` Jonathan Tan
2020-04-11 2:46 ` Elijah Newren
-- strict thread matches above, loose matches on Subject: below --
2020-03-26 7:35 [PATCH 0/3] add travis job for linux with musl libc Đoàn Trần Công Danh
2020-03-26 7:35 ` [PATCH 1/3] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
2020-03-26 7:35 ` [PATCH 2/3] ci: refactor docker runner script Đoàn Trần Công Danh
2020-03-26 16:06 ` Eric Sunshine
2020-03-28 17:53 ` SZEDER Gábor
2020-03-29 6:36 ` Danh Doan
2020-03-26 7:35 ` [PATCH 3/3] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
2020-03-29 5:49 ` [PATCH 0/3] add travis job for linux with musl libc Junio C Hamano
2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
2020-03-29 10:12 ` [PATCH v2 1/4] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
2020-03-29 10:12 ` [PATCH v2 2/4] ci: refactor docker runner script Đoàn Trần Công Danh
2020-04-01 21:51 ` SZEDER Gábor
2020-03-29 10:12 ` [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
2020-04-01 22:18 ` SZEDER Gábor
2020-04-02 1:42 ` Danh Doan
2020-04-07 14:53 ` Johannes Schindelin
2020-04-07 21:35 ` Junio C Hamano
2020-04-10 13:38 ` Johannes Schindelin
2020-03-29 16:23 ` [PATCH v2 0/4] Travis + Azure jobs for linux with musl libc Junio C Hamano
2020-04-02 13:03 ` [PATCH v3 0/6] " Đoàn Trần Công Danh
2020-04-02 13:04 ` [PATCH v3 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
2020-04-02 13:04 ` [PATCH v3 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
2020-04-03 8:22 ` SZEDER Gábor
2020-04-03 10:09 ` Danh Doan
2020-04-03 19:55 ` SZEDER Gábor
2020-04-02 13:04 ` [PATCH v3 3/6] ci/linux32: parameterise command to switch arch Đoàn Trần Công Danh
2020-04-02 13:04 ` [PATCH v3 4/6] ci: refactor docker runner script Đoàn Trần Công Danh
2020-04-02 13:04 ` [PATCH v3 5/6] ci/linux32: libify install-dependencies step Đoàn Trần Công Danh
2020-04-02 13:04 ` [PATCH v3 6/6] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
2020-04-02 17:53 ` [PATCH v3 0/6] Travis + Azure jobs for linux with musl libc Junio C Hamano
2020-04-03 0:23 ` Danh Doan
2020-04-04 1:08 ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
2020-04-04 1:08 ` [PATCH v4 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
2020-04-04 1:08 ` [PATCH v4 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
2020-04-04 1:08 ` [PATCH v4 3/6] ci/linux32: parameterise command to switch arch Đoàn Trần Công Danh
2020-04-04 1:08 ` [PATCH v4 4/6] ci: refactor docker runner script Đoàn Trần Công Danh
2020-04-04 1:08 ` [PATCH v4 5/6] ci/linux32: libify install-dependencies step Đoàn Trần Công Danh
2020-04-04 1:08 ` [PATCH v4 6/6] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
2020-04-05 20:39 ` [PATCH v4 0/6] Travis jobs for linux with musl libc Junio C Hamano
2020-04-07 14:55 ` Johannes Schindelin
2020-04-07 19:25 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e917f451-c00a-c819-7f78-888ba6e8f5ea@gmail.com \
--to=stolee@gmail.com \
--cc=congdanhqx@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).