git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Elijah Newren <newren@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH v2 03/20] merge-ort: port merge_start() from merge-recursive
Date: Wed, 11 Nov 2020 08:52:59 -0500	[thread overview]
Message-ID: <3f1cd4b6-695d-7524-4cb7-7c31f370fe36@gmail.com> (raw)
In-Reply-To: <20201102204344.342633-4-newren@gmail.com>

On 11/2/2020 3:43 PM, Elijah Newren wrote:
> merge_start() basically does a bunch of sanity checks, then allocates
> and initializes opt->priv -- a struct merge_options_internal.
> 
> Most the sanity checks are usable as-is.  The allocation/intialization

s/Most the/Most of the/

> The weirdest part here is that merge-ort and merge-recursive use the
> same struct merge_options, even though merge_options has a number of
> fields that are oddly specific to merge-recursive's internal
> implementation and don't even make sense with merge-ort's high-level
> design (e.g. buffer_output, which merge-ort has to always do).  I reused
> the same data structure because:
>   * most the fields made sense to both merge algorithms
>   * making a new struct would have required making new enums or somehow
>     externalizing them, and that was getting messy.
>   * it simplifies converting the existing callers by not having to
>     have different code paths for merge_options setup.

I think this is appropriate. The other option would be to split the
struct into "common options" and "specific options" but that starts
to get messy if we add yet another merge strategy that changes what
should be "common". Hopefully we can group options within the struct
merge_options definition to assist with this?

For now, the assertions are a good approach.

> I also marked detect_renames as ignored.  We can revisit that later, but
> in short: merge-recursive allowed turning off rename detection because
> it was sometimes glacially slow.  When you speed something up by a few
> orders of magnitude, it's worth revisiting whether that justification is
> still relevant.  Besides, if folks find it's still too slow, perhaps
> they have a better scaling case than I could find and maybe it turns up
> some more optimizations we can add.  If it still is needed as an option,
> it is easy to add later.

As long as it is easy to add later, I don't see much of a problem. Usually
adding a knob to disable a feature is necessary to mitigate risk, and here
we can simply adjust config to use the non-ort algorithm if we notice a data
shape where rename detection makes the algorithm slow/unusable.

>  static void merge_start(struct merge_options *opt, struct merge_result *result)
>  {
> -	die("Not yet implemented.");
> +	/* Sanity checks on opt */
> +	assert(opt->repo);
> +
> +	assert(opt->branch1 && opt->branch2);
> +
> +	assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE &&
> +	       opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE);
> +	assert(opt->rename_limit >= -1);
> +	assert(opt->rename_score >= 0 && opt->rename_score <= MAX_SCORE);
> +	assert(opt->show_rename_progress >= 0 && opt->show_rename_progress <= 1);
> +
> +	assert(opt->xdl_opts >= 0);
> +	assert(opt->recursive_variant >= MERGE_VARIANT_NORMAL &&
> +	       opt->recursive_variant <= MERGE_VARIANT_THEIRS);
> +
> +	/*
> +	 * detect_renames, verbosity, buffer_output, and obuf are ignored
> +	 * fields that were used by "recursive" rather than "ort" -- but
> +	 * sanity check them anyway.
> +	 */
> +	assert(opt->detect_renames >= -1 &&
> +	       opt->detect_renames <= DIFF_DETECT_COPY);
> +	assert(opt->verbosity >= 0 && opt->verbosity <= 5);
> +	assert(opt->buffer_output <= 2);
> +	assert(opt->obuf.len == 0);
> +
> +	assert(opt->priv == NULL);
> +
> +	/* Initialization of opt->priv, our internal merge data */
> +	opt->priv = xcalloc(1, sizeof(*opt->priv));

nit: I would insert an empty line between this code and the
multi-line comment below.

> +	/*
> +	 * Although we initialize opt->priv->paths with strdup_strings=0,
> +	 * that's just to avoid making yet another copy of an allocated
> +	 * string.  Putting the entry into paths means we are taking
> +	 * ownership, so we will later free it.
> +	 *
> +	 * In contrast, unmerged just has a subset of keys from paths, so
> +	 * we don't want to free those (it'd be a duplicate free).
> +	 */
> +	strmap_init_with_options(&opt->priv->paths, NULL, 0);
> +	strmap_init_with_options(&opt->priv->unmerged, NULL, 0);
>  }

This approach looks fine to me.

Thanks,
-Stolee


  reply	other threads:[~2020-11-11 13:53 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-02 20:43 [PATCH v2 00/20] fundamentals of merge-ort implementation Elijah Newren
2020-11-02 20:43 ` [PATCH v2 01/20] merge-ort: setup basic internal data structures Elijah Newren
2020-11-06 22:05   ` Jonathan Tan
2020-11-06 22:45     ` Elijah Newren
2020-11-09 20:55       ` Jonathan Tan
2020-11-02 20:43 ` [PATCH v2 02/20] merge-ort: add some high-level algorithm structure Elijah Newren
2020-11-02 20:43 ` [PATCH v2 03/20] merge-ort: port merge_start() from merge-recursive Elijah Newren
2020-11-11 13:52   ` Derrick Stolee [this message]
2020-11-11 16:22     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 04/20] merge-ort: use histogram diff Elijah Newren
2020-11-11 13:54   ` Derrick Stolee
2020-11-11 16:47     ` Elijah Newren
2020-11-11 16:51       ` Derrick Stolee
2020-11-11 17:03         ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 05/20] merge-ort: add an err() function similar to one from merge-recursive Elijah Newren
2020-11-11 13:58   ` Derrick Stolee
2020-11-11 17:07     ` Elijah Newren
2020-11-11 17:10       ` Derrick Stolee
2020-11-02 20:43 ` [PATCH v2 06/20] merge-ort: implement a very basic collect_merge_info() Elijah Newren
2020-11-06 22:19   ` Jonathan Tan
2020-11-06 23:10     ` Elijah Newren
2020-11-09 20:59       ` Jonathan Tan
2020-11-11 14:38   ` Derrick Stolee
2020-11-11 17:02     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 07/20] merge-ort: avoid repeating fill_tree_descriptor() on the same tree Elijah Newren
2020-11-11 14:51   ` Derrick Stolee
2020-11-11 17:13     ` Elijah Newren
2020-11-11 17:21       ` Eric Sunshine
2020-11-02 20:43 ` [PATCH v2 08/20] merge-ort: compute a few more useful fields for collect_merge_info Elijah Newren
2020-11-06 22:52   ` Jonathan Tan
2020-11-06 23:41     ` Elijah Newren
2020-11-09 22:04       ` Jonathan Tan
2020-11-09 23:05         ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 09/20] merge-ort: record stage and auxiliary info for every path Elijah Newren
2020-11-06 22:58   ` Jonathan Tan
2020-11-07  0:26     ` Elijah Newren
2020-11-09 22:09       ` Jonathan Tan
2020-11-09 23:08         ` Elijah Newren
2020-11-11 15:26   ` Derrick Stolee
2020-11-11 18:16     ` Elijah Newren
2020-11-11 22:06       ` Elijah Newren
2020-11-12 18:23         ` Derrick Stolee
2020-11-12 18:39       ` Derrick Stolee
2020-11-02 20:43 ` [PATCH v2 10/20] merge-ort: avoid recursing into identical trees Elijah Newren
2020-11-11 15:31   ` Derrick Stolee
2020-11-02 20:43 ` [PATCH v2 11/20] merge-ort: add a preliminary simple process_entries() implementation Elijah Newren
2020-11-11 19:51   ` Jonathan Tan
2020-11-12  1:48     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 12/20] merge-ort: have process_entries operate in a defined order Elijah Newren
2020-11-11 16:09   ` Derrick Stolee
2020-11-11 18:58     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 13/20] merge-ort: step 1 of tree writing -- record basenames, modes, and oids Elijah Newren
2020-11-11 20:01   ` Jonathan Tan
2020-11-11 20:24     ` Elijah Newren
2020-11-12 20:39       ` Jonathan Tan
2020-11-02 20:43 ` [PATCH v2 14/20] merge-ort: step 2 of tree writing -- function to create tree object Elijah Newren
2020-11-11 20:47   ` Jonathan Tan
2020-11-11 21:21     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 15/20] merge-ort: step 3 of tree writing -- handling subdirectories as we go Elijah Newren
2020-11-12 20:15   ` Jonathan Tan
2020-11-12 22:30     ` Elijah Newren
2020-11-24 20:19       ` Elijah Newren
2020-11-25  2:07         ` Jonathan Tan
2020-11-26 18:13           ` Elijah Newren
2020-11-30 18:41             ` Jonathan Tan
2020-11-02 20:43 ` [PATCH v2 16/20] merge-ort: basic outline for merge_switch_to_result() Elijah Newren
2020-11-02 20:43 ` [PATCH v2 17/20] merge-ort: add implementation of checkout() Elijah Newren
2020-11-02 20:43 ` [PATCH v2 18/20] tree: enable cmp_cache_name_compare() to be used elsewhere Elijah Newren
2020-11-02 20:43 ` [PATCH v2 19/20] merge-ort: add implementation of record_unmerged_index_entries() Elijah Newren
2020-11-02 20:43 ` [PATCH v2 20/20] merge-ort: free data structures in merge_finalize() Elijah Newren
2020-11-03 14:49 ` [PATCH v2 00/20] fundamentals of merge-ort implementation Derrick Stolee
2020-11-03 16:36   ` Elijah Newren
2020-11-07  6:06     ` Elijah Newren
2020-11-07 15:02       ` Derrick Stolee
2020-11-07 19:39         ` Elijah Newren
2020-11-09 12:30           ` Derrick Stolee
2020-11-09 17:13             ` Elijah Newren
2020-11-09 19:51               ` Derrick Stolee
2020-11-09 22:44                 ` Elijah Newren
2020-11-11 17:08 ` Derrick Stolee
2020-11-11 18:35   ` Elijah Newren
2020-11-11 20:48     ` Derrick Stolee
2020-11-11 21:18       ` Elijah Newren
  -- strict thread matches above, loose matches on Subject: below --
2020-11-29  7:43 [PATCH " Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 " Elijah Newren via GitGitGadget
2020-12-04 20:47   ` [PATCH v2 03/20] merge-ort: port merge_start() from merge-recursive Elijah Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f1cd4b6-695d-7524-4cb7-7c31f370fe36@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).