git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
@ 2023-02-23  9:14 Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
                   ` (13 more replies)
  0 siblings, 14 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

I wrote this patch series about a year and a half ago, but never submitted
it. I rebased and updated it due to [0].

Some time ago, I noticed that struct dir_struct and struct
unpack_trees_options both have numerous fields meant for internal use only,
most of which are not marked as such. This has resulted in callers
accidentally trying to initialize some of these fields, and in at least one
case required a fair amount of review to verify other changes were okay --
review that would have been simplified with the apriori knowledge that a
combination of multiple fields were internal-only[1]. Looking closer, I
found that only 6 out of 18 fields in dir_struct were actually meant to be
public[2], and noted that unpack_trees_options also had 11 internal-only
fields (out of 36).

This patch is primarily about moving internal-only fields within these two
structs into an embedded internal struct. Patch breakdown:

 * Patches 1-3: Restructuring dir_struct
   * Patch 1: Splitting off internal-use-only fields
   * Patch 2: Add important usage note to avoid accidentally using
     deprecated API
   * Patch 3: Mark output-only fields as such
 * Patches 4-11: Restructuring unpack_trees_options
   * Patches 4-6: Preparatory cleanup
   * Patches 7-10: Splitting off internal-use-only fields
   * Patch 11: Mark output-only field as such

To make the benefit more clear, here are compressed versions of dir_struct
both before and after the changes. First, before:

struct dir_struct {
    int nr;
    int alloc;
    int ignored_nr;
    int ignored_alloc;
    enum [...] flags;
    struct dir_entry **entries;
    struct dir_entry **ignored;
    const char *exclude_per_dir;
#define EXC_CMDL 0
#define EXC_DIRS 1
#define EXC_FILE 2
    struct exclude_list_group exclude_list_group[3];
    struct exclude_stack *exclude_stack;
    struct path_pattern *pattern;
    struct strbuf basebuf;
    struct untracked_cache *untracked;
    struct oid_stat ss_info_exclude;
    struct oid_stat ss_excludes_file;
    unsigned unmanaged_exclude_files;
    unsigned visited_paths;
    unsigned visited_directories;
};


And after the changes:

struct dir_struct {
    enum [...] flags;
    int nr; /* output only */
    int ignored_nr; /* output only */
    struct dir_entry **entries; /* output only */
    struct dir_entry **ignored; /* output only */
    struct untracked_cache *untracked;
    const char *exclude_per_dir; /* deprecated */
    struct dir_struct_internal {
        int alloc;
        int ignored_alloc;
#define EXC_CMDL 0
#define EXC_DIRS 1
#define EXC_FILE 2
        struct exclude_list_group exclude_list_group[3];
        struct exclude_stack *exclude_stack;
        struct path_pattern *pattern;
        struct strbuf basebuf;
        struct oid_stat ss_info_exclude;
        struct oid_stat ss_excludes_file;
        unsigned unmanaged_exclude_files;
        unsigned visited_paths;
        unsigned visited_directories;
    } internal;
};


The former version has 18 fields (and 3 magic constants) which API users
will have to figure out. The latter makes it clear there are only at most 2
fields you should be setting upon input, and at most 4 which you read at
output, and the rest (including all the magic constants) you can ignore.

[0] Search for "Extremely yes" in
https://lore.kernel.org/git/CAJoAoZm+TkCL0Jpg_qFgKottxbtiG2QOiY0qGrz3-uQy+=waPg@mail.gmail.com/
[1]
https://lore.kernel.org/git/CABPp-BFSFN3WM6q7KzkD5mhrwsz--St_-ej5LbaY8Yr2sZzj=w@mail.gmail.com/
[2]
https://lore.kernel.org/git/CABPp-BHgot=CPNyK_xNfog_SqsNPNoCGfiSb-gZoS2sn_741dQ@mail.gmail.com/

Elijah Newren (11):
  dir: separate public from internal portion of dir_struct
  dir: add a usage note to exclude_per_dir
  dir: mark output only fields of dir_struct as such
  unpack-trees: clean up some flow control
  sparse-checkout: avoid using internal API of unpack-trees
  sparse-checkout: avoid using internal API of unpack-trees, take 2
  unpack_trees: start splitting internal fields from public API
  unpack-trees: mark fields only used internally as internal
  unpack-trees: rewrap a few overlong lines from previous patch
  unpack-trees: special case read-tree debugging as internal usage
  unpack-trees: add usage notices around df_conflict_entry

 builtin/read-tree.c       |  10 +-
 builtin/sparse-checkout.c |   4 +-
 dir.c                     | 114 +++++++++---------
 dir.h                     | 110 +++++++++--------
 unpack-trees.c            | 245 ++++++++++++++++++++------------------
 unpack-trees.h            |  42 ++++---
 6 files changed, 275 insertions(+), 250 deletions(-)


base-commit: 06dd2baa8da4a73421b959ec026a43711b9d77f9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1149%2Fnewren%2Fclarify-api-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1149/newren/clarify-api-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1149
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 01/11] dir: separate public from internal portion of dir_struct
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In order to make it clearer to callers what portions of dir_struct are
public API, and avoid errors from them setting fields that are meant as
internal API, split the fields used for internal implementation reasons
into a separate embedded struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c | 114 +++++++++++++++++++++++++++++-----------------------------
 dir.h |  86 +++++++++++++++++++++++---------------------
 2 files changed, 104 insertions(+), 96 deletions(-)

diff --git a/dir.c b/dir.c
index 4e99f0c868f..7adf242026e 100644
--- a/dir.c
+++ b/dir.c
@@ -1190,7 +1190,7 @@ struct pattern_list *add_pattern_list(struct dir_struct *dir,
 	struct pattern_list *pl;
 	struct exclude_list_group *group;
 
-	group = &dir->exclude_list_group[group_type];
+	group = &dir->internal.exclude_list_group[group_type];
 	ALLOC_GROW(group->pl, group->nr + 1, group->alloc);
 	pl = &group->pl[group->nr++];
 	memset(pl, 0, sizeof(*pl));
@@ -1211,7 +1211,7 @@ static void add_patterns_from_file_1(struct dir_struct *dir, const char *fname,
 	 * differently when dir->untracked is non-NULL.
 	 */
 	if (!dir->untracked)
-		dir->unmanaged_exclude_files++;
+		dir->internal.unmanaged_exclude_files++;
 	pl = add_pattern_list(dir, EXC_FILE, fname);
 	if (add_patterns(fname, "", 0, pl, NULL, 0, oid_stat) < 0)
 		die(_("cannot use %s as an exclude file"), fname);
@@ -1219,7 +1219,7 @@ static void add_patterns_from_file_1(struct dir_struct *dir, const char *fname,
 
 void add_patterns_from_file(struct dir_struct *dir, const char *fname)
 {
-	dir->unmanaged_exclude_files++; /* see validate_untracked_cache() */
+	dir->internal.unmanaged_exclude_files++; /* see validate_untracked_cache() */
 	add_patterns_from_file_1(dir, fname, NULL);
 }
 
@@ -1519,7 +1519,7 @@ static struct path_pattern *last_matching_pattern_from_lists(
 	struct exclude_list_group *group;
 	struct path_pattern *pattern;
 	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
-		group = &dir->exclude_list_group[i];
+		group = &dir->internal.exclude_list_group[i];
 		for (j = group->nr - 1; j >= 0; j--) {
 			pattern = last_matching_pattern_from_list(
 				pathname, pathlen, basename, dtype_p,
@@ -1545,20 +1545,20 @@ static void prep_exclude(struct dir_struct *dir,
 	struct untracked_cache_dir *untracked;
 	int current;
 
-	group = &dir->exclude_list_group[EXC_DIRS];
+	group = &dir->internal.exclude_list_group[EXC_DIRS];
 
 	/*
 	 * Pop the exclude lists from the EXCL_DIRS exclude_list_group
 	 * which originate from directories not in the prefix of the
 	 * path being checked.
 	 */
-	while ((stk = dir->exclude_stack) != NULL) {
+	while ((stk = dir->internal.exclude_stack) != NULL) {
 		if (stk->baselen <= baselen &&
-		    !strncmp(dir->basebuf.buf, base, stk->baselen))
+		    !strncmp(dir->internal.basebuf.buf, base, stk->baselen))
 			break;
-		pl = &group->pl[dir->exclude_stack->exclude_ix];
-		dir->exclude_stack = stk->prev;
-		dir->pattern = NULL;
+		pl = &group->pl[dir->internal.exclude_stack->exclude_ix];
+		dir->internal.exclude_stack = stk->prev;
+		dir->internal.pattern = NULL;
 		free((char *)pl->src); /* see strbuf_detach() below */
 		clear_pattern_list(pl);
 		free(stk);
@@ -1566,7 +1566,7 @@ static void prep_exclude(struct dir_struct *dir,
 	}
 
 	/* Skip traversing into sub directories if the parent is excluded */
-	if (dir->pattern)
+	if (dir->internal.pattern)
 		return;
 
 	/*
@@ -1574,12 +1574,12 @@ static void prep_exclude(struct dir_struct *dir,
 	 * memset(dir, 0, sizeof(*dir)) before use. Changing all of
 	 * them seems lots of work for little benefit.
 	 */
-	if (!dir->basebuf.buf)
-		strbuf_init(&dir->basebuf, PATH_MAX);
+	if (!dir->internal.basebuf.buf)
+		strbuf_init(&dir->internal.basebuf, PATH_MAX);
 
 	/* Read from the parent directories and push them down. */
 	current = stk ? stk->baselen : -1;
-	strbuf_setlen(&dir->basebuf, current < 0 ? 0 : current);
+	strbuf_setlen(&dir->internal.basebuf, current < 0 ? 0 : current);
 	if (dir->untracked)
 		untracked = stk ? stk->ucd : dir->untracked->root;
 	else
@@ -1599,32 +1599,33 @@ static void prep_exclude(struct dir_struct *dir,
 				die("oops in prep_exclude");
 			cp++;
 			untracked =
-				lookup_untracked(dir->untracked, untracked,
+				lookup_untracked(dir->untracked,
+						 untracked,
 						 base + current,
 						 cp - base - current);
 		}
-		stk->prev = dir->exclude_stack;
+		stk->prev = dir->internal.exclude_stack;
 		stk->baselen = cp - base;
 		stk->exclude_ix = group->nr;
 		stk->ucd = untracked;
 		pl = add_pattern_list(dir, EXC_DIRS, NULL);
-		strbuf_add(&dir->basebuf, base + current, stk->baselen - current);
-		assert(stk->baselen == dir->basebuf.len);
+		strbuf_add(&dir->internal.basebuf, base + current, stk->baselen - current);
+		assert(stk->baselen == dir->internal.basebuf.len);
 
 		/* Abort if the directory is excluded */
 		if (stk->baselen) {
 			int dt = DT_DIR;
-			dir->basebuf.buf[stk->baselen - 1] = 0;
-			dir->pattern = last_matching_pattern_from_lists(dir,
+			dir->internal.basebuf.buf[stk->baselen - 1] = 0;
+			dir->internal.pattern = last_matching_pattern_from_lists(dir,
 									istate,
-				dir->basebuf.buf, stk->baselen - 1,
-				dir->basebuf.buf + current, &dt);
-			dir->basebuf.buf[stk->baselen - 1] = '/';
-			if (dir->pattern &&
-			    dir->pattern->flags & PATTERN_FLAG_NEGATIVE)
-				dir->pattern = NULL;
-			if (dir->pattern) {
-				dir->exclude_stack = stk;
+				dir->internal.basebuf.buf, stk->baselen - 1,
+				dir->internal.basebuf.buf + current, &dt);
+			dir->internal.basebuf.buf[stk->baselen - 1] = '/';
+			if (dir->internal.pattern &&
+			    dir->internal.pattern->flags & PATTERN_FLAG_NEGATIVE)
+				dir->internal.pattern = NULL;
+			if (dir->internal.pattern) {
+				dir->internal.exclude_stack = stk;
 				return;
 			}
 		}
@@ -1647,15 +1648,15 @@ static void prep_exclude(struct dir_struct *dir,
 		      */
 		     !is_null_oid(&untracked->exclude_oid))) {
 			/*
-			 * dir->basebuf gets reused by the traversal, but we
-			 * need fname to remain unchanged to ensure the src
-			 * member of each struct path_pattern correctly
+			 * dir->internal.basebuf gets reused by the traversal,
+			 * but we need fname to remain unchanged to ensure the
+			 * src member of each struct path_pattern correctly
 			 * back-references its source file.  Other invocations
 			 * of add_pattern_list provide stable strings, so we
 			 * strbuf_detach() and free() here in the caller.
 			 */
 			struct strbuf sb = STRBUF_INIT;
-			strbuf_addbuf(&sb, &dir->basebuf);
+			strbuf_addbuf(&sb, &dir->internal.basebuf);
 			strbuf_addstr(&sb, dir->exclude_per_dir);
 			pl->src = strbuf_detach(&sb, NULL);
 			add_patterns(pl->src, pl->src, stk->baselen, pl, istate,
@@ -1681,10 +1682,10 @@ static void prep_exclude(struct dir_struct *dir,
 			invalidate_gitignore(dir->untracked, untracked);
 			oidcpy(&untracked->exclude_oid, &oid_stat.oid);
 		}
-		dir->exclude_stack = stk;
+		dir->internal.exclude_stack = stk;
 		current = stk->baselen;
 	}
-	strbuf_setlen(&dir->basebuf, baselen);
+	strbuf_setlen(&dir->internal.basebuf, baselen);
 }
 
 /*
@@ -1704,8 +1705,8 @@ struct path_pattern *last_matching_pattern(struct dir_struct *dir,
 
 	prep_exclude(dir, istate, pathname, basename-pathname);
 
-	if (dir->pattern)
-		return dir->pattern;
+	if (dir->internal.pattern)
+		return dir->internal.pattern;
 
 	return last_matching_pattern_from_lists(dir, istate, pathname, pathlen,
 			basename, dtype_p);
@@ -1742,7 +1743,7 @@ static struct dir_entry *dir_add_name(struct dir_struct *dir,
 	if (index_file_exists(istate, pathname, len, ignore_case))
 		return NULL;
 
-	ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
+	ALLOC_GROW(dir->entries, dir->nr+1, dir->internal.alloc);
 	return dir->entries[dir->nr++] = dir_entry_new(pathname, len);
 }
 
@@ -1753,7 +1754,7 @@ struct dir_entry *dir_add_ignored(struct dir_struct *dir,
 	if (!index_name_is_other(istate, pathname, len))
 		return NULL;
 
-	ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->ignored_alloc);
+	ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->internal.ignored_alloc);
 	return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);
 }
 
@@ -2569,7 +2570,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 
 	if (open_cached_dir(&cdir, dir, untracked, istate, &path, check_only))
 		goto out;
-	dir->visited_directories++;
+	dir->internal.visited_directories++;
 
 	if (untracked)
 		untracked->check_only = !!check_only;
@@ -2578,7 +2579,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* check how the file or directory should be treated */
 		state = treat_path(dir, untracked, &cdir, istate, &path,
 				   baselen, pathspec);
-		dir->visited_paths++;
+		dir->internal.visited_paths++;
 
 		if (state > dir_state)
 			dir_state = state;
@@ -2586,7 +2587,8 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* recurse into subdir if instructed by treat_path */
 		if (state == path_recurse) {
 			struct untracked_cache_dir *ud;
-			ud = lookup_untracked(dir->untracked, untracked,
+			ud = lookup_untracked(dir->untracked,
+					      untracked,
 					      path.buf + baselen,
 					      path.len - baselen);
 			subdir_state =
@@ -2846,7 +2848,7 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 	 * condition also catches running setup_standard_excludes()
 	 * before setting dir->untracked!
 	 */
-	if (dir->unmanaged_exclude_files)
+	if (dir->internal.unmanaged_exclude_files)
 		return NULL;
 
 	/*
@@ -2875,7 +2877,7 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 	 * EXC_CMDL is not considered in the cache. If people set it,
 	 * skip the cache.
 	 */
-	if (dir->exclude_list_group[EXC_CMDL].nr)
+	if (dir->internal.exclude_list_group[EXC_CMDL].nr)
 		return NULL;
 
 	if (!ident_in_untracked(dir->untracked)) {
@@ -2935,15 +2937,15 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 
 	/* Validate $GIT_DIR/info/exclude and core.excludesfile */
 	root = dir->untracked->root;
-	if (!oideq(&dir->ss_info_exclude.oid,
+	if (!oideq(&dir->internal.ss_info_exclude.oid,
 		   &dir->untracked->ss_info_exclude.oid)) {
 		invalidate_gitignore(dir->untracked, root);
-		dir->untracked->ss_info_exclude = dir->ss_info_exclude;
+		dir->untracked->ss_info_exclude = dir->internal.ss_info_exclude;
 	}
-	if (!oideq(&dir->ss_excludes_file.oid,
+	if (!oideq(&dir->internal.ss_excludes_file.oid,
 		   &dir->untracked->ss_excludes_file.oid)) {
 		invalidate_gitignore(dir->untracked, root);
-		dir->untracked->ss_excludes_file = dir->ss_excludes_file;
+		dir->untracked->ss_excludes_file = dir->internal.ss_excludes_file;
 	}
 
 	/* Make sure this directory is not dropped out at saving phase */
@@ -2969,9 +2971,9 @@ static void emit_traversal_statistics(struct dir_struct *dir,
 	}
 
 	trace2_data_intmax("read_directory", repo,
-			   "directories-visited", dir->visited_directories);
+			   "directories-visited", dir->internal.visited_directories);
 	trace2_data_intmax("read_directory", repo,
-			   "paths-visited", dir->visited_paths);
+			   "paths-visited", dir->internal.visited_paths);
 
 	if (!dir->untracked)
 		return;
@@ -2993,8 +2995,8 @@ int read_directory(struct dir_struct *dir, struct index_state *istate,
 	struct untracked_cache_dir *untracked;
 
 	trace2_region_enter("dir", "read_directory", istate->repo);
-	dir->visited_paths = 0;
-	dir->visited_directories = 0;
+	dir->internal.visited_paths = 0;
+	dir->internal.visited_directories = 0;
 
 	if (has_symlink_leading_path(path, len)) {
 		trace2_region_leave("dir", "read_directory", istate->repo);
@@ -3342,14 +3344,14 @@ void setup_standard_excludes(struct dir_struct *dir)
 		excludes_file = xdg_config_home("ignore");
 	if (excludes_file && !access_or_warn(excludes_file, R_OK, 0))
 		add_patterns_from_file_1(dir, excludes_file,
-					 dir->untracked ? &dir->ss_excludes_file : NULL);
+					 dir->untracked ? &dir->internal.ss_excludes_file : NULL);
 
 	/* per repository user preference */
 	if (startup_info->have_repository) {
 		const char *path = git_path_info_exclude();
 		if (!access_or_warn(path, R_OK, 0))
 			add_patterns_from_file_1(dir, path,
-						 dir->untracked ? &dir->ss_info_exclude : NULL);
+						 dir->untracked ? &dir->internal.ss_info_exclude : NULL);
 	}
 }
 
@@ -3405,7 +3407,7 @@ void dir_clear(struct dir_struct *dir)
 	struct dir_struct new = DIR_INIT;
 
 	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
-		group = &dir->exclude_list_group[i];
+		group = &dir->internal.exclude_list_group[i];
 		for (j = 0; j < group->nr; j++) {
 			pl = &group->pl[j];
 			if (i == EXC_DIRS)
@@ -3422,13 +3424,13 @@ void dir_clear(struct dir_struct *dir)
 	free(dir->ignored);
 	free(dir->entries);
 
-	stk = dir->exclude_stack;
+	stk = dir->internal.exclude_stack;
 	while (stk) {
 		struct exclude_stack *prev = stk->prev;
 		free(stk);
 		stk = prev;
 	}
-	strbuf_release(&dir->basebuf);
+	strbuf_release(&dir->internal.basebuf);
 
 	memcpy(dir, &new, sizeof(*dir));
 }
diff --git a/dir.h b/dir.h
index 8acfc044181..33fd848fc8d 100644
--- a/dir.h
+++ b/dir.h
@@ -215,14 +215,9 @@ struct dir_struct {
 	/* The number of members in `entries[]` array. */
 	int nr;
 
-	/* Internal use; keeps track of allocation of `entries[]` array.*/
-	int alloc;
-
 	/* The number of members in `ignored[]` array. */
 	int ignored_nr;
 
-	int ignored_alloc;
-
 	/* bit-field of options */
 	enum {
 
@@ -296,51 +291,62 @@ struct dir_struct {
 	 */
 	struct dir_entry **ignored;
 
+	/* Enable/update untracked file cache if set */
+	struct untracked_cache *untracked;
+
 	/**
 	 * The name of the file to be read in each directory for excluded files
 	 * (typically `.gitignore`).
 	 */
 	const char *exclude_per_dir;
 
-	/*
-	 * We maintain three groups of exclude pattern lists:
-	 *
-	 * EXC_CMDL lists patterns explicitly given on the command line.
-	 * EXC_DIRS lists patterns obtained from per-directory ignore files.
-	 * EXC_FILE lists patterns from fallback ignore files, e.g.
-	 *   - .git/info/exclude
-	 *   - core.excludesfile
-	 *
-	 * Each group contains multiple exclude lists, a single list
-	 * per source.
-	 */
+	struct dir_struct_internal {
+		/* Keeps track of allocation of `entries[]` array.*/
+		int alloc;
+
+		/* Keeps track of allocation of `ignored[]` array. */
+		int ignored_alloc;
+
+		/*
+		 * We maintain three groups of exclude pattern lists:
+		 *
+		 * EXC_CMDL lists patterns explicitly given on the command line.
+		 * EXC_DIRS lists patterns obtained from per-directory ignore
+		 *          files.
+		 * EXC_FILE lists patterns from fallback ignore files, e.g.
+		 *   - .git/info/exclude
+		 *   - core.excludesfile
+		 *
+		 * Each group contains multiple exclude lists, a single list
+		 * per source.
+		 */
 #define EXC_CMDL 0
 #define EXC_DIRS 1
 #define EXC_FILE 2
-	struct exclude_list_group exclude_list_group[3];
-
-	/*
-	 * Temporary variables which are used during loading of the
-	 * per-directory exclude lists.
-	 *
-	 * exclude_stack points to the top of the exclude_stack, and
-	 * basebuf contains the full path to the current
-	 * (sub)directory in the traversal. Exclude points to the
-	 * matching exclude struct if the directory is excluded.
-	 */
-	struct exclude_stack *exclude_stack;
-	struct path_pattern *pattern;
-	struct strbuf basebuf;
-
-	/* Enable untracked file cache if set */
-	struct untracked_cache *untracked;
-	struct oid_stat ss_info_exclude;
-	struct oid_stat ss_excludes_file;
-	unsigned unmanaged_exclude_files;
+		struct exclude_list_group exclude_list_group[3];
 
-	/* Stats about the traversal */
-	unsigned visited_paths;
-	unsigned visited_directories;
+		/*
+		 * Temporary variables which are used during loading of the
+		 * per-directory exclude lists.
+		 *
+		 * exclude_stack points to the top of the exclude_stack, and
+		 * basebuf contains the full path to the current
+		 * (sub)directory in the traversal. Exclude points to the
+		 * matching exclude struct if the directory is excluded.
+		 */
+		struct exclude_stack *exclude_stack;
+		struct path_pattern *pattern;
+		struct strbuf basebuf;
+
+		/* Additional metadata related to 'untracked' */
+		struct oid_stat ss_info_exclude;
+		struct oid_stat ss_excludes_file;
+		unsigned unmanaged_exclude_files;
+
+		/* Stats about the traversal */
+		unsigned visited_paths;
+		unsigned visited_directories;
+	} internal;
 };
 
 #define DIR_INIT { 0 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 02/11] dir: add a usage note to exclude_per_dir
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-24 22:31   ` Jonathan Tan
  2023-02-23  9:14 ` [PATCH 03/11] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/dir.h b/dir.h
index 33fd848fc8d..2196e12630c 100644
--- a/dir.h
+++ b/dir.h
@@ -295,8 +295,12 @@ struct dir_struct {
 	struct untracked_cache *untracked;
 
 	/**
-	 * The name of the file to be read in each directory for excluded files
-	 * (typically `.gitignore`).
+	 * Deprecated: ls-files is the only allowed caller; all other callers
+	 * should leave this as NULL; it pre-dated the
+	 * setup_standard_excludes() mechanism that replaces this.
+	 *
+	 * This field tracks the name of the file to be read in each directory
+	 * for excluded files (typically `.gitignore`).
 	 */
 	const char *exclude_per_dir;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 03/11] dir: mark output only fields of dir_struct as such
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

While at it, also group these fields together for convenience.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/dir.h b/dir.h
index 2196e12630c..e8106e1ecac 100644
--- a/dir.h
+++ b/dir.h
@@ -212,12 +212,6 @@ struct untracked_cache {
  */
 struct dir_struct {
 
-	/* The number of members in `entries[]` array. */
-	int nr;
-
-	/* The number of members in `ignored[]` array. */
-	int ignored_nr;
-
 	/* bit-field of options */
 	enum {
 
@@ -282,14 +276,20 @@ struct dir_struct {
 		DIR_SKIP_NESTED_GIT = 1<<9
 	} flags;
 
+	/* The number of members in `entries[]` array. */
+	int nr; /* output only */
+
+	/* The number of members in `ignored[]` array. */
+	int ignored_nr; /* output only */
+
 	/* An array of `struct dir_entry`, each element of which describes a path. */
-	struct dir_entry **entries;
+	struct dir_entry **entries; /* output only */
 
 	/**
 	 * used for ignored paths with the `DIR_SHOW_IGNORED_TOO` and
 	 * `DIR_COLLECT_IGNORED` flags.
 	 */
-	struct dir_entry **ignored;
+	struct dir_entry **ignored; /* output only */
 
 	/* Enable/update untracked file cache if set */
 	struct untracked_cache *untracked;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 04/11] unpack-trees: clean up some flow control
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (2 preceding siblings ...)
  2023-02-23  9:14 ` [PATCH 03/11] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-24 22:33   ` Jonathan Tan
  2023-02-23  9:14 ` [PATCH 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The update_sparsity() function was introduced in commit 7af7a25853
("unpack-trees: add a new update_sparsity() function", 2020-03-27).
Prior to that, unpack_trees() was used, but that had a few bugs because
the needs of the caller were different, and different enough that
unpack_trees() could not easily be modified to handle both usecases.

The implementation detail that update_sparsity() was written by copying
unpack_trees() and then streamlining it, and then modifying it in the
needed ways still shows through in that there are leftover vestiges in
both functions that are no longer needed.  Clean them up.  In
particular:

  * update_sparsity() allows a pattern list to be passed in, but
    unpack_trees() never should use a different pattern list.  Add a
    check and a BUG() if this gets violated.
  * update_sparsity() has a check early on that will BUG() if
    o->skip_sparse_checkout is set; as such, there's no need to check
    for that condition again later in the code.  We can simply remove
    the check and its corresponding goto label.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3d05e45a279..0887d157df4 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1873,6 +1873,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
 	if (o->dir)
 		BUG("o->dir is for internal use only");
+	if (o->pl)
+		BUG("o->pl is for internal use only");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
@@ -1899,7 +1901,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 	if (!core_apply_sparse_checkout || !o->update)
 		o->skip_sparse_checkout = 1;
-	if (!o->skip_sparse_checkout && !o->pl) {
+	if (!o->skip_sparse_checkout) {
 		memset(&pl, 0, sizeof(pl));
 		free_pattern_list = 1;
 		populate_from_existing_patterns(o, &pl);
@@ -2113,8 +2115,6 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 		memset(&pl, 0, sizeof(pl));
 		free_pattern_list = 1;
 		populate_from_existing_patterns(o, &pl);
-		if (o->skip_sparse_checkout)
-			goto skip_sparse_checkout;
 	}
 
 	/* Expand sparse directories as needed */
@@ -2142,7 +2142,6 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 			ret = UPDATE_SPARSITY_WARNINGS;
 	}
 
-skip_sparse_checkout:
 	if (check_updates(o, o->src_index))
 		ret = UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 05/11] sparse-checkout: avoid using internal API of unpack-trees
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (3 preceding siblings ...)
  2023-02-23  9:14 ` [PATCH 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-24 22:37   ` Jonathan Tan
  2023-02-23  9:14 ` [PATCH 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

struct unpack_trees_options has the following field and comment:

	struct pattern_list *pl; /* for internal use */

Despite the internal-use comment, commit e091228e17 ("sparse-checkout:
update working directory in-process", 2019-11-21) starting setting this
field from an external caller.  At the time, the only way around that
would have been to modify unpack_trees() to take an extra pattern_list
argument, and there's a lot of callers of that function.  However, when
we split update_sparsity() off as a separate function, with
sparse-checkout being the sole caller, the need to update other callers
went away.  Fix this API problem by adding a pattern_list argument to
update_sparsity() and stop setting the internal o.pl field directly.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/sparse-checkout.c |  3 +--
 unpack-trees.c            | 17 ++++++++++-------
 unpack-trees.h            |  3 ++-
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index c3738154918..4b7390ce367 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -219,14 +219,13 @@ static int update_working_directory(struct pattern_list *pl)
 	o.dst_index = r->index;
 	index_state_init(&o.result, r);
 	o.skip_sparse_checkout = 0;
-	o.pl = pl;
 
 	setup_work_tree();
 
 	repo_hold_locked_index(r, &lock_file, LOCK_DIE_ON_ERROR);
 
 	setup_unpack_trees_porcelain(&o, "sparse-checkout");
-	result = update_sparsity(&o);
+	result = update_sparsity(&o, pl);
 	clear_unpack_trees_porcelain(&o);
 
 	if (result == UPDATE_SPARSITY_WARNINGS)
diff --git a/unpack-trees.c b/unpack-trees.c
index 0887d157df4..d9c9f330233 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2091,10 +2091,10 @@ return_failed:
  *
  * CE_NEW_SKIP_WORKTREE is used internally.
  */
-enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
+enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
+					    struct pattern_list *pl)
 {
 	enum update_sparsity_result ret = UPDATE_SPARSITY_SUCCESS;
-	struct pattern_list pl;
 	int i;
 	unsigned old_show_all_errors;
 	int free_pattern_list = 0;
@@ -2111,11 +2111,12 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 	trace_performance_enter();
 
 	/* If we weren't given patterns, use the recorded ones */
-	if (!o->pl) {
-		memset(&pl, 0, sizeof(pl));
+	if (!pl) {
 		free_pattern_list = 1;
-		populate_from_existing_patterns(o, &pl);
+		pl = xcalloc(1, sizeof(*pl));
+		populate_from_existing_patterns(o, pl);
 	}
+	o->pl = pl;
 
 	/* Expand sparse directories as needed */
 	expand_index(o->src_index, o->pl);
@@ -2147,8 +2148,10 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 
 	display_warning_msgs(o);
 	o->show_all_errors = old_show_all_errors;
-	if (free_pattern_list)
-		clear_pattern_list(&pl);
+	if (free_pattern_list) {
+		clear_pattern_list(pl);
+		free(pl);
+	}
 	trace_performance_leave("update_sparsity");
 	return ret;
 }
diff --git a/unpack-trees.h b/unpack-trees.h
index 3a7b3e5f007..f3a6e4f90ef 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -112,7 +112,8 @@ enum update_sparsity_result {
 	UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES = -2
 };
 
-enum update_sparsity_result update_sparsity(struct unpack_trees_options *options);
+enum update_sparsity_result update_sparsity(struct unpack_trees_options *options,
+					    struct pattern_list *pl);
 
 int verify_uptodate(const struct cache_entry *ce,
 		    struct unpack_trees_options *o);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (4 preceding siblings ...)
  2023-02-23  9:14 ` [PATCH 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-24 23:22   ` Jonathan Tan
  2023-02-23  9:14 ` [PATCH 07/11] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Commit 2f6b1eb794 ("cache API: add a "INDEX_STATE_INIT" macro/function,
add release_index()", 2023-01-12) mistakenly added some initialization
of a member of unpack_trees_options that was intended to be
internal-only.  Further, it served no purpose as it simply duplicated
the initialization that unpack-trees.c code was already doing.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/sparse-checkout.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 4b7390ce367..8d5ae6f2a60 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -217,7 +217,6 @@ static int update_working_directory(struct pattern_list *pl)
 	o.head_idx = -1;
 	o.src_index = r->index;
 	o.dst_index = r->index;
-	index_state_init(&o.result, r);
 	o.skip_sparse_checkout = 0;
 
 	setup_work_tree();
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 07/11] unpack_trees: start splitting internal fields from public API
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (5 preceding siblings ...)
  2023-02-23  9:14 ` [PATCH 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 08/11] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

This just splits the two fields already marked as internal-only into a
separate internal struct.  Future commits will add more fields that
were meant to be internal-only but were not explicitly marked as such
to the same struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 40 ++++++++++++++++++++--------------------
 unpack-trees.h |  7 +++++--
 2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index d9c9f330233..e6b5fb980cb 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1809,7 +1809,7 @@ static void populate_from_existing_patterns(struct unpack_trees_options *o,
 	if (get_sparse_checkout_patterns(pl) < 0)
 		o->skip_sparse_checkout = 1;
 	else
-		o->pl = pl;
+		o->internal.pl = pl;
 }
 
 static void update_sparsity_for_prefix(const char *prefix,
@@ -1871,10 +1871,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 	if (len > MAX_UNPACK_TREES)
 		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
-	if (o->dir)
-		BUG("o->dir is for internal use only");
-	if (o->pl)
-		BUG("o->pl is for internal use only");
+	if (o->internal.dir)
+		BUG("o->internal.dir is for internal use only");
+	if (o->internal.pl)
+		BUG("o->internal.pl is for internal use only");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
@@ -1891,9 +1891,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		BUG("UNPACK_RESET_OVERWRITE_UNTRACKED incompatible with preserved ignored files");
 
 	if (!o->preserve_ignored) {
-		o->dir = &dir;
-		o->dir->flags |= DIR_SHOW_IGNORED;
-		setup_standard_excludes(o->dir);
+		o->internal.dir = &dir;
+		o->internal.dir->flags |= DIR_SHOW_IGNORED;
+		setup_standard_excludes(o->internal.dir);
 	}
 
 	if (o->prefix)
@@ -1943,7 +1943,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
 	 */
 	if (!o->skip_sparse_checkout)
-		mark_new_skip_worktree(o->pl, o->src_index, 0,
+		mark_new_skip_worktree(o->internal.pl, o->src_index, 0,
 				       CE_NEW_SKIP_WORKTREE, o->verbose_update);
 
 	if (!dfc)
@@ -2009,7 +2009,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		 * If they will have NEW_SKIP_WORKTREE, also set CE_SKIP_WORKTREE
 		 * so apply_sparse_checkout() won't attempt to remove it from worktree
 		 */
-		mark_new_skip_worktree(o->pl, &o->result,
+		mark_new_skip_worktree(o->internal.pl, &o->result,
 				       CE_ADDED, CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE,
 				       o->verbose_update);
 
@@ -2067,9 +2067,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 done:
 	if (free_pattern_list)
 		clear_pattern_list(&pl);
-	if (o->dir) {
-		dir_clear(o->dir);
-		o->dir = NULL;
+	if (o->internal.dir) {
+		dir_clear(o->internal.dir);
+		o->internal.dir = NULL;
 	}
 	trace2_region_leave("unpack_trees", "unpack_trees", the_repository);
 	trace_performance_leave("unpack_trees");
@@ -2116,14 +2116,14 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 		pl = xcalloc(1, sizeof(*pl));
 		populate_from_existing_patterns(o, pl);
 	}
-	o->pl = pl;
+	o->internal.pl = pl;
 
 	/* Expand sparse directories as needed */
-	expand_index(o->src_index, o->pl);
+	expand_index(o->src_index, o->internal.pl);
 
 	/* Set NEW_SKIP_WORKTREE on existing entries. */
 	mark_all_ce_unused(o->src_index);
-	mark_new_skip_worktree(o->pl, o->src_index, 0,
+	mark_new_skip_worktree(o->internal.pl, o->src_index, 0,
 			       CE_NEW_SKIP_WORKTREE, o->verbose_update);
 
 	/* Then loop over entries and update/remove as needed */
@@ -2338,8 +2338,8 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
 	pathbuf = xstrfmt("%.*s/", namelen, ce->name);
 
 	memset(&d, 0, sizeof(d));
-	if (o->dir)
-		d.exclude_per_dir = o->dir->exclude_per_dir;
+	if (o->internal.dir)
+		d.exclude_per_dir = o->internal.dir->exclude_per_dir;
 	i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
 	dir_clear(&d);
 	free(pathbuf);
@@ -2393,8 +2393,8 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 	if (ignore_case && icase_exists(o, name, len, st))
 		return 0;
 
-	if (o->dir &&
-	    is_excluded(o->dir, o->src_index, name, &dtype))
+	if (o->internal.dir &&
+	    is_excluded(o->internal.dir, o->src_index, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
diff --git a/unpack-trees.h b/unpack-trees.h
index f3a6e4f90ef..5c1a9314a06 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -97,9 +97,12 @@ struct unpack_trees_options {
 	struct index_state *src_index;
 	struct index_state result;
 
-	struct pattern_list *pl; /* for internal use */
-	struct dir_struct *dir; /* for internal use only */
 	struct checkout_metadata meta;
+
+	struct unpack_trees_options_internal {
+		struct pattern_list *pl;
+		struct dir_struct *dir;
+	} internal;
 };
 
 int unpack_trees(unsigned n, struct tree_desc *t,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 08/11] unpack-trees: mark fields only used internally as internal
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (6 preceding siblings ...)
  2023-02-23  9:14 ` [PATCH 07/11] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 09/11] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Continue the work from the previous patch by finding additional fields
which are only used internally but not yet explicitly marked as such,
and include them in the internal fields struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 157 +++++++++++++++++++++++++------------------------
 unpack-trees.h |  26 ++++----
 2 files changed, 94 insertions(+), 89 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index e6b5fb980cb..d89eb3d8bf0 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -66,8 +66,8 @@ static const char *unpack_plumbing_errors[NB_UNPACK_TREES_WARNING_TYPES] = {
 };
 
 #define ERRORMSG(o,type) \
-	( ((o) && (o)->msgs[(type)]) \
-	  ? ((o)->msgs[(type)])      \
+	( ((o) && (o)->internal.msgs[(type)]) \
+	  ? ((o)->internal.msgs[(type)])      \
 	  : (unpack_plumbing_errors[(type)]) )
 
 static const char *super_prefixed(const char *path, const char *super_prefix)
@@ -108,10 +108,10 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 				  const char *cmd)
 {
 	int i;
-	const char **msgs = opts->msgs;
+	const char **msgs = opts->internal.msgs;
 	const char *msg;
 
-	strvec_init(&opts->msgs_to_free);
+	strvec_init(&opts->internal.msgs_to_free);
 
 	if (!strcmp(cmd, "checkout"))
 		msg = advice_enabled(ADVICE_COMMIT_BEFORE_MERGE)
@@ -129,7 +129,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please commit your changes or stash them before you %s.")
 		      : _("Your local changes to the following files would be overwritten by %s:\n%%s");
 	msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	msgs[ERROR_NOT_UPTODATE_DIR] =
 		_("Updating the following directories would lose untracked files in them:\n%s");
@@ -153,7 +153,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please move or remove them before you %s.")
 		      : _("The following untracked working tree files would be removed by %s:\n%%s");
 	msgs[ERROR_WOULD_LOSE_UNTRACKED_REMOVED] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	if (!strcmp(cmd, "checkout"))
 		msg = advice_enabled(ADVICE_COMMIT_BEFORE_MERGE)
@@ -171,7 +171,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please move or remove them before you %s.")
 		      : _("The following untracked working tree files would be overwritten by %s:\n%%s");
 	msgs[ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	/*
 	 * Special case: ERROR_BIND_OVERLAP refers to a pair of paths, we
@@ -189,16 +189,16 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 	msgs[WARNING_SPARSE_ORPHANED_NOT_OVERWRITTEN] =
 		_("The following paths were already present and thus not updated despite sparse patterns:\n%s");
 
-	opts->show_all_errors = 1;
+	opts->internal.show_all_errors = 1;
 	/* rejected paths may not have a static buffer */
-	for (i = 0; i < ARRAY_SIZE(opts->unpack_rejects); i++)
-		opts->unpack_rejects[i].strdup_strings = 1;
+	for (i = 0; i < ARRAY_SIZE(opts->internal.unpack_rejects); i++)
+		opts->internal.unpack_rejects[i].strdup_strings = 1;
 }
 
 void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
 {
-	strvec_clear(&opts->msgs_to_free);
-	memset(opts->msgs, 0, sizeof(opts->msgs));
+	strvec_clear(&opts->internal.msgs_to_free);
+	memset(opts->internal.msgs, 0, sizeof(opts->internal.msgs));
 }
 
 static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
@@ -210,7 +210,7 @@ static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
 		set |= CE_WT_REMOVE;
 
 	ce->ce_flags = (ce->ce_flags & ~clear) | set;
-	return add_index_entry(&o->result, ce,
+	return add_index_entry(&o->internal.result, ce,
 			       ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE);
 }
 
@@ -218,7 +218,7 @@ static void add_entry(struct unpack_trees_options *o,
 		      const struct cache_entry *ce,
 		      unsigned int set, unsigned int clear)
 {
-	do_add_entry(o, dup_cache_entry(ce, &o->result), set, clear);
+	do_add_entry(o, dup_cache_entry(ce, &o->internal.result), set, clear);
 }
 
 /*
@@ -233,7 +233,7 @@ static int add_rejected_path(struct unpack_trees_options *o,
 	if (o->quiet)
 		return -1;
 
-	if (!o->show_all_errors)
+	if (!o->internal.show_all_errors)
 		return error(ERRORMSG(o, e), super_prefixed(path,
 							    o->super_prefix));
 
@@ -241,7 +241,7 @@ static int add_rejected_path(struct unpack_trees_options *o,
 	 * Otherwise, insert in a list for future display by
 	 * display_(error|warning)_msgs()
 	 */
-	string_list_append(&o->unpack_rejects[e], path);
+	string_list_append(&o->internal.unpack_rejects[e], path);
 	return -1;
 }
 
@@ -253,7 +253,7 @@ static void display_error_msgs(struct unpack_trees_options *o)
 	int e;
 	unsigned error_displayed = 0;
 	for (e = 0; e < NB_UNPACK_TREES_ERROR_TYPES; e++) {
-		struct string_list *rejects = &o->unpack_rejects[e];
+		struct string_list *rejects = &o->internal.unpack_rejects[e];
 
 		if (rejects->nr > 0) {
 			int i;
@@ -281,7 +281,7 @@ static void display_warning_msgs(struct unpack_trees_options *o)
 	unsigned warning_displayed = 0;
 	for (e = NB_UNPACK_TREES_ERROR_TYPES + 1;
 	     e < NB_UNPACK_TREES_WARNING_TYPES; e++) {
-		struct string_list *rejects = &o->unpack_rejects[e];
+		struct string_list *rejects = &o->internal.unpack_rejects[e];
 
 		if (rejects->nr > 0) {
 			int i;
@@ -600,13 +600,14 @@ static void mark_ce_used(struct cache_entry *ce, struct unpack_trees_options *o)
 {
 	ce->ce_flags |= CE_UNPACKED;
 
-	if (o->cache_bottom < o->src_index->cache_nr &&
-	    o->src_index->cache[o->cache_bottom] == ce) {
-		int bottom = o->cache_bottom;
+	if (o->internal.cache_bottom < o->src_index->cache_nr &&
+	    o->src_index->cache[o->internal.cache_bottom] == ce) {
+		int bottom = o->internal.cache_bottom;
+
 		while (bottom < o->src_index->cache_nr &&
 		       o->src_index->cache[bottom]->ce_flags & CE_UNPACKED)
 			bottom++;
-		o->cache_bottom = bottom;
+		o->internal.cache_bottom = bottom;
 	}
 }
 
@@ -652,7 +653,7 @@ static void mark_ce_used_same_name(struct cache_entry *ce,
 static struct cache_entry *next_cache_entry(struct unpack_trees_options *o)
 {
 	const struct index_state *index = o->src_index;
-	int pos = o->cache_bottom;
+	int pos = o->internal.cache_bottom;
 
 	while (pos < index->cache_nr) {
 		struct cache_entry *ce = index->cache[pos];
@@ -711,7 +712,7 @@ static void restore_cache_bottom(struct traverse_info *info, int bottom)
 
 	if (o->diff_index_cached)
 		return;
-	o->cache_bottom = bottom;
+	o->internal.cache_bottom = bottom;
 }
 
 static int switch_cache_bottom(struct traverse_info *info)
@@ -721,13 +722,13 @@ static int switch_cache_bottom(struct traverse_info *info)
 
 	if (o->diff_index_cached)
 		return 0;
-	ret = o->cache_bottom;
+	ret = o->internal.cache_bottom;
 	pos = find_cache_pos(info->prev, info->name, info->namelen);
 
 	if (pos < -1)
-		o->cache_bottom = -2 - pos;
+		o->internal.cache_bottom = -2 - pos;
 	else if (pos < 0)
-		o->cache_bottom = o->src_index->cache_nr;
+		o->internal.cache_bottom = o->src_index->cache_nr;
 	return ret;
 }
 
@@ -873,9 +874,9 @@ static int traverse_trees_recursive(int n, unsigned long dirmask,
 		 * save and restore cache_bottom anyway to not miss
 		 * unprocessed entries before 'pos'.
 		 */
-		bottom = o->cache_bottom;
+		bottom = o->internal.cache_bottom;
 		ret = traverse_by_cache_tree(pos, nr_entries, n, info);
-		o->cache_bottom = bottom;
+		o->internal.cache_bottom = bottom;
 		return ret;
 	}
 
@@ -1212,7 +1213,7 @@ static int unpack_single_entry(int n, unsigned long mask,
 		 * cache entry from the index aware logic.
 		 */
 		src[i + o->merge] = create_ce_entry(info, names + i, stage,
-						    &o->result, o->merge,
+						    &o->internal.result, o->merge,
 						    bit & dirmask);
 	}
 
@@ -1237,7 +1238,7 @@ static int unpack_single_entry(int n, unsigned long mask,
 
 static int unpack_failed(struct unpack_trees_options *o, const char *message)
 {
-	discard_index(&o->result);
+	discard_index(&o->internal.result);
 	if (!o->quiet && !o->exiting_early) {
 		if (message)
 			return error("%s", message);
@@ -1260,7 +1261,7 @@ static int find_cache_pos(struct traverse_info *info,
 	struct index_state *index = o->src_index;
 	int pfxlen = info->pathlen;
 
-	for (pos = o->cache_bottom; pos < index->cache_nr; pos++) {
+	for (pos = o->internal.cache_bottom; pos < index->cache_nr; pos++) {
 		const struct cache_entry *ce = index->cache[pos];
 		const char *ce_name, *ce_slash;
 		int cmp, ce_len;
@@ -1271,8 +1272,8 @@ static int find_cache_pos(struct traverse_info *info,
 			 * we can never match it; don't check it
 			 * again.
 			 */
-			if (pos == o->cache_bottom)
-				++o->cache_bottom;
+			if (pos == o->internal.cache_bottom)
+				++o->internal.cache_bottom;
 			continue;
 		}
 		if (!ce_in_traverse_path(ce, info)) {
@@ -1450,7 +1451,7 @@ static int unpack_sparse_callback(int n, unsigned long mask, unsigned long dirma
 	 */
 	if (!is_null_oid(&names[0].oid)) {
 		src[0] = create_ce_entry(info, &names[0], 0,
-					&o->result, 1,
+					&o->internal.result, 1,
 					dirmask & (1ul << 0));
 		src[0]->ce_flags |= (CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
 	}
@@ -1560,7 +1561,7 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 				 * in 'mark_ce_used()'
 				 */
 				if (!src[0] || !S_ISSPARSEDIR(src[0]->ce_mode))
-					o->cache_bottom += matches;
+					o->internal.cache_bottom += matches;
 				return mask;
 			}
 		}
@@ -1907,37 +1908,37 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		populate_from_existing_patterns(o, &pl);
 	}
 
-	index_state_init(&o->result, o->src_index->repo);
-	o->result.initialized = 1;
-	o->result.timestamp.sec = o->src_index->timestamp.sec;
-	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
-	o->result.version = o->src_index->version;
+	index_state_init(&o->internal.result, o->src_index->repo);
+	o->internal.result.initialized = 1;
+	o->internal.result.timestamp.sec = o->src_index->timestamp.sec;
+	o->internal.result.timestamp.nsec = o->src_index->timestamp.nsec;
+	o->internal.result.version = o->src_index->version;
 	if (!o->src_index->split_index) {
-		o->result.split_index = NULL;
+		o->internal.result.split_index = NULL;
 	} else if (o->src_index == o->dst_index) {
 		/*
 		 * o->dst_index (and thus o->src_index) will be discarded
-		 * and overwritten with o->result at the end of this function,
+		 * and overwritten with o->internal.result at the end of this function,
 		 * so just use src_index's split_index to avoid having to
 		 * create a new one.
 		 */
-		o->result.split_index = o->src_index->split_index;
-		o->result.split_index->refcount++;
+		o->internal.result.split_index = o->src_index->split_index;
+		o->internal.result.split_index->refcount++;
 	} else {
-		o->result.split_index = init_split_index(&o->result);
+		o->internal.result.split_index = init_split_index(&o->internal.result);
 	}
-	oidcpy(&o->result.oid, &o->src_index->oid);
+	oidcpy(&o->internal.result.oid, &o->src_index->oid);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
 
-	o->result.fsmonitor_last_update =
+	o->internal.result.fsmonitor_last_update =
 		xstrdup_or_null(o->src_index->fsmonitor_last_update);
-	o->result.fsmonitor_has_run_once = o->src_index->fsmonitor_has_run_once;
+	o->internal.result.fsmonitor_has_run_once = o->src_index->fsmonitor_has_run_once;
 
 	if (!o->src_index->initialized &&
 	    !repo->settings.command_requires_full_index &&
-	    is_sparse_index_allowed(&o->result, 0))
-		o->result.sparse_index = 1;
+	    is_sparse_index_allowed(&o->internal.result, 0))
+		o->internal.result.sparse_index = 1;
 
 	/*
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
@@ -1957,7 +1958,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		setup_traverse_info(&info, prefix);
 		info.fn = unpack_callback;
 		info.data = o;
-		info.show_all_errors = o->show_all_errors;
+		info.show_all_errors = o->internal.show_all_errors;
 		info.pathspec = o->pathspec;
 
 		if (o->prefix) {
@@ -1998,7 +1999,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	}
 	mark_all_ce_unused(o->src_index);
 
-	if (o->trivial_merges_only && o->nontrivial_merge) {
+	if (o->trivial_merges_only && o->internal.nontrivial_merge) {
 		ret = unpack_failed(o, "Merge requires file-level merging");
 		goto done;
 	}
@@ -2009,13 +2010,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		 * If they will have NEW_SKIP_WORKTREE, also set CE_SKIP_WORKTREE
 		 * so apply_sparse_checkout() won't attempt to remove it from worktree
 		 */
-		mark_new_skip_worktree(o->internal.pl, &o->result,
+		mark_new_skip_worktree(o->internal.pl, &o->internal.result,
 				       CE_ADDED, CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE,
 				       o->verbose_update);
 
 		ret = 0;
-		for (i = 0; i < o->result.cache_nr; i++) {
-			struct cache_entry *ce = o->result.cache[i];
+		for (i = 0; i < o->internal.result.cache_nr; i++) {
+			struct cache_entry *ce = o->internal.result.cache[i];
 
 			/*
 			 * Entries marked with CE_ADDED in merged_entry() do not have
@@ -2029,7 +2030,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			    verify_absent(ce, WARNING_SPARSE_ORPHANED_NOT_OVERWRITTEN, o))
 				ret = 1;
 
-			if (apply_sparse_checkout(&o->result, ce, o))
+			if (apply_sparse_checkout(&o->internal.result, ce, o))
 				ret = 1;
 		}
 		if (ret == 1) {
@@ -2037,30 +2038,30 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			 * Inability to sparsify or de-sparsify individual
 			 * paths is not an error, but just a warning.
 			 */
-			if (o->show_all_errors)
+			if (o->internal.show_all_errors)
 				display_warning_msgs(o);
 			ret = 0;
 		}
 	}
 
-	ret = check_updates(o, &o->result) ? (-2) : 0;
+	ret = check_updates(o, &o->internal.result) ? (-2) : 0;
 	if (o->dst_index) {
-		move_index_extensions(&o->result, o->src_index);
+		move_index_extensions(&o->internal.result, o->src_index);
 		if (!ret) {
 			if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
-				cache_tree_verify(the_repository, &o->result);
+				cache_tree_verify(the_repository, &o->internal.result);
 			if (!o->skip_cache_tree_update &&
-			    !cache_tree_fully_valid(o->result.cache_tree))
-				cache_tree_update(&o->result,
+			    !cache_tree_fully_valid(o->internal.result.cache_tree))
+				cache_tree_update(&o->internal.result,
 						  WRITE_TREE_SILENT |
 						  WRITE_TREE_REPAIR);
 		}
 
-		o->result.updated_workdir = 1;
+		o->internal.result.updated_workdir = 1;
 		discard_index(o->dst_index);
-		*o->dst_index = o->result;
+		*o->dst_index = o->internal.result;
 	} else {
-		discard_index(&o->result);
+		discard_index(&o->internal.result);
 	}
 	o->src_index = NULL;
 
@@ -2076,7 +2077,7 @@ done:
 	return ret;
 
 return_failed:
-	if (o->show_all_errors)
+	if (o->internal.show_all_errors)
 		display_error_msgs(o);
 	mark_all_ce_unused(o->src_index);
 	ret = unpack_failed(o, NULL);
@@ -2099,8 +2100,8 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 	unsigned old_show_all_errors;
 	int free_pattern_list = 0;
 
-	old_show_all_errors = o->show_all_errors;
-	o->show_all_errors = 1;
+	old_show_all_errors = o->internal.show_all_errors;
+	o->internal.show_all_errors = 1;
 
 	/* Sanity checks */
 	if (!o->update || o->index_only || o->skip_sparse_checkout)
@@ -2147,7 +2148,7 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 		ret = UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES;
 
 	display_warning_msgs(o);
-	o->show_all_errors = old_show_all_errors;
+	o->internal.show_all_errors = old_show_all_errors;
 	if (free_pattern_list) {
 		clear_pattern_list(pl);
 		free(pl);
@@ -2246,15 +2247,15 @@ static int verify_uptodate_sparse(const struct cache_entry *ce,
 }
 
 /*
- * TODO: We should actually invalidate o->result, not src_index [1].
+ * TODO: We should actually invalidate o->internal.result, not src_index [1].
  * But since cache tree and untracked cache both are not copied to
- * o->result until unpacking is complete, we invalidate them on
+ * o->internal.result until unpacking is complete, we invalidate them on
  * src_index instead with the assumption that they will be copied to
  * dst_index at the end.
  *
  * [1] src_index->cache_tree is also used in unpack_callback() so if
- * we invalidate o->result, we need to update it to use
- * o->result.cache_tree as well.
+ * we invalidate o->internal.result, we need to update it to use
+ * o->internal.result.cache_tree as well.
  */
 static void invalidate_ce_path(const struct cache_entry *ce,
 			       struct unpack_trees_options *o)
@@ -2422,7 +2423,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 	 * delete this path, which is in a subdirectory that
 	 * is being replaced with a blob.
 	 */
-	result = index_file_exists(&o->result, name, len, 0);
+	result = index_file_exists(&o->internal.result, name, len, 0);
 	if (result) {
 		if (result->ce_flags & CE_REMOVE)
 			return 0;
@@ -2523,7 +2524,7 @@ static int merged_entry(const struct cache_entry *ce,
 			struct unpack_trees_options *o)
 {
 	int update = CE_UPDATE;
-	struct cache_entry *merge = dup_cache_entry(ce, &o->result);
+	struct cache_entry *merge = dup_cache_entry(ce, &o->internal.result);
 
 	if (!old) {
 		/*
@@ -2618,7 +2619,7 @@ static int merged_sparse_dir(const struct cache_entry * const *src, int n,
 	setup_traverse_info(&info, src[0]->name);
 	info.fn = unpack_sparse_callback;
 	info.data = o;
-	info.show_all_errors = o->show_all_errors;
+	info.show_all_errors = o->internal.show_all_errors;
 	info.pathspec = o->pathspec;
 
 	/* Get the tree descriptors of the sparse directory in each of the merging trees */
@@ -2836,7 +2837,7 @@ int threeway_merge(const struct cache_entry * const *stages,
 			return -1;
 	}
 
-	o->nontrivial_merge = 1;
+	o->internal.nontrivial_merge = 1;
 
 	/* #2, #3, #4, #6, #7, #9, #10, #11. */
 	count = 0;
diff --git a/unpack-trees.h b/unpack-trees.h
index 5c1a9314a06..0335c89bc75 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -59,7 +59,6 @@ struct unpack_trees_options {
 		     preserve_ignored,
 		     clone,
 		     index_only,
-		     nontrivial_merge,
 		     trivial_merges_only,
 		     verbose_update,
 		     aggressive,
@@ -70,22 +69,13 @@ struct unpack_trees_options {
 		     skip_sparse_checkout,
 		     quiet,
 		     exiting_early,
-		     show_all_errors,
 		     dry_run,
 		     skip_cache_tree_update;
 	enum unpack_trees_reset_type reset;
 	const char *prefix;
 	const char *super_prefix;
-	int cache_bottom;
 	struct pathspec *pathspec;
 	merge_fn_t fn;
-	const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
-	struct strvec msgs_to_free;
-	/*
-	 * Store error messages in an array, each case
-	 * corresponding to a error message type
-	 */
-	struct string_list unpack_rejects[NB_UNPACK_TREES_WARNING_TYPES];
 
 	int head_idx;
 	int merge_size;
@@ -95,11 +85,25 @@ struct unpack_trees_options {
 
 	struct index_state *dst_index;
 	struct index_state *src_index;
-	struct index_state result;
 
 	struct checkout_metadata meta;
 
 	struct unpack_trees_options_internal {
+		unsigned int nontrivial_merge,
+			     show_all_errors;
+
+		int cache_bottom;
+		const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
+		struct strvec msgs_to_free;
+
+		/*
+		 * Store error messages in an array, each case
+		 * corresponding to a error message type
+		 */
+		struct string_list unpack_rejects[NB_UNPACK_TREES_WARNING_TYPES];
+
+		struct index_state result;
+
 		struct pattern_list *pl;
 		struct dir_struct *dir;
 	} internal;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 09/11] unpack-trees: rewrap a few overlong lines from previous patch
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (7 preceding siblings ...)
  2023-02-23  9:14 ` [PATCH 08/11] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-23  9:14 ` [PATCH 10/11] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The previous patch made many lines a little longer, resulting in four
becoming a bit too long.  They were left as-is for the previous patch
to facilitate reviewers verifying that we were just adding "internal."
in a bunch of places, but rewrap them now.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index d89eb3d8bf0..fa186a27ccc 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1213,8 +1213,8 @@ static int unpack_single_entry(int n, unsigned long mask,
 		 * cache entry from the index aware logic.
 		 */
 		src[i + o->merge] = create_ce_entry(info, names + i, stage,
-						    &o->internal.result, o->merge,
-						    bit & dirmask);
+						    &o->internal.result,
+						    o->merge, bit & dirmask);
 	}
 
 	if (o->merge) {
@@ -1918,14 +1918,15 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	} else if (o->src_index == o->dst_index) {
 		/*
 		 * o->dst_index (and thus o->src_index) will be discarded
-		 * and overwritten with o->internal.result at the end of this function,
-		 * so just use src_index's split_index to avoid having to
-		 * create a new one.
+		 * and overwritten with o->internal.result at the end of
+		 * this function, so just use src_index's split_index to
+		 * avoid having to create a new one.
 		 */
 		o->internal.result.split_index = o->src_index->split_index;
 		o->internal.result.split_index->refcount++;
 	} else {
-		o->internal.result.split_index = init_split_index(&o->internal.result);
+		o->internal.result.split_index =
+			init_split_index(&o->internal.result);
 	}
 	oidcpy(&o->internal.result.oid, &o->src_index->oid);
 	o->merge_size = len;
@@ -2049,7 +2050,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		move_index_extensions(&o->internal.result, o->src_index);
 		if (!ret) {
 			if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
-				cache_tree_verify(the_repository, &o->internal.result);
+				cache_tree_verify(the_repository,
+						  &o->internal.result);
 			if (!o->skip_cache_tree_update &&
 			    !cache_tree_fully_valid(o->internal.result.cache_tree))
 				cache_tree_update(&o->internal.result,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 10/11] unpack-trees: special case read-tree debugging as internal usage
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (8 preceding siblings ...)
  2023-02-23  9:14 ` [PATCH 09/11] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
@ 2023-02-23  9:14 ` Elijah Newren via GitGitGadget
  2023-02-23  9:15 ` [PATCH 11/11] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:14 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

builtin/read-tree.c has some special functionality explicitly designed
for debugging unpack-trees.[ch].  Associated with that is two fields
that no other external caller would or should use.  Mark these as
internal to unpack-trees, but allow builtin/read-tree to read or write
them for this special case.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/read-tree.c | 10 +++++-----
 unpack-trees.c      | 22 +++++++++++-----------
 unpack-trees.h      |  6 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 3ce75417833..6034408d486 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -87,9 +87,9 @@ static int debug_merge(const struct cache_entry * const *stages,
 {
 	int i;
 
-	printf("* %d-way merge\n", o->merge_size);
+	printf("* %d-way merge\n", o->internal.merge_size);
 	debug_stage("index", stages[0], o);
-	for (i = 1; i <= o->merge_size; i++) {
+	for (i = 1; i <= o->internal.merge_size; i++) {
 		char buf[24];
 		xsnprintf(buf, sizeof(buf), "ent#%d", i);
 		debug_stage(buf, stages[i], o);
@@ -144,7 +144,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 		OPT__DRY_RUN(&opts.dry_run, N_("don't update the index or the work tree")),
 		OPT_BOOL(0, "no-sparse-checkout", &opts.skip_sparse_checkout,
 			 N_("skip applying sparse checkout filter")),
-		OPT_BOOL(0, "debug-unpack", &opts.debug_unpack,
+		OPT_BOOL(0, "debug-unpack", &opts.internal.debug_unpack,
 			 N_("debug unpack-trees")),
 		OPT_CALLBACK_F(0, "recurse-submodules", NULL,
 			    "checkout", "control recursive updating of submodules",
@@ -247,7 +247,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 			opts.head_idx = 1;
 	}
 
-	if (opts.debug_unpack)
+	if (opts.internal.debug_unpack)
 		opts.fn = debug_merge;
 
 	/* If we're going to prime_cache_tree later, skip cache tree update */
@@ -263,7 +263,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 	if (unpack_trees(nr_trees, t, &opts))
 		return 128;
 
-	if (opts.debug_unpack || opts.dry_run)
+	if (opts.internal.debug_unpack || opts.dry_run)
 		return 0; /* do not write the index out */
 
 	/*
diff --git a/unpack-trees.c b/unpack-trees.c
index fa186a27ccc..60b6e38fd69 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -839,7 +839,7 @@ static int traverse_by_cache_tree(int pos, int nr_entries, int nr_names,
 		mark_ce_used(src[0], o);
 	}
 	free(tree_ce);
-	if (o->debug_unpack)
+	if (o->internal.debug_unpack)
 		printf("Unpacked %d entries from %s to %s using cache-tree\n",
 		       nr_entries,
 		       o->src_index->cache[pos]->name,
@@ -1488,7 +1488,7 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 	while (!p->mode)
 		p++;
 
-	if (o->debug_unpack)
+	if (o->internal.debug_unpack)
 		debug_unpack_callback(n, mask, dirmask, names, info);
 
 	/* Are we supposed to look at the index too? */
@@ -1929,7 +1929,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			init_split_index(&o->internal.result);
 	}
 	oidcpy(&o->internal.result.oid, &o->src_index->oid);
-	o->merge_size = len;
+	o->internal.merge_size = len;
 	mark_all_ce_unused(o->src_index);
 
 	o->internal.result.fsmonitor_last_update =
@@ -2880,9 +2880,9 @@ int twoway_merge(const struct cache_entry * const *src,
 	const struct cache_entry *oldtree = src[1];
 	const struct cache_entry *newtree = src[2];
 
-	if (o->merge_size != 2)
+	if (o->internal.merge_size != 2)
 		return error("Cannot do a twoway merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 
 	if (oldtree == o->df_conflict_entry)
 		oldtree = NULL;
@@ -2962,9 +2962,9 @@ int bind_merge(const struct cache_entry * const *src,
 	const struct cache_entry *old = src[0];
 	const struct cache_entry *a = src[1];
 
-	if (o->merge_size != 1)
+	if (o->internal.merge_size != 1)
 		return error("Cannot do a bind merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 	if (a && old)
 		return o->quiet ? -1 :
 			error(ERRORMSG(o, ERROR_BIND_OVERLAP),
@@ -2988,9 +2988,9 @@ int oneway_merge(const struct cache_entry * const *src,
 	const struct cache_entry *old = src[0];
 	const struct cache_entry *a = src[1];
 
-	if (o->merge_size != 1)
+	if (o->internal.merge_size != 1)
 		return error("Cannot do a oneway merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 
 	if (!a || a == o->df_conflict_entry)
 		return deleted_entry(old, old, o);
@@ -3025,8 +3025,8 @@ int stash_worktree_untracked_merge(const struct cache_entry * const *src,
 	const struct cache_entry *worktree = src[1];
 	const struct cache_entry *untracked = src[2];
 
-	if (o->merge_size != 2)
-		BUG("invalid merge_size: %d", o->merge_size);
+	if (o->internal.merge_size != 2)
+		BUG("invalid merge_size: %d", o->internal.merge_size);
 
 	if (worktree && untracked)
 		return error(_("worktree and untracked commit have duplicate entries: %s"),
diff --git a/unpack-trees.h b/unpack-trees.h
index 0335c89bc75..e8737adfeda 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -65,7 +65,6 @@ struct unpack_trees_options {
 		     skip_unmerged,
 		     initial_checkout,
 		     diff_index_cached,
-		     debug_unpack,
 		     skip_sparse_checkout,
 		     quiet,
 		     exiting_early,
@@ -78,7 +77,6 @@ struct unpack_trees_options {
 	merge_fn_t fn;
 
 	int head_idx;
-	int merge_size;
 
 	struct cache_entry *df_conflict_entry;
 	void *unpack_data;
@@ -90,8 +88,10 @@ struct unpack_trees_options {
 
 	struct unpack_trees_options_internal {
 		unsigned int nontrivial_merge,
-			     show_all_errors;
+			     show_all_errors,
+			     debug_unpack; /* used by read-tree debugging */
 
+		int merge_size; /* used by read-tree debugging */
 		int cache_bottom;
 		const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
 		struct strvec msgs_to_free;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 11/11] unpack-trees: add usage notices around df_conflict_entry
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (9 preceding siblings ...)
  2023-02-23  9:14 ` [PATCH 10/11] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
@ 2023-02-23  9:15 ` Elijah Newren via GitGitGadget
  2023-02-23 15:18 ` [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Derrick Stolee
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-23  9:15 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Avoid making users believe they need to initialize df_conflict_entry
to something (as happened with other output only fields before) with
a quick comment and a small sanity check.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 2 ++
 unpack-trees.h | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 60b6e38fd69..583132f1510 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1876,6 +1876,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		BUG("o->internal.dir is for internal use only");
 	if (o->internal.pl)
 		BUG("o->internal.pl is for internal use only");
+	if (o->df_conflict_entry)
+		BUG("o->df_conflict_entry is an output only field");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
diff --git a/unpack-trees.h b/unpack-trees.h
index e8737adfeda..61c06eb7c50 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -78,7 +78,7 @@ struct unpack_trees_options {
 
 	int head_idx;
 
-	struct cache_entry *df_conflict_entry;
+	struct cache_entry *df_conflict_entry; /* output only */
 	void *unpack_data;
 
 	struct index_state *dst_index;
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (10 preceding siblings ...)
  2023-02-23  9:15 ` [PATCH 11/11] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
@ 2023-02-23 15:18 ` Derrick Stolee
  2023-02-23 15:26   ` Derrick Stolee
                     ` (3 more replies)
  2023-02-24 23:36 ` Jonathan Tan
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
  13 siblings, 4 replies; 57+ messages in thread
From: Derrick Stolee @ 2023-02-23 15:18 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git; +Cc: Elijah Newren

On 2/23/2023 4:14 AM, Elijah Newren via GitGitGadget wrote:
> This patch is primarily about moving internal-only fields within these two
> structs into an embedded internal struct. Patch breakdown:
> 
>  * Patches 1-3: Restructuring dir_struct
>    * Patch 1: Splitting off internal-use-only fields
>    * Patch 2: Add important usage note to avoid accidentally using
>      deprecated API
>    * Patch 3: Mark output-only fields as such
>  * Patches 4-11: Restructuring unpack_trees_options
>    * Patches 4-6: Preparatory cleanup
>    * Patches 7-10: Splitting off internal-use-only fields
>    * Patch 11: Mark output-only field as such
 
> And after the changes:
> 
> struct dir_struct {
>     enum [...] flags;
>     int nr; /* output only */
>     int ignored_nr; /* output only */
>     struct dir_entry **entries; /* output only */
>     struct dir_entry **ignored; /* output only */
>     struct untracked_cache *untracked;
>     const char *exclude_per_dir; /* deprecated */
>     struct dir_struct_internal {
>         int alloc;
>         int ignored_alloc;
> #define EXC_CMDL 0
> #define EXC_DIRS 1
> #define EXC_FILE 2
>         struct exclude_list_group exclude_list_group[3];
>         struct exclude_stack *exclude_stack;
>         struct path_pattern *pattern;
>         struct strbuf basebuf;
>         struct oid_stat ss_info_exclude;
>         struct oid_stat ss_excludes_file;
>         unsigned unmanaged_exclude_files;
>         unsigned visited_paths;
>         unsigned visited_directories;
>     } internal;
> };

This does present a very clear structure to avoid callers being
confused when writing these changes. It doesn't, however, present
any way to guarantee that callers can't mutate this state.

...here I go on a side track thinking of an alternative...

One way to track this would be to anonymously declare 'struct
dir_struct_internal' in the header file and let 'struct dir_struct'
contain a _pointer_ to the internal struct. The dir_struct_internal
can then be defined inside the .c file, limiting its scope. (It
must be a pointer in dir_struct or else callers would not be able
to create a dir_struct without using a pointer and an initializer
method.

The major downside to this pointer approach is that the internal
struct needs to be initialized within API calls and somehow cleared
by all callers. The internal data could be initialized by the common
initializers read_directory() or fill_directory(). There is a
dir_clear() that _should_ be called by all callers (but I notice we
are leaking the struct in at least one place in add-interactive.c,
and likely others).

This alternative adds some complexity to the structure, but
provides compiler-level guarantees that these internals are not used
outside of dir.c. I thought it worth exploring, even if we decide
that the complexity is not worth those guarantees.

The best news is that your existing series makes it easier to flip
to the internal pointer method in the future, since we can shift
the 'd->internal.member" uses into "d->internal->member" in a
mechanical way. Thus, the change you are proposing does not lock us
into this approach if we change our minds later.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-23 15:18 ` [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Derrick Stolee
@ 2023-02-23 15:26   ` Derrick Stolee
  2023-02-23 20:35     ` Elijah Newren
  2023-02-23 20:31   ` Elijah Newren
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 57+ messages in thread
From: Derrick Stolee @ 2023-02-23 15:26 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git; +Cc: Elijah Newren

On 2/23/2023 10:18 AM, Derrick Stolee wrote:
> On 2/23/2023 4:14 AM, Elijah Newren via GitGitGadget wrote:
>> This patch is primarily about moving internal-only fields within these two
>> structs into an embedded internal struct. Patch breakdown:
>>
>>  * Patches 1-3: Restructuring dir_struct
>>    * Patch 1: Splitting off internal-use-only fields
>>    * Patch 2: Add important usage note to avoid accidentally using
>>      deprecated API
>>    * Patch 3: Mark output-only fields as such
>>  * Patches 4-11: Restructuring unpack_trees_options
>>    * Patches 4-6: Preparatory cleanup
>>    * Patches 7-10: Splitting off internal-use-only fields
>>    * Patch 11: Mark output-only field as such
...
> The best news is that your existing series makes it easier to flip
> to the internal pointer method in the future, since we can shift
> the 'd->internal.member" uses into "d->internal->member" in a
> mechanical way. Thus, the change you are proposing does not lock us
> into this approach if we change our minds later.

And now that I've read the series in its entirety, I think it is
well organized and does not need any updates. It creates a better
situation than what we already have, and any changes to split the
internal structs to be anonymous to callers can be done as a
follow-up.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-23 15:18 ` [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Derrick Stolee
  2023-02-23 15:26   ` Derrick Stolee
@ 2023-02-23 20:31   ` Elijah Newren
  2023-02-24  1:24   ` Junio C Hamano
  2023-02-24  5:54   ` Jacob Keller
  3 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren @ 2023-02-23 20:31 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, git

On Thu, Feb 23, 2023 at 7:18 AM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 2/23/2023 4:14 AM, Elijah Newren via GitGitGadget wrote:
> > This patch is primarily about moving internal-only fields within these two
> > structs into an embedded internal struct. Patch breakdown:
> >
> >  * Patches 1-3: Restructuring dir_struct
> >    * Patch 1: Splitting off internal-use-only fields
> >    * Patch 2: Add important usage note to avoid accidentally using
> >      deprecated API
> >    * Patch 3: Mark output-only fields as such
> >  * Patches 4-11: Restructuring unpack_trees_options
> >    * Patches 4-6: Preparatory cleanup
> >    * Patches 7-10: Splitting off internal-use-only fields
> >    * Patch 11: Mark output-only field as such
>
> > And after the changes:
> >
> > struct dir_struct {
> >     enum [...] flags;
> >     int nr; /* output only */
> >     int ignored_nr; /* output only */
> >     struct dir_entry **entries; /* output only */
> >     struct dir_entry **ignored; /* output only */
> >     struct untracked_cache *untracked;
> >     const char *exclude_per_dir; /* deprecated */
> >     struct dir_struct_internal {
> >         int alloc;
> >         int ignored_alloc;
> > #define EXC_CMDL 0
> > #define EXC_DIRS 1
> > #define EXC_FILE 2
> >         struct exclude_list_group exclude_list_group[3];
> >         struct exclude_stack *exclude_stack;
> >         struct path_pattern *pattern;
> >         struct strbuf basebuf;
> >         struct oid_stat ss_info_exclude;
> >         struct oid_stat ss_excludes_file;
> >         unsigned unmanaged_exclude_files;
> >         unsigned visited_paths;
> >         unsigned visited_directories;
> >     } internal;
> > };
>
> This does present a very clear structure to avoid callers being
> confused when writing these changes. It doesn't, however, present
> any way to guarantee that callers can't mutate this state.
>
> ...here I go on a side track thinking of an alternative...
>
> One way to track this would be to anonymously declare 'struct
> dir_struct_internal' in the header file and let 'struct dir_struct'
> contain a _pointer_ to the internal struct. The dir_struct_internal
> can then be defined inside the .c file, limiting its scope. (It
> must be a pointer in dir_struct or else callers would not be able
> to create a dir_struct without using a pointer and an initializer
> method.
>
> The major downside to this pointer approach is that the internal
> struct needs to be initialized within API calls and somehow cleared
> by all callers. The internal data could be initialized by the common
> initializers read_directory() or fill_directory(). There is a
> dir_clear() that _should_ be called by all callers (but I notice we
> are leaking the struct in at least one place in add-interactive.c,
> and likely others).
>
> This alternative adds some complexity to the structure, but
> provides compiler-level guarantees that these internals are not used
> outside of dir.c. I thought it worth exploring, even if we decide
> that the complexity is not worth those guarantees.

In addition to the guarantees that others won't muck with those
fields, such an approach would also buy us the following:

  * The implementation can change and internal data structures be
modified/extended/appended-to, without requiring recompiling all files
that depend upon our header.
  * Related to the last point, the ABI doesn't change when the
implementation and internal data structures change.  Might be helpful
for libification efforts.

For all three of these reasons, I pursued this alternate strategy you
bring up in merge-recursive and merge-ort (they both use a "struct
merge_options_internal *priv" and then define _very_ different "struct
merge_options_iternal" in their respective C files).  I thought about
using this strategy you suggest here, but was worried at the time I
created this patch that it might create more friction for Ævar who was
pushing his struct initialization work and memory leak cleanups in the
same area heavily at the time (and back then we had a few long threads
back and forth because our work was overlapping and clashing
slightly).  But I figured that doing at least this much was good,
because of what you point out:

> The best news is that your existing series makes it easier to flip
> to the internal pointer method in the future, since we can shift
> the 'd->internal.member" uses into "d->internal->member" in a
> mechanical way. Thus, the change you are proposing does not lock us
> into this approach if we change our minds later.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-23 15:26   ` Derrick Stolee
@ 2023-02-23 20:35     ` Elijah Newren
  0 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren @ 2023-02-23 20:35 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, git

On Thu, Feb 23, 2023 at 7:26 AM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 2/23/2023 10:18 AM, Derrick Stolee wrote:
> > On 2/23/2023 4:14 AM, Elijah Newren via GitGitGadget wrote:
> >> This patch is primarily about moving internal-only fields within these two
> >> structs into an embedded internal struct. Patch breakdown:
> >>
> >>  * Patches 1-3: Restructuring dir_struct
> >>    * Patch 1: Splitting off internal-use-only fields
> >>    * Patch 2: Add important usage note to avoid accidentally using
> >>      deprecated API
> >>    * Patch 3: Mark output-only fields as such
> >>  * Patches 4-11: Restructuring unpack_trees_options
> >>    * Patches 4-6: Preparatory cleanup
> >>    * Patches 7-10: Splitting off internal-use-only fields
> >>    * Patch 11: Mark output-only field as such
> ...
> > The best news is that your existing series makes it easier to flip
> > to the internal pointer method in the future, since we can shift
> > the 'd->internal.member" uses into "d->internal->member" in a
> > mechanical way. Thus, the change you are proposing does not lock us
> > into this approach if we change our minds later.
>
> And now that I've read the series in its entirety, I think it is
> well organized and does not need any updates. It creates a better
> situation than what we already have, and any changes to split the
> internal structs to be anonymous to callers can be done as a
> follow-up.

Wow, I was a bit worried pushing a couple dozen patches last night
that it'd be weeks before anyone took a look, and perhaps even that
I'd again get comments that I was pushing too many to the list.  You
read and reviewed all of them across both series, including some
comments showing you read pretty carefully, all before I had even
woken up.  Very cool; thanks.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-23 15:18 ` [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Derrick Stolee
  2023-02-23 15:26   ` Derrick Stolee
  2023-02-23 20:31   ` Elijah Newren
@ 2023-02-24  1:24   ` Junio C Hamano
  2023-02-24  5:54   ` Jacob Keller
  3 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2023-02-24  1:24 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, git, Elijah Newren

Derrick Stolee <derrickstolee@github.com> writes:

> The major downside to this pointer approach is that the internal
> struct needs to be initialized within API calls and somehow cleared
> by all callers. The internal data could be initialized by the common
> initializers read_directory() or fill_directory(). There is a
> dir_clear() that _should_ be called by all callers (but I notice we
> are leaking the struct in at least one place in add-interactive.c,
> and likely others).
>
> This alternative adds some complexity to the structure, but
> provides compiler-level guarantees that these internals are not used
> outside of dir.c. I thought it worth exploring, even if we decide
> that the complexity is not worth those guarantees.

I actually think the current structure may be a good place to stop
at.  Or we could use the original flat structure, but with members
that are supposed to be private prefixed with a longer prefix that
is very specific to the dir.c file, say "private_to_dir_c_".

Then have a block of #define

	#define alloc private_to_dir_c_alloc
	#define ignored_alloc private_to_dir_c_ignored_alloc
	...
	#define visited_directories private_to_dir_c_visited_directories

at the beginning dir.c to hide the cruft out of the implementation.
"git grep private_to_dir_c ':!dir.c'" would catch any outsider
peeking into the part of the struct that they shouldn't be touching.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-23 15:18 ` [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Derrick Stolee
                     ` (2 preceding siblings ...)
  2023-02-24  1:24   ` Junio C Hamano
@ 2023-02-24  5:54   ` Jacob Keller
  3 siblings, 0 replies; 57+ messages in thread
From: Jacob Keller @ 2023-02-24  5:54 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, git, Elijah Newren

On Thu, Feb 23, 2023 at 7:29 AM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 2/23/2023 4:14 AM, Elijah Newren via GitGitGadget wrote:
> > This patch is primarily about moving internal-only fields within these two
> > structs into an embedded internal struct. Patch breakdown:
> >
> >  * Patches 1-3: Restructuring dir_struct
> >    * Patch 1: Splitting off internal-use-only fields
> >    * Patch 2: Add important usage note to avoid accidentally using
> >      deprecated API
> >    * Patch 3: Mark output-only fields as such
> >  * Patches 4-11: Restructuring unpack_trees_options
> >    * Patches 4-6: Preparatory cleanup
> >    * Patches 7-10: Splitting off internal-use-only fields
> >    * Patch 11: Mark output-only field as such
>
> > And after the changes:
> >
> > struct dir_struct {
> >     enum [...] flags;
> >     int nr; /* output only */
> >     int ignored_nr; /* output only */
> >     struct dir_entry **entries; /* output only */
> >     struct dir_entry **ignored; /* output only */
> >     struct untracked_cache *untracked;
> >     const char *exclude_per_dir; /* deprecated */
> >     struct dir_struct_internal {
> >         int alloc;
> >         int ignored_alloc;
> > #define EXC_CMDL 0
> > #define EXC_DIRS 1
> > #define EXC_FILE 2
> >         struct exclude_list_group exclude_list_group[3];
> >         struct exclude_stack *exclude_stack;
> >         struct path_pattern *pattern;
> >         struct strbuf basebuf;
> >         struct oid_stat ss_info_exclude;
> >         struct oid_stat ss_excludes_file;
> >         unsigned unmanaged_exclude_files;
> >         unsigned visited_paths;
> >         unsigned visited_directories;
> >     } internal;
> > };
>
> This does present a very clear structure to avoid callers being
> confused when writing these changes. It doesn't, however, present
> any way to guarantee that callers can't mutate this state.
>
> ...here I go on a side track thinking of an alternative...
>
> One way to track this would be to anonymously declare 'struct
> dir_struct_internal' in the header file and let 'struct dir_struct'
> contain a _pointer_ to the internal struct. The dir_struct_internal
> can then be defined inside the .c file, limiting its scope. (It
> must be a pointer in dir_struct or else callers would not be able
> to create a dir_struct without using a pointer and an initializer
> method.
>
> The major downside to this pointer approach is that the internal
> struct needs to be initialized within API calls and somehow cleared
> by all callers. The internal data could be initialized by the common
> initializers read_directory() or fill_directory(). There is a
> dir_clear() that _should_ be called by all callers (but I notice we
> are leaking the struct in at least one place in add-interactive.c,
> and likely others).
>
> This alternative adds some complexity to the structure, but
> provides compiler-level guarantees that these internals are not used
> outside of dir.c. I thought it worth exploring, even if we decide
> that the complexity is not worth those guarantees.
>

Another approach, if you don't mind structure pointer math is to
create two structures:

a) the external public one in the public header file

struct dir_entry {
  <public stuff>
};

b) a private structure in the source file:

struct dir_entry_private {
  struct dir_entry entry;
  <private stuff>
};

In the source file you also define a macro/function that can take a
pointer to dir_entry and get a pointer to dir_entry_private:

struct dir_entry_private *dir_entry_to_private(struct dir_entry *entry)
{
  return (struct dir_entry_private *)(<calculate the offset that entry
inside dir_entry_private is, then subtract that from entry pointer>))
}

In Linux kernel this is container_of, not sure if git has this already
defined and its a common pattern.

Then you can set the code up such that the only way to allocate a
dir_entry is to call some function in the dir code. If a new entry
needs to be allocated, implement an alloc function that has the full
private structure definition.

This way you don't need an extra field in the dir_entry struct at all,
but at the cost of requiring special allocations. It works great for
code where the only way to get a dir_entry is already some other
functions that would ensure the private version is setup correctly.

> The best news is that your existing series makes it easier to flip
> to the internal pointer method in the future, since we can shift
> the 'd->internal.member" uses into "d->internal->member" in a
> mechanical way. Thus, the change you are proposing does not lock us
> into this approach if we change our minds later.
>

I think either approach is good. I like container_of because I'm quite
used to it in low level kernel code and its a good way to provide
private/public split abstractions there. The private pointer variation
is also a common approach to this problem and I think it sounds like
we already use it in a few places. Its perhaps better for those
reasons.

> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] dir: add a usage note to exclude_per_dir
  2023-02-23  9:14 ` [PATCH 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
@ 2023-02-24 22:31   ` Jonathan Tan
  2023-02-25  0:23     ` Elijah Newren
  0 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2023-02-24 22:31 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> diff --git a/dir.h b/dir.h
> index 33fd848fc8d..2196e12630c 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -295,8 +295,12 @@ struct dir_struct {
>  	struct untracked_cache *untracked;
>  
>  	/**
> -	 * The name of the file to be read in each directory for excluded files
> -	 * (typically `.gitignore`).
> +	 * Deprecated: ls-files is the only allowed caller; all other callers
> +	 * should leave this as NULL; it pre-dated the
> +	 * setup_standard_excludes() mechanism that replaces this.
> +	 *
> +	 * This field tracks the name of the file to be read in each directory
> +	 * for excluded files (typically `.gitignore`).
>  	 */
>  	const char *exclude_per_dir;

I'm not sure what is meant by "allowed caller", but I wouldn't have
expected this to also mean that unpack-trees would need to know to
propagate this from o->internal.dir to d in verify_clean_subdirectory.

I would be OK with excluding this from the patch set - I think the other
changes can stand independent of this.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 04/11] unpack-trees: clean up some flow control
  2023-02-23  9:14 ` [PATCH 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
@ 2023-02-24 22:33   ` Jonathan Tan
  0 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2023-02-24 22:33 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>   * update_sparsity() has a check early on that will BUG() if
>     o->skip_sparse_checkout is set; as such, there's no need to check
>     for that condition again later in the code.  We can simply remove
>     the check and its corresponding goto label.

[snip]

> @@ -2113,8 +2115,6 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
>  		memset(&pl, 0, sizeof(pl));
>  		free_pattern_list = 1;
>  		populate_from_existing_patterns(o, &pl);
> -		if (o->skip_sparse_checkout)
> -			goto skip_sparse_checkout;

I've verified that indeed there is a prior check that
o->skip_sparse_checkout is not set (not visible in the diff).

Up to here, everything looks good.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 05/11] sparse-checkout: avoid using internal API of unpack-trees
  2023-02-23  9:14 ` [PATCH 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
@ 2023-02-24 22:37   ` Jonathan Tan
  2023-02-25  0:33     ` Elijah Newren
  0 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2023-02-24 22:37 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
> index c3738154918..4b7390ce367 100644
> --- a/builtin/sparse-checkout.c
> +++ b/builtin/sparse-checkout.c
> @@ -219,14 +219,13 @@ static int update_working_directory(struct pattern_list *pl)
>  	o.dst_index = r->index;
>  	index_state_init(&o.result, r);
>  	o.skip_sparse_checkout = 0;
> -	o.pl = pl;
>  
>  	setup_work_tree();
>  
>  	repo_hold_locked_index(r, &lock_file, LOCK_DIE_ON_ERROR);
>  
>  	setup_unpack_trees_porcelain(&o, "sparse-checkout");
> -	result = update_sparsity(&o);
> +	result = update_sparsity(&o, pl);

This makes sense - setup_unpack_trees_porcelain() does not use o.pl, so
there is no logic change.

> @@ -2111,11 +2111,12 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
>  	trace_performance_enter();
>  
>  	/* If we weren't given patterns, use the recorded ones */
> -	if (!o->pl) {
> -		memset(&pl, 0, sizeof(pl));
> +	if (!pl) {
>  		free_pattern_list = 1;
> -		populate_from_existing_patterns(o, &pl);
> +		pl = xcalloc(1, sizeof(*pl));
> +		populate_from_existing_patterns(o, pl);
>  	}
> +	o->pl = pl;
>  
>  	/* Expand sparse directories as needed */
>  	expand_index(o->src_index, o->pl);
> @@ -2147,8 +2148,10 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
>  
>  	display_warning_msgs(o);
>  	o->show_all_errors = old_show_all_errors;
> -	if (free_pattern_list)
> -		clear_pattern_list(&pl);
> +	if (free_pattern_list) {
> +		clear_pattern_list(pl);
> +		free(pl);
> +	}

When free_pattern_list is true, we free "pl" which was previously
assigned to "o->pl". Do we thus need to also clear "o->pl"?


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2
  2023-02-23  9:14 ` [PATCH 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
@ 2023-02-24 23:22   ` Jonathan Tan
  2023-02-25  0:40     ` Elijah Newren
  0 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2023-02-24 23:22 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
> 
> Commit 2f6b1eb794 ("cache API: add a "INDEX_STATE_INIT" macro/function,
> add release_index()", 2023-01-12) mistakenly added some initialization
> of a member of unpack_trees_options that was intended to be
> internal-only.  Further, it served no purpose as it simply duplicated
> the initialization that unpack-trees.c code was already doing.
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/sparse-checkout.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
> index 4b7390ce367..8d5ae6f2a60 100644
> --- a/builtin/sparse-checkout.c
> +++ b/builtin/sparse-checkout.c
> @@ -217,7 +217,6 @@ static int update_working_directory(struct pattern_list *pl)
>  	o.head_idx = -1;
>  	o.src_index = r->index;
>  	o.dst_index = r->index;
> -	index_state_init(&o.result, r);
>  	o.skip_sparse_checkout = 0;
>  
>  	setup_work_tree();

The commit message seems to imply that in this code path, there is some
code in unpack-trees.c that runs index_state_init(), but that doesn't
seem to be the case. memset-ting the result field with a junk value
causes valgrind to fail with the following trace:

  ==2035705== Invalid read of size 8
  ==2035705==    at 0x30D982: lazy_init_name_hash (name-hash.c:602)
  ==2035705==    by 0x30DDDA: index_file_exists (name-hash.c:721)
  ==2035705==    by 0x3F71A8: check_ok_to_remove (unpack-trees.c:2430)
  ==2035705==    by 0x3F74EE: verify_absent_1 (unpack-trees.c:2495)
  ==2035705==    by 0x3F75C6: verify_absent_sparse (unpack-trees.c:2523)
  ==2035705==    by 0x3F2A15: apply_sparse_checkout (unpack-trees.c:566)
  ==2035705==    by 0x3F6849: update_sparsity (unpack-trees.c:2147)
  ==2035705==    by 0x1FC105: update_working_directory (sparse-checkout.c:228)

so it might be better to move the init invocation to update_sparsity()
instead of only removing it.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (11 preceding siblings ...)
  2023-02-23 15:18 ` [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Derrick Stolee
@ 2023-02-24 23:36 ` Jonathan Tan
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
  13 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2023-02-24 23:36 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> I wrote this patch series about a year and a half ago, but never submitted
> it. I rebased and updated it due to [0].

[snip]

> [0] Search for "Extremely yes" in
> https://lore.kernel.org/git/CAJoAoZm+TkCL0Jpg_qFgKottxbtiG2QOiY0qGrz3-uQy+=waPg@mail.gmail.com/

I left some minor comments, but otherwise, this looks good. I do share
the desire to avoid unnecessary refactoring churn, but demarcation
of private and public fields does make code much clearer and can
potentially avoid a whole class of bugs, so I would be happy if this
patch set was merged in.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] dir: add a usage note to exclude_per_dir
  2023-02-24 22:31   ` Jonathan Tan
@ 2023-02-25  0:23     ` Elijah Newren
  2023-02-25  1:54       ` Jonathan Tan
  0 siblings, 1 reply; 57+ messages in thread
From: Elijah Newren @ 2023-02-25  0:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Elijah Newren via GitGitGadget, git

On Fri, Feb 24, 2023 at 2:31 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > diff --git a/dir.h b/dir.h
> > index 33fd848fc8d..2196e12630c 100644
> > --- a/dir.h
> > +++ b/dir.h
> > @@ -295,8 +295,12 @@ struct dir_struct {
> >       struct untracked_cache *untracked;
> >
> >       /**
> > -      * The name of the file to be read in each directory for excluded files
> > -      * (typically `.gitignore`).
> > +      * Deprecated: ls-files is the only allowed caller; all other callers
> > +      * should leave this as NULL; it pre-dated the
> > +      * setup_standard_excludes() mechanism that replaces this.
> > +      *
> > +      * This field tracks the name of the file to be read in each directory
> > +      * for excluded files (typically `.gitignore`).
> >        */
> >       const char *exclude_per_dir;
>
> I'm not sure what is meant by "allowed caller", but I wouldn't have
> expected this to also mean that unpack-trees would need to know to
> propagate this from o->internal.dir to d in verify_clean_subdirectory.

Are you confusing fields that are internal to dir, with fields that
are internal to unpack-trees?

This series does not make exclude_per_dir an internal field within dir_struct.

Later in this series, we do make the embedded dir_struct within
unpack_trees_options as internal.  Thus every access of any field of
dir within unpack_trees will have an "internal." prefix, but that has
nothing to do with this patch and would still be true even if this
patch were dropped.

> I would be OK with excluding this from the patch set - I think the other
> changes can stand independent of this.

Trying to get a consistent look and feel between commands is
important.  The "setup_standard_excludes()" is one small area where
we've achieved that, and it's shared by _almost_ all commands
(builtin/ls-files being the only exception and it persists for
backward compatibility reasons).  So I definitely want to keep the
deprecation notice and warn people away from using this field.  That's
what this patch is for.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 05/11] sparse-checkout: avoid using internal API of unpack-trees
  2023-02-24 22:37   ` Jonathan Tan
@ 2023-02-25  0:33     ` Elijah Newren
  0 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren @ 2023-02-25  0:33 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Elijah Newren via GitGitGadget, git

On Fri, Feb 24, 2023 at 2:37 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
> > index c3738154918..4b7390ce367 100644
> > --- a/builtin/sparse-checkout.c
> > +++ b/builtin/sparse-checkout.c
> > @@ -219,14 +219,13 @@ static int update_working_directory(struct pattern_list *pl)
> >       o.dst_index = r->index;
> >       index_state_init(&o.result, r);
> >       o.skip_sparse_checkout = 0;
> > -     o.pl = pl;
> >
> >       setup_work_tree();
> >
> >       repo_hold_locked_index(r, &lock_file, LOCK_DIE_ON_ERROR);
> >
> >       setup_unpack_trees_porcelain(&o, "sparse-checkout");
> > -     result = update_sparsity(&o);
> > +     result = update_sparsity(&o, pl);
>
> This makes sense - setup_unpack_trees_porcelain() does not use o.pl, so
> there is no logic change.
>
> > @@ -2111,11 +2111,12 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
> >       trace_performance_enter();
> >
> >       /* If we weren't given patterns, use the recorded ones */
> > -     if (!o->pl) {
> > -             memset(&pl, 0, sizeof(pl));
> > +     if (!pl) {
> >               free_pattern_list = 1;
> > -             populate_from_existing_patterns(o, &pl);
> > +             pl = xcalloc(1, sizeof(*pl));
> > +             populate_from_existing_patterns(o, pl);
> >       }
> > +     o->pl = pl;
> >
> >       /* Expand sparse directories as needed */
> >       expand_index(o->src_index, o->pl);
> > @@ -2147,8 +2148,10 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
> >
> >       display_warning_msgs(o);
> >       o->show_all_errors = old_show_all_errors;
> > -     if (free_pattern_list)
> > -             clear_pattern_list(&pl);
> > +     if (free_pattern_list) {
> > +             clear_pattern_list(pl);
> > +             free(pl);
> > +     }
>
> When free_pattern_list is true, we free "pl" which was previously
> assigned to "o->pl". Do we thus need to also clear "o->pl"?

Yeah, I don't think the existing code will ever attempt to use o->pl
again under these circumstances, but setting it to NULL for
future-proofing makes sense.  I'll make that tweak; thanks for reading
carefully!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2
  2023-02-24 23:22   ` Jonathan Tan
@ 2023-02-25  0:40     ` Elijah Newren
  0 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren @ 2023-02-25  0:40 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Elijah Newren via GitGitGadget, git

On Fri, Feb 24, 2023 at 3:22 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > From: Elijah Newren <newren@gmail.com>
> >
> > Commit 2f6b1eb794 ("cache API: add a "INDEX_STATE_INIT" macro/function,
> > add release_index()", 2023-01-12) mistakenly added some initialization
> > of a member of unpack_trees_options that was intended to be
> > internal-only.  Further, it served no purpose as it simply duplicated
> > the initialization that unpack-trees.c code was already doing.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  builtin/sparse-checkout.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
> > index 4b7390ce367..8d5ae6f2a60 100644
> > --- a/builtin/sparse-checkout.c
> > +++ b/builtin/sparse-checkout.c
> > @@ -217,7 +217,6 @@ static int update_working_directory(struct pattern_list *pl)
> >       o.head_idx = -1;
> >       o.src_index = r->index;
> >       o.dst_index = r->index;
> > -     index_state_init(&o.result, r);
> >       o.skip_sparse_checkout = 0;
> >
> >       setup_work_tree();
>
> The commit message seems to imply that in this code path, there is some
> code in unpack-trees.c that runs index_state_init(), but that doesn't
> seem to be the case. memset-ting the result field with a junk value
> causes valgrind to fail with the following trace:
>
>   ==2035705== Invalid read of size 8
>   ==2035705==    at 0x30D982: lazy_init_name_hash (name-hash.c:602)
>   ==2035705==    by 0x30DDDA: index_file_exists (name-hash.c:721)
>   ==2035705==    by 0x3F71A8: check_ok_to_remove (unpack-trees.c:2430)
>   ==2035705==    by 0x3F74EE: verify_absent_1 (unpack-trees.c:2495)
>   ==2035705==    by 0x3F75C6: verify_absent_sparse (unpack-trees.c:2523)
>   ==2035705==    by 0x3F2A15: apply_sparse_checkout (unpack-trees.c:566)
>   ==2035705==    by 0x3F6849: update_sparsity (unpack-trees.c:2147)
>   ==2035705==    by 0x1FC105: update_working_directory (sparse-checkout.c:228)
>
> so it might be better to move the init invocation to update_sparsity()
> instead of only removing it.

Actually, my commit message was implying o->result is only used by
unpack_trees() and not update_sparsity().  While that implication is
*mostly* true, I forgot about check_ok_to_remove().  Doh!

Thanks for reviewing so carefully; I'll move the initialization to
update_sparsity() instead.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] dir: add a usage note to exclude_per_dir
  2023-02-25  0:23     ` Elijah Newren
@ 2023-02-25  1:54       ` Jonathan Tan
  2023-02-25  3:23         ` Elijah Newren
  0 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2023-02-25  1:54 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Jonathan Tan, Elijah Newren via GitGitGadget, git

Elijah Newren <newren@gmail.com> writes:
> On Fri, Feb 24, 2023 at 2:31 PM Jonathan Tan <jonathantanmy@google.com> wrote:
> >
> > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > > diff --git a/dir.h b/dir.h
> > > index 33fd848fc8d..2196e12630c 100644
> > > --- a/dir.h
> > > +++ b/dir.h
> > > @@ -295,8 +295,12 @@ struct dir_struct {
> > >       struct untracked_cache *untracked;
> > >
> > >       /**
> > > -      * The name of the file to be read in each directory for excluded files
> > > -      * (typically `.gitignore`).
> > > +      * Deprecated: ls-files is the only allowed caller; all other callers
> > > +      * should leave this as NULL; it pre-dated the
> > > +      * setup_standard_excludes() mechanism that replaces this.
> > > +      *
> > > +      * This field tracks the name of the file to be read in each directory
> > > +      * for excluded files (typically `.gitignore`).
> > >        */
> > >       const char *exclude_per_dir;
> >
> > I'm not sure what is meant by "allowed caller", but I wouldn't have
> > expected this to also mean that unpack-trees would need to know to
> > propagate this from o->internal.dir to d in verify_clean_subdirectory.
> 
> Are you confusing fields that are internal to dir, with fields that
> are internal to unpack-trees?
> 
> This series does not make exclude_per_dir an internal field within dir_struct.

Agreed, but the comment says that ls-files is the only allowed caller,
and I would have expected that non-"allowed callers" would not need to
write to exclude_per_dir. But in unpack-trees.c:

  2346          if (o->internal.dir)                                                                                                       
  2347                  d.exclude_per_dir = o->internal.dir->exclude_per_dir;

Both "d" and "o->internal.dir" are of type "struct dir_struct" (well,
one is not a pointer and one is). I would not have expected such non-
ls-files code to read or write from this field. (But if unpack-trees
is considered part of ls-files and/or copying the same field to another
struct is not considered "calling", then this patch is fine, I guess.)

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
                   ` (12 preceding siblings ...)
  2023-02-24 23:36 ` Jonathan Tan
@ 2023-02-25  2:25 ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
                     ` (12 more replies)
  13 siblings, 13 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren

Changes since v1 (thanks to Jonathan Tan for the careful reviews!)

 * Clear o->pl when freeing pl, to avoid risking use-after-free.
 * Initialize o->result in update_sparsity() since it is actually used (by
   check_ok_to_remove()).

Some time ago, I noticed that struct dir_struct and struct
unpack_trees_options both have numerous fields meant for internal use only,
most of which are not marked as such. This has resulted in callers
accidentally trying to initialize some of these fields, and in at least one
case required a fair amount of review to verify other changes were okay --
review that would have been simplified with the apriori knowledge that a
combination of multiple fields were internal-only[1]. Looking closer, I
found that only 6 out of 18 fields in dir_struct were actually meant to be
public[2], and noted that unpack_trees_options also had 11 internal-only
fields (out of 36).

This patch is primarily about moving internal-only fields within these two
structs into an embedded internal struct. Patch breakdown:

 * Patches 1-3: Restructuring dir_struct
   * Patch 1: Splitting off internal-use-only fields
   * Patch 2: Add important usage note to avoid accidentally using
     deprecated API
   * Patch 3: Mark output-only fields as such
 * Patches 4-11: Restructuring unpack_trees_options
   * Patches 4-6: Preparatory cleanup
   * Patches 7-10: Splitting off internal-use-only fields
   * Patch 11: Mark output-only field as such

To make the benefit more clear, here are compressed versions of dir_struct
both before and after the changes. First, before:

struct dir_struct {
    int nr;
    int alloc;
    int ignored_nr;
    int ignored_alloc;
    enum [...] flags;
    struct dir_entry **entries;
    struct dir_entry **ignored;
    const char *exclude_per_dir;
#define EXC_CMDL 0
#define EXC_DIRS 1
#define EXC_FILE 2
    struct exclude_list_group exclude_list_group[3];
    struct exclude_stack *exclude_stack;
    struct path_pattern *pattern;
    struct strbuf basebuf;
    struct untracked_cache *untracked;
    struct oid_stat ss_info_exclude;
    struct oid_stat ss_excludes_file;
    unsigned unmanaged_exclude_files;
    unsigned visited_paths;
    unsigned visited_directories;
};


And after the changes:

struct dir_struct {
    enum [...] flags;
    int nr; /* output only */
    int ignored_nr; /* output only */
    struct dir_entry **entries; /* output only */
    struct dir_entry **ignored; /* output only */
    struct untracked_cache *untracked;
    const char *exclude_per_dir; /* deprecated */
    struct dir_struct_internal {
        int alloc;
        int ignored_alloc;
#define EXC_CMDL 0
#define EXC_DIRS 1
#define EXC_FILE 2
        struct exclude_list_group exclude_list_group[3];
        struct exclude_stack *exclude_stack;
        struct path_pattern *pattern;
        struct strbuf basebuf;
        struct oid_stat ss_info_exclude;
        struct oid_stat ss_excludes_file;
        unsigned unmanaged_exclude_files;
        unsigned visited_paths;
        unsigned visited_directories;
    } internal;
};


The former version has 18 fields (and 3 magic constants) which API users
will have to figure out. The latter makes it clear there are only at most 2
fields you should be setting upon input, and at most 4 which you read at
output, and the rest (including all the magic constants) you can ignore.

[0] Search for "Extremely yes" in
https://lore.kernel.org/git/CAJoAoZm+TkCL0Jpg_qFgKottxbtiG2QOiY0qGrz3-uQy+=waPg@mail.gmail.com/
[1]
https://lore.kernel.org/git/CABPp-BFSFN3WM6q7KzkD5mhrwsz--St_-ej5LbaY8Yr2sZzj=w@mail.gmail.com/
[2]
https://lore.kernel.org/git/CABPp-BHgot=CPNyK_xNfog_SqsNPNoCGfiSb-gZoS2sn_741dQ@mail.gmail.com/

Elijah Newren (11):
  dir: separate public from internal portion of dir_struct
  dir: add a usage note to exclude_per_dir
  dir: mark output only fields of dir_struct as such
  unpack-trees: clean up some flow control
  sparse-checkout: avoid using internal API of unpack-trees
  sparse-checkout: avoid using internal API of unpack-trees, take 2
  unpack_trees: start splitting internal fields from public API
  unpack-trees: mark fields only used internally as internal
  unpack-trees: rewrap a few overlong lines from previous patch
  unpack-trees: special case read-tree debugging as internal usage
  unpack-trees: add usage notices around df_conflict_entry

 builtin/read-tree.c       |  10 +-
 builtin/sparse-checkout.c |   4 +-
 dir.c                     | 114 +++++++++---------
 dir.h                     | 110 +++++++++--------
 unpack-trees.c            | 247 ++++++++++++++++++++------------------
 unpack-trees.h            |  42 ++++---
 6 files changed, 277 insertions(+), 250 deletions(-)


base-commit: 06dd2baa8da4a73421b959ec026a43711b9d77f9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1149%2Fnewren%2Fclarify-api-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1149/newren/clarify-api-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1149

Range-diff vs v1:

  1:  7f59ad548d0 =  1:  7f59ad548d0 dir: separate public from internal portion of dir_struct
  2:  239b10e1181 =  2:  239b10e1181 dir: add a usage note to exclude_per_dir
  3:  b8aa14350d3 =  3:  b8aa14350d3 dir: mark output only fields of dir_struct as such
  4:  f5a58123034 =  4:  f5a58123034 unpack-trees: clean up some flow control
  5:  508837fc182 !  5:  975dec0f0eb sparse-checkout: avoid using internal API of unpack-trees
     @@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_
      +	if (free_pattern_list) {
      +		clear_pattern_list(pl);
      +		free(pl);
     ++		o->pl = NULL;
      +	}
       	trace_performance_leave("update_sparsity");
       	return ret;
  6:  8955b45e354 !  6:  429f195dcfe sparse-checkout: avoid using internal API of unpack-trees, take 2
     @@ Commit message
          Commit 2f6b1eb794 ("cache API: add a "INDEX_STATE_INIT" macro/function,
          add release_index()", 2023-01-12) mistakenly added some initialization
          of a member of unpack_trees_options that was intended to be
     -    internal-only.  Further, it served no purpose as it simply duplicated
     -    the initialization that unpack-trees.c code was already doing.
     +    internal-only.  This initialization should be done within
     +    update_sparsity() instead.
     +
     +    Note that while o->result is mostly meant for unpack_trees() and
     +    update_sparsity() mostly operates without o->result,
     +    check_ok_to_remove() does consult it so we need to ensure it is properly
     +    initialized.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ builtin/sparse-checkout.c: static int update_working_directory(struct pattern_li
       	o.skip_sparse_checkout = 0;
       
       	setup_work_tree();
     +
     + ## unpack-trees.c ##
     +@@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
     + 
     + 	old_show_all_errors = o->show_all_errors;
     + 	o->show_all_errors = 1;
     ++	index_state_init(&o->result, o->src_index->repo);
     + 
     + 	/* Sanity checks */
     + 	if (!o->update || o->index_only || o->skip_sparse_checkout)
  7:  63ee57478ed !  7:  993da584dbb unpack_trees: start splitting internal fields from public API
     @@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_
       			       CE_NEW_SKIP_WORKTREE, o->verbose_update);
       
       	/* Then loop over entries and update/remove as needed */
     +@@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
     + 	if (free_pattern_list) {
     + 		clear_pattern_list(pl);
     + 		free(pl);
     +-		o->pl = NULL;
     ++		o->internal.pl = NULL;
     + 	}
     + 	trace_performance_leave("update_sparsity");
     + 	return ret;
      @@ unpack-trees.c: static int verify_clean_subdirectory(const struct cache_entry *ce,
       	pathbuf = xstrfmt("%.*s/", namelen, ce->name);
       
  8:  081578b3210 !  8:  8ecb24a45f0 unpack-trees: mark fields only used internally as internal
     @@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_
       
      -	old_show_all_errors = o->show_all_errors;
      -	o->show_all_errors = 1;
     +-	index_state_init(&o->result, o->src_index->repo);
      +	old_show_all_errors = o->internal.show_all_errors;
      +	o->internal.show_all_errors = 1;
     ++	index_state_init(&o->internal.result, o->src_index->repo);
       
       	/* Sanity checks */
       	if (!o->update || o->index_only || o->skip_sparse_checkout)
  9:  f492ab27b19 =  9:  36ca49c3624 unpack-trees: rewrap a few overlong lines from previous patch
 10:  a5048ea00b2 = 10:  5af04d7fe23 unpack-trees: special case read-tree debugging as internal usage
 11:  efec74c8b49 = 11:  c4f31237634 unpack-trees: add usage notices around df_conflict_entry

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 01/11] dir: separate public from internal portion of dir_struct
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In order to make it clearer to callers what portions of dir_struct are
public API, and avoid errors from them setting fields that are meant as
internal API, split the fields used for internal implementation reasons
into a separate embedded struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c | 114 +++++++++++++++++++++++++++++-----------------------------
 dir.h |  86 +++++++++++++++++++++++---------------------
 2 files changed, 104 insertions(+), 96 deletions(-)

diff --git a/dir.c b/dir.c
index 4e99f0c868f..7adf242026e 100644
--- a/dir.c
+++ b/dir.c
@@ -1190,7 +1190,7 @@ struct pattern_list *add_pattern_list(struct dir_struct *dir,
 	struct pattern_list *pl;
 	struct exclude_list_group *group;
 
-	group = &dir->exclude_list_group[group_type];
+	group = &dir->internal.exclude_list_group[group_type];
 	ALLOC_GROW(group->pl, group->nr + 1, group->alloc);
 	pl = &group->pl[group->nr++];
 	memset(pl, 0, sizeof(*pl));
@@ -1211,7 +1211,7 @@ static void add_patterns_from_file_1(struct dir_struct *dir, const char *fname,
 	 * differently when dir->untracked is non-NULL.
 	 */
 	if (!dir->untracked)
-		dir->unmanaged_exclude_files++;
+		dir->internal.unmanaged_exclude_files++;
 	pl = add_pattern_list(dir, EXC_FILE, fname);
 	if (add_patterns(fname, "", 0, pl, NULL, 0, oid_stat) < 0)
 		die(_("cannot use %s as an exclude file"), fname);
@@ -1219,7 +1219,7 @@ static void add_patterns_from_file_1(struct dir_struct *dir, const char *fname,
 
 void add_patterns_from_file(struct dir_struct *dir, const char *fname)
 {
-	dir->unmanaged_exclude_files++; /* see validate_untracked_cache() */
+	dir->internal.unmanaged_exclude_files++; /* see validate_untracked_cache() */
 	add_patterns_from_file_1(dir, fname, NULL);
 }
 
@@ -1519,7 +1519,7 @@ static struct path_pattern *last_matching_pattern_from_lists(
 	struct exclude_list_group *group;
 	struct path_pattern *pattern;
 	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
-		group = &dir->exclude_list_group[i];
+		group = &dir->internal.exclude_list_group[i];
 		for (j = group->nr - 1; j >= 0; j--) {
 			pattern = last_matching_pattern_from_list(
 				pathname, pathlen, basename, dtype_p,
@@ -1545,20 +1545,20 @@ static void prep_exclude(struct dir_struct *dir,
 	struct untracked_cache_dir *untracked;
 	int current;
 
-	group = &dir->exclude_list_group[EXC_DIRS];
+	group = &dir->internal.exclude_list_group[EXC_DIRS];
 
 	/*
 	 * Pop the exclude lists from the EXCL_DIRS exclude_list_group
 	 * which originate from directories not in the prefix of the
 	 * path being checked.
 	 */
-	while ((stk = dir->exclude_stack) != NULL) {
+	while ((stk = dir->internal.exclude_stack) != NULL) {
 		if (stk->baselen <= baselen &&
-		    !strncmp(dir->basebuf.buf, base, stk->baselen))
+		    !strncmp(dir->internal.basebuf.buf, base, stk->baselen))
 			break;
-		pl = &group->pl[dir->exclude_stack->exclude_ix];
-		dir->exclude_stack = stk->prev;
-		dir->pattern = NULL;
+		pl = &group->pl[dir->internal.exclude_stack->exclude_ix];
+		dir->internal.exclude_stack = stk->prev;
+		dir->internal.pattern = NULL;
 		free((char *)pl->src); /* see strbuf_detach() below */
 		clear_pattern_list(pl);
 		free(stk);
@@ -1566,7 +1566,7 @@ static void prep_exclude(struct dir_struct *dir,
 	}
 
 	/* Skip traversing into sub directories if the parent is excluded */
-	if (dir->pattern)
+	if (dir->internal.pattern)
 		return;
 
 	/*
@@ -1574,12 +1574,12 @@ static void prep_exclude(struct dir_struct *dir,
 	 * memset(dir, 0, sizeof(*dir)) before use. Changing all of
 	 * them seems lots of work for little benefit.
 	 */
-	if (!dir->basebuf.buf)
-		strbuf_init(&dir->basebuf, PATH_MAX);
+	if (!dir->internal.basebuf.buf)
+		strbuf_init(&dir->internal.basebuf, PATH_MAX);
 
 	/* Read from the parent directories and push them down. */
 	current = stk ? stk->baselen : -1;
-	strbuf_setlen(&dir->basebuf, current < 0 ? 0 : current);
+	strbuf_setlen(&dir->internal.basebuf, current < 0 ? 0 : current);
 	if (dir->untracked)
 		untracked = stk ? stk->ucd : dir->untracked->root;
 	else
@@ -1599,32 +1599,33 @@ static void prep_exclude(struct dir_struct *dir,
 				die("oops in prep_exclude");
 			cp++;
 			untracked =
-				lookup_untracked(dir->untracked, untracked,
+				lookup_untracked(dir->untracked,
+						 untracked,
 						 base + current,
 						 cp - base - current);
 		}
-		stk->prev = dir->exclude_stack;
+		stk->prev = dir->internal.exclude_stack;
 		stk->baselen = cp - base;
 		stk->exclude_ix = group->nr;
 		stk->ucd = untracked;
 		pl = add_pattern_list(dir, EXC_DIRS, NULL);
-		strbuf_add(&dir->basebuf, base + current, stk->baselen - current);
-		assert(stk->baselen == dir->basebuf.len);
+		strbuf_add(&dir->internal.basebuf, base + current, stk->baselen - current);
+		assert(stk->baselen == dir->internal.basebuf.len);
 
 		/* Abort if the directory is excluded */
 		if (stk->baselen) {
 			int dt = DT_DIR;
-			dir->basebuf.buf[stk->baselen - 1] = 0;
-			dir->pattern = last_matching_pattern_from_lists(dir,
+			dir->internal.basebuf.buf[stk->baselen - 1] = 0;
+			dir->internal.pattern = last_matching_pattern_from_lists(dir,
 									istate,
-				dir->basebuf.buf, stk->baselen - 1,
-				dir->basebuf.buf + current, &dt);
-			dir->basebuf.buf[stk->baselen - 1] = '/';
-			if (dir->pattern &&
-			    dir->pattern->flags & PATTERN_FLAG_NEGATIVE)
-				dir->pattern = NULL;
-			if (dir->pattern) {
-				dir->exclude_stack = stk;
+				dir->internal.basebuf.buf, stk->baselen - 1,
+				dir->internal.basebuf.buf + current, &dt);
+			dir->internal.basebuf.buf[stk->baselen - 1] = '/';
+			if (dir->internal.pattern &&
+			    dir->internal.pattern->flags & PATTERN_FLAG_NEGATIVE)
+				dir->internal.pattern = NULL;
+			if (dir->internal.pattern) {
+				dir->internal.exclude_stack = stk;
 				return;
 			}
 		}
@@ -1647,15 +1648,15 @@ static void prep_exclude(struct dir_struct *dir,
 		      */
 		     !is_null_oid(&untracked->exclude_oid))) {
 			/*
-			 * dir->basebuf gets reused by the traversal, but we
-			 * need fname to remain unchanged to ensure the src
-			 * member of each struct path_pattern correctly
+			 * dir->internal.basebuf gets reused by the traversal,
+			 * but we need fname to remain unchanged to ensure the
+			 * src member of each struct path_pattern correctly
 			 * back-references its source file.  Other invocations
 			 * of add_pattern_list provide stable strings, so we
 			 * strbuf_detach() and free() here in the caller.
 			 */
 			struct strbuf sb = STRBUF_INIT;
-			strbuf_addbuf(&sb, &dir->basebuf);
+			strbuf_addbuf(&sb, &dir->internal.basebuf);
 			strbuf_addstr(&sb, dir->exclude_per_dir);
 			pl->src = strbuf_detach(&sb, NULL);
 			add_patterns(pl->src, pl->src, stk->baselen, pl, istate,
@@ -1681,10 +1682,10 @@ static void prep_exclude(struct dir_struct *dir,
 			invalidate_gitignore(dir->untracked, untracked);
 			oidcpy(&untracked->exclude_oid, &oid_stat.oid);
 		}
-		dir->exclude_stack = stk;
+		dir->internal.exclude_stack = stk;
 		current = stk->baselen;
 	}
-	strbuf_setlen(&dir->basebuf, baselen);
+	strbuf_setlen(&dir->internal.basebuf, baselen);
 }
 
 /*
@@ -1704,8 +1705,8 @@ struct path_pattern *last_matching_pattern(struct dir_struct *dir,
 
 	prep_exclude(dir, istate, pathname, basename-pathname);
 
-	if (dir->pattern)
-		return dir->pattern;
+	if (dir->internal.pattern)
+		return dir->internal.pattern;
 
 	return last_matching_pattern_from_lists(dir, istate, pathname, pathlen,
 			basename, dtype_p);
@@ -1742,7 +1743,7 @@ static struct dir_entry *dir_add_name(struct dir_struct *dir,
 	if (index_file_exists(istate, pathname, len, ignore_case))
 		return NULL;
 
-	ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
+	ALLOC_GROW(dir->entries, dir->nr+1, dir->internal.alloc);
 	return dir->entries[dir->nr++] = dir_entry_new(pathname, len);
 }
 
@@ -1753,7 +1754,7 @@ struct dir_entry *dir_add_ignored(struct dir_struct *dir,
 	if (!index_name_is_other(istate, pathname, len))
 		return NULL;
 
-	ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->ignored_alloc);
+	ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->internal.ignored_alloc);
 	return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);
 }
 
@@ -2569,7 +2570,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 
 	if (open_cached_dir(&cdir, dir, untracked, istate, &path, check_only))
 		goto out;
-	dir->visited_directories++;
+	dir->internal.visited_directories++;
 
 	if (untracked)
 		untracked->check_only = !!check_only;
@@ -2578,7 +2579,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* check how the file or directory should be treated */
 		state = treat_path(dir, untracked, &cdir, istate, &path,
 				   baselen, pathspec);
-		dir->visited_paths++;
+		dir->internal.visited_paths++;
 
 		if (state > dir_state)
 			dir_state = state;
@@ -2586,7 +2587,8 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* recurse into subdir if instructed by treat_path */
 		if (state == path_recurse) {
 			struct untracked_cache_dir *ud;
-			ud = lookup_untracked(dir->untracked, untracked,
+			ud = lookup_untracked(dir->untracked,
+					      untracked,
 					      path.buf + baselen,
 					      path.len - baselen);
 			subdir_state =
@@ -2846,7 +2848,7 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 	 * condition also catches running setup_standard_excludes()
 	 * before setting dir->untracked!
 	 */
-	if (dir->unmanaged_exclude_files)
+	if (dir->internal.unmanaged_exclude_files)
 		return NULL;
 
 	/*
@@ -2875,7 +2877,7 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 	 * EXC_CMDL is not considered in the cache. If people set it,
 	 * skip the cache.
 	 */
-	if (dir->exclude_list_group[EXC_CMDL].nr)
+	if (dir->internal.exclude_list_group[EXC_CMDL].nr)
 		return NULL;
 
 	if (!ident_in_untracked(dir->untracked)) {
@@ -2935,15 +2937,15 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 
 	/* Validate $GIT_DIR/info/exclude and core.excludesfile */
 	root = dir->untracked->root;
-	if (!oideq(&dir->ss_info_exclude.oid,
+	if (!oideq(&dir->internal.ss_info_exclude.oid,
 		   &dir->untracked->ss_info_exclude.oid)) {
 		invalidate_gitignore(dir->untracked, root);
-		dir->untracked->ss_info_exclude = dir->ss_info_exclude;
+		dir->untracked->ss_info_exclude = dir->internal.ss_info_exclude;
 	}
-	if (!oideq(&dir->ss_excludes_file.oid,
+	if (!oideq(&dir->internal.ss_excludes_file.oid,
 		   &dir->untracked->ss_excludes_file.oid)) {
 		invalidate_gitignore(dir->untracked, root);
-		dir->untracked->ss_excludes_file = dir->ss_excludes_file;
+		dir->untracked->ss_excludes_file = dir->internal.ss_excludes_file;
 	}
 
 	/* Make sure this directory is not dropped out at saving phase */
@@ -2969,9 +2971,9 @@ static void emit_traversal_statistics(struct dir_struct *dir,
 	}
 
 	trace2_data_intmax("read_directory", repo,
-			   "directories-visited", dir->visited_directories);
+			   "directories-visited", dir->internal.visited_directories);
 	trace2_data_intmax("read_directory", repo,
-			   "paths-visited", dir->visited_paths);
+			   "paths-visited", dir->internal.visited_paths);
 
 	if (!dir->untracked)
 		return;
@@ -2993,8 +2995,8 @@ int read_directory(struct dir_struct *dir, struct index_state *istate,
 	struct untracked_cache_dir *untracked;
 
 	trace2_region_enter("dir", "read_directory", istate->repo);
-	dir->visited_paths = 0;
-	dir->visited_directories = 0;
+	dir->internal.visited_paths = 0;
+	dir->internal.visited_directories = 0;
 
 	if (has_symlink_leading_path(path, len)) {
 		trace2_region_leave("dir", "read_directory", istate->repo);
@@ -3342,14 +3344,14 @@ void setup_standard_excludes(struct dir_struct *dir)
 		excludes_file = xdg_config_home("ignore");
 	if (excludes_file && !access_or_warn(excludes_file, R_OK, 0))
 		add_patterns_from_file_1(dir, excludes_file,
-					 dir->untracked ? &dir->ss_excludes_file : NULL);
+					 dir->untracked ? &dir->internal.ss_excludes_file : NULL);
 
 	/* per repository user preference */
 	if (startup_info->have_repository) {
 		const char *path = git_path_info_exclude();
 		if (!access_or_warn(path, R_OK, 0))
 			add_patterns_from_file_1(dir, path,
-						 dir->untracked ? &dir->ss_info_exclude : NULL);
+						 dir->untracked ? &dir->internal.ss_info_exclude : NULL);
 	}
 }
 
@@ -3405,7 +3407,7 @@ void dir_clear(struct dir_struct *dir)
 	struct dir_struct new = DIR_INIT;
 
 	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
-		group = &dir->exclude_list_group[i];
+		group = &dir->internal.exclude_list_group[i];
 		for (j = 0; j < group->nr; j++) {
 			pl = &group->pl[j];
 			if (i == EXC_DIRS)
@@ -3422,13 +3424,13 @@ void dir_clear(struct dir_struct *dir)
 	free(dir->ignored);
 	free(dir->entries);
 
-	stk = dir->exclude_stack;
+	stk = dir->internal.exclude_stack;
 	while (stk) {
 		struct exclude_stack *prev = stk->prev;
 		free(stk);
 		stk = prev;
 	}
-	strbuf_release(&dir->basebuf);
+	strbuf_release(&dir->internal.basebuf);
 
 	memcpy(dir, &new, sizeof(*dir));
 }
diff --git a/dir.h b/dir.h
index 8acfc044181..33fd848fc8d 100644
--- a/dir.h
+++ b/dir.h
@@ -215,14 +215,9 @@ struct dir_struct {
 	/* The number of members in `entries[]` array. */
 	int nr;
 
-	/* Internal use; keeps track of allocation of `entries[]` array.*/
-	int alloc;
-
 	/* The number of members in `ignored[]` array. */
 	int ignored_nr;
 
-	int ignored_alloc;
-
 	/* bit-field of options */
 	enum {
 
@@ -296,51 +291,62 @@ struct dir_struct {
 	 */
 	struct dir_entry **ignored;
 
+	/* Enable/update untracked file cache if set */
+	struct untracked_cache *untracked;
+
 	/**
 	 * The name of the file to be read in each directory for excluded files
 	 * (typically `.gitignore`).
 	 */
 	const char *exclude_per_dir;
 
-	/*
-	 * We maintain three groups of exclude pattern lists:
-	 *
-	 * EXC_CMDL lists patterns explicitly given on the command line.
-	 * EXC_DIRS lists patterns obtained from per-directory ignore files.
-	 * EXC_FILE lists patterns from fallback ignore files, e.g.
-	 *   - .git/info/exclude
-	 *   - core.excludesfile
-	 *
-	 * Each group contains multiple exclude lists, a single list
-	 * per source.
-	 */
+	struct dir_struct_internal {
+		/* Keeps track of allocation of `entries[]` array.*/
+		int alloc;
+
+		/* Keeps track of allocation of `ignored[]` array. */
+		int ignored_alloc;
+
+		/*
+		 * We maintain three groups of exclude pattern lists:
+		 *
+		 * EXC_CMDL lists patterns explicitly given on the command line.
+		 * EXC_DIRS lists patterns obtained from per-directory ignore
+		 *          files.
+		 * EXC_FILE lists patterns from fallback ignore files, e.g.
+		 *   - .git/info/exclude
+		 *   - core.excludesfile
+		 *
+		 * Each group contains multiple exclude lists, a single list
+		 * per source.
+		 */
 #define EXC_CMDL 0
 #define EXC_DIRS 1
 #define EXC_FILE 2
-	struct exclude_list_group exclude_list_group[3];
-
-	/*
-	 * Temporary variables which are used during loading of the
-	 * per-directory exclude lists.
-	 *
-	 * exclude_stack points to the top of the exclude_stack, and
-	 * basebuf contains the full path to the current
-	 * (sub)directory in the traversal. Exclude points to the
-	 * matching exclude struct if the directory is excluded.
-	 */
-	struct exclude_stack *exclude_stack;
-	struct path_pattern *pattern;
-	struct strbuf basebuf;
-
-	/* Enable untracked file cache if set */
-	struct untracked_cache *untracked;
-	struct oid_stat ss_info_exclude;
-	struct oid_stat ss_excludes_file;
-	unsigned unmanaged_exclude_files;
+		struct exclude_list_group exclude_list_group[3];
 
-	/* Stats about the traversal */
-	unsigned visited_paths;
-	unsigned visited_directories;
+		/*
+		 * Temporary variables which are used during loading of the
+		 * per-directory exclude lists.
+		 *
+		 * exclude_stack points to the top of the exclude_stack, and
+		 * basebuf contains the full path to the current
+		 * (sub)directory in the traversal. Exclude points to the
+		 * matching exclude struct if the directory is excluded.
+		 */
+		struct exclude_stack *exclude_stack;
+		struct path_pattern *pattern;
+		struct strbuf basebuf;
+
+		/* Additional metadata related to 'untracked' */
+		struct oid_stat ss_info_exclude;
+		struct oid_stat ss_excludes_file;
+		unsigned unmanaged_exclude_files;
+
+		/* Stats about the traversal */
+		unsigned visited_paths;
+		unsigned visited_directories;
+	} internal;
 };
 
 #define DIR_INIT { 0 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 02/11] dir: add a usage note to exclude_per_dir
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 03/11] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/dir.h b/dir.h
index 33fd848fc8d..2196e12630c 100644
--- a/dir.h
+++ b/dir.h
@@ -295,8 +295,12 @@ struct dir_struct {
 	struct untracked_cache *untracked;
 
 	/**
-	 * The name of the file to be read in each directory for excluded files
-	 * (typically `.gitignore`).
+	 * Deprecated: ls-files is the only allowed caller; all other callers
+	 * should leave this as NULL; it pre-dated the
+	 * setup_standard_excludes() mechanism that replaces this.
+	 *
+	 * This field tracks the name of the file to be read in each directory
+	 * for excluded files (typically `.gitignore`).
 	 */
 	const char *exclude_per_dir;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 03/11] dir: mark output only fields of dir_struct as such
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

While at it, also group these fields together for convenience.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/dir.h b/dir.h
index 2196e12630c..e8106e1ecac 100644
--- a/dir.h
+++ b/dir.h
@@ -212,12 +212,6 @@ struct untracked_cache {
  */
 struct dir_struct {
 
-	/* The number of members in `entries[]` array. */
-	int nr;
-
-	/* The number of members in `ignored[]` array. */
-	int ignored_nr;
-
 	/* bit-field of options */
 	enum {
 
@@ -282,14 +276,20 @@ struct dir_struct {
 		DIR_SKIP_NESTED_GIT = 1<<9
 	} flags;
 
+	/* The number of members in `entries[]` array. */
+	int nr; /* output only */
+
+	/* The number of members in `ignored[]` array. */
+	int ignored_nr; /* output only */
+
 	/* An array of `struct dir_entry`, each element of which describes a path. */
-	struct dir_entry **entries;
+	struct dir_entry **entries; /* output only */
 
 	/**
 	 * used for ignored paths with the `DIR_SHOW_IGNORED_TOO` and
 	 * `DIR_COLLECT_IGNORED` flags.
 	 */
-	struct dir_entry **ignored;
+	struct dir_entry **ignored; /* output only */
 
 	/* Enable/update untracked file cache if set */
 	struct untracked_cache *untracked;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 04/11] unpack-trees: clean up some flow control
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (2 preceding siblings ...)
  2023-02-25  2:25   ` [PATCH v2 03/11] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The update_sparsity() function was introduced in commit 7af7a25853
("unpack-trees: add a new update_sparsity() function", 2020-03-27).
Prior to that, unpack_trees() was used, but that had a few bugs because
the needs of the caller were different, and different enough that
unpack_trees() could not easily be modified to handle both usecases.

The implementation detail that update_sparsity() was written by copying
unpack_trees() and then streamlining it, and then modifying it in the
needed ways still shows through in that there are leftover vestiges in
both functions that are no longer needed.  Clean them up.  In
particular:

  * update_sparsity() allows a pattern list to be passed in, but
    unpack_trees() never should use a different pattern list.  Add a
    check and a BUG() if this gets violated.
  * update_sparsity() has a check early on that will BUG() if
    o->skip_sparse_checkout is set; as such, there's no need to check
    for that condition again later in the code.  We can simply remove
    the check and its corresponding goto label.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3d05e45a279..0887d157df4 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1873,6 +1873,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
 	if (o->dir)
 		BUG("o->dir is for internal use only");
+	if (o->pl)
+		BUG("o->pl is for internal use only");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
@@ -1899,7 +1901,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 	if (!core_apply_sparse_checkout || !o->update)
 		o->skip_sparse_checkout = 1;
-	if (!o->skip_sparse_checkout && !o->pl) {
+	if (!o->skip_sparse_checkout) {
 		memset(&pl, 0, sizeof(pl));
 		free_pattern_list = 1;
 		populate_from_existing_patterns(o, &pl);
@@ -2113,8 +2115,6 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 		memset(&pl, 0, sizeof(pl));
 		free_pattern_list = 1;
 		populate_from_existing_patterns(o, &pl);
-		if (o->skip_sparse_checkout)
-			goto skip_sparse_checkout;
 	}
 
 	/* Expand sparse directories as needed */
@@ -2142,7 +2142,6 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 			ret = UPDATE_SPARSITY_WARNINGS;
 	}
 
-skip_sparse_checkout:
 	if (check_updates(o, o->src_index))
 		ret = UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 05/11] sparse-checkout: avoid using internal API of unpack-trees
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (3 preceding siblings ...)
  2023-02-25  2:25   ` [PATCH v2 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

struct unpack_trees_options has the following field and comment:

	struct pattern_list *pl; /* for internal use */

Despite the internal-use comment, commit e091228e17 ("sparse-checkout:
update working directory in-process", 2019-11-21) starting setting this
field from an external caller.  At the time, the only way around that
would have been to modify unpack_trees() to take an extra pattern_list
argument, and there's a lot of callers of that function.  However, when
we split update_sparsity() off as a separate function, with
sparse-checkout being the sole caller, the need to update other callers
went away.  Fix this API problem by adding a pattern_list argument to
update_sparsity() and stop setting the internal o.pl field directly.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/sparse-checkout.c |  3 +--
 unpack-trees.c            | 18 +++++++++++-------
 unpack-trees.h            |  3 ++-
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index c3738154918..4b7390ce367 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -219,14 +219,13 @@ static int update_working_directory(struct pattern_list *pl)
 	o.dst_index = r->index;
 	index_state_init(&o.result, r);
 	o.skip_sparse_checkout = 0;
-	o.pl = pl;
 
 	setup_work_tree();
 
 	repo_hold_locked_index(r, &lock_file, LOCK_DIE_ON_ERROR);
 
 	setup_unpack_trees_porcelain(&o, "sparse-checkout");
-	result = update_sparsity(&o);
+	result = update_sparsity(&o, pl);
 	clear_unpack_trees_porcelain(&o);
 
 	if (result == UPDATE_SPARSITY_WARNINGS)
diff --git a/unpack-trees.c b/unpack-trees.c
index 0887d157df4..639e48cc6bb 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2091,10 +2091,10 @@ return_failed:
  *
  * CE_NEW_SKIP_WORKTREE is used internally.
  */
-enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
+enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
+					    struct pattern_list *pl)
 {
 	enum update_sparsity_result ret = UPDATE_SPARSITY_SUCCESS;
-	struct pattern_list pl;
 	int i;
 	unsigned old_show_all_errors;
 	int free_pattern_list = 0;
@@ -2111,11 +2111,12 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 	trace_performance_enter();
 
 	/* If we weren't given patterns, use the recorded ones */
-	if (!o->pl) {
-		memset(&pl, 0, sizeof(pl));
+	if (!pl) {
 		free_pattern_list = 1;
-		populate_from_existing_patterns(o, &pl);
+		pl = xcalloc(1, sizeof(*pl));
+		populate_from_existing_patterns(o, pl);
 	}
+	o->pl = pl;
 
 	/* Expand sparse directories as needed */
 	expand_index(o->src_index, o->pl);
@@ -2147,8 +2148,11 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 
 	display_warning_msgs(o);
 	o->show_all_errors = old_show_all_errors;
-	if (free_pattern_list)
-		clear_pattern_list(&pl);
+	if (free_pattern_list) {
+		clear_pattern_list(pl);
+		free(pl);
+		o->pl = NULL;
+	}
 	trace_performance_leave("update_sparsity");
 	return ret;
 }
diff --git a/unpack-trees.h b/unpack-trees.h
index 3a7b3e5f007..f3a6e4f90ef 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -112,7 +112,8 @@ enum update_sparsity_result {
 	UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES = -2
 };
 
-enum update_sparsity_result update_sparsity(struct unpack_trees_options *options);
+enum update_sparsity_result update_sparsity(struct unpack_trees_options *options,
+					    struct pattern_list *pl);
 
 int verify_uptodate(const struct cache_entry *ce,
 		    struct unpack_trees_options *o);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (4 preceding siblings ...)
  2023-02-25  2:25   ` [PATCH v2 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 07/11] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Commit 2f6b1eb794 ("cache API: add a "INDEX_STATE_INIT" macro/function,
add release_index()", 2023-01-12) mistakenly added some initialization
of a member of unpack_trees_options that was intended to be
internal-only.  This initialization should be done within
update_sparsity() instead.

Note that while o->result is mostly meant for unpack_trees() and
update_sparsity() mostly operates without o->result,
check_ok_to_remove() does consult it so we need to ensure it is properly
initialized.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/sparse-checkout.c | 1 -
 unpack-trees.c            | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 4b7390ce367..8d5ae6f2a60 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -217,7 +217,6 @@ static int update_working_directory(struct pattern_list *pl)
 	o.head_idx = -1;
 	o.src_index = r->index;
 	o.dst_index = r->index;
-	index_state_init(&o.result, r);
 	o.skip_sparse_checkout = 0;
 
 	setup_work_tree();
diff --git a/unpack-trees.c b/unpack-trees.c
index 639e48cc6bb..4fca051cbea 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2101,6 +2101,7 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 
 	old_show_all_errors = o->show_all_errors;
 	o->show_all_errors = 1;
+	index_state_init(&o->result, o->src_index->repo);
 
 	/* Sanity checks */
 	if (!o->update || o->index_only || o->skip_sparse_checkout)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 07/11] unpack_trees: start splitting internal fields from public API
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (5 preceding siblings ...)
  2023-02-25  2:25   ` [PATCH v2 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 08/11] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

This just splits the two fields already marked as internal-only into a
separate internal struct.  Future commits will add more fields that
were meant to be internal-only but were not explicitly marked as such
to the same struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 42 +++++++++++++++++++++---------------------
 unpack-trees.h |  7 +++++--
 2 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 4fca051cbea..c659af67c62 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1809,7 +1809,7 @@ static void populate_from_existing_patterns(struct unpack_trees_options *o,
 	if (get_sparse_checkout_patterns(pl) < 0)
 		o->skip_sparse_checkout = 1;
 	else
-		o->pl = pl;
+		o->internal.pl = pl;
 }
 
 static void update_sparsity_for_prefix(const char *prefix,
@@ -1871,10 +1871,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 	if (len > MAX_UNPACK_TREES)
 		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
-	if (o->dir)
-		BUG("o->dir is for internal use only");
-	if (o->pl)
-		BUG("o->pl is for internal use only");
+	if (o->internal.dir)
+		BUG("o->internal.dir is for internal use only");
+	if (o->internal.pl)
+		BUG("o->internal.pl is for internal use only");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
@@ -1891,9 +1891,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		BUG("UNPACK_RESET_OVERWRITE_UNTRACKED incompatible with preserved ignored files");
 
 	if (!o->preserve_ignored) {
-		o->dir = &dir;
-		o->dir->flags |= DIR_SHOW_IGNORED;
-		setup_standard_excludes(o->dir);
+		o->internal.dir = &dir;
+		o->internal.dir->flags |= DIR_SHOW_IGNORED;
+		setup_standard_excludes(o->internal.dir);
 	}
 
 	if (o->prefix)
@@ -1943,7 +1943,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
 	 */
 	if (!o->skip_sparse_checkout)
-		mark_new_skip_worktree(o->pl, o->src_index, 0,
+		mark_new_skip_worktree(o->internal.pl, o->src_index, 0,
 				       CE_NEW_SKIP_WORKTREE, o->verbose_update);
 
 	if (!dfc)
@@ -2009,7 +2009,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		 * If they will have NEW_SKIP_WORKTREE, also set CE_SKIP_WORKTREE
 		 * so apply_sparse_checkout() won't attempt to remove it from worktree
 		 */
-		mark_new_skip_worktree(o->pl, &o->result,
+		mark_new_skip_worktree(o->internal.pl, &o->result,
 				       CE_ADDED, CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE,
 				       o->verbose_update);
 
@@ -2067,9 +2067,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 done:
 	if (free_pattern_list)
 		clear_pattern_list(&pl);
-	if (o->dir) {
-		dir_clear(o->dir);
-		o->dir = NULL;
+	if (o->internal.dir) {
+		dir_clear(o->internal.dir);
+		o->internal.dir = NULL;
 	}
 	trace2_region_leave("unpack_trees", "unpack_trees", the_repository);
 	trace_performance_leave("unpack_trees");
@@ -2117,14 +2117,14 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 		pl = xcalloc(1, sizeof(*pl));
 		populate_from_existing_patterns(o, pl);
 	}
-	o->pl = pl;
+	o->internal.pl = pl;
 
 	/* Expand sparse directories as needed */
-	expand_index(o->src_index, o->pl);
+	expand_index(o->src_index, o->internal.pl);
 
 	/* Set NEW_SKIP_WORKTREE on existing entries. */
 	mark_all_ce_unused(o->src_index);
-	mark_new_skip_worktree(o->pl, o->src_index, 0,
+	mark_new_skip_worktree(o->internal.pl, o->src_index, 0,
 			       CE_NEW_SKIP_WORKTREE, o->verbose_update);
 
 	/* Then loop over entries and update/remove as needed */
@@ -2152,7 +2152,7 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 	if (free_pattern_list) {
 		clear_pattern_list(pl);
 		free(pl);
-		o->pl = NULL;
+		o->internal.pl = NULL;
 	}
 	trace_performance_leave("update_sparsity");
 	return ret;
@@ -2340,8 +2340,8 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
 	pathbuf = xstrfmt("%.*s/", namelen, ce->name);
 
 	memset(&d, 0, sizeof(d));
-	if (o->dir)
-		d.exclude_per_dir = o->dir->exclude_per_dir;
+	if (o->internal.dir)
+		d.exclude_per_dir = o->internal.dir->exclude_per_dir;
 	i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
 	dir_clear(&d);
 	free(pathbuf);
@@ -2395,8 +2395,8 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 	if (ignore_case && icase_exists(o, name, len, st))
 		return 0;
 
-	if (o->dir &&
-	    is_excluded(o->dir, o->src_index, name, &dtype))
+	if (o->internal.dir &&
+	    is_excluded(o->internal.dir, o->src_index, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
diff --git a/unpack-trees.h b/unpack-trees.h
index f3a6e4f90ef..5c1a9314a06 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -97,9 +97,12 @@ struct unpack_trees_options {
 	struct index_state *src_index;
 	struct index_state result;
 
-	struct pattern_list *pl; /* for internal use */
-	struct dir_struct *dir; /* for internal use only */
 	struct checkout_metadata meta;
+
+	struct unpack_trees_options_internal {
+		struct pattern_list *pl;
+		struct dir_struct *dir;
+	} internal;
 };
 
 int unpack_trees(unsigned n, struct tree_desc *t,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 08/11] unpack-trees: mark fields only used internally as internal
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (6 preceding siblings ...)
  2023-02-25  2:25   ` [PATCH v2 07/11] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 09/11] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Continue the work from the previous patch by finding additional fields
which are only used internally but not yet explicitly marked as such,
and include them in the internal fields struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 159 +++++++++++++++++++++++++------------------------
 unpack-trees.h |  26 ++++----
 2 files changed, 95 insertions(+), 90 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index c659af67c62..f5294194aa1 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -66,8 +66,8 @@ static const char *unpack_plumbing_errors[NB_UNPACK_TREES_WARNING_TYPES] = {
 };
 
 #define ERRORMSG(o,type) \
-	( ((o) && (o)->msgs[(type)]) \
-	  ? ((o)->msgs[(type)])      \
+	( ((o) && (o)->internal.msgs[(type)]) \
+	  ? ((o)->internal.msgs[(type)])      \
 	  : (unpack_plumbing_errors[(type)]) )
 
 static const char *super_prefixed(const char *path, const char *super_prefix)
@@ -108,10 +108,10 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 				  const char *cmd)
 {
 	int i;
-	const char **msgs = opts->msgs;
+	const char **msgs = opts->internal.msgs;
 	const char *msg;
 
-	strvec_init(&opts->msgs_to_free);
+	strvec_init(&opts->internal.msgs_to_free);
 
 	if (!strcmp(cmd, "checkout"))
 		msg = advice_enabled(ADVICE_COMMIT_BEFORE_MERGE)
@@ -129,7 +129,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please commit your changes or stash them before you %s.")
 		      : _("Your local changes to the following files would be overwritten by %s:\n%%s");
 	msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	msgs[ERROR_NOT_UPTODATE_DIR] =
 		_("Updating the following directories would lose untracked files in them:\n%s");
@@ -153,7 +153,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please move or remove them before you %s.")
 		      : _("The following untracked working tree files would be removed by %s:\n%%s");
 	msgs[ERROR_WOULD_LOSE_UNTRACKED_REMOVED] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	if (!strcmp(cmd, "checkout"))
 		msg = advice_enabled(ADVICE_COMMIT_BEFORE_MERGE)
@@ -171,7 +171,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please move or remove them before you %s.")
 		      : _("The following untracked working tree files would be overwritten by %s:\n%%s");
 	msgs[ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	/*
 	 * Special case: ERROR_BIND_OVERLAP refers to a pair of paths, we
@@ -189,16 +189,16 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 	msgs[WARNING_SPARSE_ORPHANED_NOT_OVERWRITTEN] =
 		_("The following paths were already present and thus not updated despite sparse patterns:\n%s");
 
-	opts->show_all_errors = 1;
+	opts->internal.show_all_errors = 1;
 	/* rejected paths may not have a static buffer */
-	for (i = 0; i < ARRAY_SIZE(opts->unpack_rejects); i++)
-		opts->unpack_rejects[i].strdup_strings = 1;
+	for (i = 0; i < ARRAY_SIZE(opts->internal.unpack_rejects); i++)
+		opts->internal.unpack_rejects[i].strdup_strings = 1;
 }
 
 void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
 {
-	strvec_clear(&opts->msgs_to_free);
-	memset(opts->msgs, 0, sizeof(opts->msgs));
+	strvec_clear(&opts->internal.msgs_to_free);
+	memset(opts->internal.msgs, 0, sizeof(opts->internal.msgs));
 }
 
 static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
@@ -210,7 +210,7 @@ static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
 		set |= CE_WT_REMOVE;
 
 	ce->ce_flags = (ce->ce_flags & ~clear) | set;
-	return add_index_entry(&o->result, ce,
+	return add_index_entry(&o->internal.result, ce,
 			       ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE);
 }
 
@@ -218,7 +218,7 @@ static void add_entry(struct unpack_trees_options *o,
 		      const struct cache_entry *ce,
 		      unsigned int set, unsigned int clear)
 {
-	do_add_entry(o, dup_cache_entry(ce, &o->result), set, clear);
+	do_add_entry(o, dup_cache_entry(ce, &o->internal.result), set, clear);
 }
 
 /*
@@ -233,7 +233,7 @@ static int add_rejected_path(struct unpack_trees_options *o,
 	if (o->quiet)
 		return -1;
 
-	if (!o->show_all_errors)
+	if (!o->internal.show_all_errors)
 		return error(ERRORMSG(o, e), super_prefixed(path,
 							    o->super_prefix));
 
@@ -241,7 +241,7 @@ static int add_rejected_path(struct unpack_trees_options *o,
 	 * Otherwise, insert in a list for future display by
 	 * display_(error|warning)_msgs()
 	 */
-	string_list_append(&o->unpack_rejects[e], path);
+	string_list_append(&o->internal.unpack_rejects[e], path);
 	return -1;
 }
 
@@ -253,7 +253,7 @@ static void display_error_msgs(struct unpack_trees_options *o)
 	int e;
 	unsigned error_displayed = 0;
 	for (e = 0; e < NB_UNPACK_TREES_ERROR_TYPES; e++) {
-		struct string_list *rejects = &o->unpack_rejects[e];
+		struct string_list *rejects = &o->internal.unpack_rejects[e];
 
 		if (rejects->nr > 0) {
 			int i;
@@ -281,7 +281,7 @@ static void display_warning_msgs(struct unpack_trees_options *o)
 	unsigned warning_displayed = 0;
 	for (e = NB_UNPACK_TREES_ERROR_TYPES + 1;
 	     e < NB_UNPACK_TREES_WARNING_TYPES; e++) {
-		struct string_list *rejects = &o->unpack_rejects[e];
+		struct string_list *rejects = &o->internal.unpack_rejects[e];
 
 		if (rejects->nr > 0) {
 			int i;
@@ -600,13 +600,14 @@ static void mark_ce_used(struct cache_entry *ce, struct unpack_trees_options *o)
 {
 	ce->ce_flags |= CE_UNPACKED;
 
-	if (o->cache_bottom < o->src_index->cache_nr &&
-	    o->src_index->cache[o->cache_bottom] == ce) {
-		int bottom = o->cache_bottom;
+	if (o->internal.cache_bottom < o->src_index->cache_nr &&
+	    o->src_index->cache[o->internal.cache_bottom] == ce) {
+		int bottom = o->internal.cache_bottom;
+
 		while (bottom < o->src_index->cache_nr &&
 		       o->src_index->cache[bottom]->ce_flags & CE_UNPACKED)
 			bottom++;
-		o->cache_bottom = bottom;
+		o->internal.cache_bottom = bottom;
 	}
 }
 
@@ -652,7 +653,7 @@ static void mark_ce_used_same_name(struct cache_entry *ce,
 static struct cache_entry *next_cache_entry(struct unpack_trees_options *o)
 {
 	const struct index_state *index = o->src_index;
-	int pos = o->cache_bottom;
+	int pos = o->internal.cache_bottom;
 
 	while (pos < index->cache_nr) {
 		struct cache_entry *ce = index->cache[pos];
@@ -711,7 +712,7 @@ static void restore_cache_bottom(struct traverse_info *info, int bottom)
 
 	if (o->diff_index_cached)
 		return;
-	o->cache_bottom = bottom;
+	o->internal.cache_bottom = bottom;
 }
 
 static int switch_cache_bottom(struct traverse_info *info)
@@ -721,13 +722,13 @@ static int switch_cache_bottom(struct traverse_info *info)
 
 	if (o->diff_index_cached)
 		return 0;
-	ret = o->cache_bottom;
+	ret = o->internal.cache_bottom;
 	pos = find_cache_pos(info->prev, info->name, info->namelen);
 
 	if (pos < -1)
-		o->cache_bottom = -2 - pos;
+		o->internal.cache_bottom = -2 - pos;
 	else if (pos < 0)
-		o->cache_bottom = o->src_index->cache_nr;
+		o->internal.cache_bottom = o->src_index->cache_nr;
 	return ret;
 }
 
@@ -873,9 +874,9 @@ static int traverse_trees_recursive(int n, unsigned long dirmask,
 		 * save and restore cache_bottom anyway to not miss
 		 * unprocessed entries before 'pos'.
 		 */
-		bottom = o->cache_bottom;
+		bottom = o->internal.cache_bottom;
 		ret = traverse_by_cache_tree(pos, nr_entries, n, info);
-		o->cache_bottom = bottom;
+		o->internal.cache_bottom = bottom;
 		return ret;
 	}
 
@@ -1212,7 +1213,7 @@ static int unpack_single_entry(int n, unsigned long mask,
 		 * cache entry from the index aware logic.
 		 */
 		src[i + o->merge] = create_ce_entry(info, names + i, stage,
-						    &o->result, o->merge,
+						    &o->internal.result, o->merge,
 						    bit & dirmask);
 	}
 
@@ -1237,7 +1238,7 @@ static int unpack_single_entry(int n, unsigned long mask,
 
 static int unpack_failed(struct unpack_trees_options *o, const char *message)
 {
-	discard_index(&o->result);
+	discard_index(&o->internal.result);
 	if (!o->quiet && !o->exiting_early) {
 		if (message)
 			return error("%s", message);
@@ -1260,7 +1261,7 @@ static int find_cache_pos(struct traverse_info *info,
 	struct index_state *index = o->src_index;
 	int pfxlen = info->pathlen;
 
-	for (pos = o->cache_bottom; pos < index->cache_nr; pos++) {
+	for (pos = o->internal.cache_bottom; pos < index->cache_nr; pos++) {
 		const struct cache_entry *ce = index->cache[pos];
 		const char *ce_name, *ce_slash;
 		int cmp, ce_len;
@@ -1271,8 +1272,8 @@ static int find_cache_pos(struct traverse_info *info,
 			 * we can never match it; don't check it
 			 * again.
 			 */
-			if (pos == o->cache_bottom)
-				++o->cache_bottom;
+			if (pos == o->internal.cache_bottom)
+				++o->internal.cache_bottom;
 			continue;
 		}
 		if (!ce_in_traverse_path(ce, info)) {
@@ -1450,7 +1451,7 @@ static int unpack_sparse_callback(int n, unsigned long mask, unsigned long dirma
 	 */
 	if (!is_null_oid(&names[0].oid)) {
 		src[0] = create_ce_entry(info, &names[0], 0,
-					&o->result, 1,
+					&o->internal.result, 1,
 					dirmask & (1ul << 0));
 		src[0]->ce_flags |= (CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
 	}
@@ -1560,7 +1561,7 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 				 * in 'mark_ce_used()'
 				 */
 				if (!src[0] || !S_ISSPARSEDIR(src[0]->ce_mode))
-					o->cache_bottom += matches;
+					o->internal.cache_bottom += matches;
 				return mask;
 			}
 		}
@@ -1907,37 +1908,37 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		populate_from_existing_patterns(o, &pl);
 	}
 
-	index_state_init(&o->result, o->src_index->repo);
-	o->result.initialized = 1;
-	o->result.timestamp.sec = o->src_index->timestamp.sec;
-	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
-	o->result.version = o->src_index->version;
+	index_state_init(&o->internal.result, o->src_index->repo);
+	o->internal.result.initialized = 1;
+	o->internal.result.timestamp.sec = o->src_index->timestamp.sec;
+	o->internal.result.timestamp.nsec = o->src_index->timestamp.nsec;
+	o->internal.result.version = o->src_index->version;
 	if (!o->src_index->split_index) {
-		o->result.split_index = NULL;
+		o->internal.result.split_index = NULL;
 	} else if (o->src_index == o->dst_index) {
 		/*
 		 * o->dst_index (and thus o->src_index) will be discarded
-		 * and overwritten with o->result at the end of this function,
+		 * and overwritten with o->internal.result at the end of this function,
 		 * so just use src_index's split_index to avoid having to
 		 * create a new one.
 		 */
-		o->result.split_index = o->src_index->split_index;
-		o->result.split_index->refcount++;
+		o->internal.result.split_index = o->src_index->split_index;
+		o->internal.result.split_index->refcount++;
 	} else {
-		o->result.split_index = init_split_index(&o->result);
+		o->internal.result.split_index = init_split_index(&o->internal.result);
 	}
-	oidcpy(&o->result.oid, &o->src_index->oid);
+	oidcpy(&o->internal.result.oid, &o->src_index->oid);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
 
-	o->result.fsmonitor_last_update =
+	o->internal.result.fsmonitor_last_update =
 		xstrdup_or_null(o->src_index->fsmonitor_last_update);
-	o->result.fsmonitor_has_run_once = o->src_index->fsmonitor_has_run_once;
+	o->internal.result.fsmonitor_has_run_once = o->src_index->fsmonitor_has_run_once;
 
 	if (!o->src_index->initialized &&
 	    !repo->settings.command_requires_full_index &&
-	    is_sparse_index_allowed(&o->result, 0))
-		o->result.sparse_index = 1;
+	    is_sparse_index_allowed(&o->internal.result, 0))
+		o->internal.result.sparse_index = 1;
 
 	/*
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
@@ -1957,7 +1958,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		setup_traverse_info(&info, prefix);
 		info.fn = unpack_callback;
 		info.data = o;
-		info.show_all_errors = o->show_all_errors;
+		info.show_all_errors = o->internal.show_all_errors;
 		info.pathspec = o->pathspec;
 
 		if (o->prefix) {
@@ -1998,7 +1999,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	}
 	mark_all_ce_unused(o->src_index);
 
-	if (o->trivial_merges_only && o->nontrivial_merge) {
+	if (o->trivial_merges_only && o->internal.nontrivial_merge) {
 		ret = unpack_failed(o, "Merge requires file-level merging");
 		goto done;
 	}
@@ -2009,13 +2010,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		 * If they will have NEW_SKIP_WORKTREE, also set CE_SKIP_WORKTREE
 		 * so apply_sparse_checkout() won't attempt to remove it from worktree
 		 */
-		mark_new_skip_worktree(o->internal.pl, &o->result,
+		mark_new_skip_worktree(o->internal.pl, &o->internal.result,
 				       CE_ADDED, CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE,
 				       o->verbose_update);
 
 		ret = 0;
-		for (i = 0; i < o->result.cache_nr; i++) {
-			struct cache_entry *ce = o->result.cache[i];
+		for (i = 0; i < o->internal.result.cache_nr; i++) {
+			struct cache_entry *ce = o->internal.result.cache[i];
 
 			/*
 			 * Entries marked with CE_ADDED in merged_entry() do not have
@@ -2029,7 +2030,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			    verify_absent(ce, WARNING_SPARSE_ORPHANED_NOT_OVERWRITTEN, o))
 				ret = 1;
 
-			if (apply_sparse_checkout(&o->result, ce, o))
+			if (apply_sparse_checkout(&o->internal.result, ce, o))
 				ret = 1;
 		}
 		if (ret == 1) {
@@ -2037,30 +2038,30 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			 * Inability to sparsify or de-sparsify individual
 			 * paths is not an error, but just a warning.
 			 */
-			if (o->show_all_errors)
+			if (o->internal.show_all_errors)
 				display_warning_msgs(o);
 			ret = 0;
 		}
 	}
 
-	ret = check_updates(o, &o->result) ? (-2) : 0;
+	ret = check_updates(o, &o->internal.result) ? (-2) : 0;
 	if (o->dst_index) {
-		move_index_extensions(&o->result, o->src_index);
+		move_index_extensions(&o->internal.result, o->src_index);
 		if (!ret) {
 			if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
-				cache_tree_verify(the_repository, &o->result);
+				cache_tree_verify(the_repository, &o->internal.result);
 			if (!o->skip_cache_tree_update &&
-			    !cache_tree_fully_valid(o->result.cache_tree))
-				cache_tree_update(&o->result,
+			    !cache_tree_fully_valid(o->internal.result.cache_tree))
+				cache_tree_update(&o->internal.result,
 						  WRITE_TREE_SILENT |
 						  WRITE_TREE_REPAIR);
 		}
 
-		o->result.updated_workdir = 1;
+		o->internal.result.updated_workdir = 1;
 		discard_index(o->dst_index);
-		*o->dst_index = o->result;
+		*o->dst_index = o->internal.result;
 	} else {
-		discard_index(&o->result);
+		discard_index(&o->internal.result);
 	}
 	o->src_index = NULL;
 
@@ -2076,7 +2077,7 @@ done:
 	return ret;
 
 return_failed:
-	if (o->show_all_errors)
+	if (o->internal.show_all_errors)
 		display_error_msgs(o);
 	mark_all_ce_unused(o->src_index);
 	ret = unpack_failed(o, NULL);
@@ -2099,9 +2100,9 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 	unsigned old_show_all_errors;
 	int free_pattern_list = 0;
 
-	old_show_all_errors = o->show_all_errors;
-	o->show_all_errors = 1;
-	index_state_init(&o->result, o->src_index->repo);
+	old_show_all_errors = o->internal.show_all_errors;
+	o->internal.show_all_errors = 1;
+	index_state_init(&o->internal.result, o->src_index->repo);
 
 	/* Sanity checks */
 	if (!o->update || o->index_only || o->skip_sparse_checkout)
@@ -2148,7 +2149,7 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 		ret = UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES;
 
 	display_warning_msgs(o);
-	o->show_all_errors = old_show_all_errors;
+	o->internal.show_all_errors = old_show_all_errors;
 	if (free_pattern_list) {
 		clear_pattern_list(pl);
 		free(pl);
@@ -2248,15 +2249,15 @@ static int verify_uptodate_sparse(const struct cache_entry *ce,
 }
 
 /*
- * TODO: We should actually invalidate o->result, not src_index [1].
+ * TODO: We should actually invalidate o->internal.result, not src_index [1].
  * But since cache tree and untracked cache both are not copied to
- * o->result until unpacking is complete, we invalidate them on
+ * o->internal.result until unpacking is complete, we invalidate them on
  * src_index instead with the assumption that they will be copied to
  * dst_index at the end.
  *
  * [1] src_index->cache_tree is also used in unpack_callback() so if
- * we invalidate o->result, we need to update it to use
- * o->result.cache_tree as well.
+ * we invalidate o->internal.result, we need to update it to use
+ * o->internal.result.cache_tree as well.
  */
 static void invalidate_ce_path(const struct cache_entry *ce,
 			       struct unpack_trees_options *o)
@@ -2424,7 +2425,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 	 * delete this path, which is in a subdirectory that
 	 * is being replaced with a blob.
 	 */
-	result = index_file_exists(&o->result, name, len, 0);
+	result = index_file_exists(&o->internal.result, name, len, 0);
 	if (result) {
 		if (result->ce_flags & CE_REMOVE)
 			return 0;
@@ -2525,7 +2526,7 @@ static int merged_entry(const struct cache_entry *ce,
 			struct unpack_trees_options *o)
 {
 	int update = CE_UPDATE;
-	struct cache_entry *merge = dup_cache_entry(ce, &o->result);
+	struct cache_entry *merge = dup_cache_entry(ce, &o->internal.result);
 
 	if (!old) {
 		/*
@@ -2620,7 +2621,7 @@ static int merged_sparse_dir(const struct cache_entry * const *src, int n,
 	setup_traverse_info(&info, src[0]->name);
 	info.fn = unpack_sparse_callback;
 	info.data = o;
-	info.show_all_errors = o->show_all_errors;
+	info.show_all_errors = o->internal.show_all_errors;
 	info.pathspec = o->pathspec;
 
 	/* Get the tree descriptors of the sparse directory in each of the merging trees */
@@ -2838,7 +2839,7 @@ int threeway_merge(const struct cache_entry * const *stages,
 			return -1;
 	}
 
-	o->nontrivial_merge = 1;
+	o->internal.nontrivial_merge = 1;
 
 	/* #2, #3, #4, #6, #7, #9, #10, #11. */
 	count = 0;
diff --git a/unpack-trees.h b/unpack-trees.h
index 5c1a9314a06..0335c89bc75 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -59,7 +59,6 @@ struct unpack_trees_options {
 		     preserve_ignored,
 		     clone,
 		     index_only,
-		     nontrivial_merge,
 		     trivial_merges_only,
 		     verbose_update,
 		     aggressive,
@@ -70,22 +69,13 @@ struct unpack_trees_options {
 		     skip_sparse_checkout,
 		     quiet,
 		     exiting_early,
-		     show_all_errors,
 		     dry_run,
 		     skip_cache_tree_update;
 	enum unpack_trees_reset_type reset;
 	const char *prefix;
 	const char *super_prefix;
-	int cache_bottom;
 	struct pathspec *pathspec;
 	merge_fn_t fn;
-	const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
-	struct strvec msgs_to_free;
-	/*
-	 * Store error messages in an array, each case
-	 * corresponding to a error message type
-	 */
-	struct string_list unpack_rejects[NB_UNPACK_TREES_WARNING_TYPES];
 
 	int head_idx;
 	int merge_size;
@@ -95,11 +85,25 @@ struct unpack_trees_options {
 
 	struct index_state *dst_index;
 	struct index_state *src_index;
-	struct index_state result;
 
 	struct checkout_metadata meta;
 
 	struct unpack_trees_options_internal {
+		unsigned int nontrivial_merge,
+			     show_all_errors;
+
+		int cache_bottom;
+		const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
+		struct strvec msgs_to_free;
+
+		/*
+		 * Store error messages in an array, each case
+		 * corresponding to a error message type
+		 */
+		struct string_list unpack_rejects[NB_UNPACK_TREES_WARNING_TYPES];
+
+		struct index_state result;
+
 		struct pattern_list *pl;
 		struct dir_struct *dir;
 	} internal;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 09/11] unpack-trees: rewrap a few overlong lines from previous patch
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (7 preceding siblings ...)
  2023-02-25  2:25   ` [PATCH v2 08/11] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:25   ` [PATCH v2 10/11] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The previous patch made many lines a little longer, resulting in four
becoming a bit too long.  They were left as-is for the previous patch
to facilitate reviewers verifying that we were just adding "internal."
in a bunch of places, but rewrap them now.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index f5294194aa1..985896d6af6 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1213,8 +1213,8 @@ static int unpack_single_entry(int n, unsigned long mask,
 		 * cache entry from the index aware logic.
 		 */
 		src[i + o->merge] = create_ce_entry(info, names + i, stage,
-						    &o->internal.result, o->merge,
-						    bit & dirmask);
+						    &o->internal.result,
+						    o->merge, bit & dirmask);
 	}
 
 	if (o->merge) {
@@ -1918,14 +1918,15 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	} else if (o->src_index == o->dst_index) {
 		/*
 		 * o->dst_index (and thus o->src_index) will be discarded
-		 * and overwritten with o->internal.result at the end of this function,
-		 * so just use src_index's split_index to avoid having to
-		 * create a new one.
+		 * and overwritten with o->internal.result at the end of
+		 * this function, so just use src_index's split_index to
+		 * avoid having to create a new one.
 		 */
 		o->internal.result.split_index = o->src_index->split_index;
 		o->internal.result.split_index->refcount++;
 	} else {
-		o->internal.result.split_index = init_split_index(&o->internal.result);
+		o->internal.result.split_index =
+			init_split_index(&o->internal.result);
 	}
 	oidcpy(&o->internal.result.oid, &o->src_index->oid);
 	o->merge_size = len;
@@ -2049,7 +2050,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		move_index_extensions(&o->internal.result, o->src_index);
 		if (!ret) {
 			if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
-				cache_tree_verify(the_repository, &o->internal.result);
+				cache_tree_verify(the_repository,
+						  &o->internal.result);
 			if (!o->skip_cache_tree_update &&
 			    !cache_tree_fully_valid(o->internal.result.cache_tree))
 				cache_tree_update(&o->internal.result,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 10/11] unpack-trees: special case read-tree debugging as internal usage
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (8 preceding siblings ...)
  2023-02-25  2:25   ` [PATCH v2 09/11] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
@ 2023-02-25  2:25   ` Elijah Newren via GitGitGadget
  2023-02-25  2:26   ` [PATCH v2 11/11] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:25 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

builtin/read-tree.c has some special functionality explicitly designed
for debugging unpack-trees.[ch].  Associated with that is two fields
that no other external caller would or should use.  Mark these as
internal to unpack-trees, but allow builtin/read-tree to read or write
them for this special case.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/read-tree.c | 10 +++++-----
 unpack-trees.c      | 22 +++++++++++-----------
 unpack-trees.h      |  6 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 3ce75417833..6034408d486 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -87,9 +87,9 @@ static int debug_merge(const struct cache_entry * const *stages,
 {
 	int i;
 
-	printf("* %d-way merge\n", o->merge_size);
+	printf("* %d-way merge\n", o->internal.merge_size);
 	debug_stage("index", stages[0], o);
-	for (i = 1; i <= o->merge_size; i++) {
+	for (i = 1; i <= o->internal.merge_size; i++) {
 		char buf[24];
 		xsnprintf(buf, sizeof(buf), "ent#%d", i);
 		debug_stage(buf, stages[i], o);
@@ -144,7 +144,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 		OPT__DRY_RUN(&opts.dry_run, N_("don't update the index or the work tree")),
 		OPT_BOOL(0, "no-sparse-checkout", &opts.skip_sparse_checkout,
 			 N_("skip applying sparse checkout filter")),
-		OPT_BOOL(0, "debug-unpack", &opts.debug_unpack,
+		OPT_BOOL(0, "debug-unpack", &opts.internal.debug_unpack,
 			 N_("debug unpack-trees")),
 		OPT_CALLBACK_F(0, "recurse-submodules", NULL,
 			    "checkout", "control recursive updating of submodules",
@@ -247,7 +247,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 			opts.head_idx = 1;
 	}
 
-	if (opts.debug_unpack)
+	if (opts.internal.debug_unpack)
 		opts.fn = debug_merge;
 
 	/* If we're going to prime_cache_tree later, skip cache tree update */
@@ -263,7 +263,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 	if (unpack_trees(nr_trees, t, &opts))
 		return 128;
 
-	if (opts.debug_unpack || opts.dry_run)
+	if (opts.internal.debug_unpack || opts.dry_run)
 		return 0; /* do not write the index out */
 
 	/*
diff --git a/unpack-trees.c b/unpack-trees.c
index 985896d6af6..e58f0f6a867 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -839,7 +839,7 @@ static int traverse_by_cache_tree(int pos, int nr_entries, int nr_names,
 		mark_ce_used(src[0], o);
 	}
 	free(tree_ce);
-	if (o->debug_unpack)
+	if (o->internal.debug_unpack)
 		printf("Unpacked %d entries from %s to %s using cache-tree\n",
 		       nr_entries,
 		       o->src_index->cache[pos]->name,
@@ -1488,7 +1488,7 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 	while (!p->mode)
 		p++;
 
-	if (o->debug_unpack)
+	if (o->internal.debug_unpack)
 		debug_unpack_callback(n, mask, dirmask, names, info);
 
 	/* Are we supposed to look at the index too? */
@@ -1929,7 +1929,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			init_split_index(&o->internal.result);
 	}
 	oidcpy(&o->internal.result.oid, &o->src_index->oid);
-	o->merge_size = len;
+	o->internal.merge_size = len;
 	mark_all_ce_unused(o->src_index);
 
 	o->internal.result.fsmonitor_last_update =
@@ -2882,9 +2882,9 @@ int twoway_merge(const struct cache_entry * const *src,
 	const struct cache_entry *oldtree = src[1];
 	const struct cache_entry *newtree = src[2];
 
-	if (o->merge_size != 2)
+	if (o->internal.merge_size != 2)
 		return error("Cannot do a twoway merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 
 	if (oldtree == o->df_conflict_entry)
 		oldtree = NULL;
@@ -2964,9 +2964,9 @@ int bind_merge(const struct cache_entry * const *src,
 	const struct cache_entry *old = src[0];
 	const struct cache_entry *a = src[1];
 
-	if (o->merge_size != 1)
+	if (o->internal.merge_size != 1)
 		return error("Cannot do a bind merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 	if (a && old)
 		return o->quiet ? -1 :
 			error(ERRORMSG(o, ERROR_BIND_OVERLAP),
@@ -2990,9 +2990,9 @@ int oneway_merge(const struct cache_entry * const *src,
 	const struct cache_entry *old = src[0];
 	const struct cache_entry *a = src[1];
 
-	if (o->merge_size != 1)
+	if (o->internal.merge_size != 1)
 		return error("Cannot do a oneway merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 
 	if (!a || a == o->df_conflict_entry)
 		return deleted_entry(old, old, o);
@@ -3027,8 +3027,8 @@ int stash_worktree_untracked_merge(const struct cache_entry * const *src,
 	const struct cache_entry *worktree = src[1];
 	const struct cache_entry *untracked = src[2];
 
-	if (o->merge_size != 2)
-		BUG("invalid merge_size: %d", o->merge_size);
+	if (o->internal.merge_size != 2)
+		BUG("invalid merge_size: %d", o->internal.merge_size);
 
 	if (worktree && untracked)
 		return error(_("worktree and untracked commit have duplicate entries: %s"),
diff --git a/unpack-trees.h b/unpack-trees.h
index 0335c89bc75..e8737adfeda 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -65,7 +65,6 @@ struct unpack_trees_options {
 		     skip_unmerged,
 		     initial_checkout,
 		     diff_index_cached,
-		     debug_unpack,
 		     skip_sparse_checkout,
 		     quiet,
 		     exiting_early,
@@ -78,7 +77,6 @@ struct unpack_trees_options {
 	merge_fn_t fn;
 
 	int head_idx;
-	int merge_size;
 
 	struct cache_entry *df_conflict_entry;
 	void *unpack_data;
@@ -90,8 +88,10 @@ struct unpack_trees_options {
 
 	struct unpack_trees_options_internal {
 		unsigned int nontrivial_merge,
-			     show_all_errors;
+			     show_all_errors,
+			     debug_unpack; /* used by read-tree debugging */
 
+		int merge_size; /* used by read-tree debugging */
 		int cache_bottom;
 		const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
 		struct strvec msgs_to_free;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 11/11] unpack-trees: add usage notices around df_conflict_entry
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (9 preceding siblings ...)
  2023-02-25  2:25   ` [PATCH v2 10/11] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
@ 2023-02-25  2:26   ` Elijah Newren via GitGitGadget
  2023-02-25 23:30   ` [PATCH v2 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Junio C Hamano
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-25  2:26 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Avoid making users believe they need to initialize df_conflict_entry
to something (as happened with other output only fields before) with
a quick comment and a small sanity check.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 2 ++
 unpack-trees.h | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index e58f0f6a867..aafc5eca791 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1876,6 +1876,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		BUG("o->internal.dir is for internal use only");
 	if (o->internal.pl)
 		BUG("o->internal.pl is for internal use only");
+	if (o->df_conflict_entry)
+		BUG("o->df_conflict_entry is an output only field");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
diff --git a/unpack-trees.h b/unpack-trees.h
index e8737adfeda..61c06eb7c50 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -78,7 +78,7 @@ struct unpack_trees_options {
 
 	int head_idx;
 
-	struct cache_entry *df_conflict_entry;
+	struct cache_entry *df_conflict_entry; /* output only */
 	void *unpack_data;
 
 	struct index_state *dst_index;
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] dir: add a usage note to exclude_per_dir
  2023-02-25  1:54       ` Jonathan Tan
@ 2023-02-25  3:23         ` Elijah Newren
  0 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren @ 2023-02-25  3:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Elijah Newren via GitGitGadget, git

On Fri, Feb 24, 2023 at 5:54 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
> > On Fri, Feb 24, 2023 at 2:31 PM Jonathan Tan <jonathantanmy@google.com> wrote:
> > >
> > > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > > > diff --git a/dir.h b/dir.h
> > > > index 33fd848fc8d..2196e12630c 100644
> > > > --- a/dir.h
> > > > +++ b/dir.h
> > > > @@ -295,8 +295,12 @@ struct dir_struct {
> > > >       struct untracked_cache *untracked;
> > > >
> > > >       /**
> > > > -      * The name of the file to be read in each directory for excluded files
> > > > -      * (typically `.gitignore`).
> > > > +      * Deprecated: ls-files is the only allowed caller; all other callers
> > > > +      * should leave this as NULL; it pre-dated the
> > > > +      * setup_standard_excludes() mechanism that replaces this.
> > > > +      *
> > > > +      * This field tracks the name of the file to be read in each directory
> > > > +      * for excluded files (typically `.gitignore`).
> > > >        */
> > > >       const char *exclude_per_dir;
> > >
> > > I'm not sure what is meant by "allowed caller", but I wouldn't have
> > > expected this to also mean that unpack-trees would need to know to
> > > propagate this from o->internal.dir to d in verify_clean_subdirectory.
> >
> > Are you confusing fields that are internal to dir, with fields that
> > are internal to unpack-trees?
> >
> > This series does not make exclude_per_dir an internal field within dir_struct.
>
> Agreed, but the comment says that ls-files is the only allowed caller,
> and I would have expected that non-"allowed callers" would not need to
> write to exclude_per_dir. But in unpack-trees.c:
>
>   2346          if (o->internal.dir)
>   2347                  d.exclude_per_dir = o->internal.dir->exclude_per_dir;
>
> Both "d" and "o->internal.dir" are of type "struct dir_struct" (well,
> one is not a pointer and one is). I would not have expected such non-
> ls-files code to read or write from this field. (But if unpack-trees
> is considered part of ls-files and/or copying the same field to another
> struct is not considered "calling", then this patch is fine, I guess.)

Ah, gotcha, sorry for misunderstanding earlier.  And you are right;
the code in unpack_trees() is buggy and should not be using
exclude_per_dir; the fact that it is using it directly actually hints
that it has bugs when exclusions are specified in .git/info/exclude
instead of .gitignore.  I've found a testcase that demonstrates this;
I'll update the series with a new patch fixing it.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (10 preceding siblings ...)
  2023-02-25  2:26   ` [PATCH v2 11/11] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
@ 2023-02-25 23:30   ` Junio C Hamano
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
  12 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2023-02-25 23:30 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Changes since v1 (thanks to Jonathan Tan for the careful reviews!)
>
>  * Clear o->pl when freeing pl, to avoid risking use-after-free.
>  * Initialize o->result in update_sparsity() since it is actually used (by
>    check_ok_to_remove()).
>
> Some time ago, I noticed that struct dir_struct and struct
> unpack_trees_options both have numerous fields meant for internal use only,
> most of which are not marked as such. This has resulted in callers
> accidentally trying to initialize some of these fields, and in at least one
> case required a fair amount of review to verify other changes were okay --
> review that would have been simplified with the apriori knowledge that a
> combination of multiple fields were internal-only[1]. Looking closer, I
> found that only 6 out of 18 fields in dir_struct were actually meant to be
> public[2], and noted that unpack_trees_options also had 11 internal-only
> fields (out of 36).

Nice.

Will queue.

Thanks.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v3 00/13] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
  2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (11 preceding siblings ...)
  2023-02-25 23:30   ` [PATCH v2 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Junio C Hamano
@ 2023-02-27 15:28   ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 01/13] t2021: fix platform-specific leftover cruft Elijah Newren via GitGitGadget
                       ` (12 more replies)
  12 siblings, 13 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren

Changes since v2:

 * Two new patches:
   * one patch (2nd in the series) that fixes a bug in unpack-trees due to
     the code never having been updated to setup_standard_excludes() + adds
     a test to avoid regressing that bug it
   * a preliminary patch that fixes a separate latent issue in the modified
     testfile referenced above, so that the new test listed above will
     actually work on all platforms.

Changes since v1 (thanks to Jonathan Tan for the careful reviews!)

 * Clear o->pl when freeing pl, to avoid risking use-after-free.
 * Initialize o->result in update_sparsity() since it is actually used (by
   check_ok_to_remove()).

Some time ago, I noticed that struct dir_struct and struct
unpack_trees_options both have numerous fields meant for internal use only,
most of which are not marked as such. This has resulted in callers
accidentally trying to initialize some of these fields, and in at least one
case required a fair amount of review to verify other changes were okay --
review that would have been simplified with the apriori knowledge that a
combination of multiple fields were internal-only[1]. Looking closer, I
found that only 6 out of 18 fields in dir_struct were actually meant to be
public[2], and noted that unpack_trees_options also had 11 internal-only
fields (out of 36).

This patch is primarily about moving internal-only fields within these two
structs into an embedded internal struct. Patch breakdown:

 * Patches 1-3: Restructuring dir_struct
   * Patch 1: Splitting off internal-use-only fields
   * Patch 2: Add important usage note to avoid accidentally using
     deprecated API
   * Patch 3: Mark output-only fields as such
 * Patches 4-11: Restructuring unpack_trees_options
   * Patches 4-6: Preparatory cleanup
   * Patches 7-10: Splitting off internal-use-only fields
   * Patch 11: Mark output-only field as such

To make the benefit more clear, here are compressed versions of dir_struct
both before and after the changes. First, before:

struct dir_struct {
    int nr;
    int alloc;
    int ignored_nr;
    int ignored_alloc;
    enum [...] flags;
    struct dir_entry **entries;
    struct dir_entry **ignored;
    const char *exclude_per_dir;
#define EXC_CMDL 0
#define EXC_DIRS 1
#define EXC_FILE 2
    struct exclude_list_group exclude_list_group[3];
    struct exclude_stack *exclude_stack;
    struct path_pattern *pattern;
    struct strbuf basebuf;
    struct untracked_cache *untracked;
    struct oid_stat ss_info_exclude;
    struct oid_stat ss_excludes_file;
    unsigned unmanaged_exclude_files;
    unsigned visited_paths;
    unsigned visited_directories;
};


And after the changes:

struct dir_struct {
    enum [...] flags;
    int nr; /* output only */
    int ignored_nr; /* output only */
    struct dir_entry **entries; /* output only */
    struct dir_entry **ignored; /* output only */
    struct untracked_cache *untracked;
    const char *exclude_per_dir; /* deprecated */
    struct dir_struct_internal {
        int alloc;
        int ignored_alloc;
#define EXC_CMDL 0
#define EXC_DIRS 1
#define EXC_FILE 2
        struct exclude_list_group exclude_list_group[3];
        struct exclude_stack *exclude_stack;
        struct path_pattern *pattern;
        struct strbuf basebuf;
        struct oid_stat ss_info_exclude;
        struct oid_stat ss_excludes_file;
        unsigned unmanaged_exclude_files;
        unsigned visited_paths;
        unsigned visited_directories;
    } internal;
};


The former version has 18 fields (and 3 magic constants) which API users
will have to figure out. The latter makes it clear there are only at most 2
fields you should be setting upon input, and at most 4 which you read at
output, and the rest (including all the magic constants) you can ignore.

[0] Search for "Extremely yes" in
https://lore.kernel.org/git/CAJoAoZm+TkCL0Jpg_qFgKottxbtiG2QOiY0qGrz3-uQy+=waPg@mail.gmail.com/
[1]
https://lore.kernel.org/git/CABPp-BFSFN3WM6q7KzkD5mhrwsz--St_-ej5LbaY8Yr2sZzj=w@mail.gmail.com/
[2]
https://lore.kernel.org/git/CABPp-BHgot=CPNyK_xNfog_SqsNPNoCGfiSb-gZoS2sn_741dQ@mail.gmail.com/

Elijah Newren (13):
  t2021: fix platform-specific leftover cruft
  unpack-trees: heed requests to overwrite ignored files
  dir: separate public from internal portion of dir_struct
  dir: add a usage note to exclude_per_dir
  dir: mark output only fields of dir_struct as such
  unpack-trees: clean up some flow control
  sparse-checkout: avoid using internal API of unpack-trees
  sparse-checkout: avoid using internal API of unpack-trees, take 2
  unpack_trees: start splitting internal fields from public API
  unpack-trees: mark fields only used internally as internal
  unpack-trees: rewrap a few overlong lines from previous patch
  unpack-trees: special case read-tree debugging as internal usage
  unpack-trees: add usage notices around df_conflict_entry

 builtin/read-tree.c           |  10 +-
 builtin/sparse-checkout.c     |   4 +-
 dir.c                         | 114 ++++++++--------
 dir.h                         | 110 ++++++++-------
 t/t2021-checkout-overwrite.sh |  16 ++-
 unpack-trees.c                | 247 ++++++++++++++++++----------------
 unpack-trees.h                |  42 +++---
 7 files changed, 292 insertions(+), 251 deletions(-)


base-commit: 06dd2baa8da4a73421b959ec026a43711b9d77f9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1149%2Fnewren%2Fclarify-api-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1149/newren/clarify-api-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1149

Range-diff vs v2:

  -:  ----------- >  1:  7bbc4577a57 t2021: fix platform-specific leftover cruft
  -:  ----------- >  2:  8ffdb6c8a8a unpack-trees: heed requests to overwrite ignored files
  1:  7f59ad548d0 =  3:  879a93ac2d7 dir: separate public from internal portion of dir_struct
  2:  239b10e1181 !  4:  4ce9fae5e7f dir: add a usage note to exclude_per_dir
     @@ Metadata
       ## Commit message ##
          dir: add a usage note to exclude_per_dir
      
     +    As evidenced by the fix a couple commits ago, places in the code using
     +    exclude_per_dir are likely buggy and should be adapted to call
     +    setup_standard_excludes() instead.  Unfortunately, the usage of
     +    exclude_per_dir has been hardcoded into the arguments ls-files accepts,
     +    so we cannot actually remove it.  Add a note that it is deprecated and
     +    no other callers should use it directly.
     +
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## dir.h ##
  3:  b8aa14350d3 =  5:  12344400fa0 dir: mark output only fields of dir_struct as such
  4:  f5a58123034 =  6:  4e86e39506c unpack-trees: clean up some flow control
  5:  975dec0f0eb =  7:  cd3d4894afb sparse-checkout: avoid using internal API of unpack-trees
  6:  429f195dcfe =  8:  09140cb2ac5 sparse-checkout: avoid using internal API of unpack-trees, take 2
  7:  993da584dbb !  9:  27f2d477116 unpack_trees: start splitting internal fields from public API
     @@ unpack-trees.c: static int verify_clean_subdirectory(const struct cache_entry *c
       
       	memset(&d, 0, sizeof(d));
      -	if (o->dir)
     --		d.exclude_per_dir = o->dir->exclude_per_dir;
      +	if (o->internal.dir)
     -+		d.exclude_per_dir = o->internal.dir->exclude_per_dir;
     + 		setup_standard_excludes(&d);
       	i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
       	dir_clear(&d);
     - 	free(pathbuf);
      @@ unpack-trees.c: static int check_ok_to_remove(const char *name, int len, int dtype,
       	if (ignore_case && icase_exists(o, name, len, st))
       		return 0;
  8:  8ecb24a45f0 = 10:  4236c0d80c7 unpack-trees: mark fields only used internally as internal
  9:  36ca49c3624 = 11:  76f4a544e4b unpack-trees: rewrap a few overlong lines from previous patch
 10:  5af04d7fe23 = 12:  ee36935adb5 unpack-trees: special case read-tree debugging as internal usage
 11:  c4f31237634 = 13:  6575c007577 unpack-trees: add usage notices around df_conflict_entry

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v3 01/13] t2021: fix platform-specific leftover cruft
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 19:11       ` Derrick Stolee
  2023-02-27 15:28     ` [PATCH v3 02/13] unpack-trees: heed requests to overwrite ignored files Elijah Newren via GitGitGadget
                       ` (11 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

t2021.6 existed to test the status of a symlink that was left around by
previous tests.  It tried to also clean up the symlink after it was done
so that subsequent tests wouldn't be tripped up by it.  Unfortunately,
since this test had a SYMLINK prerequisite, that made the cleanup
platform dependent...and made a testcase I was trying to add to this
testsuite fail (that testcase will be included in the next patch).
Before we go and add new testcases, fix this cleanup by moving it into a
separate test.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t2021-checkout-overwrite.sh | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t2021-checkout-overwrite.sh b/t/t2021-checkout-overwrite.sh
index 713c3fa6038..baca66e1a31 100755
--- a/t/t2021-checkout-overwrite.sh
+++ b/t/t2021-checkout-overwrite.sh
@@ -50,10 +50,13 @@ test_expect_success 'checkout commit with dir must not remove untracked a/b' '
 
 test_expect_success SYMLINKS 'the symlink remained' '
 
-	test_when_finished "rm a/b" &&
 	test -h a/b
 '
 
+test_expect_success 'cleanup after previous symlink tests' '
+	rm a/b
+'
+
 test_expect_success SYMLINKS 'checkout -f must not follow symlinks when removing entries' '
 	git checkout -f start &&
 	mkdir dir &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 02/13] unpack-trees: heed requests to overwrite ignored files
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 01/13] t2021: fix platform-specific leftover cruft Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 23:20       ` Jonathan Tan
  2023-02-27 15:28     ` [PATCH v3 03/13] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
                       ` (10 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When a directory exists but has only ignored files within it and we are
trying to switch to a branch that has a file where that directory is,
the behavior depends upon --[no]-overwrite-ignore.  If the user wants to
--overwrite-ignore (the default), then we should delete the ignored file
and directory and switch to the new branch.

The code to handle this in verify_clean_subdirectory() in unpack-trees
tried to handle this via paying attention to the exclude_per_dir setting
of the internal dir field.  This came from commit c81935348b ("Fix
switching to a branch with D/F when current branch has file D.",
2007-03-15), which pre-dated 039bc64e88 ("core.excludesfile clean-up",
2007-11-14), and thus did not pay attention to ignore patterns from
other relevant files.  Change it to use setup_standard_excludes() so
that it is also aware of excludes specified in other locations.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t2021-checkout-overwrite.sh | 11 +++++++++++
 unpack-trees.c                |  2 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/t/t2021-checkout-overwrite.sh b/t/t2021-checkout-overwrite.sh
index baca66e1a31..034f62c13c5 100755
--- a/t/t2021-checkout-overwrite.sh
+++ b/t/t2021-checkout-overwrite.sh
@@ -69,4 +69,15 @@ test_expect_success SYMLINKS 'checkout -f must not follow symlinks when removing
 	test_path_is_file untracked/f
 '
 
+test_expect_success 'checkout --overwrite-ignore should succeed if only ignored files in the way' '
+	git checkout -b df_conflict &&
+	test_commit contents some_dir &&
+	git checkout start &&
+	mkdir some_dir &&
+	echo autogenerated information >some_dir/ignore &&
+	echo ignore >.git/info/exclude &&
+	git checkout --overwrite-ignore df_conflict &&
+	! test_path_is_dir some_dir
+'
+
 test_done
diff --git a/unpack-trees.c b/unpack-trees.c
index 3d05e45a279..4518d33ed99 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2337,7 +2337,7 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
 
 	memset(&d, 0, sizeof(d));
 	if (o->dir)
-		d.exclude_per_dir = o->dir->exclude_per_dir;
+		setup_standard_excludes(&d);
 	i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
 	dir_clear(&d);
 	free(pathbuf);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 03/13] dir: separate public from internal portion of dir_struct
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 01/13] t2021: fix platform-specific leftover cruft Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 02/13] unpack-trees: heed requests to overwrite ignored files Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 04/13] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
                       ` (9 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In order to make it clearer to callers what portions of dir_struct are
public API, and avoid errors from them setting fields that are meant as
internal API, split the fields used for internal implementation reasons
into a separate embedded struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c | 114 +++++++++++++++++++++++++++++-----------------------------
 dir.h |  86 +++++++++++++++++++++++---------------------
 2 files changed, 104 insertions(+), 96 deletions(-)

diff --git a/dir.c b/dir.c
index 4e99f0c868f..7adf242026e 100644
--- a/dir.c
+++ b/dir.c
@@ -1190,7 +1190,7 @@ struct pattern_list *add_pattern_list(struct dir_struct *dir,
 	struct pattern_list *pl;
 	struct exclude_list_group *group;
 
-	group = &dir->exclude_list_group[group_type];
+	group = &dir->internal.exclude_list_group[group_type];
 	ALLOC_GROW(group->pl, group->nr + 1, group->alloc);
 	pl = &group->pl[group->nr++];
 	memset(pl, 0, sizeof(*pl));
@@ -1211,7 +1211,7 @@ static void add_patterns_from_file_1(struct dir_struct *dir, const char *fname,
 	 * differently when dir->untracked is non-NULL.
 	 */
 	if (!dir->untracked)
-		dir->unmanaged_exclude_files++;
+		dir->internal.unmanaged_exclude_files++;
 	pl = add_pattern_list(dir, EXC_FILE, fname);
 	if (add_patterns(fname, "", 0, pl, NULL, 0, oid_stat) < 0)
 		die(_("cannot use %s as an exclude file"), fname);
@@ -1219,7 +1219,7 @@ static void add_patterns_from_file_1(struct dir_struct *dir, const char *fname,
 
 void add_patterns_from_file(struct dir_struct *dir, const char *fname)
 {
-	dir->unmanaged_exclude_files++; /* see validate_untracked_cache() */
+	dir->internal.unmanaged_exclude_files++; /* see validate_untracked_cache() */
 	add_patterns_from_file_1(dir, fname, NULL);
 }
 
@@ -1519,7 +1519,7 @@ static struct path_pattern *last_matching_pattern_from_lists(
 	struct exclude_list_group *group;
 	struct path_pattern *pattern;
 	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
-		group = &dir->exclude_list_group[i];
+		group = &dir->internal.exclude_list_group[i];
 		for (j = group->nr - 1; j >= 0; j--) {
 			pattern = last_matching_pattern_from_list(
 				pathname, pathlen, basename, dtype_p,
@@ -1545,20 +1545,20 @@ static void prep_exclude(struct dir_struct *dir,
 	struct untracked_cache_dir *untracked;
 	int current;
 
-	group = &dir->exclude_list_group[EXC_DIRS];
+	group = &dir->internal.exclude_list_group[EXC_DIRS];
 
 	/*
 	 * Pop the exclude lists from the EXCL_DIRS exclude_list_group
 	 * which originate from directories not in the prefix of the
 	 * path being checked.
 	 */
-	while ((stk = dir->exclude_stack) != NULL) {
+	while ((stk = dir->internal.exclude_stack) != NULL) {
 		if (stk->baselen <= baselen &&
-		    !strncmp(dir->basebuf.buf, base, stk->baselen))
+		    !strncmp(dir->internal.basebuf.buf, base, stk->baselen))
 			break;
-		pl = &group->pl[dir->exclude_stack->exclude_ix];
-		dir->exclude_stack = stk->prev;
-		dir->pattern = NULL;
+		pl = &group->pl[dir->internal.exclude_stack->exclude_ix];
+		dir->internal.exclude_stack = stk->prev;
+		dir->internal.pattern = NULL;
 		free((char *)pl->src); /* see strbuf_detach() below */
 		clear_pattern_list(pl);
 		free(stk);
@@ -1566,7 +1566,7 @@ static void prep_exclude(struct dir_struct *dir,
 	}
 
 	/* Skip traversing into sub directories if the parent is excluded */
-	if (dir->pattern)
+	if (dir->internal.pattern)
 		return;
 
 	/*
@@ -1574,12 +1574,12 @@ static void prep_exclude(struct dir_struct *dir,
 	 * memset(dir, 0, sizeof(*dir)) before use. Changing all of
 	 * them seems lots of work for little benefit.
 	 */
-	if (!dir->basebuf.buf)
-		strbuf_init(&dir->basebuf, PATH_MAX);
+	if (!dir->internal.basebuf.buf)
+		strbuf_init(&dir->internal.basebuf, PATH_MAX);
 
 	/* Read from the parent directories and push them down. */
 	current = stk ? stk->baselen : -1;
-	strbuf_setlen(&dir->basebuf, current < 0 ? 0 : current);
+	strbuf_setlen(&dir->internal.basebuf, current < 0 ? 0 : current);
 	if (dir->untracked)
 		untracked = stk ? stk->ucd : dir->untracked->root;
 	else
@@ -1599,32 +1599,33 @@ static void prep_exclude(struct dir_struct *dir,
 				die("oops in prep_exclude");
 			cp++;
 			untracked =
-				lookup_untracked(dir->untracked, untracked,
+				lookup_untracked(dir->untracked,
+						 untracked,
 						 base + current,
 						 cp - base - current);
 		}
-		stk->prev = dir->exclude_stack;
+		stk->prev = dir->internal.exclude_stack;
 		stk->baselen = cp - base;
 		stk->exclude_ix = group->nr;
 		stk->ucd = untracked;
 		pl = add_pattern_list(dir, EXC_DIRS, NULL);
-		strbuf_add(&dir->basebuf, base + current, stk->baselen - current);
-		assert(stk->baselen == dir->basebuf.len);
+		strbuf_add(&dir->internal.basebuf, base + current, stk->baselen - current);
+		assert(stk->baselen == dir->internal.basebuf.len);
 
 		/* Abort if the directory is excluded */
 		if (stk->baselen) {
 			int dt = DT_DIR;
-			dir->basebuf.buf[stk->baselen - 1] = 0;
-			dir->pattern = last_matching_pattern_from_lists(dir,
+			dir->internal.basebuf.buf[stk->baselen - 1] = 0;
+			dir->internal.pattern = last_matching_pattern_from_lists(dir,
 									istate,
-				dir->basebuf.buf, stk->baselen - 1,
-				dir->basebuf.buf + current, &dt);
-			dir->basebuf.buf[stk->baselen - 1] = '/';
-			if (dir->pattern &&
-			    dir->pattern->flags & PATTERN_FLAG_NEGATIVE)
-				dir->pattern = NULL;
-			if (dir->pattern) {
-				dir->exclude_stack = stk;
+				dir->internal.basebuf.buf, stk->baselen - 1,
+				dir->internal.basebuf.buf + current, &dt);
+			dir->internal.basebuf.buf[stk->baselen - 1] = '/';
+			if (dir->internal.pattern &&
+			    dir->internal.pattern->flags & PATTERN_FLAG_NEGATIVE)
+				dir->internal.pattern = NULL;
+			if (dir->internal.pattern) {
+				dir->internal.exclude_stack = stk;
 				return;
 			}
 		}
@@ -1647,15 +1648,15 @@ static void prep_exclude(struct dir_struct *dir,
 		      */
 		     !is_null_oid(&untracked->exclude_oid))) {
 			/*
-			 * dir->basebuf gets reused by the traversal, but we
-			 * need fname to remain unchanged to ensure the src
-			 * member of each struct path_pattern correctly
+			 * dir->internal.basebuf gets reused by the traversal,
+			 * but we need fname to remain unchanged to ensure the
+			 * src member of each struct path_pattern correctly
 			 * back-references its source file.  Other invocations
 			 * of add_pattern_list provide stable strings, so we
 			 * strbuf_detach() and free() here in the caller.
 			 */
 			struct strbuf sb = STRBUF_INIT;
-			strbuf_addbuf(&sb, &dir->basebuf);
+			strbuf_addbuf(&sb, &dir->internal.basebuf);
 			strbuf_addstr(&sb, dir->exclude_per_dir);
 			pl->src = strbuf_detach(&sb, NULL);
 			add_patterns(pl->src, pl->src, stk->baselen, pl, istate,
@@ -1681,10 +1682,10 @@ static void prep_exclude(struct dir_struct *dir,
 			invalidate_gitignore(dir->untracked, untracked);
 			oidcpy(&untracked->exclude_oid, &oid_stat.oid);
 		}
-		dir->exclude_stack = stk;
+		dir->internal.exclude_stack = stk;
 		current = stk->baselen;
 	}
-	strbuf_setlen(&dir->basebuf, baselen);
+	strbuf_setlen(&dir->internal.basebuf, baselen);
 }
 
 /*
@@ -1704,8 +1705,8 @@ struct path_pattern *last_matching_pattern(struct dir_struct *dir,
 
 	prep_exclude(dir, istate, pathname, basename-pathname);
 
-	if (dir->pattern)
-		return dir->pattern;
+	if (dir->internal.pattern)
+		return dir->internal.pattern;
 
 	return last_matching_pattern_from_lists(dir, istate, pathname, pathlen,
 			basename, dtype_p);
@@ -1742,7 +1743,7 @@ static struct dir_entry *dir_add_name(struct dir_struct *dir,
 	if (index_file_exists(istate, pathname, len, ignore_case))
 		return NULL;
 
-	ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
+	ALLOC_GROW(dir->entries, dir->nr+1, dir->internal.alloc);
 	return dir->entries[dir->nr++] = dir_entry_new(pathname, len);
 }
 
@@ -1753,7 +1754,7 @@ struct dir_entry *dir_add_ignored(struct dir_struct *dir,
 	if (!index_name_is_other(istate, pathname, len))
 		return NULL;
 
-	ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->ignored_alloc);
+	ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->internal.ignored_alloc);
 	return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);
 }
 
@@ -2569,7 +2570,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 
 	if (open_cached_dir(&cdir, dir, untracked, istate, &path, check_only))
 		goto out;
-	dir->visited_directories++;
+	dir->internal.visited_directories++;
 
 	if (untracked)
 		untracked->check_only = !!check_only;
@@ -2578,7 +2579,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* check how the file or directory should be treated */
 		state = treat_path(dir, untracked, &cdir, istate, &path,
 				   baselen, pathspec);
-		dir->visited_paths++;
+		dir->internal.visited_paths++;
 
 		if (state > dir_state)
 			dir_state = state;
@@ -2586,7 +2587,8 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* recurse into subdir if instructed by treat_path */
 		if (state == path_recurse) {
 			struct untracked_cache_dir *ud;
-			ud = lookup_untracked(dir->untracked, untracked,
+			ud = lookup_untracked(dir->untracked,
+					      untracked,
 					      path.buf + baselen,
 					      path.len - baselen);
 			subdir_state =
@@ -2846,7 +2848,7 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 	 * condition also catches running setup_standard_excludes()
 	 * before setting dir->untracked!
 	 */
-	if (dir->unmanaged_exclude_files)
+	if (dir->internal.unmanaged_exclude_files)
 		return NULL;
 
 	/*
@@ -2875,7 +2877,7 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 	 * EXC_CMDL is not considered in the cache. If people set it,
 	 * skip the cache.
 	 */
-	if (dir->exclude_list_group[EXC_CMDL].nr)
+	if (dir->internal.exclude_list_group[EXC_CMDL].nr)
 		return NULL;
 
 	if (!ident_in_untracked(dir->untracked)) {
@@ -2935,15 +2937,15 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 
 	/* Validate $GIT_DIR/info/exclude and core.excludesfile */
 	root = dir->untracked->root;
-	if (!oideq(&dir->ss_info_exclude.oid,
+	if (!oideq(&dir->internal.ss_info_exclude.oid,
 		   &dir->untracked->ss_info_exclude.oid)) {
 		invalidate_gitignore(dir->untracked, root);
-		dir->untracked->ss_info_exclude = dir->ss_info_exclude;
+		dir->untracked->ss_info_exclude = dir->internal.ss_info_exclude;
 	}
-	if (!oideq(&dir->ss_excludes_file.oid,
+	if (!oideq(&dir->internal.ss_excludes_file.oid,
 		   &dir->untracked->ss_excludes_file.oid)) {
 		invalidate_gitignore(dir->untracked, root);
-		dir->untracked->ss_excludes_file = dir->ss_excludes_file;
+		dir->untracked->ss_excludes_file = dir->internal.ss_excludes_file;
 	}
 
 	/* Make sure this directory is not dropped out at saving phase */
@@ -2969,9 +2971,9 @@ static void emit_traversal_statistics(struct dir_struct *dir,
 	}
 
 	trace2_data_intmax("read_directory", repo,
-			   "directories-visited", dir->visited_directories);
+			   "directories-visited", dir->internal.visited_directories);
 	trace2_data_intmax("read_directory", repo,
-			   "paths-visited", dir->visited_paths);
+			   "paths-visited", dir->internal.visited_paths);
 
 	if (!dir->untracked)
 		return;
@@ -2993,8 +2995,8 @@ int read_directory(struct dir_struct *dir, struct index_state *istate,
 	struct untracked_cache_dir *untracked;
 
 	trace2_region_enter("dir", "read_directory", istate->repo);
-	dir->visited_paths = 0;
-	dir->visited_directories = 0;
+	dir->internal.visited_paths = 0;
+	dir->internal.visited_directories = 0;
 
 	if (has_symlink_leading_path(path, len)) {
 		trace2_region_leave("dir", "read_directory", istate->repo);
@@ -3342,14 +3344,14 @@ void setup_standard_excludes(struct dir_struct *dir)
 		excludes_file = xdg_config_home("ignore");
 	if (excludes_file && !access_or_warn(excludes_file, R_OK, 0))
 		add_patterns_from_file_1(dir, excludes_file,
-					 dir->untracked ? &dir->ss_excludes_file : NULL);
+					 dir->untracked ? &dir->internal.ss_excludes_file : NULL);
 
 	/* per repository user preference */
 	if (startup_info->have_repository) {
 		const char *path = git_path_info_exclude();
 		if (!access_or_warn(path, R_OK, 0))
 			add_patterns_from_file_1(dir, path,
-						 dir->untracked ? &dir->ss_info_exclude : NULL);
+						 dir->untracked ? &dir->internal.ss_info_exclude : NULL);
 	}
 }
 
@@ -3405,7 +3407,7 @@ void dir_clear(struct dir_struct *dir)
 	struct dir_struct new = DIR_INIT;
 
 	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
-		group = &dir->exclude_list_group[i];
+		group = &dir->internal.exclude_list_group[i];
 		for (j = 0; j < group->nr; j++) {
 			pl = &group->pl[j];
 			if (i == EXC_DIRS)
@@ -3422,13 +3424,13 @@ void dir_clear(struct dir_struct *dir)
 	free(dir->ignored);
 	free(dir->entries);
 
-	stk = dir->exclude_stack;
+	stk = dir->internal.exclude_stack;
 	while (stk) {
 		struct exclude_stack *prev = stk->prev;
 		free(stk);
 		stk = prev;
 	}
-	strbuf_release(&dir->basebuf);
+	strbuf_release(&dir->internal.basebuf);
 
 	memcpy(dir, &new, sizeof(*dir));
 }
diff --git a/dir.h b/dir.h
index 8acfc044181..33fd848fc8d 100644
--- a/dir.h
+++ b/dir.h
@@ -215,14 +215,9 @@ struct dir_struct {
 	/* The number of members in `entries[]` array. */
 	int nr;
 
-	/* Internal use; keeps track of allocation of `entries[]` array.*/
-	int alloc;
-
 	/* The number of members in `ignored[]` array. */
 	int ignored_nr;
 
-	int ignored_alloc;
-
 	/* bit-field of options */
 	enum {
 
@@ -296,51 +291,62 @@ struct dir_struct {
 	 */
 	struct dir_entry **ignored;
 
+	/* Enable/update untracked file cache if set */
+	struct untracked_cache *untracked;
+
 	/**
 	 * The name of the file to be read in each directory for excluded files
 	 * (typically `.gitignore`).
 	 */
 	const char *exclude_per_dir;
 
-	/*
-	 * We maintain three groups of exclude pattern lists:
-	 *
-	 * EXC_CMDL lists patterns explicitly given on the command line.
-	 * EXC_DIRS lists patterns obtained from per-directory ignore files.
-	 * EXC_FILE lists patterns from fallback ignore files, e.g.
-	 *   - .git/info/exclude
-	 *   - core.excludesfile
-	 *
-	 * Each group contains multiple exclude lists, a single list
-	 * per source.
-	 */
+	struct dir_struct_internal {
+		/* Keeps track of allocation of `entries[]` array.*/
+		int alloc;
+
+		/* Keeps track of allocation of `ignored[]` array. */
+		int ignored_alloc;
+
+		/*
+		 * We maintain three groups of exclude pattern lists:
+		 *
+		 * EXC_CMDL lists patterns explicitly given on the command line.
+		 * EXC_DIRS lists patterns obtained from per-directory ignore
+		 *          files.
+		 * EXC_FILE lists patterns from fallback ignore files, e.g.
+		 *   - .git/info/exclude
+		 *   - core.excludesfile
+		 *
+		 * Each group contains multiple exclude lists, a single list
+		 * per source.
+		 */
 #define EXC_CMDL 0
 #define EXC_DIRS 1
 #define EXC_FILE 2
-	struct exclude_list_group exclude_list_group[3];
-
-	/*
-	 * Temporary variables which are used during loading of the
-	 * per-directory exclude lists.
-	 *
-	 * exclude_stack points to the top of the exclude_stack, and
-	 * basebuf contains the full path to the current
-	 * (sub)directory in the traversal. Exclude points to the
-	 * matching exclude struct if the directory is excluded.
-	 */
-	struct exclude_stack *exclude_stack;
-	struct path_pattern *pattern;
-	struct strbuf basebuf;
-
-	/* Enable untracked file cache if set */
-	struct untracked_cache *untracked;
-	struct oid_stat ss_info_exclude;
-	struct oid_stat ss_excludes_file;
-	unsigned unmanaged_exclude_files;
+		struct exclude_list_group exclude_list_group[3];
 
-	/* Stats about the traversal */
-	unsigned visited_paths;
-	unsigned visited_directories;
+		/*
+		 * Temporary variables which are used during loading of the
+		 * per-directory exclude lists.
+		 *
+		 * exclude_stack points to the top of the exclude_stack, and
+		 * basebuf contains the full path to the current
+		 * (sub)directory in the traversal. Exclude points to the
+		 * matching exclude struct if the directory is excluded.
+		 */
+		struct exclude_stack *exclude_stack;
+		struct path_pattern *pattern;
+		struct strbuf basebuf;
+
+		/* Additional metadata related to 'untracked' */
+		struct oid_stat ss_info_exclude;
+		struct oid_stat ss_excludes_file;
+		unsigned unmanaged_exclude_files;
+
+		/* Stats about the traversal */
+		unsigned visited_paths;
+		unsigned visited_directories;
+	} internal;
 };
 
 #define DIR_INIT { 0 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 04/13] dir: add a usage note to exclude_per_dir
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (2 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 03/13] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 05/13] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
                       ` (8 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

As evidenced by the fix a couple commits ago, places in the code using
exclude_per_dir are likely buggy and should be adapted to call
setup_standard_excludes() instead.  Unfortunately, the usage of
exclude_per_dir has been hardcoded into the arguments ls-files accepts,
so we cannot actually remove it.  Add a note that it is deprecated and
no other callers should use it directly.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/dir.h b/dir.h
index 33fd848fc8d..2196e12630c 100644
--- a/dir.h
+++ b/dir.h
@@ -295,8 +295,12 @@ struct dir_struct {
 	struct untracked_cache *untracked;
 
 	/**
-	 * The name of the file to be read in each directory for excluded files
-	 * (typically `.gitignore`).
+	 * Deprecated: ls-files is the only allowed caller; all other callers
+	 * should leave this as NULL; it pre-dated the
+	 * setup_standard_excludes() mechanism that replaces this.
+	 *
+	 * This field tracks the name of the file to be read in each directory
+	 * for excluded files (typically `.gitignore`).
 	 */
 	const char *exclude_per_dir;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 05/13] dir: mark output only fields of dir_struct as such
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (3 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 04/13] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 06/13] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
                       ` (7 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

While at it, also group these fields together for convenience.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/dir.h b/dir.h
index 2196e12630c..e8106e1ecac 100644
--- a/dir.h
+++ b/dir.h
@@ -212,12 +212,6 @@ struct untracked_cache {
  */
 struct dir_struct {
 
-	/* The number of members in `entries[]` array. */
-	int nr;
-
-	/* The number of members in `ignored[]` array. */
-	int ignored_nr;
-
 	/* bit-field of options */
 	enum {
 
@@ -282,14 +276,20 @@ struct dir_struct {
 		DIR_SKIP_NESTED_GIT = 1<<9
 	} flags;
 
+	/* The number of members in `entries[]` array. */
+	int nr; /* output only */
+
+	/* The number of members in `ignored[]` array. */
+	int ignored_nr; /* output only */
+
 	/* An array of `struct dir_entry`, each element of which describes a path. */
-	struct dir_entry **entries;
+	struct dir_entry **entries; /* output only */
 
 	/**
 	 * used for ignored paths with the `DIR_SHOW_IGNORED_TOO` and
 	 * `DIR_COLLECT_IGNORED` flags.
 	 */
-	struct dir_entry **ignored;
+	struct dir_entry **ignored; /* output only */
 
 	/* Enable/update untracked file cache if set */
 	struct untracked_cache *untracked;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 06/13] unpack-trees: clean up some flow control
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (4 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 05/13] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 07/13] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
                       ` (6 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The update_sparsity() function was introduced in commit 7af7a25853
("unpack-trees: add a new update_sparsity() function", 2020-03-27).
Prior to that, unpack_trees() was used, but that had a few bugs because
the needs of the caller were different, and different enough that
unpack_trees() could not easily be modified to handle both usecases.

The implementation detail that update_sparsity() was written by copying
unpack_trees() and then streamlining it, and then modifying it in the
needed ways still shows through in that there are leftover vestiges in
both functions that are no longer needed.  Clean them up.  In
particular:

  * update_sparsity() allows a pattern list to be passed in, but
    unpack_trees() never should use a different pattern list.  Add a
    check and a BUG() if this gets violated.
  * update_sparsity() has a check early on that will BUG() if
    o->skip_sparse_checkout is set; as such, there's no need to check
    for that condition again later in the code.  We can simply remove
    the check and its corresponding goto label.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 4518d33ed99..bad3120a76e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1873,6 +1873,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
 	if (o->dir)
 		BUG("o->dir is for internal use only");
+	if (o->pl)
+		BUG("o->pl is for internal use only");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
@@ -1899,7 +1901,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 	if (!core_apply_sparse_checkout || !o->update)
 		o->skip_sparse_checkout = 1;
-	if (!o->skip_sparse_checkout && !o->pl) {
+	if (!o->skip_sparse_checkout) {
 		memset(&pl, 0, sizeof(pl));
 		free_pattern_list = 1;
 		populate_from_existing_patterns(o, &pl);
@@ -2113,8 +2115,6 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 		memset(&pl, 0, sizeof(pl));
 		free_pattern_list = 1;
 		populate_from_existing_patterns(o, &pl);
-		if (o->skip_sparse_checkout)
-			goto skip_sparse_checkout;
 	}
 
 	/* Expand sparse directories as needed */
@@ -2142,7 +2142,6 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 			ret = UPDATE_SPARSITY_WARNINGS;
 	}
 
-skip_sparse_checkout:
 	if (check_updates(o, o->src_index))
 		ret = UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 07/13] sparse-checkout: avoid using internal API of unpack-trees
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (5 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 06/13] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 08/13] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
                       ` (5 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

struct unpack_trees_options has the following field and comment:

	struct pattern_list *pl; /* for internal use */

Despite the internal-use comment, commit e091228e17 ("sparse-checkout:
update working directory in-process", 2019-11-21) starting setting this
field from an external caller.  At the time, the only way around that
would have been to modify unpack_trees() to take an extra pattern_list
argument, and there's a lot of callers of that function.  However, when
we split update_sparsity() off as a separate function, with
sparse-checkout being the sole caller, the need to update other callers
went away.  Fix this API problem by adding a pattern_list argument to
update_sparsity() and stop setting the internal o.pl field directly.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/sparse-checkout.c |  3 +--
 unpack-trees.c            | 18 +++++++++++-------
 unpack-trees.h            |  3 ++-
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index c3738154918..4b7390ce367 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -219,14 +219,13 @@ static int update_working_directory(struct pattern_list *pl)
 	o.dst_index = r->index;
 	index_state_init(&o.result, r);
 	o.skip_sparse_checkout = 0;
-	o.pl = pl;
 
 	setup_work_tree();
 
 	repo_hold_locked_index(r, &lock_file, LOCK_DIE_ON_ERROR);
 
 	setup_unpack_trees_porcelain(&o, "sparse-checkout");
-	result = update_sparsity(&o);
+	result = update_sparsity(&o, pl);
 	clear_unpack_trees_porcelain(&o);
 
 	if (result == UPDATE_SPARSITY_WARNINGS)
diff --git a/unpack-trees.c b/unpack-trees.c
index bad3120a76e..6e4ca6fe800 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2091,10 +2091,10 @@ return_failed:
  *
  * CE_NEW_SKIP_WORKTREE is used internally.
  */
-enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
+enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
+					    struct pattern_list *pl)
 {
 	enum update_sparsity_result ret = UPDATE_SPARSITY_SUCCESS;
-	struct pattern_list pl;
 	int i;
 	unsigned old_show_all_errors;
 	int free_pattern_list = 0;
@@ -2111,11 +2111,12 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 	trace_performance_enter();
 
 	/* If we weren't given patterns, use the recorded ones */
-	if (!o->pl) {
-		memset(&pl, 0, sizeof(pl));
+	if (!pl) {
 		free_pattern_list = 1;
-		populate_from_existing_patterns(o, &pl);
+		pl = xcalloc(1, sizeof(*pl));
+		populate_from_existing_patterns(o, pl);
 	}
+	o->pl = pl;
 
 	/* Expand sparse directories as needed */
 	expand_index(o->src_index, o->pl);
@@ -2147,8 +2148,11 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
 
 	display_warning_msgs(o);
 	o->show_all_errors = old_show_all_errors;
-	if (free_pattern_list)
-		clear_pattern_list(&pl);
+	if (free_pattern_list) {
+		clear_pattern_list(pl);
+		free(pl);
+		o->pl = NULL;
+	}
 	trace_performance_leave("update_sparsity");
 	return ret;
 }
diff --git a/unpack-trees.h b/unpack-trees.h
index 3a7b3e5f007..f3a6e4f90ef 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -112,7 +112,8 @@ enum update_sparsity_result {
 	UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES = -2
 };
 
-enum update_sparsity_result update_sparsity(struct unpack_trees_options *options);
+enum update_sparsity_result update_sparsity(struct unpack_trees_options *options,
+					    struct pattern_list *pl);
 
 int verify_uptodate(const struct cache_entry *ce,
 		    struct unpack_trees_options *o);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 08/13] sparse-checkout: avoid using internal API of unpack-trees, take 2
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (6 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 07/13] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 09/13] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
                       ` (4 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Commit 2f6b1eb794 ("cache API: add a "INDEX_STATE_INIT" macro/function,
add release_index()", 2023-01-12) mistakenly added some initialization
of a member of unpack_trees_options that was intended to be
internal-only.  This initialization should be done within
update_sparsity() instead.

Note that while o->result is mostly meant for unpack_trees() and
update_sparsity() mostly operates without o->result,
check_ok_to_remove() does consult it so we need to ensure it is properly
initialized.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/sparse-checkout.c | 1 -
 unpack-trees.c            | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 4b7390ce367..8d5ae6f2a60 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -217,7 +217,6 @@ static int update_working_directory(struct pattern_list *pl)
 	o.head_idx = -1;
 	o.src_index = r->index;
 	o.dst_index = r->index;
-	index_state_init(&o.result, r);
 	o.skip_sparse_checkout = 0;
 
 	setup_work_tree();
diff --git a/unpack-trees.c b/unpack-trees.c
index 6e4ca6fe800..c8dacd76c5f 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2101,6 +2101,7 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 
 	old_show_all_errors = o->show_all_errors;
 	o->show_all_errors = 1;
+	index_state_init(&o->result, o->src_index->repo);
 
 	/* Sanity checks */
 	if (!o->update || o->index_only || o->skip_sparse_checkout)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 09/13] unpack_trees: start splitting internal fields from public API
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (7 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 08/13] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 10/13] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
                       ` (3 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

This just splits the two fields already marked as internal-only into a
separate internal struct.  Future commits will add more fields that
were meant to be internal-only but were not explicitly marked as such
to the same struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 40 ++++++++++++++++++++--------------------
 unpack-trees.h |  7 +++++--
 2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index c8dacd76c5f..ecf89d5bfeb 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1809,7 +1809,7 @@ static void populate_from_existing_patterns(struct unpack_trees_options *o,
 	if (get_sparse_checkout_patterns(pl) < 0)
 		o->skip_sparse_checkout = 1;
 	else
-		o->pl = pl;
+		o->internal.pl = pl;
 }
 
 static void update_sparsity_for_prefix(const char *prefix,
@@ -1871,10 +1871,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 	if (len > MAX_UNPACK_TREES)
 		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
-	if (o->dir)
-		BUG("o->dir is for internal use only");
-	if (o->pl)
-		BUG("o->pl is for internal use only");
+	if (o->internal.dir)
+		BUG("o->internal.dir is for internal use only");
+	if (o->internal.pl)
+		BUG("o->internal.pl is for internal use only");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
@@ -1891,9 +1891,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		BUG("UNPACK_RESET_OVERWRITE_UNTRACKED incompatible with preserved ignored files");
 
 	if (!o->preserve_ignored) {
-		o->dir = &dir;
-		o->dir->flags |= DIR_SHOW_IGNORED;
-		setup_standard_excludes(o->dir);
+		o->internal.dir = &dir;
+		o->internal.dir->flags |= DIR_SHOW_IGNORED;
+		setup_standard_excludes(o->internal.dir);
 	}
 
 	if (o->prefix)
@@ -1943,7 +1943,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
 	 */
 	if (!o->skip_sparse_checkout)
-		mark_new_skip_worktree(o->pl, o->src_index, 0,
+		mark_new_skip_worktree(o->internal.pl, o->src_index, 0,
 				       CE_NEW_SKIP_WORKTREE, o->verbose_update);
 
 	if (!dfc)
@@ -2009,7 +2009,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		 * If they will have NEW_SKIP_WORKTREE, also set CE_SKIP_WORKTREE
 		 * so apply_sparse_checkout() won't attempt to remove it from worktree
 		 */
-		mark_new_skip_worktree(o->pl, &o->result,
+		mark_new_skip_worktree(o->internal.pl, &o->result,
 				       CE_ADDED, CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE,
 				       o->verbose_update);
 
@@ -2067,9 +2067,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 done:
 	if (free_pattern_list)
 		clear_pattern_list(&pl);
-	if (o->dir) {
-		dir_clear(o->dir);
-		o->dir = NULL;
+	if (o->internal.dir) {
+		dir_clear(o->internal.dir);
+		o->internal.dir = NULL;
 	}
 	trace2_region_leave("unpack_trees", "unpack_trees", the_repository);
 	trace_performance_leave("unpack_trees");
@@ -2117,14 +2117,14 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 		pl = xcalloc(1, sizeof(*pl));
 		populate_from_existing_patterns(o, pl);
 	}
-	o->pl = pl;
+	o->internal.pl = pl;
 
 	/* Expand sparse directories as needed */
-	expand_index(o->src_index, o->pl);
+	expand_index(o->src_index, o->internal.pl);
 
 	/* Set NEW_SKIP_WORKTREE on existing entries. */
 	mark_all_ce_unused(o->src_index);
-	mark_new_skip_worktree(o->pl, o->src_index, 0,
+	mark_new_skip_worktree(o->internal.pl, o->src_index, 0,
 			       CE_NEW_SKIP_WORKTREE, o->verbose_update);
 
 	/* Then loop over entries and update/remove as needed */
@@ -2152,7 +2152,7 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 	if (free_pattern_list) {
 		clear_pattern_list(pl);
 		free(pl);
-		o->pl = NULL;
+		o->internal.pl = NULL;
 	}
 	trace_performance_leave("update_sparsity");
 	return ret;
@@ -2340,7 +2340,7 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
 	pathbuf = xstrfmt("%.*s/", namelen, ce->name);
 
 	memset(&d, 0, sizeof(d));
-	if (o->dir)
+	if (o->internal.dir)
 		setup_standard_excludes(&d);
 	i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
 	dir_clear(&d);
@@ -2395,8 +2395,8 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 	if (ignore_case && icase_exists(o, name, len, st))
 		return 0;
 
-	if (o->dir &&
-	    is_excluded(o->dir, o->src_index, name, &dtype))
+	if (o->internal.dir &&
+	    is_excluded(o->internal.dir, o->src_index, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
diff --git a/unpack-trees.h b/unpack-trees.h
index f3a6e4f90ef..5c1a9314a06 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -97,9 +97,12 @@ struct unpack_trees_options {
 	struct index_state *src_index;
 	struct index_state result;
 
-	struct pattern_list *pl; /* for internal use */
-	struct dir_struct *dir; /* for internal use only */
 	struct checkout_metadata meta;
+
+	struct unpack_trees_options_internal {
+		struct pattern_list *pl;
+		struct dir_struct *dir;
+	} internal;
 };
 
 int unpack_trees(unsigned n, struct tree_desc *t,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 10/13] unpack-trees: mark fields only used internally as internal
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (8 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 09/13] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 11/13] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
                       ` (2 subsequent siblings)
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Continue the work from the previous patch by finding additional fields
which are only used internally but not yet explicitly marked as such,
and include them in the internal fields struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 159 +++++++++++++++++++++++++------------------------
 unpack-trees.h |  26 ++++----
 2 files changed, 95 insertions(+), 90 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index ecf89d5bfeb..dd4b55ef49e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -66,8 +66,8 @@ static const char *unpack_plumbing_errors[NB_UNPACK_TREES_WARNING_TYPES] = {
 };
 
 #define ERRORMSG(o,type) \
-	( ((o) && (o)->msgs[(type)]) \
-	  ? ((o)->msgs[(type)])      \
+	( ((o) && (o)->internal.msgs[(type)]) \
+	  ? ((o)->internal.msgs[(type)])      \
 	  : (unpack_plumbing_errors[(type)]) )
 
 static const char *super_prefixed(const char *path, const char *super_prefix)
@@ -108,10 +108,10 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 				  const char *cmd)
 {
 	int i;
-	const char **msgs = opts->msgs;
+	const char **msgs = opts->internal.msgs;
 	const char *msg;
 
-	strvec_init(&opts->msgs_to_free);
+	strvec_init(&opts->internal.msgs_to_free);
 
 	if (!strcmp(cmd, "checkout"))
 		msg = advice_enabled(ADVICE_COMMIT_BEFORE_MERGE)
@@ -129,7 +129,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please commit your changes or stash them before you %s.")
 		      : _("Your local changes to the following files would be overwritten by %s:\n%%s");
 	msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	msgs[ERROR_NOT_UPTODATE_DIR] =
 		_("Updating the following directories would lose untracked files in them:\n%s");
@@ -153,7 +153,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please move or remove them before you %s.")
 		      : _("The following untracked working tree files would be removed by %s:\n%%s");
 	msgs[ERROR_WOULD_LOSE_UNTRACKED_REMOVED] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	if (!strcmp(cmd, "checkout"))
 		msg = advice_enabled(ADVICE_COMMIT_BEFORE_MERGE)
@@ -171,7 +171,7 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 			  "Please move or remove them before you %s.")
 		      : _("The following untracked working tree files would be overwritten by %s:\n%%s");
 	msgs[ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN] =
-		strvec_pushf(&opts->msgs_to_free, msg, cmd, cmd);
+		strvec_pushf(&opts->internal.msgs_to_free, msg, cmd, cmd);
 
 	/*
 	 * Special case: ERROR_BIND_OVERLAP refers to a pair of paths, we
@@ -189,16 +189,16 @@ void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
 	msgs[WARNING_SPARSE_ORPHANED_NOT_OVERWRITTEN] =
 		_("The following paths were already present and thus not updated despite sparse patterns:\n%s");
 
-	opts->show_all_errors = 1;
+	opts->internal.show_all_errors = 1;
 	/* rejected paths may not have a static buffer */
-	for (i = 0; i < ARRAY_SIZE(opts->unpack_rejects); i++)
-		opts->unpack_rejects[i].strdup_strings = 1;
+	for (i = 0; i < ARRAY_SIZE(opts->internal.unpack_rejects); i++)
+		opts->internal.unpack_rejects[i].strdup_strings = 1;
 }
 
 void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
 {
-	strvec_clear(&opts->msgs_to_free);
-	memset(opts->msgs, 0, sizeof(opts->msgs));
+	strvec_clear(&opts->internal.msgs_to_free);
+	memset(opts->internal.msgs, 0, sizeof(opts->internal.msgs));
 }
 
 static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
@@ -210,7 +210,7 @@ static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
 		set |= CE_WT_REMOVE;
 
 	ce->ce_flags = (ce->ce_flags & ~clear) | set;
-	return add_index_entry(&o->result, ce,
+	return add_index_entry(&o->internal.result, ce,
 			       ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE);
 }
 
@@ -218,7 +218,7 @@ static void add_entry(struct unpack_trees_options *o,
 		      const struct cache_entry *ce,
 		      unsigned int set, unsigned int clear)
 {
-	do_add_entry(o, dup_cache_entry(ce, &o->result), set, clear);
+	do_add_entry(o, dup_cache_entry(ce, &o->internal.result), set, clear);
 }
 
 /*
@@ -233,7 +233,7 @@ static int add_rejected_path(struct unpack_trees_options *o,
 	if (o->quiet)
 		return -1;
 
-	if (!o->show_all_errors)
+	if (!o->internal.show_all_errors)
 		return error(ERRORMSG(o, e), super_prefixed(path,
 							    o->super_prefix));
 
@@ -241,7 +241,7 @@ static int add_rejected_path(struct unpack_trees_options *o,
 	 * Otherwise, insert in a list for future display by
 	 * display_(error|warning)_msgs()
 	 */
-	string_list_append(&o->unpack_rejects[e], path);
+	string_list_append(&o->internal.unpack_rejects[e], path);
 	return -1;
 }
 
@@ -253,7 +253,7 @@ static void display_error_msgs(struct unpack_trees_options *o)
 	int e;
 	unsigned error_displayed = 0;
 	for (e = 0; e < NB_UNPACK_TREES_ERROR_TYPES; e++) {
-		struct string_list *rejects = &o->unpack_rejects[e];
+		struct string_list *rejects = &o->internal.unpack_rejects[e];
 
 		if (rejects->nr > 0) {
 			int i;
@@ -281,7 +281,7 @@ static void display_warning_msgs(struct unpack_trees_options *o)
 	unsigned warning_displayed = 0;
 	for (e = NB_UNPACK_TREES_ERROR_TYPES + 1;
 	     e < NB_UNPACK_TREES_WARNING_TYPES; e++) {
-		struct string_list *rejects = &o->unpack_rejects[e];
+		struct string_list *rejects = &o->internal.unpack_rejects[e];
 
 		if (rejects->nr > 0) {
 			int i;
@@ -600,13 +600,14 @@ static void mark_ce_used(struct cache_entry *ce, struct unpack_trees_options *o)
 {
 	ce->ce_flags |= CE_UNPACKED;
 
-	if (o->cache_bottom < o->src_index->cache_nr &&
-	    o->src_index->cache[o->cache_bottom] == ce) {
-		int bottom = o->cache_bottom;
+	if (o->internal.cache_bottom < o->src_index->cache_nr &&
+	    o->src_index->cache[o->internal.cache_bottom] == ce) {
+		int bottom = o->internal.cache_bottom;
+
 		while (bottom < o->src_index->cache_nr &&
 		       o->src_index->cache[bottom]->ce_flags & CE_UNPACKED)
 			bottom++;
-		o->cache_bottom = bottom;
+		o->internal.cache_bottom = bottom;
 	}
 }
 
@@ -652,7 +653,7 @@ static void mark_ce_used_same_name(struct cache_entry *ce,
 static struct cache_entry *next_cache_entry(struct unpack_trees_options *o)
 {
 	const struct index_state *index = o->src_index;
-	int pos = o->cache_bottom;
+	int pos = o->internal.cache_bottom;
 
 	while (pos < index->cache_nr) {
 		struct cache_entry *ce = index->cache[pos];
@@ -711,7 +712,7 @@ static void restore_cache_bottom(struct traverse_info *info, int bottom)
 
 	if (o->diff_index_cached)
 		return;
-	o->cache_bottom = bottom;
+	o->internal.cache_bottom = bottom;
 }
 
 static int switch_cache_bottom(struct traverse_info *info)
@@ -721,13 +722,13 @@ static int switch_cache_bottom(struct traverse_info *info)
 
 	if (o->diff_index_cached)
 		return 0;
-	ret = o->cache_bottom;
+	ret = o->internal.cache_bottom;
 	pos = find_cache_pos(info->prev, info->name, info->namelen);
 
 	if (pos < -1)
-		o->cache_bottom = -2 - pos;
+		o->internal.cache_bottom = -2 - pos;
 	else if (pos < 0)
-		o->cache_bottom = o->src_index->cache_nr;
+		o->internal.cache_bottom = o->src_index->cache_nr;
 	return ret;
 }
 
@@ -873,9 +874,9 @@ static int traverse_trees_recursive(int n, unsigned long dirmask,
 		 * save and restore cache_bottom anyway to not miss
 		 * unprocessed entries before 'pos'.
 		 */
-		bottom = o->cache_bottom;
+		bottom = o->internal.cache_bottom;
 		ret = traverse_by_cache_tree(pos, nr_entries, n, info);
-		o->cache_bottom = bottom;
+		o->internal.cache_bottom = bottom;
 		return ret;
 	}
 
@@ -1212,7 +1213,7 @@ static int unpack_single_entry(int n, unsigned long mask,
 		 * cache entry from the index aware logic.
 		 */
 		src[i + o->merge] = create_ce_entry(info, names + i, stage,
-						    &o->result, o->merge,
+						    &o->internal.result, o->merge,
 						    bit & dirmask);
 	}
 
@@ -1237,7 +1238,7 @@ static int unpack_single_entry(int n, unsigned long mask,
 
 static int unpack_failed(struct unpack_trees_options *o, const char *message)
 {
-	discard_index(&o->result);
+	discard_index(&o->internal.result);
 	if (!o->quiet && !o->exiting_early) {
 		if (message)
 			return error("%s", message);
@@ -1260,7 +1261,7 @@ static int find_cache_pos(struct traverse_info *info,
 	struct index_state *index = o->src_index;
 	int pfxlen = info->pathlen;
 
-	for (pos = o->cache_bottom; pos < index->cache_nr; pos++) {
+	for (pos = o->internal.cache_bottom; pos < index->cache_nr; pos++) {
 		const struct cache_entry *ce = index->cache[pos];
 		const char *ce_name, *ce_slash;
 		int cmp, ce_len;
@@ -1271,8 +1272,8 @@ static int find_cache_pos(struct traverse_info *info,
 			 * we can never match it; don't check it
 			 * again.
 			 */
-			if (pos == o->cache_bottom)
-				++o->cache_bottom;
+			if (pos == o->internal.cache_bottom)
+				++o->internal.cache_bottom;
 			continue;
 		}
 		if (!ce_in_traverse_path(ce, info)) {
@@ -1450,7 +1451,7 @@ static int unpack_sparse_callback(int n, unsigned long mask, unsigned long dirma
 	 */
 	if (!is_null_oid(&names[0].oid)) {
 		src[0] = create_ce_entry(info, &names[0], 0,
-					&o->result, 1,
+					&o->internal.result, 1,
 					dirmask & (1ul << 0));
 		src[0]->ce_flags |= (CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
 	}
@@ -1560,7 +1561,7 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 				 * in 'mark_ce_used()'
 				 */
 				if (!src[0] || !S_ISSPARSEDIR(src[0]->ce_mode))
-					o->cache_bottom += matches;
+					o->internal.cache_bottom += matches;
 				return mask;
 			}
 		}
@@ -1907,37 +1908,37 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		populate_from_existing_patterns(o, &pl);
 	}
 
-	index_state_init(&o->result, o->src_index->repo);
-	o->result.initialized = 1;
-	o->result.timestamp.sec = o->src_index->timestamp.sec;
-	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
-	o->result.version = o->src_index->version;
+	index_state_init(&o->internal.result, o->src_index->repo);
+	o->internal.result.initialized = 1;
+	o->internal.result.timestamp.sec = o->src_index->timestamp.sec;
+	o->internal.result.timestamp.nsec = o->src_index->timestamp.nsec;
+	o->internal.result.version = o->src_index->version;
 	if (!o->src_index->split_index) {
-		o->result.split_index = NULL;
+		o->internal.result.split_index = NULL;
 	} else if (o->src_index == o->dst_index) {
 		/*
 		 * o->dst_index (and thus o->src_index) will be discarded
-		 * and overwritten with o->result at the end of this function,
+		 * and overwritten with o->internal.result at the end of this function,
 		 * so just use src_index's split_index to avoid having to
 		 * create a new one.
 		 */
-		o->result.split_index = o->src_index->split_index;
-		o->result.split_index->refcount++;
+		o->internal.result.split_index = o->src_index->split_index;
+		o->internal.result.split_index->refcount++;
 	} else {
-		o->result.split_index = init_split_index(&o->result);
+		o->internal.result.split_index = init_split_index(&o->internal.result);
 	}
-	oidcpy(&o->result.oid, &o->src_index->oid);
+	oidcpy(&o->internal.result.oid, &o->src_index->oid);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
 
-	o->result.fsmonitor_last_update =
+	o->internal.result.fsmonitor_last_update =
 		xstrdup_or_null(o->src_index->fsmonitor_last_update);
-	o->result.fsmonitor_has_run_once = o->src_index->fsmonitor_has_run_once;
+	o->internal.result.fsmonitor_has_run_once = o->src_index->fsmonitor_has_run_once;
 
 	if (!o->src_index->initialized &&
 	    !repo->settings.command_requires_full_index &&
-	    is_sparse_index_allowed(&o->result, 0))
-		o->result.sparse_index = 1;
+	    is_sparse_index_allowed(&o->internal.result, 0))
+		o->internal.result.sparse_index = 1;
 
 	/*
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
@@ -1957,7 +1958,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		setup_traverse_info(&info, prefix);
 		info.fn = unpack_callback;
 		info.data = o;
-		info.show_all_errors = o->show_all_errors;
+		info.show_all_errors = o->internal.show_all_errors;
 		info.pathspec = o->pathspec;
 
 		if (o->prefix) {
@@ -1998,7 +1999,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	}
 	mark_all_ce_unused(o->src_index);
 
-	if (o->trivial_merges_only && o->nontrivial_merge) {
+	if (o->trivial_merges_only && o->internal.nontrivial_merge) {
 		ret = unpack_failed(o, "Merge requires file-level merging");
 		goto done;
 	}
@@ -2009,13 +2010,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		 * If they will have NEW_SKIP_WORKTREE, also set CE_SKIP_WORKTREE
 		 * so apply_sparse_checkout() won't attempt to remove it from worktree
 		 */
-		mark_new_skip_worktree(o->internal.pl, &o->result,
+		mark_new_skip_worktree(o->internal.pl, &o->internal.result,
 				       CE_ADDED, CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE,
 				       o->verbose_update);
 
 		ret = 0;
-		for (i = 0; i < o->result.cache_nr; i++) {
-			struct cache_entry *ce = o->result.cache[i];
+		for (i = 0; i < o->internal.result.cache_nr; i++) {
+			struct cache_entry *ce = o->internal.result.cache[i];
 
 			/*
 			 * Entries marked with CE_ADDED in merged_entry() do not have
@@ -2029,7 +2030,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			    verify_absent(ce, WARNING_SPARSE_ORPHANED_NOT_OVERWRITTEN, o))
 				ret = 1;
 
-			if (apply_sparse_checkout(&o->result, ce, o))
+			if (apply_sparse_checkout(&o->internal.result, ce, o))
 				ret = 1;
 		}
 		if (ret == 1) {
@@ -2037,30 +2038,30 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			 * Inability to sparsify or de-sparsify individual
 			 * paths is not an error, but just a warning.
 			 */
-			if (o->show_all_errors)
+			if (o->internal.show_all_errors)
 				display_warning_msgs(o);
 			ret = 0;
 		}
 	}
 
-	ret = check_updates(o, &o->result) ? (-2) : 0;
+	ret = check_updates(o, &o->internal.result) ? (-2) : 0;
 	if (o->dst_index) {
-		move_index_extensions(&o->result, o->src_index);
+		move_index_extensions(&o->internal.result, o->src_index);
 		if (!ret) {
 			if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
-				cache_tree_verify(the_repository, &o->result);
+				cache_tree_verify(the_repository, &o->internal.result);
 			if (!o->skip_cache_tree_update &&
-			    !cache_tree_fully_valid(o->result.cache_tree))
-				cache_tree_update(&o->result,
+			    !cache_tree_fully_valid(o->internal.result.cache_tree))
+				cache_tree_update(&o->internal.result,
 						  WRITE_TREE_SILENT |
 						  WRITE_TREE_REPAIR);
 		}
 
-		o->result.updated_workdir = 1;
+		o->internal.result.updated_workdir = 1;
 		discard_index(o->dst_index);
-		*o->dst_index = o->result;
+		*o->dst_index = o->internal.result;
 	} else {
-		discard_index(&o->result);
+		discard_index(&o->internal.result);
 	}
 	o->src_index = NULL;
 
@@ -2076,7 +2077,7 @@ done:
 	return ret;
 
 return_failed:
-	if (o->show_all_errors)
+	if (o->internal.show_all_errors)
 		display_error_msgs(o);
 	mark_all_ce_unused(o->src_index);
 	ret = unpack_failed(o, NULL);
@@ -2099,9 +2100,9 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 	unsigned old_show_all_errors;
 	int free_pattern_list = 0;
 
-	old_show_all_errors = o->show_all_errors;
-	o->show_all_errors = 1;
-	index_state_init(&o->result, o->src_index->repo);
+	old_show_all_errors = o->internal.show_all_errors;
+	o->internal.show_all_errors = 1;
+	index_state_init(&o->internal.result, o->src_index->repo);
 
 	/* Sanity checks */
 	if (!o->update || o->index_only || o->skip_sparse_checkout)
@@ -2148,7 +2149,7 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
 		ret = UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES;
 
 	display_warning_msgs(o);
-	o->show_all_errors = old_show_all_errors;
+	o->internal.show_all_errors = old_show_all_errors;
 	if (free_pattern_list) {
 		clear_pattern_list(pl);
 		free(pl);
@@ -2248,15 +2249,15 @@ static int verify_uptodate_sparse(const struct cache_entry *ce,
 }
 
 /*
- * TODO: We should actually invalidate o->result, not src_index [1].
+ * TODO: We should actually invalidate o->internal.result, not src_index [1].
  * But since cache tree and untracked cache both are not copied to
- * o->result until unpacking is complete, we invalidate them on
+ * o->internal.result until unpacking is complete, we invalidate them on
  * src_index instead with the assumption that they will be copied to
  * dst_index at the end.
  *
  * [1] src_index->cache_tree is also used in unpack_callback() so if
- * we invalidate o->result, we need to update it to use
- * o->result.cache_tree as well.
+ * we invalidate o->internal.result, we need to update it to use
+ * o->internal.result.cache_tree as well.
  */
 static void invalidate_ce_path(const struct cache_entry *ce,
 			       struct unpack_trees_options *o)
@@ -2424,7 +2425,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 	 * delete this path, which is in a subdirectory that
 	 * is being replaced with a blob.
 	 */
-	result = index_file_exists(&o->result, name, len, 0);
+	result = index_file_exists(&o->internal.result, name, len, 0);
 	if (result) {
 		if (result->ce_flags & CE_REMOVE)
 			return 0;
@@ -2525,7 +2526,7 @@ static int merged_entry(const struct cache_entry *ce,
 			struct unpack_trees_options *o)
 {
 	int update = CE_UPDATE;
-	struct cache_entry *merge = dup_cache_entry(ce, &o->result);
+	struct cache_entry *merge = dup_cache_entry(ce, &o->internal.result);
 
 	if (!old) {
 		/*
@@ -2620,7 +2621,7 @@ static int merged_sparse_dir(const struct cache_entry * const *src, int n,
 	setup_traverse_info(&info, src[0]->name);
 	info.fn = unpack_sparse_callback;
 	info.data = o;
-	info.show_all_errors = o->show_all_errors;
+	info.show_all_errors = o->internal.show_all_errors;
 	info.pathspec = o->pathspec;
 
 	/* Get the tree descriptors of the sparse directory in each of the merging trees */
@@ -2838,7 +2839,7 @@ int threeway_merge(const struct cache_entry * const *stages,
 			return -1;
 	}
 
-	o->nontrivial_merge = 1;
+	o->internal.nontrivial_merge = 1;
 
 	/* #2, #3, #4, #6, #7, #9, #10, #11. */
 	count = 0;
diff --git a/unpack-trees.h b/unpack-trees.h
index 5c1a9314a06..0335c89bc75 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -59,7 +59,6 @@ struct unpack_trees_options {
 		     preserve_ignored,
 		     clone,
 		     index_only,
-		     nontrivial_merge,
 		     trivial_merges_only,
 		     verbose_update,
 		     aggressive,
@@ -70,22 +69,13 @@ struct unpack_trees_options {
 		     skip_sparse_checkout,
 		     quiet,
 		     exiting_early,
-		     show_all_errors,
 		     dry_run,
 		     skip_cache_tree_update;
 	enum unpack_trees_reset_type reset;
 	const char *prefix;
 	const char *super_prefix;
-	int cache_bottom;
 	struct pathspec *pathspec;
 	merge_fn_t fn;
-	const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
-	struct strvec msgs_to_free;
-	/*
-	 * Store error messages in an array, each case
-	 * corresponding to a error message type
-	 */
-	struct string_list unpack_rejects[NB_UNPACK_TREES_WARNING_TYPES];
 
 	int head_idx;
 	int merge_size;
@@ -95,11 +85,25 @@ struct unpack_trees_options {
 
 	struct index_state *dst_index;
 	struct index_state *src_index;
-	struct index_state result;
 
 	struct checkout_metadata meta;
 
 	struct unpack_trees_options_internal {
+		unsigned int nontrivial_merge,
+			     show_all_errors;
+
+		int cache_bottom;
+		const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
+		struct strvec msgs_to_free;
+
+		/*
+		 * Store error messages in an array, each case
+		 * corresponding to a error message type
+		 */
+		struct string_list unpack_rejects[NB_UNPACK_TREES_WARNING_TYPES];
+
+		struct index_state result;
+
 		struct pattern_list *pl;
 		struct dir_struct *dir;
 	} internal;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 11/13] unpack-trees: rewrap a few overlong lines from previous patch
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (9 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 10/13] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 12/13] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 13/13] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The previous patch made many lines a little longer, resulting in four
becoming a bit too long.  They were left as-is for the previous patch
to facilitate reviewers verifying that we were just adding "internal."
in a bunch of places, but rewrap them now.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index dd4b55ef49e..cac5dd0da37 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1213,8 +1213,8 @@ static int unpack_single_entry(int n, unsigned long mask,
 		 * cache entry from the index aware logic.
 		 */
 		src[i + o->merge] = create_ce_entry(info, names + i, stage,
-						    &o->internal.result, o->merge,
-						    bit & dirmask);
+						    &o->internal.result,
+						    o->merge, bit & dirmask);
 	}
 
 	if (o->merge) {
@@ -1918,14 +1918,15 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	} else if (o->src_index == o->dst_index) {
 		/*
 		 * o->dst_index (and thus o->src_index) will be discarded
-		 * and overwritten with o->internal.result at the end of this function,
-		 * so just use src_index's split_index to avoid having to
-		 * create a new one.
+		 * and overwritten with o->internal.result at the end of
+		 * this function, so just use src_index's split_index to
+		 * avoid having to create a new one.
 		 */
 		o->internal.result.split_index = o->src_index->split_index;
 		o->internal.result.split_index->refcount++;
 	} else {
-		o->internal.result.split_index = init_split_index(&o->internal.result);
+		o->internal.result.split_index =
+			init_split_index(&o->internal.result);
 	}
 	oidcpy(&o->internal.result.oid, &o->src_index->oid);
 	o->merge_size = len;
@@ -2049,7 +2050,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		move_index_extensions(&o->internal.result, o->src_index);
 		if (!ret) {
 			if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
-				cache_tree_verify(the_repository, &o->internal.result);
+				cache_tree_verify(the_repository,
+						  &o->internal.result);
 			if (!o->skip_cache_tree_update &&
 			    !cache_tree_fully_valid(o->internal.result.cache_tree))
 				cache_tree_update(&o->internal.result,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 12/13] unpack-trees: special case read-tree debugging as internal usage
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (10 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 11/13] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  2023-02-27 15:28     ` [PATCH v3 13/13] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

builtin/read-tree.c has some special functionality explicitly designed
for debugging unpack-trees.[ch].  Associated with that is two fields
that no other external caller would or should use.  Mark these as
internal to unpack-trees, but allow builtin/read-tree to read or write
them for this special case.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/read-tree.c | 10 +++++-----
 unpack-trees.c      | 22 +++++++++++-----------
 unpack-trees.h      |  6 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 3ce75417833..6034408d486 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -87,9 +87,9 @@ static int debug_merge(const struct cache_entry * const *stages,
 {
 	int i;
 
-	printf("* %d-way merge\n", o->merge_size);
+	printf("* %d-way merge\n", o->internal.merge_size);
 	debug_stage("index", stages[0], o);
-	for (i = 1; i <= o->merge_size; i++) {
+	for (i = 1; i <= o->internal.merge_size; i++) {
 		char buf[24];
 		xsnprintf(buf, sizeof(buf), "ent#%d", i);
 		debug_stage(buf, stages[i], o);
@@ -144,7 +144,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 		OPT__DRY_RUN(&opts.dry_run, N_("don't update the index or the work tree")),
 		OPT_BOOL(0, "no-sparse-checkout", &opts.skip_sparse_checkout,
 			 N_("skip applying sparse checkout filter")),
-		OPT_BOOL(0, "debug-unpack", &opts.debug_unpack,
+		OPT_BOOL(0, "debug-unpack", &opts.internal.debug_unpack,
 			 N_("debug unpack-trees")),
 		OPT_CALLBACK_F(0, "recurse-submodules", NULL,
 			    "checkout", "control recursive updating of submodules",
@@ -247,7 +247,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 			opts.head_idx = 1;
 	}
 
-	if (opts.debug_unpack)
+	if (opts.internal.debug_unpack)
 		opts.fn = debug_merge;
 
 	/* If we're going to prime_cache_tree later, skip cache tree update */
@@ -263,7 +263,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 	if (unpack_trees(nr_trees, t, &opts))
 		return 128;
 
-	if (opts.debug_unpack || opts.dry_run)
+	if (opts.internal.debug_unpack || opts.dry_run)
 		return 0; /* do not write the index out */
 
 	/*
diff --git a/unpack-trees.c b/unpack-trees.c
index cac5dd0da37..3e5f4bd2355 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -839,7 +839,7 @@ static int traverse_by_cache_tree(int pos, int nr_entries, int nr_names,
 		mark_ce_used(src[0], o);
 	}
 	free(tree_ce);
-	if (o->debug_unpack)
+	if (o->internal.debug_unpack)
 		printf("Unpacked %d entries from %s to %s using cache-tree\n",
 		       nr_entries,
 		       o->src_index->cache[pos]->name,
@@ -1488,7 +1488,7 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 	while (!p->mode)
 		p++;
 
-	if (o->debug_unpack)
+	if (o->internal.debug_unpack)
 		debug_unpack_callback(n, mask, dirmask, names, info);
 
 	/* Are we supposed to look at the index too? */
@@ -1929,7 +1929,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			init_split_index(&o->internal.result);
 	}
 	oidcpy(&o->internal.result.oid, &o->src_index->oid);
-	o->merge_size = len;
+	o->internal.merge_size = len;
 	mark_all_ce_unused(o->src_index);
 
 	o->internal.result.fsmonitor_last_update =
@@ -2882,9 +2882,9 @@ int twoway_merge(const struct cache_entry * const *src,
 	const struct cache_entry *oldtree = src[1];
 	const struct cache_entry *newtree = src[2];
 
-	if (o->merge_size != 2)
+	if (o->internal.merge_size != 2)
 		return error("Cannot do a twoway merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 
 	if (oldtree == o->df_conflict_entry)
 		oldtree = NULL;
@@ -2964,9 +2964,9 @@ int bind_merge(const struct cache_entry * const *src,
 	const struct cache_entry *old = src[0];
 	const struct cache_entry *a = src[1];
 
-	if (o->merge_size != 1)
+	if (o->internal.merge_size != 1)
 		return error("Cannot do a bind merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 	if (a && old)
 		return o->quiet ? -1 :
 			error(ERRORMSG(o, ERROR_BIND_OVERLAP),
@@ -2990,9 +2990,9 @@ int oneway_merge(const struct cache_entry * const *src,
 	const struct cache_entry *old = src[0];
 	const struct cache_entry *a = src[1];
 
-	if (o->merge_size != 1)
+	if (o->internal.merge_size != 1)
 		return error("Cannot do a oneway merge of %d trees",
-			     o->merge_size);
+			     o->internal.merge_size);
 
 	if (!a || a == o->df_conflict_entry)
 		return deleted_entry(old, old, o);
@@ -3027,8 +3027,8 @@ int stash_worktree_untracked_merge(const struct cache_entry * const *src,
 	const struct cache_entry *worktree = src[1];
 	const struct cache_entry *untracked = src[2];
 
-	if (o->merge_size != 2)
-		BUG("invalid merge_size: %d", o->merge_size);
+	if (o->internal.merge_size != 2)
+		BUG("invalid merge_size: %d", o->internal.merge_size);
 
 	if (worktree && untracked)
 		return error(_("worktree and untracked commit have duplicate entries: %s"),
diff --git a/unpack-trees.h b/unpack-trees.h
index 0335c89bc75..e8737adfeda 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -65,7 +65,6 @@ struct unpack_trees_options {
 		     skip_unmerged,
 		     initial_checkout,
 		     diff_index_cached,
-		     debug_unpack,
 		     skip_sparse_checkout,
 		     quiet,
 		     exiting_early,
@@ -78,7 +77,6 @@ struct unpack_trees_options {
 	merge_fn_t fn;
 
 	int head_idx;
-	int merge_size;
 
 	struct cache_entry *df_conflict_entry;
 	void *unpack_data;
@@ -90,8 +88,10 @@ struct unpack_trees_options {
 
 	struct unpack_trees_options_internal {
 		unsigned int nontrivial_merge,
-			     show_all_errors;
+			     show_all_errors,
+			     debug_unpack; /* used by read-tree debugging */
 
+		int merge_size; /* used by read-tree debugging */
 		int cache_bottom;
 		const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
 		struct strvec msgs_to_free;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 13/13] unpack-trees: add usage notices around df_conflict_entry
  2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
                       ` (11 preceding siblings ...)
  2023-02-27 15:28     ` [PATCH v3 12/13] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
@ 2023-02-27 15:28     ` Elijah Newren via GitGitGadget
  12 siblings, 0 replies; 57+ messages in thread
From: Elijah Newren via GitGitGadget @ 2023-02-27 15:28 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Elijah Newren, Jacob Keller, Jonathan Tan,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Avoid making users believe they need to initialize df_conflict_entry
to something (as happened with other output only fields before) with
a quick comment and a small sanity check.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 unpack-trees.c | 2 ++
 unpack-trees.h | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3e5f4bd2355..a37ab292bbd 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1876,6 +1876,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		BUG("o->internal.dir is for internal use only");
 	if (o->internal.pl)
 		BUG("o->internal.pl is for internal use only");
+	if (o->df_conflict_entry)
+		BUG("o->df_conflict_entry is an output only field");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
diff --git a/unpack-trees.h b/unpack-trees.h
index e8737adfeda..61c06eb7c50 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -78,7 +78,7 @@ struct unpack_trees_options {
 
 	int head_idx;
 
-	struct cache_entry *df_conflict_entry;
+	struct cache_entry *df_conflict_entry; /* output only */
 	void *unpack_data;
 
 	struct index_state *dst_index;
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/13] t2021: fix platform-specific leftover cruft
  2023-02-27 15:28     ` [PATCH v3 01/13] t2021: fix platform-specific leftover cruft Elijah Newren via GitGitGadget
@ 2023-02-27 19:11       ` Derrick Stolee
  0 siblings, 0 replies; 57+ messages in thread
From: Derrick Stolee @ 2023-02-27 19:11 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git
  Cc: Elijah Newren, Jacob Keller, Jonathan Tan

On 2/27/2023 10:28 AM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>

>  test_expect_success SYMLINKS 'the symlink remained' '
>  
> -	test_when_finished "rm a/b" &&
>  	test -h a/b
>  '
>  
> +test_expect_success 'cleanup after previous symlink tests' '
> +	rm a/b
> +'

I was confused why this worked without "rm -f a/b" and it seems
the path exists in all cases, it's just a symlink on the filesystem
in the case of the SYMLINKS prerequisite.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 02/13] unpack-trees: heed requests to overwrite ignored files
  2023-02-27 15:28     ` [PATCH v3 02/13] unpack-trees: heed requests to overwrite ignored files Elijah Newren via GitGitGadget
@ 2023-02-27 23:20       ` Jonathan Tan
  0 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2023-02-27 23:20 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: Jonathan Tan, git, Derrick Stolee, Elijah Newren, Jacob Keller

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 3d05e45a279..4518d33ed99 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -2337,7 +2337,7 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
>  
>  	memset(&d, 0, sizeof(d));
>  	if (o->dir)
> -		d.exclude_per_dir = o->dir->exclude_per_dir;
> +		setup_standard_excludes(&d);
>  	i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
>  	dir_clear(&d);
>  	free(pathbuf);

Thanks to the later patches in this patch set, I only needed to look at
unpack-trees.c to see how o->dir (later, o->internal.dir) is set. The
only place it is set is in unpack_trees(), in which a flag is set and
setup_standard_excludes() is called. So the RHS of the diff here does
effectively the same thing as the LHS. (As for the flag, it is not set
in the RHS, but it was not set in the LHS in the first place, so that's
fine.)

Thanks - all 13 patches in this patch set look good.

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2023-02-27 23:21 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
2023-02-24 22:31   ` Jonathan Tan
2023-02-25  0:23     ` Elijah Newren
2023-02-25  1:54       ` Jonathan Tan
2023-02-25  3:23         ` Elijah Newren
2023-02-23  9:14 ` [PATCH 03/11] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
2023-02-24 22:33   ` Jonathan Tan
2023-02-23  9:14 ` [PATCH 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
2023-02-24 22:37   ` Jonathan Tan
2023-02-25  0:33     ` Elijah Newren
2023-02-23  9:14 ` [PATCH 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
2023-02-24 23:22   ` Jonathan Tan
2023-02-25  0:40     ` Elijah Newren
2023-02-23  9:14 ` [PATCH 07/11] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 08/11] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 09/11] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 10/11] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
2023-02-23  9:15 ` [PATCH 11/11] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
2023-02-23 15:18 ` [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Derrick Stolee
2023-02-23 15:26   ` Derrick Stolee
2023-02-23 20:35     ` Elijah Newren
2023-02-23 20:31   ` Elijah Newren
2023-02-24  1:24   ` Junio C Hamano
2023-02-24  5:54   ` Jacob Keller
2023-02-24 23:36 ` Jonathan Tan
2023-02-25  2:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 03/11] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 07/11] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 08/11] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 09/11] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 10/11] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
2023-02-25  2:26   ` [PATCH v2 11/11] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
2023-02-25 23:30   ` [PATCH v2 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Junio C Hamano
2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 01/13] t2021: fix platform-specific leftover cruft Elijah Newren via GitGitGadget
2023-02-27 19:11       ` Derrick Stolee
2023-02-27 15:28     ` [PATCH v3 02/13] unpack-trees: heed requests to overwrite ignored files Elijah Newren via GitGitGadget
2023-02-27 23:20       ` Jonathan Tan
2023-02-27 15:28     ` [PATCH v3 03/13] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 04/13] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 05/13] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 06/13] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 07/13] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 08/13] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 09/13] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 10/13] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 11/13] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 12/13] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 13/13] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).