git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Derrick Stolee <derrickstolee@github.com>,
	Elijah Newren <newren@gmail.com>,
	Jacob Keller <jacob.keller@gmail.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Elijah Newren <newren@gmail.com>
Subject: [PATCH v2 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal
Date: Sat, 25 Feb 2023 02:25:49 +0000	[thread overview]
Message-ID: <pull.1149.v2.git.1677291960.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1149.git.1677143700.gitgitgadget@gmail.com>

Changes since v1 (thanks to Jonathan Tan for the careful reviews!)

 * Clear o->pl when freeing pl, to avoid risking use-after-free.
 * Initialize o->result in update_sparsity() since it is actually used (by
   check_ok_to_remove()).

Some time ago, I noticed that struct dir_struct and struct
unpack_trees_options both have numerous fields meant for internal use only,
most of which are not marked as such. This has resulted in callers
accidentally trying to initialize some of these fields, and in at least one
case required a fair amount of review to verify other changes were okay --
review that would have been simplified with the apriori knowledge that a
combination of multiple fields were internal-only[1]. Looking closer, I
found that only 6 out of 18 fields in dir_struct were actually meant to be
public[2], and noted that unpack_trees_options also had 11 internal-only
fields (out of 36).

This patch is primarily about moving internal-only fields within these two
structs into an embedded internal struct. Patch breakdown:

 * Patches 1-3: Restructuring dir_struct
   * Patch 1: Splitting off internal-use-only fields
   * Patch 2: Add important usage note to avoid accidentally using
     deprecated API
   * Patch 3: Mark output-only fields as such
 * Patches 4-11: Restructuring unpack_trees_options
   * Patches 4-6: Preparatory cleanup
   * Patches 7-10: Splitting off internal-use-only fields
   * Patch 11: Mark output-only field as such

To make the benefit more clear, here are compressed versions of dir_struct
both before and after the changes. First, before:

struct dir_struct {
    int nr;
    int alloc;
    int ignored_nr;
    int ignored_alloc;
    enum [...] flags;
    struct dir_entry **entries;
    struct dir_entry **ignored;
    const char *exclude_per_dir;
#define EXC_CMDL 0
#define EXC_DIRS 1
#define EXC_FILE 2
    struct exclude_list_group exclude_list_group[3];
    struct exclude_stack *exclude_stack;
    struct path_pattern *pattern;
    struct strbuf basebuf;
    struct untracked_cache *untracked;
    struct oid_stat ss_info_exclude;
    struct oid_stat ss_excludes_file;
    unsigned unmanaged_exclude_files;
    unsigned visited_paths;
    unsigned visited_directories;
};


And after the changes:

struct dir_struct {
    enum [...] flags;
    int nr; /* output only */
    int ignored_nr; /* output only */
    struct dir_entry **entries; /* output only */
    struct dir_entry **ignored; /* output only */
    struct untracked_cache *untracked;
    const char *exclude_per_dir; /* deprecated */
    struct dir_struct_internal {
        int alloc;
        int ignored_alloc;
#define EXC_CMDL 0
#define EXC_DIRS 1
#define EXC_FILE 2
        struct exclude_list_group exclude_list_group[3];
        struct exclude_stack *exclude_stack;
        struct path_pattern *pattern;
        struct strbuf basebuf;
        struct oid_stat ss_info_exclude;
        struct oid_stat ss_excludes_file;
        unsigned unmanaged_exclude_files;
        unsigned visited_paths;
        unsigned visited_directories;
    } internal;
};


The former version has 18 fields (and 3 magic constants) which API users
will have to figure out. The latter makes it clear there are only at most 2
fields you should be setting upon input, and at most 4 which you read at
output, and the rest (including all the magic constants) you can ignore.

[0] Search for "Extremely yes" in
https://lore.kernel.org/git/CAJoAoZm+TkCL0Jpg_qFgKottxbtiG2QOiY0qGrz3-uQy+=waPg@mail.gmail.com/
[1]
https://lore.kernel.org/git/CABPp-BFSFN3WM6q7KzkD5mhrwsz--St_-ej5LbaY8Yr2sZzj=w@mail.gmail.com/
[2]
https://lore.kernel.org/git/CABPp-BHgot=CPNyK_xNfog_SqsNPNoCGfiSb-gZoS2sn_741dQ@mail.gmail.com/

Elijah Newren (11):
  dir: separate public from internal portion of dir_struct
  dir: add a usage note to exclude_per_dir
  dir: mark output only fields of dir_struct as such
  unpack-trees: clean up some flow control
  sparse-checkout: avoid using internal API of unpack-trees
  sparse-checkout: avoid using internal API of unpack-trees, take 2
  unpack_trees: start splitting internal fields from public API
  unpack-trees: mark fields only used internally as internal
  unpack-trees: rewrap a few overlong lines from previous patch
  unpack-trees: special case read-tree debugging as internal usage
  unpack-trees: add usage notices around df_conflict_entry

 builtin/read-tree.c       |  10 +-
 builtin/sparse-checkout.c |   4 +-
 dir.c                     | 114 +++++++++---------
 dir.h                     | 110 +++++++++--------
 unpack-trees.c            | 247 ++++++++++++++++++++------------------
 unpack-trees.h            |  42 ++++---
 6 files changed, 277 insertions(+), 250 deletions(-)


base-commit: 06dd2baa8da4a73421b959ec026a43711b9d77f9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1149%2Fnewren%2Fclarify-api-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1149/newren/clarify-api-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1149

Range-diff vs v1:

  1:  7f59ad548d0 =  1:  7f59ad548d0 dir: separate public from internal portion of dir_struct
  2:  239b10e1181 =  2:  239b10e1181 dir: add a usage note to exclude_per_dir
  3:  b8aa14350d3 =  3:  b8aa14350d3 dir: mark output only fields of dir_struct as such
  4:  f5a58123034 =  4:  f5a58123034 unpack-trees: clean up some flow control
  5:  508837fc182 !  5:  975dec0f0eb sparse-checkout: avoid using internal API of unpack-trees
     @@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_
      +	if (free_pattern_list) {
      +		clear_pattern_list(pl);
      +		free(pl);
     ++		o->pl = NULL;
      +	}
       	trace_performance_leave("update_sparsity");
       	return ret;
  6:  8955b45e354 !  6:  429f195dcfe sparse-checkout: avoid using internal API of unpack-trees, take 2
     @@ Commit message
          Commit 2f6b1eb794 ("cache API: add a "INDEX_STATE_INIT" macro/function,
          add release_index()", 2023-01-12) mistakenly added some initialization
          of a member of unpack_trees_options that was intended to be
     -    internal-only.  Further, it served no purpose as it simply duplicated
     -    the initialization that unpack-trees.c code was already doing.
     +    internal-only.  This initialization should be done within
     +    update_sparsity() instead.
     +
     +    Note that while o->result is mostly meant for unpack_trees() and
     +    update_sparsity() mostly operates without o->result,
     +    check_ok_to_remove() does consult it so we need to ensure it is properly
     +    initialized.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ builtin/sparse-checkout.c: static int update_working_directory(struct pattern_li
       	o.skip_sparse_checkout = 0;
       
       	setup_work_tree();
     +
     + ## unpack-trees.c ##
     +@@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
     + 
     + 	old_show_all_errors = o->show_all_errors;
     + 	o->show_all_errors = 1;
     ++	index_state_init(&o->result, o->src_index->repo);
     + 
     + 	/* Sanity checks */
     + 	if (!o->update || o->index_only || o->skip_sparse_checkout)
  7:  63ee57478ed !  7:  993da584dbb unpack_trees: start splitting internal fields from public API
     @@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_
       			       CE_NEW_SKIP_WORKTREE, o->verbose_update);
       
       	/* Then loop over entries and update/remove as needed */
     +@@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_options *o,
     + 	if (free_pattern_list) {
     + 		clear_pattern_list(pl);
     + 		free(pl);
     +-		o->pl = NULL;
     ++		o->internal.pl = NULL;
     + 	}
     + 	trace_performance_leave("update_sparsity");
     + 	return ret;
      @@ unpack-trees.c: static int verify_clean_subdirectory(const struct cache_entry *ce,
       	pathbuf = xstrfmt("%.*s/", namelen, ce->name);
       
  8:  081578b3210 !  8:  8ecb24a45f0 unpack-trees: mark fields only used internally as internal
     @@ unpack-trees.c: enum update_sparsity_result update_sparsity(struct unpack_trees_
       
      -	old_show_all_errors = o->show_all_errors;
      -	o->show_all_errors = 1;
     +-	index_state_init(&o->result, o->src_index->repo);
      +	old_show_all_errors = o->internal.show_all_errors;
      +	o->internal.show_all_errors = 1;
     ++	index_state_init(&o->internal.result, o->src_index->repo);
       
       	/* Sanity checks */
       	if (!o->update || o->index_only || o->skip_sparse_checkout)
  9:  f492ab27b19 =  9:  36ca49c3624 unpack-trees: rewrap a few overlong lines from previous patch
 10:  a5048ea00b2 = 10:  5af04d7fe23 unpack-trees: special case read-tree debugging as internal usage
 11:  efec74c8b49 = 11:  c4f31237634 unpack-trees: add usage notices around df_conflict_entry

-- 
gitgitgadget

  parent reply	other threads:[~2023-02-25  2:26 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-23  9:14 [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
2023-02-24 22:31   ` Jonathan Tan
2023-02-25  0:23     ` Elijah Newren
2023-02-25  1:54       ` Jonathan Tan
2023-02-25  3:23         ` Elijah Newren
2023-02-23  9:14 ` [PATCH 03/11] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
2023-02-24 22:33   ` Jonathan Tan
2023-02-23  9:14 ` [PATCH 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
2023-02-24 22:37   ` Jonathan Tan
2023-02-25  0:33     ` Elijah Newren
2023-02-23  9:14 ` [PATCH 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
2023-02-24 23:22   ` Jonathan Tan
2023-02-25  0:40     ` Elijah Newren
2023-02-23  9:14 ` [PATCH 07/11] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 08/11] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 09/11] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
2023-02-23  9:14 ` [PATCH 10/11] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
2023-02-23  9:15 ` [PATCH 11/11] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
2023-02-23 15:18 ` [PATCH 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Derrick Stolee
2023-02-23 15:26   ` Derrick Stolee
2023-02-23 20:35     ` Elijah Newren
2023-02-23 20:31   ` Elijah Newren
2023-02-24  1:24   ` Junio C Hamano
2023-02-24  5:54   ` Jacob Keller
2023-02-24 23:36 ` Jonathan Tan
2023-02-25  2:25 ` Elijah Newren via GitGitGadget [this message]
2023-02-25  2:25   ` [PATCH v2 01/11] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 02/11] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 03/11] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 04/11] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 05/11] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 06/11] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 07/11] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 08/11] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 09/11] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
2023-02-25  2:25   ` [PATCH v2 10/11] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
2023-02-25  2:26   ` [PATCH v2 11/11] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget
2023-02-25 23:30   ` [PATCH v2 00/11] Clarify API for dir.[ch] and unpack-trees.[ch] -- mark relevant fields as internal Junio C Hamano
2023-02-27 15:28   ` [PATCH v3 00/13] " Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 01/13] t2021: fix platform-specific leftover cruft Elijah Newren via GitGitGadget
2023-02-27 19:11       ` Derrick Stolee
2023-02-27 15:28     ` [PATCH v3 02/13] unpack-trees: heed requests to overwrite ignored files Elijah Newren via GitGitGadget
2023-02-27 23:20       ` Jonathan Tan
2023-02-27 15:28     ` [PATCH v3 03/13] dir: separate public from internal portion of dir_struct Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 04/13] dir: add a usage note to exclude_per_dir Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 05/13] dir: mark output only fields of dir_struct as such Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 06/13] unpack-trees: clean up some flow control Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 07/13] sparse-checkout: avoid using internal API of unpack-trees Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 08/13] sparse-checkout: avoid using internal API of unpack-trees, take 2 Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 09/13] unpack_trees: start splitting internal fields from public API Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 10/13] unpack-trees: mark fields only used internally as internal Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 11/13] unpack-trees: rewrap a few overlong lines from previous patch Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 12/13] unpack-trees: special case read-tree debugging as internal usage Elijah Newren via GitGitGadget
2023-02-27 15:28     ` [PATCH v3 13/13] unpack-trees: add usage notices around df_conflict_entry Elijah Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1149.v2.git.1677291960.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=jacob.keller@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).