git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] some minor memory allocation cleanups
@ 2020-01-30  9:51 Jeff King
  2020-01-30  9:52 ` [PATCH 1/3] normalize_path_copy(): document "dst" size expectations Jeff King
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Jeff King @ 2020-01-30  9:51 UTC (permalink / raw)
  To: git

These are a result of me poking at the results of:

  git grep 'x[mc]alloc.*[+*] '

looking for any buffer allocation computations that could overflow (and
hence under-allocate).

There are a few hits left after this in the commit-graph code. Those
will be dealt with in a separate series (coming soon!) since they have
other problems, as discussed in:

  https://lore.kernel.org/git/20191027042116.GA5801@sigill.intra.peff.net/

(those have to do with normalize_path_copy(), hence the only
semi-related documentation patch here).

  [1/3]: normalize_path_copy(): document "dst" size expectations
  [2/3]: walker_fetch(): avoid raw array length computation
  [3/3]: traverse_trees(): use stack array for name entries

 path.c      |  2 ++
 tree-walk.c | 13 ++++++++-----
 walker.c    |  4 +++-
 3 files changed, 13 insertions(+), 6 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] normalize_path_copy(): document "dst" size expectations
  2020-01-30  9:51 [PATCH 0/3] some minor memory allocation cleanups Jeff King
@ 2020-01-30  9:52 ` Jeff King
  2020-01-30 20:12   ` Taylor Blau
  2020-01-30  9:52 ` [PATCH 2/3] walker_fetch(): avoid raw array length computation Jeff King
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2020-01-30  9:52 UTC (permalink / raw)
  To: git

We take a "dst" buffer to write into, but there's no matching "len"
parameter. The hidden assumption is that normalizing always makes things
smaller, so we're OK as long as "dst" is at least as big as "src". Let's
document that explicitly.

Signed-off-by: Jeff King <peff@peff.net>
---
 path.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/path.c b/path.c
index a76eec8b96..88cf593007 100644
--- a/path.c
+++ b/path.c
@@ -1077,6 +1077,8 @@ const char *remove_leading_path(const char *in, const char *prefix)
 
 /*
  * It is okay if dst == src, but they should not overlap otherwise.
+ * The "dst" buffer must be at least as long as "src"; normalizing may shrink
+ * the size of the path, but will never grow it.
  *
  * Performs the following normalizations on src, storing the result in dst:
  * - Ensures that components are separated by '/' (Windows only)
-- 
2.25.0.515.gaba5347bc6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] walker_fetch(): avoid raw array length computation
  2020-01-30  9:51 [PATCH 0/3] some minor memory allocation cleanups Jeff King
  2020-01-30  9:52 ` [PATCH 1/3] normalize_path_copy(): document "dst" size expectations Jeff King
@ 2020-01-30  9:52 ` Jeff King
  2020-01-30  9:53 ` [PATCH 3/3] traverse_trees(): use stack array for name entries Jeff King
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2020-01-30  9:52 UTC (permalink / raw)
  To: git

We compute the length of an array of object_id's with a raw
multiplication. In theory this could trigger an integer overflow which
would cause an under-allocation (and eventually an out of bounds write).

I doubt this can be triggered in practice, since you'd need to feed it
an enormous number of target objects, which would typically come from
the ref advertisement and be using proportional memory. And even on
64-bit systems, where "int" is much smaller than "size_t", that should
hold: even though "targets" is an int, the multiplication will be done
as a size_t because of the use of sizeof().

But we can easily fix it by using ALLOC_ARRAY(), which uses st_mult()
under the hood.

Signed-off-by: Jeff King <peff@peff.net>
---
 walker.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/walker.c b/walker.c
index 06cd2bd569..bb010f7a2b 100644
--- a/walker.c
+++ b/walker.c
@@ -261,12 +261,14 @@ int walker_fetch(struct walker *walker, int targets, char **target,
 	struct strbuf refname = STRBUF_INIT;
 	struct strbuf err = STRBUF_INIT;
 	struct ref_transaction *transaction = NULL;
-	struct object_id *oids = xmalloc(targets * sizeof(struct object_id));
+	struct object_id *oids;
 	char *msg = NULL;
 	int i, ret = -1;
 
 	save_commit_buffer = 0;
 
+	ALLOC_ARRAY(oids, targets);
+
 	if (write_ref) {
 		transaction = ref_transaction_begin(&err);
 		if (!transaction) {
-- 
2.25.0.515.gaba5347bc6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] traverse_trees(): use stack array for name entries
  2020-01-30  9:51 [PATCH 0/3] some minor memory allocation cleanups Jeff King
  2020-01-30  9:52 ` [PATCH 1/3] normalize_path_copy(): document "dst" size expectations Jeff King
  2020-01-30  9:52 ` [PATCH 2/3] walker_fetch(): avoid raw array length computation Jeff King
@ 2020-01-30  9:53 ` Jeff King
  2020-01-30 14:57   ` Elijah Newren
  2020-01-30 14:59 ` [PATCH 0/3] some minor memory allocation cleanups Elijah Newren
  2020-01-30 23:03 ` Taylor Blau
  4 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2020-01-30  9:53 UTC (permalink / raw)
  To: git

We heap-allocate our arrays of name_entry structs, etc, with one entry
per tree we're asked to traverse. The code does a raw multiplication in
the xmalloc() call, which I find when auditing for integer overflows
during allocation.

We could "fix" this by using ALLOC_ARRAY() instead. But as it turns out,
the maximum size of these arrays is limited at compile time:

  - merge_trees() always passes in 3 trees

  - unpack_trees() and its brethren never pass in more than
    MAX_UNPACK_TREES

So we can simplify even further by just using a stack array and bounding
it with MAX_UNPACK_TREES. There should be no concern with overflowing
the stack, since MAX_UNPACK_TREES is only 8 and the structs themselves
are small.

Note that since we're replacing xcalloc(), we have to move one of the
NULL initializations into a loop.

Signed-off-by: Jeff King <peff@peff.net>
---

This does increase the coupling between tree-walk and unpack-trees a
bit. I'd be OK just switching to ALLOC_ARRAY(), too. I doubt the
performance improvement is measurable, and the cleanup free() calls are
already there.

 tree-walk.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index d5a8e096a6..3093cf7098 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -410,15 +410,20 @@ int traverse_trees(struct index_state *istate,
 		   struct traverse_info *info)
 {
 	int error = 0;
-	struct name_entry *entry = xmalloc(n*sizeof(*entry));
+	struct name_entry entry[MAX_UNPACK_TREES];
 	int i;
-	struct tree_desc_x *tx = xcalloc(n, sizeof(*tx));
+	struct tree_desc_x tx[ARRAY_SIZE(entry)];
 	struct strbuf base = STRBUF_INIT;
 	int interesting = 1;
 	char *traverse_path;
 
-	for (i = 0; i < n; i++)
+	if (n >= ARRAY_SIZE(entry))
+		BUG("traverse_trees() called with too many trees (%d)", n);
+
+	for (i = 0; i < n; i++) {
 		tx[i].d = t[i];
+		tx[i].skip = NULL;
+	}
 
 	if (info->prev) {
 		strbuf_make_traverse_path(&base, info->prev,
@@ -506,10 +511,8 @@ int traverse_trees(struct index_state *istate,
 			if (mask & (1ul << i))
 				update_extended_entry(tx + i, entry + i);
 	}
-	free(entry);
 	for (i = 0; i < n; i++)
 		free_extended_entry(tx + i);
-	free(tx);
 	free(traverse_path);
 	info->traverse_path = NULL;
 	strbuf_release(&base);
-- 
2.25.0.515.gaba5347bc6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] traverse_trees(): use stack array for name entries
  2020-01-30  9:53 ` [PATCH 3/3] traverse_trees(): use stack array for name entries Jeff King
@ 2020-01-30 14:57   ` Elijah Newren
  2020-01-31  9:29     ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Elijah Newren @ 2020-01-30 14:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Thu, Jan 30, 2020 at 1:54 AM Jeff King <peff@peff.net> wrote:
>
> We heap-allocate our arrays of name_entry structs, etc, with one entry
> per tree we're asked to traverse. The code does a raw multiplication in
> the xmalloc() call, which I find when auditing for integer overflows
> during allocation.
>
> We could "fix" this by using ALLOC_ARRAY() instead. But as it turns out,
> the maximum size of these arrays is limited at compile time:
>
>   - merge_trees() always passes in 3 trees
>
>   - unpack_trees() and its brethren never pass in more than
>     MAX_UNPACK_TREES
>
> So we can simplify even further by just using a stack array and bounding
> it with MAX_UNPACK_TREES. There should be no concern with overflowing
> the stack, since MAX_UNPACK_TREES is only 8 and the structs themselves
> are small.
>
> Note that since we're replacing xcalloc(), we have to move one of the
> NULL initializations into a loop.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>
> This does increase the coupling between tree-walk and unpack-trees a
> bit. I'd be OK just switching to ALLOC_ARRAY(), too. I doubt the
> performance improvement is measurable, and the cleanup free() calls are
> already there.

Could we undo this cyclic dependency between tree-walk and
unpack-trees by defining MAX_TRAVERSE_TREES in tree-walk.h, making
MAX_UNPACK_TREES in unpack-trees.h be defined to MAX_TRAVERSE_TREES,
and remove the include of unpack-trees.h in tree-walk.c?

>  tree-walk.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/tree-walk.c b/tree-walk.c
> index d5a8e096a6..3093cf7098 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -410,15 +410,20 @@ int traverse_trees(struct index_state *istate,
>                    struct traverse_info *info)
>  {
>         int error = 0;
> -       struct name_entry *entry = xmalloc(n*sizeof(*entry));
> +       struct name_entry entry[MAX_UNPACK_TREES];
>         int i;
> -       struct tree_desc_x *tx = xcalloc(n, sizeof(*tx));
> +       struct tree_desc_x tx[ARRAY_SIZE(entry)];
>         struct strbuf base = STRBUF_INIT;
>         int interesting = 1;
>         char *traverse_path;
>
> -       for (i = 0; i < n; i++)
> +       if (n >= ARRAY_SIZE(entry))
> +               BUG("traverse_trees() called with too many trees (%d)", n);
> +
> +       for (i = 0; i < n; i++) {
>                 tx[i].d = t[i];
> +               tx[i].skip = NULL;
> +       }
>
>         if (info->prev) {
>                 strbuf_make_traverse_path(&base, info->prev,
> @@ -506,10 +511,8 @@ int traverse_trees(struct index_state *istate,
>                         if (mask & (1ul << i))
>                                 update_extended_entry(tx + i, entry + i);
>         }
> -       free(entry);
>         for (i = 0; i < n; i++)
>                 free_extended_entry(tx + i);
> -       free(tx);
>         free(traverse_path);
>         info->traverse_path = NULL;
>         strbuf_release(&base);
> --
> 2.25.0.515.gaba5347bc6

Looks good to me otherwise.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] some minor memory allocation cleanups
  2020-01-30  9:51 [PATCH 0/3] some minor memory allocation cleanups Jeff King
                   ` (2 preceding siblings ...)
  2020-01-30  9:53 ` [PATCH 3/3] traverse_trees(): use stack array for name entries Jeff King
@ 2020-01-30 14:59 ` Elijah Newren
  2020-01-30 23:03 ` Taylor Blau
  4 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2020-01-30 14:59 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Thu, Jan 30, 2020 at 1:53 AM Jeff King <peff@peff.net> wrote:
>
> These are a result of me poking at the results of:
>
>   git grep 'x[mc]alloc.*[+*] '
>
> looking for any buffer allocation computations that could overflow (and
> hence under-allocate).
>
> There are a few hits left after this in the commit-graph code. Those
> will be dealt with in a separate series (coming soon!) since they have
> other problems, as discussed in:
>
>   https://lore.kernel.org/git/20191027042116.GA5801@sigill.intra.peff.net/
>
> (those have to do with normalize_path_copy(), hence the only
> semi-related documentation patch here).
>
>   [1/3]: normalize_path_copy(): document "dst" size expectations
>   [2/3]: walker_fetch(): avoid raw array length computation
>   [3/3]: traverse_trees(): use stack array for name entries
>
>  path.c      |  2 ++
>  tree-walk.c | 13 ++++++++-----
>  walker.c    |  4 +++-
>  3 files changed, 13 insertions(+), 6 deletions(-)
>
> -Peff

Other than introducing (or extending?) a cyclic dependency between
tree-walk and unpack-trees that I'd prefer to remove, this series
looks good to me.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] normalize_path_copy(): document "dst" size expectations
  2020-01-30  9:52 ` [PATCH 1/3] normalize_path_copy(): document "dst" size expectations Jeff King
@ 2020-01-30 20:12   ` Taylor Blau
  2020-01-31  8:45     ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2020-01-30 20:12 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Thu, Jan 30, 2020 at 04:52:19AM -0500, Jeff King wrote:
> We take a "dst" buffer to write into, but there's no matching "len"
> parameter. The hidden assumption is that normalizing always makes things
> smaller, so we're OK as long as "dst" is at least as big as "src". Let's
> document that explicitly.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  path.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/path.c b/path.c
> index a76eec8b96..88cf593007 100644
> --- a/path.c
> +++ b/path.c
> @@ -1077,6 +1077,8 @@ const char *remove_leading_path(const char *in, const char *prefix)
>
>  /*
>   * It is okay if dst == src, but they should not overlap otherwise.
> + * The "dst" buffer must be at least as long as "src"; normalizing may shrink
> + * the size of the path, but will never grow it.

Thanks for documenting this. It's quite helpful, and hopefully should
prevent bugs like the one you alluded to in your cover letter from
getting in in the future.

>   *
>   * Performs the following normalizations on src, storing the result in dst:
>   * - Ensures that components are separated by '/' (Windows only)
> --
> 2.25.0.515.gaba5347bc6

This looks obviously good to me.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] some minor memory allocation cleanups
  2020-01-30  9:51 [PATCH 0/3] some minor memory allocation cleanups Jeff King
                   ` (3 preceding siblings ...)
  2020-01-30 14:59 ` [PATCH 0/3] some minor memory allocation cleanups Elijah Newren
@ 2020-01-30 23:03 ` Taylor Blau
  4 siblings, 0 replies; 13+ messages in thread
From: Taylor Blau @ 2020-01-30 23:03 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Hi Peff,

On Thu, Jan 30, 2020 at 04:51:55AM -0500, Jeff King wrote:
> These are a result of me poking at the results of:
>
>   git grep 'x[mc]alloc.*[+*] '
>
> looking for any buffer allocation computations that could overflow (and
> hence under-allocate).
>
> There are a few hits left after this in the commit-graph code. Those
> will be dealt with in a separate series (coming soon!) since they have
> other problems, as discussed in:

I just sent the series you're referring to in [1].

>   https://lore.kernel.org/git/20191027042116.GA5801@sigill.intra.peff.net/
>
> (those have to do with normalize_path_copy(), hence the only
> semi-related documentation patch here).
>
>   [1/3]: normalize_path_copy(): document "dst" size expectations
>   [2/3]: walker_fetch(): avoid raw array length computation
>   [3/3]: traverse_trees(): use stack array for name entries
>
>  path.c      |  2 ++
>  tree-walk.c | 13 ++++++++-----
>  walker.c    |  4 +++-
>  3 files changed, 13 insertions(+), 6 deletions(-)
>
> -Peff

[1]: https://lore.kernel.org/git/cover.1580424766.git.me@ttaylorr.com/

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] normalize_path_copy(): document "dst" size expectations
  2020-01-30 20:12   ` Taylor Blau
@ 2020-01-31  8:45     ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2020-01-31  8:45 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On Thu, Jan 30, 2020 at 12:12:47PM -0800, Taylor Blau wrote:

> > @@ -1077,6 +1077,8 @@ const char *remove_leading_path(const char *in, const char *prefix)
> >
> >  /*
> >   * It is okay if dst == src, but they should not overlap otherwise.
> > + * The "dst" buffer must be at least as long as "src"; normalizing may shrink
> > + * the size of the path, but will never grow it.
> 
> Thanks for documenting this. It's quite helpful, and hopefully should
> prevent bugs like the one you alluded to in your cover letter from
> getting in in the future.

To be picky, I didn't find an actual bug around buffer lengths; the
problem was a failure to check the error code. This was just something I
happened to find confusing auditing the code.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] traverse_trees(): use stack array for name entries
  2020-01-30 14:57   ` Elijah Newren
@ 2020-01-31  9:29     ` Jeff King
  2020-01-31 18:52       ` Elijah Newren
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2020-01-31  9:29 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

On Thu, Jan 30, 2020 at 06:57:26AM -0800, Elijah Newren wrote:

> > This does increase the coupling between tree-walk and unpack-trees a
> > bit. I'd be OK just switching to ALLOC_ARRAY(), too. I doubt the
> > performance improvement is measurable, and the cleanup free() calls are
> > already there.
> 
> Could we undo this cyclic dependency between tree-walk and
> unpack-trees by defining MAX_TRAVERSE_TREES in tree-walk.h, making
> MAX_UNPACK_TREES in unpack-trees.h be defined to MAX_TRAVERSE_TREES,
> and remove the include of unpack-trees.h in tree-walk.c?

I don't mind doing that, but I had a hard time trying to write a commit
message. I.e., I can explain the current use of MAX_UNPACK_TREES here,
or defining MAX_TRAVERSE_TREES as MAX_UNPACK_TREES by saying "this is an
arbitrary limit, but it's the highest value any caller would use with
us".

But to define MAX_UNPACK_TREES in terms of MAX_TRAVERSE_TREES, I feel
we've created a circular reasoning in the justification.

I'm not even sure whether the current value of 8 is meaningful. It comes
from this commit:

  commit ca885a4fe6444bed840295378848904106c87c85
  Author: Junio C Hamano <gitster@pobox.com>
  Date:   Thu Mar 13 22:07:18 2008 -0700
  
      read-tree() and unpack_trees(): use consistent limit
      
      read-tree -m can read up to MAX_TREES, which was arbitrarily set to 8 since
      August 2007 (4 is needed to deal with 2 merge-base case).
      
      However, the updated unpack_trees() code had an advertised limit of 4
      (which it enforced).  In reality the code was prepared to take only 3
      trees and giving 4 caused it to stomp on its stack.  Rename the MAX_TREES
      constant to MAX_UNPACK_TREES, move it to the unpack-trees.h common header
      file, and use it from both places to avoid future confusion.

which kind of makes me wonder if we should just go back to calling it
MAX_TREES. I guess that's too vague, though.

So I dunno. It would be easy to do as you asked, but I'm not convinced
it actually untangles anything meaningful.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] traverse_trees(): use stack array for name entries
  2020-01-31  9:29     ` Jeff King
@ 2020-01-31 18:52       ` Elijah Newren
  2020-02-01 11:39         ` [PATCH] tree-walk.c: break circular dependency with unpack-trees Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Elijah Newren @ 2020-01-31 18:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Fri, Jan 31, 2020 at 1:29 AM Jeff King <peff@peff.net> wrote:
>
> On Thu, Jan 30, 2020 at 06:57:26AM -0800, Elijah Newren wrote:
>
> > > This does increase the coupling between tree-walk and unpack-trees a
> > > bit. I'd be OK just switching to ALLOC_ARRAY(), too. I doubt the
> > > performance improvement is measurable, and the cleanup free() calls are
> > > already there.
> >
> > Could we undo this cyclic dependency between tree-walk and
> > unpack-trees by defining MAX_TRAVERSE_TREES in tree-walk.h, making
> > MAX_UNPACK_TREES in unpack-trees.h be defined to MAX_TRAVERSE_TREES,
> > and remove the include of unpack-trees.h in tree-walk.c?
>
> I don't mind doing that, but I had a hard time trying to write a commit
> message. I.e., I can explain the current use of MAX_UNPACK_TREES here,
> or defining MAX_TRAVERSE_TREES as MAX_UNPACK_TREES by saying "this is an
> arbitrary limit, but it's the highest value any caller would use with
> us".
>
> But to define MAX_UNPACK_TREES in terms of MAX_TRAVERSE_TREES, I feel
> we've created a circular reasoning in the justification.

It may be circular in how it came about, but I don't see how it's
circular from first principles: unpack_trees() uses traverse_trees()
to do its work (whereas tree-walk.c shouldn't call any functions from
unpack-trees).  For simplicity and lacking any justification for huge
numbers of trees, we want to impose a small limit on both for how many
trees they can simultaneously operate on -- though we want to leave a
little wiggle room by generously setting the limit much higher than we
expect any caller to ever want.  unpack-trees certainly can't have a
higher limit than tree-walk, and we don't know of callers outside
unpack_trees that would want to traverse more trees than
unpack_trees() does, so we just set them both to the same value, 8.

> I'm not even sure whether the current value of 8 is meaningful. It comes
> from this commit:
>
>   commit ca885a4fe6444bed840295378848904106c87c85
>   Author: Junio C Hamano <gitster@pobox.com>
>   Date:   Thu Mar 13 22:07:18 2008 -0700
>
>       read-tree() and unpack_trees(): use consistent limit
>
>       read-tree -m can read up to MAX_TREES, which was arbitrarily set to 8 since
>       August 2007 (4 is needed to deal with 2 merge-base case).
>
>       However, the updated unpack_trees() code had an advertised limit of 4
>       (which it enforced).  In reality the code was prepared to take only 3
>       trees and giving 4 caused it to stomp on its stack.  Rename the MAX_TREES
>       constant to MAX_UNPACK_TREES, move it to the unpack-trees.h common header
>       file, and use it from both places to avoid future confusion.
>
> which kind of makes me wonder if we should just go back to calling it
> MAX_TREES. I guess that's too vague, though.
>
> So I dunno. It would be easy to do as you asked, but I'm not convinced
> it actually untangles anything meaningful.

As far as I can tell, before your change, we can remove the
   #include "unpack-trees.h"
from tree-walk.c; nothing uses it.  I do like snipping includes and
dependencies where they aren't necessary, and this one seems like one
that could be removed.

But, it's not a big deal; if you want to leave that for future work
for someone else (perhaps even asking me to turn my paragraph above
into a commit message that rips it out and defines one #define in
terms of the other), that's fine.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] tree-walk.c: break circular dependency with unpack-trees
  2020-01-31 18:52       ` Elijah Newren
@ 2020-02-01 11:39         ` Jeff King
  2020-02-01 15:32           ` Elijah Newren
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2020-02-01 11:39 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

On Fri, Jan 31, 2020 at 10:52:30AM -0800, Elijah Newren wrote:

> As far as I can tell, before your change, we can remove the
>    #include "unpack-trees.h"
> from tree-walk.c; nothing uses it.  I do like snipping includes and
> dependencies where they aren't necessary, and this one seems like one
> that could be removed.

Yeah, I think that is the case.

> But, it's not a big deal; if you want to leave that for future work
> for someone else (perhaps even asking me to turn my paragraph above
> into a commit message that rips it out and defines one #define in
> terms of the other), that's fine.

Let's do it now while we're thinking about it. How about the patch below
on top of my series?

-- >8 --
Subject: [PATCH] tree-walk.c: break circular dependency with unpack-trees

The unpack-trees API depends on the tree-walk API. But we've recently
introduced a dependency in tree-walk.c on MAX_UNPACK_TREES, which
doesn't otherwise care about unpack-trees at all.

Let's break that dependency by reversing the constants: we'll introduce
a new MAX_TRAVERSE_TREES which belongs to the tree-walk API. And then we
can define MAX_UNPACK_TREES in terms of that (since unpack-trees cannot
possibly work with more trees than it can traverse at once via
tree-walk).

The value for both will remain at 8. This is somewhat arbitrary and
probably more than is necessary, per ca885a4fe6 (read-tree() and
unpack_trees(): use consistent limit, 2008-03-13), but there's not
really any pressing need to reduce it.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 tree-walk.c    | 3 +--
 tree-walk.h    | 2 ++
 unpack-trees.h | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index 3093cf7098..bb0ad34c54 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -1,6 +1,5 @@
 #include "cache.h"
 #include "tree-walk.h"
-#include "unpack-trees.h"
 #include "dir.h"
 #include "object-store.h"
 #include "tree.h"
@@ -410,7 +409,7 @@ int traverse_trees(struct index_state *istate,
 		   struct traverse_info *info)
 {
 	int error = 0;
-	struct name_entry entry[MAX_UNPACK_TREES];
+	struct name_entry entry[MAX_TRAVERSE_TREES];
 	int i;
 	struct tree_desc_x tx[ARRAY_SIZE(entry)];
 	struct strbuf base = STRBUF_INIT;
diff --git a/tree-walk.h b/tree-walk.h
index 826396c8ed..a5058469e9 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -3,6 +3,8 @@
 
 #include "cache.h"
 
+#define MAX_TRAVERSE_TREES 8
+
 /**
  * The tree walking API is used to traverse and inspect trees.
  */
diff --git a/unpack-trees.h b/unpack-trees.h
index ca94a421a5..ae1557fb80 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -6,7 +6,7 @@
 #include "string-list.h"
 #include "tree-walk.h"
 
-#define MAX_UNPACK_TREES 8
+#define MAX_UNPACK_TREES MAX_TRAVERSE_TREES
 
 struct cache_entry;
 struct unpack_trees_options;
-- 
2.25.0.538.g1d5d7023b0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] tree-walk.c: break circular dependency with unpack-trees
  2020-02-01 11:39         ` [PATCH] tree-walk.c: break circular dependency with unpack-trees Jeff King
@ 2020-02-01 15:32           ` Elijah Newren
  0 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2020-02-01 15:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Sat, Feb 1, 2020 at 3:39 AM Jeff King <peff@peff.net> wrote:
>
> On Fri, Jan 31, 2020 at 10:52:30AM -0800, Elijah Newren wrote:
>
> > As far as I can tell, before your change, we can remove the
> >    #include "unpack-trees.h"
> > from tree-walk.c; nothing uses it.  I do like snipping includes and
> > dependencies where they aren't necessary, and this one seems like one
> > that could be removed.
>
> Yeah, I think that is the case.
>
> > But, it's not a big deal; if you want to leave that for future work
> > for someone else (perhaps even asking me to turn my paragraph above
> > into a commit message that rips it out and defines one #define in
> > terms of the other), that's fine.
>
> Let's do it now while we're thinking about it. How about the patch below
> on top of my series?
>
> -- >8 --
> Subject: [PATCH] tree-walk.c: break circular dependency with unpack-trees
>
> The unpack-trees API depends on the tree-walk API. But we've recently
> introduced a dependency in tree-walk.c on MAX_UNPACK_TREES, which
> doesn't otherwise care about unpack-trees at all.
>
> Let's break that dependency by reversing the constants: we'll introduce
> a new MAX_TRAVERSE_TREES which belongs to the tree-walk API. And then we
> can define MAX_UNPACK_TREES in terms of that (since unpack-trees cannot
> possibly work with more trees than it can traverse at once via
> tree-walk).
>
> The value for both will remain at 8. This is somewhat arbitrary and
> probably more than is necessary, per ca885a4fe6 (read-tree() and
> unpack_trees(): use consistent limit, 2008-03-13), but there's not
> really any pressing need to reduce it.
>
> Suggested-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  tree-walk.c    | 3 +--
>  tree-walk.h    | 2 ++
>  unpack-trees.h | 2 +-
>  3 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/tree-walk.c b/tree-walk.c
> index 3093cf7098..bb0ad34c54 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -1,6 +1,5 @@
>  #include "cache.h"
>  #include "tree-walk.h"
> -#include "unpack-trees.h"
>  #include "dir.h"
>  #include "object-store.h"
>  #include "tree.h"
> @@ -410,7 +409,7 @@ int traverse_trees(struct index_state *istate,
>                    struct traverse_info *info)
>  {
>         int error = 0;
> -       struct name_entry entry[MAX_UNPACK_TREES];
> +       struct name_entry entry[MAX_TRAVERSE_TREES];
>         int i;
>         struct tree_desc_x tx[ARRAY_SIZE(entry)];
>         struct strbuf base = STRBUF_INIT;
> diff --git a/tree-walk.h b/tree-walk.h
> index 826396c8ed..a5058469e9 100644
> --- a/tree-walk.h
> +++ b/tree-walk.h
> @@ -3,6 +3,8 @@
>
>  #include "cache.h"
>
> +#define MAX_TRAVERSE_TREES 8
> +
>  /**
>   * The tree walking API is used to traverse and inspect trees.
>   */
> diff --git a/unpack-trees.h b/unpack-trees.h
> index ca94a421a5..ae1557fb80 100644
> --- a/unpack-trees.h
> +++ b/unpack-trees.h
> @@ -6,7 +6,7 @@
>  #include "string-list.h"
>  #include "tree-walk.h"
>
> -#define MAX_UNPACK_TREES 8
> +#define MAX_UNPACK_TREES MAX_TRAVERSE_TREES
>
>  struct cache_entry;
>  struct unpack_trees_options;
> --
> 2.25.0.538.g1d5d7023b0

Looks great, thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-02-01 15:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-30  9:51 [PATCH 0/3] some minor memory allocation cleanups Jeff King
2020-01-30  9:52 ` [PATCH 1/3] normalize_path_copy(): document "dst" size expectations Jeff King
2020-01-30 20:12   ` Taylor Blau
2020-01-31  8:45     ` Jeff King
2020-01-30  9:52 ` [PATCH 2/3] walker_fetch(): avoid raw array length computation Jeff King
2020-01-30  9:53 ` [PATCH 3/3] traverse_trees(): use stack array for name entries Jeff King
2020-01-30 14:57   ` Elijah Newren
2020-01-31  9:29     ` Jeff King
2020-01-31 18:52       ` Elijah Newren
2020-02-01 11:39         ` [PATCH] tree-walk.c: break circular dependency with unpack-trees Jeff King
2020-02-01 15:32           ` Elijah Newren
2020-01-30 14:59 ` [PATCH 0/3] some minor memory allocation cleanups Elijah Newren
2020-01-30 23:03 ` Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).