git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Derrick Stolee <stolee@gmail.com>,
	git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH 5/5] load_ref_decorations(): avoid parsing non-tag objects
Date: Tue, 22 Jun 2021 15:08:26 -0400	[thread overview]
Message-ID: <YNI1KmkGTYDCW3jZ@coredump.intra.peff.net> (raw)
In-Reply-To: <87zgvh20j3.fsf@evledraar.gmail.com>

On Tue, Jun 22, 2021 at 08:27:53PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Whoops, thanks for catching that. I originally called it "enum
> > object_type type", but then of course the compiler informed that there
> > was already a "type" variable in the function. So I renamed it to
> > "objtype" but missed updating that line. But it still compiled. Yikes. :)
> 
> [Enter Captain Hindsight]
> 
> If you use a slightly different coding style and leverage the
> information the compiler has to work with you'd get it to error for you,
> e.g. this on your original patch would catch it:
> 
> 	diff --git a/log-tree.c b/log-tree.c
> 	index 8b700e9c142..7e3a011b533 100644
> 	--- a/log-tree.c
> 	+++ b/log-tree.c
> 	@@ -157,9 +157,12 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
> 	 	}
> 	 
> 	 	objtype = oid_object_info(the_repository, oid, NULL);
> 	-	if (type < 0)
> 	+	switch (type) {
> 	+	case OBJ_BAD:
> 	 		return 0;
> 	-	obj = lookup_object_by_type(the_repository, oid, objtype);
> 	+	default:
> 	+		obj = lookup_object_by_type(the_repository, oid, objtype);
> 	+	}
> 	 
> 	 	if (starts_with(refname, "refs/heads/"))
> 	 		type = DECORATION_REF_LOCAL;

Yeah, I agree that would find it in this case. I do find that style
slightly harder to read, though. And...

> IMO the real problem is an over-reliance on C being so happy to treat
> enums as ints (well, with them being ints). If you consistently use
> labels you get the compiler to do the checking. For me with gcc and
> clang with that on top:
> 	
> 	log-tree.c:161:2: error: case value ‘4294967295’ not in enumerated type ‘enum decoration_type’ [-Werror=switch]
> 	  case OBJ_BAD:
> 	  ^~~~
> 	log-tree.c:161:7: error: case value not in enumerated type 'enum decoration_type' [-Werror,-Wswitch]
> 	        case OBJ_BAD:
> 	             ^

...it would help in this case because OBJ_BAD happens to have a value
that is not defined for decoration_type. If it did, then the compiler
would be quite happy to consider them equivalent.

So I don't disagree with you exactly, but I'm not sure of the tradeoff
of always using switches instead of conditionals (which IMHO is less
readable) for more compiler safety that only works sometimes is worth
it.

> I think we've disagreed on that exact point before recently, i.e. you
> think we shouldn't rely on OBJ_BAD in that way, and instead check for
> any negative value:
> https://lore.kernel.org/git/YHCZh5nLNVEHCWV2@coredump.intra.peff.net/

To be clear, my complaint about checking for OBJ_BAD exactly is that it
closes the door on other negative return types. And indeed, the switch()
you showed above would become a silent bug if we introduced
OBJ_BAD_FOR_ANOTHER_READ as "-2" (without any compiler support, because
of the "default" case in the switch statement).

Now that's somewhat hypothetical, but in the near-term it also means
confirming that any of the functions which get converted from "int" to
"enum object_type" are not in fact passing back "-2" in any
circumstances.

That said...

> In practice I don't think it's too verbose, because once you start
> consistently using the pattern you'll usually not be doing conversions
> all over the place, and would just do this sort of thing via a helper
> that does the type checking, e.g. something like this (or anything else
> where you don't lose the type & labels):
> [...]
> 	-	objtype = oid_object_info(the_repository, oid, NULL);
> 	-	if (type < 0)
> 	+	if (!oid_object_info_ok(the_repository, oid, &type, NULL))
> 	 		return 0;

Yes, that would deal with that problem. It's definitely a different
style, but one that I could get used to. It's a lot more object oriented
("you are not allowed to do numeric logic on an object type; you can
only use these accessor methods to query it"). If we were going that
route, I would stop having "enum object_type" at all, and instead make
it "struct object_type { enum { ... } value }". That would prevent
anybody from accidentally just looking at it, and instead force people
into that object-oriented style.

I dunno. It is is a big departure from how we do things now. And the bug
here notwithstanding, I don't feel like enum confusion has generally
been a big source of error for us.

-Peff

  reply	other threads:[~2021-06-22 19:08 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22 16:03 [PATCH 0/5] some "log --decorate" optimizations Jeff King
2021-06-22 16:03 ` [PATCH 1/5] pretty.h: update and expand docstring for userformat_find_requirements() Jeff King
2021-06-22 16:04 ` [PATCH 2/5] log: avoid loading decorations for userformats that don't need it Jeff King
2021-06-22 16:05 ` [PATCH 3/5] object.h: expand docstring for lookup_unknown_object() Jeff King
2021-06-22 16:06 ` [PATCH 4/5] object.h: add lookup_object_by_type() function Jeff King
2021-06-22 16:08 ` [PATCH 5/5] load_ref_decorations(): avoid parsing non-tag objects Jeff King
2021-06-22 16:35   ` Derrick Stolee
2021-06-22 17:06     ` Jeff King
2021-06-22 17:09       ` Jeff King
2021-06-22 17:25         ` Derrick Stolee
2021-06-22 18:27       ` Ævar Arnfjörð Bjarmason
2021-06-22 19:08         ` Jeff King [this message]
2021-06-22 17:06   ` Ævar Arnfjörð Bjarmason
2021-06-22 18:57     ` Jeff King
2021-06-23  2:46   ` Taylor Blau
2021-06-23 21:51     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNI1KmkGTYDCW3jZ@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).