git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Derrick Stolee <stolee@gmail.com>
Cc: Taylor Blau <me@ttaylorr.com>, Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org
Subject: Re: [RFC PATCH 1/1] commit-graph.c: die on un-parseable commits
Date: Fri, 6 Sep 2019 13:04:17 -0400	[thread overview]
Message-ID: <20190906170417.GA23181@sigill.intra.peff.net> (raw)
In-Reply-To: <36bf0064-b563-74ed-4ae5-01745ced5d2e@gmail.com>

On Fri, Sep 06, 2019 at 12:48:05PM -0400, Derrick Stolee wrote:

> > diff --git a/revision.h b/revision.h
> > index 4134dc6029..5c0b831b37 100644
> > --- a/revision.h
> > +++ b/revision.h
> > @@ -33,7 +33,7 @@
> >  #define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR)
> >  
> >  #define TOPO_WALK_EXPLORED	(1u<<27)
> > -#define TOPO_WALK_INDEGREE	(1u<<28)
> > +#define TOPO_WALK_INDEGREE	(1u<<24)
> 
> As an aside, these flag bit modifications look fine, but would need to
> be explained. I'm guessing that since you are adding a bit of data
> to struct object you want to avoid increasing the struct size across
> a 32-bit boundary. Are we sure that bit 24 is not used anywhere else?
> (My search for "1u<<24" found nothing, and "1 << 24" found a bit in
> the cache-entry flags, so this seems safe.)

Yeah, I'd definitely break this up into several commits with explanation
(though see an alternate I posted that just uses the parsed flag without
any new bits).

Bit 24 isn't used according to the table in objects.h, which is
_supposed_ to be the source of truth, though of course there's no
compiler-level checking. (One aside: is there a reason TOPO_WALK_* isn't
part of ALL_REV_FLAGS?).

And yes, the goal was to keep things to the 32-bit boundary. But in the
course of this, I discovered something interesting: 64-bit systems are
now padding this up to the 8-byte boundary!

The culprit is the switch of GIT_MAX_RAWSZ for sha256. Before then, our
object_id was 20 bytes for sha1. Adding 4 bytes of flags still left us
at 24 bytes, which is both 4- and 8-byte aligned.

With the switch to sha256, object_id is now 32 bytes. Adding 4 bytes
takes us to 36, and then 8-byte aligning the struct takes us to 40
bytes, with 4 bytes of wasted padding.

I'm sorely tempted to use this as an opportunity to move commit->index
into "struct object". That would actually shrink commit object sizes by
4 bytes, and would let all object types do the commit-slab trick to
store object data with constant-time lookup. This would make it possible
to migrate some uses of flags to per-operation bitfields (so e.g., two
traversals would have their _own_ flag data, and wouldn't risk stomping
on each other's bits).

The one downside would be that the index space would become more sparse.
I.e., right now if you're only storing things for commits in a slab, you
know that every slot you allocate is for a commit. But if we allocate an
index for each object, then the commits are less likely to be together
(so wasted memory and worse cache performance). That might be solvable
by assigning a per-type index (with a few hacks to handle OBJ_NONE).

Anyway, all of that is rather off the topic of this discussion.

-Peff

  reply	other threads:[~2019-09-06 17:04 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04  2:22 [RFC PATCH 0/1] commit-graph.c: handle corrupt commit trees Taylor Blau
2019-09-04  2:22 ` [RFC PATCH 1/1] commit-graph.c: die on un-parseable commits Taylor Blau
2019-09-04  3:04   ` Jeff King
2019-09-04 21:18     ` Taylor Blau
2019-09-05  6:47       ` Jeff King
2019-09-06 16:48         ` Derrick Stolee
2019-09-06 17:04           ` Jeff King [this message]
2019-09-06 17:19             ` Derrick Stolee
2019-09-06 17:20             ` Derrick Stolee
2019-09-05 22:19     ` Junio C Hamano
2019-09-06  6:35       ` Jeff King
2019-09-06  6:56         ` Jeff King
2019-09-06 16:59         ` Junio C Hamano
2019-09-06 17:04           ` Jeff King
2019-09-09 16:39             ` Junio C Hamano
2019-09-09 16:54               ` Jeff King
2019-09-04 18:25 ` [RFC PATCH 0/1] commit-graph.c: handle corrupt commit trees Garima Singh
2019-09-04 21:21   ` Taylor Blau
2019-09-05  6:08     ` Jeff King
2019-09-06 16:48     ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190906170417.GA23181@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).