git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Duy Nguyen <pclouds@gmail.com>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"René Scharfe" <l.s.r@web.de>, "Git List" <git@vger.kernel.org>,
	"Johannes Schindelin" <johannes.schindelin@gmx.de>
Subject: Re: [PATCH] fast-export: avoid NULL pointer arithmetic
Date: Fri, 11 May 2018 04:56:34 -0400	[thread overview]
Message-ID: <20180511085634.GC22086@sigill.intra.peff.net> (raw)
In-Reply-To: <CACsJy8AN4nssu1+x0x9Kmz1BB1aXO7_UBFCjpyULMeC5K-Fzvw@mail.gmail.com>

On Fri, May 11, 2018 at 08:19:59AM +0200, Duy Nguyen wrote:

> On Fri, May 11, 2018 at 6:49 AM, Junio C Hamano <gitster@pobox.com> wrote:
> > Junio C Hamano <gitster@pobox.com> writes:
> >
> >> René Scharfe <l.s.r@web.de> writes:
> >>
> >>>> But it somehow feels backwards in spirit to me, as the reason why we
> >>>> use "void *" there in the decoration field is because we expect that
> >>>> we'd have a pointer to some struture most of the time, and we have
> >>>> to occasionally store a small integer there.
> >>>
> >>> Yes, fast-export seems to be the only place that stores an integer as
> >>> a decoration.
> >>
> >> With the decoration subsystem that might be the case, but I think
> >> we have other codepaths where "void * .util" field in the structure
> >> is used to store (void *)1, expecting that a normal allocation will
> >> never yield a pointer that is indistinguishable from that value.
> >
> > I was looking at a different topic and noticed that bisect.c uses
> > commit->util (which is of type "void *") to always store an int (it
> > never stores a pointer and there is no mixing).  This one is equally
> > unportable as fast-export after your fix.
> >
> 
> In both cases we should be able to use commit-slab instead of
> commit->util. We could go even further and kill "util" pointer but
> that's more work.

I would love it if we could kill the util pointer in favor of
commit-slab. Unfortunately the fast-export case is decorating non-commit
objects, too.

I'm not sure how possible/easy it would be to have an "object slab".
Some complications are:

  1. It would increase the size of "struct tree" and "struct blob" by at
     least 4 bytes (possibly 8, due to alignment) to store the slab
     index. Commits would stay the same, but my experience is that most
     repositories have 5-10 times as many trees/blobs as commits.

     Some of that memory might be reclaimable. E.g., if we moved
     tree->buffer and tree->size into a slab, callers which don't use
     them would actually come out ahead (and more callers than you might
     expect could do this, since many only need to look at the tree
     contents inside a single function).

  2. If we allocate the indices for all types as a single sequence, then
     callers who only care about a specific type will have very sparse
     slabs (so they'll over-allocate, and it will have poor cache
     behavior).

     It might be possible to do something more clever. E.g., give each
     object type its own index counter, but then for a full object-slab,
     store per-type slabs and dereference obj->type to know which slab
     to look in.

     There'd be some complication with any_object. We'd probably need an
     OBJ_NONE slab, and then to allocate a type-specific index when the
     type is assigned. There's already a central point for this in
     object_as_type(). But we'd also have to migrate any
     previously-stored slab data (stored when it was OBJ_NONE), which
     implies having a global list of slabs.

So I dunno. It seems do-able but complicated.

-Peff

  reply	other threads:[~2018-05-11  8:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-09 21:06 [PATCH] fast-export: avoid NULL pointer arithmetic René Scharfe
2018-05-09 21:43 ` Johannes Schindelin
2018-05-10  9:24 ` René Scharfe
2018-05-10 10:51   ` Junio C Hamano
2018-05-10 19:47     ` René Scharfe
2018-05-11  2:16       ` Junio C Hamano
2018-05-11  4:49         ` Junio C Hamano
2018-05-11  6:19           ` Duy Nguyen
2018-05-11  8:56             ` Jeff King [this message]
2018-05-11 13:11               ` Duy Nguyen
2018-05-11 13:34                 ` Duy Nguyen
2018-05-11 17:42                   ` Jeff King
2018-05-12  8:45                     ` René Scharfe
2018-05-12  8:49                       ` René Scharfe
2018-05-14  1:37                       ` Junio C Hamano
2018-05-15 19:36                         ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180511085634.GC22086@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=l.s.r@web.de \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).