git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: Kevin Willford <kcwillford@gmail.com>,
	git@vger.kernel.org, Kevin Willford <kewillf@microsoft.com>
Subject: Re: [[PATCH v2] 1/4] patch-ids: stop using a hand-rolled hashmap implementation
Date: Tue, 2 Aug 2016 12:30:36 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.20.1608021013010.79248@virtualbox> (raw)
In-Reply-To: <xmqqy44gi7bp.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Mon, 1 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > It would be a serious bug if hashmap_entry_init() played games with
> > references, given its signature (that this function does not have any
> > access to the hashmap structure, only to the entry itself):
> >
> > 	void hashmap_entry_init(void *entry, unsigned int hash)
> 
> I do not think we are on the same page.  The "reference to other
> resource" I wondered was inside the hashmap_entry structure, IOW,
> "the entry itself".

Oh, I see now.

> Which is declared to be opaque to the API users,

Actually, not really. We cannot do that in C: we need to define the struct
in hashmap.h so that its size is known to the users.

> so whoever defined that API cannot blame me for not checking its
> definition to see that it only has "unsigned int hash" and no allocated
> memory or open file descriptor in it that needs freeing.

That is the reason, I guess, why we have the documentation in
Documentation/technical/api-hashmap.txt: it would have to talk about your
hypothetical hashmap_entry_clear() (which would better be named
*_release() BTW, unless I misunderstood what you want a hypothetical
future version of that function to do).

And quite frankly, unless we *have* to, I would rather try to avoid
introducing that function as much as possible, as it would make using the
hashmap API even more finicky than it already is.

> By the way, the first parameter of the function being "void *" is
> merely to help lazy API users, who have their own structure that
> embeds the hashmap_entry as its first element, as API documentation
> tells them to do, e.g.
> 
> 	struct foo {
>         	struct hashmap_entry e;
>                 ... other "foo" specific fields come here ...
> 	} foo;
> 
> and because of the lazy "void *", they do not have to do this:
> 
> 	hashmap_entry_init(&foo->e, ...);
> 
> which would be required if the first parameter were "struct
> hashmap_entry *", but they can just do this:
> 
> 	hashmap_entry_init(&foo, ...);

Yes, I know that. It is the common way to simulate subclassing in C, for
lack of a more compile-safe construct.

> I have a slight preference to avoid the lazy "void *", but that is
> an unrelated tangent.

Oh, we are already safely in Unrelated Tangent Land for a while, I would
think. Nothing of what we are discussing in this thread has anything to do
with Kevin's patch series, which is about trying to use resources more
sensibly when using the revision machinery's --cherry-pick option.

And since we are already there, I'll offer an opinion in favor of `void
*`: doing the &foo->e dance could quite possibly suggest that `e` is a
field just like any other field (and does not necessarily *need* to be the
first).

But again, this has nothing to do with the patch series we are discussing
here.

> >> The fact that hashmap_entry_init() is there but there is no
> >> corresponding hashmap_entry_clear() hints that there is nothing to be
> >> worried about and I can see from the implementation of
> >> hashmap_entry_init() that no extra resource is held inside, but an
> >> API user should not have to guess.  We may want to do one of the two
> >> things:
> >> 
> >>  * document that an embedded hashmap_entry does not hold any
> >>    resource that need to be released and it is safe to free the user
> >>    structure that embeds one; or
> >> 
> >>  * implement hashmap_entry_clear() that currently is a no-op.
> >
> > Urgh. The only reason we have hashmap_entry_init() is that we *may* want
> > to extend `struct hashmap_entry` at some point. That is *already*
> > over-engineered because that point in time seems quite unlikely to arrive,
> > like, ever.
> 
> I am saying that an uneven over-enginnering is bad.

Hmm. I guess that the _init() function could be replaced by an _INIT macro
a la STRBUF_INIT. Not sure it is really worth the effort, though.

Ciao,
Dscho

  parent reply	other threads:[~2016-08-02 11:02 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-29 16:19 [[PATCH v2] 0/4] Use header data patch ids for rebase to avoid loading file content Kevin Willford
2016-07-29 16:19 ` [[PATCH v2] 1/4] patch-ids: stop using a hand-rolled hashmap implementation Kevin Willford
2016-07-29 20:47   ` Junio C Hamano
2016-08-01  8:54     ` Johannes Schindelin
2016-08-01 20:04       ` Junio C Hamano
2016-08-01 22:34         ` Eric Wong
2016-08-02 10:30         ` Johannes Schindelin [this message]
2016-08-02 17:01           ` Junio C Hamano
2016-08-02 18:04             ` Junio C Hamano
2016-07-29 21:29   ` Junio C Hamano
2016-07-29 16:19 ` [[PATCH v2] 2/4] patch-ids: replace the seen indicator with a commit pointer Kevin Willford
2016-07-29 21:03   ` Junio C Hamano
2016-07-29 16:19 ` [[PATCH v2] 3/4] patch-ids: add flag to create the diff patch id using header only data Kevin Willford
2016-07-29 16:19 ` [[PATCH v2] 4/4] rebase: avoid computing unnecessary patch IDs Kevin Willford
2016-07-29 21:46   ` Junio C Hamano
2016-08-01  8:58     ` Johannes Schindelin
2016-08-01 20:11       ` Junio C Hamano
2016-08-02  9:50         ` Jakub Narębski
2016-08-02 17:06           ` Junio C Hamano
2016-08-02 10:45         ` Johannes Schindelin
2016-08-02 17:08           ` Junio C Hamano
2016-08-04  3:00             ` Junio C Hamano
2016-08-04 14:21               ` Johannes Schindelin
2016-07-29 20:22 ` [[PATCH v2] 0/4] Use header data patch ids for rebase to avoid loading file content Junio C Hamano
2016-08-01  9:01   ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.20.1608021013010.79248@virtualbox \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kcwillford@gmail.com \
    --cc=kewillf@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).