git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "Martin Ågren" <martin.agren@gmail.com>,
	git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>
Subject: Re: [PATCH] revision: replace "struct cmdline_pathspec" with argv_array
Date: Wed, 20 Sep 2017 20:49:00 -0700	[thread overview]
Message-ID: <20170921034900.GB88098@aiede.mtv.corp.google.com> (raw)
In-Reply-To: <20170921030424.akqaou7tqj2updgr@sigill.intra.peff.net>

Hi,

Jeff King wrote:

> But mostly I am fundamentally against using UNLEAK() in a case like
> this, because it does not match either of the properties which justified
> adding UNLEAK() in the first place:
>
>   1. We are about to exit the program, so the "leak" is only caused by
>      the memory going out of scope at that exit.
>
>      By contrast, the revision machinery may be called many times in the
>      same program.
>
>   2. The memory remains useful until around the time of program exit.
>
>      This most certainly does not, as it would not even be reachable.
[...]
> On Wed, Sep 20, 2017 at 03:48:26PM -0700, Jonathan Nieder wrote:

>> In other words, proposed changes:
>>
>>  1. Could the commit message describe what effect this would have on
>>     maximum heap usage, if any?  (In qualitative terms is fine, though
>>     actual numbers would be even better if it's easy to get them.)
>>     That would make it easier to justify not using UNLEAK.
>
> What wording are you looking for? It was a leak, and now it's gone.  The
> size of the leak depends on how much you feed to --stdin. IMHO using
> UNLEAK is totally inappropriate for this case, and doesn't even seem
> like an alternative worth rejecting.
>
>>  2. Can parse_pathspec get a comment in pathspec.h saying that it
>>     defensively copies anything it needs from args so the caller is
>>     free to modify or free it?  That way, it should be more obvious
>>     to people in the future modifying parse_pathspec() that callers
>>     may rely on that.  (The current API comment describes argv as
>>     "command line arguments", which I fear would send the opposite
>>     message to implementors.)
>
> I certainly agree that the pathspec interface could use better
> documentation. Patches welcome? :)

I think I failed at communicating here.  That is not what I meant at
all.

The context is that Git (especially older parts of it) suffers from a
pretty severe lack of API documentation.  For newcomers that is
especially obvious --- long-time git contributors, on the other hand,
may get used to patterns and common interfaces and may have trouble
seeing just how bad the lack of clearly communicated API contracts is.

There is a bit of a "broken window" problem, too: authors of one-off
patches may reasonably assume from existing code that this is just the
way it is and, lacking examples of how to document an API, add more
underdocumented API contracts.

The patch I am replying to tightens the contract for parse_pathspec().
I am not asking for comprehensive documentation of that function ---
that would be clearly off-topic for the patch.  Instead, I am saying
that we should document what we are newly requiring of the function in
the patch.  That way, the documented contract becomes clearer over
time as people document what they are relying on.  I think of that as
generally a good practice.

In other words, it was not at all obvious that "(2) The memory remains
useful until around the time of program exit" did not hold.  To a
casual reader it instead looks like (2) does hold, and there's no
documentation short of delving into the implementation of
parse_pathspec() to convince a reader otherwise.  The documentation
that is present leads to the opposite conclusion.

The assertion (1) that this allocation is going to happen multiple
times in a program isn't true either.  As you noted, we only read
stdin once.  But that doesn't matter as long as (2) doesn't hold.

TBH saying that I should write the one-sentence doc patch feels like
a cop-out.  Just like it is not sustainable for those reviewers that
are interested in good test coverage to be the only ones who write
tests, I think we cannot avoid treating documentation of API contracts
as a shared responsibility.

Thanks and hope that clarifies,
Jonathan

  reply	other threads:[~2017-09-21  3:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-20 19:47 [PATCH] revision: fix memory leaks with `struct cmdline_pathspec` Martin Ågren
2017-09-20 20:25 ` Jeff King
2017-09-20 20:36   ` [PATCH] revision: replace "struct cmdline_pathspec" with argv_array Jeff King
2017-09-20 22:48     ` Jonathan Nieder
2017-09-21  3:04       ` Jeff King
2017-09-21  3:49         ` Jonathan Nieder [this message]
2017-09-21  4:48           ` Jeff King
2017-09-21  4:41         ` Jonathan Nieder
2017-09-21  4:50           ` Jeff King
2017-09-21  5:10           ` Junio C Hamano
2017-09-21  3:57     ` Martin Ågren
2017-09-21  4:11     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170921034900.GB88098@aiede.mtv.corp.google.com \
    --to=jrnieder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=martin.agren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).