list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <>
To: Junio C Hamano <>, Josh Steadmon <>
Subject: Re: [PATCH] run-command: don't spam trace2_child_exit()
Date: Tue, 3 May 2022 10:59:52 -0400	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <xmqqr15gev94.fsf@gitster.g>

On 4/28/22 5:46 PM, Junio C Hamano wrote:
> Josh Steadmon <> writes:
>> In rare cases, wait_or_whine() cannot determine a child process's exit
>> status (and will return -1 in this case). This can cause Git to issue
>> trace2 child_exit events despite the fact that the child is still
>> running.

I'm curious what is causing the spurious return values.
Could you instrument wait_or_whine() and see which of the
if/else arms are causing the -1 to be returned?

That routine is rather complicated and looks like it has 3
different ways that a -1 could be returned.

> Rather, we do not even know if the child is still running when it
> happens, right?  It is curious what "rare cases" makes the symptom
> appear.  Do we know?
> The patch looks OK from the "we do not know the child exited in this
> case, so we shouldn't be reporting the child exit" point of view, of
> course.  Having one event that started a child in the log and then
> having millions of events that reports the exit of the (same) child
> is way too broken.  With this change, we remove these phoney exit
> events from the log.
> Do we know, for such a child process that caused these millions
> phoney exit events, we got a real exit event at the end?  Otherwise,
> we'd still have a similar problem in the opposite direction, i.e. a
> child has a start event recorded, many exit event discarded but the
> log lacks the true exit event for the child, implying that the child
> is still running because we failed to log its exit?
>>   int finish_command_in_signal(struct child_process *cmd)
>>   {
>>   	int ret = wait_or_whine(cmd->pid, cmd->args.v[0], 1);
>> -	trace2_child_exit(cmd, ret);
>> +	if (ret != -1)
>> +		trace2_child_exit(cmd, ret);
>>   	return ret;
>>   }

Since this is only called from pager.c and is used to setup the
pager, I have to wonder if you're only getting these spurious events
for the pager process or for any of the other random child processes.

And whether they are received while the pager is alive and working
properly, or when you're trying to quit the pager or when the pager
is trying to signal eof.

> Will queue; thanks.

  reply	other threads:[~2022-05-03 15:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-28 20:58 Josh Steadmon
2022-04-28 21:46 ` Junio C Hamano
2022-05-03 14:59   ` Jeff Hostetler [this message]
2022-05-05 19:58     ` Josh Steadmon
2022-05-10 20:37       ` Jeff Hostetler
2022-06-07 18:45         ` Josh Steadmon
2022-05-05 19:44   ` Josh Steadmon
2022-06-07 18:21 ` [PATCH v2] " Josh Steadmon
2022-06-07 22:09   ` Ævar Arnfjörð Bjarmason
2022-06-10 15:31   ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \
    --subject='Re: [PATCH] run-command: don'\''t spam trace2_child_exit()' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).