git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ben Peart <Ben.Peart@microsoft.com>
To: Jonathan Tan <jonathantanmy@google.com>,
	Ben Peart <peartben@gmail.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: [PATCH v2 2/2] sub-process: refactor the filter process code into a reusable module
Date: Mon, 27 Mar 2017 23:54:36 +0000	[thread overview]
Message-ID: <BL2PR03MB3232F7FE70707532B5FB178F4330@BL2PR03MB323.namprd03.prod.outlook.com> (raw)
In-Reply-To: <c8f85b4a-e69e-76ab-9a8d-66857968fb4d@google.com>

> -----Original Message-----
> From: Jonathan Tan [mailto:jonathantanmy@google.com]
> Sent: Monday, March 27, 2017 3:00 PM
> To: Ben Peart <peartben@gmail.com>; git@vger.kernel.org
> Cc: Ben Peart <Ben.Peart@microsoft.com>
> Subject: Re: [PATCH v2 2/2] sub-process: refactor the filter process code into
> a reusable module
> 
> On 03/24/2017 08:27 AM, Ben Peart wrote:
> > Refactor the filter.<driver>.process code into a separate sub-process
> > module that can be used to reduce the cost of starting up a
> > sub-process for multiple commands.  It does this by keeping the
> > external process running and processing all commands by communicating
> > over standard input and standard output using the packet format (pkt-line)
> based protocol.
> > Full documentation is in Documentation/technical/api-sub-process.txt.
> 
> Thanks - this looks like something useful to have.

Thanks for the review and feedback.

> 
> When you create a "struct subprocess_entry" to be entered into the system,
> it is not a true "struct subprocess_entry" - it is a "struct subprocess_entry"
> plus some extra variables at the end. Since the sub-process hashmap is
> keyed solely on the command, what happens if another component uses the
> same trick (but with different extra
> variables) when using a sub-process with the same command?

Having the command be the unique key is sufficient because it gets executed as a process by run_command and there can't be multiple different processes by the same name. 

> 
> I can think of at least two ways to solve this: (i) each component can have its
> own sub-process hashmap, or (ii) add a component key to the hashmap. (i)
> seems more elegant to me, but I'm not sure what the code will look like.
> 
> Also, I saw some minor code improvements possible (e.g. using "starts_with"
> when you're checking for the "status=<foo>" line) but I'll comment on those
> and look into the code more thoroughly once the questions in this e-mail are
> resolved.
> 
> > diff --git a/sub-process.h b/sub-process.h new file mode 100644 index
> > 0000000000..d1492f476d
> > --- /dev/null
> > +++ b/sub-process.h
> > @@ -0,0 +1,46 @@
> > +#ifndef SUBPROCESS_H
> > +#define SUBPROCESS_H
> > +
> > +#include "git-compat-util.h"
> > +#include "hashmap.h"
> > +#include "run-command.h"
> > +
> > +/*
> > + * Generic implementation of background process infrastructure.
> > + * See Documentation/technical/api-background-process.txt.
> > + */
> > +
> > + /* data structures */
> > +
> > +struct subprocess_entry {
> > +	struct hashmap_entry ent; /* must be the first member! */
> > +	struct child_process process;
> > +	const char *cmd;
> > +};
> 
> I notice from the documentation (and from "subprocess_get_child_process"
> below) that this is meant to be opaque, but I think this can be non-opaque
> (like "run-command").
> 
> Also, I would prefer adding a "util" pointer here instead of using it as an
> embedded struct. There is no clue here that it is embeddable or meant to be
> embedded.
> 

The structure is intentionally opaque to provide the benefits of encapsulation.  Obviously, the "C" language doesn't provide any enforcement of that design principal but we do what we can.  

The embedded struct is following the same design pattern as elsewhere in git (hashmap for example) simply for consistency.

> > +
> > +/* subprocess functions */
> > +
> > +typedef int(*subprocess_start_fn)(struct subprocess_entry *entry);
> > +int subprocess_start(struct subprocess_entry *entry, const char *cmd,
> > +		subprocess_start_fn startfn);
> 
> I'm not sure if it is useful to take a callback here - I think the caller of this
> function can just run whatever it wants after a successful subprocess_start.

The purpose of doing the subprocess specific initialization via a callback is so that if it encounters an error (for example, it can't negotiate a common interface version) the subprocess_start function can detect that and ensure the hashmap does not contain the invalid/unusable subprocess. 

> 
> Alternatively, if you add the "util" pointer (as I described above), then it
> makes sense to add a subprocess_get_or_start() function (and now it makes
> sense to take the callback). This way, the data structure will own, create, and
> destroy all the "struct subprocess_entry" that it needs, creating a nice
> separation of concerns.
> 
> > +
> > +void subprocess_stop(struct subprocess_entry *entry);
> 
> (continued from above) And it would be clear that this would free
> "entry", for example.
> 
> > +
> > +struct subprocess_entry *subprocess_find_entry(const char *cmd);
> > +
> > +/* subprocess helper functions */
> > +
> > +static inline struct child_process *subprocess_get_child_process(
> > +		struct subprocess_entry *entry)
> > +{
> > +	return &entry->process;
> > +}
> > +
> > +/*
> > + * Helper function that will read packets looking for "status=<foo>"
> > + * key/value pairs and return the value from the last "status" packet
> > + */
> > +
> > +int subprocess_read_status(int fd, struct strbuf *status);
> > +
> > +#endif
> >

  reply	other threads:[~2017-03-27 23:54 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-24 15:27 [PATCH v2 0/2] Refactor the filter process code into a reusable module Ben Peart
2017-03-24 15:27 ` [PATCH v2 1/2] pkt-line: add packet_writel() and packet_read_line_gently() Ben Peart
2017-03-25  5:47   ` Torsten Bögershausen
2017-03-27 22:19     ` Ben Peart
2017-03-30 14:04       ` Torsten Bögershausen
2017-03-30 16:01         ` Ben Peart
2017-03-30 17:01           ` Torsten Bögershausen
2017-03-24 15:27 ` [PATCH v2 1/4] t7800: remove whitespace before redirect Ben Peart
2017-03-24 16:21   ` Ben Peart
2017-03-24 15:27 ` [PATCH v2 2/2] sub-process: refactor the filter process code into a reusable module Ben Peart
2017-03-27 18:59   ` Jonathan Tan
2017-03-27 23:54     ` Ben Peart [this message]
2017-03-24 15:27 ` [PATCH v2 2/4] t7800: cleanup cruft left behind by tests Ben Peart
2017-03-24 15:27 ` [PATCH v2 3/4] difftool: handle modified symlinks in dir-diff mode Ben Peart
2017-03-24 15:27 ` [PATCH v2 4/4] Git 2.12.1 Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BL2PR03MB3232F7FE70707532B5FB178F4330@BL2PR03MB323.namprd03.prod.outlook.com \
    --to=ben.peart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=peartben@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).