git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ben Peart <peartben@gmail.com>
To: git@vger.kernel.org
Cc: benpeart@microsoft.com, christian.couder@gmail.com,
	larsxschneider@gmail.com
Subject: [PATCH v1 2/3] sub-process: refactor the filter process code into a reusable module
Date: Wed, 22 Mar 2017 12:52:19 -0400	[thread overview]
Message-ID: <20170322165220.5660-3-benpeart@microsoft.com> (raw)
In-Reply-To: <20170322165220.5660-1-benpeart@microsoft.com>

Create a new sub-process module that can be used to reduce the cost of
starting up a sub-process for multiple commands.  It does this by
keeping the external process running and processing all commands by
communicating over standard input and standard output using the packet
format (pkt-line) based protocol.  Full documentation is contained in
Documentation/technical/api-sub-process.txt.

This code is refactored from:

	Commit edcc85814c ("convert: add filter.<driver>.process option", 2016-10-16)
	keeps the external process running and processes all commands

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Documentation/technical/api-sub-process.txt |  55 ++++++++++++++
 Makefile                                    |   1 +
 sub-process.c                               | 113 ++++++++++++++++++++++++++++
 sub-process.h                               |  46 +++++++++++
 4 files changed, 215 insertions(+)
 create mode 100644 Documentation/technical/api-sub-process.txt
 create mode 100644 sub-process.c
 create mode 100644 sub-process.h

diff --git a/Documentation/technical/api-sub-process.txt b/Documentation/technical/api-sub-process.txt
new file mode 100644
index 0000000000..8471875611
--- /dev/null
+++ b/Documentation/technical/api-sub-process.txt
@@ -0,0 +1,55 @@
+sub-process API
+===============
+
+The sub-process API makes it possible to run background sub-processes
+that should run until the git command exits and communicate with it
+through stdin and stdout.  This reduces the overhead of having to fork
+a new process each time it needs to be communicated with.
+
+The sub-processes are kept in a hashmap by command name and looked up
+via the subprocess_find_entry function.  If an existing instance can not
+be found then a new process should be created and started.  When the
+parent git command terminates, all sub-processes are also terminated.
+
+This API is based on the run-command API.
+
+Data structures
+---------------
+
+* `struct subprocess_entry`
+
+The sub-process structure.  Members should not be accessed directly.
+
+Types
+-----
+
+'int(*subprocess_start_fn)(struct subprocess_entry *entry)'::
+
+	User-supplied function to initialize the sub-process.  This is
+	typically used to negoiate the interface version and capabilities.
+
+
+Functions
+---------
+
+`subprocess_start`::
+
+	Start a subprocess and add it to the subprocess hashmap.
+
+`subprocess_stop`::
+
+	Kill a subprocess and remove it from the subprocess hashmap.
+
+`subprocess_find_entry`::
+
+	Find a subprocess in the subprocess hashmap.
+
+`subprocess_get_child_process`::
+
+	Get the underlying `struct child_process` from a subprocess.
+
+`subprocess_read_status`::
+
+	Helper function to read packets looking for the last "status=<foo>"
+	key/value pair.
+
diff --git a/Makefile b/Makefile
index a5a11e721a..8afe733092 100644
--- a/Makefile
+++ b/Makefile
@@ -830,6 +830,7 @@ LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
 LIB_OBJS += submodule.o
 LIB_OBJS += submodule-config.o
+LIB_OBJS += sub-process.o
 LIB_OBJS += symlinks.o
 LIB_OBJS += tag.o
 LIB_OBJS += tempfile.o
diff --git a/sub-process.c b/sub-process.c
new file mode 100644
index 0000000000..02050b6867
--- /dev/null
+++ b/sub-process.c
@@ -0,0 +1,113 @@
+/*
+ * Generic implementation of background process infrastructure.
+ */
+#include "sub-process.h"
+#include "sigchain.h"
+#include "pkt-line.h"
+
+static int subprocess_map_initialized;
+static struct hashmap subprocess_map;
+
+static int name2process_cmp(const struct subprocess_entry *e1,
+		const struct subprocess_entry *e2, const void *unused)
+{
+	return strcmp(e1->cmd, e2->cmd);
+}
+
+static void subprocess_exit_handler(struct child_process *process)
+{
+	sigchain_push(SIGPIPE, SIG_IGN);
+	/* Closing the pipe signals the filter to initiate a shutdown. */
+	close(process->in);
+	close(process->out);
+	sigchain_pop(SIGPIPE);
+	/* Finish command will wait until the shutdown is complete. */
+	finish_command(process);
+}
+
+int subprocess_start(struct subprocess_entry *entry, const char *cmd,
+		subprocess_start_fn startfn)
+{
+	int err;
+	const char *argv[] = { cmd, NULL };
+
+	if (!subprocess_map_initialized) {
+		hashmap_init(&subprocess_map, (hashmap_cmp_fn)name2process_cmp, 0);
+		subprocess_map_initialized = 1;
+	}
+
+	entry->cmd = cmd;
+
+	child_process_init(&entry->process);
+	entry->process.argv = argv;
+	entry->process.use_shell = 1;
+	entry->process.in = -1;
+	entry->process.out = -1;
+	entry->process.clean_on_exit = 1;
+	entry->process.clean_on_exit_handler = subprocess_exit_handler;
+
+	err = start_command(&entry->process);
+	if (err) {
+		error("cannot fork to run sub-process '%s'", entry->cmd);
+		return err;
+	}
+
+	err = startfn(entry);
+	if (err) {
+		error("initialization for sub-process '%s' failed", entry->cmd);
+		subprocess_stop(entry);
+		return err;
+	}
+
+	hashmap_entry_init(entry, strhash(entry->cmd));
+	hashmap_add(&subprocess_map, entry);
+
+	return 0;
+}
+
+void subprocess_stop(struct subprocess_entry *entry)
+{
+	if (!entry)
+		return;
+
+	entry->process.clean_on_exit = 0;
+	kill(entry->process.pid, SIGTERM);
+	finish_command(&entry->process);
+
+	hashmap_remove(&subprocess_map, entry, NULL);
+}
+
+struct subprocess_entry *subprocess_find_entry(const char *cmd)
+{
+	struct subprocess_entry key;
+
+	if (!subprocess_map_initialized) {
+		hashmap_init(&subprocess_map, (hashmap_cmp_fn)name2process_cmp, 0);
+		subprocess_map_initialized = 1;
+		return NULL;
+	}
+
+	hashmap_entry_init(&key, strhash(cmd));
+	key.cmd = cmd;
+	return hashmap_get(&subprocess_map, &key, NULL);
+}
+
+void subprocess_read_status(int fd, struct strbuf *status)
+{
+	struct strbuf **pair;
+	char *line;
+	for (;;) {
+		line = packet_read_line(fd, NULL);
+		if (!line)
+			break;
+		pair = strbuf_split_str(line, '=', 2);
+		if (pair[0] && pair[0]->len && pair[1]) {
+			/* the last "status=<foo>" line wins */
+			if (!strcmp(pair[0]->buf, "status=")) {
+				strbuf_reset(status);
+				strbuf_addbuf(status, pair[1]);
+			}
+		}
+		strbuf_list_free(pair);
+	}
+}
diff --git a/sub-process.h b/sub-process.h
new file mode 100644
index 0000000000..235f1e5fa3
--- /dev/null
+++ b/sub-process.h
@@ -0,0 +1,46 @@
+#ifndef SUBPROCESS_H
+#define SUBPROCESS_H
+
+#include "git-compat-util.h"
+#include "hashmap.h"
+#include "run-command.h"
+
+/*
+ * Generic implementation of background process infrastructure.
+ * See Documentation/technical/api-background-process.txt.
+ */
+
+ /* data structures */
+
+struct subprocess_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	struct child_process process;
+	const char *cmd;
+};
+
+/* subprocess functions */
+
+typedef int(*subprocess_start_fn)(struct subprocess_entry *entry);
+int subprocess_start(struct subprocess_entry *entry, const char *cmd,
+		subprocess_start_fn startfn);
+
+void subprocess_stop(struct subprocess_entry *entry);
+
+struct subprocess_entry *subprocess_find_entry(const char *cmd);
+
+/* subprocess helper functions */
+
+static inline struct child_process *subprocess_get_child_process(
+		struct subprocess_entry *entry)
+{
+	return &entry->process;
+}
+
+/*
+ * Helper function that will read packets looking for "status=<foo>"
+ * key/value pairs and return the value from the last "status" packet
+ */
+
+void subprocess_read_status(int fd, struct strbuf *status);
+
+#endif
-- 
2.12.0.gvfs.1.42.g0b7328eac2


  parent reply	other threads:[~2017-03-22 17:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-22 16:52 [PATCH v1 0/3] Add support for downloading blobs on demand Ben Peart
2017-03-22 16:52 ` [PATCH v1 1/3] pkt-line: add packet_write_list_gently() Ben Peart
2017-03-22 20:21   ` Junio C Hamano
2017-03-24 12:34     ` Ben Peart
2017-03-22 16:52 ` Ben Peart [this message]
2017-03-23  6:16   ` [PATCH v1 2/3] sub-process: refactor the filter process code into a reusable module Junio C Hamano
2017-03-24 12:39     ` Ben Peart
2017-03-24 16:10       ` Junio C Hamano
2017-03-24 17:15         ` Junio C Hamano
2017-03-27 22:04           ` Ben Peart
2017-03-22 16:52 ` [PATCH v1 3/3] convert: use new sub-process module for filter processes Ben Peart
2017-03-25 11:59 ` [PATCH v1 0/3] Add support for downloading blobs on demand Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170322165220.5660-3-benpeart@microsoft.com \
    --to=peartben@gmail.com \
    --cc=benpeart@microsoft.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=larsxschneider@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).