git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Eric DeCosta via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Eric Sunshine <sunshine@sunshineco.com>,
	Glen Choo <chooglen@google.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Taylor Blau <me@ttaylorr.com>,
	Eric DeCosta <edecosta@mathworks.com>
Subject: Re: [PATCH v4 3/6] fsmonitor: implement filesystem change listener for Linux
Date: Mon, 12 Dec 2022 11:42:26 +0100	[thread overview]
Message-ID: <221212.86h6y06gca.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <80282efef5779728022a317388da6790e5db44e2.1669230044.git.gitgitgadget@gmail.com>


On Wed, Nov 23 2022, Eric DeCosta via GitGitGadget wrote:

> From: Eric DeCosta <edecosta@mathworks.com>
> [...]
> +}
> +
> +/*
> + * Remove the inotify watch, the watch descriptor to path mapping
> + * and the reverse mapping.
> + */
> +static void remove_watch(struct watch_entry *w,
> +	struct fsm_listen_data *data)
> +{
> +	struct watch_entry k1, k2, *w1, *w2;
> +
> +	/* remove watch, ignore error if kernel already did it */
> +	if (inotify_rm_watch(data->fd_inotify, w->wd) && errno != EINVAL)
> +		error_errno("inotify_rm_watch() failed");

If we error_errno(), shouldn't this function have a return value?

> +
> +	hashmap_entry_init(&k1.ent, memhash(&w->wd, sizeof(int)));
> +	w1 = hashmap_remove_entry(&data->watches, &k1, ent, NULL);
> +	if (!w1)
> +		BUG("Double remove of watch for '%s'", w->dir);
> +
> +	if (w1->cookie)
> +		BUG("Removing watch for '%s' which has a pending rename", w1->dir);
> +
> +	hashmap_entry_init(&k2.ent, memhash(w->dir, strlen(w->dir)));
> +	w2 = hashmap_remove_entry(&data->revwatches, &k2, ent, NULL);
> +	if (!w2)
> +		BUG("Double remove of reverse watch for '%s'", w->dir);

For the BUG() additions: Start with lower-case in messages, see
CodingGuidelines. I.e. "double remove of ..." etc. Ditto below..

> [...]
> +/*
> + * Handle directory renames
> + *
> + * Once a IN_MOVED_TO event is received, lookup the rename tracking information
> + * via the event cookie and use this information to update the watch.
> + */
> +static void rename_dir(uint32_t cookie, const char *path,
> +	struct fsm_listen_data *data)
> +{
> +	struct rename_entry rek, *re;
> +	struct watch_entry k, *w;
> +
> +	/* lookup a pending rename to match */
> +	rek.cookie = cookie;
> +	hashmap_entry_init(&rek.ent, memhash(&rek.cookie, sizeof(uint32_t)));
> +	re = hashmap_get_entry(&data->renames, &rek, ent, NULL);
> +	if (re) {
> +		k.dir = re->dir;
> +		hashmap_entry_init(&k.ent, memhash(k.dir, strlen(k.dir)));
> +		w = hashmap_get_entry(&data->revwatches, &k, ent, NULL);
> +		if (w) {
> +			w->cookie = 0; /* rename handled */
> +			remove_watch(w, data);
> +			add_watch(path, data);

Elsewhere you check the add_watch() return value, but not here?

> +		} else {
> +			BUG("No matching watch");
> +		}
> +	} else {
> +		BUG("No matching cookie");
> +	}
> +}
> +
> +/*
> + * Recursively add watches to every directory under path
> + */
> +static int register_inotify(const char *path,
> +	struct fsmonitor_daemon_state *state,
> +	struct fsmonitor_batch *batch)
> +{
> +	DIR *dir;
> +	const char *rel;
> +	struct strbuf current = STRBUF_INIT;
> +	struct dirent *de;
> +	struct stat fs;
> +	int ret = -1;
> +
> +	dir = opendir(path);
> +	if (!dir)
> +		return error_errno("opendir('%s') failed", path);
> +
> +	while ((de = readdir_skip_dot_and_dotdot(dir)) != NULL) {

ditto drop the "!= NULL"

> +	struct strbuf msg = STRBUF_INIT;
> +
> +	if (mask & IN_ACCESS)
> +		strbuf_addstr(&msg, "IN_ACCESS|");
> +	if (mask & IN_MODIFY)
> +		strbuf_addstr(&msg, "IN_MODIFY|");
> +	if (mask & IN_ATTRIB)
> +		strbuf_addstr(&msg, "IN_ATTRIB|");
> +	if (mask & IN_CLOSE_WRITE)
> +		strbuf_addstr(&msg, "IN_CLOSE_WRITE|");
> +	if (mask & IN_CLOSE_NOWRITE)
> +		strbuf_addstr(&msg, "IN_CLOSE_NOWRITE|");
> +	if (mask & IN_OPEN)
> +		strbuf_addstr(&msg, "IN_OPEN|");
> +	if (mask & IN_MOVED_FROM)
> +		strbuf_addstr(&msg, "IN_MOVED_FROM|");
> +	if (mask & IN_MOVED_TO)
> +		strbuf_addstr(&msg, "IN_MOVED_TO|");
> +	if (mask & IN_CREATE)
> +		strbuf_addstr(&msg, "IN_CREATE|");
> +	if (mask & IN_DELETE)
> +		strbuf_addstr(&msg, "IN_DELETE|");
> +	if (mask & IN_DELETE_SELF)
> +		strbuf_addstr(&msg, "IN_DELETE_SELF|");
> +	if (mask & IN_MOVE_SELF)
> +		strbuf_addstr(&msg, "IN_MOVE_SELF|");
> +	if (mask & IN_UNMOUNT)
> +		strbuf_addstr(&msg, "IN_UNMOUNT|");
> +	if (mask & IN_Q_OVERFLOW)
> +		strbuf_addstr(&msg, "IN_Q_OVERFLOW|");
> +	if (mask & IN_IGNORED)
> +		strbuf_addstr(&msg, "IN_IGNORED|");
> +	if (mask & IN_ISDIR)
> +		strbuf_addstr(&msg, "IN_ISDIR|");

Shouldn't the very last addition to mask omit the "|"?

I think it would be worth just making this a NULL-delimited "int, const char *" array like:

	{
		{ IN_ACCESS, "IN_ACCESS" },
		...
		NULL,
	};

Then looping over it, or maybe not...

> +	trace_printf_key(&trace_fsmonitor, "inotify_event: '%s', mask=%#8.8x %s",
> +				path, mask, msg.buf);
> +
> +	strbuf_release(&msg);
> +}
> +
> +int fsm_listen__ctor(struct fsmonitor_daemon_state *state)
> +{
> +	int fd;
> +	int ret = 0;
> +	struct fsm_listen_data *data;
> +
> +	CALLOC_ARRAY(data, 1);
> +	state->listen_data = data;
> +	state->listen_error_code = -1;
> +	data->shutdown = SHUTDOWN_ERROR;
> +
> +	fd = inotify_init1(O_NONBLOCK);
> +	if (fd < 0)
> +		return error_errno("inotify_init1() failed");

Here you leak the "data" you just allocated on error.

> +
> +	data->fd_inotify = fd;
> +
> +	hashmap_init(&data->watches, watch_entry_cmp, NULL, 0);
> +	hashmap_init(&data->renames, rename_entry_cmp, NULL, 0);
> +	hashmap_init(&data->revwatches, revwatches_entry_cmp, NULL, 0);
> +
> +	if (add_watch(state->path_worktree_watch.buf, data))

I.e. we should only avoid free()-ing it if we can successfully hand it
over to add_watch(), shouldn't we?n

> +		ret = -1;
> +	else if (register_inotify(state->path_worktree_watch.buf, state, NULL))
> +		ret = -1;

We add {}'s to all if/else if branches if one needs it, see CodingGuidelines.
> +	else if (state->nr_paths_watching > 1) {
> +		if (add_watch(state->path_gitdir_watch.buf, data))
> +			ret = -1;
> +		else if (register_inotify(state->path_gitdir_watch.buf, state, NULL))
> +			ret = -1;

Can't this be:

	else if (state->nr_paths_watching > 1 &&
		 (add_watch(...) || register_inotify(...)))
		ret = -1;

> +	switch (fsmonitor_classify_path_absolute(state, path)) {
> +		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:

Don't indent the "case" label here more than the "switch", see
CodingGuidelines.


> +static void handle_events(struct fsmonitor_daemon_state *state)
> +{
> +	 /* See https://man7.org/linux/man-pages/man7/inotify.7.html */

Extra " " here after the "\t".

> +	char buf[4096]
> +		__attribute__ ((aligned(__alignof__(struct inotify_event))));
> +
> +	struct hashmap watches = state->listen_data->watches;
> +	struct fsmonitor_batch *batch = NULL;
> +	struct string_list cookie_list = STRING_LIST_INIT_DUP;
> +	struct watch_entry k, *w;
> +	struct strbuf path;
> +	const struct inotify_event *event;
> +	int fd = state->listen_data->fd_inotify;
> +	ssize_t len;
> +	char *ptr, *p;
> +
> +	strbuf_init(&path, PATH_MAX);

Just use "struct strbuf path = STRBUF_INIT" instead? I.e...
> +
> +	for(;;) {
> +		len = read(fd, buf, sizeof(buf));
> +		if (len == -1 && errno != EAGAIN) {
> +			error_errno(_("reading inotify message stream failed"));
> +			state->listen_data->shutdown = SHUTDOWN_ERROR;
> +			goto done;
> +		}
> +
> +		/* nothing to read */
> +		if (len <= 0)
> +			goto done;
> +
> +		/* Loop over all events in the buffer. */
> +		for (ptr = buf; ptr < buf + len;
> +			 ptr += sizeof(struct inotify_event) + event->len) {
> +
> +			event = (const struct inotify_event *) ptr;
> +
> +			if (em_ignore(event->mask))
> +				continue;
> +
> +			/* File system was unmounted or event queue overflowed */
> +			if (em_force_shutdown(event->mask)) {
> +				if (trace_pass_fl(&trace_fsmonitor))
> +					log_mask_set("Forcing shutdown", event->mask);
> +				state->listen_data->shutdown = SHUTDOWN_FORCE;
> +				goto done;
> +			}
> +
> +			hashmap_entry_init(&k.ent, memhash(&event->wd, sizeof(int)));
> +			k.wd = event->wd;
> +
> +			w = hashmap_get_entry(&watches, &k, ent, NULL);
> +			if (!w) /* should never happen */
> +				BUG("No watch for '%s'", event->name);
> +
> +			/* directory watch was removed */
> +			if (em_remove_watch(event->mask)) {
> +				remove_watch(w, state->listen_data);
> +				continue;
> +			}
> +
> +			strbuf_reset(&path);
> +			strbuf_add(&path, w->dir, strlen(w->dir));
> +			strbuf_addch(&path, '/');
> +			strbuf_addstr(&path, event->name);

... we may not even get to this, so we may pointlessly pre-grow it, if
we're considering the micro-optimization.

But if we do need it it'll quickly get up to size in the loop, which is
probably smaller than PATH_MAX, and paths can exceed PATH_MAX....

> +	for(;;) {
> +		switch (state->listen_data->shutdown) {
> +			case SHUTDOWN_CONTINUE:
> +				poll_num = poll(fds, 1, 1);
> +				if (poll_num == -1) {
> +					if (errno == EINTR)
> +						continue;
> +					error_errno(_("polling inotify message stream failed"));
> +					state->listen_data->shutdown = SHUTDOWN_ERROR;
> +					continue;
> +				}
> +
> +				if ((time(NULL) - checked) >= interval) {
> +					checked = time(NULL);

As this is linux-specific code, shouldn't we use the linux-specific API
to get the guaranteed atomically growing time here, arther than time()?

Maybe not, but I wonder if this has funny side-effects with ntpd
adjusting the time concurrently...

  reply	other threads:[~2022-12-12 11:09 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-09 14:37 [PATCH 00/12] fsmonitor: Implement fsmonitor for Linux Eric DeCosta via GitGitGadget
2022-10-09 14:37 ` [PATCH 01/12] fsmonitor: refactor filesystem checks to common interface Eric DeCosta via GitGitGadget
2022-10-18 12:29   ` Ævar Arnfjörð Bjarmason
2022-10-09 14:37 ` [PATCH 02/12] fsmonitor: relocate socket file if .git directory is remote Eric DeCosta via GitGitGadget
2022-10-09 14:37 ` [PATCH 03/12] fsmonitor: avoid socket location check if using hook Eric DeCosta via GitGitGadget
2022-10-09 14:37 ` [PATCH 04/12] fsmonitor: deal with synthetic firmlinks on macOS Eric DeCosta via GitGitGadget
2022-10-09 14:37 ` [PATCH 05/12] fsmonitor: check for compatability before communicating with fsmonitor Eric DeCosta via GitGitGadget
2022-10-09 14:37 ` [PATCH 06/12] fsmonitor: add documentation for allowRemote and socketDir options Eric DeCosta via GitGitGadget
2022-10-09 14:37 ` [PATCH 07/12] fsmonitor: prepare to share code between Mac OS and Linux Eric DeCosta via GitGitGadget
2022-10-09 22:36   ` Junio C Hamano
2022-10-10 16:23   ` Jeff Hostetler
2022-10-09 14:37 ` [PATCH 08/12] fsmonitor: determine if filesystem is local or remote Eric DeCosta via GitGitGadget
2022-10-10 10:04   ` Ævar Arnfjörð Bjarmason
2022-10-14 19:38     ` Eric DeCosta
2022-10-09 14:37 ` [PATCH 09/12] fsmonitor: implement filesystem change listener for Linux Eric DeCosta via GitGitGadget
2022-10-09 14:37 ` [PATCH 10/12] fsmonitor: enable fsmonitor " Eric DeCosta via GitGitGadget
2022-10-09 14:37 ` [PATCH 11/12] fsmonitor: test updates Eric DeCosta via GitGitGadget
2022-10-18 12:42   ` Ævar Arnfjörð Bjarmason
2022-10-09 14:37 ` [PATCH 12/12] fsmonitor: update doc for Linux Eric DeCosta via GitGitGadget
2022-10-18 12:43   ` Ævar Arnfjörð Bjarmason
2022-10-09 22:24 ` [PATCH 00/12] fsmonitor: Implement fsmonitor " Junio C Hamano
2022-10-10  0:19   ` Eric Sunshine
2022-10-10 21:08 ` Junio C Hamano
2022-10-14 21:45 ` [PATCH v2 " Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 01/12] fsmonitor: refactor filesystem checks to common interface Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 02/12] fsmonitor: relocate socket file if .git directory is remote Eric DeCosta via GitGitGadget
2022-10-18 12:12     ` Ævar Arnfjörð Bjarmason
2022-10-14 21:45   ` [PATCH v2 03/12] fsmonitor: avoid socket location check if using hook Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 04/12] fsmonitor: deal with synthetic firmlinks on macOS Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 05/12] fsmonitor: check for compatability before communicating with fsmonitor Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 06/12] fsmonitor: add documentation for allowRemote and socketDir options Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 07/12] fsmonitor: prepare to share code between Mac OS and Linux Eric DeCosta via GitGitGadget
2022-10-14 23:46     ` Junio C Hamano
2022-10-17 21:30       ` Eric DeCosta
2022-10-18  6:31         ` Junio C Hamano
2022-10-14 21:45   ` [PATCH v2 08/12] fsmonitor: determine if filesystem is local or remote Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 09/12] fsmonitor: implement filesystem change listener for Linux Eric DeCosta via GitGitGadget
2022-10-18 11:59     ` Ævar Arnfjörð Bjarmason
2022-10-14 21:45   ` [PATCH v2 10/12] fsmonitor: enable fsmonitor " Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 11/12] fsmonitor: test updates Eric DeCosta via GitGitGadget
2022-10-14 21:45   ` [PATCH v2 12/12] fsmonitor: update doc for Linux Eric DeCosta via GitGitGadget
2022-10-14 23:32   ` [PATCH v2 00/12] fsmonitor: Implement fsmonitor " Junio C Hamano
2022-10-17 21:32     ` Eric DeCosta
2022-10-17 22:22       ` Junio C Hamano
2022-10-18 14:02       ` Johannes Schindelin
2022-10-17 22:14   ` Glen Choo
2022-10-18  4:17     ` Junio C Hamano
2022-10-18 17:03       ` Glen Choo
2022-10-18 17:11         ` Junio C Hamano
2022-10-19  1:19         ` Ævar Arnfjörð Bjarmason
2022-10-19  2:28           ` Eric Sunshine
2022-10-19 16:58             ` Junio C Hamano
2022-10-19 19:11             ` Ævar Arnfjörð Bjarmason
2022-10-19 20:14               ` Eric Sunshine
2022-10-19  1:04       ` Ævar Arnfjörð Bjarmason
2022-10-19 16:33         ` Junio C Hamano
2022-10-20 16:13       ` Junio C Hamano
2022-10-20 16:37         ` Eric Sunshine
2022-10-20 17:01           ` Junio C Hamano
2022-10-20 17:54             ` Eric Sunshine
2022-10-20 15:43     ` Junio C Hamano
2022-10-20 22:01       ` Junio C Hamano
2022-11-16 23:23   ` [PATCH v3 0/6] " Eric DeCosta via GitGitGadget
2022-11-16 23:23     ` [PATCH v3 1/6] fsmonitor: prepare to share code between Mac OS and Linux Eric DeCosta via GitGitGadget
2022-11-16 23:23     ` [PATCH v3 2/6] fsmonitor: determine if filesystem is local or remote Eric DeCosta via GitGitGadget
2022-11-16 23:23     ` [PATCH v3 3/6] fsmonitor: implement filesystem change listener for Linux Eric DeCosta via GitGitGadget
2022-11-16 23:23     ` [PATCH v3 4/6] fsmonitor: enable fsmonitor " Eric DeCosta via GitGitGadget
2022-11-16 23:23     ` [PATCH v3 5/6] fsmonitor: test updates Eric DeCosta via GitGitGadget
2022-11-16 23:23     ` [PATCH v3 6/6] fsmonitor: update doc for Linux Eric DeCosta via GitGitGadget
2022-11-16 23:27     ` [PATCH v3 0/6] fsmonitor: Implement fsmonitor " Taylor Blau
2022-11-23 19:00     ` [PATCH v4 " Eric DeCosta via GitGitGadget
2022-11-23 19:00       ` [PATCH v4 1/6] fsmonitor: prepare to share code between Mac OS and Linux Eric DeCosta via GitGitGadget
2022-11-23 19:00       ` [PATCH v4 2/6] fsmonitor: determine if filesystem is local or remote Eric DeCosta via GitGitGadget
2022-11-25  7:31         ` Junio C Hamano
2022-12-12 10:24         ` Ævar Arnfjörð Bjarmason
2022-11-23 19:00       ` [PATCH v4 3/6] fsmonitor: implement filesystem change listener for Linux Eric DeCosta via GitGitGadget
2022-12-12 10:42         ` Ævar Arnfjörð Bjarmason [this message]
2022-11-23 19:00       ` [PATCH v4 4/6] fsmonitor: enable fsmonitor " Eric DeCosta via GitGitGadget
2022-11-23 19:00       ` [PATCH v4 5/6] fsmonitor: test updates Eric DeCosta via GitGitGadget
2022-11-23 19:00       ` [PATCH v4 6/6] fsmonitor: update doc for Linux Eric DeCosta via GitGitGadget
2022-12-12 21:57       ` [PATCH v5 0/6] fsmonitor: Implement fsmonitor " Eric DeCosta via GitGitGadget
2022-12-12 21:58         ` [PATCH v5 1/6] fsmonitor: prepare to share code between Mac OS and Linux Eric DeCosta via GitGitGadget
2022-12-12 21:58         ` [PATCH v5 2/6] fsmonitor: determine if filesystem is local or remote Eric DeCosta via GitGitGadget
2022-12-12 21:58         ` [PATCH v5 3/6] fsmonitor: implement filesystem change listener for Linux Eric DeCosta via GitGitGadget
2022-12-12 21:58         ` [PATCH v5 4/6] fsmonitor: enable fsmonitor " Eric DeCosta via GitGitGadget
2022-12-12 21:58         ` [PATCH v5 5/6] fsmonitor: test updates Eric DeCosta via GitGitGadget
2022-12-12 21:58         ` [PATCH v5 6/6] fsmonitor: update doc for Linux Eric DeCosta via GitGitGadget
2023-04-12  5:42         ` [PATCH v5 0/6] fsmonitor: Implement fsmonitor " Junio C Hamano
2022-10-18 12:47 ` [PATCH 00/12] " Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=221212.86h6y06gca.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=chooglen@google.com \
    --cc=edecosta@mathworks.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).