From: "René Scharfe" <l.s.r@web.de>
To: Junio C Hamano <gitster@pobox.com>, Jessica Clarke <jrtc27@jrtc27.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] apply: Avoid ambiguous pointer provenance for CHERI/Arm's Morello
Date: Fri, 7 Jan 2022 13:16:53 +0100 [thread overview]
Message-ID: <8739caad-aa3d-1f0f-b5dd-6174a8e059f6@web.de> (raw)
In-Reply-To: <xmqqczl4bhmo.fsf@gitster.g>
Am 06.01.22 um 23:53 schrieb Junio C Hamano:
> Jessica Clarke <jrtc27@jrtc27.com> writes:
>
>> On CHERI, and thus Arm's Morello prototype, pointers are implemented as
>> hardware capabilities which, as well as having a normal integer address,
>> have additional bounds, permissions and other metadata in a second word.
>> In order to preserve this metadata, uintptr_t is also implemented as a
>> capability, not a plain integer, which causes problems for binary
>> operators, as the metadata preserved in the output can only come from
>> one of the inputs. In most cases this is clear, as normally at least one
>> operand is provably a plain integer, but if both operands are uintptr_t
>> and have no indication they're just plain integers then it is ambiguous,
>> and the current implementation will arbitrarily, but deterministically,
>> pick the left-hand side, due to empirical evidence that it is more
>> likely to be correct.
>
> What's left-hand side in the context of the code you changed?
> Between "what" vs "ent->util" you take "what"? That cannot be
> true. Are you referring to the (usually hidden and useless when we
> use it as an integer) "hardware capabilities" word as "left" vs the
> value of the pointer as "right"?
>
>> static uintptr_t register_symlink_changes(struct apply_state *state,
>> const char *path,
>> - uintptr_t what)
>> + size_t what)
>> {
>> struct string_list_item *ent;
>>
>> @@ -3823,7 +3823,7 @@ static uintptr_t register_symlink_changes(struct apply_state *state,
>> ent = string_list_insert(&state->symlink_changes, path);
>> ent->util = (void *)0;
>> }
>> - ent->util = (void *)(what | ((uintptr_t)ent->util));
>> + ent->util = (void *)((uintptr_t)what | ((uintptr_t)ent->util));
>> return (uintptr_t)ent->util;
>> }
>
> I actually wonder if it results in code that is much easier to
> follow if we did:
>
> * Introduce an "enum apply_symlink_treatment" that has
> APPLY_SYMLINK_GOES_AWAY and APPLY_SYMLINK_IN_RESULT as its
> possible values;
>
> * Make register_symlink_changes() and check_symlink_changes()
> work with "enum apply_symlink_treatment";
>
> * (optional) stop using string_list() to store the symlink_changes;
> use strintmap and use strintmap_set() and strintmap_get() to
> access its entries, so that the ugly implementation detail
> (i.e. "the container type we use only has a (void *) field to
> store caller-supplied data, so we cast an integer and a pointer
> back and forth") can be safely hidden.
>
Or strsets -- we only need two.
--- >8 ---
Subject: [PATCH] apply: use strsets to track symlinks
Symlink changes are tracked in a string_list, with the util pointer
value indicating whether a symlink is kept or removed. Using fake
pointer values requires awkward casts. Use one strset for each type of
change instead to simplify and shorten the code.
Original-patch-by: Jessica Clarke <jrtc27@jrtc27.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
apply.c | 42 ++++++++----------------------------------
apply.h | 26 +++++++++++---------------
2 files changed, 19 insertions(+), 49 deletions(-)
diff --git a/apply.c b/apply.c
index fed195250b..7deb4f79fd 100644
--- a/apply.c
+++ b/apply.c
@@ -103,7 +103,8 @@ int init_apply_state(struct apply_state *state,
state->linenr = 1;
string_list_init_nodup(&state->fn_table);
string_list_init_nodup(&state->limit_by_name);
- string_list_init_nodup(&state->symlink_changes);
+ strset_init(&state->removed_symlinks);
+ strset_init(&state->kept_symlinks);
strbuf_init(&state->root, 0);
git_apply_config();
@@ -117,7 +118,8 @@ int init_apply_state(struct apply_state *state,
void clear_apply_state(struct apply_state *state)
{
string_list_clear(&state->limit_by_name, 0);
- string_list_clear(&state->symlink_changes, 0);
+ strset_clear(&state->removed_symlinks);
+ strset_clear(&state->kept_symlinks);
strbuf_release(&state->root);
/* &state->fn_table is cleared at the end of apply_patch() */
@@ -3812,59 +3814,31 @@ static int check_to_create(struct apply_state *state,
return 0;
}
-static uintptr_t register_symlink_changes(struct apply_state *state,
- const char *path,
- uintptr_t what)
-{
- struct string_list_item *ent;
-
- ent = string_list_lookup(&state->symlink_changes, path);
- if (!ent) {
- ent = string_list_insert(&state->symlink_changes, path);
- ent->util = (void *)0;
- }
- ent->util = (void *)(what | ((uintptr_t)ent->util));
- return (uintptr_t)ent->util;
-}
-
-static uintptr_t check_symlink_changes(struct apply_state *state, const char *path)
-{
- struct string_list_item *ent;
-
- ent = string_list_lookup(&state->symlink_changes, path);
- if (!ent)
- return 0;
- return (uintptr_t)ent->util;
-}
-
static void prepare_symlink_changes(struct apply_state *state, struct patch *patch)
{
for ( ; patch; patch = patch->next) {
if ((patch->old_name && S_ISLNK(patch->old_mode)) &&
(patch->is_rename || patch->is_delete))
/* the symlink at patch->old_name is removed */
- register_symlink_changes(state, patch->old_name, APPLY_SYMLINK_GOES_AWAY);
+ strset_add(&state->removed_symlinks, patch->old_name);
if (patch->new_name && S_ISLNK(patch->new_mode))
/* the symlink at patch->new_name is created or remains */
- register_symlink_changes(state, patch->new_name, APPLY_SYMLINK_IN_RESULT);
+ strset_add(&state->kept_symlinks, patch->new_name);
}
}
static int path_is_beyond_symlink_1(struct apply_state *state, struct strbuf *name)
{
do {
- unsigned int change;
-
while (--name->len && name->buf[name->len] != '/')
; /* scan backwards */
if (!name->len)
break;
name->buf[name->len] = '\0';
- change = check_symlink_changes(state, name->buf);
- if (change & APPLY_SYMLINK_IN_RESULT)
+ if (strset_contains(&state->kept_symlinks, name->buf))
return 1;
- if (change & APPLY_SYMLINK_GOES_AWAY)
+ if (strset_contains(&state->removed_symlinks, name->buf))
/*
* This cannot be "return 0", because we may
* see a new one created at a higher level.
diff --git a/apply.h b/apply.h
index 16202da160..4052da50c0 100644
--- a/apply.h
+++ b/apply.h
@@ -4,6 +4,7 @@
#include "hash.h"
#include "lockfile.h"
#include "string-list.h"
+#include "strmap.h"
struct repository;
@@ -25,20 +26,6 @@ enum apply_verbosity {
verbosity_verbose = 1
};
-/*
- * We need to keep track of how symlinks in the preimage are
- * manipulated by the patches. A patch to add a/b/c where a/b
- * is a symlink should not be allowed to affect the directory
- * the symlink points at, but if the same patch removes a/b,
- * it is perfectly fine, as the patch removes a/b to make room
- * to create a directory a/b so that a/b/c can be created.
- *
- * See also "struct string_list symlink_changes" in "struct
- * apply_state".
- */
-#define APPLY_SYMLINK_GOES_AWAY 01
-#define APPLY_SYMLINK_IN_RESULT 02
-
struct apply_state {
const char *prefix;
@@ -86,7 +73,16 @@ struct apply_state {
/* Various "current state" */
int linenr; /* current line number */
- struct string_list symlink_changes; /* we have to track symlinks */
+ /*
+ * We need to keep track of how symlinks in the preimage are
+ * manipulated by the patches. A patch to add a/b/c where a/b
+ * is a symlink should not be allowed to affect the directory
+ * the symlink points at, but if the same patch removes a/b,
+ * it is perfectly fine, as the patch removes a/b to make room
+ * to create a directory a/b so that a/b/c can be created.
+ */
+ struct strset removed_symlinks;
+ struct strset kept_symlinks;
/*
* For "diff-stat" like behaviour, we keep track of the biggest change
--
2.34.1
next prev parent reply other threads:[~2022-01-07 12:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-05 13:23 [PATCH] apply: Avoid ambiguous pointer provenance for CHERI/Arm's Morello Jessica Clarke
2022-01-05 16:39 ` Konstantin Khomoutov
2022-01-05 16:40 ` Jessica Clarke
2022-01-06 22:50 ` Taylor Blau
2022-01-06 22:57 ` Jessica Clarke
2022-01-06 22:53 ` Junio C Hamano
2022-01-06 23:02 ` Jessica Clarke
2022-01-06 23:41 ` Junio C Hamano
2022-01-07 12:16 ` René Scharfe [this message]
2022-01-07 13:00 ` Jessica Clarke
2022-01-07 19:40 ` Junio C Hamano
2022-01-08 0:04 ` René Scharfe
2022-01-08 0:51 ` Junio C Hamano
2022-01-07 23:25 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8739caad-aa3d-1f0f-b5dd-6174a8e059f6@web.de \
--to=l.s.r@web.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrtc27@jrtc27.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).