From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-3.6 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org [IPv6:2604:1380:40f1:3f00::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id D198E1F44D for ; Fri, 12 Apr 2024 16:34:54 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (1024-bit key; unprotected) header.d=pobox.com header.i=@pobox.com header.a=rsa-sha256 header.s=sasl header.b=JRtfO51y; dkim-atps=neutral Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id CAB71B21549 for ; Fri, 12 Apr 2024 16:34:52 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D7E5D14AD17; Fri, 12 Apr 2024 16:34:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=pobox.com header.i=@pobox.com header.b="JRtfO51y" Received: from pb-smtp20.pobox.com (pb-smtp20.pobox.com [173.228.157.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 228EE1E53A for ; Fri, 12 Apr 2024 16:34:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=173.228.157.52 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712939683; cv=none; b=XTfyrikoVkjSHbGEYovB/dm0/DWwYrIFpNBP+CaHoVUW77pkMdIEgLGt+3PUe+cllAAOGa4GHEfwW6Qt+gYlX3sLyrtefb2W89g+J9RdBcsE0iW2kqzR4GxYfNkcTX5sGGROKaaE8knVTTVRGxPgVZyi05vbAfoIIU+IhrOFs2g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712939683; c=relaxed/simple; bh=1m3cTP8uZc2PfpcTZFFjK2WXMtl99Q0hF6kSdE8mdiU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=IT+Yxc8kdaUt9qdFP3LwEkh0gYp39eMfuMTJbFmk+CGWiP1yWgbrONZtTPBSUTjgWt3S5QPGWY4HpmtKmWhu+ZHjEBrnXEn/hq5RDLBogZ9ZAe7odVH38Xtp6AwsjSzhmHFQkMgR4VNuViXPebdQTsupqkwM9wIEQzztQOD7q+4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=pobox.com; spf=pass smtp.mailfrom=pobox.com; dkim=pass (1024-bit key) header.d=pobox.com header.i=@pobox.com header.b=JRtfO51y; arc=none smtp.client-ip=173.228.157.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=pobox.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pobox.com Received: from pb-smtp20.pobox.com (unknown [127.0.0.1]) by pb-smtp20.pobox.com (Postfix) with ESMTP id B9DD218F4C; Fri, 12 Apr 2024 12:34:35 -0400 (EDT) (envelope-from junio@pobox.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=pobox.com; h=from:to:cc :subject:in-reply-to:references:date:message-id:mime-version :content-type; s=sasl; bh=1m3cTP8uZc2PfpcTZFFjK2WXMtl99Q0hF6kSdE 8mdiU=; b=JRtfO51yoQ+RkXp/PneksmXNCD9uDQOqS+Ew5aojVUNOnKljo1SUi5 IRg1qAQvCg1DIZJLg4FXN1+1w7veKbTmdDYtxUeIuTA+Bc0/mXbm2LIiBVymNdFc odQQT/+yIZsqUULSvSmEx8MRIz0uaFgRbpsjk/MTYnWBdOjeDA61k= Received: from pb-smtp20.sea.icgroup.com (unknown [127.0.0.1]) by pb-smtp20.pobox.com (Postfix) with ESMTP id B1F7518F4B; Fri, 12 Apr 2024 12:34:35 -0400 (EDT) (envelope-from junio@pobox.com) Received: from pobox.com (unknown [34.125.229.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pb-smtp20.pobox.com (Postfix) with ESMTPSA id EA13C18F47; Fri, 12 Apr 2024 12:34:29 -0400 (EDT) (envelope-from junio@pobox.com) From: Junio C Hamano To: Thalia Archibald Cc: git@vger.kernel.org, Patrick Steinhardt , Chris Torek , Elijah Newren Subject: Re: [PATCH v4 1/8] fast-import: tighten path unquoting In-Reply-To: (Thalia Archibald's message of "Fri, 12 Apr 2024 08:02:56 +0000") References: <20240322000304.76810-1-thalia@archibald.dev> Date: Fri, 12 Apr 2024 09:34:28 -0700 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: 890242AE-F8EA-11EE-98AF-F515D2CDFF5E-77302942!pb-smtp20.pobox.com Thalia Archibald writes: > diff --git a/builtin/fast-import.c b/builtin/fast-import.c > index 782bda007c..ce9231afe6 100644 > --- a/builtin/fast-import.c > +++ b/builtin/fast-import.c > @@ -2258,10 +2258,56 @@ static uintmax_t parse_mark_ref_space(const char **p) > return mark; > } > > +/* > + * Parse the path string into the strbuf. It may be quoted with escape sequences > + * or unquoted without escape sequences. When unquoted, it may only contain a > + * space if `include_spaces` is nonzero. > + */ It took me three reads to understand the last sentence. It would have been easier if it were written as "it may contain a space only if ...". I'd also named it "allow_unquoted_spaces"---it is not like this function includes extra spaces on top of whatever was given. > +static void parse_path(struct strbuf *sb, const char *p, const char **endp, > + int include_spaces, const char *field) > +{ > + if (*p == '"') { > + if (unquote_c_style(sb, p, endp)) > + die("Invalid %s: %s", field, command_buf.buf); > + } else { > + if (include_spaces) > + *endp = p + strlen(p); > + else > + *endp = strchrnul(p, ' '); > + strbuf_add(sb, p, *endp - p); > + } > +} A very straigtht-forward implementation. Makes sense. > +/* > + * Parse the path string into the strbuf, and complain if this is not the end of > + * the string. It may contain spaces even when unquoted. > + */ > +static void parse_path_eol(struct strbuf *sb, const char *p, const char *field) > +{ > + const char *end; > + > + parse_path(sb, p, &end, 1, field); > + if (*end) > + die("Garbage after %s: %s", field, command_buf.buf); > +} OK. > +/* > + * Parse the path string into the strbuf, and ensure it is followed by a space. > + * It may not contain spaces when unquoted. Update *endp to point to the first > + * character after the space. > + */ > +static void parse_path_space(struct strbuf *sb, const char *p, > + const char **endp, const char *field) > +{ > + parse_path(sb, p, endp, 0, field); > + if (**endp != ' ') > + die("Missing space after %s: %s", field, command_buf.buf); > + (*endp)++; > +} OK. The updated callers that use the above helper functions do read a lot more easily, while filling the gaps in the original implementation. Very nicely done. Thanks.