From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS53758 23.128.96.0/24 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id 531431F5AE for ; Thu, 29 Apr 2021 20:06:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236720AbhD2UHm (ORCPT ); Thu, 29 Apr 2021 16:07:42 -0400 Received: from mav.lukeshu.com ([104.207.138.63]:44930 "EHLO mav.lukeshu.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234930AbhD2UHm (ORCPT ); Thu, 29 Apr 2021 16:07:42 -0400 Received: from lukeshu-dw-thinkpad (unknown [IPv6:2601:281:8200:26:527b:9dff:fe2b:180a]) by mav.lukeshu.com (Postfix) with ESMTPSA id 5357780590; Thu, 29 Apr 2021 16:06:52 -0400 (EDT) Date: Thu, 29 Apr 2021 14:06:52 -0600 Message-ID: <87o8dwq2hv.wl-lukeshu@lukeshu.com> From: Luke Shumaker To: Junio C Hamano Cc: Luke Shumaker , git@vger.kernel.org, Elijah Newren , Jeff King , Johannes Schindelin , =?UTF-8?B?Tmd1eeG7hW4g?= =?ISO-8859-1?Q?Th?= =?ISO-8859-1?Q?=E1i_?= =?UTF-8?B?Tmfhu41j?= Duy , Taylor Blau , "brian m . carlson" , Eric Sunshine , Luke Shumaker Subject: Re: [PATCH v3 3/3] fast-export, fast-import: implement signed-commits In-Reply-To: References: <20210422002749.2413359-1-lukeshu@lukeshu.com> <20210423164118.693197-1-lukeshu@lukeshu.com> <20210423164118.693197-4-lukeshu@lukeshu.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, 27 Apr 2021 22:02:47 -0600, Junio C Hamano wrote: > > Luke Shumaker writes: > > > fast-export has an existing --signed-tags= flag that controls how to > > Don't call a command line option "a flag", especially when it is not > a boolean. Good to know, perhaps this should be mentioned in CodingGuidelines or SubmittingPatches.txt? I see lots of instances in the docs of "flag" being used. > "has an existing" feels redundantly repeticious. I guess I did this to make it clearer that that paragraph is describing the state of things before the patch, rather than after the patch. This is of course the required way of writing messages for git.git, but I worded it that way to make it clearer to reviewers that I'm following that requirement (especially since I haven't gotten a commit landed in git.git before). > > So, implement signed-commits. > > That's misleading. You are not inventing "git commit --signed" > here. > > So implement `--signed-commits=` that mirrors the > `--signed-tags=` option. Ack. > > +--signed-commits=(verbatim|warn-verbatim|warn-strip|strip|abort):: > > + Specify how to handle signed commits. Behaves exactly as > > + --signed-tags (but for commits), except that the default is > > + 'warn-strip' rather than 'abort'. > > Why deliberate inconsistency? I am not sure "historically we did a > wrong thing" is a good reason (if we view that silently stripping > was a disservice to the users, aborting would be a bugfix). I *almost* agree. I agree in principle, but disagree in practice because I know that it would break a bunch of existing tooling, including git-filter-repo. > > > > +enum sign_mode { SIGN_ABORT, SIGN_VERBATIM, SIGN_STRIP, SIGN_VERBATIM_WARN, SIGN_STRIP_WARN }; > > + > > static int progress; > > -static enum { SIGNED_TAG_ABORT, VERBATIM, WARN, WARN_STRIP, STRIP } signed_tag_mode = SIGNED_TAG_ABORT; > > Giving the enum values consistent prefix "SIGN_" is a great > improvement. On the other hand, swapping the word order, > e.g. WARN_STRIP to SIGN_STRIP_WARN, is unwarranted. Flipping it around made the switch statements read better, I thought. But I can change it back. > > +static enum sign_mode signed_tag_mode = SIGN_ABORT; > > +static enum sign_mode signed_commit_mode = SIGN_STRIP_WARN; > > I think it is safer to abort for both and sell it as a bugfix > ("silently stripping commit signatures was wrong. we should abort > the same way by default when encountering a signed tag"). > > > -static int parse_opt_signed_tag_mode(const struct option *opt, > > +static int parse_opt_sign_mode(const struct option *opt, > > const char *arg, int unset) > > { > > - if (unset || !strcmp(arg, "abort")) > > - signed_tag_mode = SIGNED_TAG_ABORT; > > + enum sign_mode *valptr = opt->value; > > + if (unset) > > + return 0; > > + else if (!strcmp(arg, "abort")) > > + *valptr = SIGN_ABORT; > > else if (!strcmp(arg, "verbatim") || !strcmp(arg, "ignore")) > > - signed_tag_mode = VERBATIM; > > + *valptr = SIGN_VERBATIM; > > Interesting and not a new issue at all, but "ignore" is a confusing > symonym to "verbatim"---I would have expected "ignore", if accepted > as a choice, would strip the signature. Not documenting it is > probably good, but perhaps we would eventually remove it? Indeed, it was renamed from "ignore" to "verbatim" because "ignore" was such a confusing name. It was renamed (in ee4bc3715f (fast-export: rename the signed tag mode 'ignore' to 'verbatim', 2007-12-03)) by the original fast-export author, pretty much immediately after fast-export was originally introduced. There was never a released version of Git that had fast-export but didn't have the 'ignore'->'verbatim' rename (fast-export was first released in v1.5.4, and the rename was already present then). > > @@ -499,6 +505,60 @@ static void show_filemodify(struct diff_queue_struct *q, > > } > > } > > > > +static const char *find_signature(const char *begin, const char *end, const char *key) > > This is only for in-header signature used in commit objects, and not > for the traditional "attached to the end" signature used in tag > objects, right? > > The name of this function should be designed to answer the above > question, but find_signature() that does not say either commit or > tag implies it can accept both (which would be a horrible interface, > though). If this is only for in-header signature, rename it to make > sure that the fact is readable out of its name? > > > +{ > > + static struct strbuf needle = STRBUF_INIT; > > + char *bod, *eod, *eol; > > + > > + strbuf_reset(&needle); > > + strbuf_addch(&needle, '\n'); > > + strbuf_addstr(&needle, key); > > + strbuf_addch(&needle, ' '); > > strbuf_addf(), perhaps? Currently, strbuf_addf is only used by fast-export.c in the "anonymize_" functions. I took that to mean "avoid strbuf_addf if you can", figuring that fast-export and fast-import seem to go reasonably far out of their way to avoid dynamic things (like printf) in the happy-path. But I can change if it I'm mis-reading fast-export's paranoia. > > + bod = memmem(begin, end ? end - begin : strlen(begin), > > + needle.buf, needle.len); > > + if (!bod) > > + return NULL; > > + bod += needle.len; > > + > > + /* > > + * In the commit object, multi-line header values are stored > > + * by prefixing continuation lines begin with a space. So > > "by prefixig continuation lines with a space" Oops. > > + * within the commit object, it looks like > > + * > > + * "gpgsig -----BEGIN PGP SIGNATURE-----\n" > > + * " Version: GnuPG v1.4.5 (GNU/Linux)\n" > > + * " \n" > > + * " base64_pem_here\n" > > + * " -----END PGP SIGNATURE-----\n" > > + * > > + * So we need to look for the first '\n' that *isn't* followed > > + * by a ' ' (or the first '\0', if no such '\n' exists). > > + */ > > > + eod = strchrnul(bod, '\n'); > > + while (eod[0] == '\n' && eod[1] == ' ') { > > + eod = strchrnul(eod+1, '\n'); > > + } > > SP on both sides of '+'; no {} around a block that consists of a > single statement. Ack. > > + *eod = '\0'; > > The begin and end pointers pointed to a piece of memory that is > supposed to be read-only, but this pointer points into that region > of memory and then updates a byte? The function signature is > misleading---if you intend to muck with the string, accept them as > mutable pointers. > > Better yet, don't butcher the region of memory pointed by the > "message" variable the caller uses to keep reading from the > remainder of the commit object buffer with this and memmove() > below. Perhaps have the caller pass a strbuf to fill in the > signature found by this helper as another parameter, and then return > a bool "Yes, I found a sig" as its return value? That all sounds very sane, but I was mimicking the existing `find_encoding`. You aren't supposed to modify the memory from get_commit_buffer, but fast-export does anyway. I assume that Johannes knew what he was doing when he wrote it (that it's safe because fast-export never traverses the same object twice?) and that he did it as an allocation-avoiding optimization. Part of me thinks that it would be better to just use the standard functions for this, like read_commit_extra_headers or find_commit_header? > > + > > + /* > > + * We now have the value as it's stored in the commit object. > > + * However, we want the raw value; we want to return > > + * > > + * "-----BEGIN PGP SIGNATURE-----\n" > > + * "Version: GnuPG v1.4.5 (GNU/Linux)\n" > > + * "\n" > > + * "base64_pem_here\n" > > + * "-----END PGP SIGNATURE-----\n" > > + * > > + * So now we need to strip out all of those extra spaces. > > + */ > > + while ((eol = strstr(bod, "\n "))) > > + memmove(eol+1, eol+2, strlen(eol+1)); > > Besides, this is O(n^2), isn't it, as it always starts scanning at > bod while there are lines in the signature block to be processed, it > needs to skip over the lines that the loop already has processed. Indeed, I'm embarrassed that made it in to something I submitted. -- Happy hacking, ~ Luke Shumaker