From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS53758 23.128.96.0/24 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS,URIBL_CSS, URIBL_CSS_A shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id AC52F1F953 for ; Thu, 28 Oct 2021 22:46:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231346AbhJ1WtM (ORCPT ); Thu, 28 Oct 2021 18:49:12 -0400 Received: from pb-smtp21.pobox.com ([173.228.157.53]:63338 "EHLO pb-smtp21.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231162AbhJ1WtK (ORCPT ); Thu, 28 Oct 2021 18:49:10 -0400 Received: from pb-smtp21.pobox.com (unknown [127.0.0.1]) by pb-smtp21.pobox.com (Postfix) with ESMTP id D44A61561F0; Thu, 28 Oct 2021 18:46:42 -0400 (EDT) (envelope-from junio@pobox.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=+P3WlFpfEPyokuqvGE8M9iCgcnS0dMzY+DfU2i 7SLsI=; b=HeU+5MlTDsv6RdHtOHqNslUn1UJ6Jivg68BnQX17Tq1hqDicQzZhfz lg/rzHGO0q25yeN9tO2/LQAhIDHaow2TbFOIJQ6DTB6xTwKEncCv8b6En/EUqgdL dAErBBSEDKmdYqVdhpkxiIoFnDhACTQUiwuVJUQfesuceqTuPECAM= Received: from pb-smtp21.sea.icgroup.com (unknown [127.0.0.1]) by pb-smtp21.pobox.com (Postfix) with ESMTP id CD1241561EF; Thu, 28 Oct 2021 18:46:42 -0400 (EDT) (envelope-from junio@pobox.com) Received: from pobox.com (unknown [104.133.2.91]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pb-smtp21.pobox.com (Postfix) with ESMTPSA id 33D991561EE; Thu, 28 Oct 2021 18:46:40 -0400 (EDT) (envelope-from junio@pobox.com) From: Junio C Hamano To: Ivan Frade Cc: Ivan Frade via GitGitGadget , git@vger.kernel.org, Jonathan Tan Subject: Re: [PATCH v4 1/2] fetch-pack: redact packfile urls in traces References: <973a250752c39c3fe835d69f3fbe8f009fc4fa74.1635288599.git.gitgitgadget@gmail.com> Date: Thu, 28 Oct 2021 15:46:39 -0700 In-Reply-To: (Ivan Frade's message of "Thu, 28 Oct 2021 15:15:05 -0700") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: EA575016-3840-11EC-B47A-98D80D944F46-77302942!pb-smtp21.pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Ivan Frade writes: >> Hmph, the format we expect is " "; don't we need to >> validate the leading followed by SP? > > I was trying to find a uri in a packet in general, not counting on the > packfile-uri line format. That is probably an overgeneralization. Ah, I see. This is merely a tracing, so we might benefit from a generalized version of redactor, and from that point of view, the use of strstr and stopping at the whitespace do make sort-of sense to me, but then we lack any attempt to redact more than one instance of URL in a packet, so the generalization may have quite a limited usefulness. > Next patch version follows these suggestions to look for a packfile-uri line. Yeah, I think that is a good way to go, at least for now. When we want a more general one, we can revisit it, but not now. >> > - packet_trace(buffer, len, 0); >> > + if (options & PACKET_READ_REDACT_URL_PATH && >> > + (url_path_start = find_url_path(buffer, &url_path_len))) { >> > + const char *redacted = ""; >> > + struct strbuf tracebuf = STRBUF_INIT; >> > + strbuf_insert(&tracebuf, 0, buffer, len); >> > + strbuf_splice(&tracebuf, url_path_start - buffer, >> > + url_path_len, redacted, strlen(redacted)); >> > + packet_trace(tracebuf.buf, tracebuf.len, 0); >> > + strbuf_release(&tracebuf); >> >> I briefly wondered if the repeated allocation (and more >> fundamentally, preparing the redacted copy of packet whether we are >> actually tracing the packet in the first place) is blindly wasting >> the resources too much, but this only happens in the protocol header >> part, so it might be OK. > > We only allocate and redact if it looks like a packfile-uri line, so > it shouldn't happen too frequently. I was mostly wondering about the cost of determining "if it looks like?". But we do this only for the protocol header part, so we won't have thousands of attempts to match, I guess. Oh, or if we also do this for the ref advertisement packets, then we might have quite a many. Hmph. > I move the set/unset of the redacting flag to the FETCH_GET_PACK > around the "packfile-uris" section. > There is no need to check every incoming packet for a packfile-uri > line, we know when they should come. Yeah, that is quite a wise design decision, I would think. Thanks.