From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id 086A21F5AE for ; Tue, 23 Jun 2020 18:35:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387474AbgFWSf2 (ORCPT ); Tue, 23 Jun 2020 14:35:28 -0400 Received: from cloud.peff.net ([104.130.231.41]:40640 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387453AbgFWSf0 (ORCPT ); Tue, 23 Jun 2020 14:35:26 -0400 Received: (qmail 12948 invoked by uid 109); 23 Jun 2020 18:35:26 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 23 Jun 2020 18:35:26 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 19528 invoked by uid 111); 23 Jun 2020 18:35:26 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 23 Jun 2020 14:35:26 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 23 Jun 2020 14:35:25 -0400 From: Jeff King To: Eric Sunshine Cc: Git List , Junio C Hamano , Johannes Schindelin Subject: Re: [PATCH 09/10] fast-export: allow seeding the anonymized mapping Message-ID: <20200623183525.GB1444619@coredump.intra.peff.net> References: <20200623152436.GA50925@coredump.intra.peff.net> <20200623152505.GI1435482@coredump.intra.peff.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, Jun 23, 2020 at 02:11:51PM -0400, Eric Sunshine wrote: > On Tue, Jun 23, 2020 at 11:25 AM Jeff King wrote: > > Let's make it possible to seed the anonymization map. This lets users > > either: > > [...] > > Signed-off-by: Jeff King > > --- > > diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt > > @@ -119,6 +119,11 @@ by keeping the marks the same across runs. > > +--seed-anonymized=[:]:: > > + Convert token `` to `` in the anonymized output. If > > + `` is omitted, map `` to itself (i.e., do not > > + anonymize it). See the section on `ANONYMIZING` below. > > By the way (possible bikeshedding ahead), "seed anonymous" seems > overly technical. I wonder if a name such as > '--anonymize-to=[:]' might be clearer and easier for people > to understand. I wrestled with the name, and I agree "seed" is overly technical. And I came up with many similar variations of "anonymize-to", but they all seemed ambiguous (e.g., it could be "to" a file that we're storing the data in). Perhaps "--anonymize-map" would be less technical? > In fact, in an earlier email, I asked whether --seed-anonymized should > imply --anonymize. Thinking further on this, I wonder if we even need > the second option name. It should be possible to overload the existing > --anonymize to handle all functions. For instance: > > '--anonymize' would anonymize everything > > '--anonymize=[:]' would anonymize and map to > > So, the example you give in the documentation would become: > > git fast-export --all \ > --anonymize=foo.c:secret.c \ > --anonymize=mybranch >stream > > Or is that too cryptic? Yeah, that was another one I considered, but it both seemed cryptic (after all, we're saying what _not_ to anonymize), and it squats on the "anonymize" option. So imagine we had another option later, like "anonymize blobs and paths, but not refs", that could easily be "--anonymize=blobs,path" or "--anonymize=!refs". I'd rather not paint ourselves in a corner. -Peff