From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 0522720248 for ; Mon, 15 Apr 2019 21:54:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727201AbfDOVyX (ORCPT ); Mon, 15 Apr 2019 17:54:23 -0400 Received: from mail-qt1-f195.google.com ([209.85.160.195]:39018 "EHLO mail-qt1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726609AbfDOVyX (ORCPT ); Mon, 15 Apr 2019 17:54:23 -0400 Received: by mail-qt1-f195.google.com with SMTP id t28so21060722qte.6 for ; Mon, 15 Apr 2019 14:54:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=platin-gs.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Z5R3DwuSpeVP3Em9LmFxQvJu9SuXQ9zNV+TLEpjjKLc=; b=ka6fAEfLvACnD8CjuLml7lBfG3m7DPSutHm4Z8KaRe9s8AtGdeBkyrdibI2hLoA+fA 5x0i2Il5CQZjimgN/2p89hAgdSmzOGk80bpd9pR55KVu8n8nrn4GrZgGpFN1Eb3QMlyQ GbROLpfh9q34Y7p27cOJakwjgGJuTXRVsaAxh5F9+rey0Kbv3MWug606z0JrAcTWNkVL d8hHK9itspc9Uuai+N1XpcH71rXNC0ieD17hGyfr8qU6yTJXOW4eUcJyvGK4gey1hDPj sCG8aplyWUF+ZJ8YHa9/xNb4mM35irJXpplAWTQ6zMJOrigt7ncVzynZISSojyMJRKyP c1zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Z5R3DwuSpeVP3Em9LmFxQvJu9SuXQ9zNV+TLEpjjKLc=; b=OB7kdRX72kZpt/g51RUIlXYXIQSzwANN77NOVIcNo11pRYVqIjXTkiIZzUnv5WsL8b PAkZkdB6ASL9W/MXw2/V6ibiDHkEFX7tdTbsost/TvM7QIadQKORZ22TvLdlc5hWIOlR 9QTHamEXTAv4iOvnu+XNdyIdQOIuGvpfPQaPVtYz/xlnlMWEgdpEWr4ISQIHGg/uhTJl AhA94yHz7FvekQt41oy1x+kAstlBhsfc03oNv/+7UWiFhD92EjetM2J5eoGlM6x1MS1G wiyrt+S0WyqOljQGD35Kz4rQgSdj07wuUanch9P4Jv9KqMY3PqHTL7ADQYEzFgsnrt2U VBWg== X-Gm-Message-State: APjAAAX6hyHHQJl8sH16MN19JcExKJUCoyareB2RvlN4uW1VA4WHf6no yzm1Sd47p6/MKmuL5qCsCYJvj1wDS1Wf81JNWiQ= X-Google-Smtp-Source: APXvYqx08q1XT2JtXmE7EcsQpUgKWdmqIs/zRQHW3OifB1Hhy8A13THjpgvsWFSYtvfE0hE7Sk/i/Ih9TH6R74NdFkU= X-Received: by 2002:aed:3641:: with SMTP id e59mr62277408qtb.235.1555365262447; Mon, 15 Apr 2019 14:54:22 -0700 (PDT) MIME-Version: 1.0 References: <20190410162409.117264-1-brho@google.com> <9439c697-246f-3bcb-4d34-85099e577e8b@google.com> In-Reply-To: <9439c697-246f-3bcb-4d34-85099e577e8b@google.com> From: Michael Platings Date: Mon, 15 Apr 2019 22:54:10 +0100 Message-ID: Subject: Re: [PATCH v6 0/6] blame: add the ability to ignore commits To: Barret Rhoden Cc: Git mailing list , =?UTF-8?B?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= , David Kastrup , Jeff King , Jeff Smith , Johannes Schindelin , Junio C Hamano , =?UTF-8?Q?Ren=C3=A9_Scharfe?= , Stefan Beller Content-Type: text/plain; charset="UTF-8" Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org > My main concerns: > - Can your version reach outside of a diff chunk? Currently no. It's optimised for reformatting and renaming, both of which preserve ordering. I could look into allowing disordered matches where the similarity is high, while still being biased towards ordered matches. If you can post more examples that would be helpful. > - Complexity and possibly performance. The recursive stuff made me > wonder about it a bit. It's no reason not to use it, just need to check > it more closely. Complexity I can't deny, I can only mitigate it with documentation/comments. I optimised the code pretty heavily and tested on some contrived worst-case scenarios and the performance was still good so I'm not worried about that. > Is the latest version of your stuff still the one you posted last week > or so? Yes. But reaching outside the chunk might lead to a significantly different API in the next version... > Similarly, do you think the "two pass" approach I have (check the chunk, > then check the parent file) would work with your recursive partitioning > style? That might make yours able to handle the "split diff chunk" case. Yes, should do. I'll see what I can come up with this week. > No algorithm will work for all cases. The one you just gave had the > simple heuristic working better than a complex one. We could make it > more complex, but then another example may be worse. I can live with > some inaccuracy in exchange for simplicity. Exactly, no algorithm will work for all cases. So what I'm suggesting is that it might be best to let the user choose which heuristic is appropriate for a given commit. If they know that the simple heuristic works best then perhaps we should let them choose that rather than only offering a one-size-fits-all option. But if we do want to go for one-size-fits-all then I'm very keen to make sure it at least solves the specific cases that we know about.