From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-5.4 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,RCVD_IN_SORBS_SPAM, RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 8355C203C1 for ; Wed, 16 Nov 2016 09:53:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753502AbcKPJxW (ORCPT ); Wed, 16 Nov 2016 04:53:22 -0500 Received: from mail-wm0-f66.google.com ([74.125.82.66]:35450 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752258AbcKPJxS (ORCPT ); Wed, 16 Nov 2016 04:53:18 -0500 Received: by mail-wm0-f66.google.com with SMTP id a20so9273575wme.2 for ; Wed, 16 Nov 2016 01:53:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=CQCnzHNSS/Z1aDoRAEDRsjzdM9KL5v44rDCrqjuXslE=; b=fDJY7Jcs9SqiSsGKip/XmSOfOQzBLpadwVESE9RbyPrIGXVEEBIivlS1YAmaxsh2U1 FsNxWvmErVoFhyfZSsmMI64ek2j86sW7JxKPQiZpTq1ePJjYp9QxSu2kY/P4CrpI3Hby YSno8sAp8N+aU1ZaP/+DWF01doAi3LsFc3mDWefYsBw4UNb13WsHIb49DImoRy0JkEg0 Dk/CMO3RjPCzfWkXY0VkhFV/TkGrImaCwCTxGiXEgfYFxZHtzJnhiREmnNlxUYGOb+yh iiZpDASwWO2QTgPxechkMedwISq1k8SiRzSgq0I2kZ8rchPhCJ7k15A48tKNCVC9Gawb 5HuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=CQCnzHNSS/Z1aDoRAEDRsjzdM9KL5v44rDCrqjuXslE=; b=i3pQxgT6nHbwVwz17ME3oskr6KCY6elxuaXzsDK97pqeIN/H7Qr1YsJ2dKsfXK0wrr MuN5l0+HbZVHnvW6IYvYvqGbhnjIMGJ3KmMV0bTXNMAKXXGDVs5UkqB5KRpX7j1cFFu3 Cst0sol2M6yd6xgkdNLvwna8fmKU8JbpCxwMONCDN7R7ff0Vm8M944CWKfoPz3LS2cB0 UVUKOklCqTU99W0JXsO/0JaCWRNy1IebyIEyhz+tvtIlSGvmUmwf4FzSO5iYbdWKEF+E moZuZimvRa3SwDicQYS8fE+zLNCgjjtJ8I12Oaif96BB3T+dL5YJIbrotY1wOiFzzGFK G12w== X-Gm-Message-State: ABUngvd3z8fskbp/Fc3/AGX+aKJNISRVc2m9oKCcNorHeYevqAz/taRR08bZAKFcGJTL6g== X-Received: by 10.28.143.20 with SMTP id r20mr8461814wmd.2.1479289997089; Wed, 16 Nov 2016 01:53:17 -0800 (PST) Received: from slxbook3.fritz.box (p5DDB4EC0.dip0.t-ipconnect.de. [93.219.78.192]) by smtp.gmail.com with ESMTPSA id i132sm9131275wmf.14.2016.11.16.01.53.15 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 16 Nov 2016 01:53:16 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: RFC: Enable delayed responses to Git clean/smudge filter requests From: Lars Schneider In-Reply-To: Date: Wed, 16 Nov 2016 10:53:17 +0100 Cc: Eric Wong , git@vger.kernel.org Content-Transfer-Encoding: 7bit Message-Id: <17709AFF-3C2D-4EC0-97DC-BD750F514D0B@gmail.com> References: <20161115010356.GA29602@starla> <5BC69AC1-5499-4F73-816A-D8786106D796@gmail.com> To: Junio C Hamano X-Mailer: Apple Mail (2.1878.6) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On 15 Nov 2016, at 19:03, Junio C Hamano wrote: > Lars Schneider writes: > >>> The filter itself would need to be aware of parallelism >>> if it lives for multiple objects, right? >> >> Correct. This way Git doesn't need to deal with threading... > > I think you need to be careful about three things (at least; there > may be more): > > * Codepaths that check out multiple cache entries do rely on the > order of checkout. We checkout removals first to make room so > that creation of a path X can succeed if an existing path X/Y > that used to want to see X as a directory can succeed (see the > use of checkout_entry() by "git checkout", which does have two > separate loops to explicitly guarantee this), for example. I > think "remove all and then create" you do not specifically have > to worry about with the proposed change, but you may need to > inspect and verify there aren't other kind of order dependency. OK > * Done naively, it will lead to unmaintainable code, like this: > > + struct list_of_cache_entries *list = ...; > for (i = 0; i < active_nr; i++) > - checkout_entry(active_cache[i], state, NULL); > + if (checkout_entry(active_cache[i], state, NULL) == DELAYED) > + add_cache_to_queue(&list, active_cache[i]); > + while (list) { > + wait_for_checkout_to_finish(*list); > + list = list->next; > + } > > I do not think we want to see such a rewrite all over the > codepaths. It might be OK to add such a "these entries are known > to be delayed" list in struct checkout so that the above becomes > more like this: > > for (i = 0; i < active_nr; i++) > checkout_entry(active_cache[i], state, NULL); > + checkout_entry_finish(state); > > That is, addition of a single "some of the checkout_entry() calls > done so far might have been lazy, and I'll give them a chance to > clean up" might be palatable. Anything more than that on the > caller side is not. I haven't thought hard about the implementation, yet, but I'll try to stick to your suggestion and change as less code as possible on the caller sides. > * You'd need to rein in the maximum parallelism somehow, as you do > not want to see hundreds of competing filter processes starting > only to tell the main loop over an index with hundreds of entries > that they are delayed checkouts. I intend to implement this feature only for the new long running filter process protocol. OK with you? Thanks, Lars