From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS53758 23.128.96.0/24 X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id AD2D91F5AE for ; Wed, 2 Jun 2021 05:04:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229832AbhFBFG1 (ORCPT ); Wed, 2 Jun 2021 01:06:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229571AbhFBFG0 (ORCPT ); Wed, 2 Jun 2021 01:06:26 -0400 Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0189C061574 for ; Tue, 1 Jun 2021 22:04:42 -0700 (PDT) Received: by mail-ej1-x62f.google.com with SMTP id qq22so1873785ejb.9 for ; Tue, 01 Jun 2021 22:04:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=klerks-biz.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MArme3tWsF0EAjsAy+cMxAO2EHjOALcUXqpTfL90rmk=; b=l/PXo5GzE6EaJBJrYmtczMOdyUDPyjW/Tgz6lmO58g+PMBRw018cHPTWOji1BLKU4a NvWsy0KjvCl6fECx5Kgz8nRbVP3Y/Cj1RC5Q5QmJemCvqvZkFQWP7R2pR6ty6rDbSWVN 5a1HKZlmuY5JuK+Jar7VJ+uWAYpxrrdWXKhyaXCY8hDrDRSz1cp+gEvmfCrcH/n0UuBk c+j7PxkNE1Sg2hjWz1A7TsncT367hsjImkro0ed2MA8U4WgdIC0RFTVCzSXS1TulagE4 o//mOSTwe3ZEY2tx6Qvvcr7C70KiG2jIvAy0YVWEFTB27d56vsWUm60dmhaQjhmkkcUa R+/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MArme3tWsF0EAjsAy+cMxAO2EHjOALcUXqpTfL90rmk=; b=AFHkkPNDDzA9da5ld3WWojX/Onw/1sG1cK0OxP10W4RCmFLYmdzAfhM1NY8ZO6fAKA tp90ssplamVqle/j9KLypavFgBgFLjskC+VSeLI8ugZqDO8jcBMzptbQuCHuPzSrrp9d zpFv93ItycOC+P5YaGUJXInb0z1B/Ng6MG4D9KslT+u55tpU56m4lOYyxxYNcSqTaiax aMw6ytn48j1Q4mjZdoBtK9mPBtPimeIWKxEowLU/MXTn3gj3ElS+ItgaZfGNaC2ioMVJ WAUQWNAyKhxqIseoGMgeIL5zIzp3Kr9KZNqpMDtOlV/oWj3pS/zsKPgTkXZrPgbS9r1V QEeg== X-Gm-Message-State: AOAM532m95qLBgN3hEs8Q2FI2SWNJjLfDMtqXb2VU2fOkh8JpIXn0VzW 2AwtEHa+y+mp9ignVMIDc0UBenGyNkDLQP/EbLdrOA== X-Google-Smtp-Source: ABdhPJwnrbX6YbBi7YzSbh9yItsjpgDEJe+pNvneryfE66eh9BRkJ3u/sCvIHh+xdD3zuLnAUmgpvHByTQD88V8ntf8= X-Received: by 2002:a17:907:6ef:: with SMTP id yh15mr8209753ejb.151.1622610281252; Tue, 01 Jun 2021 22:04:41 -0700 (PDT) MIME-Version: 1.0 References: <032cabb2-652a-1d88-2e12-601b40a4020c@gmail.com> <0b57cba9-3ab3-dfdf-5589-a0016eaea634@gmail.com> In-Reply-To: From: Tao Klerks Date: Wed, 2 Jun 2021 07:04:30 +0200 Message-ID: Subject: Re: Removing Partial Clone / Filtered Clone on a repo To: Derrick Stolee Cc: git@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org I understand replying to myself is bad form, but I need to add a correction/clarification to a statement I made below: On Tue, Jun 1, 2021 at 6:54 PM Tao Klerks wrote:> > it would be good to design such a feature to have other> > custom knobs, such as: > > * Get only "recent" history, perhaps with a "--since=" > > kind of flag. This would walk commits only to a certain date, > > then find all missing blobs reachable from their root trees. > > As long as you know at initial clone time that this is what you want, > combining shallow clone with sparse clone already enables this today > (shallow clone, set up filter, unshallow, and potentially remove > filter). You can even do more complicated things like unshallowing > with different increasingly-aggressive filters in multiple > steps/fetches over different time periods. The main challenge that I > perceive at the moment is that you're effectively locked into "one > shot". As soon as you've retrieved the commits with blobs missing, > "filling them in" at scale seems to be orders of magnitude more > expensive than an equivalent clone would have been. As I just noted in another thread, there seems to be one extra step needed to pull this off: you need to add a *.promisor file for the initial shallow clone's packfile, because otherwise (at least with the 2.31 client that I am using) later "git fetch" calls take forever doing something with rev-list that I don't understand, presumably due to the relationship between promisor packfiles and non-promisor packfiles...