git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Taylor Blau <me@ttaylorr.com>, git@vger.kernel.org
Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net,
	tytso@mit.edu
Subject: Re: [PATCH 08/17] builtin/pack-objects.c: --cruft without expiration
Date: Mon, 6 Dec 2021 16:44:31 -0500	[thread overview]
Message-ID: <b3a30e27-7821-1fcb-bacc-07a6d2b3df76@gmail.com> (raw)
In-Reply-To: <66165917a4660f63ce60b820d178d52a51304d20.1638224692.git.me@ttaylorr.com>

On 11/29/2021 5:25 PM, Taylor Blau wrote:
> Generating a non-expiring cruft packs works as follows:

I had trouble parsing the documentation changes below, so I came back
to this commit message to see if that helps.
 
>   - Callers provide a list of every pack they know about, and indicate
>     which packs are about to be removed.

This corresponds to the list over stdin.
 
>   - All packs which are going to be removed (we'll call these the
>     redundant ones) are marked as kept in-core, as well as any packs
>     that `pack-objects` found but the caller did not specify.

Ok, so as an implementation detail we mark these as keep packs.

>     These packs are presumed to have entered the repository between
>     the caller collecting packs and invoking `pack-objects`. Since we
>     do not want to include objects in these packs (because we don't know
>     which of their objects are or aren't reachable), these are also
>     marked as kept in-core.

Here, "are presumed" is doing a lot of work. Theoretically, there could
be three categories:

1. This pack was just repacked and will be removed because all of its
   objects were placed into new objects.

2. Either this pack was repacked and contains important reachable objects
   OR we did a repack of reachable objects and this pack contained some
   extra, unreachable objects.

3. This pack was added to the repository while creating those repacked
   packs from category 2, so we don't know if things are reachable or
   not.

So, the packs that we discover on-disk but are not specified over stdin
are in this third category, but these are grouped with category 1 as we
will treat them the same.

>   - Then, we enumerate all objects in the repository, and add them to
>     our packing list if they do not appear in an in-core kept pack.

Here, we are looking at all of the objects in category 2 as well as
loose objects.

> This results in a new cruft pack which contains all known objects that
> aren't included in the kept packs. When the kept pack is the result of
> `git repack -A`, the resulting pack contains all unreachable objects.

This now describes how 'git repack' will interface with this new change
to pack-objects. I'll keep an eye out for that.

> +--cruft::

Now getting to this description.

> +	Packs unreachable objects into a separate "cruft" pack, denoted
> +	by the existence of a `.mtimes` file. Pack names provided over
> +	stdin indicate which packs will remain after a `git repack`.
> +	Pack names prefixed with a `-` indicate those which will be
> +	removed. (...)

This description is too tied to 'git repack'. Can we describe the
input using terms independent of the 'git repack' operation? I need
to keep reading.

> (...) The contents of the cruft pack are all objects not
> +	contained in the surviving packs specified by `--keep-pack`)

Now you use --keep-pack, which is a way of specifying a pack as
"in-core keep" which was not in your commit message. Here, we also
don't link the packs over stdin to the concept of keep packs.

> +	which have not exceeded the grace period (see
> +	`--cruft-expiration` below), or which have exceeded the grace
> +	period, but are reachable from an other object which hasn't.

And now we think about the grace period! There is so much going on
that I need to break it down to understand.

  An object is _excluded_ from the new cruft pack if

  1. It is reachable from at least one reference.
  2. It is in a pack from stdin prefixed with "-"
  3. It is in a pack specified by `--keep-pack`
  4. It is in an existing cruft pack and the .mtimes file states
     that its mtime is at least as recent as the time specified by
     the --cruft-expiration option.

Breaking it down into a list like this helps me, at least. I'm not
sure what the best way would look like.

(Needing to pause here and look at the implementation later.)

Thanks,
-Stolee

  reply	other threads:[~2021-12-06 21:45 UTC|newest]

Thread overview: 201+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-29 22:25 [PATCH 00/17] cruft packs Taylor Blau
2021-11-29 22:25 ` [PATCH 01/17] Documentation/technical: add cruft-packs.txt Taylor Blau
2021-12-02 14:33   ` Derrick Stolee
2021-12-03 21:53     ` Taylor Blau
2021-12-04 22:20   ` Elijah Newren
2021-12-04 23:32     ` Taylor Blau
2021-11-29 22:25 ` [PATCH 02/17] pack-mtimes: support reading .mtimes files Taylor Blau
2021-12-02 15:06   ` Derrick Stolee
2021-12-02 22:32     ` brian m. carlson
2021-12-03 22:24     ` Taylor Blau
2022-01-07 19:41       ` Taylor Blau
2021-11-29 22:25 ` [PATCH 03/17] pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' Taylor Blau
2021-11-29 22:25 ` [PATCH 04/17] chunk-format.h: extract oid_version() Taylor Blau
2021-12-02 15:22   ` Derrick Stolee
2021-12-03 22:40     ` Taylor Blau
2021-12-06 17:33       ` Derrick Stolee
2021-11-29 22:25 ` [PATCH 05/17] pack-mtimes: support writing pack .mtimes files Taylor Blau
2021-12-02 15:36   ` Derrick Stolee
2021-12-03 23:04     ` Taylor Blau
2021-11-29 22:25 ` [PATCH 06/17] t/helper: add 'pack-mtimes' test-tool Taylor Blau
2021-12-06 21:16   ` Derrick Stolee
2022-02-23 22:24     ` Taylor Blau
2021-11-29 22:25 ` [PATCH 07/17] builtin/pack-objects.c: return from create_object_entry() Taylor Blau
2021-11-29 22:25 ` [PATCH 08/17] builtin/pack-objects.c: --cruft without expiration Taylor Blau
2021-12-06 21:44   ` Derrick Stolee [this message]
2022-03-01  2:48     ` Taylor Blau
2021-12-07 15:17   ` Derrick Stolee
2022-02-23 23:34     ` Taylor Blau
2021-11-29 22:25 ` [PATCH 09/17] reachable: add options to add_unseen_recent_objects_to_traversal Taylor Blau
2021-11-29 22:25 ` [PATCH 10/17] reachable: report precise timestamps from objects in cruft packs Taylor Blau
2021-11-29 22:25 ` [PATCH 11/17] builtin/pack-objects.c: --cruft with expiration Taylor Blau
2021-12-07 15:30   ` Derrick Stolee
2022-02-23 23:35     ` Taylor Blau
2021-11-29 22:25 ` [PATCH 12/17] builtin/repack.c: support generating a cruft pack Taylor Blau
2021-12-05 20:46   ` Junio C Hamano
2022-03-01  2:00     ` Taylor Blau
2021-12-07 15:38   ` Derrick Stolee
2022-02-23 23:37     ` Taylor Blau
2021-11-29 22:25 ` [PATCH 13/17] builtin/repack.c: allow configuring cruft pack generation Taylor Blau
2021-11-29 22:25 ` [PATCH 14/17] builtin/repack.c: use named flags for existing_packs Taylor Blau
2021-11-29 22:25 ` [PATCH 15/17] builtin/repack.c: add cruft packs to MIDX during geometric repack Taylor Blau
2021-11-29 22:25 ` [PATCH 16/17] builtin/gc.c: conditionally avoid pruning objects via loose Taylor Blau
2021-11-29 22:25 ` [PATCH 17/17] sha1-file.c: don't freshen cruft packs Taylor Blau
2021-12-03 19:51 ` [PATCH 00/17] " Junio C Hamano
2021-12-03 20:08   ` Taylor Blau
2021-12-03 20:47     ` Taylor Blau
2022-03-02  0:57 ` [PATCH v2 " Taylor Blau
2022-03-02  0:58   ` [PATCH v2 01/17] Documentation/technical: add cruft-packs.txt Taylor Blau
2022-03-02  0:58   ` [PATCH v2 02/17] pack-mtimes: support reading .mtimes files Taylor Blau
2022-03-02 20:22     ` Derrick Stolee
2022-03-02 21:33       ` Taylor Blau
2022-03-02  0:58   ` [PATCH v2 03/17] pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' Taylor Blau
2022-03-02  0:58   ` [PATCH v2 04/17] chunk-format.h: extract oid_version() Taylor Blau
2022-03-02  0:58   ` [PATCH v2 05/17] pack-mtimes: support writing pack .mtimes files Taylor Blau
2022-03-02  0:58   ` [PATCH v2 06/17] t/helper: add 'pack-mtimes' test-tool Taylor Blau
2022-03-02  0:58   ` [PATCH v2 07/17] builtin/pack-objects.c: return from create_object_entry() Taylor Blau
2022-03-02  0:58   ` [PATCH v2 08/17] builtin/pack-objects.c: --cruft without expiration Taylor Blau
2022-03-02  0:58   ` [PATCH v2 09/17] reachable: add options to add_unseen_recent_objects_to_traversal Taylor Blau
2022-03-02 20:19     ` Derrick Stolee
2022-03-02 21:28       ` Taylor Blau
2022-03-02  0:58   ` [PATCH v2 10/17] reachable: report precise timestamps from objects in cruft packs Taylor Blau
2022-03-02  0:58   ` [PATCH v2 11/17] builtin/pack-objects.c: --cruft with expiration Taylor Blau
2022-03-02  7:42     ` Junio C Hamano
2022-03-02 15:54       ` Taylor Blau
2022-03-02 19:57         ` Derrick Stolee
2022-03-02  0:58   ` [PATCH v2 12/17] builtin/repack.c: support generating a cruft pack Taylor Blau
2022-03-02  0:58   ` [PATCH v2 13/17] builtin/repack.c: allow configuring cruft pack generation Taylor Blau
2022-03-02  0:58   ` [PATCH v2 14/17] builtin/repack.c: use named flags for existing_packs Taylor Blau
2022-03-02  0:58   ` [PATCH v2 15/17] builtin/repack.c: add cruft packs to MIDX during geometric repack Taylor Blau
2022-03-02  0:58   ` [PATCH v2 16/17] builtin/gc.c: conditionally avoid pruning objects via loose Taylor Blau
2022-03-02  0:58   ` [PATCH v2 17/17] sha1-file.c: don't freshen cruft packs Taylor Blau
2022-03-02 20:23   ` [PATCH v2 00/17] " Derrick Stolee
2022-03-02 21:36     ` Taylor Blau
2022-03-03  0:20 ` [PATCH v3 " Taylor Blau
2022-03-03  0:20   ` [PATCH v3 01/17] Documentation/technical: add cruft-packs.txt Taylor Blau
2022-03-07 18:03     ` Jonathan Nieder
2022-03-22  1:16       ` Taylor Blau
2022-03-22 21:45         ` Jonathan Nieder
2022-03-22 22:02           ` Taylor Blau
2022-03-22 23:04             ` Jonathan Nieder
2022-03-23  1:01               ` Taylor Blau
2022-03-28 18:46                 ` Taylor Blau
2022-03-28 20:55                   ` Junio C Hamano
2022-03-28 21:21                     ` Taylor Blau
2022-03-29 15:59                       ` Junio C Hamano
2022-03-30  2:23                         ` Taylor Blau
2022-03-30 13:37                           ` Junio C Hamano
2022-03-30 17:30                             ` Taylor Blau
2022-03-03  0:20   ` [PATCH v3 02/17] pack-mtimes: support reading .mtimes files Taylor Blau
2022-03-03  0:20   ` [PATCH v3 03/17] pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' Taylor Blau
2022-03-03  0:20   ` [PATCH v3 04/17] chunk-format.h: extract oid_version() Taylor Blau
2022-03-03 16:30     ` Ævar Arnfjörð Bjarmason
2022-03-03 23:32       ` Taylor Blau
2022-03-04  0:16         ` Junio C Hamano
2022-03-03  0:20   ` [PATCH v3 05/17] pack-mtimes: support writing pack .mtimes files Taylor Blau
2022-03-03 16:45     ` Ævar Arnfjörð Bjarmason
2022-03-03 23:35       ` Taylor Blau
2022-03-04 10:40         ` Ævar Arnfjörð Bjarmason
2022-03-03  0:20   ` [PATCH v3 06/17] t/helper: add 'pack-mtimes' test-tool Taylor Blau
2022-03-03  0:21   ` [PATCH v3 07/17] builtin/pack-objects.c: return from create_object_entry() Taylor Blau
2022-03-03  0:21   ` [PATCH v3 08/17] builtin/pack-objects.c: --cruft without expiration Taylor Blau
2022-03-03  0:21   ` [PATCH v3 09/17] reachable: add options to add_unseen_recent_objects_to_traversal Taylor Blau
2022-03-03  0:21   ` [PATCH v3 10/17] reachable: report precise timestamps from objects in cruft packs Taylor Blau
2022-03-03  0:21   ` [PATCH v3 11/17] builtin/pack-objects.c: --cruft with expiration Taylor Blau
2022-03-03  0:21   ` [PATCH v3 12/17] builtin/repack.c: support generating a cruft pack Taylor Blau
2022-03-03  0:21   ` [PATCH v3 13/17] builtin/repack.c: allow configuring cruft pack generation Taylor Blau
2022-03-03  0:21   ` [PATCH v3 14/17] builtin/repack.c: use named flags for existing_packs Taylor Blau
2022-03-03  0:21   ` [PATCH v3 15/17] builtin/repack.c: add cruft packs to MIDX during geometric repack Taylor Blau
2022-03-03  0:21   ` [PATCH v3 16/17] builtin/gc.c: conditionally avoid pruning objects via loose Taylor Blau
2022-03-03  0:21   ` [PATCH v3 17/17] sha1-file.c: don't freshen cruft packs Taylor Blau
2022-03-03  1:29   ` [PATCH v3 00/17] " Derrick Stolee
2022-05-18 23:10 ` [PATCH v4 " Taylor Blau
2022-05-18 23:10   ` [PATCH v4 01/17] Documentation/technical: add cruft-packs.txt Taylor Blau
2022-05-19 14:04     ` Junio C Hamano
2022-05-18 23:10   ` [PATCH v4 02/17] pack-mtimes: support reading .mtimes files Taylor Blau
2022-05-19 10:40     ` Ævar Arnfjörð Bjarmason
2022-05-19 15:21       ` Junio C Hamano
2022-05-20  7:32         ` Ævar Arnfjörð Bjarmason
2022-05-20 22:37           ` Taylor Blau
2022-05-18 23:10   ` [PATCH v4 03/17] pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' Taylor Blau
2022-05-18 23:11   ` [PATCH v4 04/17] chunk-format.h: extract oid_version() Taylor Blau
2022-05-19 11:44     ` Ævar Arnfjörð Bjarmason
2022-05-18 23:11   ` [PATCH v4 05/17] pack-mtimes: support writing pack .mtimes files Taylor Blau
2022-05-18 23:11   ` [PATCH v4 06/17] t/helper: add 'pack-mtimes' test-tool Taylor Blau
2022-05-18 23:11   ` [PATCH v4 07/17] builtin/pack-objects.c: return from create_object_entry() Taylor Blau
2022-05-18 23:11   ` [PATCH v4 08/17] builtin/pack-objects.c: --cruft without expiration Taylor Blau
2022-05-19 10:04     ` Junio C Hamano
2022-05-19 15:16       ` Junio C Hamano
2022-05-20 22:52         ` Taylor Blau
2022-05-18 23:11   ` [PATCH v4 09/17] reachable: add options to add_unseen_recent_objects_to_traversal Taylor Blau
2022-05-18 23:11   ` [PATCH v4 10/17] reachable: report precise timestamps from objects in cruft packs Taylor Blau
2022-05-18 23:11   ` [PATCH v4 11/17] builtin/pack-objects.c: --cruft with expiration Taylor Blau
2022-05-18 23:11   ` [PATCH v4 12/17] builtin/repack.c: support generating a cruft pack Taylor Blau
2022-05-19 11:29     ` Ævar Arnfjörð Bjarmason
2022-05-20 22:39       ` Taylor Blau
2022-05-18 23:11   ` [PATCH v4 13/17] builtin/repack.c: allow configuring cruft pack generation Taylor Blau
2022-05-18 23:11   ` [PATCH v4 14/17] builtin/repack.c: use named flags for existing_packs Taylor Blau
2022-05-18 23:11   ` [PATCH v4 15/17] builtin/repack.c: add cruft packs to MIDX during geometric repack Taylor Blau
2022-05-19 11:32     ` Ævar Arnfjörð Bjarmason
2022-05-20 22:42       ` Taylor Blau
2022-05-18 23:11   ` [PATCH v4 16/17] builtin/gc.c: conditionally avoid pruning objects via loose Taylor Blau
2022-05-18 23:11   ` [PATCH v4 17/17] sha1-file.c: don't freshen cruft packs Taylor Blau
2022-05-18 23:48   ` [PATCH v4 00/17] " Derrick Stolee
2022-05-20 23:19     ` Junio C Hamano
2022-05-20 23:30       ` Taylor Blau
2022-05-19 11:42   ` [RFC PATCH 0/2] Utility functions for duplicated pack(write) code Ævar Arnfjörð Bjarmason
2022-05-19 11:42     ` [RFC PATCH 1/2] packfile API: add and use a pack_name_to_ext() utility function Ævar Arnfjörð Bjarmason
2022-05-19 15:40       ` Junio C Hamano
2022-05-19 11:42     ` [RFC PATCH 2/2] hash API: add and use a hash_short_id_by_algo() function Ævar Arnfjörð Bjarmason
2022-05-19 15:50       ` Junio C Hamano
2022-05-19 19:07         ` Ævar Arnfjörð Bjarmason
2022-05-19 15:31     ` [RFC PATCH 0/2] Utility functions for duplicated pack(write) code Junio C Hamano
2022-05-19 11:54   ` [PATCH v4 00/17] cruft packs Ævar Arnfjörð Bjarmason
2022-05-20 23:17 ` [PATCH v5 " Taylor Blau
2022-05-20 23:17   ` [PATCH v5 01/17] Documentation/technical: add cruft-packs.txt Taylor Blau
2022-05-20 23:17   ` [PATCH v5 02/17] pack-mtimes: support reading .mtimes files Taylor Blau
2022-05-24 19:32     ` Jonathan Nieder
2022-05-24 19:44       ` rsbecker
2022-05-24 22:25         ` Taylor Blau
2022-05-24 23:24           ` rsbecker
2022-05-25  0:07             ` Taylor Blau
2022-05-25  0:20               ` rsbecker
2022-05-25  9:11               ` adding new 32-bit on-disk (unsigned) timestamp formats (was: [PATCH v5 02/17] pack-mtimes: support reading .mtimes files) Ævar Arnfjörð Bjarmason
2022-05-25 13:30                 ` Derrick Stolee
2022-05-25 21:13                   ` Taylor Blau
2022-05-26  0:02                     ` Ævar Arnfjörð Bjarmason
2022-05-26  0:12                       ` Taylor Blau
2022-05-24 22:21       ` [PATCH v5 02/17] pack-mtimes: support reading .mtimes files Taylor Blau
2022-05-25  7:48         ` Jonathan Nieder
2022-05-25 21:36           ` Taylor Blau
2022-05-25 21:58             ` rsbecker
2022-05-25 22:59               ` Taylor Blau
2022-05-25 23:02     ` Taylor Blau
2022-05-26  0:30       ` Junio C Hamano
2023-06-01 13:01     ` Andreas Schwab
2022-05-20 23:17   ` [PATCH v5 03/17] pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' Taylor Blau
2022-05-20 23:17   ` [PATCH v5 04/17] chunk-format.h: extract oid_version() Taylor Blau
2022-05-20 23:17   ` [PATCH v5 05/17] pack-mtimes: support writing pack .mtimes files Taylor Blau
2022-05-20 23:17   ` [PATCH v5 06/17] t/helper: add 'pack-mtimes' test-tool Taylor Blau
2022-05-20 23:17   ` [PATCH v5 07/17] builtin/pack-objects.c: return from create_object_entry() Taylor Blau
2022-05-20 23:17   ` [PATCH v5 08/17] builtin/pack-objects.c: --cruft without expiration Taylor Blau
2022-05-20 23:17   ` [PATCH v5 09/17] reachable: add options to add_unseen_recent_objects_to_traversal Taylor Blau
2022-05-20 23:17   ` [PATCH v5 10/17] reachable: report precise timestamps from objects in cruft packs Taylor Blau
2022-05-20 23:18   ` [PATCH v5 11/17] builtin/pack-objects.c: --cruft with expiration Taylor Blau
2022-05-20 23:18   ` [PATCH v5 12/17] builtin/repack.c: support generating a cruft pack Taylor Blau
2022-05-20 23:18   ` [PATCH v5 13/17] builtin/repack.c: allow configuring cruft pack generation Taylor Blau
2022-05-20 23:18   ` [PATCH v5 14/17] builtin/repack.c: use named flags for existing_packs Taylor Blau
2022-05-20 23:18   ` [PATCH v5 15/17] builtin/repack.c: add cruft packs to MIDX during geometric repack Taylor Blau
2022-05-20 23:18   ` [PATCH v5 16/17] builtin/gc.c: conditionally avoid pruning objects via loose Taylor Blau
2022-06-19  5:38     ` René Scharfe
2022-06-21 15:58       ` Junio C Hamano
2022-05-20 23:18   ` [PATCH v5 17/17] sha1-file.c: don't freshen cruft packs Taylor Blau
2022-05-21 11:17   ` [PATCH v5 00/17] " Ævar Arnfjörð Bjarmason
2022-05-24 19:39     ` Jonathan Nieder
2022-05-24 21:50       ` Taylor Blau
2022-05-24 21:55         ` Ævar Arnfjörð Bjarmason
2022-05-24 22:12           ` Taylor Blau
2022-05-25  7:53             ` Jonathan Nieder
2022-05-25 19:59               ` Derrick Stolee
2022-05-25 21:09                 ` Taylor Blau
2022-05-26  0:06                   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3a30e27-7821-1fcb-bacc-07a6d2b3df76@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).