git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
	Derrick Stolee <derrickstolee@github.com>,
	Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH] builtin/pack-objects.c: introduce `pack.extraCruftTips`
Date: Thu, 20 Apr 2023 16:48:39 -0400	[thread overview]
Message-ID: <ZEGlJx4ibYSp6qmD@nand.local> (raw)
In-Reply-To: <xmqq1qkez5yg.fsf@gitster.g>

On Thu, Apr 20, 2023 at 12:52:55PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >> But it makes me wonder if it would make the life of end-users simpler
> >> if we reserve a special ref hierarchy, say "refs/crufts/*", than
> >> having to write a program for doing something like this.
> >
> > Ideally, yes. But I think there are certain instances where there are
> > far too many (disconnected) objects that creating a reference for each
> > part of the unreachable object graph that we want to keep is infeasible.
> >
> > Another way to think about pack.extraCruftTips is that the program
> > invocation is acting like the refs/crufts hierarchy would if it existed,
> > but without actually having to write all of those references down.
>
> [...] Is there a less hand-wavy use case you have in mind?

Sure. The use-case I have in mind directly is keeping certain entries
from GitHub's `audit_log` file (see a description from Peff in [1])
while excluding others.

We use the audit_log to track every single reference change, like the
reflog but with the reference name prepended to each entry and some
optional metadata attached to the end of each entry.

Our goal is to be able to prune the test-merge objects that GitHub
creates without (usually) pruning any objects that was pushed by a user.
E.g., even if a user force-pushes their branch from A to B (where B is
not strictly ahead of A) we want to keep the objects from A around, even
though a reference is no longer pointing at it.

This already works with reflogs, which are considered as reachable
objects when generating a cruft pack (that is, they go in the "big"
pack with the rest of the reachable objects. This is extending that
mechanism to work with GitHub's custom format, but doing so in a way
that is not tied to that format whatsoever.

The hope is that others may find it useful in other special
circumstances like above.

Thanks,
Taylor

[1]: https://lore.kernel.org/git/20150624094919.GC5436@peff.net/

  reply	other threads:[~2023-04-20 20:49 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-20 17:27 [PATCH] builtin/pack-objects.c: introduce `pack.extraCruftTips` Taylor Blau
2023-04-20 18:12 ` Junio C Hamano
2023-04-20 19:30   ` Taylor Blau
2023-04-20 19:52     ` Junio C Hamano
2023-04-20 20:48       ` Taylor Blau [this message]
2023-04-21  0:10 ` Chris Torek
2023-04-21  2:14   ` Taylor Blau
2023-04-25 19:42 ` Derrick Stolee
2023-04-25 21:25   ` Taylor Blau
2023-04-26 10:52     ` Derrick Stolee
2023-05-03  0:06       ` Taylor Blau
2023-05-03  0:09 ` [PATCH v2] " Taylor Blau
2023-05-03 14:01   ` Derrick Stolee
2023-05-03 19:59   ` Jeff King
2023-05-03 21:22     ` Taylor Blau
2023-05-05 21:23       ` Jeff King
2023-05-06  0:06         ` Taylor Blau
2023-05-06  0:14           ` Taylor Blau
2023-05-03 21:28     ` Taylor Blau
2023-05-05 21:26       ` Jeff King
2023-05-05 22:13         ` Jeff King
2023-05-06  0:13           ` Taylor Blau
2023-05-06  0:20             ` Taylor Blau
2023-05-06  2:12             ` Jeff King
2023-05-03 22:05 ` [PATCH v3] " Taylor Blau
2023-05-03 23:18   ` Junio C Hamano
2023-05-03 23:42     ` Junio C Hamano
2023-05-03 23:48       ` Taylor Blau
2023-05-03 23:50       ` Taylor Blau
2023-05-05 21:39     ` Jeff King
2023-05-05 22:19   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZEGlJx4ibYSp6qmD@nand.local \
    --to=me@ttaylorr.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).