list mirror (unofficial, one of many)
 help / color / Atom feed
From: Elijah Newren <>
To: Jonathan Tan <>
Cc:, Kaushik Srenevasan <>,
	Git Mailing List <>
Subject: Re: [RFC] Extending git-replace
Date: Tue, 14 Jan 2020 12:39:49 -0800
Message-ID: <> (raw)
In-Reply-To: <>

On Tue, Jan 14, 2020 at 11:05 AM Jonathan Tan <> wrote:
> > That is, would it be sufficient if every replaced file were replaced
> > with the exact text "me caga en la leche" instead of a custom hand-
> > crafted replacement?  I guess it's a bit complicated because while
> > that's a reasonable blob, it's not a valid commit.  So maybe this
> > mechanism would be limited to blobs.  I thought about whether we could
> > a different flavor of replacement for commits, but those generally have
> > to be custom because they each have different parents.
> Since the original email just discussed blobs, I'll confine myself to
> discussing blobs. (Commits are trickier, as you said.)
> > And if that would be sufficient, could promisors be used for this?  I
> > don't know how those interact with fsck and the other commands that
> > you're worried about.  Basically, the idea would be to use most of the
> > existing promisor code, and then have a mode where instead of visiting
> > the promisor, we just always return "me caga en la leche" (and this
> > does not have its SHA checked, of course).

Maybe; it doesn't necessarily need to be the same object returned, and
these replacements could be user-specified via replace refs...

> Missing promisor objects do not prevent fsck from passing - this is part
> of the original design (any packfiles we download from the specifically
> designated promisor remote are marked as such, and any objects that the
> objects in the packfile refer to are considered OK to be missing).

Is there ever a risk that objects in the downloaded packfile come
across as deltas against other objects that are missing/excluded, or
does the partial clone machinery ensure that doesn't happen?  (Because
this was certainly the biggest pain-point with my "fake cheap clone"

> Currently, when a missing object is read, it is first fetched (there are
> some more details that I can go over if you have any specific
> questions). What you're suggesting here is to return a fake blob with
> wrong hash - I haven't looked at all the callers of read-object
> functions in detail, but I don't think all of them are ready for such a
> behavioral change.

git-replace already took care of that for you and provides that
guarantee, modulo the --no-replace-objects & fsck & prune & fetch &
whatnot cases that ignore replace objects as Kaushik mentioned.  I
took advantage of this to great effect with my "fake cheap clone"
hacks.  Based in part on your other email where you made a suggestion
about promisors, I'm starting to think a pretty good first cut
solution might look like the following:

  * user manually adds a bunch of replace refs to map the unwanted big
blobs to something else (e.g. a README about how the files were
stripped, or something similar to this)
  * a partial clone specification that says "exclude objects that are
referenced by replace refs"
  * add a fake promisor to the downloaded promisor pack so that if
anyone runs with --no-replace-objects or similar then they get an
error saying the specified objects don't exist and can't be

Anyone see any obvious problems with this?

>  Maybe it would be sufficient to just make this work
> in a more limited scope (e.g. checkout only - and if we need different
> replacement blobs for different object IDs, maybe we could have
> something similar to the clean/smudge filters).

> > This could work together with some sort refs/blacklist mechanism to
> > enable the server to choose which objects the client replaces.
> In the original email, Kaushik mentioned objects larger than a certain
> size - we already have support for that (--filter=blob:limit=1000000,
> for example). Having said that, Git is already able to tolerate any
> exclusion (of tree or blob) from the server - we already need this in
> order to support changing of filters, for example.

  reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14  5:33 Kaushik Srenevasan
2020-01-14  6:55 ` Elijah Newren
2020-01-14 19:11   ` Jonathan Tan
2020-01-16  3:30   ` Kaushik Srenevasan
2020-01-14 18:19 ` David Turner
2020-01-14 19:03   ` Jonathan Tan
2020-01-14 20:39     ` Elijah Newren [this message]
2020-01-14 21:57       ` Jonathan Tan
2020-01-14 22:46         ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:

 note: .onion URLs require Tor:

AGPL code for this site: git clone