git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, John Cai <johncai86@gmail.com>,
	 Patrick Steinhardt <ps@pks.im>,
	Christian Couder <chriscool@tuxfamily.org>
Subject: Re: [PATCH v2 3/3] upload-pack: allow configuring a missing-action
Date: Tue, 28 May 2024 12:10:31 +0200	[thread overview]
Message-ID: <CAP8UFD1_aHwbhF12v-miCTWEbbgjtpjTCmkRmFHu4Vusezq6dA@mail.gmail.com> (raw)
In-Reply-To: <xmqqjzjikhdz.fsf@gitster.g>

On Fri, May 24, 2024 at 11:51 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> >> Repository S borrows from its "promisor" X, and repository C which
> >> initially cloned from S borrows from its "promisor" S.  Even if C
> >> wants an object in order to fill in the gap in its object graph, S
> >> may not have it (and S itself may have no need for that object), and
> >> in such a case, bypassing S and let C go directly to X would make
> >> sense.
> > ...
> >>
> >> It feels utterly irresponsible to give an option to set up a server
> >> that essentially declares: I'll serve objects you ask me as best
> >> efforts basis, the pack stream I'll give you may not have all
> >> objects you asked for and missing some objects, and when I do so, I
> >> am not telling you which objects I omitted.
> >
> > I don't think it's irresponsible. The client anyways checks that it
> > got something usable in the same way as it does when it performs a
> > partial fetch or clone. The fetch or clone fails if that's not the
> > case. For example if the checkout part of a clone needs some objects
> > but cannot get them, the whole clone fails.
>
> But then what can the repository C do after seeing such a failure?

It's basically the same as when a regular clone or a partial clone or
a clone using bundle-uri fails or when using a regular bundle fails.
If it failed because the remote was not properly configured, then that
config can be fixed. If it fails because the remote doesn't have some
objects, then maybe the missing objects can be transferred to the
remote. And so on.

The feature doesn't create any new kind of failure. In particular,
when you use a partial clone, even a very simple one with a single
remote, there is always the risk of not being able to get some missing
objects as there is the risk of the remote being unreachable for some
reason (like if you take a plane and don't have an internet
connection, or if there is an outage on the server side). There are
some added risks because the feature requires added configuration and
it can be wrong like any configuration, and because there are 2
remotes instead of just one. But these are not new kinds of risks.
These risks already exist if one uses multiple promisor remotes.

> With the design, S does not even consult C to see if C knows about
> X.

If S is managed by a company like GitLab or GitHub, then S will
certainly advertise, for example by showing a command that can easily
be copy-pasted from the web page of the project onto the user's
command line, some way for C to use X.

In the cover letter I give the example of the following command that
can be used (and advertised by S):

  GIT_NO_LAZY_FETCH=0 git clone
      -c remote.my_promisor.promisor=true \
      -c remote.my_promisor.fetch="+refs/heads/*:refs/remotes/my_promisor/*" \
      -c remote.my_promisor.url=<MY_PROMISOR_URL> \
      --filter="blob:limit=5k" server

I also agree in the cover letter that this is not the most user
friendly clone command and I suggest that I could work on improving on
that by saying:

"it would be nice if there was a capability for the client to say
that it would like the server to give it information about the
promisor that it could use, so that the user doesn't have to pass all
the "remote.my_promisor.XXX" config options on the command like."

and by saying that this could be added later.

If you think that such a capability should definitely be part of this
work, for example because it wouldn't be sane to require users to use
such a long and complex command and it could avoid difficult to debug
failures, then I would be willing to work on this and add it to this
series.

> Without knowing that, it cannot safely decide that it does not
> have to send objects that can be obtained from X to C.

In the above command C is asking for a partial clone, as it uses a
--filter option. This means that C knows very well that it might not
get from S all the objects needed for a complete object graph. So why
can't S safely decide not to send some objects to C? Why would it be
Ok if C wanted a partial clone but didn't want to get some objects
from X at the same time, but would not be Ok if C wants the same
partial clone but also with the possibility to get some of the objects
from X right away? To me it seems less risky to ask for some objects
from X right away.

>  Instead, S
> simply say "if C requests an object that I do not have, just ignore
> it and let C grab it from somewhere else".  How would it not be an
> irresponsible design?

Again when using a regular partial clone omitting the same set of
objects, C also requests some objects that S doesn't have. And this is
not considered an issue or something irresponsible. It already works
like this. And then C still has the possibility to configure X as a
promisor remote and get missing objects from there. So why is it Ok
when it's done in several steps but not in one?


  reply	other threads:[~2024-05-28 10:11 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-12 13:51 [PATCH 0/3] Implement filtering repacks Christian Couder
2022-10-12 13:51 ` [PATCH 1/3] pack-objects: allow --filter without --stdout Christian Couder
2022-10-12 13:51 ` [PATCH 2/3] repack: add --filter=<filter-spec> option Christian Couder
2022-10-12 13:51 ` [PATCH 3/3] repack: introduce --force to force filtering Christian Couder
2022-10-14 16:46 ` [PATCH 0/3] Implement filtering repacks Junio C Hamano
2022-10-20 11:23   ` Christian Couder
2022-10-28 19:49     ` Taylor Blau
2022-10-28 20:26       ` Junio C Hamano
2022-11-07  9:12         ` Christian Couder
2022-11-07  9:00       ` Christian Couder
2022-10-25 12:28 ` [PATCH v2 0/2] " Christian Couder
2022-10-25 12:28   ` [PATCH v2 1/2] pack-objects: allow --filter without --stdout Christian Couder
2022-10-25 12:28   ` [PATCH v2 2/2] repack: add --filter=<filter-spec> option Christian Couder
2022-10-28 19:54   ` [PATCH v2 0/2] Implement filtering repacks Taylor Blau
2022-11-07  9:29     ` Christian Couder
2022-11-22 17:51   ` [PATCH v3 " Christian Couder
2022-11-22 17:51     ` [PATCH v3 1/2] pack-objects: allow --filter without --stdout Christian Couder
2022-11-22 17:51     ` [PATCH v3 2/2] repack: add --filter=<filter-spec> option Christian Couder
2022-11-23  0:31     ` [PATCH v3 0/2] Implement filtering repacks Junio C Hamano
2022-12-21  3:53       ` Christian Couder
2022-11-23  0:35     ` Junio C Hamano
2022-12-21  4:04     ` [PATCH v4 0/3] " Christian Couder
2022-12-21  4:04       ` [PATCH v4 1/3] pack-objects: allow --filter without --stdout Christian Couder
2023-01-04 14:56         ` Patrick Steinhardt
2022-12-21  4:04       ` [PATCH v4 2/3] repack: add --filter=<filter-spec> option Christian Couder
2023-01-04 14:56         ` Patrick Steinhardt
2023-01-05  1:39           ` Junio C Hamano
2022-12-21  4:04       ` [PATCH v4 3/3] gc: add gc.repackFilter config option Christian Couder
2023-01-04 14:57         ` Patrick Steinhardt
2024-05-15 13:25 ` [PATCH v2 0/3] upload-pack: support a missing-action Christian Couder
2024-05-15 13:25   ` [PATCH v2 1/3] rev-list: refactor --missing=<missing-action> Christian Couder
2024-05-15 16:16     ` Junio C Hamano
2024-05-15 13:25   ` [PATCH v2 2/3] pack-objects: use the missing action API Christian Couder
2024-05-15 16:46     ` Junio C Hamano
2024-05-24 16:40       ` Christian Couder
2024-05-15 13:25   ` [PATCH v2 3/3] upload-pack: allow configuring a missing-action Christian Couder
2024-05-15 17:08     ` Junio C Hamano
2024-05-24 16:41       ` Christian Couder
2024-05-24 21:51         ` Junio C Hamano
2024-05-28 10:10           ` Christian Couder [this message]
2024-05-28 15:54             ` Junio C Hamano
2024-05-31 20:43               ` Christian Couder
2024-06-01  9:43                 ` Junio C Hamano
2024-06-03 15:01                   ` Christian Couder
2024-06-03 17:29                     ` Junio C Hamano
2024-05-15 13:59   ` [PATCH v2 0/3] upload-pack: support " Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP8UFD1_aHwbhF12v-miCTWEbbgjtpjTCmkRmFHu4Vusezq6dA@mail.gmail.com \
    --to=christian.couder@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).