git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	Ben Peart <Ben.Peart@microsoft.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Jonathan Nieder <jrnieder@gmail.com>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
	Mike Hommey <mh@glandium.org>,
	Lars Schneider <larsxschneider@gmail.com>,
	Eric Wong <e@80x24.org>,
	Christian Couder <chriscool@tuxfamily.org>,
	Jeff Hostetler <jeffhost@microsoft.com>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Beat Bolli <dev+git@drbeat.li>
Subject: Re: [PATCH v3 02/11] Add initial support for many promisor remotes
Date: Mon, 1 Apr 2019 18:41:33 +0200	[thread overview]
Message-ID: <CAP8UFD208vY=0tduwSipBHYTPJCrBtsME6GouZMiKrnXJ=0zAw@mail.gmail.com> (raw)
In-Reply-To: <xmqqtvg7e7pn.fsf@gitster-ct.c.googlers.com>

On Wed, Mar 13, 2019 at 5:09 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > +struct promisor_remote *promisor_remote_new(const char *remote_name)
> > +{
>
> Shouldn't this be static?  The config callback that calls this
> function is inside this file.

Yeah, I made it static.

> > +     struct promisor_remote *o;
> > +
> > +     o = xcalloc(1, sizeof(*o));
> > +     o->remote_name = xstrdup(remote_name);
>
> A comment on this later...
>
> > +static struct promisor_remote *promisor_remote_look_up(const char *remote_name,
> > +                                                    struct promisor_remote **previous)
>
> In our codebase, this operation is far more often called "lookup",
> one word, according to "git grep -e look_up \*.h".

Ok, I changed it to "lookup".

> > +{
> > +     struct promisor_remote *o, *p;
> > +
> > +     for (p = NULL, o = promisors; o; p = o, o = o->next)
> > +             if (o->remote_name && !strcmp(o->remote_name, remote_name)) {
> > +                     if (previous)
> > +                             *previous = p;
>
> I think the "previous" thing is for the callers to learn what
> pointer points at the found entry, allowing e.g. an element to be
> inserted just before the found element.

Actually it's to make it easy to move the found element.

> If so, would it make more
> sense to use the more familiar pattern to use
>
>         *previous = &promisors;
>
> here?

If I do that I get an "error: assignment from incompatible pointer
type" as "*previous" is of type "struct promisor_remote *" while
"&promisors" is of type "struct promisor_remote **".

Maybe you mean:

         *previous = promisors;

but I fail to see how that would correctly pass the previous element
when the found one is not the first one.

> That would remove the need to switch on NULL-ness of previous
> in the caller.

In the only caller that passes a non NULL previous, we call
promisor_remote_move_to_tail() which does:

    if (previous)
        previous->next = o->next;
    else
        promisors = o->next ? o->next : o;

So yeah we check the NULL-ness of previous, but if previous has been
set to promisors, then previous->next = o->next will not set promisors
correctly.

I guess we are not here in the case were the familiar pattern you are
thinking about can be applied. Or is there an example, maybe in the
Git source code, that I could learn from?

Another possibility is to just use hashmap as you suggest below or
list.h. It might be a bit wasteful, but the code simplification might
be worth it.

> > diff --git a/promisor-remote.h b/promisor-remote.h
> > new file mode 100644
> > index 0000000000..bfbf7c0f21
> > --- /dev/null
> > +++ b/promisor-remote.h
> > @@ -0,0 +1,17 @@
> > +#ifndef PROMISOR_REMOTE_H
> > +#define PROMISOR_REMOTE_H
> > +
> > +/*
> > + * Promisor remote linked list
> > + * Its information come from remote.XXX config entries.
> > + */
> > +struct promisor_remote {
> > +     const char *remote_name;
> > +     struct promisor_remote *next;
> > +};
>
> Would it make the management of storage easier to make it
>
>         struct promisor_remote {
>                 struct promisor_remote *next;
>                 const char name[FLEX_ARRAY];
>         };
>
> that will allow allocation with
>
>         struct promisor_remote *r;
>         FLEX_ALLOC_STR(r, name, remote_name);

Ok to use a flex array. If we ever use arrays or hashmaps of promisor
remotes, we might have to go back to not using one.

> Or if the remote_name field must be a pointer, perhaps use
> FLEXPTR_ALLOC_STR().

[...]

> Can the name of promisor be any string?  If they end up getting used
> as part of a path on the filesystem, we'd need to worry about case
> sensitivity and UTF-8 normalization issues as well.

It looks like for regular remotes we only check if they start with /.
So I don't think we need to do more than that for promisor remotes. I
added the check.

> In a large enough project where multi-promisor makes sense, what is
> the expected number of promisors a repository would define?  10s?
> 1000s?  Would a linked list still make sense when deployed in the
> real world, or would we be forced to move to something like hashmap
> later?

I am ok to use hashmap to make it similar with regular remotes.

For now I don't expect large projects to use more than 10s promisors
though. They are defined in the config file and I don't think people
will be happy if they have to manage more than 10s promisors in their
config file. If people really start to use more than that, they are
likely to ask us for a new mechanism to manage them (and to
automatically have them configured from servers). So maybe we can
change that if/when we have to work on such mechanism.




> You do not have to have the answers to all these questions, and even
> the ones with concrete answers, you do not necessarily have to act
> on them right now (e.g. you may anticipate the eventual need to move
> to hashmap, but prototyping with linked list is perfectly fine;
> being aware of the possibility alone would force us to be careful to
> make sure that the implementation detail does not leak through too
> much and confined within _lookup(), _find(), etc. functions, and
> that awareness is good enough at this point).
>
> Thanks.

  parent reply	other threads:[~2019-04-01 16:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
2019-03-12 13:29 ` [PATCH v3 01/11] fetch-object: make functions return an error code Christian Couder
2019-03-12 13:29 ` [PATCH v3 02/11] Add initial support for many promisor remotes Christian Couder
2019-03-13  4:09   ` Junio C Hamano
2019-03-13  4:34     ` Junio C Hamano
2019-04-01 16:41     ` Christian Couder [this message]
2019-03-12 13:29 ` [PATCH v3 03/11] promisor-remote: implement promisor_remote_get_direct() Christian Couder
2019-03-13  4:23   ` Junio C Hamano
2019-04-01 16:41     ` Christian Couder
2019-03-12 13:29 ` [PATCH v3 04/11] promisor-remote: add promisor_remote_reinit() Christian Couder
2019-03-13  4:28   ` Junio C Hamano
2019-04-01 16:41     ` Christian Couder
2019-03-12 13:29 ` [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone Christian Couder
2019-03-13  4:31   ` Junio C Hamano
2019-04-01 16:42     ` Christian Couder
2019-04-01 17:25       ` Junio C Hamano
2019-03-12 13:29 ` [PATCH v3 06/11] Use promisor_remote_get_direct() and has_promisor_remote() Christian Couder
2019-03-12 13:29 ` [PATCH v3 07/11] promisor-remote: parse remote.*.partialclonefilter Christian Couder
2019-03-12 13:29 ` [PATCH v3 08/11] builtin/fetch: remove unique promisor remote limitation Christian Couder
2019-03-12 13:29 ` [PATCH v3 09/11] t0410: test fetching from many promisor remotes Christian Couder
2019-03-12 13:29 ` [PATCH v3 10/11] partial-clone: add multiple remotes in the doc Christian Couder
2019-03-12 13:29 ` [PATCH v3 11/11] remote: add promisor and partial clone config to " Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP8UFD208vY=0tduwSipBHYTPJCrBtsME6GouZMiKrnXJ=0zAw@mail.gmail.com' \
    --to=christian.couder@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=chriscool@tuxfamily.org \
    --cc=dev+git@drbeat.li \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=larsxschneider@gmail.com \
    --cc=mh@glandium.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).