From: Ramsay Jones <ramsay@ramsayjones.plus.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: git <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>,
Jeff King <peff@peff.net>, Duy Nguyen <pclouds@gmail.com>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Stefan Beller <sbeller@google.com>,
Christian Couder <chriscool@tuxfamily.org>,
SZEDER Gabor <szeder.dev@gmail.com>, Jeff King <peff@peff.net>
Subject: Re: [PATCH v4 1/7] Add delta-islands.{c,h}
Date: Mon, 13 Aug 2018 13:17:18 +0100 [thread overview]
Message-ID: <7a780fe9-e8bf-804a-82e6-8df81cd5c41c@ramsayjones.plus.com> (raw)
In-Reply-To: <CAP8UFD1tX+rAxQc47o-50Kzo6hnX9mTWH2BPSq9HiO_OgBzYTw@mail.gmail.com>
On 13/08/18 04:33, Christian Couder wrote:
> On Mon, Aug 13, 2018 at 3:14 AM, Ramsay Jones
[snip]
>> And neither the sha1 or str hash-maps are destroyed here.
>> (That is not necessarily a problem, of course! ;-) )
>
> The instances are declared as static:
>
> static khash_sha1 *island_marks;
>
> static kh_str_t *remote_islands;
>
> so it maybe ok.
Yes, that's fine.
>
>>> +struct island_bitmap {
>>> + uint32_t refcount;
>>> + uint32_t bits[];
>>
>> Use FLEX_ARRAY here? We are slowly moving toward requiring
>> certain C99 features, but I can't remember a flex array
>> weather-balloon patch.
>
> This was already discussed by Junio and Peff there:
>
> https://public-inbox.org/git/20180727130229.GB18599@sigill.intra.peff.net/
>
That is a fine discussion, without a firm conclusion, but I don't
think you can simply do nothing here:
$ cat -n junk.c
1 #include <stdint.h>
2
3 struct island_bitmap {
4 uint32_t refcount;
5 uint32_t bits[];
6 };
7
$ gcc --std=c89 --pedantic -c junk.c
junk.c:5:11: warning: ISO C90 does not support flexible array members [-Wpedantic]
uint32_t bits[];
^~~~
$ gcc --std=c99 --pedantic -c junk.c
$
>>> +};
>
>>> +int in_same_island(const struct object_id *trg_oid, const struct object_id *src_oid)
>>
>> Hmm, what does the trg_ prefix stand for?
>>
>>> +{
>>> + khiter_t trg_pos, src_pos;
>>> +
>>> + /* If we aren't using islands, assume everything goes together. */
>>> + if (!island_marks)
>>> + return 1;
>>> +
>>> + /*
>>> + * If we don't have a bitmap for the target, we can delta it
>>
>> ... Ah, OK, trg_ => target.
>
> I am ok to replace "trg" with "target" (or maybe "dst"? or something
> else) and "src" with "source" if you think it would make things
> clearer.
If it had been dst_ (or target), I would not have had a 'huh?'
moment, but it is not all that important.
>
>>> +static void add_ref_to_island(const char *island_name, const struct object_id *oid)
>>> +{
>>> + uint64_t sha_core;
>>> + struct remote_island *rl = NULL;
>>> +
>>> + int hash_ret;
>>> + khiter_t pos = kh_put_str(remote_islands, island_name, &hash_ret);
>>> +
>>> + if (hash_ret) {
>>> + kh_key(remote_islands, pos) = xstrdup(island_name);
>>> + kh_value(remote_islands, pos) = xcalloc(1, sizeof(struct remote_island));
>>> + }
>>> +
>>> + rl = kh_value(remote_islands, pos);
>>> + oid_array_append(&rl->oids, oid);
>>> +
>>> + memcpy(&sha_core, oid->hash, sizeof(uint64_t));
>>> + rl->hash += sha_core;
>>
>> Hmm, so the first 64-bits of the oid of each ref that is part of
>> this island is added together as a 'hash' for the island. And this
>> is used to de-duplicate the islands? Any false positives? (does it
>> matter - it would only affect performance, not correctness, right?)
>
> I would think that a false positive from pure chance is very unlikely.
> We would need to approach billions of delta islands (as 2 to the power
> 64/2 is in the order of billions) for the probability to be
> significant. GitHub has less than 50 millions users and it is very
> unlikely that a significant proportion of these users will fork the
> same repo.
>
> Now if there is a false positive because two forks have exactly the
> same refs, then it is not a problem if they are considered the same,
> because they are actually the same.
Yep, good point.
ATB,
Ramsay Jones
next prev parent reply other threads:[~2018-08-13 12:17 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-12 5:11 [PATCH v4 0/7] Add delta islands support Christian Couder
2018-08-12 5:11 ` [PATCH v4 1/7] Add delta-islands.{c,h} Christian Couder
2018-08-13 1:14 ` Ramsay Jones
2018-08-13 3:33 ` Christian Couder
2018-08-13 12:17 ` Ramsay Jones [this message]
2018-08-13 18:11 ` Jeff King
2018-08-16 6:02 ` Christian Couder
2018-08-13 19:00 ` Jeff King
2018-08-16 6:04 ` Christian Couder
2018-08-12 5:11 ` [PATCH v4 2/7] pack-objects: refactor code into compute_layer_order() Christian Couder
2018-08-12 5:11 ` [PATCH v4 3/7] pack-objects: add delta-islands support Christian Couder
2018-08-12 5:11 ` [PATCH v4 4/7] repack: " Christian Couder
2018-08-12 5:11 ` [PATCH v4 5/7] t: add t5319-delta-islands.sh Christian Couder
2018-08-12 5:11 ` [PATCH v4 6/7] pack-objects: move tree_depth into 'struct packing_data' Christian Couder
2018-08-12 5:11 ` [PATCH v4 7/7] pack-objects: move 'layer' " Christian Couder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7a780fe9-e8bf-804a-82e6-8df81cd5c41c@ramsayjones.plus.com \
--to=ramsay@ramsayjones.plus.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=chriscool@tuxfamily.org \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).