git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Stefan Beller <sbeller@google.com>,
	Derrick Stolee <dstolee@microsoft.com>,
	Ævar Arnfjörð Bjarmason <avarab@gmail.com>,
	Jonathan Nieder <jrnieder@gmail.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Martin Fick <mfick@codeaurora.org>
Subject: Re: [PATCH 11/23] midx: sort and deduplicate objects from packfiles
Date: Thu, 21 Jun 2018 13:54:11 -0400
Message-ID: <dcd19f97-51de-8254-9977-56ad9a54cbb4@gmail.com> (raw)
In-Reply-To: <CACsJy8CJ0GNkciHVVUm_7a_MtG5RnZSWpV_zNAwtvMR8aRq42A@mail.gmail.com>

On 6/9/2018 1:07 PM, Duy Nguyen wrote:
> On Thu, Jun 7, 2018 at 4:06 PM Derrick Stolee <stolee@gmail.com> wrote:
>> Before writing a list of objects and their offsets to a multi-pack-index
>> (MIDX), we need to collect the list of objects contained in the
>> packfiles. There may be multiple copies of some objects, so this list
>> must be deduplicated.
> Can you just do merge-sort with a slight modification to ignore duplicates?

Are you proposing we consider a multi-way merge of the existing sorted 
lists of packfiles (skipping duplicates)? In my head, this would work 
this way:

1. Keep an array of positions within each of the pack-indexes for the 
"current lex-least OID not already in my sorted list"

2. Scan the list of P pack-indexes to find the lex-least OID among all 
candidates. Advance the position of that pack-index as we put that OID 
in the list (and advance the position of pack-indexes with duplicates).

This would have O(P * N) performance, where P is the number of packfiles 
and N is the total number of objects. This gets slightly better when 
there are duplicates; in the world where we have P identical lists of n 
objects, then N = n * P and we actually get N steps because we can 
advance the position on a duplicate value and not revisit duplicates. 
However, we do not expect duplicates in this density.

By adding some complexity to the algorithm, we could sort the 
pack-indexes in order of their lex-least OIDs, and update the order as 
we advance -- or rather use a min-heap to have access to the proper 
pack-index. This case is most likely to be valuable when updating a 
large MIDX by adding a list of smaller IDX files (which we expect to not 
be the "best" choice for most of the selections). I'm not sure the 
complexity is worth it (would need to measure!).

By concatenating the lists within the fanout values and sorting, we do 
256 sorts of size ~N/256, giving O(N * log(N/256)) performance. This 
method also has an extra array of size ~N/200 to store the batches, 
resulting in extra copies being pushed around.

You've convinced me that your approach may be better, especially in the 
typical case of adding a small number of packfiles to an existing MIDX 
file. Some work is needed to be sure it is better in general (such as 
reported cases of 5000 packfiles!). I'll leave a note to revisit this 
between v2 and v3.

>
>> It is possible to artificially get into a state where there are many
>> duplicate copies of objects. That can create high memory pressure if we
>> are to create a list of all objects before de-duplication. To reduce
>> this memory pressure without a significant performance drop,
>> automatically group objects by the first byte of their object id. Use
>> the IDX fanout tables to group the data, copy to a local array, then
>> sort.
>>
>> Copy only the de-duplicated entries. Select the duplicate based on the
>> most-recent modified time of a packfile containing the object.
>>
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>> ---
>>   midx.c | 138 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 138 insertions(+)
>>
>> diff --git a/midx.c b/midx.c
>> index 923acda72e..b20d52713c 100644
>> --- a/midx.c
>> +++ b/midx.c
>> @@ -4,6 +4,7 @@
>>   #include "csum-file.h"
>>   #include "lockfile.h"
>>   #include "object-store.h"
>> +#include "packfile.h"
>>   #include "midx.h"
>>
>>   #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */
>> @@ -190,6 +191,140 @@ static void sort_packs_by_name(char **pack_names, uint32_t nr_packs, uint32_t *p
>>          }
>>   }
>>
>> +static uint32_t get_pack_fanout(struct packed_git *p, uint32_t value)
>> +{
>> +       const uint32_t *level1_ofs = p->index_data;
>> +
>> +       if (!level1_ofs) {
>> +               if (open_pack_index(p))
>> +                       return 0;
>> +               level1_ofs = p->index_data;
>> +       }
>> +
>> +       if (p->index_version > 1) {
>> +               level1_ofs += 2;
>> +       }
>> +
>> +       return ntohl(level1_ofs[value]);
>> +}
> Maybe keep this in packfile,c, refactor fanout code in there if
> necessary, keep .idx file format info in that file instead of
> spreading out more.
>
>> +
>> +struct pack_midx_entry {
>> +       struct object_id oid;
>> +       uint32_t pack_int_id;
>> +       time_t pack_mtime;
>> +       uint64_t offset;
>> +};
>> +
>> +static int midx_oid_compare(const void *_a, const void *_b)
>> +{
>> +       struct pack_midx_entry *a = (struct pack_midx_entry *)_a;
>> +       struct pack_midx_entry *b = (struct pack_midx_entry *)_b;
> Try not to lose "const" while typecasting.
>
>> +       int cmp = oidcmp(&a->oid, &b->oid);
>> +
>> +       if (cmp)
>> +               return cmp;
>> +
>> +       if (a->pack_mtime > b->pack_mtime)
>> +               return -1;
>> +       else if (a->pack_mtime < b->pack_mtime)
>> +               return 1;
>> +
>> +       return a->pack_int_id - b->pack_int_id;
>> +}
>> +
>> +static void fill_pack_entry(uint32_t pack_int_id,
>> +                           struct packed_git *p,
>> +                           uint32_t cur_object,
>> +                           struct pack_midx_entry *entry)
>> +{
>> +       if (!nth_packed_object_oid(&entry->oid, p, cur_object))
>> +               die("failed to located object %d in packfile", cur_object);
> _()
>
>> +
>> +       entry->pack_int_id = pack_int_id;
>> +       entry->pack_mtime = p->mtime;
>> +
>> +       entry->offset = nth_packed_object_offset(p, cur_object);
>> +}
>> +
>> +/*
>> + * It is possible to artificially get into a state where there are many
>> + * duplicate copies of objects. That can create high memory pressure if
>> + * we are to create a list of all objects before de-duplication. To reduce
>> + * this memory pressure without a significant performance drop, automatically
>> + * group objects by the first byte of their object id. Use the IDX fanout
>> + * tables to group the data, copy to a local array, then sort.
>> + *
>> + * Copy only the de-duplicated entries (selected by most-recent modified time
>> + * of a packfile containing the object).
>> + */
>> +static struct pack_midx_entry *get_sorted_entries(struct packed_git **p,
>> +                                                 uint32_t *perm,
>> +                                                 uint32_t nr_packs,
>> +                                                 uint32_t *nr_objects)
>> +{
>> +       uint32_t cur_fanout, cur_pack, cur_object;
>> +       uint32_t nr_fanout, alloc_fanout, alloc_objects, total_objects = 0;
>> +       struct pack_midx_entry *entries_by_fanout = NULL;
>> +       struct pack_midx_entry *deduplicated_entries = NULL;
>> +
>> +       for (cur_pack = 0; cur_pack < nr_packs; cur_pack++) {
>> +               if (open_pack_index(p[cur_pack]))
>> +                       continue;
> Is it a big problem if you fail to open .idx for a certain pack?
> Should we error out and abort instead of continuing on? Later on in
> the second pack loop code when get_fanout return zero (failure), you
> don't seem to catch it and skip the pack.
>
>> +
>> +               total_objects += p[cur_pack]->num_objects;
>> +       }
>> +
>> +       /*
>> +        * As we de-duplicate by fanout value, we expect the fanout
>> +        * slices to be evenly distributed, with some noise. Hence,
>> +        * allocate slightly more than one 256th.
>> +        */
>> +       alloc_objects = alloc_fanout = total_objects > 3200 ? total_objects / 200 : 16;
>> +
>> +       ALLOC_ARRAY(entries_by_fanout, alloc_fanout);
>> +       ALLOC_ARRAY(deduplicated_entries, alloc_objects);
>> +       *nr_objects = 0;
>> +
>> +       for (cur_fanout = 0; cur_fanout < 256; cur_fanout++) {
>> +               nr_fanout = 0;
> Keep variable scope small, declare nr_fanout here instead of at the
> top of the function.
>
>> +
>> +               for (cur_pack = 0; cur_pack < nr_packs; cur_pack++) {
>> +                       uint32_t start = 0, end;
>> +
>> +                       if (cur_fanout)
>> +                               start = get_pack_fanout(p[cur_pack], cur_fanout - 1);
>> +                       end = get_pack_fanout(p[cur_pack], cur_fanout);
>> +
>> +                       for (cur_object = start; cur_object < end; cur_object++) {
>> +                               ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout);
>> +                               fill_pack_entry(perm[cur_pack], p[cur_pack], cur_object, &entries_by_fanout[nr_fanout]);
>> +                               nr_fanout++;
>> +                       }
>> +               }
>> +
>> +               QSORT(entries_by_fanout, nr_fanout, midx_oid_compare);
>> +
>> +               /*
>> +                * The batch is now sorted by OID and then mtime (descending).
>> +                * Take only the first duplicate.
>> +                */
>> +               for (cur_object = 0; cur_object < nr_fanout; cur_object++) {
>> +                       if (cur_object && !oidcmp(&entries_by_fanout[cur_object - 1].oid,
>> +                                                 &entries_by_fanout[cur_object].oid))
>> +                               continue;
>> +
>> +                       ALLOC_GROW(deduplicated_entries, *nr_objects + 1, alloc_objects);
>> +                       memcpy(&deduplicated_entries[*nr_objects],
>> +                              &entries_by_fanout[cur_object],
>> +                              sizeof(struct pack_midx_entry));
>> +                       (*nr_objects)++;
>> +               }
>> +       }
>> +
>> +       FREE_AND_NULL(entries_by_fanout);
>> +       return deduplicated_entries;
>> +}
>> +
>>   static size_t write_midx_pack_lookup(struct hashfile *f,
>>                                       char **pack_names,
>>                                       uint32_t nr_packs)
>> @@ -254,6 +389,7 @@ int write_midx_file(const char *object_dir)
>>          uint64_t written = 0;
>>          uint32_t chunk_ids[MIDX_MAX_CHUNKS + 1];
>>          uint64_t chunk_offsets[MIDX_MAX_CHUNKS + 1];
>> +       uint32_t nr_entries;
>>
>>          midx_name = get_midx_filename(object_dir);
>>          if (safe_create_leading_directories(midx_name)) {
>> @@ -312,6 +448,8 @@ int write_midx_file(const char *object_dir)
>>          ALLOC_ARRAY(pack_perm, nr_packs);
>>          sort_packs_by_name(pack_names, nr_packs, pack_perm);
>>
>> +       get_sorted_entries(packs, pack_perm, nr_packs, &nr_entries);
> Intentional ignoring return value (and temporary leaking as a result)
> should have a least a comment to acknowledge it and save reviewers
> some head scratching. Or even better, just free it now, even if you
> don't use it.
>
>> +
>>          hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR);
>>          f = hashfd(lk.tempfile->fd, lk.tempfile->filename.buf);
>>          FREE_AND_NULL(midx_name);
>> --
>> 2.18.0.rc1
>>
>


  reply index

Thread overview: 192+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-07 14:03 [PATCH 00/23] Multi-pack-index (MIDX) Derrick Stolee
2018-06-07 14:03 ` [PATCH 01/23] midx: add design document Derrick Stolee
2018-06-11 19:04   ` Stefan Beller
2018-06-18 18:48     ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 02/23] midx: add midx format details to pack-format.txt Derrick Stolee
2018-06-11 19:19   ` Stefan Beller
2018-06-18 19:01     ` Derrick Stolee
2018-06-18 19:41       ` Stefan Beller
2018-06-07 14:03 ` [PATCH 03/23] midx: add midx builtin Derrick Stolee
2018-06-07 17:20   ` Duy Nguyen
2018-06-18 19:23     ` Derrick Stolee
2018-06-11 21:02   ` Stefan Beller
2018-06-18 19:40     ` Derrick Stolee
2018-06-18 19:55       ` Stefan Beller
2018-06-18 19:58         ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 04/23] midx: add 'write' subcommand and basic wiring Derrick Stolee
2018-06-07 17:27   ` Duy Nguyen
2018-06-07 14:03 ` [PATCH 05/23] midx: write header information to lockfile Derrick Stolee
2018-06-07 17:35   ` Duy Nguyen
2018-06-12 15:00   ` Duy Nguyen
2018-06-19 12:54     ` Derrick Stolee
2018-06-19 14:59       ` Duy Nguyen
2018-06-19 15:24         ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 06/23] midx: struct midxed_git and 'read' subcommand Derrick Stolee
2018-06-07 17:54   ` Duy Nguyen
2018-06-20 13:13     ` Derrick Stolee
2018-06-07 18:31   ` Duy Nguyen
2018-06-20 13:33     ` Derrick Stolee
2018-06-20 15:07       ` Duy Nguyen
2018-06-20 16:39         ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 07/23] midx: expand test data Derrick Stolee
2018-06-07 14:03 ` [PATCH 08/23] midx: read packfiles from pack directory Derrick Stolee
2018-06-07 18:03   ` Duy Nguyen
2018-06-20 16:33     ` [PATCH] packfile: generalize pack directory list Derrick Stolee
2018-06-07 14:03 ` [PATCH 09/23] midx: write pack names in chunk Derrick Stolee
2018-06-07 18:26   ` Duy Nguyen
2018-06-21 15:25     ` Derrick Stolee
2018-06-21 17:38       ` Junio C Hamano
2018-06-22 18:25         ` Derrick Stolee
2018-06-22 18:31           ` Junio C Hamano
2018-06-22 18:32             ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 10/23] midx: write a lookup into the pack names chunk Derrick Stolee
2018-06-09 16:43   ` Duy Nguyen
2018-06-21 17:23     ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 11/23] midx: sort and deduplicate objects from packfiles Derrick Stolee
2018-06-09 17:07   ` Duy Nguyen
2018-06-21 17:54     ` Derrick Stolee [this message]
2018-06-07 14:03 ` [PATCH 12/23] midx: write object ids in a chunk Derrick Stolee
2018-06-09 17:25   ` Duy Nguyen
2018-06-07 14:03 ` [PATCH 13/23] midx: write object id fanout chunk Derrick Stolee
2018-06-09 17:28   ` Duy Nguyen
2018-06-21 19:49     ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 14/23] midx: write object offsets Derrick Stolee
2018-06-09 17:41   ` Duy Nguyen
2018-06-07 14:03 ` [PATCH 15/23] midx: create core.midx config setting Derrick Stolee
2018-06-07 14:03 ` [PATCH 16/23] midx: prepare midxed_git struct Derrick Stolee
2018-06-09 17:47   ` Duy Nguyen
2018-06-07 14:03 ` [PATCH 17/23] midx: read objects from multi-pack-index Derrick Stolee
2018-06-09 17:56   ` Duy Nguyen
2018-06-21 20:03     ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 18/23] midx: use midx in abbreviation calculations Derrick Stolee
2018-06-09 18:01   ` Duy Nguyen
2018-06-22 18:38     ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 19/23] midx: use existing midx when writing new one Derrick Stolee
2018-06-07 14:03 ` [PATCH 20/23] midx: use midx in approximate_object_count Derrick Stolee
2018-06-09 18:03   ` Duy Nguyen
2018-06-22 18:39     ` Derrick Stolee
2018-06-07 14:03 ` [PATCH 21/23] midx: prevent duplicate packfile loads Derrick Stolee
2018-06-09 18:05   ` Duy Nguyen
2018-06-07 14:03 ` [PATCH 22/23] midx: use midx to find ref-deltas Derrick Stolee
2018-06-07 14:03 ` [PATCH 23/23] midx: clear midx on repack Derrick Stolee
2018-06-09 18:13   ` Duy Nguyen
2018-06-22 18:44     ` Derrick Stolee
2018-06-07 14:06 ` [PATCH 00/23] Multi-pack-index (MIDX) Derrick Stolee
2018-06-07 14:45 ` Ævar Arnfjörð Bjarmason
2018-06-07 14:54   ` Derrick Stolee
2018-06-25 14:34 ` [PATCH v2 00/24] " Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 01/24] multi-pack-index: add design document Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 02/24] multi-pack-index: add format details Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 03/24] multi-pack-index: add builtin Derrick Stolee
2018-06-25 19:15     ` Junio C Hamano
2018-06-25 14:34   ` [PATCH v2 04/24] multi-pack-index: add 'write' verb Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 05/24] midx: write header information to lockfile Derrick Stolee
2018-06-25 19:19     ` Junio C Hamano
2018-07-05 19:13       ` Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 06/24] multi-pack-index: load into memory Derrick Stolee
2018-06-25 19:38     ` Junio C Hamano
2018-07-05 14:19       ` Derrick Stolee
2018-07-05 18:58         ` Eric Sunshine
2018-07-06 19:20           ` Junio C Hamano
2018-06-25 14:34   ` [PATCH v2 07/24] multi-pack-index: expand test data Derrick Stolee
2018-06-25 19:45     ` Junio C Hamano
2018-06-25 14:34   ` [PATCH v2 08/24] packfile: generalize pack directory list Derrick Stolee
2018-06-25 19:57     ` Junio C Hamano
2018-06-25 14:34   ` [PATCH v2 09/24] multi-pack-index: read packfile list Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 10/24] multi-pack-index: write pack names in chunk Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 11/24] midx: read pack names into array Derrick Stolee
2018-06-25 23:52     ` Eric Sunshine
2018-06-25 14:34   ` [PATCH v2 12/24] midx: sort and deduplicate objects from packfiles Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 13/24] midx: write object ids in a chunk Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 14/24] midx: write object id fanout chunk Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 15/24] midx: write object offsets Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 16/24] config: create core.multiPackIndex setting Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 17/24] midx: prepare midxed_git struct Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 18/24] midx: read objects from multi-pack-index Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 19/24] midx: use midx in abbreviation calculations Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 20/24] midx: use existing midx when writing new one Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 21/24] midx: use midx in approximate_object_count Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 22/24] midx: prevent duplicate packfile loads Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 23/24] packfile: skip loading index if in multi-pack-index Derrick Stolee
2018-06-25 14:34   ` [PATCH v2 24/24] midx: clear midx on repack Derrick Stolee
2018-07-06  0:52   ` [PATCH v3 00/24] Multi-pack-index (MIDX) Derrick Stolee
2018-07-06  0:52     ` [PATCH v3 01/24] multi-pack-index: add design document Derrick Stolee
2018-07-06  0:52     ` [PATCH v3 02/24] multi-pack-index: add format details Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 03/24] multi-pack-index: add builtin Derrick Stolee
2018-07-06  3:54       ` Eric Sunshine
2018-07-06  0:53     ` [PATCH v3 04/24] multi-pack-index: add 'write' verb Derrick Stolee
2018-07-06  4:07       ` Eric Sunshine
2018-07-06  0:53     ` [PATCH v3 05/24] midx: write header information to lockfile Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 06/24] multi-pack-index: load into memory Derrick Stolee
2018-07-06  4:19       ` Eric Sunshine
2018-07-06  5:18         ` Eric Sunshine
2018-07-09 19:08       ` Junio C Hamano
2018-07-12 16:06         ` Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 07/24] multi-pack-index: expand test data Derrick Stolee
2018-07-06  4:36       ` Eric Sunshine
2018-07-06  5:20         ` Eric Sunshine
2018-07-12 14:10         ` Derrick Stolee
2018-07-12 18:02           ` Eric Sunshine
2018-07-12 18:06             ` Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 08/24] packfile: generalize pack directory list Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 09/24] multi-pack-index: read packfile list Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 10/24] multi-pack-index: write pack names in chunk Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 11/24] midx: read pack names into array Derrick Stolee
2018-07-06  4:58       ` Eric Sunshine
2018-07-06  0:53     ` [PATCH v3 12/24] midx: sort and deduplicate objects from packfiles Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 13/24] midx: write object ids in a chunk Derrick Stolee
2018-07-06  5:04       ` Eric Sunshine
2018-07-06  0:53     ` [PATCH v3 14/24] midx: write object id fanout chunk Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 15/24] midx: write object offsets Derrick Stolee
2018-07-06  5:27       ` Eric Sunshine
2018-07-12 16:33         ` Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 16/24] config: create core.multiPackIndex setting Derrick Stolee
2018-07-06  5:39       ` Eric Sunshine
2018-07-12 13:19         ` Derrick Stolee
2018-07-12 16:30           ` Derrick Stolee
2018-07-11  9:48       ` SZEDER Gábor
2018-07-12 13:01         ` Derrick Stolee
2018-07-12 13:31           ` SZEDER Gábor
2018-07-12 15:40             ` Derrick Stolee
2018-07-12 17:29             ` Junio C Hamano
2018-07-06  0:53     ` [PATCH v3 17/24] midx: prepare midxed_git struct Derrick Stolee
2018-07-06  5:41       ` Eric Sunshine
2018-07-06  0:53     ` [PATCH v3 18/24] midx: read objects from multi-pack-index Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 19/24] midx: use midx in abbreviation calculations Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 20/24] midx: use existing midx when writing new one Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 21/24] midx: use midx in approximate_object_count Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 22/24] midx: prevent duplicate packfile loads Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 23/24] packfile: skip loading index if in multi-pack-index Derrick Stolee
2018-07-06  0:53     ` [PATCH v3 24/24] midx: clear midx on repack Derrick Stolee
2018-07-06  5:52       ` Eric Sunshine
2018-07-12 19:39     ` [PATCH v4 00/23] Multi-pack-index (MIDX) Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 01/23] multi-pack-index: add design document Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 02/23] multi-pack-index: add format details Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 03/23] multi-pack-index: add builtin Derrick Stolee
2018-07-20 18:22         ` Junio C Hamano
2018-07-20 22:15           ` brian m. carlson
2018-07-20 22:28             ` Junio C Hamano
2018-07-12 19:39       ` [PATCH v4 04/23] multi-pack-index: add 'write' verb Derrick Stolee
2018-07-12 22:56         ` Eric Sunshine
2018-07-12 19:39       ` [PATCH v4 05/23] midx: write header information to lockfile Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 06/23] multi-pack-index: load into memory Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 07/23] t5319: expand test data Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 08/23] packfile: generalize pack directory list Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 09/23] multi-pack-index: read packfile list Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 10/23] multi-pack-index: write pack names in chunk Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 11/23] midx: read pack names into array Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 12/23] midx: sort and deduplicate objects from packfiles Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 13/23] midx: write object ids in a chunk Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 14/23] midx: write object id fanout chunk Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 15/23] midx: write object offsets Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 16/23] config: create core.multiPackIndex setting Derrick Stolee
2018-07-12 21:05         ` Junio C Hamano
2018-07-13  0:50           ` Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 17/23] midx: read objects from multi-pack-index Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 18/23] midx: use midx in abbreviation calculations Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 19/23] midx: use existing midx when writing new one Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 20/23] midx: use midx in approximate_object_count Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 21/23] midx: prevent duplicate packfile loads Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 22/23] packfile: skip loading index if in multi-pack-index Derrick Stolee
2018-07-12 19:39       ` [PATCH v4 23/23] midx: clear midx on repack Derrick Stolee
2018-07-12 21:11       ` [PATCH v4 00/23] Multi-pack-index (MIDX) Junio C Hamano

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dcd19f97-51de-8254-9977-56ad9a54cbb4@gmail.com \
    --to=stolee@gmail.com \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=mfick@codeaurora.org \
    --cc=pclouds@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git