Re: [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Neeraj Singh <nksingh85@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Git List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Patrick Steinhardt <ps@pks.im>,
	Bagas Sanjaya <bagasdotme@gmail.com>,
	Neeraj Singh <neerajsi@microsoft.com>
Subject: Re: [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags
Date: Wed, 23 Mar 2022 13:19:32 -0700	[thread overview]
Message-ID: <CANQDOddpo+a8r_0yghgy_1bHvfUe5XQaaaWc7D-OLqX6Anhgiw@mail.gmail.com> (raw)
In-Reply-To: <220323.86sfr9ndpr.gmgdl@evledraar.gmail.com>

I'm going to respond in more detail to your individual patches,
(expect the last mail to contain a comment at the end "LAST MAIL").

On Wed, Mar 23, 2022 at 3:52 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Mar 22 2022, Neeraj Singh wrote:
>
> > On Tue, Mar 22, 2022 at 8:48 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >> As with unpack-objects in a preceding commit have update-index.c make
> >> use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch"
> >> mode again for "update-index".
> >>
> >> Adding the t/* directory from git.git on a Linux ramdisk is a bit
> >> faster than with the tmp-objdir indirection:
> >>
> >>         git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' -p 'rm -rf repo && git init repo && cp -R t repo/' 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' --warmup 1 -r 10
> >>         Benchmark 1: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync
> >>           Time (mean ± σ):     289.8 ms ±   4.0 ms    [User: 186.3 ms, System: 103.2 ms]
> >>           Range (min … max):   285.6 ms … 297.0 ms    10 runs
> >>
> >>         Benchmark 2: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD
> >>           Time (mean ± σ):     273.9 ms ±   7.3 ms    [User: 189.3 ms, System: 84.1 ms]
> >>           Range (min … max):   267.8 ms … 291.3 ms    10 runs
> >>
> >>         Summary
> >>           'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
> >>             1.06 ± 0.03 times faster than 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
> >>
> >> And as before running that with "strace --summary-only" slows things
> >> down a bit (probably mimicking slower I/O a bit). I then get:
> >>
> >>         Summary
> >>           'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
> >>             1.21 ± 0.02 times faster than 'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
> >>
> >> We also go from ~51k syscalls to ~39k, with ~2x the number of link()
> >> and unlink() in ns/batched-fsync.
> >>
> >> In the process of doing this conversion we lost the "bulk" mode for
> >> files added on the command-line. I don't think it's useful to optimize
> >> that, but we could if anyone cared.
> >>
> >> We've also converted this to a string_list, we could walk with
> >> getline_fn() and get one line "ahead" to see what we have left, but I
> >> found that state machine a bit painful, and at least in my testing
> >> buffering this doesn't harm things. But we could also change this to
> >> stream again, at the cost of some getline_fn() twiddling.
> >>
> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> >> ---
> >>  builtin/update-index.c | 31 +++++++++++++++++++++++++++----
> >>  1 file changed, 27 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/builtin/update-index.c b/builtin/update-index.c
> >> index af02ff39756..c7cbfe1123b 100644
> >> --- a/builtin/update-index.c
> >> +++ b/builtin/update-index.c
> >> @@ -1194,15 +1194,38 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
> >>         }
> >>
> >>         if (read_from_stdin) {
> >> +               struct string_list list = STRING_LIST_INIT_NODUP;
> >>                 struct strbuf line = STRBUF_INIT;
> >>                 struct strbuf unquoted = STRBUF_INIT;
> >> +               size_t i, nr;
> >> +               unsigned oflags;
> >>
> >>                 setup_work_tree();
> >> -               while (getline_fn(&line, stdin) != EOF)
> >> -                       line_from_stdin(&line, &unquoted, prefix, prefix_length,
> >> -                                       nul_term_line, set_executable_bit, 0);
> >> +               while (getline_fn(&line, stdin) != EOF) {
> >> +                       size_t len = line.len;
> >> +                       char *str = strbuf_detach(&line, NULL);
> >> +
> >> +                       string_list_append_nodup(&list, str)->util = (void *)len;
> >> +               }
> >> +
> >> +               nr = list.nr;
> >> +               oflags = nr > 1 ? HASH_N_OBJECTS : 0;
> >> +               for (i = 0; i < nr; i++) {
> >> +                       size_t nth = i + 1;
> >> +                       unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
> >> +                                 nr == nth ? HASH_N_OBJECTS_LAST : 0;
> >> +                       struct strbuf buf = STRBUF_INIT;
> >> +                       struct string_list_item *item = list.items + i;
> >> +                       const size_t len = (size_t)item->util;
> >> +
> >> +                       strbuf_attach(&buf, item->string, len, len);
> >> +                       line_from_stdin(&buf, &unquoted, prefix, prefix_length,
> >> +                                       nul_term_line, set_executable_bit,
> >> +                                       oflags | f);
> >> +                       strbuf_release(&buf);
> >> +               }
> >>                 strbuf_release(&unquoted);
> >> -               strbuf_release(&line);
> >> +               string_list_clear(&list, 0);
> >>         }
> >>
> >>         if (split_index > 0) {
> >> --
> >> 2.35.1.1428.g1c1a0152d61
> >>
> >
> > This buffering introduces the same potential risk of the
> > "stdin-feeder" process not being able to see objects right away as my
> > version had. I'm planning to mitigate the issue by unplugging the bulk
> > checkin when issuing a verbose report so that anyone who's using that
> > output to synchronize can still see what they're expecting.
>
> I was rather terse in the commit message, I meant (but forgot some
> words) "doesn't harm thing for performance [in the above test]", but
> converting this to a string_list is clearly & regression that shouldn't
> be kept.
>
> I just wanted to demonstrate method of doing this by passing down the
> HASH_* flags, and found that writing the state-machine to "buffer ahead"
> by one line so that we can eventually know in the loop if we're in the
> "last" line or not was tedious, so I came up with this POC. But we
> clearly shouldn't lose the "streaming" aspect.
>

From my experience working on several state machines in the Windows
OS, they are notoriously difficult to understand and extend.  I
wouldn't want every top-level command that does something interesting
to have to deal with that.

> But anyway, now that I look at this again the smart thing here (surely?)
> is to keep the simple getline() loop and not ever issue a
> HASH_N_OBJECTS_LAST for the Nth item, instead we should in this case do
> the "checkpoint fsync" at the point that we write the actual index.
>
> Because an existing redundancy in your series is that you'll do the
> fsync() the same way for "git unpack-objects" as for "git
> {update-index,add}".
>
> I.e. in the former case adding the N objects is all we're doing, so the
> "last object" is the point at which we need to flush the previous N to
> disk.
>
> But for "update-index/add" you'll do at least 2 fsync()'s in the bulk
> mode, when it should be one. I.e. the equivalent of (leaving aside the
> tmp-objdir migration part of it), if writing objects A && B:
>
>     ## METHOD ONE
>     # A
>     write(objects/A.tmp)
>     bulk_fsync(objects/A.tmp)
>     rename(objects/A.tmp, objects/A)
>     # B
>     write(objects/B.tmp)
>     bulk_fsync(objects/B.tmp)
>     rename(objects/B.tmp, objects/B)
>     # "cookie"
>     write(bulk_fsync_XXXXXX)
>     fsync(bulk_fsync_XXXXXX)
>     # ref
>     write(INDEX.tmp, $(git rev-parse B))
>     fsync(INDEX.tmp)
>     rename(INDEX.tmp, INDEX)
>
> This series on top changes that so we know that we're doing N, so we
> don't need the seperate "cookie", we can just use the B object as the
> cookie, as we know it comes last;
>
>     ## METHOD TWO
>     # A -- SAME as above
>     write(objects/A.tmp)
>     bulk_fsync(objects/A.tmp)
>     rename(objects/A.tmp, objects/A)
>     # B -- SAME as above, with s/bulk_fsync/fsync/
>     write(objects/B.tmp)
>     fsync(objects/B.tmp)
>     rename(objects/B.tmp, objects/B)
>     # "cookie" -- GONE!
>     # ref -- SAME
>     write(INDEX.tmp, $(git rev-parse B))
>     fsync(INDEX.tmp)
>     rename(INDEX.tmp, INDEX)
>
> But really, we should instead realize that we're not doing
> "unpack-objects", but have a "ref update" at the end (whether that's a
> ref, or an index etc.) and do:
>
>     ## METHOD THREE
>     # A -- SAME as above
>     write(objects/A.tmp)
>     bulk_fsync(objects/A.tmp)
>     rename(objects/A.tmp, objects/A)
>     # B -- SAME as the first
>     write(objects/B.tmp)
>     bulk_fsync(objects/B.tmp)
>     rename(objects/B.tmp, objects/B)
>     # ref -- SAME
>     write(INDEX.tmp, $(git rev-parse B))
>     fsync(INDEX.tmp)
>     rename(INDEX.tmp, INDEX)
>
> Which cuts our number of fsync() operations down from 2 to 1, ina
> addition to removing the need for the "cookie", which is only there
> because we didn't keep track of where we were in the sequence as in my
> 2/7 and 5/7.
>

I agree that this is a great direction to go in as an extension to
this work (i.e. a subsequent patch).  I saw in one of your mails on v2
of your rfc series that you mentioned a "lightweight transaction-y
thing".  I've been thinking along the same lines myself, but wanted to
treat that as a separable concern.  In my ideal world, we'd just use a
real database for loose objects, the index, and refs and let that
handle the transaction management.  But in lieu of that, having a
transaction that looks across the ODB, index, and refs would let us
batch syncs optimally.

> And it would be the same for tmp-objdir, the rename dance is a bit
> different, but we'd do the "full" fsync() while on the INDEX.tmp, then
> migrate() the tmp-objdir, and once that's done do the final:
>
>     rename(INDEX.tmp, INDEX)
>
> I.e. we'd fsync() the content once, and only have the renme() or link()
> operations left. For POSIX we'd need a few more fsync() for the
> metadata, but this (i.e. your) series already makes the hard assumption
> that we don't need to do that for rename().
>
> > I think the code you've presented here is a lot of diff to accomplish
> > the same thing that my series does, where this specific update-index
> > caller has been roto-tilled to provide the needed
> > begin/end-transaction points.
>
> Any caller of these APIs will need the "unsigned oflags" sooner than
> later anyway, as they need to pass down e.g. HASH_WRITE_OBJECT. We just
> do it slightly earlier.
>
> And because of that in the general case it's really not the same, I
> think it's a better approach. You've already got the bug in yours of
> needlessly setting up the bulk checkin for !HASH_WRITE_OBJECT in
> update-index, which this neatly solves by deferring the "bulk" mechanism
> until the codepath that's past that and into the "real" object writing.
>
> We can also die() or error out in the object writing before ever getting
> to writing the object, in which case we'd do some setup that we'd need
> to tear down again, by deferring it until the last moment...
>

I'll be submitting a new version to the list which sets up the tmp
objdir lazily on first actual write, so the concern about writing to
the ODB needlessly should go away.

> > And I think there will be a lot of
> > complexity in supporting the same hints for command-line additions
> > (which is roughly equivalent to the git-add workflow).
>
> I left that out due to Junio's comment in
> https://lore.kernel.org/git/xmqqzgljyz34.fsf@gitster.g/; i.e. I don't
> see why we'd find it worthwhile to optimize that case, but we easily
> could (especially per the "just sync the INDEX.tmp" above).
>
> But even if we don't do "THREE" above I think it's still easy, for "TWO"
> we already have as parse_options() state machine to parse argv as it
> comes in. Doing the fsync() on the last object is just a matter of
> "looking ahead" there).
>
> > Every caller
> > that wants batch treatment will have to either implement a state
> > machine or implement a buffering mechanism in order to figure out the
> > begin-end points. Having a separate plug/unplug call eliminates this
> > complexity on each caller.
>
> This is subjective, but I really think that's rather easy to do, and
> much easier to reason about than the global state on the side via
> singletons that your method of avoiding modifying these callers and
> instead having them all consult global state via bulk-checkin.c and
> cache.h demands.

The nice thing about having the ODB handle the batch stuff internally
is that it can present a nice minimal interface to all of the callers.
Yes, it has a complex implementation internally, but that complexity
backs a rather simple API surface:
1. Begin/end transaction (plug/unplug checkin).
2. Find-object by SHA
3. Add object if it doesn't exist
4. Get the SHA without adding anything.

The ODB work is implemented once and callers can easily adopt the
transaction API without having to implement their own stuff on the
side.  Future series can make the transaction span nicely across the
ODB, index, and refs.

> That API also currently assumes single-threaded writers, if we start
> writing some of this in parallel in e.g. "unpack-objects" we'd need
> mutexes in bulk-object.[ch]. Isn't that a lot easier when the caller
> would instead know something about the special nature of the transaction
> they're interacting with, and that the 1st and last item are important
> (for a "BEGIN" and "FLUSH").
>

The API as sketched above doesn't deeply assume single-threadedness
for the "find object by SHA" or "add object if it doesn't exist".
There is a single-threaded assumption for begin/end-transaction.  The
implementation can use pthread_once to handle anything that needs to
be done lazily when adding objects.

> > Btw, I'm planning in a future series to reduce the system calls
> > involved in renaming a file by taking advantage of the renameat2
> > system call and equivalents on other platforms.  There's a pretty
> > strong motivation to do that on Windows.
>
> What do you have in mind for renameat2() specifically?  I.e. which of
> the 3x flags it implements will benefit us? RENAME_NOREPLACE to "move"
> the tmp_OBJ to an eventual OBJ?
>

Yes RENAME_NOREPLACE.  I'd want to introduce a helper called
git_rename_noreplace and use it instead of the link dance.

> Generally: There's some low-hanging fruit there. E.g. for tmp-objdir we
> slavishly go through the motion of writing an tmp_OBJ, writing (and
> possibly syncing it), then renaming that tmp_OBJ to OBJ.
>
> We could clearly just avoid that in some/all cases that use
> tmp-objdir. I.e. we're writing to a temporary store anyway, so why the
> tmp_OBJ files? We could just write to the final destinations instead,
> they're not reachable (by ref or OID lookup) from anyone else yet.
>

We were thinking before that there could be some concurrency in the
tmp_objdir, though I personally don't believe it's possible for the
typical bulk checkin case.  Using the final name in the tmp objdir
would be a nice optimization, but I think that it's a separable
concern that shouldn't block the bigger win from eliminating the cache
flushes.

> But even then I don't see how you'd get away with reducing some classes
> of syscalls past the 2x increase for some (leading an overall increase,
> but not a ~2x overall increase as noted in:
> https://lore.kernel.org/git/RFC-patch-7.7-481f1d771cb-20220323T033928Z-avarab@gmail.com/)
> as long as you use the tmp-objdir API. It's always going to have to
> write tmpdir/OBJ and link()/rename() that to OBJ.
>
> Now, I do think there's an easy way by extending the API use I've
> introduced in this RFC to do it. I.e. we'd just do:
>
>     ## METHOD FOUR
>     # A -- SAME as THREE, except no rename()
>     write(objects/A.tmp)
>     bulk_fsync(objects/A.tmp)
>     # B -- SAME as THREE, except no rename()
>     write(objects/B.tmp)
>     bulk_fsync(objects/B.tmp)
>     # ref -- SAME
>     write(INDEX.tmp, $(git rev-parse B))
>     fsync(INDEX.tmp)
>     # NEW: do all the renames at the end:
>     rename(objects/A.tmp, objects/A)
>     rename(objects/B.tmp, objects/B)
>     rename(INDEX.tmp, INDEX)
>
> That seems like an obvious win to me in any case. I.e. the tmp-objdir
> API isn't really a close fit for what we *really* want to do in this
> case.
>

I think this is the right place to get to eventually.  I believe the
best way to get there is to keep the plug/unplug bulk checkin
functionality (rebranding it as an 'ODB transaction') and then make
that a sub-transaction of a larger 'git repo transaction.'

> I.e. the reason it does everything this way is because it was explicitly
> designed for 722ff7f876c (receive-pack: quarantine objects until
> pre-receive accepts, 2016-10-03), where it's the right trade-off,
> because we'd like to cheaply "rm -rf" the whole thing if e.g. the
> "pre-receive" hook rejects the push.
>
> *AND* because it's made for the case of other things concurrently
> needing access to those objects. So pedantically you would need it for
> some modes of "git update-index", but not e.g. "git unpack-objects"
> where we really are expecting to keep all of them.
>
> > Thanks for the concrete code,
>
> ..but no thanks? I.e. it would be useful to explicitly know if you're
> interested or open to running with some of the approach in this RFC.

I'm still at the point of arguing with you about your RFC, but I'm
_not_ currently leaning toward adopting your approach.  I think from a
separation-of-concerns perspective, we shouldn't change top-level git
commands to try hard to track first/last object.  The ODB should
conceptually handle it internally as part of a higher-level
transaction.  Consider cmd_add, which does its interesting
add_file_to_index from the update_callback coming from the diff code:
I believe it would be hopelessly complex/impossible to do the tracking
required to pass the LAST_OF_N flag to a multiplexed write API.

We have a pretty clear example from the database world that
begin/end-transaction is the right way to design the API for the task
we want to accomplish.  It's also how many filesystems work
internally.  I don't want to reinvent the bicycle here.

Thanks,
Neeraj

next prev parent reply	other threads:[~2022-03-23 20:19 UTC|newest]

Thread overview: 175+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-16  5:33   ` Junio C Hamano
2022-03-16  7:33     ` Neeraj Singh
2022-03-16 16:14       ` Junio C Hamano
2022-03-16 17:59         ` Neeraj Singh
2022-03-16 18:10           ` Junio C Hamano
2022-03-16 19:50             ` Neeraj Singh
2022-03-15 21:30 ` [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-16  7:31   ` Patrick Steinhardt
2022-03-16 18:21     ` Neeraj Singh
2022-03-17  5:48       ` Patrick Steinhardt
2022-03-16 11:50   ` Bagas Sanjaya
2022-03-16 19:59     ` Neeraj Singh
2022-03-15 21:30 ` [PATCH 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
2022-03-21 18:28       ` Neeraj Singh
2022-03-21 15:47     ` Ævar Arnfjörð Bjarmason
2022-03-21 20:14       ` Neeraj Singh
2022-03-21 20:18         ` Ævar Arnfjörð Bjarmason
2022-03-22  0:13           ` Neeraj Singh
2022-03-22  8:52             ` Ævar Arnfjörð Bjarmason
2022-03-22 20:05               ` Neeraj Singh
2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function Ævar Arnfjörð Bjarmason
2022-03-23  5:27                     ` Neeraj Singh
2022-03-23  3:47                   ` [RFC PATCH 2/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 3/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 4/7] update-index: use a utility function for stdin consumption Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 5/7] update-index: pass down an "oflags" argument Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 6/7] update-index: rename "buf" to "line" Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23  5:51                     ` Neeraj Singh
2022-03-23  9:48                       ` Ævar Arnfjörð Bjarmason
2022-03-23 20:19                         ` Neeraj Singh [this message]
2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23 20:23                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
2022-03-23 20:25                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 3/7] update-index: pass down skeleton "oflags" argument Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects Ævar Arnfjörð Bjarmason
2022-03-23 20:30                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 5/7] add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync() Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 6/7] fsync docs: update for new syncing semantics Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old Ævar Arnfjörð Bjarmason
2022-03-23 21:08                       ` Neeraj Singh
2022-03-21 17:30     ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Junio C Hamano
2022-03-21 20:23       ` Neeraj Singh
2022-03-23 13:26     ` Ævar Arnfjörð Bjarmason
2022-03-24  2:04       ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-21 15:01     ` Ævar Arnfjörð Bjarmason
2022-03-21 22:09       ` Neeraj Singh
2022-03-21 23:16         ` Ævar Arnfjörð Bjarmason
2022-03-21 17:50     ` Junio C Hamano
2022-03-21 22:18       ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-21 17:55     ` Junio C Hamano
2022-03-21 23:02       ` Neeraj Singh
2022-03-22 20:54         ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-21 18:34     ` Junio C Hamano
2022-03-22  5:54       ` Neeraj Singh
2022-03-20  7:16   ` [PATCH v2 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-21 17:03   ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
2022-03-21 18:14     ` Neeraj Singh
2022-03-21 20:49       ` Junio C Hamano
2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-24 16:10       ` Ævar Arnfjörð Bjarmason
2022-03-24 17:52         ` Neeraj Singh
2022-03-24  4:58     ` [PATCH v3 02/11] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 03/11] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 04/11] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-24 18:18       ` Junio C Hamano
2022-03-24 20:25         ` Neeraj Singh
2022-03-24 21:34           ` Junio C Hamano
2022-03-24 22:21             ` Neeraj Singh
2022-03-24  4:58     ` [PATCH v3 06/11] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 07/11] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 08/11] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 09/11] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-24 16:29       ` Ævar Arnfjörð Bjarmason
2022-03-24 18:23         ` Neeraj Singh
2022-03-26 15:35           ` Ævar Arnfjörð Bjarmason
2022-03-24  4:58     ` [PATCH v3 10/11] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 11/11] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-03-24 17:44     ` [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
2022-03-24 19:21       ` Neeraj Singh
2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 02/13] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 03/13] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 04/13] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 05/13] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 06/13] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 07/13] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 08/13] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 09/13] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 10/13] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 11/13] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
2022-03-29 17:14         ` Neeraj Singh
2022-03-29 18:50           ` Junio C Hamano
2022-03-29  0:42       ` [PATCH v4 12/13] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-29 17:38         ` Neeraj Singh
2022-03-29  0:42       ` [PATCH v4 13/13] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-03-29 10:47       ` [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Ævar Arnfjörð Bjarmason
2022-03-29 17:09         ` Neeraj Singh
2022-03-29 11:45       ` Ævar Arnfjörð Bjarmason
2022-03-29 16:51         ` Neeraj Singh
2022-03-30  5:05       ` [PATCH v5 00/14] " Neeraj K. Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 01/14] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-30 17:11           ` Junio C Hamano
2022-03-30 18:34             ` Neeraj Singh
2022-03-30 20:24               ` Junio C Hamano
2022-03-31  4:17                 ` Neeraj Singh
2022-03-31 17:50                   ` Junio C Hamano
2022-03-31 19:08                     ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 02/14] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-30 17:17           ` Junio C Hamano
2022-03-31  5:51             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 03/14] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-30 17:18           ` Junio C Hamano
2022-03-30 17:54             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 04/14] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-30 17:37           ` Junio C Hamano
2022-03-31  6:28             ` Neeraj Singh
2022-03-31 18:05               ` Junio C Hamano
2022-03-31 19:18                 ` Neeraj Singh
2022-04-01 15:56                   ` Junio C Hamano
2022-03-30  5:05         ` [PATCH v5 05/14] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
2022-03-30 17:46           ` Junio C Hamano
2022-03-30 19:04             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 06/14] builtin/add: add ODB transaction around add_files_to_cache Neeraj Singh via GitGitGadget
2022-03-30 17:47           ` Junio C Hamano
2022-03-30  5:05         ` [PATCH v5 07/14] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-30 17:52           ` Junio C Hamano
2022-03-30 19:09             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 08/14] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 09/14] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 10/14] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 11/14] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-30 18:13           ` Junio C Hamano
2022-03-31  3:55             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 12/14] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 13/14] core.fsyncmethod: performance tests for batch mode Neeraj Singh via GitGitGadget
2022-03-31  4:09           ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 14/14] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-04-05  5:20         ` [PATCH v6 00/12] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects nksingh85
2022-04-06 20:32           ` Junio C Hamano
2022-05-19 21:47             ` Junio C Hamano
2022-05-19 21:54               ` Neeraj Singh
2022-05-24 12:31                 ` Johannes Schindelin
2022-04-05  5:20         ` [PATCH v6 01/12] bulk-checkin: rename 'state' variable and separate 'plugged' boolean nksingh85
2022-04-05  5:20         ` [PATCH v6 02/12] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' nksingh85
2022-04-05  5:20         ` [PATCH v6 03/12] core.fsyncmethod: batched disk flushes for loose-objects nksingh85
2022-04-05  5:20         ` [PATCH v6 04/12] cache-tree: use ODB transaction around writing a tree nksingh85
2022-04-05  5:20         ` [PATCH v6 05/12] builtin/add: add ODB transaction around add_files_to_cache nksingh85
2022-04-05  5:20         ` [PATCH v6 06/12] update-index: use the bulk-checkin infrastructure nksingh85
2022-04-05  5:20         ` [PATCH v6 07/12] unpack-objects: " nksingh85
2022-04-05  5:20         ` [PATCH v6 08/12] core.fsync: use batch mode and sync loose objects by default on Windows nksingh85
2022-04-05  5:20         ` [PATCH v6 09/12] test-lib-functions: add parsing helpers for ls-files and ls-tree nksingh85
2022-04-05  5:20         ` [PATCH v6 10/12] core.fsyncmethod: tests for batch mode nksingh85
2022-04-05  5:20         ` [PATCH v6 11/12] t/perf: add iteration setup mechanism to perf-lib nksingh85
2022-04-05  5:20         ` [PATCH v6 12/12] core.fsyncmethod: performance tests for batch mode nksingh85

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANQDOddpo+a8r_0yghgy_1bHvfUe5XQaaaWc7D-OLqX6Anhgiw@mail.gmail.com \
    --to=nksingh85@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=neerajsi@microsoft.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).