git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Neeraj Singh <nksingh85@gmail.com>
Cc: "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com>,
	"Git List" <git@vger.kernel.org>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Patrick Steinhardt" <ps@pks.im>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>,
	"Neeraj K. Singh" <neerajsi@microsoft.com>
Subject: Re: [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure
Date: Thu, 24 Mar 2022 14:34:06 -0700	[thread overview]
Message-ID: <xmqq8rszf31t.fsf@gitster.g> (raw)
In-Reply-To: <CANQDOdcuBRvWx7iMYBvLYEEb6A_=SURLAGumk026ZyDODpfAsQ@mail.gmail.com> (Neeraj Singh's message of "Thu, 24 Mar 2022 13:25:38 -0700")

Neeraj Singh <nksingh85@gmail.com> writes:

>> IOW, I am not sure end_if_active() should exist in the first place.
>> Shouldn't end_transaction() do that instead?
>>
>
> Today there's an "assert(bulk_checkin_plugged)" in
> end_odb_transaction. In principle we could just drop the assert and
> allow a transaction to be ended multiple times.  But maybe in the long
> run for composability we'd like to have nested callers to begin/end
> transaction (e.g. we could have a nested transaction around writing
> the cache tree to the ODB to minimize fsyncs there).

I am not convinced that "transaction" is a good mental model for
this mechanism to begin with, in the sense that the sense that it is
not a bug or failure of the implementation if two or more operations
in the same <begin,end> bracket did not happen (or not happen)
atomically, or if 'begin' and 'end' were not properly nested.  With
the design getting more complex with things like tentative object
store that needs to be explicitly migrated after the outermost level
of end-transaction, we may end up _requiring_ that sufficient number
of 'end' must come once we issued 'begin', which I am not sure is
necessarily a good thing.

In any case, we aspire/envision to have a nested plug/unplug, I
think it is a good thing.  A helper for one subsystem may have its
large batch of operations inside plug/unplug pair, another help may
do the same, and the caller of these two helpers may want to say

	plug
		call helper A
			A does plug
			A does many things
			A does unplug
		call helper B
			B does plug
			B does many things
			B does unplug
	unplug

to "cancel" the unplug helper A and B has.

> In that world,
> having a subsystem not maintain a balanced pairing could be a problem.

And in such a world, you never want to have end-if-active to
implement what you are doing here, as you may end up being not
properly nested:

	begin
		begin
			do many things
			if some condtion
				end_if_active
			do more things
		end
	end

> An alternative API here could be to have an "flush_odb_transaction"
> call to make the objects visible at this point.

Yes, what you want is a forced-flush instead, I think.

So I suspect you'd want these three primitives, perhaps?

 * begin increments the nesting level
   - if outermost, you may have to do real "setup" things
   - otherwise, you may not have anything other than just counting
     the nesting level

 * flush implements unplug, fsync, etc. and does so immediately,
   even when plugged.

 * end decrements the nesting level
   - if outermost, you'd do "flush".
   - otherwise, you may only count the nesting level and do nothing else,
     but doing "flush" when you realize that you've queued too many
     is not a bug or a crime.


  reply	other threads:[~2022-03-24 21:34 UTC|newest]

Thread overview: 175+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-16  5:33   ` Junio C Hamano
2022-03-16  7:33     ` Neeraj Singh
2022-03-16 16:14       ` Junio C Hamano
2022-03-16 17:59         ` Neeraj Singh
2022-03-16 18:10           ` Junio C Hamano
2022-03-16 19:50             ` Neeraj Singh
2022-03-15 21:30 ` [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-16  7:31   ` Patrick Steinhardt
2022-03-16 18:21     ` Neeraj Singh
2022-03-17  5:48       ` Patrick Steinhardt
2022-03-16 11:50   ` Bagas Sanjaya
2022-03-16 19:59     ` Neeraj Singh
2022-03-15 21:30 ` [PATCH 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
2022-03-21 18:28       ` Neeraj Singh
2022-03-21 15:47     ` Ævar Arnfjörð Bjarmason
2022-03-21 20:14       ` Neeraj Singh
2022-03-21 20:18         ` Ævar Arnfjörð Bjarmason
2022-03-22  0:13           ` Neeraj Singh
2022-03-22  8:52             ` Ævar Arnfjörð Bjarmason
2022-03-22 20:05               ` Neeraj Singh
2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function Ævar Arnfjörð Bjarmason
2022-03-23  5:27                     ` Neeraj Singh
2022-03-23  3:47                   ` [RFC PATCH 2/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 3/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 4/7] update-index: use a utility function for stdin consumption Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 5/7] update-index: pass down an "oflags" argument Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 6/7] update-index: rename "buf" to "line" Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23  5:51                     ` Neeraj Singh
2022-03-23  9:48                       ` Ævar Arnfjörð Bjarmason
2022-03-23 20:19                         ` Neeraj Singh
2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23 20:23                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
2022-03-23 20:25                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 3/7] update-index: pass down skeleton "oflags" argument Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects Ævar Arnfjörð Bjarmason
2022-03-23 20:30                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 5/7] add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync() Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 6/7] fsync docs: update for new syncing semantics Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old Ævar Arnfjörð Bjarmason
2022-03-23 21:08                       ` Neeraj Singh
2022-03-21 17:30     ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Junio C Hamano
2022-03-21 20:23       ` Neeraj Singh
2022-03-23 13:26     ` Ævar Arnfjörð Bjarmason
2022-03-24  2:04       ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-21 15:01     ` Ævar Arnfjörð Bjarmason
2022-03-21 22:09       ` Neeraj Singh
2022-03-21 23:16         ` Ævar Arnfjörð Bjarmason
2022-03-21 17:50     ` Junio C Hamano
2022-03-21 22:18       ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-21 17:55     ` Junio C Hamano
2022-03-21 23:02       ` Neeraj Singh
2022-03-22 20:54         ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-21 18:34     ` Junio C Hamano
2022-03-22  5:54       ` Neeraj Singh
2022-03-20  7:16   ` [PATCH v2 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-21 17:03   ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
2022-03-21 18:14     ` Neeraj Singh
2022-03-21 20:49       ` Junio C Hamano
2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-24 16:10       ` Ævar Arnfjörð Bjarmason
2022-03-24 17:52         ` Neeraj Singh
2022-03-24  4:58     ` [PATCH v3 02/11] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 03/11] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 04/11] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-24 18:18       ` Junio C Hamano
2022-03-24 20:25         ` Neeraj Singh
2022-03-24 21:34           ` Junio C Hamano [this message]
2022-03-24 22:21             ` Neeraj Singh
2022-03-24  4:58     ` [PATCH v3 06/11] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 07/11] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 08/11] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 09/11] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-24 16:29       ` Ævar Arnfjörð Bjarmason
2022-03-24 18:23         ` Neeraj Singh
2022-03-26 15:35           ` Ævar Arnfjörð Bjarmason
2022-03-24  4:58     ` [PATCH v3 10/11] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 11/11] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-03-24 17:44     ` [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
2022-03-24 19:21       ` Neeraj Singh
2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 02/13] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 03/13] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 04/13] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 05/13] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 06/13] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 07/13] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 08/13] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 09/13] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 10/13] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 11/13] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
2022-03-29 17:14         ` Neeraj Singh
2022-03-29 18:50           ` Junio C Hamano
2022-03-29  0:42       ` [PATCH v4 12/13] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-29 17:38         ` Neeraj Singh
2022-03-29  0:42       ` [PATCH v4 13/13] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-03-29 10:47       ` [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Ævar Arnfjörð Bjarmason
2022-03-29 17:09         ` Neeraj Singh
2022-03-29 11:45       ` Ævar Arnfjörð Bjarmason
2022-03-29 16:51         ` Neeraj Singh
2022-03-30  5:05       ` [PATCH v5 00/14] " Neeraj K. Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 01/14] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-30 17:11           ` Junio C Hamano
2022-03-30 18:34             ` Neeraj Singh
2022-03-30 20:24               ` Junio C Hamano
2022-03-31  4:17                 ` Neeraj Singh
2022-03-31 17:50                   ` Junio C Hamano
2022-03-31 19:08                     ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 02/14] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-30 17:17           ` Junio C Hamano
2022-03-31  5:51             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 03/14] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-30 17:18           ` Junio C Hamano
2022-03-30 17:54             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 04/14] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-30 17:37           ` Junio C Hamano
2022-03-31  6:28             ` Neeraj Singh
2022-03-31 18:05               ` Junio C Hamano
2022-03-31 19:18                 ` Neeraj Singh
2022-04-01 15:56                   ` Junio C Hamano
2022-03-30  5:05         ` [PATCH v5 05/14] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
2022-03-30 17:46           ` Junio C Hamano
2022-03-30 19:04             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 06/14] builtin/add: add ODB transaction around add_files_to_cache Neeraj Singh via GitGitGadget
2022-03-30 17:47           ` Junio C Hamano
2022-03-30  5:05         ` [PATCH v5 07/14] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-30 17:52           ` Junio C Hamano
2022-03-30 19:09             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 08/14] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 09/14] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 10/14] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 11/14] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-30 18:13           ` Junio C Hamano
2022-03-31  3:55             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 12/14] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 13/14] core.fsyncmethod: performance tests for batch mode Neeraj Singh via GitGitGadget
2022-03-31  4:09           ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 14/14] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-04-05  5:20         ` [PATCH v6 00/12] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects nksingh85
2022-04-06 20:32           ` Junio C Hamano
2022-05-19 21:47             ` Junio C Hamano
2022-05-19 21:54               ` Neeraj Singh
2022-05-24 12:31                 ` Johannes Schindelin
2022-04-05  5:20         ` [PATCH v6 01/12] bulk-checkin: rename 'state' variable and separate 'plugged' boolean nksingh85
2022-04-05  5:20         ` [PATCH v6 02/12] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' nksingh85
2022-04-05  5:20         ` [PATCH v6 03/12] core.fsyncmethod: batched disk flushes for loose-objects nksingh85
2022-04-05  5:20         ` [PATCH v6 04/12] cache-tree: use ODB transaction around writing a tree nksingh85
2022-04-05  5:20         ` [PATCH v6 05/12] builtin/add: add ODB transaction around add_files_to_cache nksingh85
2022-04-05  5:20         ` [PATCH v6 06/12] update-index: use the bulk-checkin infrastructure nksingh85
2022-04-05  5:20         ` [PATCH v6 07/12] unpack-objects: " nksingh85
2022-04-05  5:20         ` [PATCH v6 08/12] core.fsync: use batch mode and sync loose objects by default on Windows nksingh85
2022-04-05  5:20         ` [PATCH v6 09/12] test-lib-functions: add parsing helpers for ls-files and ls-tree nksingh85
2022-04-05  5:20         ` [PATCH v6 10/12] core.fsyncmethod: tests for batch mode nksingh85
2022-04-05  5:20         ` [PATCH v6 11/12] t/perf: add iteration setup mechanism to perf-lib nksingh85
2022-04-05  5:20         ` [PATCH v6 12/12] core.fsyncmethod: performance tests for batch mode nksingh85

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq8rszf31t.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=neerajsi@microsoft.com \
    --cc=nksingh85@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).