git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Kevin Willford <Kevin.Willford@microsoft.com>
To: Utsav Shah <ukshah2@illinois.edu>
Cc: Junio C Hamano <gitster@pobox.com>,
	Utsav Shah via GitGitGadget <gitgitgadget@gmail.com>,
	William Baker <William.Baker@microsoft.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: [PATCH 0/1] fsmonitor: skip sanity check if the index is split
Date: Mon, 11 Nov 2019 18:21:19 +0000	[thread overview]
Message-ID: <BN6PR21MB078648200A11238D0FD992FC91740@BN6PR21MB0786.namprd21.prod.outlook.com> (raw)
In-Reply-To: <CABhpXbdyfxF3xVrjHUDzN7inu6O+CVVxG1wc9shhy33XiDoNPA@mail.gmail.com>

> From: Utsav Shah <ukshah2@illinois.edu>
> Sent: Monday, November 11, 2019 10:26 AM
> 
> On Mon, Nov 11, 2019 at 8:55 AM Kevin Willford
> <Kevin.Willford@microsoft.com> wrote:
> >
> > > From: git-owner@vger.kernel.org <git-owner@vger.kernel.org> On
> > > Behalf Of Junio C Hamano
> > > Sent: Sunday, November 10, 2019 7:01 PM
> > >
> > > Junio C Hamano <gitster@pobox.com> writes:
> > >
> > > > "Utsav Shah via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > > >
> > > >> At the very least, this patch mitigates an over-eager check for
> > > >> split index users while maintaining good invariants for the standard
> case.
> > > >
> > > > OK, it sounds more like this "it does not make any sense to
> > > > compare the position in the fsmonitor bitmap (which covers the
> > > > entire thing) with the position in just a split part of the index
> > > > (which covers only the delta over the base index)"?  If that is
> > > > the case, it means that the "check" is even worse than being
> > > > "over-eager"---it simply is not correct.
> > >
> > > Having said all that, I wonder if we are doing the right thing with
> > > or without 3444ec2e ("fsmonitor: don't fill bitmap with entries to
> > > be removed", 2019-10-
> > > 11) in the split-index mode in the first place.
> > >
> > > The fact that your "loosen the check and allow 'pos' that identifies
> > > a tracked path used by the fsmonitor bitmap to be larger than the
> > > size of the istate-
> > > >cache[]" patch under discussion is needed is that 'pos' may
> > > >sometimes be
> > > larger than isate->cache[] no?  Then what happens in this hunk, for
> example?
> > >
> > > diff --git a/fsmonitor.c b/fsmonitor.c index 231e83a94d..1f4aa1b150
> > > 100644
> > > --- a/fsmonitor.c
> > > +++ b/fsmonitor.c
> > > @@ -14,8 +14,13 @@ struct trace_key trace_fsmonitor =
> > > TRACE_KEY_INIT(FSMONITOR);  static void
> > > fsmonitor_ewah_callback(size_t pos, void *is)  {
> > >       struct index_state *istate = (struct index_state *)is;
> > > -     struct cache_entry *ce = istate->cache[pos];
> > > +     struct cache_entry *ce;
> > >
> > > +     if (pos >= istate->cache_nr)
> > > +             BUG("fsmonitor_dirty has more entries than the index
> > > (%"PRIuMAX" >= %u)",
> > > +                 (uintmax_t)pos, istate->cache_nr);
> > > +
> > > +     ce = istate->cache[pos];
> > >       ce->ce_flags &= ~CE_FSMONITOR_VALID;
> > >
> > > The istate->cache[] is a dynamic array whose size is managed via the
> > > usual
> > > ALLOC_GROW() using istate->cache_nr and istate->cache_alloc, whether
> > > the split-index feature is in use.  When your patch makes a
> > > difference, then, doesn't the access to istate->cache[] pick up a
> > > random garbage and then flip the bit?
> > >
> > > Puzzled...  In any case, "check is worse than over-eager, it simply
> > > is wrong" I wrote in the message I am responding to is totally
> > > incorrect, it seems.  It smells like lifting the check would just
> > > hide the underlying problem under the rug?
> >
> > I agree.  The only 2 places that excluding the split-index make sense
> > are in read_fsmonitor_extension and write_fsmonitor_extension because
> > the index_state that is being passing into those methods could be the
> > delta index in which case the number of entries for the fsmonitor
> > bitmap would almost always be more and cause the BUG to be hit which it
> should not be.
> >
> > The reason it is not needed and should not be in the other 2 places is
> > they are ran from tweak_fsmonitor which is ran at post_read_index_from
> > which is after the base and delta indexes have been loaded into the
> > indes_state and the index_state will have all the entries and if the
> > fsmonitor bitmap is bigger than the number of entries then the BUG should
> be hit.
> 
> Thanks. What exactly is the delta index? Is it the "split" index, vs the shared
> indices?

Yes the delta is the same as the split index mentioned here
https://git-scm.com/docs/git-update-index#_split_index.

> I was surprised to see cache_nr being zero. My understanding was
> that cache and cache_nr would always be the materialized version of the
> entire index, which is clearly incorrect.

Most of the time that is correct but if you look in read_index_from, the
index is loaded with the call to

ret = do_read_index(istate, path, 0);

This will read the index extensions so read_fsmonitor_extension will be
called and the cache will only have the entries from the split/delta index.

The base/shared index isn't loaded and in the cache until later when
merge_base_index(istate); is called which is right before the call to
post_read_index_from where tweak_fsmonitor will get called from.

  reply	other threads:[~2019-11-11 18:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-08  7:09 [PATCH 0/1] fsmonitor: skip sanity check if the index is split Utsav Shah via GitGitGadget
2019-11-08  7:09 ` [PATCH 1/1] " Utsav Shah via GitGitGadget
2019-11-12 11:18   ` SZEDER Gábor
2019-11-12 21:08     ` Utsav Shah
2019-11-11  1:43 ` [PATCH 0/1] " Junio C Hamano
2019-11-11  2:01   ` Junio C Hamano
2019-11-11 16:55     ` Kevin Willford
2019-11-11 17:25       ` Utsav Shah
2019-11-11 18:21         ` Kevin Willford [this message]
2019-11-11 17:30       ` William Baker
2019-11-13  1:30       ` Junio C Hamano
2019-11-14  2:55         ` Utsav Shah
2019-11-14 16:41         ` William Baker
2019-11-15  5:04           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN6PR21MB078648200A11238D0FD992FC91740@BN6PR21MB0786.namprd21.prod.outlook.com \
    --to=kevin.willford@microsoft.com \
    --cc=William.Baker@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=ukshah2@illinois.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).