git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: David Turner <dturner@twopensource.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH v2 03/17] index-helper: new daemon for caching index and related stuff
Date: Tue, 29 Mar 2016 17:51:59 -0400	[thread overview]
Message-ID: <1459288319.2976.16.camel@twopensource.com> (raw)
In-Reply-To: <CACsJy8Bk19NNacDwez6BzicThnVUDQEoGe3m+ThiorP8uP9+eA@mail.gmail.com>

On Tue, 2016-03-29 at 09:31 +0700, Duy Nguyen wrote:
> On Sat, Mar 19, 2016 at 8:04 AM, David Turner <
> dturner@twopensource.com> wrote:
> > From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> > 
> > Instead of reading the index from disk and worrying about disk
> > corruption, the index is cached in memory (memory bit-flips happen
> > too, but hopefully less often). The result is faster read. Read
> > time
> > is reduced by 70%.
> > 
> > The biggest gain is not having to verify the trailing SHA-1, which
> > takes lots of time especially on large index files. But this also
> > opens doors for further optimiztions:
> > 
> >  - we could create an in-memory format that's essentially the
> > memory
> >    dump of the index to eliminate most of parsing/allocation
> >    overhead. The mmap'd memory can be used straight away.
> > Experiment
> >    [1] shows we could reduce read time by 88%.
> 
> This reference [1] is missing (even in my old version). I believe
> it's
> http://thread.gmane.org/gmane.comp.version-control.git/247268/focus=2
> 48771,
> comparing 256.442ms in that mail with v4 number, 2245.113ms in 0/8
> mail from the same thread.
> 
> > Git can poke the daemon via named pipes to tell it to refresh the
> > index cache, or to keep it alive some more minutes. It can't give
> > any
> > real index data directly to the daemon. Real data goes to disk
> > first,
> > then the daemon reads and verifies it from there. Poking only
> > happens
> > for $GIT_DIR/index, not temporary index files.
> 
> I think we should go with unix socket on *nix platform instead of
> named pipe. UNIX named pipe only allows one communication channel at
> a
> time. Windows named pipe is different and allows multiple clients,
> which is the same as unix socket.
> 
> 
> > $GIT_DIR/index-helper.pipe is the named pipe for daemon process.
> > The
> > daemon reads from the pipe and executes commands.  Commands that
> > need
> > replies from the daemon will have to open their own pipe, since a
> > named pipe should only have one reader.  Unix domain sockets don't
> > have this problem, but are less portable.
> 
> Hm..NO_UNIX_SOCKETS is only set for Windows in config.mak.uname and
> Windows will need to be specially tailored anyway, I think unix
> socket
> would be more elegant.

One annoyance with unix sockets is that they must have short paths
(UNIX_PATH_MAX -- about a hundred characters).  This basically means
they should be in $TMPDIR.  I guess we could go back to pid files in
$GIT_DIR, and then have a socket named after the pid.  There's also
some security issues, but it actually looks like there's a simple
enough workaround for them.

I'll change this, but it might take a bit as I'm busy with other things
this week.

> > +static void share_index(struct index_state *istate, struct shm
> > *is)
> > +{
> > +       void *new_mmap;
> > +       if (istate->mmap_size <= 20 ||
> > +           hashcmp(istate->sha1,
> > +                   (unsigned char *)istate->mmap + istate
> > ->mmap_size - 20) ||
> > +           !hashcmp(istate->sha1, is->sha1) ||
> > +           git_shm_map(O_CREAT | O_EXCL | O_RDWR, 0700, istate
> > ->mmap_size,
> > +                       &new_mmap, PROT_READ | PROT_WRITE,
> > MAP_SHARED,
> > +                       "git-index-%s", sha1_to_hex(istate->sha1))
> > < 0)
> > +               return;
> > +
> > +       release_index_shm(is);
> > +       is->size = istate->mmap_size;
> > +       is->shm = new_mmap;
> > +       hashcpy(is->sha1, istate->sha1);
> > +       memcpy(new_mmap, istate->mmap, istate->mmap_size - 20);
> > +
> > +       /*
> > +        * The trailing hash must be written last after everything
> > is
> > +        * written. It's the indication that the shared memory is
> > now
> > +        * ready.
> > +        */
> > +       hashcpy((unsigned char *)new_mmap + istate->mmap_size - 20,
> > is->sha1);
> 
> You commented here [1] a long time ago about memory barrier. I'm not
> entirely sure if compilers dare to reorder function calls, but when
> hashcpy is inlined and memcpy is builtin, I suppose that's
> possible...
> 
> [1] http://article.gmane.org/gmane.comp.version-control.git/280729

Oh, right.  Will fix.

  reply	other threads:[~2016-03-29 21:52 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-19  1:04 [PATCH v2 00/15] index-helper, watchman David Turner
2016-03-19  1:04 ` [PATCH v2 01/17] read-cache.c: fix constness of verify_hdr() David Turner
2016-03-19  1:04 ` [PATCH v2 02/17] read-cache: allow to keep mmap'd memory after reading David Turner
2016-03-19  1:04 ` [PATCH v2 03/17] index-helper: new daemon for caching index and related stuff David Turner
2016-03-29  2:31   ` Duy Nguyen
2016-03-29 21:51     ` David Turner [this message]
2016-03-19  1:04 ` [PATCH v2 04/17] index-helper: add --strict David Turner
2016-03-19  1:04 ` [PATCH v2 05/17] daemonize(): set a flag before exiting the main process David Turner
2016-03-19  1:04 ` [PATCH v2 06/17] index-helper: add --detach David Turner
2016-03-29  2:35   ` Duy Nguyen
2016-03-19  1:04 ` [PATCH v2 07/17] read-cache: add watchman 'WAMA' extension David Turner
2016-03-19  1:04 ` [PATCH v2 08/17] read-cache: invalidate untracked cache data when reading WAMA David Turner
2016-03-29  2:50   ` Duy Nguyen
2016-03-29 21:22     ` David Turner
2016-03-19  1:04 ` [PATCH v2 09/17] Add watchman support to reduce index refresh cost David Turner
2016-03-24 13:47   ` Jeff Hostetler
2016-03-24 17:58     ` David Turner
2016-03-19  1:04 ` [PATCH v2 10/17] index-helper: use watchman to avoid refreshing index with lstat() David Turner
2016-03-19  1:04 ` [PATCH v2 11/17] update-index: enable/disable watchman support David Turner
2016-03-19  1:04 ` [PATCH v2 12/17] unpack-trees: preserve index extensions David Turner
2016-03-19  1:04 ` [PATCH v2 13/17] index-helper: kill mode David Turner
2016-03-19  1:04 ` [PATCH v2 14/17] index-helper: don't run if already running David Turner
2016-03-19  1:04 ` [PATCH v2 15/17] index-helper: autorun mode David Turner
2016-03-19  1:04 ` [PATCH v2 16/17] index-helper: optionally automatically run David Turner
2016-03-19  1:04 ` [PATCH v2 17/17] read-cache: config for waiting for index-helper David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1459288319.2976.16.camel@twopensource.com \
    --to=dturner@twopensource.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).