git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: David Turner <dturner@twopensource.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH v2 03/17] index-helper: new daemon for caching index and related stuff
Date: Tue, 29 Mar 2016 09:31:39 +0700	[thread overview]
Message-ID: <CACsJy8Bk19NNacDwez6BzicThnVUDQEoGe3m+ThiorP8uP9+eA@mail.gmail.com> (raw)
In-Reply-To: <1458349490-1704-4-git-send-email-dturner@twopensource.com>

On Sat, Mar 19, 2016 at 8:04 AM, David Turner <dturner@twopensource.com> wrote:
> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>
> Instead of reading the index from disk and worrying about disk
> corruption, the index is cached in memory (memory bit-flips happen
> too, but hopefully less often). The result is faster read. Read time
> is reduced by 70%.
>
> The biggest gain is not having to verify the trailing SHA-1, which
> takes lots of time especially on large index files. But this also
> opens doors for further optimiztions:
>
>  - we could create an in-memory format that's essentially the memory
>    dump of the index to eliminate most of parsing/allocation
>    overhead. The mmap'd memory can be used straight away. Experiment
>    [1] shows we could reduce read time by 88%.

This reference [1] is missing (even in my old version). I believe it's
http://thread.gmane.org/gmane.comp.version-control.git/247268/focus=248771,
comparing 256.442ms in that mail with v4 number, 2245.113ms in 0/8
mail from the same thread.

> Git can poke the daemon via named pipes to tell it to refresh the
> index cache, or to keep it alive some more minutes. It can't give any
> real index data directly to the daemon. Real data goes to disk first,
> then the daemon reads and verifies it from there. Poking only happens
> for $GIT_DIR/index, not temporary index files.

I think we should go with unix socket on *nix platform instead of
named pipe. UNIX named pipe only allows one communication channel at a
time. Windows named pipe is different and allows multiple clients,
which is the same as unix socket.


> $GIT_DIR/index-helper.pipe is the named pipe for daemon process. The
> daemon reads from the pipe and executes commands.  Commands that need
> replies from the daemon will have to open their own pipe, since a
> named pipe should only have one reader.  Unix domain sockets don't
> have this problem, but are less portable.

Hm..NO_UNIX_SOCKETS is only set for Windows in config.mak.uname and
Windows will need to be specially tailored anyway, I think unix socket
would be more elegant.

> +static void share_index(struct index_state *istate, struct shm *is)
> +{
> +       void *new_mmap;
> +       if (istate->mmap_size <= 20 ||
> +           hashcmp(istate->sha1,
> +                   (unsigned char *)istate->mmap + istate->mmap_size - 20) ||
> +           !hashcmp(istate->sha1, is->sha1) ||
> +           git_shm_map(O_CREAT | O_EXCL | O_RDWR, 0700, istate->mmap_size,
> +                       &new_mmap, PROT_READ | PROT_WRITE, MAP_SHARED,
> +                       "git-index-%s", sha1_to_hex(istate->sha1)) < 0)
> +               return;
> +
> +       release_index_shm(is);
> +       is->size = istate->mmap_size;
> +       is->shm = new_mmap;
> +       hashcpy(is->sha1, istate->sha1);
> +       memcpy(new_mmap, istate->mmap, istate->mmap_size - 20);
> +
> +       /*
> +        * The trailing hash must be written last after everything is
> +        * written. It's the indication that the shared memory is now
> +        * ready.
> +        */
> +       hashcpy((unsigned char *)new_mmap + istate->mmap_size - 20, is->sha1);

You commented here [1] a long time ago about memory barrier. I'm not
entirely sure if compilers dare to reorder function calls, but when
hashcpy is inlined and memcpy is builtin, I suppose that's possible...

[1] http://article.gmane.org/gmane.comp.version-control.git/280729

> +}
-- 
Duy

  reply	other threads:[~2016-03-29  2:32 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-19  1:04 [PATCH v2 00/15] index-helper, watchman David Turner
2016-03-19  1:04 ` [PATCH v2 01/17] read-cache.c: fix constness of verify_hdr() David Turner
2016-03-19  1:04 ` [PATCH v2 02/17] read-cache: allow to keep mmap'd memory after reading David Turner
2016-03-19  1:04 ` [PATCH v2 03/17] index-helper: new daemon for caching index and related stuff David Turner
2016-03-29  2:31   ` Duy Nguyen [this message]
2016-03-29 21:51     ` David Turner
2016-03-19  1:04 ` [PATCH v2 04/17] index-helper: add --strict David Turner
2016-03-19  1:04 ` [PATCH v2 05/17] daemonize(): set a flag before exiting the main process David Turner
2016-03-19  1:04 ` [PATCH v2 06/17] index-helper: add --detach David Turner
2016-03-29  2:35   ` Duy Nguyen
2016-03-19  1:04 ` [PATCH v2 07/17] read-cache: add watchman 'WAMA' extension David Turner
2016-03-19  1:04 ` [PATCH v2 08/17] read-cache: invalidate untracked cache data when reading WAMA David Turner
2016-03-29  2:50   ` Duy Nguyen
2016-03-29 21:22     ` David Turner
2016-03-19  1:04 ` [PATCH v2 09/17] Add watchman support to reduce index refresh cost David Turner
2016-03-24 13:47   ` Jeff Hostetler
2016-03-24 17:58     ` David Turner
2016-03-19  1:04 ` [PATCH v2 10/17] index-helper: use watchman to avoid refreshing index with lstat() David Turner
2016-03-19  1:04 ` [PATCH v2 11/17] update-index: enable/disable watchman support David Turner
2016-03-19  1:04 ` [PATCH v2 12/17] unpack-trees: preserve index extensions David Turner
2016-03-19  1:04 ` [PATCH v2 13/17] index-helper: kill mode David Turner
2016-03-19  1:04 ` [PATCH v2 14/17] index-helper: don't run if already running David Turner
2016-03-19  1:04 ` [PATCH v2 15/17] index-helper: autorun mode David Turner
2016-03-19  1:04 ` [PATCH v2 16/17] index-helper: optionally automatically run David Turner
2016-03-19  1:04 ` [PATCH v2 17/17] read-cache: config for waiting for index-helper David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACsJy8Bk19NNacDwez6BzicThnVUDQEoGe3m+ThiorP8uP9+eA@mail.gmail.com \
    --to=pclouds@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).