git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ondřej Bílka" <neleai@seznam.cz>
To: David Lang <david@lang.hm>
Cc: Dennis Luehring <dl.soluz@gmx.net>, git@vger.kernel.org
Subject: Re: question about: Facebook makes Mercurial faster than Git
Date: Tue, 11 Mar 2014 15:23:25 +0100	[thread overview]
Message-ID: <20140311142325.GB17336@domone.podge> (raw)
In-Reply-To: <alpine.DEB.2.02.1403101053120.20306@nftneq.ynat.uz>

On Mon, Mar 10, 2014 at 10:56:51AM -0700, David Lang wrote:
> On Mon, 10 Mar 2014, Ondřej Bílka wrote:
> 
> >On Mon, Mar 10, 2014 at 03:13:45AM -0700, David Lang wrote:
> >>On Mon, 10 Mar 2014, Dennis Luehring wrote:
> >>
> >>>according to these blog posts
> >>>
> >>>http://www.infoq.com/news/2014/01/facebook-scaling-hg
> >>>https://code.facebook.com/posts/218678814984400/scaling-mercurial-at-facebook/
> >>>
> >>>mercurial "can" be faster then git
> >>>
> >>>but i don't found any reply from the git community if it is a real problem
> >>>or if there a ongoing (maybe git 2.0) changes to compete better in this case
> >>
> >>As I understand this, the biggest part of what happened is that
> >>Facebook made a tweak to mercurial so that when it needs to know
> >>what files have changed in their massive tree, their version asks
> >>their special storage array, while git would have to look at it
> >>through the filesystem interface (by doing stat calls on the
> >>directories and files to see if anything has changed)
> >>
> >That is mostly a kernel problem. Long ago there was proposed patch to
> >add a recursive mtime so you could check what subtrees changed. If
> >somebody ressurected that patch it would gave similar boost.
> 
> btrfs could actually implement this efficiently, but for a lot of
> other filesysems this could be very expensive. The question is if it
> could be enough of a win to make it a good choice for people who are
> doing a heavy git workload as opposed to more generic uses.
>
Read next paragraph how do that efficiently, a directory update needs to be done
only between application runs. Also there is no overhead when not used
(except if that makes headers bigger.)
 
> there's also the issue of managed vs generated files, if you update
> the mtime all the way up the tree because a source file was compiled
> and a binary created, that will quickly defeat the value of the
> recursive mtime.
>
You could do marking on per-file basis. I am not sure if that is needed
as larger projects use makefiles to not recompile everything so its
probably recompiled because source at same directory changed. Also if
your compile time is five minutes a half second status would not make
much difference.

 
> 
> >There are two issues that need to be handled, first if you are concerned
> >about one mtime change doing lot of updates a application needs to mark
> >all directories it is interested on, when we do update we unmark
> >directory and by that we update each directory at most once per
> >application run.
> >
> >Second problem were hard links where probably a best course is keep list
> >of these and stat them separately.

  parent reply	other threads:[~2014-03-11 14:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-10 10:07 question about: Facebook makes Mercurial faster than Git Dennis Luehring
2014-03-10 10:13 ` David Lang
2014-03-10 17:51   ` Ondřej Bílka
2014-03-10 17:56     ` David Lang
2014-03-10 20:22       ` Martin Langhoff
2014-03-11 14:23       ` Ondřej Bílka [this message]
2014-03-10 11:28 ` demerphq
2014-03-10 11:42   ` Dennis Luehring
2014-03-10 12:10     ` Johan Herland
2014-03-10 14:48       ` Michael Haggerty
2014-03-10 14:18     ` Karsten Blees
2014-03-14 12:58   ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140311142325.GB17336@domone.podge \
    --to=neleai@seznam.cz \
    --cc=david@lang.hm \
    --cc=dl.soluz@gmx.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).