git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: David Lang <david@lang.hm>
To: "Ondřej Bílka" <neleai@seznam.cz>
Cc: Dennis Luehring <dl.soluz@gmx.net>, git@vger.kernel.org
Subject: Re: question about: Facebook makes Mercurial faster than Git
Date: Mon, 10 Mar 2014 10:56:51 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.02.1403101053120.20306@nftneq.ynat.uz> (raw)
In-Reply-To: <20140310175102.GA17336@domone.podge>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2058 bytes --]

On Mon, 10 Mar 2014, Ondřej Bílka wrote:

> On Mon, Mar 10, 2014 at 03:13:45AM -0700, David Lang wrote:
>> On Mon, 10 Mar 2014, Dennis Luehring wrote:
>>
>>> according to these blog posts
>>>
>>> http://www.infoq.com/news/2014/01/facebook-scaling-hg
>>> https://code.facebook.com/posts/218678814984400/scaling-mercurial-at-facebook/
>>>
>>> mercurial "can" be faster then git
>>>
>>> but i don't found any reply from the git community if it is a real problem
>>> or if there a ongoing (maybe git 2.0) changes to compete better in this case
>>
>> As I understand this, the biggest part of what happened is that
>> Facebook made a tweak to mercurial so that when it needs to know
>> what files have changed in their massive tree, their version asks
>> their special storage array, while git would have to look at it
>> through the filesystem interface (by doing stat calls on the
>> directories and files to see if anything has changed)
>>
> That is mostly a kernel problem. Long ago there was proposed patch to
> add a recursive mtime so you could check what subtrees changed. If
> somebody ressurected that patch it would gave similar boost.

btrfs could actually implement this efficiently, but for a lot of other 
filesysems this could be very expensive. The question is if it could be enough 
of a win to make it a good choice for people who are doing a heavy git workload 
as opposed to more generic uses.

there's also the issue of managed vs generated files, if you update the mtime 
all the way up the tree because a source file was compiled and a binary created, 
that will quickly defeat the value of the recursive mtime.

David Lang

> There are two issues that need to be handled, first if you are concerned
> about one mtime change doing lot of updates a application needs to mark
> all directories it is interested on, when we do update we unmark
> directory and by that we update each directory at most once per
> application run.
>
> Second problem were hard links where probably a best course is keep list
> of these and stat them separately.

  reply	other threads:[~2014-03-10 17:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-10 10:07 question about: Facebook makes Mercurial faster than Git Dennis Luehring
2014-03-10 10:13 ` David Lang
2014-03-10 17:51   ` Ondřej Bílka
2014-03-10 17:56     ` David Lang [this message]
2014-03-10 20:22       ` Martin Langhoff
2014-03-11 14:23       ` Ondřej Bílka
2014-03-10 11:28 ` demerphq
2014-03-10 11:42   ` Dennis Luehring
2014-03-10 12:10     ` Johan Herland
2014-03-10 14:48       ` Michael Haggerty
2014-03-10 14:18     ` Karsten Blees
2014-03-14 12:58   ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.02.1403101053120.20306@nftneq.ynat.uz \
    --to=david@lang.hm \
    --cc=dl.soluz@gmx.net \
    --cc=git@vger.kernel.org \
    --cc=neleai@seznam.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).