On Mon, 10 Mar 2014, Ondřej Bílka wrote: > On Mon, Mar 10, 2014 at 03:13:45AM -0700, David Lang wrote: >> On Mon, 10 Mar 2014, Dennis Luehring wrote: >> >>> according to these blog posts >>> >>> http://www.infoq.com/news/2014/01/facebook-scaling-hg >>> https://code.facebook.com/posts/218678814984400/scaling-mercurial-at-facebook/ >>> >>> mercurial "can" be faster then git >>> >>> but i don't found any reply from the git community if it is a real problem >>> or if there a ongoing (maybe git 2.0) changes to compete better in this case >> >> As I understand this, the biggest part of what happened is that >> Facebook made a tweak to mercurial so that when it needs to know >> what files have changed in their massive tree, their version asks >> their special storage array, while git would have to look at it >> through the filesystem interface (by doing stat calls on the >> directories and files to see if anything has changed) >> > That is mostly a kernel problem. Long ago there was proposed patch to > add a recursive mtime so you could check what subtrees changed. If > somebody ressurected that patch it would gave similar boost. btrfs could actually implement this efficiently, but for a lot of other filesysems this could be very expensive. The question is if it could be enough of a win to make it a good choice for people who are doing a heavy git workload as opposed to more generic uses. there's also the issue of managed vs generated files, if you update the mtime all the way up the tree because a source file was compiled and a binary created, that will quickly defeat the value of the recursive mtime. David Lang > There are two issues that need to be handled, first if you are concerned > about one mtime change doing lot of updates a application needs to mark > all directories it is interested on, when we do update we unmark > directory and by that we update each directory at most once per > application run. > > Second problem were hard links where probably a best course is keep list > of these and stat them separately.