git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ivan Kanis <expire-by-2010-08-16@kanis.fr>
To: Avery Pennarun <apenwarr@gmail.com>
Cc: Dmitry Potapov <dpotapov@gmail.com>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
	jaredhance@gmail.com, jnareb@gmail.com, git <git@vger.kernel.org>
Subject: Re: Excessive mmap [was Git server eats all memory]
Date: Wed, 11 Aug 2010 17:47:36 +0200	[thread overview]
Message-ID: <weshbj1xiav.fsf@kanis.fr> (raw)
In-Reply-To: <AANLkTiktriuvciNTNPD4941AG3th6rWwUYT4v_UnaAz3@mail.gmail.com> (Avery Pennarun's message of "Mon, 9 Aug 2010 12:50:30 -0400")

Hi Avery,

Avery Pennarun <apenwarr@gmail.com> wrote:

> ... When you access some of the pages of the mmap'd file, the kernel
> will swap those pages into memory, which increases RSS.  This uses
> *real* memory on the system...

Thanks for the very clear explanations

> Now, the kernel is supposed to be smart enough to release old pages
> out of RSS if you stop using them; it's no different from what the
> kernel does with any cached file data.  So it shouldn't be expensive
> to mmap instead of just reading the file.

How can the kernel release old pages? There does not seem to be anyway
to tell it that it doesn't need a given memory block.

>> Looking some more into it today the bulk of the memory allocation
>> happens in write_pack_file in the following loop.
>>
>> for (; i < nr_objects; i++) {
>>    if (!write_one(f, objects + i, &offset))
>>        break;
>>    display_progress(progress_state, written);
>> }
>>
>> This eventually calls write_object, here I am wondering if the
>> unuse_pack function is doing its job. As far as I can tell it writes a
>> null in memory, that I think is not enough to reclaim memory.
>
> What do you mean by the "memory allocation" happens here?  How are you
> measuring it?

I run top and look at the RES column. I put a printf before and after
the code block and watch the memory go up and up.

>> I also looked at the use_pack function where the mmap is
>> happening. Would it be worth refactoring this function so that it uses
>> an index withing a file instead of mmap?
>>
>> Unless I hear of a better idea I'll be trying that tomorrow...
>
> I wouldn't expect this to help, but I would be interested to hear if
> it does.

I got caught up with other thing at work but I think I'll be able to try
Friday.

> If the problem is simply that you're flooding the kernel disk cache
> with data you'll use only once, to the detriment of everything else on
> the system, then one thing that might help could be posix_fadvise:
>
>     posix_fadvise(fd, ofs, len, POSIX_FADV_DONTNEED);

Sounds interesting, I'll try sticking that in the unuse_pack function
Friday.

> On the other hand, perhaps a more important question is: why does git
> feel like it needs to generate entirely new packs for each person
> doing a clone on your system?  Shouldn't it be reusing existing ones
> and just streaming them straight out to the recipient?

Ah interesting point. Two things make me suspect the mmap is not shared
between processes. One is that mmap is done with the MAP_PRIVATE flag
which according to the man page doesn't share between processes. The
second is that the mmap is done on a temporary file created by
odb_mkstemp, I don't believe the name is identical between the two
processes.

Take care,
-- 
Ivan Kanis
http://kanis.fr

Nobody ever went broke underestimating the intelligence of the
American public.
    -- H L Mencken 

  parent reply	other threads:[~2010-08-11 15:48 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-04 14:57 Git server eats all memory Ivan Kanis
2010-08-04 15:55 ` Matthieu Moy
2010-08-04 17:50   ` Ivan Kanis
2010-08-04 20:12 ` Avery Pennarun
2010-08-05  6:33   ` Ivan Kanis
2010-08-05 22:45     ` Jared Hance
2010-08-06  1:37     ` Nguyen Thai Ngoc Duy
2010-08-06  1:51       ` Nguyen Thai Ngoc Duy
2010-08-06 11:34         ` Jakub Narebski
2010-08-06 17:23         ` Ivan Kanis
2010-08-07  6:42           ` Dmitry Potapov
2010-08-09 10:12             ` Excessive mmap [was Git server eats all memory] Ivan Kanis
2010-08-09 12:35               ` Dmitry Potapov
2010-08-09 16:34                 ` Ivan Kanis
2010-08-09 16:50                   ` Avery Pennarun
2010-08-09 17:45                     ` Tomas Carnecky
2010-08-09 18:17                       ` Avery Pennarun
2010-08-09 21:28                     ` Dmitry Potapov
2010-08-11 15:47                     ` Ivan Kanis [this message]
2010-08-11 16:35                       ` Avery Pennarun
     [not found]                         ` <wes4oetv31i.fsf@kanis.fr>
2010-08-17 17:07                           ` Dmitry Potapov
2018-06-20 14:53               ` Duy Nguyen
     [not found]           ` <AANLkTi=yeTh2tKn9t_=iZbdB5VLrfCPZ2_fBpYdf9wta@mail.gmail.com>
     [not found]             ` <wesbp9cnnag.fsf@kanis.fr>
2010-08-09  9:57               ` Git server eats all memory Nguyen Thai Ngoc Duy
2010-08-09 17:38                 ` Ivan Kanis
2010-08-10  0:46 ` Robin H. Johnson
2010-08-10  2:31   ` Sverre Rabbelier
2010-08-11 10:30     ` Sam Vilain
2010-08-11 15:54   ` Ivan Kanis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=weshbj1xiav.fsf@kanis.fr \
    --to=expire-by-2010-08-16@kanis.fr \
    --cc=apenwarr@gmail.com \
    --cc=dpotapov@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jaredhance@gmail.com \
    --cc=jnareb@gmail.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).