From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,T_RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id BE5A11F406 for ; Mon, 15 Jan 2018 23:47:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750764AbeAOXrh (ORCPT ); Mon, 15 Jan 2018 18:47:37 -0500 Received: from mxin.vub.ac.be ([134.184.129.114]:13511 "EHLO mxin.vub.ac.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750740AbeAOXrh (ORCPT ); Mon, 15 Jan 2018 18:47:37 -0500 X-Greylist: delayed 583 seconds by postgrey-1.27 at vger.kernel.org; Mon, 15 Jan 2018 18:47:36 EST X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A2FwAQCPOl1a/wsxuIZcGQEBAQEBAQEBA?= =?us-ascii?q?QEBAQcBAQEBAYUbj1KPbRRokTiHDgqFOwKFFBQBAQEBAQEBAQEDaCiFJAEBAQM?= =?us-ascii?q?BJxM/BQsLDgoJJQ8FKCGKPgipfTqJSwEBAQEBAQQBAQEBJIgogWmCeDaIaoIUI?= =?us-ascii?q?AWjZJU+giaJezeHRUiWXYE8NiOBUEwgGIJohFdAjgYBAQE?= X-IPAS-Result: =?us-ascii?q?A2FwAQCPOl1a/wsxuIZcGQEBAQEBAQEBAQEBAQcBAQEBAYU?= =?us-ascii?q?bj1KPbRRokTiHDgqFOwKFFBQBAQEBAQEBAQEDaCiFJAEBAQMBJxM/BQsLDgoJJ?= =?us-ascii?q?Q8FKCGKPgipfTqJSwEBAQEBAQQBAQEBJIgogWmCeDaIaoIUIAWjZJU+giaJeze?= =?us-ascii?q?HRUiWXYE8NiOBUEwgGIJohFdAjgYBAQE?= Received: from igwe11.vub.ac.be (HELO igwe.rave.org) ([134.184.49.11]) by smtp.vub.ac.be with ESMTP/TLS/AES128-GCM-SHA256; 16 Jan 2018 00:37:51 +0100 Received: from kgybels by igwe.rave.org with local (Exim 4.84_2) (envelope-from ) id 1ebEK7-00014s-6f; Tue, 16 Jan 2018 00:37:51 +0100 Date: Tue, 16 Jan 2018 00:37:51 +0100 From: Kim Gybels To: Jeff King Cc: git@vger.kernel.org, Junio C Hamano , Johannes Schindelin , Michael Haggerty Subject: Re: [PATCH] packed_ref_cache: don't use mmap() for small files Message-ID: <20180115233751.GA1781@infogroep.be> References: <20180113161149.9564-1-kgybels@infogroep.be> <20180115211505.GA4778@sigill.intra.peff.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180115211505.GA4778@sigill.intra.peff.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On (15/01/18 16:15), Jeff King wrote: > On Sat, Jan 13, 2018 at 05:11:49PM +0100, Kim Gybels wrote: > > > Take a hint from commit ea68b0ce9f8ce8da3e360aed3cbd6720159ffbee and use > > read() instead of mmap() for small packed-refs files. > > > > This also fixes the problem[1] where xmmap() returns NULL for zero > > length[2], for which munmap() later fails. > > > > Alternatively, we could simply check for NULL before munmap(), or > > introduce an xmunmap() that could be used together with xmmap(). > > This looks good to me, and since it's a recent-ish regression, I think > we should take the minimal fix here. The minimal fix being a simple NULL check before munmap()? > But it does make me wonder whether xmmap() ought to be doing this "small > mmap" optimization for us. Obviously that only works when we do > MAP_PRIVATE and never write to the result. But that's how we always use > it anyway, and we're restricted to that to work with the NO_MMAP wrapper > in compat/mmap.c. Maybe I should have left the optimization for small files out of the patch for the zero length regression. After all, read() vs mmap() performance might depend on other factors than just size. > > @@ -489,21 +491,21 @@ static int load_contents(struct snapshot *snapshot) > > die_errno("couldn't stat %s", snapshot->refs->path); > > size = xsize_t(st.st_size); > > > > - switch (mmap_strategy) { > > - case MMAP_NONE: > > + if (!size) { > > + snapshot->buf = NULL; > > + snapshot->eof = NULL; > > + snapshot->mmapped = 0; > > + } else if (size <= SMALL_FILE_SIZE || mmap_strategy == MMAP_NONE) { > > snapshot->buf = xmalloc(size); > > bytes_read = read_in_full(fd, snapshot->buf, size); > > if (bytes_read < 0 || bytes_read != size) > > die_errno("couldn't read %s", snapshot->refs->path); > > snapshot->eof = snapshot->buf + size; > > snapshot->mmapped = 0; > > If the "!size" case is just lumped in with "size <= SMALL_FILE_SIZE", > then we'd try to xmalloc(0), which is guaranteed to work (we fallback to > a 1-byte allocation if necessary). Would that make things simpler and > more consistent for the rest of the code to always have snapshot->buf be > a valid pointer (just based on seeing Michael's follow-up patches)? Indeed, all those patches are to avoid using the NULL pointers in ways that are undefined. We could also copy index_core's way of handling the zero length case: ret = index_mem(sha1, "", size, type, path, flags); Point to some static memory instead of NULL, then all the pointer arithmetic is defined. -Kim