git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Elijah Newren <newren@gmail.com>
Cc: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 0/5] Add struct strmap and associated utility functions
Date: Fri, 28 Aug 2020 03:03:35 -0400	[thread overview]
Message-ID: <20200828070335.GB2105050@coredump.intra.peff.net> (raw)
In-Reply-To: <CABPp-BGYiphp-93Bf=2z-ZLd-Y=buTA0BCp6zuTJF39n1x3Rfw@mail.gmail.com>

On Fri, Aug 21, 2020 at 02:33:54PM -0700, Elijah Newren wrote:

> However, there's an important difference here between what I've done
> and what you've suggested for hashmap: my method did not deallocate
> hashmap->table in hashmap_clear() and then use lazy initialization.
> In fact, I think not deallocating the table was part of the charm --
> the table had already naturally grown to the right size, and because
> the repository has approximately the same number of paths in various
> commits, this provided me a way of getting a table preallocated to a
> reasonable size for all merges after the first (and there are multiple
> merges either when recursiveness is needed due to multiple merge
> bases, OR when rebasing or cherry-picking a sequence of commits).
> This prevented, as hashmap.h puts it, "expensive resizing".
> 
> So, once again, my performance ideas might be clashing with some of
> your desires for the API.  Any clever ideas for resolving that?

If the magic is in pre-sizing the hash, then it seems like the callers
ought to be feeding the size hint. That does make a little more work for
them, but I think there's real value in having consistent semantics for
"clear" across our data structures.

However, one cheat would be to free the memory but retain the size hint
after a clear. And then if we lazy-init, grow immediately to the hint
size. That's more expensive than a true reuse, because we do reallocate
the memory. But it avoids the repeated re-allocation during growth.

It may also be a sign that we should be growing the hash more
aggressively in the first place. Of course all of this is predicated
having some benchmarks. It would be useful to know which part actually
provided the speedup.

-Peff

  parent reply	other threads:[~2020-08-28  7:04 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21 18:52 Elijah Newren via GitGitGadget
2020-08-21 18:52 ` [PATCH 1/5] hashmap: add usage documentation explaining hashmap_free[_entries]() Elijah Newren via GitGitGadget
2020-08-21 19:22   ` Jeff King
2020-08-21 18:52 ` [PATCH 2/5] strmap: new utility functions Elijah Newren via GitGitGadget
2020-08-21 19:48   ` Jeff King
2020-08-21 18:52 ` [PATCH 3/5] strmap: add more " Elijah Newren via GitGitGadget
2020-08-21 19:58   ` Jeff King
2020-08-21 18:52 ` [PATCH 4/5] strmap: add strdup_strings option Elijah Newren via GitGitGadget
2020-08-21 20:01   ` Jeff King
2020-08-21 20:41     ` Elijah Newren
2020-08-21 21:03       ` Jeff King
2020-08-21 22:25         ` Elijah Newren
2020-08-28  7:08           ` Jeff King
2020-08-28 17:20             ` Elijah Newren
2020-08-21 18:52 ` [PATCH 5/5] strmap: add functions facilitating use as a string->int map Elijah Newren via GitGitGadget
2020-08-21 20:10   ` Jeff King
2020-08-21 20:51     ` Elijah Newren
2020-08-21 21:05       ` Jeff King
2020-08-21 20:16 ` [PATCH 0/5] Add struct strmap and associated utility functions Jeff King
2020-08-21 21:33   ` Elijah Newren
2020-08-21 22:28     ` Elijah Newren
2020-08-28  7:03     ` Jeff King [this message]
2020-08-28 15:29       ` Elijah Newren
2020-09-01  9:27         ` Jeff King
2020-10-13  0:40 ` [PATCH v2 00/10] " Elijah Newren via GitGitGadget
2020-10-13  0:40   ` [PATCH v2 01/10] hashmap: add usage documentation explaining hashmap_free[_entries]() Elijah Newren via GitGitGadget
2020-10-30 12:50     ` Jeff King
2020-10-30 19:55       ` Elijah Newren
2020-11-03 16:26         ` Jeff King
2020-11-03 16:48           ` Elijah Newren
2020-10-13  0:40   ` [PATCH v2 02/10] hashmap: adjust spacing to fix argument alignment Elijah Newren via GitGitGadget
2020-10-30 12:51     ` Jeff King
2020-10-13  0:40   ` [PATCH v2 03/10] hashmap: allow re-use after hashmap_free() Elijah Newren via GitGitGadget
2020-10-30 13:35     ` Jeff King
2020-10-30 15:37       ` Elijah Newren
2020-11-03 16:08         ` Jeff King
2020-11-03 16:16           ` Elijah Newren
2020-10-13  0:40   ` [PATCH v2 04/10] hashmap: introduce a new hashmap_partial_clear() Elijah Newren via GitGitGadget
2020-10-30 13:41     ` Jeff King
2020-10-30 16:03       ` Elijah Newren
2020-11-03 16:10         ` Jeff King
2020-10-13  0:40   ` [PATCH v2 05/10] strmap: new utility functions Elijah Newren via GitGitGadget
2020-10-30 14:12     ` Jeff King
2020-10-30 16:26       ` Elijah Newren
2020-10-13  0:40   ` [PATCH v2 06/10] strmap: add more " Elijah Newren via GitGitGadget
2020-10-30 14:23     ` Jeff King
2020-10-30 16:43       ` Elijah Newren
2020-11-03 16:12         ` Jeff King
2020-10-13  0:40   ` [PATCH v2 07/10] strmap: enable faster clearing and reusing of strmaps Elijah Newren via GitGitGadget
2020-10-30 14:27     ` Jeff King
2020-10-13  0:40   ` [PATCH v2 08/10] strmap: add functions facilitating use as a string->int map Elijah Newren via GitGitGadget
2020-10-30 14:39     ` Jeff King
2020-10-30 17:28       ` Elijah Newren
2020-11-03 16:20         ` Jeff King
2020-11-03 16:46           ` Elijah Newren
2020-10-13  0:40   ` [PATCH v2 09/10] strmap: add a strset sub-type Elijah Newren via GitGitGadget
2020-10-30 14:44     ` Jeff King
2020-10-30 18:02       ` Elijah Newren
2020-10-13  0:40   ` [PATCH v2 10/10] strmap: enable allocations to come from a mem_pool Elijah Newren via GitGitGadget
2020-10-30 14:56     ` Jeff King
2020-10-30 19:31       ` Elijah Newren
2020-11-03 16:24         ` Jeff King
2020-11-02 18:55   ` [PATCH v3 00/13] Add struct strmap and associated utility functions Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 01/13] hashmap: add usage documentation explaining hashmap_free[_entries]() Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 02/13] hashmap: adjust spacing to fix argument alignment Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 03/13] hashmap: allow re-use after hashmap_free() Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 04/13] hashmap: introduce a new hashmap_partial_clear() Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 05/13] hashmap: provide deallocation function names Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 06/13] strmap: new utility functions Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 07/13] strmap: add more " Elijah Newren via GitGitGadget
2020-11-04 20:13       ` Jeff King
2020-11-04 20:24         ` Elijah Newren
2020-11-02 18:55     ` [PATCH v3 08/13] strmap: enable faster clearing and reusing of strmaps Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 09/13] strmap: add functions facilitating use as a string->int map Elijah Newren via GitGitGadget
2020-11-04 20:21       ` Jeff King
2020-11-02 18:55     ` [PATCH v3 10/13] strmap: add a strset sub-type Elijah Newren via GitGitGadget
2020-11-04 20:31       ` Jeff King
2020-11-02 18:55     ` [PATCH v3 11/13] strmap: enable allocations to come from a mem_pool Elijah Newren via GitGitGadget
2020-11-02 18:55     ` [PATCH v3 12/13] strmap: take advantage of FLEXPTR_ALLOC_STR when relevant Elijah Newren via GitGitGadget
2020-11-04 20:43       ` Jeff King
2020-11-02 18:55     ` [PATCH v3 13/13] Use new HASHMAP_INIT macro to simplify hashmap initialization Elijah Newren via GitGitGadget
2020-11-04 20:48       ` Jeff King
2020-11-04 20:52     ` [PATCH v3 00/13] Add struct strmap and associated utility functions Jeff King
2020-11-04 22:20       ` Elijah Newren
2020-11-05  0:22     ` [PATCH v4 " Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 01/13] hashmap: add usage documentation explaining hashmap_free[_entries]() Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 02/13] hashmap: adjust spacing to fix argument alignment Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 03/13] hashmap: allow re-use after hashmap_free() Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 04/13] hashmap: introduce a new hashmap_partial_clear() Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 05/13] hashmap: provide deallocation function names Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 06/13] strmap: new utility functions Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 07/13] strmap: add more " Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 08/13] strmap: enable faster clearing and reusing of strmaps Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 09/13] strmap: add functions facilitating use as a string->int map Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 10/13] strmap: add a strset sub-type Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 11/13] strmap: enable allocations to come from a mem_pool Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 12/13] strmap: take advantage of FLEXPTR_ALLOC_STR when relevant Elijah Newren via GitGitGadget
2020-11-05  0:22       ` [PATCH v4 13/13] Use new HASHMAP_INIT macro to simplify hashmap initialization Elijah Newren via GitGitGadget
2020-11-05 13:29       ` [PATCH v4 00/13] Add struct strmap and associated utility functions Jeff King
2020-11-05 20:25         ` Junio C Hamano
2020-11-05 21:17           ` Jeff King
2020-11-05 21:22           ` Elijah Newren
2020-11-05 22:15             ` Junio C Hamano
2020-11-06  0:24       ` [PATCH v5 00/15] " Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 01/15] hashmap: add usage documentation explaining hashmap_free[_entries]() Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 02/15] hashmap: adjust spacing to fix argument alignment Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 03/15] hashmap: allow re-use after hashmap_free() Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 04/15] hashmap: introduce a new hashmap_partial_clear() Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 05/15] hashmap: provide deallocation function names Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 06/15] strmap: new utility functions Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 07/15] strmap: add more " Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 08/15] strmap: enable faster clearing and reusing of strmaps Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 09/15] strmap: add functions facilitating use as a string->int map Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 10/15] strmap: split create_entry() out of strmap_put() Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 11/15] strmap: add a strset sub-type Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 12/15] strmap: enable allocations to come from a mem_pool Elijah Newren via GitGitGadget
2020-11-11 17:33           ` Phillip Wood
2020-11-11 18:49             ` Elijah Newren
2020-11-11 19:01             ` Jeff King
2020-11-11 20:34               ` Chris Torek
2020-11-06  0:24         ` [PATCH v5 13/15] strmap: take advantage of FLEXPTR_ALLOC_STR when relevant Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 14/15] Use new HASHMAP_INIT macro to simplify hashmap initialization Elijah Newren via GitGitGadget
2020-11-06  0:24         ` [PATCH v5 15/15] shortlog: use strset from strmap.h Elijah Newren via GitGitGadget
2020-11-06  2:00         ` [PATCH v5 00/15] Add struct strmap and associated utility functions Junio C Hamano
2020-11-06  2:42           ` Elijah Newren
2020-11-06  2:48             ` Jeff King
2020-11-06 17:32               ` Junio C Hamano
2020-11-11 20:02         ` [PATCH v6 " Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 01/15] hashmap: add usage documentation explaining hashmap_free[_entries]() Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 02/15] hashmap: adjust spacing to fix argument alignment Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 03/15] hashmap: allow re-use after hashmap_free() Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 04/15] hashmap: introduce a new hashmap_partial_clear() Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 05/15] hashmap: provide deallocation function names Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 06/15] strmap: new utility functions Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 07/15] strmap: add more " Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 08/15] strmap: enable faster clearing and reusing of strmaps Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 09/15] strmap: add functions facilitating use as a string->int map Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 10/15] strmap: split create_entry() out of strmap_put() Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 11/15] strmap: add a strset sub-type Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 12/15] strmap: enable allocations to come from a mem_pool Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 13/15] strmap: take advantage of FLEXPTR_ALLOC_STR when relevant Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 14/15] Use new HASHMAP_INIT macro to simplify hashmap initialization Elijah Newren via GitGitGadget
2020-11-11 20:02           ` [PATCH v6 15/15] shortlog: use strset from strmap.h Elijah Newren via GitGitGadget
2020-11-11 20:07           ` [PATCH v6 00/15] Add struct strmap and associated utility functions Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200828070335.GB2105050@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    --subject='Re: [PATCH 0/5] Add struct strmap and associated utility functions' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/inbox.comp.version-control.git
	nntp://ie5yzdi7fg72h7s4sdcztq5evakq23rdt33mfyfcddc5u3ndnw24ogqd.onion/inbox.comp.version-control.git
	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git