git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: git@vger.kernel.org
Cc: Stefan Beller <sbeller@google.com>, "brian m. carlson" <sandals@crustytoothpaste.net>
Subject: [PATCH 00/39] per-repository object store, part 1
Date: Tue, 29 Aug 2017 23:46:34 -0700
Message-ID: <20170830064634.GA153983@aiede.mtv.corp.google.com> (raw)

Hi,

Most of the credit for this series should go to Stefan Beller.  I just
decided to pull the trigger on sending out what we have so far.

This series is about API.  It makes no functional change yet.

Today, when a git command wants to operate on some objects from another
repository (e.g., a submodule), it has two choices:

 A. Use run_command to operate on that repository in a separate process.

 B. Use add_to_alternates_memory to pretend the repository is an
    alternate.  This has a number of downsides.  Aside from aesthetics,
    one particularly painful consequence is that as alternates
    accumulate, the number of packs git has to check for objects
    increases, which can cause significant slowdowns.

Brandon Williams's recent work to introduce "struct repository" points
to a better way.  Encapsulating object access in struct repository
would mean:

  i. The API for accessing objects in another repository becomes more
     simple and familiar (instead of using the CLI or abusing alternates).

  ii. Operations on one repository do not interfere with another,
      neither in semantics (e.g. replace objects do not work correctly
      with the approach (B) above) nor performance (already described
      above).

 iii. Resources associated with access to a repository could be freed
      when done with that repo.

  iv. Thread-safe multiple readers to a single repository also become
      straightforward, by using multiple repository objects for the same
      repo.

This series is a small step in that direction.

At the end of this series, sha1_loose_object_info takes a repository
argument and can be independently called for multiple repositories.
Not incredibly useful on its own, but a future series will do the same
for sha1_object_info, which will be enough to migrate a caller in
submodule.c (which uses the object store for commit existence checks).

This series has a few phases:

 1. Patch 1 is a cleanup that made some of the later patches easier.

 2. Patches 2-6 create a struct object_store field inside struct
    repository and move some globals to it.

 3. Patches 7-24 are mechanical changes that update some functions to
    accept a repository argument. The only goal is to make the later
    patches that teach these functions to actual handle a repository
    other than the_repository easier to review.  The patches enforce
    at compile time that no caller passes a repository other than
    the_repository --- see patch 7 in particular for details on how
    that works.

 4. Patches 25-39 update the implementations of those functions to
    handle a repository other than the_repository.  This means the
    safety check introduced in phase 3 goes away completely --- all
    functions that gained a repository argument are safe to use with
    a repository argument other than the_repository.

Patches 2-6 and 25-39 should be the most interesting to review.  I'd
particularly appreciate if people can look over 25-39 carefully.  We
were careful not to leave any calls to functions that assume they are
operating on the_repository, but a triple-check is always welcome.

Thanks as well to brian m. carlson, who showed us how such a long and
potentially tedius series can be made bearable for reviewers.

Thoughts of all kinds welcome, as always.

Thanks,
Jonathan Nieder (24):
  pack: make packed_git_mru global a value instead of a pointer
  object-store: move packed_git and packed_git_mru to object store
    struct
  pack: move prepare_packed_git_run_once to object store struct
  pack: move approximate object count to object store struct
  pack: add repository argument to install_packed_git
  pack: add repository argument to prepare_packed_git_one
  pack: add repository argument to rearrange_packed_git
  pack: add repository argument to prepare_packed_git_mru
  pack: add repository argument to prepare_packed_git
  pack: add repository argument to reprepare_packed_git
  pack: add repository argument to sha1_file_name
  pack: add repository argument to map_sha1_file
  pack: allow install_packed_git to handle arbitrary repositories
  pack: allow rearrange_packed_git to handle arbitrary repositories
  pack: allow prepare_packed_git_mru to handle arbitrary repositories
  pack: allow prepare_packed_git_one to handle arbitrary repositories
  pack: allow prepare_packed_git to handle arbitrary repositories
  pack: allow reprepare_packed_git to handle arbitrary repositories
  pack: allow sha1_file_name to handle arbitrary repositories
  pack: allow stat_sha1_file to handle arbitrary repositories
  pack: allow open_sha1_file to handle arbitrary repositories
  pack: allow map_sha1_file_1 to handle arbitrary repositories
  pack: allow map_sha1_file to handle arbitrary repositories
  pack: allow sha1_loose_object_info to handle arbitrary repositories

Stefan Beller (15):
  repository: introduce object store field
  object-store: move alt_odb_list and alt_odb_tail to object store
    struct
  sha1_file: add repository argument to alt_odb_usable
  sha1_file: add repository argument to link_alt_odb_entry
  sha1_file: add repository argument to read_info_alternates
  sha1_file: add repository argument to link_alt_odb_entries
  sha1_file: add repository argument to stat_sha1_file
  sha1_file: add repository argument to open_sha1_file
  sha1_file: add repository argument to map_sha1_file_1
  sha1_file: add repository argument to sha1_loose_object_info
  object-store: add repository argument to prepare_alt_odb
  object-store: add repository argument to foreach_alt_odb
  sha1_file: allow alt_odb_usable to handle arbitrary repositories
  object-store: allow prepare_alt_odb to handle arbitrary repositories
  object-store: allow foreach_alt_odb to handle arbitrary repositories

 builtin/count-objects.c             |  10 ++-
 builtin/fsck.c                      |  15 ++--
 builtin/gc.c                        |   8 +-
 builtin/index-pack.c                |   1 +
 builtin/pack-objects.c              |  23 +++--
 builtin/pack-redundant.c            |   8 +-
 builtin/receive-pack.c              |   4 +-
 builtin/submodule--helper.c         |   4 +-
 bulk-checkin.c                      |   3 +-
 cache.h                             |  50 ++---------
 contrib/coccinelle/packed_git.cocci |  15 ++++
 fast-import.c                       |  10 ++-
 fetch-pack.c                        |   3 +-
 http-backend.c                      |   8 +-
 http-push.c                         |   1 +
 http-walker.c                       |   4 +-
 http.c                              |   9 +-
 mru.h                               |   1 +
 object-store.h                      |  71 ++++++++++++++++
 pack-bitmap.c                       |   6 +-
 pack-check.c                        |   1 +
 pack-revindex.c                     |   1 +
 packfile.c                          |  94 ++++++++++----------
 packfile.h                          |   6 +-
 reachable.c                         |   1 +
 repository.c                        |   4 +-
 repository.h                        |   7 ++
 server-info.c                       |   8 +-
 sha1_file.c                         | 165 ++++++++++++++++++++----------------
 sha1_name.c                         |  11 ++-
 streaming.c                         |   5 +-
 transport.c                         |   4 +-
 32 files changed, 344 insertions(+), 217 deletions(-)
 create mode 100644 contrib/coccinelle/packed_git.cocci
 create mode 100644 object-store.h

-- 
2.14.1.581.gf28d330327


             reply index

Thread overview: 46+ messages in thread (expand / mbox.gz / Atom feed / [top])
2017-08-30  6:46 Jonathan Nieder [this message]
2017-08-30  6:48 ` [PATCH 01/39] pack: make packed_git_mru global a value instead of a pointer Jonathan Nieder
2017-08-30 19:44   ` Jeff King
2017-09-06 19:51     ` Junio C Hamano
2017-08-30  6:52 ` [PATCH 02/39] repository: introduce object store field Jonathan Nieder
2017-08-30  6:53 ` [PATCH 03/39] object-store: move alt_odb_list and alt_odb_tail to object store Jonathan Nieder
2017-08-30  6:54 ` [PATCH 04/39] object-store: move packed_git and packed_git_mru " Jonathan Nieder
2017-08-30  6:54 ` [PATCH 05/39] pack: move prepare_packed_git_run_once " Jonathan Nieder
2017-08-30  6:55 ` [PATCH 06/39] pack: move approximate object count " Jonathan Nieder
2017-08-30  6:56 ` [PATCH 07/39] sha1_file: add repository argument to alt_odb_usable Jonathan Nieder
2017-08-30 22:40   ` Brandon Williams
2017-08-30  6:57 ` [PATCH 08/39] sha1_file: add repository argument to link_alt_odb_entry Jonathan Nieder
2017-08-30  6:58 ` [PATCH 09/39] sha1_file: add repository argument to read_info_alternates Jonathan Nieder
2017-08-30  6:58 ` [PATCH 10/39] sha1_file: add repository argument to link_alt_odb_entries Jonathan Nieder
2017-08-30  6:59 ` [PATCH 11/39] sha1_file: add repository argument to stat_sha1_file Jonathan Nieder
2017-08-30  6:59 ` [PATCH 12/39] sha1_file: add repository argument to open_sha1_file Jonathan Nieder
2017-08-30  7:00 ` [PATCH 13/39] sha1_file: add repository argument to map_sha1_file_1 Jonathan Nieder
2017-08-30  7:01 ` [PATCH 14/39] sha1_file: add repository argument to sha1_loose_object_info Jonathan Nieder
2017-08-30  7:01 ` [PATCH 15/39] object-store: add repository argument to prepare_alt_odb Jonathan Nieder
2017-08-30  7:02 ` [PATCH 16/39] object-store: add repository argument to foreach_alt_odb Jonathan Nieder
2017-08-30  7:02 ` [PATCH 17/39] pack: add repository argument to install_packed_git Jonathan Nieder
2017-08-30  7:03 ` [PATCH 18/39] pack: add repository argument to prepare_packed_git_one Jonathan Nieder
2017-08-30  7:03 ` [PATCH 19/39] pack: add repository argument to rearrange_packed_git Jonathan Nieder
2017-08-30  7:04 ` [PATCH 20/39] pack: add repository argument to prepare_packed_git_mru Jonathan Nieder
2017-08-30  7:05 ` [PATCH 21/39] pack: add repository argument to prepare_packed_git Jonathan Nieder
2017-08-30  7:06 ` [PATCH 22/39] pack: add repository argument to reprepare_packed_git Jonathan Nieder
2017-08-30  7:06 ` [PATCH 23/39] pack: add repository argument to sha1_file_name Jonathan Nieder
2017-08-30  7:07 ` [PATCH 24/39] pack: add repository argument to map_sha1_file Jonathan Nieder
2017-08-30  7:08 ` [PATCH 25/39] sha1_file: allow alt_odb_usable to handle arbitrary repositories Jonathan Nieder
2017-09-06 20:01   ` Junio C Hamano
2017-09-06 21:59     ` Stefan Beller
2017-08-30  7:10 ` [PATCH 26/39] object-store: allow prepare_alt_odb " Jonathan Nieder
2017-08-30  7:11 ` [PATCH 27/39] object-store: allow foreach_alt_odb " Jonathan Nieder
2017-08-30  7:11 ` [PATCH 28/39] pack: allow install_packed_git " Jonathan Nieder
2017-08-30  7:12 ` [PATCH 29/39] pack: allow rearrange_packed_git " Jonathan Nieder
2017-08-30  7:12 ` [PATCH 30/39] pack: allow prepare_packed_git_mru " Jonathan Nieder
2017-08-30  7:13 ` [PATCH 31/39] pack: allow prepare_packed_git_one " Jonathan Nieder
2017-08-30  7:13 ` [PATCH 32/39] pack: allow prepare_packed_git " Jonathan Nieder
2017-08-30  7:14 ` [PATCH 33/39] pack: allow reprepare_packed_git " Jonathan Nieder
2017-08-30  7:14 ` [PATCH 34/39] pack: allow sha1_file_name " Jonathan Nieder
2017-08-30  7:15 ` [PATCH 35/39] pack: allow stat_sha1_file " Jonathan Nieder
2017-08-30  7:16 ` [PATCH 36/39] pack: allow open_sha1_file " Jonathan Nieder
2017-08-30  7:16 ` [PATCH 37/39] pack: allow map_sha1_file_1 " Jonathan Nieder
2017-08-30  7:16 ` [PATCH 38/39] pack: allow map_sha1_file " Jonathan Nieder
2017-08-30  7:18 ` [PATCH 39/39] pack: allow sha1_loose_object_info " Jonathan Nieder
2017-08-30 23:07 ` [PATCH 00/39] per-repository object store, part 1 Brandon Williams

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply to all the recipients using the --to, --cc,
  and --in-reply-to switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170830064634.GA153983@aiede.mtv.corp.google.com \
    --to=jrnieder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=sandals@crustytoothpaste.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox