git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Duy Nguyen <pclouds@gmail.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Stefan Beller <sbeller@google.com>,
	Christian Couder <chriscool@tuxfamily.org>
Subject: [PATCH v2 0/6] Add delta islands support
Date: Sun,  5 Aug 2018 19:25:19 +0200	[thread overview]
Message-ID: <20180805172525.15278-1-chriscool@tuxfamily.org> (raw)

This patch series is upstreaming work made by GitHub and available in:

https://github.com/peff/git/commits/jk/delta-islands

The above work has been already described in the following article:

https://githubengineering.com/counting-objects/

The above branch contains only one patch. In this patch series the
patch has been split into 5 patches (1/6 to 5/6) with their own commit
message, and on top of that one patch (6/6) has been added. This patch
implements something that was requested following the previous
iteration.

I kept Peff as the author of the first 5 patches and took the liberty
to add his Signed-off-by to them.

As explained in details in the Counting Object article referenced
above, the goal of the delta island mechanism is for a hosting
provider to make it possible to have the "forks" of a repository share
as much storage as possible while preventing object packs to contain
deltas between different forks.

If deltas between different forks are not prevented, when users clone
or fetch a fork, preparing the pack that should be sent to them can be
very costly CPU wise, as objects from a different fork should not be
sent, which means that a lot of deltas might need to be computed
again (instead of reusing existing deltas).


The following changes have been made since the previous iteration:

* suggested Dscho: explain in the cover letter what the patches are
  all about

* suggested by Peff and Junio: improve the commit messages

* suggested by Junio: add comment before get_delta_base() in
  "packfile.h" in patch 1/6

* suggested by Duy: move 'pack.island' documentation (in
  "Documentation/config.txt") from patch 2/6 to patch 3/6

* suggested by Junio: improve pack.island documentation (in
  "Documentation/config.txt") to tell that it is an ERE in patch 3/6

* suggested by Peff: add doc about 'pack.islandCore' in patch 3/6

* suggested by Peff: add info about repacking with a big --window to
  avoid the delta window being clogged
  "Documentation/git-pack-objects.txt" in patch 3/6

* suggested by Duy: remove `#include "builtin.h"` from delta-islands.c
  in patch 2/6

* suggested by Duy: mark strings for translation in patch 2/6

* suggested by Peff: modernize code using ALLOC_ARRAY, QSORT() and
  free_tree_buffer() in patch 2/6

* suggested by Peff: use "respect islands during delta compression" as
  help text for --delta-islands in "builtin/pack-objects.c" in patch
  3/6

* suggested by Junio: improve documentation explaining how capture groups from
  the pack.island regexes are concatenated in
  Documentation/git-pack-objects.txt in patch 3/6

* suggested by Junio: add that only up to 7 capture groups are supported in
  the pack.island regexes in Documentation/git-pack-objects.txt in patch
  3/6

* suggested by Peff: move test script from the t99XX range to the t53XX range
  in commit 5/6

* suggested by Duy: move field 'tree_depth' from 'struct object_entry'
  to 'struct packing_data' in pack-object.h in new patch 6/6


The following changes have been suggested in the previous iteration,
but have not been implemented:

* suggested by Peff: rename get_delta_base() in patch 1/6

I am not sure which name to use, especially as there are a number of
other functions static to "packfile.c" with a name that starts with
"get_delta_base" and they should probably be renamed too.

* suggested by Duy: move field 'layer' from 'struct object_entry' to 'struct
  packing_data' in pack-object.h

I will respond in the original email about this.

* suggested by Peff: using FLEX_ALLOC_MEM() in island_bitmap_new() in
  patch 2/6

In his email Peff says that'd waste 4 bytes per struct, so it's not
worth it in my opinion.


This patch series is also available on GitHub in:

https://github.com/chriscool/git/commits/delta-islands

The previous version is available there:

https://github.com/chriscool/git/commits/delta-islands6
https://public-inbox.org/git/20180722054836.28935-1-chriscool@tuxfamily.org/

Christian Couder (1):
  pack-objects: move tree_depth into 'struct packing_data'

Jeff King (5):
  packfile: make get_delta_base() non static
  Add delta-islands.{c,h}
  pack-objects: add delta-islands support
  repack: add delta-islands support
  t: add t5319-delta-islands.sh

 Documentation/config.txt           |  19 ++
 Documentation/git-pack-objects.txt |  97 ++++++
 Documentation/git-repack.txt       |   5 +
 Makefile                           |   1 +
 builtin/pack-objects.c             | 142 ++++++---
 builtin/repack.c                   |   9 +
 delta-islands.c                    | 496 +++++++++++++++++++++++++++++
 delta-islands.h                    |  11 +
 pack-objects.h                     |   6 +
 packfile.c                         |  10 +-
 packfile.h                         |   7 +
 t/t5319-delta-islands.sh           | 143 +++++++++
 12 files changed, 900 insertions(+), 46 deletions(-)
 create mode 100644 delta-islands.c
 create mode 100644 delta-islands.h
 create mode 100755 t/t5319-delta-islands.sh

-- 
2.18.0.327.ga7d188ab43


             reply	other threads:[~2018-08-05 17:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-05 17:25 Christian Couder [this message]
2018-08-05 17:25 ` [PATCH v2 1/6] packfile: make get_delta_base() non static Christian Couder
2018-08-05 17:25 ` [PATCH v2 2/6] Add delta-islands.{c,h} Christian Couder
2018-08-05 17:25 ` [PATCH v2 3/6] pack-objects: add delta-islands support Christian Couder
2018-08-05 17:25 ` [PATCH v2 4/6] repack: " Christian Couder
2018-08-05 17:25 ` [PATCH v2 5/6] t: add t5319-delta-islands.sh Christian Couder
2018-08-05 17:25 ` [PATCH v2 6/6] pack-objects: move tree_depth into 'struct packing_data' Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180805172525.15278-1-chriscool@tuxfamily.org \
    --to=christian.couder@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).