From: Christian Couder <christian.couder@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
Duy Nguyen <pclouds@gmail.com>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Stefan Beller <sbeller@google.com>,
Christian Couder <chriscool@tuxfamily.org>
Subject: [PATCH v2 0/6] Add delta islands support
Date: Sun, 5 Aug 2018 19:25:19 +0200 [thread overview]
Message-ID: <20180805172525.15278-1-chriscool@tuxfamily.org> (raw)
This patch series is upstreaming work made by GitHub and available in:
https://github.com/peff/git/commits/jk/delta-islands
The above work has been already described in the following article:
https://githubengineering.com/counting-objects/
The above branch contains only one patch. In this patch series the
patch has been split into 5 patches (1/6 to 5/6) with their own commit
message, and on top of that one patch (6/6) has been added. This patch
implements something that was requested following the previous
iteration.
I kept Peff as the author of the first 5 patches and took the liberty
to add his Signed-off-by to them.
As explained in details in the Counting Object article referenced
above, the goal of the delta island mechanism is for a hosting
provider to make it possible to have the "forks" of a repository share
as much storage as possible while preventing object packs to contain
deltas between different forks.
If deltas between different forks are not prevented, when users clone
or fetch a fork, preparing the pack that should be sent to them can be
very costly CPU wise, as objects from a different fork should not be
sent, which means that a lot of deltas might need to be computed
again (instead of reusing existing deltas).
The following changes have been made since the previous iteration:
* suggested Dscho: explain in the cover letter what the patches are
all about
* suggested by Peff and Junio: improve the commit messages
* suggested by Junio: add comment before get_delta_base() in
"packfile.h" in patch 1/6
* suggested by Duy: move 'pack.island' documentation (in
"Documentation/config.txt") from patch 2/6 to patch 3/6
* suggested by Junio: improve pack.island documentation (in
"Documentation/config.txt") to tell that it is an ERE in patch 3/6
* suggested by Peff: add doc about 'pack.islandCore' in patch 3/6
* suggested by Peff: add info about repacking with a big --window to
avoid the delta window being clogged
"Documentation/git-pack-objects.txt" in patch 3/6
* suggested by Duy: remove `#include "builtin.h"` from delta-islands.c
in patch 2/6
* suggested by Duy: mark strings for translation in patch 2/6
* suggested by Peff: modernize code using ALLOC_ARRAY, QSORT() and
free_tree_buffer() in patch 2/6
* suggested by Peff: use "respect islands during delta compression" as
help text for --delta-islands in "builtin/pack-objects.c" in patch
3/6
* suggested by Junio: improve documentation explaining how capture groups from
the pack.island regexes are concatenated in
Documentation/git-pack-objects.txt in patch 3/6
* suggested by Junio: add that only up to 7 capture groups are supported in
the pack.island regexes in Documentation/git-pack-objects.txt in patch
3/6
* suggested by Peff: move test script from the t99XX range to the t53XX range
in commit 5/6
* suggested by Duy: move field 'tree_depth' from 'struct object_entry'
to 'struct packing_data' in pack-object.h in new patch 6/6
The following changes have been suggested in the previous iteration,
but have not been implemented:
* suggested by Peff: rename get_delta_base() in patch 1/6
I am not sure which name to use, especially as there are a number of
other functions static to "packfile.c" with a name that starts with
"get_delta_base" and they should probably be renamed too.
* suggested by Duy: move field 'layer' from 'struct object_entry' to 'struct
packing_data' in pack-object.h
I will respond in the original email about this.
* suggested by Peff: using FLEX_ALLOC_MEM() in island_bitmap_new() in
patch 2/6
In his email Peff says that'd waste 4 bytes per struct, so it's not
worth it in my opinion.
This patch series is also available on GitHub in:
https://github.com/chriscool/git/commits/delta-islands
The previous version is available there:
https://github.com/chriscool/git/commits/delta-islands6
https://public-inbox.org/git/20180722054836.28935-1-chriscool@tuxfamily.org/
Christian Couder (1):
pack-objects: move tree_depth into 'struct packing_data'
Jeff King (5):
packfile: make get_delta_base() non static
Add delta-islands.{c,h}
pack-objects: add delta-islands support
repack: add delta-islands support
t: add t5319-delta-islands.sh
Documentation/config.txt | 19 ++
Documentation/git-pack-objects.txt | 97 ++++++
Documentation/git-repack.txt | 5 +
Makefile | 1 +
builtin/pack-objects.c | 142 ++++++---
builtin/repack.c | 9 +
delta-islands.c | 496 +++++++++++++++++++++++++++++
delta-islands.h | 11 +
pack-objects.h | 6 +
packfile.c | 10 +-
packfile.h | 7 +
t/t5319-delta-islands.sh | 143 +++++++++
12 files changed, 900 insertions(+), 46 deletions(-)
create mode 100644 delta-islands.c
create mode 100644 delta-islands.h
create mode 100755 t/t5319-delta-islands.sh
--
2.18.0.327.ga7d188ab43
next reply other threads:[~2018-08-05 17:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-05 17:25 Christian Couder [this message]
2018-08-05 17:25 ` [PATCH v2 1/6] packfile: make get_delta_base() non static Christian Couder
2018-08-05 17:25 ` [PATCH v2 2/6] Add delta-islands.{c,h} Christian Couder
2018-08-05 17:25 ` [PATCH v2 3/6] pack-objects: add delta-islands support Christian Couder
2018-08-05 17:25 ` [PATCH v2 4/6] repack: " Christian Couder
2018-08-05 17:25 ` [PATCH v2 5/6] t: add t5319-delta-islands.sh Christian Couder
2018-08-05 17:25 ` [PATCH v2 6/6] pack-objects: move tree_depth into 'struct packing_data' Christian Couder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180805172525.15278-1-chriscool@tuxfamily.org \
--to=christian.couder@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).