From: Nicolas Pitre <nico@fluxnic.net>
To: Avery Pennarun <apenwarr@gmail.com>
Cc: Dmitry Potapov <dpotapov@gmail.com>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Zygo Blaxell <zblaxell@esightcorp.com>,
Ilari Liusvaara <ilari.liusvaara@elisanet.fi>,
Thomas Rast <trast@student.ethz.ch>,
Jonathan Nieder <jrnieder@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH] don't use mmap() to hash files
Date: Mon, 15 Feb 2010 00:48:41 -0500 (EST) [thread overview]
Message-ID: <alpine.LFD.2.00.1002150016110.1946@xanadu.home> (raw)
In-Reply-To: <32541b131002142101i226663cfk90d1ba14f1031788@mail.gmail.com>
On Mon, 15 Feb 2010, Avery Pennarun wrote:
> - git-prune only prunes unpacked objects
>
> - git-repack claims to be willing to explode unreachable objects back
> into loose objects with -A, but I'm not quite sure if its definition
> of "unreachable" is the same as mine.
Unreachable means not referenced by the specified rev-list
specification. So if you give it --all --reflog then it means any
objects that is not reachable through either your branches, tags or
reflog entries.
> And I'm not sure rewriting a
> pack with -A makes the old pack reliably unreachable according to -d.
Reachability doesn't apply to packs. That applies to objects. And
unreachable objects may be copied to loose objects with -A, or simply
forgotten about with -a. Then -d will literally delete the old pack
file.
> - there seems to be no documented situation in which you can ever
> delete unused objects from a pack without using repack -a or -A, which
> can be amazingly slow if your packs are huge. (Ideally you'd only
> repack the particular packs that you want to shrink.) For example, my
> bup repo is currently 200 GB.
Ideally you don't keep volatile objects into huge packs. That's why we
have .keep to flag those packs that are huge and pure so not to touch
them anymore.
Incremental repacking is there to gather only those _reachable_ loose
objects into a new pack. The objects that you're likely to make
unreachable are probably going to come from a temporary branch that you
deleted which is likely to affect objects only from that latest and
small pack.
And repacking can be done unattended and in parallel to normal Git
operations with no issues. So even if it is slow to repack huge packs,
it is something that you might do during the night and only once in a
while.
But if you really want to shrink only one pack without touching the
other packs, and you do know which objects have to be removed from that
pack, then it is trivial to write a small script using git-show-index,
sorting the output by offset, filter out the unwanted objects, keeping
only the SHA1 column, and feeding the result into git-pack-objects. Oh
and delete the original pack when done of course. It is also trivial to
generate the list of all packed objects, compare it to the list of all
reachable objects, and prune objects from the packs that contains those
objects which are not to be found in the reachable object list.
Nicolas
next prev parent reply other threads:[~2010-02-15 5:48 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100211234753.22574.48799.reportbug@gibbs.hungrycats.org>
2010-02-12 0:27 ` Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs Jonathan Nieder
2010-02-12 1:23 ` Zygo Blaxell
2010-02-13 12:12 ` Jonathan Nieder
2010-02-13 13:39 ` Ilari Liusvaara
2010-02-13 14:39 ` Thomas Rast
2010-02-13 16:29 ` Ilari Liusvaara
2010-02-13 22:09 ` Dmitry Potapov
2010-02-13 22:37 ` Zygo Blaxell
2010-02-14 1:18 ` [PATCH] don't use mmap() to hash files Dmitry Potapov
2010-02-14 1:37 ` Junio C Hamano
2010-02-14 2:18 ` Dmitry Potapov
2010-02-14 3:14 ` Junio C Hamano
2010-02-14 11:14 ` Thomas Rast
2010-02-14 11:46 ` Junio C Hamano
2010-02-14 1:53 ` Johannes Schindelin
2010-02-14 2:00 ` Junio C Hamano
2010-02-14 2:42 ` Dmitry Potapov
2010-02-14 11:07 ` Jakub Narebski
2010-02-14 11:55 ` Paolo Bonzini
2010-02-14 18:10 ` Johannes Schindelin
2010-02-14 19:06 ` Dmitry Potapov
2010-02-14 19:22 ` Johannes Schindelin
2010-02-14 19:28 ` Johannes Schindelin
2010-02-14 19:56 ` Dmitry Potapov
2010-02-14 23:52 ` Zygo Blaxell
2010-02-15 5:05 ` Nicolas Pitre
2010-02-15 12:23 ` Dmitry Potapov
2010-02-15 7:48 ` Paolo Bonzini
2010-02-15 12:25 ` Dmitry Potapov
2010-02-14 19:55 ` Dmitry Potapov
2010-02-14 23:13 ` Avery Pennarun
2010-02-15 4:16 ` Nicolas Pitre
2010-02-15 5:01 ` Avery Pennarun
2010-02-15 5:48 ` Nicolas Pitre [this message]
2010-02-15 19:19 ` Avery Pennarun
2010-02-15 19:29 ` Nicolas Pitre
2010-02-14 3:05 ` [PATCH v2] " Dmitry Potapov
2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano
2010-02-18 1:20 ` Junio C Hamano
2010-02-18 15:32 ` Zygo Blaxell
2010-02-19 17:51 ` Junio C Hamano
2010-02-18 1:38 ` Jeff King
2010-02-18 4:55 ` Nicolas Pitre
2010-02-18 5:36 ` Junio C Hamano
2010-02-18 7:27 ` Wincent Colaiuta
2010-02-18 16:18 ` Zygo Blaxell
2010-02-18 18:12 ` Jonathan Nieder
2010-02-18 18:35 ` Junio C Hamano
2010-02-22 12:59 ` Paolo Bonzini
2010-02-22 13:33 ` Dmitry Potapov
2010-02-18 10:14 ` Thomas Rast
2010-02-18 18:16 ` Junio C Hamano
2010-02-18 19:58 ` Nicolas Pitre
2010-02-18 20:11 ` 16 gig, 350,000 file repository Bill Lear
2010-02-18 20:58 ` Nicolas Pitre
2010-02-19 9:27 ` Erik Faye-Lund
2010-02-22 22:20 ` Bill Lear
2010-02-22 22:31 ` Nicolas Pitre
2010-02-18 20:14 ` [PATCH] Teach "git add" and friends to be paranoid Peter Harris
2010-02-18 20:17 ` Junio C Hamano
2010-02-18 21:30 ` Nicolas Pitre
2010-02-19 1:04 ` Jonathan Nieder
2010-02-19 15:26 ` Zygo Blaxell
2010-02-19 17:52 ` Junio C Hamano
2010-02-19 19:08 ` Zygo Blaxell
2010-02-19 8:28 ` Dmitry Potapov
2010-02-19 17:52 ` Junio C Hamano
2010-02-20 19:23 ` Junio C Hamano
2010-02-21 7:21 ` Dmitry Potapov
2010-02-21 19:32 ` Junio C Hamano
2010-02-22 3:35 ` Dmitry Potapov
2010-02-22 6:59 ` Junio C Hamano
2010-02-22 12:25 ` Dmitry Potapov
2010-02-22 15:40 ` Nicolas Pitre
2010-02-22 16:01 ` Dmitry Potapov
2010-02-22 17:31 ` Zygo Blaxell
2010-02-22 18:01 ` Nicolas Pitre
2010-02-22 19:56 ` Junio C Hamano
2010-02-22 20:52 ` Nicolas Pitre
2010-02-22 18:05 ` Dmitry Potapov
2010-02-22 18:14 ` Nicolas Pitre
2010-02-14 1:36 ` mmap with MAP_PRIVATE is useless (was Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs) Paolo Bonzini
2010-02-14 1:53 ` mmap with MAP_PRIVATE is useless Junio C Hamano
2010-02-14 2:11 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.1002150016110.1946@xanadu.home \
--to=nico@fluxnic.net \
--cc=Johannes.Schindelin@gmx.de \
--cc=apenwarr@gmail.com \
--cc=dpotapov@gmail.com \
--cc=git@vger.kernel.org \
--cc=ilari.liusvaara@elisanet.fi \
--cc=jrnieder@gmail.com \
--cc=trast@student.ethz.ch \
--cc=zblaxell@esightcorp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).