From mboxrd@z Thu Jan 1 00:00:00 1970 From: Junio C Hamano Subject: Re: Subject: [PATCH] git-merge-pack Date: Thu, 06 Sep 2007 21:43:26 -0700 Message-ID: <7vwsv36q6p.fsf@gitster.siamese.dyndns.org> References: <20070905074206.GA31750@artemis.corp> <87odgh0zn6.fsf@hades.wkstn.nix> <46DEF1FA.4050500@midwinter.com> <877in50y7p.fsf@hades.wkstn.nix> <7vr6lcj2zi.fsf@gitster.siamese.dyndns.org> <7vk5r3adlx.fsf@gitster.siamese.dyndns.org> <7v1wdb9ymf.fsf_-_@gitster.siamese.dyndns.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Linus Torvalds , Johannes Schindelin , Nix , Steven Grimm , Git Mailing List To: Nicolas Pitre X-From: git-owner@vger.kernel.org Fri Sep 07 06:43:49 2007 Return-path: Envelope-to: gcvg-git@gmane.org Received: from vger.kernel.org ([209.132.176.167]) by lo.gmane.org with esmtp (Exim 4.50) id 1ITVgz-0001Tw-65 for gcvg-git@gmane.org; Fri, 07 Sep 2007 06:43:41 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751316AbXIGEng (ORCPT ); Fri, 7 Sep 2007 00:43:36 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751525AbXIGEng (ORCPT ); Fri, 7 Sep 2007 00:43:36 -0400 Received: from rune.sasl.smtp.pobox.com ([208.210.124.37]:37207 "EHLO sasl.smtp.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751434AbXIGEnf (ORCPT ); Fri, 7 Sep 2007 00:43:35 -0400 Received: from pobox.com (ip68-225-240-77.oc.oc.cox.net [68.225.240.77]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by rune.sasl.smtp.pobox.com (Postfix) with ESMTP id F2D3812E1E5; Fri, 7 Sep 2007 00:43:49 -0400 (EDT) In-Reply-To: (Nicolas Pitre's message of "Thu, 06 Sep 2007 20:51:58 -0400 (EDT)") User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) Sender: git-owner@vger.kernel.org Precedence: bulk X-Mailing-List: git@vger.kernel.org Archived-At: Nicolas Pitre writes: > I would have concatenated all packs provided on the command line into a > single one, simply by reading data from existing packs and writing it > back without any processing at all. The offset for OBJ_OFS_DELTA is > relative so a simple concatenation will just work. As I was planning to do this outside of pack-objects, I did not want to write something that intimately knows the details of packfile format, but see below. > All data is read once and written once making it no more costly than a > simple file copy. On the flip side it wouldn't get rid of duplicated > objects (I don't know if that matters i.e. if something might break with > the same object twice in a pack). I do not think duplicates create problems, as long as the pack idx remains sane. But a bigger issue is for people who fetch over dumb protocols, from a repository that repacks with "-a -d" every once in a while. There, many duplicates are norm. > In fact, since we want to _also_ perform a repack of loose objects in > the context of automatic repacking, I wonder why we wouldn't use that > --unpacked= argument to also repack smallish packs at the same time in > only one pack-objects pass. Or maybe I'm missing something? I think this is a much better idea. You obviously need some twist to the pack-objects, and being lazy that was the reason I did not want to do this that way. When a new parameter, perhaps --lossless, is given, together with the --unpacked= parameters, we can change pack-objects to iterate over all objects in the --unpacked= packs, and add the ones that are not marked for inclusion to the set of objects to be packed, after doing the usual "objects to be packed" discovery. I am not sure --lossless is a good option name from marketing point of view, though.