From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 95A411F45D for ; Mon, 23 Mar 2020 17:41:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727028AbgCWRlX (ORCPT ); Mon, 23 Mar 2020 13:41:23 -0400 Received: from pb-smtp1.pobox.com ([64.147.108.70]:61312 "EHLO pb-smtp1.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbgCWRlX (ORCPT ); Mon, 23 Mar 2020 13:41:23 -0400 Received: from pb-smtp1.pobox.com (unknown [127.0.0.1]) by pb-smtp1.pobox.com (Postfix) with ESMTP id 4BA46428C9; Mon, 23 Mar 2020 13:41:21 -0400 (EDT) (envelope-from junio@pobox.com) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=sasl; bh=xtaioxNAfH3r a8iGmHZd7T5Z+1M=; b=xwx6pt7TUyFwoCdeFVZg3bgBPYhrL13Sgtrv0ulPkkMl lU9Z2TFyxxp9jrJ6Mwr7omu1+vt1NvxFdPsJxQXuPgy/MEbgPq4PDcQH4+NJWjTI YAc3Eyyz1j1C92+A4tmAR5i+1tC9BJMHzTHCC5VydCRxwU4o3F5R97uClcmLbzY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=doyQHX y3W9nV4/UFQCAu3DYn3nRruBw/RFyt/JBemuVLVD32XhEilx/c7bJyE4ph64I310 mCSdNfyhnyZcL0+gDFHPx/hOOBii/0FXnNXHNTUsOOHRrM+SBfPiaav7KhBZdTbU tPsJV5gUczdhcti0zczPpgEVpa+QZS1MPkfrU= Received: from pb-smtp1.nyi.icgroup.com (unknown [127.0.0.1]) by pb-smtp1.pobox.com (Postfix) with ESMTP id 41B93428C8; Mon, 23 Mar 2020 13:41:21 -0400 (EDT) (envelope-from junio@pobox.com) Received: from pobox.com (unknown [34.74.119.39]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pb-smtp1.pobox.com (Postfix) with ESMTPSA id B4D7C428C7; Mon, 23 Mar 2020 13:41:20 -0400 (EDT) (envelope-from junio@pobox.com) From: Junio C Hamano To: =?utf-8?Q?Ren=C3=A9?= Scharfe Cc: Johannes Schindelin via GitGitGadget , git@vger.kernel.org, Johannes Schindelin Subject: Re: [PATCH] import-tars: ignore the global PAX header References: Date: Mon, 23 Mar 2020 10:41:20 -0700 In-Reply-To: (=?utf-8?Q?=22R?= =?utf-8?Q?en=C3=A9?= Scharfe"'s message of "Mon, 23 Mar 2020 18:09:39 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 X-Pobox-Relay-ID: 81D7E56C-6D2D-11EA-A462-C28CBED8090B-77302942!pb-smtp1.pobox.com Content-Transfer-Encoding: quoted-printable Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Ren=C3=A9 Scharfe writes: > Am 23.03.20 um 14:08 schrieb Johannes Schindelin via GitGitGadget: >> From: Johannes Schindelin >> >> Git's own `git archive` inserts that header, but it often gets into th= e >> way of `import-tars.perl` e.g. when a prefix was specified (for exampl= e >> via `--prefix=3Dmy-project-1.0.0/`, or when downloading a `.tar.gz` fr= om >> GitHub releases): this prefix _should_ be stripped. >> >> Let's just skip it. > > git archive uses a global pax header to pass the ID of the archived > commit as a comment, and for mtime values after 2242-03-16. Ignoring i= t > in a simple importer seems reasonable for now, but I don't understand > how this relates to prefixes. Is it because the header is treated as a > regular file with the full path "pax_global_header" (independently from > any prefix for actual files) and can thus be placed outside the expecte= d > destination directory? Thanks for asking the question, as I was also curious if we are throwing away too much (perhaps "prefix is given as a global pax header, and ignoring all global pax headers is the most expedite way" was the reason the patch was written that way?). I agree with you that for the purpose of simple-minded importer, it probably is acceptable to take such a short-cut, but it would help future developers if we clearly documented that it is a short-cut that throws too much. That would welcome their effort to enhance the importer, if they find it more useful to keep some other information found in global headers, without breaking the intent of this change. Having said all that, even before "git archive" existed, release tarballs by many projects had leading prefix so that a tarball extract would be made inside a versioned directory. To truly help users of the importer, doesn't the logic to allow the user to say "please strip one leading level of directory from all the tarballs I feed you, as I know they are versioned directories" belong to the command line option of the importer?