git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Cc: git <git@vger.kernel.org>, "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Thomas Gummerer" <t.gummerer@gmail.com>,
	"Christian Couder" <christian.couder@gmail.com>
Subject: Re: [RFC PATCH v3 1/5] clone: test for our behavior on odd objects/* content
Date: Fri, 01 Mar 2019 14:49:50 +0100	[thread overview]
Message-ID: <87o96uvh4x.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CAHd-oW71N94vBvN4wYQVzXyUZjvVs3Ca9tM3VUEuqB66YNGDtA@mail.gmail.com>


On Thu, Feb 28 2019, Matheus Tavares Bernardino wrote:

> Hi, Ævar
>
> I'm finishing the required changes in this series to send a v4, but
> when submitting to travis ci, I got some errors on the
> t5604-clone-reference test:
> https://travis-ci.org/MatheusBernardino/git/builds/500007587

I don't have access to an OSX box, but could reproduce the failure on
NetBSD.

It's because there link() when faced with a symlink behaves
differently. On GNU/Linux link()-ing a symlink will produce another
symlink like it, on NetBSD (and presumably OSX) doing that will produce
a hardlink to the file the existing symlink points to.

I've pushed out a version of mine here which you might want to pull in:
https://github.com/git/git/compare/master...avar:clone-dir-iterator-3

I.e. this whole thing is silly, but just preserving the notion that
we're not going to introduce behavior changes as we're refactoring.

So it adds a commit right after the tests I added to detect this case,
and use symlink() or link() as appropriate instead of link().

There's then a commit at the end you might want to squash in that
reproduces this behavior on top of your iterator refactoring.

Of course the DIR_ITERATOR_FOLLOW_SYMLINKS flag at this point is rather
silly. We're telling it to stat(), and then end up needing both stat()
and lstat() data.

I'm starting to think that this interface which previously only had one
caller, but now has two exists at the wrong abstraction level. I.e. it
itself needs to call lstat(). Seems sensible to always do that and leave
it to the caller to call stat() if they need, as I believe Duy pointed
out. Also noticed that dir-iterator.h still has a comment to the effect
that it'll call "lstat()", even though we now have a "stat() or
lstat()?" flag.

> On Tue, Feb 26, 2019 at 9:28 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> Add tests for what happens when we locally clone .git/objects
>> directories where some of the loose objects or packs are symlinked, or
>> when when there's unknown files there.
>>
>> I'm bending over backwards here to avoid a SHA1 dependency. See [1]
>> for an earlier and simpler version that hardcoded a SHA-1s.
>>
>> This behavior has been the same for a *long* time, but hasn't been
>> tested for.
>>
>> There's a good post-hoc argument to be made for copying over unknown
>> things, e.g. I'd like a git version that doesn't know about the
>> commit-graph to copy it under "clone --local" so a newer git version
>> can make use of it.
>>
>> But the behavior showed where with symlinks seems pretty
>> random. E.g. if "pack" is a symlink we end up with two copies of the
>> contents, and only transfer some symlinks as-is.
>>
>> In follow-up commits we'll look at changing some of this behavior, but
>> for now let's just assert it as-is so we'll notice what we'll change
>> later.
>>
>> 1. https://public-inbox.org/git/20190226002625.13022-5-avarab@gmail.com/
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  t/t5604-clone-reference.sh | 142 +++++++++++++++++++++++++++++++++++++
>>  1 file changed, 142 insertions(+)
>>
>> diff --git a/t/t5604-clone-reference.sh b/t/t5604-clone-reference.sh
>> index 4320082b1b..cb0dc22d14 100755
>> --- a/t/t5604-clone-reference.sh
>> +++ b/t/t5604-clone-reference.sh
>> @@ -221,4 +221,146 @@ test_expect_success 'clone, dissociate from alternates' '
>>         ( cd C && git fsck )
>>  '
>>
>> +test_expect_success 'setup repo with garbage in objects/*' '
>> +       git init S &&
>> +       (
>> +               cd S &&
>> +               test_commit A &&
>> +
>> +               cd .git/objects &&
>> +               >.some-hidden-file &&
>> +               >some-file &&
>> +               mkdir .some-hidden-dir &&
>> +               >.some-hidden-dir/some-file &&
>> +               >.some-hidden-dir/.some-dot-file &&
>> +               mkdir some-dir &&
>> +               >some-dir/some-file &&
>> +               >some-dir/.some-dot-file
>> +       )
>> +'
>> +
>> +test_expect_success 'clone a repo with garbage in objects/*' '
>> +       for option in --local --no-hardlinks --shared --dissociate
>> +       do
>> +               git clone $option S S$option || return 1 &&
>> +               git -C S$option fsck || return 1
>> +       done &&
>> +       find S-* -name "*some*" | sort >actual &&
>> +       cat >expected <<-EOF &&
>> +       S--dissociate/.git/objects/.some-hidden-file
>> +       S--dissociate/.git/objects/some-dir
>> +       S--dissociate/.git/objects/some-dir/.some-dot-file
>> +       S--dissociate/.git/objects/some-dir/some-file
>> +       S--dissociate/.git/objects/some-file
>> +       S--local/.git/objects/.some-hidden-file
>> +       S--local/.git/objects/some-dir
>> +       S--local/.git/objects/some-dir/.some-dot-file
>> +       S--local/.git/objects/some-dir/some-file
>> +       S--local/.git/objects/some-file
>> +       S--no-hardlinks/.git/objects/.some-hidden-file
>> +       S--no-hardlinks/.git/objects/some-dir
>> +       S--no-hardlinks/.git/objects/some-dir/.some-dot-file
>> +       S--no-hardlinks/.git/objects/some-dir/some-file
>> +       S--no-hardlinks/.git/objects/some-file
>> +       EOF
>> +       test_cmp expected actual
>> +'
>> +
>> +test_expect_success SYMLINKS 'setup repo with manually symlinked objects/*' '
>> +       git init T &&
>> +       (
>> +               cd T &&
>> +               test_commit A &&
>> +               git gc &&
>> +               (
>> +                       cd .git/objects &&
>> +                       mv pack packs &&
>> +                       ln -s packs pack
>> +               ) &&
>> +               test_commit B &&
>> +               (
>> +                       cd .git/objects &&
>> +                       find ?? -type d >loose-dirs &&
>> +                       last_loose=$(tail -n 1 loose-dirs) &&
>> +                       mv $last_loose a-loose-dir &&
>> +                       ln -s a-loose-dir $last_loose &&
>> +                       first_loose=$(head -n 1 loose-dirs) &&
>> +                       (
>> +                               cd $first_loose &&
>> +                               obj=$(ls *) &&
>> +                               mv $obj ../an-object &&
>> +                               ln -s ../an-object $obj
>> +                       ) &&
>> +                       find . -type f | sort >../../../T.objects-files.raw &&
>> +                       find . -type l | sort >../../../T.objects-links.raw
>> +               )
>> +       ) &&
>> +       git -C T fsck &&
>> +       git -C T rev-list --all --objects >T.objects
>> +'
>> +
>> +
>> +test_expect_success SYMLINKS 'clone repo with symlinked objects/*' '
>> +       for option in --local --no-hardlinks --shared --dissociate
>> +       do
>> +               git clone $option T T$option || return 1 &&
>> +               git -C T$option fsck || return 1 &&
>> +               git -C T$option rev-list --all --objects >T$option.objects &&
>> +               test_cmp T.objects T$option.objects &&
>> +               (
>> +                       cd T$option/.git/objects &&
>> +                       find . -type f | sort >../../../T$option.objects-files.raw &&
>> +                       find . -type l | sort >../../../T$option.objects-links.raw
>> +               )
>> +       done &&
>> +
>> +       for raw in $(ls T*.raw)
>> +       do
>> +               sed -e "s!/..\$!/X!; s!/../!/Y/!; s![0-9a-f]\{38,\}!Z!" <$raw >$raw.de-sha || return 1
>> +       done &&
>> +
>> +       cat >expected-files <<-EOF &&
>> +       ./Y/Z
>> +       ./a-loose-dir/Z
>> +       ./an-object
>> +       ./Y/Z
>> +       ./info/packs
>> +       ./loose-dirs
>> +       ./pack/pack-Z.idx
>> +       ./pack/pack-Z.pack
>> +       ./packs/pack-Z.idx
>> +       ./packs/pack-Z.pack
>> +       EOF
>> +       cat >expected-links <<-EOF &&
>> +       ./Y/Z
>> +       EOF
>> +       for option in --local --dissociate
>> +       do
>> +               test_cmp expected-files T$option.objects-files.raw.de-sha || return 1 &&
>> +               test_cmp expected-links T$option.objects-links.raw.de-sha || return 1
>> +       done &&
>> +
>> +       cat >expected-files <<-EOF &&
>> +       ./Y/Z
>> +       ./Y/Z
>> +       ./a-loose-dir/Z
>> +       ./an-object
>> +       ./Y/Z
>> +       ./info/packs
>> +       ./loose-dirs
>> +       ./pack/pack-Z.idx
>> +       ./pack/pack-Z.pack
>> +       ./packs/pack-Z.idx
>> +       ./packs/pack-Z.pack
>> +       EOF
>> +       test_cmp expected-files T--no-hardlinks.objects-files.raw.de-sha &&
>> +       test_must_be_empty T--no-hardlinks.objects-links.raw.de-sha &&
>> +
>> +       cat >expected-files <<-EOF &&
>> +       ./info/alternates
>> +       EOF
>> +       test_cmp expected-files T--shared.objects-files.raw &&
>> +       test_must_be_empty T--shared.objects-links.raw
>> +'
>> +
>>  test_done
>> --
>> 2.21.0.rc2.261.ga7da99ff1b
>>

  reply	other threads:[~2019-03-01 13:49 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-26  5:17 [WIP RFC PATCH v2 0/5] clone: dir iterator refactoring with tests Matheus Tavares
2019-02-26  5:18 ` [WIP RFC PATCH v2 1/5] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-02-26 12:01   ` Duy Nguyen
2019-02-27 13:59     ` Matheus Tavares Bernardino
2019-02-26  5:18 ` [WIP RFC PATCH v2 2/5] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-02-26  5:18 ` [WIP RFC PATCH v2 3/5] clone: copy hidden paths at local clone Matheus Tavares
2019-02-26 12:13   ` Duy Nguyen
2019-02-26  5:18 ` [WIP RFC PATCH v2 4/5] clone: extract function from copy_or_link_directory Matheus Tavares
2019-02-26 12:18   ` Duy Nguyen
2019-02-27 17:30     ` Matheus Tavares Bernardino
2019-02-27 22:45       ` Thomas Gummerer
2019-02-27 22:50         ` Matheus Tavares Bernardino
2019-02-26  5:18 ` [WIP RFC PATCH v2 5/5] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-02-26 11:35   ` Ævar Arnfjörð Bjarmason
2019-02-26 12:32   ` Duy Nguyen
2019-02-26 12:50     ` Ævar Arnfjörð Bjarmason
2019-02-27 17:40     ` Matheus Tavares Bernardino
2019-02-28  7:13       ` Duy Nguyen
2019-02-28  7:53       ` Ævar Arnfjörð Bjarmason
2019-02-26 11:36 ` [WIP RFC PATCH v2 0/5] clone: dir iterator refactoring with tests Ævar Arnfjörð Bjarmason
2019-02-26 12:20   ` Duy Nguyen
2019-02-26 12:28 ` [RFC PATCH v3 " Ævar Arnfjörð Bjarmason
2019-02-26 20:56   ` Matheus Tavares Bernardino
2019-03-22 23:22   ` [GSoC][PATCH v4 0/7] clone: dir-iterator " Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 1/7] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-03-24 18:09       ` Matheus Tavares Bernardino
2019-03-24 20:56       ` SZEDER Gábor
2019-03-26 19:43         ` Matheus Tavares Bernardino
2019-03-28 21:49       ` Thomas Gummerer
2019-03-29 14:06         ` Matheus Tavares Bernardino
2019-03-29 19:31           ` Thomas Gummerer
2019-03-29 19:42             ` SZEDER Gábor
2019-03-30  2:49             ` Matheus Tavares Bernardino
2019-03-22 23:22     ` [GSoC][PATCH v4 2/7] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-03-28 22:10       ` Thomas Gummerer
2019-03-29  8:38         ` Ævar Arnfjörð Bjarmason
2019-03-29 20:15           ` Thomas Gummerer
2019-03-29 14:27         ` Matheus Tavares Bernardino
2019-03-29 20:05           ` Thomas Gummerer
2019-03-30  5:32             ` Matheus Tavares Bernardino
2019-03-30 19:27               ` Thomas Gummerer
2019-04-01  3:56                 ` Matheus Tavares Bernardino
2019-03-29 15:40         ` Johannes Schindelin
2019-03-22 23:22     ` [GSoC][PATCH v4 3/7] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-03-28 22:19       ` Thomas Gummerer
2019-03-29 13:16         ` Matheus Tavares Bernardino
2019-03-22 23:22     ` [GSoC][PATCH v4 4/7] clone: copy hidden paths at local clone Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 5/7] clone: extract function from copy_or_link_directory Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 6/7] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 7/7] clone: Replace strcmp by fspathcmp Matheus Tavares
2019-03-30 22:49     ` [GSoC][PATCH v5 0/7] clone: dir-iterator refactoring with tests Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 1/7] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 2/7] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-03-31 17:40         ` Thomas Gummerer
2019-04-01  3:59           ` Matheus Tavares Bernardino
2019-03-30 22:49       ` [GSoC][PATCH v5 3/7] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-03-31 18:12         ` Thomas Gummerer
2019-04-10 20:24           ` Matheus Tavares Bernardino
2019-04-11 21:09             ` Thomas Gummerer
2019-04-23 17:07               ` Matheus Tavares Bernardino
2019-04-24 18:36                 ` Thomas Gummerer
2019-04-26  4:13                   ` Matheus Tavares Bernardino
2019-03-30 22:49       ` [GSoC][PATCH v5 4/7] clone: copy hidden paths at local clone Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 5/7] clone: extract function from copy_or_link_directory Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 6/7] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 7/7] clone: replace strcmp by fspathcmp Matheus Tavares
2019-03-31 18:16       ` [GSoC][PATCH v5 0/7] clone: dir-iterator refactoring with tests Thomas Gummerer
2019-04-01 13:56         ` Matheus Tavares Bernardino
2019-05-02 14:48       ` [GSoC][PATCH v6 00/10] " Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-06-18 23:27         ` [GSoC][PATCH v7 00/10] clone: dir-iterator refactoring with tests Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-06-25 18:00             ` Junio C Hamano
2019-06-25 18:11               ` Matheus Tavares Bernardino
2019-06-26 13:34             ` Johannes Schindelin
2019-06-26 18:04               ` Junio C Hamano
2019-06-27  9:20                 ` Duy Nguyen
2019-06-27 17:23                 ` Matheus Tavares Bernardino
2019-06-27 18:48                   ` Johannes Schindelin
2019-06-27 19:33                     ` Matheus Tavares Bernardino
2019-06-28 12:51                       ` Johannes Schindelin
2019-06-28 14:16                         ` Matheus Tavares Bernardino
2019-07-01 12:15                           ` Johannes Schindelin
2019-07-03  8:57             ` SZEDER Gábor
2019-07-08 22:21               ` Matheus Tavares Bernardino
2019-06-18 23:27           ` [GSoC][PATCH v7 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-06-19  4:36           ` [GSoC][PATCH v7 00/10] clone: dir-iterator refactoring with tests Matheus Tavares Bernardino
2019-06-20 20:18           ` Junio C Hamano
2019-06-21 13:41             ` Matheus Tavares Bernardino
2019-07-10 23:58           ` [GSoC][PATCH v8 " Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-07-11 11:56             ` [GSoC][PATCH v8 00/10] clone: dir-iterator refactoring with tests Johannes Schindelin
2019-07-11 15:24               ` Matheus Tavares Bernardino
2019-02-26 12:28 ` [RFC PATCH v3 1/5] clone: test for our behavior on odd objects/* content Ævar Arnfjörð Bjarmason
2019-02-28 21:19   ` Matheus Tavares Bernardino
2019-03-01 13:49     ` Ævar Arnfjörð Bjarmason [this message]
2019-03-13  3:17       ` Matheus Tavares
2019-02-26 12:28 ` [RFC PATCH v3 2/5] dir-iterator: add flags parameter to dir_iterator_begin Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 3/5] clone: copy hidden paths at local clone Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 4/5] clone: extract function from copy_or_link_directory Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 5/5] clone: use dir-iterator to avoid explicit dir traversal Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o96uvh4x.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=pclouds@gmail.com \
    --cc=t.gummerer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).