git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, peff@peff.net, avarab@gmail.com,
	jrnieder@gmail.com, Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v4 3/6] pack-objects: add --sparse option
Date: Tue, 15 Jan 2019 10:06:53 -0500
Message-ID: <8c50e066-b132-cf6e-ed0f-ce843cf3a634@gmail.com> (raw)
In-Reply-To: <xmqqmuo6yghg.fsf@gitster-ct.c.googlers.com>

On 1/11/2019 5:30 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> Add a '--sparse' option flag to the pack-objects builtin. This
>> allows the user to specify that they want to use the new logic
>> for walking trees. This logic currently does not differ from the
>> existing output, but will in a later change.
>>
>> Create a new test script, t5322-pack-objects-sparse.sh, to ensure
>> the object list that is selected matches what we expect. When we
>> update the logic to walk in a sparse fashion, the final test will
>> be updated to show the extra objects that are added.
> Somehow checking the "these are exactly what we expect" feels
> brittle.  In a history with three relevant commits A---B---C,
> packing B..C could omit trees and blobs in C that appear in A but
> not in B, but traditionally, because we stop traversal at B and do
> not even look at A, we do not notice that such objects that need to
> complete C's tree are already available in a repository that has B.
> The approach of test in this patch feels similar to saying "we must
> send these duplicates because we must stay dumb", even though with a
> perfect knowledge about the history, e.g. bitmap, we would be able
> to omit these objects in A that appear in C but not in B.
My intention with this test was to show that the existing algorithm also 
sends "extra" objects in certain situations, which is later contrasted 
by a test where the new algorithm sends objects the old algorithm did not.

Instead of "we must stay dumb" I wanted to say "we are not dumber (in 
this case)".

I understand your brittle feeling. If the logic changed in either to be 
smarter, then we would need to change the test. In some sense, that does 
mean we have extra coverage of "this is how it works now" so we would 
understand the behavior change of that hypothetical future change.

> I think we want to test test both "are we sending enough, even
> though we might be wasting some bandwidth by not noticing the other
> side already have some?" and "are we still avoiding from becoming
> overly stupid, even though we may be cheating to save traversal cost
> and risking to redundantly send some objects?"
>
> IOW, a set of tests to make sure that truly new objects are all sent
> by seeing what is sent is a strict superset of what we expect.  And
> another set of tests to make sure that objects that are so obviously
> (this criterion may be highly subjective) be present in the
> receiving repository (e.g. the tree object of commit B and what it
> contains when seinding B..C) are never sent, even when using an
> algorithm that are tuned for traversal cost over bandwidth
> consumption.

To properly test "these objects are included," do we have an established 
pattern for saying "this file is a subset of that file"? Or, should I 
use something like `comm expected actual >common && test_cmp expected 
common`?

> The code change in this step looks all trivially good, and it may
> want to be squashed into the previous step to become a single patch.
> Otherwise, [2/6] would be adding a dead code that nobody exercises
> until this step.

Will squash.

Thanks,
-Stolee

  reply index

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28 21:52 [PATCH 0/5] Add a new "sparse" tree walk algorithm Derrick Stolee via GitGitGadget
2018-11-28 21:52 ` [PATCH 1/5] revision: add mark_tree_uninteresting_sparse Derrick Stolee via GitGitGadget
2018-11-28 21:52 ` [PATCH 2/5] list-objects: consume sparse tree walk Derrick Stolee via GitGitGadget
2018-11-28 21:52 ` [PATCH 3/5] pack-objects: add --sparse option Derrick Stolee via GitGitGadget
2018-11-28 22:11   ` Stefan Beller
2018-11-29 14:20     ` Derrick Stolee
2018-11-30  2:39       ` Junio C Hamano
2018-11-30 15:53         ` Derrick Stolee
2018-11-28 21:52 ` [PATCH 4/5] revision: implement sparse algorithm Derrick Stolee via GitGitGadget
2018-11-28 21:52 ` [PATCH 5/5] pack-objects: create pack.useSparse setting Derrick Stolee via GitGitGadget
2018-11-28 22:18 ` [PATCH 0/5] Add a new "sparse" tree walk algorithm Ævar Arnfjörð Bjarmason
2018-11-29  4:05   ` Derrick Stolee
2018-11-29 14:24 ` [PATCH v2 0/6] " Derrick Stolee via GitGitGadget
2018-11-29 14:24   ` [PATCH v2 1/6] revision: add mark_tree_uninteresting_sparse Derrick Stolee via GitGitGadget
2018-11-29 14:24   ` [PATCH v2 2/6] list-objects: consume sparse tree walk Derrick Stolee via GitGitGadget
2018-11-29 14:24   ` [PATCH v2 3/6] pack-objects: add --sparse option Derrick Stolee via GitGitGadget
2018-11-29 14:24   ` [PATCH v2 4/6] revision: implement sparse algorithm Derrick Stolee via GitGitGadget
2018-11-29 14:24   ` [PATCH v2 5/6] pack-objects: create pack.useSparse setting Derrick Stolee via GitGitGadget
2018-11-29 14:24   ` [PATCH v2 6/6] pack-objects: create GIT_TEST_PACK_SPARSE Derrick Stolee via GitGitGadget
2018-12-10 16:42   ` [PATCH v3 0/6] Add a new "sparse" tree walk algorithm Derrick Stolee via GitGitGadget
2018-12-10 16:42     ` [PATCH v3 1/6] revision: add mark_tree_uninteresting_sparse Derrick Stolee via GitGitGadget
2018-12-10 16:42     ` [PATCH v3 2/6] list-objects: consume sparse tree walk Derrick Stolee via GitGitGadget
2018-12-10 16:42     ` [PATCH v3 3/6] pack-objects: add --sparse option Derrick Stolee via GitGitGadget
2018-12-10 16:42     ` [PATCH v3 4/6] revision: implement sparse algorithm Derrick Stolee via GitGitGadget
2018-12-10 16:42     ` [PATCH v3 5/6] pack-objects: create pack.useSparse setting Derrick Stolee via GitGitGadget
2018-12-10 16:42     ` [PATCH v3 6/6] pack-objects: create GIT_TEST_PACK_SPARSE Derrick Stolee via GitGitGadget
2018-12-14 21:22     ` [PATCH v4 0/6] Add a new "sparse" tree walk algorithm Derrick Stolee via GitGitGadget
2018-12-14 21:22       ` [PATCH v4 1/6] revision: add mark_tree_uninteresting_sparse Derrick Stolee via GitGitGadget
2019-01-11 19:43         ` Junio C Hamano
2019-01-11 20:25           ` Junio C Hamano
2019-01-11 22:05             ` Derrick Stolee
2018-12-14 21:22       ` [PATCH v4 2/6] list-objects: consume sparse tree walk Derrick Stolee via GitGitGadget
2019-01-11 23:20         ` Junio C Hamano
2018-12-14 21:22       ` [PATCH v4 3/6] pack-objects: add --sparse option Derrick Stolee via GitGitGadget
2019-01-11 22:30         ` Junio C Hamano
2019-01-15 15:06           ` Derrick Stolee [this message]
2019-01-15 18:23             ` Junio C Hamano
2018-12-14 21:22       ` [PATCH v4 4/6] revision: implement sparse algorithm Derrick Stolee via GitGitGadget
2018-12-14 23:32         ` Ævar Arnfjörð Bjarmason
2018-12-17 14:20           ` Derrick Stolee
2018-12-17 14:26             ` Ævar Arnfjörð Bjarmason
2018-12-17 14:50               ` Derrick Stolee
2019-01-11 23:20         ` Junio C Hamano
2018-12-14 21:22       ` [PATCH v4 5/6] pack-objects: create pack.useSparse setting Derrick Stolee via GitGitGadget
2018-12-14 21:22       ` [PATCH v4 6/6] pack-objects: create GIT_TEST_PACK_SPARSE Derrick Stolee via GitGitGadget
2019-01-16 18:25       ` [PATCH v5 0/5] Add a new "sparse" tree walk algorithm Derrick Stolee via GitGitGadget
2019-01-16 18:25         ` [PATCH v5 2/5] list-objects: consume sparse tree walk Derrick Stolee via GitGitGadget
2019-01-16 18:25         ` [PATCH v5 1/5] revision: add mark_tree_uninteresting_sparse Derrick Stolee via GitGitGadget
2019-01-16 18:25         ` [PATCH v5 3/5] revision: implement sparse algorithm Derrick Stolee via GitGitGadget
2019-01-16 18:26         ` [PATCH v5 4/5] pack-objects: create pack.useSparse setting Derrick Stolee via GitGitGadget
2019-01-16 18:26         ` [PATCH v5 5/5] pack-objects: create GIT_TEST_PACK_SPARSE Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c50e066-b132-cf6e-ed0f-ce843cf3a634@gmail.com \
    --to=stolee@gmail.com \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git