[PATCH 11/13] t6112: rev-list object filtering test

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Jeff Hostetler <git@jeffhostetler.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, peff@peff.net, jonathantanmy@google.com,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: [PATCH 11/13] t6112: rev-list object filtering test
Date: Fri, 22 Sep 2017 20:30:15 +0000	[thread overview]
Message-ID: <20170922203017.53986-12-git@jeffhostetler.com> (raw)
In-Reply-To: <20170922203017.53986-6-git@jeffhostetler.com>

From: Jeff Hostetler <jeffhost@microsoft.com>

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/t6112-rev-list-filters-objects.sh | 237 ++++++++++++++++++++++++++++++++++++
 1 file changed, 237 insertions(+)
 create mode 100755 t/t6112-rev-list-filters-objects.sh

diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
new file mode 100755
index 0000000..66ff022
--- /dev/null
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -0,0 +1,237 @@
+#!/bin/sh
+
+test_description='git rev-list with object filtering'
+
+. ./test-lib.sh
+
+# test the omit-all filter
+
+test_expect_success 'setup' '
+	echo "{print \$1}" >print_1.awk &&
+	echo "{print \$2}" >print_2.awk &&
+
+	for n in 1 2 3 4 5
+	do
+		echo $n > file.$n
+		git add file.$n
+		git commit -m "$n"
+	done
+'
+
+# Verify the omitted ("~OID") lines match the predicted list of OIDs.
+test_expect_success 'omit-all-blobs omitted 5 blobs' '
+	git ls-files -s file.1 file.2 file.3 file.4 file.5 \
+		| awk -f print_2.awk \
+		| sort >expected &&
+	git rev-list HEAD --quiet --objects --filter-print-omitted --filter-omit-all-blobs \
+		| awk -f print_1.awk \
+		| sed "s/~//" >observed &&
+	test_cmp observed expected
+'
+
+# Verify the complete OID list matches the unfiltered OIDs plus the omitted OIDs.
+test_expect_success 'omit-all-blobs nothing else changed' '
+	git rev-list HEAD --objects \
+		| awk -f print_1.awk \
+		| sort >expected &&
+	git rev-list HEAD --objects --filter-print-omitted --filter-omit-all-blobs \
+		| awk -f print_1.awk \
+		| sed "s/~//" \
+		| sort >observed &&
+	test_cmp observed expected
+'
+
+# test the size-based filtering.
+
+test_expect_success 'setup_large' '
+	for n in 1000 10000
+	do
+		printf "%"$n"s" X > large.$n
+		git add large.$n
+		git commit -m "$n"
+	done
+'
+
+test_expect_success 'omit-large-blobs omit 2 blobs' '
+	git ls-files -s large.1000 large.10000 \
+		| awk -f print_2.awk \
+		| sort >expected &&
+	git rev-list HEAD --quiet --objects --filter-print-omitted --filter-omit-large-blobs=500 \
+		| awk -f print_1.awk \
+		| sed "s/~//" >observed &&
+	test_cmp observed expected
+'
+
+test_expect_success 'omit-large-blobs nothing else changed' '
+	git rev-list HEAD --objects \
+		| awk -f print_1.awk \
+		| sort >expected &&
+	git rev-list HEAD --objects --filter-print-omitted --filter-omit-large-blobs=500 \
+		| awk -f print_1.awk \
+		| sed "s/~//" \
+		| sort >observed &&
+	test_cmp observed expected
+'
+
+# boundary test around the size parameter.
+# filter is strictly less than the value, so size 500 and 1000 should have the
+# same results, but 1001 should filter more.
+
+test_expect_success 'omit-large-blobs omit 2 blobs' '
+	git ls-files -s large.1000 large.10000 \
+		| awk -f print_2.awk \
+		| sort >expected &&
+	git rev-list HEAD --quiet --objects --filter-print-omitted --filter-omit-large-blobs=1000 \
+		| awk -f print_1.awk \
+		| sed "s/~//" >observed &&
+	test_cmp observed expected
+'
+
+test_expect_success 'omit-large-blobs omit 1 blob' '
+	git ls-files -s large.10000 \
+		| awk -f print_2.awk \
+		| sort >expected &&
+	git rev-list HEAD --quiet --objects --filter-print-omitted --filter-omit-large-blobs=1001 \
+		| awk -f print_1.awk \
+		| sed "s/~//" >observed &&
+	test_cmp observed expected
+'
+
+test_expect_success 'omit-large-blobs omit 1 blob (1k)' '
+	git ls-files -s large.10000 \
+		| awk -f print_2.awk \
+		| sort >expected &&
+	git rev-list HEAD --quiet --objects --filter-print-omitted --filter-omit-large-blobs=1k \
+		| awk -f print_1.awk \
+		| sed "s/~//" >observed &&
+	test_cmp observed expected
+'
+
+test_expect_success 'omit-large-blobs omit no blob (1m)' '
+	cat </dev/null >expected &&
+	git rev-list HEAD --quiet --objects --filter-print-omitted --filter-omit-large-blobs=1m \
+		| awk -f print_1.awk \
+		| sed "s/~//" >observed &&
+	test_cmp observed expected
+'
+
+# Test sparse-pattern filtering (using explicit local patterns).
+# We use the same disk format as sparse-checkout to specify the
+# filtering, but do not require sparse-checkout to be enabled.
+
+test_expect_success 'setup using sparse file' '
+	mkdir dir1 &&
+	for n in sparse1 sparse2
+	do
+		echo $n > $n
+		git add $n
+		echo dir1/$n > dir1/$n
+		git add dir1/$n
+	done &&
+	git commit -m "sparse" &&
+	echo dir1/ >pattern1 &&
+	echo sparse1 >pattern2
+'
+
+# pattern1 should only include the 2 dir1/* files.
+# and omit the 5 file.*, 2 large.*, and 2 top-level sparse* files.
+test_expect_success 'sparse using path pattern1' '
+	git rev-list HEAD --objects --filter-print-omitted --filter-use-path=pattern1 >out &&
+
+	grep "^~" out >blobs.omitted &&
+	test $(cat blobs.omitted | wc -l) = 9 &&
+
+	grep "dir1/sparse" out >blobs.included &&
+	test $(cat blobs.included | wc -l) = 2
+'
+
+# pattern2 should include the sparse1 and dir1/sparse1.
+# and omit the 5 file.*, 2 large.*, and the 2 sparse2 files.
+test_expect_success 'sparse using path pattern2' '
+	git rev-list HEAD --objects --filter-print-omitted --filter-use-path=pattern2 >out &&
+
+	grep "^~" out >blobs.omitted &&
+	test $(cat blobs.omitted | wc -l) = 9 &&
+
+	grep "sparse1" out >blobs.included &&
+	test $(cat blobs.included | wc -l) = 2
+'
+
+# Test sparse-pattern filtering (using a blob in the repo).
+# This could be used to later let pack-objects do filtering.
+
+# pattern1 should only include the 2 dir1/* files.
+# and omit the 5 file.*, 2 large.*, 2 top-level sparse*, and 1 pattern file.
+test_expect_success 'sparse using OID for pattern1' '
+	git add pattern1 &&
+	git commit -m "pattern1" &&
+
+	git rev-list HEAD --objects >normal.output &&
+	grep "pattern1" <normal.output | awk "{print \$1;}" >pattern1.oid &&
+
+	git rev-list HEAD --objects --filter-print-omitted --filter-use-blob=`cat pattern1.oid` >out &&
+
+	grep "^~" out >blobs.omitted &&
+	test $(cat blobs.omitted | wc -l) = 10 &&
+
+	grep "dir1/sparse" out >blobs.included &&
+	test $(cat blobs.included | wc -l) = 2
+'
+
+# repeat previous test but use blob-ish expression rather than OID.
+test_expect_success 'sparse using blob-ish to get OID for pattern spec' '
+	git rev-list HEAD --objects --filter-print-omitted --filter-use-blob=HEAD:pattern1 >out &&
+
+	grep "^~" out >blobs.omitted &&
+	test $(cat blobs.omitted | wc -l) = 10 &&
+
+	grep "dir1/sparse" out >blobs.included &&
+	test $(cat blobs.included | wc -l) = 2
+'
+
+# pattern2 should include the sparse1 and dir1/sparse1.
+# and omit the 5 file.*, 2 large.*, 2 top-level sparse*, and 2 pattern files.
+test_expect_success 'sparse using OID for pattern2' '
+	git add pattern2 &&
+	git commit -m "pattern2" &&
+
+	git rev-list HEAD --objects >normal.output &&
+	grep "pattern2" <normal.output | awk "{print \$1;}" >pattern2.oid &&
+
+	git rev-list HEAD --objects --filter-print-omitted --filter-use-blob=`cat pattern2.oid` >out &&
+
+	grep "^~" out >blobs.omitted &&
+	test $(cat blobs.omitted | wc -l) = 11 &&
+
+	grep "sparse1" out >blobs.included &&
+	test $(cat blobs.included | wc -l) = 2
+'
+
+# repeat previous test but use blob-ish expression rather than OID.
+test_expect_success 'sparse using blob-ish rather than OID for pattern2' '
+	git rev-list HEAD --objects --filter-print-omitted --filter-use-blob=HEAD:pattern2 >out &&
+
+	grep "^~" out >blobs.omitted &&
+	test $(cat blobs.omitted | wc -l) = 11 &&
+
+	grep "sparse1" out >blobs.included &&
+	test $(cat blobs.included | wc -l) = 2
+'
+
+# delete some loose objects and test rev-list printing them as missing.
+test_expect_success 'print missing objects' '
+	git ls-files -s file.1 file.2 file.3 file.4 file.5 \
+		| awk -f print_2.awk \
+		| sort >expected &&
+	for id in `cat expected | sed "s|..|&/|"`
+	do
+		rm .git/objects/$id
+	done &&
+	git rev-list --quiet HEAD --filter-print-missing --objects \
+		| awk -f print_1.awk \
+		| sed "s/?//" \
+		| sort >observed &&
+	test_cmp observed expected
+'
+
+test_done
-- 
2.9.3

next prev parent reply	other threads:[~2017-09-22 20:31 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-22 20:30 [PATCH 05/13] list-objects-filter-large: add large blob filter to list-objects Jeff Hostetler
2017-09-22 20:30 ` [PATCH 06/13] list-objects-filter-sparse: add sparse-checkout based filter Jeff Hostetler
2017-09-22 20:30 ` [PATCH 07/13] object-filter: common declarations for object filtering Jeff Hostetler
2017-09-26 22:39   ` Jonathan Tan
2017-09-27 17:09     ` Jeff Hostetler
2017-09-28  0:05       ` Jonathan Tan
2017-09-28 14:33         ` Jeff Hostetler
2017-09-29 19:47           ` Jonathan Tan
2017-09-22 20:30 ` [PATCH 08/13] list-objects: add traverse_commit_list_filtered method Jeff Hostetler
2017-09-22 20:30 ` [PATCH 09/13] rev-list: add object filtering support Jeff Hostetler
2017-09-26 22:44   ` Jonathan Tan
2017-09-27 17:26     ` Jeff Hostetler
2017-09-22 20:30 ` [PATCH 10/13] rev-list: add filtering help text Jeff Hostetler
2017-09-22 20:30 ` Jeff Hostetler [this message]
2017-09-22 20:30 ` [PATCH 12/13] pack-objects: add object filtering support Jeff Hostetler
2017-09-22 20:30 ` [PATCH 13/13] pack-objects: add filtering help text Jeff Hostetler
  -- strict thread matches above, loose matches on Subject: below --
2017-10-24 18:53 [PATCH 00/13] WIP Partial clone part 1: object filtering Jeff Hostetler
2017-10-24 18:53 ` [PATCH 11/13] t6112: rev-list object filtering test Jeff Hostetler

find likely ancestor, descendant, or conflicting patches for this message:
dfblob:66ff022
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170922203017.53986-12-git@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).