git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: Chandra Pratap <chandrapratap3519@gmail.com>
Cc: Patrick Steinhardt <ps@pks.im>, git <git@vger.kernel.org>,
	 karthik nayak <karthik.188@gmail.com>,
	Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Subject: Re: [RFC][GSoC] Proposal v2: Move and improve reftable tests in the unit testing framework
Date: Tue, 2 Apr 2024 12:01:59 +0200	[thread overview]
Message-ID: <CAP8UFD26BzjOdJPjcbVy-xGD7nBhFsfSu9CyureCLdHp9eUMZg@mail.gmail.com> (raw)
In-Reply-To: <63059d3b-e883-4fa2-8364-64f7bbd064f7@gmail.com>

On Fri, Mar 29, 2024 at 8:08 AM Chandra Pratap
<chandrapratap3519@gmail.com> wrote:
>
> Thanks for the feedback, Christian and Patrick! With your advice, I
> decided to split my original proposal into two to conform to what was
> suggested by the SoC 2024 Ideas page.
>
> This is the proposal for the reftable tests migration project. However,
> I am unsure of what would be a good project size for this project.
> I have quite a long summer vacation and don't really have any other
> plans other than GSoC as of now so I decided to go with large project
> size on GSoC's website but please let me know if another size would
> be more appropriate.

On our Idea page, we have "Expected Project Size: 175 hours or 350
hours" for this project. So it's fine for us if you pick one of those
(medium or large).

> Before GSoC
> -----------
>
> -----Synopsis-----
>
> A new unit testing framework was introduced to the Git mailing list last
> year with the aim of simplifying testing and improving maintainability
> by moving the current testing setup from shell scripts and helper files
> to a framework written wholly in C. The idea was accepted and merged
> into master on 09/11/2023. The choice of testing framework and the
> rationale behind the choice is thoroughly described in
> Documentation/technical/unit-tests.txt.
>
> This project aims to extend that work by moving the reftable tests from
> the current setup to the new unit testing framework and improving the
> tests themselves. The difficulty for the project should be medium
> and it should take somewhat around 175-350 hours.

It's fine to give information about the project that is already in the
Idea page (or even to just copy from that page), but we are more
interested to know how you approach the project, and how you want to
work on it. So if you give information that is already in the Idea
page, please say it clearly, so we can easily skip reading that if we
don't have much time.

As we prefer that you give information specific to your approach, I
think it would have been more interesting to say that you decided to
go with a "large" project size for example.

> In GSoC
> -------
>
> -----Background for reftable-----
>
> Git’s internals consist of mainly three objects: blobs, tree objects and
> commit objects. The blobs and tree objects are responsible for storing a
> repository’s content while the commit objects store information about
> commits in the repo and are responsible for capturing the repo’s
> history.

When explaining refs with as much detail, I think tag objects are
important too, as refs often point to them too. On the contrary, blobs
and tree objects are very rarely pointed to by refs. So it's a bit
strange that you don't talk about tag objects, but that you talk about
blobs and trees which are less important than tag objects in this
context.

> Every one of these objects can be accessed through a unique key
> generated by a SHA-256 (previously SHA-1) algorithm.

Actually both SHA-256 and SHA-1 are supported.

> To make life
> easier, instead of remembering the hash key for commit objects, we can
> assign a simple name to them, store these names in a file and use that
> file whenever we need access to the commits. These names are called
> ‘references’ or ‘refs’.
>
> Since a repository can contain a lot of commits and branches and hence,
> a lot of refs, Git used packed-refs to save space by storing unused refs
> in a single file. However, this arrangement doesn’t scale well in terms
> of both space and performance. This is where reftable comes in. A
> reftable file is a portable binary file format customized for storing
> references. Some objectives of reftable are:
> -  Sorted references enabling advanced scans like binary search.
> -  Near constant time lookup for any single reference.
> -  Efficient enumeration of an entire namespace like refs/tags/
> -  Combined reflog storage with ref storage for small transactions and
>    separate reflog storage for base refs and historical logs.
> -  Near constant time verification if an object name is referred to by at
>    least one reference.
>
> -----Plan-----
>
> The reftable tests are different from other tests in the test directory

Not sure which "test directory" you are talking about. I guess it's "t/".

> because they perform unit testing with the help of a custom test framework
> rather than the usual ‘helper file + shell script’ combination.

There are actually already unit tests in t/unit-tests which don't need
a shell script.

I think it would have been clearer if this paragraph was started with
something like "Compared to unit tests that use both a "t/*.sh" test
script in shell and a test-tool helper in "t/helper/", ..."

> Reftable tests do have a helper file and a shell script invoking the
> helper file, but rather than performing the tests, this combination is
> used to invoke tests defined in the reftable directory.

It looks like you are talking about the "reftable/*_test.c" files, but
I think it would be clearer if you spelled that out.

> The reftable directory consists of nine tests:

I think using "contains" instead of "consists" would be a bit better,
as there are other things than the nine tests in the "reftable/"
directory.

>
> •  basics test
> •  record test
> •  block test
> •  tree test
> •  pq test
> •  stack test
> •  merged test
> •  refname test
> •  read-write test

Yeah, this corresponds to the output of `ls reftable/*_test.c`.

> Each of these tests is written in C using a custom reftable testing
> framework defined by reftable/test_framework (also written in C).

Using "reftable/test_framework.{c,h}" would be a bit clearer as there
is no "reftable/test_framework" file.

> Since the reftable test framework is written in C like the unit testing
> framework, we can create a direct translation of the features mentioned
> above using the existing tools in the unit testing framework with the
> following plan:
>
> •  EXPECT_ERR(c): Can be replaced by check(!c) or check_int(c, “==”, 0).
>
> •  EXPECT_STREQ(a, b): Can be replaced by check_str(a, b).
>
> •  EXPECT(c): Can be replaced by check_int(), similar to EXPECT_ERR.
>    E.g. expect(a >= b) --> check_int(a, “>=”, b)
>
> •  RUN_TEST(f): Can be replaced by TEST(f(), “message explaining the test”).
>
> The information contained in the diagnostic messages of these macros is
> replicated in the unit testing framework by default. Any additional
> information can be displayed using the test_msg() functionality in the
> framework. The additional functions set_test_hash() and strbuf_add_void()
> may be moved to reftable/stack.c and reftable/refname.c respectively.
>
> The plan itself is basic and does need improvements, but using this plan,
> I have already created a working albeit primitive copy for two out of the
> nine tests (basics test and tree test) as can be seen here:
> (link: https://github.com/gitgitgadget/git/pull/1698)

Nice, but it looks like "unit-tests/bin/t-reftable-tree" sometimes
fails, perhaps because it's missing a call to test_done().

The rest of your proposal LGTM.

Thanks!


      reply	other threads:[~2024-04-02 10:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-19 17:11 [RFC][GSoC] Proposal: Move reftable and other tests to the unit testing framework Chandra Pratap
2024-03-20 13:17 ` Christian Couder
2024-03-20 19:35   ` Chandra Pratap
2024-03-21 12:27   ` Patrick Steinhardt
2024-03-29  6:54     ` [RFC][GSoC] Proposal v2: Move more " Chandra Pratap
2024-03-29 17:51       ` Karthik Nayak
2024-03-29  7:08     ` [RFC][GSoC] Proposal v2: Move and improve reftable tests in " Chandra Pratap
2024-04-02 10:01       ` Christian Couder [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP8UFD26BzjOdJPjcbVy-xGD7nBhFsfSu9CyureCLdHp9eUMZg@mail.gmail.com \
    --to=christian.couder@gmail.com \
    --cc=chandrapratap3519@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=kaartic.sivaraam@gmail.com \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).