git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Han-Wen Nienhuys <hanwen@google.com>,
	Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: [PATCH 0/6] Minimal patch series to fix the CI runs of hn/reftable
Date: Mon, 04 May 2020 13:31:20 +0000	[thread overview]
Message-ID: <pull.623.git.1588599086.gitgitgadget@gmail.com> (raw)

These patches are not intended to be complete, not by any stretch of
imagination. They are just enough to get the CI run to pass, even in the
Windows-specific parts.

As I mentioned elsewhere, I would much prefer for hn/reftable to not 
re-invent get_be*(), struct strbuf, struct string_list, struct lock_file 
etc.

However, in the context of the test failures, I do not think that this would
have made a big difference. Apart from the unportable constructs, and from
the "delete/rename while there is still a handle on the file" issues, it
would appear that one big reason why it was so hard to debug and fix the
test is the recursive complexity of the data structures.

To elaborate on that: struct reftable_stack has an attribute called merged 
that has an array of struct reftable_reader * (confusingly called "stack").
Each of these readers has an attribute of type struct reftable_block_source 
that uses a vtable and an opaque pointer to abstract three types of block
sources: file, slice (which is apparently unused, i.e. it is apparently just
dead weight for now) and malloc.

I am not sure that this abstraction fest serves us well here.

Quite honestly, I would have expected the packed_git data structure to be
imitated more closely, as it strikes me as similar in purpose, and it has
seen a ton of testing over the past decade and a half. I could not recognize
that design in the reftable, though.

It is quite obvious, of course, that the code tries to imitate the
object-oriented nature of the Go code, but it seems quite obvious from my
difficulties to address the problem where stack_compact_range() would try to
delete stale reftable files (without even so much as a warning when the
files could not be deleted!) without releasing all file handles to all
reftable files, even the ones that do not need to be deleted. To be smarter
about this, the code has to know more about the nature of the block source
than the abstraction layer suggests. It has to know that a block source
refers to a file, and that that file is marked for deletion. My heavy-handed
work-around, even if it works, is not really a good solution, but it is a
testament that there is a lot of opportunity to simplify the code
drastically while at the same time simplifying the design, too.

I know you have been putting a lot of effort into this library, so I feel a
bit bad about saying the following: The hn/reftable patches will need
substantial work before we can merge it with confidence. Part of the reason
why it is so hard to review the reftable patches is that they intentionally 
refuse to integrate well within Git's source code, such as (re-)implementing
its own fundamental data structures, intentionally using a totally different
coding style, and it concerns me that the stated requirement for bug fixes
is to treat Git's source code as a downstream of the actual project. I am
not too bad a C developer and would consider myself competent in debugging
issues, even hard ones, in Git, and yet... it was really hard to wade
through the layers of abstraction to figure out where the file handles
should be closed that were opened and prevented deleting/renaming files.

At this point, I don't feel that it makes sense to keep insisting on having
this in a separate library. The only other user of that library will be
libgit2, and I really think that given libgit2's heritage, it won't be a
problem to adapt the code after it stabilized in git.git (and since libgit2
treats git.git as upstream for the libxdiff code, it won't be a problem to
do the same for the reftable code, too). I believe that the best course of
action is to reuse the data structures libgit.a provides, and to delete the
re-implementations in reftable/. Only then can we start working effectively
on refactoring the code to simplify the data structures in order to clarify
resource usage (which was the root cause for the bugs I fixed, although I am
sure that there are way more lurking in there, hidden by the fact that the
code is not covered thoroughly by our tests).

Johannes Schindelin (6):
  fixup! Add reftable library
  fixup! Add reftable library
  fixup! Add reftable library
  fixup! Add reftable library
  fixup! Add reftable library
  vcxproj: adjust for the reftable changes

 config.mak.uname                           |  2 +-
 contrib/buildsystems/Generators/Vcxproj.pm | 11 ++++++++++-
 reftable/record.c                          |  2 +-
 reftable/stack.c                           | 17 +++++++++++++++--
 4 files changed, 27 insertions(+), 5 deletions(-)


base-commit: 1b749f9d62f2c8b5a8e91f382d2be14902459cba
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-623%2Fdscho%2Freftable-on-windows-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-623/dscho/reftable-on-windows-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/623
-- 
gitgitgadget

             reply	other threads:[~2020-05-04 13:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-04 13:31 Johannes Schindelin via GitGitGadget [this message]
2020-05-04 13:31 ` [PATCH 1/6] fixup! Add reftable library Johannes Schindelin via GitGitGadget
2020-05-04 13:31 ` [PATCH 2/6] " Johannes Schindelin via GitGitGadget
2020-05-04 13:31 ` [PATCH 3/6] " Johannes Schindelin via GitGitGadget
2020-05-04 13:31 ` [PATCH 4/6] " Johannes Schindelin via GitGitGadget
2020-05-04 13:31 ` [PATCH 5/6] " Johannes Schindelin via GitGitGadget
2020-05-04 13:31 ` [PATCH 6/6] vcxproj: adjust for the reftable changes Johannes Schindelin via GitGitGadget
2020-05-04 17:12   ` Junio C Hamano
2020-05-04 15:56 ` [PATCH 0/6] Minimal patch series to fix the CI runs of hn/reftable Junio C Hamano
2020-05-05  8:02   ` Johannes Schindelin
2020-05-04 17:04 ` Han-Wen Nienhuys
2020-05-05  8:00   ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.623.git.1588599086.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=hanwen@google.com \
    --cc=johannes.schindelin@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).