git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/8] Additional metadata for filter processes
@ 2020-03-10 18:20 brian m. carlson
  2020-03-10 18:20 ` [PATCH 1/8] builtin/checkout: pass branch info down to checkout_worktree brian m. carlson
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: brian m. carlson @ 2020-03-10 18:20 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau

Smudge and clean filters are currently provided with one particular
piece of data: the pathname of the file being smudged.  While this is
helpful, there are a variety of situations where people would like to
have more data.

One such situation is for users who would like to have a custom
ident-style filter that contains the branch name.  In many cases, it's
sufficient to look up this information based on HEAD, but during
checkout, HEAD does not point to the right place, since it's updated
after the files are written.

Other information users frequently want to know is the commit's object
ID and the object ID of the blob being filtered.  For example, if
filtering is expensive and the filter process sees duplicate blobs
during checkout, it may cache the results and avoid having to compute
the filter twice.

This series provides an additional set of metadata to the filter
process with the keys "ref", "treeish", and "blob".  We prefer to
provide a commit as the treeish whenever possible, but in some cases,
such as when git archive is invoked with a tree, there is no commit, and
we use the tree instead.

Note that we don't provide this metadata in all cases.  Sometimes it is
trivial for the filter to do a simple "git rev-parse HEAD", and in such
cases, metadata other than the blob may not be provided.  We also don't
handle the case where the user is using a smudge or clean command
instead of a filter process command: if the user wants the additional
metadata, it should be possible for them to write a small filter
process, which is reasonably trivial in most languages.  Our
documentation already permits us to add additional metadata and
guarantees only that the pathname will be provided.

My particular use case for this is prefetching and precomputing data
during archive generation, since we don't permit delayed filters there
due to archives needing to be in a predictable order.  I have tried to
make it as generally applicable as possible, since I can imagine (and
have indeed seen requests for) many other useful applications of this
elsewhere.

Feedback is of course welcome.

brian m. carlson (8):
  builtin/checkout: pass branch info down to checkout_worktree
  convert: permit passing additional metadata to filter processes
  convert: provide additional metadata to filters
  builtin/checkout: compute checkout metadata for checkouts
  builtin/clone: compute checkout metadata for clones
  builtin/rebase: compute checkout metadata for rebases
  builtin/reset: compute checkout metadata for reset
  t0021: test filter metadata for additional cases

 apply.c                 |   2 +-
 archive.c               |  13 ++-
 archive.h               |   1 +
 builtin/cat-file.c      |   5 +-
 builtin/checkout.c      |  54 +++++++----
 builtin/clone.c         |   6 +-
 builtin/rebase.c        |   1 +
 builtin/reset.c         |  16 +++-
 cache.h                 |   1 +
 convert.c               |  66 ++++++++++++--
 convert.h               |  29 +++++-
 diff.c                  |   5 +-
 entry.c                 |   7 +-
 merge-recursive.c       |   2 +-
 merge.c                 |   1 +
 sequencer.c             |   1 +
 t/t0021-conversion.sh   | 198 ++++++++++++++++++++++++++++++++++------
 t/t0021/rot13-filter.pl |   6 ++
 unpack-trees.c          |   1 +
 unpack-trees.h          |   1 +
 20 files changed, 341 insertions(+), 75 deletions(-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-03-15 19:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-10 18:20 [PATCH 0/8] Additional metadata for filter processes brian m. carlson
2020-03-10 18:20 ` [PATCH 1/8] builtin/checkout: pass branch info down to checkout_worktree brian m. carlson
2020-03-10 18:20 ` [PATCH 2/8] convert: permit passing additional metadata to filter processes brian m. carlson
2020-03-11 20:38   ` Junio C Hamano
2020-03-12  0:39     ` brian m. carlson
2020-03-10 18:20 ` [PATCH 3/8] convert: provide additional metadata to filters brian m. carlson
2020-03-10 18:20 ` [PATCH 4/8] builtin/checkout: compute checkout metadata for checkouts brian m. carlson
2020-03-10 18:20 ` [PATCH 5/8] builtin/clone: compute checkout metadata for clones brian m. carlson
2020-03-15 10:39   ` SZEDER Gábor
2020-03-15 17:44     ` brian m. carlson
2020-03-15 19:30     ` Junio C Hamano
2020-03-10 18:20 ` [PATCH 6/8] builtin/rebase: compute checkout metadata for rebases brian m. carlson
2020-03-10 18:20 ` [PATCH 7/8] builtin/reset: compute checkout metadata for reset brian m. carlson
2020-03-10 18:20 ` [PATCH 8/8] t0021: test filter metadata for additional cases brian m. carlson

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).