On 10/24/21 1:15 AM, Junio C Hamano wrote: > Eli Schwartz writes: > >> The %(describe) placeholder by default, like `git describe`, uses a >> seven-character abbreviated commit hash. This may not be sufficient to > > "hash" -> "object name". > >> fully describe all git repos, resulting in a placeholder replacement > > "all git repos" -> "all commits in a given repository" (there may be > other valid way to clarify, but the point is that 'describe' does > not describe 'git repos' in the sense that my repository gets > description X while your repository gets description Y). Good points. >> changing its length because the repository grew in size. This could >> cause the output of git-archive to change. >> >> Add the --abbrev option to `git describe` to the placeholder interface >> in order to provide tools to the user for fine-tuning project defaults >> and ensure reproducible archives. > > Note that it is sad that --abbrev= does not necessarily ensure > reproducibility. To be more precise, I do not think it sacrifices > uniqueness to make the output reproducible. You can get more than N > hex-digits in the output if N is too small to ensure uniquness. > > So it indeed is that this line of thought ... > >> One alternative would be to just always specify --abbrev=40 but this may >> be a bit too biased... > > ... to use --abbrev=999 (because 40 is not the length of a full > object name in the SHA-2 world) is the only reasonable way, if what > you care about is the reproducibility. Right, I keep forgetting about the current work towards SHA-2... that being said I somehow feel that 40 hex-digits will probably be reasonably sufficient even if commit object ids can become longer than that. So it is technically true... > Side note. I think "git describe --no-abbrev" is buggy in that > it does not give a full object name; I didn't check the code, > but it appears to be behaving the same way as "git describe > --abbrev=0" (show no hexdigits). Fixing this bug may possibly > be a low-hanging fruit. I... did not realize that --abbrev, which takes an integer value, could be specified with a leading --no- in the first place. :o > But even if the feature cannot be used to guarantee a full > reproducibility, it is a good thing that we can now add this feature > with minimum effort thanks to the previous two steps. > > The refactoring I suggested in my review for the previous step will > shine, if we want to do a good job parsing the --abbrev= option, > since such a code organization would make it a fairly easy addition > to introduce "integer" type that calls match_placeholder_arg_value() > to read the option value (like "string" does) and validate that the > value is indeed an integer. > > Would we want to support "--contains" as another boolean type? How > about "--all" and "--long"? All three sound plausible candidates. I didn't have any immediate use for these options. To be honest, I don't entirely understand the purpose of: --all, which causes git describe to report things like "heads/master" --contains, which causes git describe to report different output depending on the status of later commits being tagged They may have their uses, but I'm not sure those uses include writing metadata to a git-archive tarball at least. I believe the results would inevitably end up changing after the fact, whereas only matching existing tags will tend to be pretty reliable as a tag is, in most (all?) cases, pushed at the same time as, or before: - the commit it describes - commits which are later than the tag On the other hand... --long could be interesting to some people, although for, say, generating software version numbers, a tagged release will typically omit that information in my experience. For the sake of thoroughness I could add that too. -- Eli Schwartz Arch Linux Bug Wrangler and Trusted User