From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-2.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID shortcircuit=no autolearn=no autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id BF19F1FF72 for ; Sat, 21 Oct 2017 13:57:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932098AbdJUN44 (ORCPT ); Sat, 21 Oct 2017 09:56:56 -0400 Received: from smtpoutz11.laposte.net ([194.117.213.174]:60029 "EHLO smtp.laposte.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932085AbdJUN4x (ORCPT ); Sat, 21 Oct 2017 09:56:53 -0400 Received: from smtp.laposte.net (localhost [127.0.0.1]) by lpn-prd-vrout003 (Postfix) with ESMTP id C2E564305C58 for ; Sat, 21 Oct 2017 15:56:51 +0200 (CEST) Received: from smtp.laposte.net (localhost [127.0.0.1]) by lpn-prd-vrout003 (Postfix) with ESMTP id A1C904305D20 for ; Sat, 21 Oct 2017 15:56:51 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=laposte.net; s=mail0; t=1508594211; bh=oijjEn5RPUnffjUZh30Yiv4XcYjzZN5DT4Hhvq4TUt8=; h=Date:From:To:Cc:In-Reply-To:References:Subject; b=G/YnBbTtWJDkBU/KcyuPK3ia8dEYyxCBuitDln/xpam66w5U2ae5W4ftlrjcIJsY5 IhNzpNzxsw503SxzuMIhRaKgHYYl8zELQmVQPkb1JxV22MZzg/AzN/c25GLU+senNk Ofm/KweaooQbam9/3DhmfkEz28rq7WZIhK4zhGpnarlnBq7E8Vy5cf2Ca/oh+RcaJA 9Zr+enIkyF9RAtryd9pGtNGX20B/162IT41OLQ4tjWm6LQY0c4pkHvLEnPv0wVu+4S kttmpWf0SkCnIHZNOQF8kT7sXqbgOqDtSzdtrzU4b5thiLQ3V1avAUKhhwG9JTKDWs gUBTG1YkXoaMg== Received: from lpn-prd-mstr088.laposte (lpn-prd-mstr088 [10.128.59.114]) by lpn-prd-vrout003 (Postfix) with ESMTP id 86E514305C58; Sat, 21 Oct 2017 15:56:51 +0200 (CEST) Date: Sat, 21 Oct 2017 15:56:51 +0200 (CEST) From: nicolas.mailhot@laposte.net To: Stefan Beller Cc: git@vger.kernel.org Message-ID: <1760206035.56692.1508594211365.JavaMail.zimbra@laposte.net> In-Reply-To: References: <2089146798.216127.1508487262593.JavaMail.zimbra@laposte.net> <1290947539.4254.1508496039812.JavaMail.zimbra@laposte.net> Subject: Re: [RFE] Add minimal universal release management capabilities to GIT MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [86.202.208.95] X-Mailer: Zimbra 8.0.6_GA_5922 (ZimbraWebClient - FF57 (Linux)/La Poste) Thread-Topic: Add minimal universal release management capabilities to GIT Thread-Index: co6sfwp7Knf+5OsIVnMzAi7pTFRxMg== X-VR-SrcIP: [86.202.208.95] X-VR-FullState: 0 X-VR-Score: -100 X-VR-Cause-1: gggruggvucftvghtrhhoucdtuddrgedttddrvddvgdejtdcutefuodetggdotefrodftvfcurfhrohhf X-VR-Cause-2: ihhlvgemucfntefrqffuvffgnecuuegrihhlohhuthemucehtddtnecusecvtfgvtghiphhivghnthhs X-VR-Cause-3: ucdlqddutddtmdenucfjughrpeffhffvkfgjfhfugggtgfhiofhtsehtqhgttdertdejnecuhfhrohhm X-VR-Cause-4: pehnihgtohhlrghsrdhmrghilhhhohhtsehlrghpohhsthgvrdhnvghtnecuffhomhgrihhnpehgihht X-VR-Cause-5: hhhusgdrtghomhdpihhsohgtphhprdhorhhgpdhlfihnrdhnvghtnecukfhppedutddruddvkedrheel X-VR-Cause-6: rdduudegpdekiedrvddtvddrvddtkedrleehnecurfgrrhgrmhepmhhouggvpehsmhhtphhouhhtpdhh X-VR-Cause-7: vghloheplhhpnhdqphhrugdqmhhsthhrtdekkedrlhgrphhoshhtvgdpihhnvghtpedutddruddvkedr X-VR-Cause-8: heelrdduudegpdhmrghilhhfrhhomhepnhhitgholhgrshdrmhgrihhlhhhotheslhgrphhoshhtvgdr X-VR-Cause-9: nhgvthdprhgtphhtthhopehgihhtsehvghgvrhdrkhgvrhhnvghlrdhorhhg X-VR-AvState: No X-VR-State: 0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org ----- Mail original ----- De: "Stefan Beller"=20 >> Unfortunately Git is so good more and more developers start to procrasti= nate on any activity that happens outside of GIT, >> starting with cutting releases. The meme "one only needs a git commit ha= sh" is going strong, even infecting institutions >> like lwn and glibc (https://lwn.net/SubscriberLink/736429/e5a8c8888cc85c= c8/) > For release you would want to include more than just "the code" into > the hash, such as compiler versions, environment variables, the phase > of the moon, what have you, that may impact the release build. Yes and no. Yes because you do want to limit failure cases, and no because = it's very easy to overspecify and block code reuse possibilities. Anyway I = don't see a strong consensus on how to do those yet, they are very language= -specific, and the first step is being able to identify other code you depe= nd on which requires some sort of release id, which is what my message was = about. You can't build any compatibility matrix, before being able to name = the dimensions of the matrix. > It sounds to me as if you assume that if X, Y, Z were numbers (or > rather had some order), this can be easily deduced. It's a lot more easy to use "option foo was introduced in version 2.3.4 and= takes Y parameters" than "option foo was introduced in commit hash #######= ######################################, you have version hash $$$$$$$$$$$$$= $$$$$$$$$$$$$$$$$$$$$$$$$$$$$", good luck. > The output of git-describe ought to be sufficient for an ordering > scheme to rely on? That relies on git access to the repo of every bit of code your computer ru= ns. This is not practical past the deployment phase. For deployment the ord= ering needs to be extracted from all the git data so you only need to manip= ulate short human and tool-friendly ids. You need low coupling not the stro= ng coupling of git repo access. >> =E2=80=94 hashes are not ranked. You can not guess, looking at a hash, i= f it corresponds to a project stability point, or is in a >> middle of a refactoring sequence, where things are expected to break. Ev= aluating every hash of every project you use >> quickly becomes prohibitive, with the only possible strategy being to ju= st use the latest commit at a given time and pra >> (and if you are lucky never never update afterwards unless you have lots= of fixing and testing time to waste). > That is up to the hash function. One could imagine a hash function > that generates bit patterns that you can use to obtain an order from. No that is not up to the hash function. First because hashes are too long t= o be manipulated by humans, and second no hash will ever capture human inte= nt. You need an explicit human action to mark "I want others to use this pa= rticular state of my project because I am confident it is solid". >> =E2=80=93 commit mixing is broken by design. > In Git terms a repository is the whole universe. > If you want relationships between different projects, you need to > include these projects e.g. via subtree or submodules. > It scales even up to linux distributions (e.g. > https://github.com/gittup/gittup, includes nethack!) This is still pre-deployment phase. And I wouldn't qualify this as "full li= nux distro", it's very small scale. If anything it demonstrated than even o= n a smallish perimeter relying on git alone as it stands today is too hard = (3 updates in the whole 2017 year!). >> One can not adapt the user of a piece of code to changes in this piece o= f code before those changes are committed in the >> first place. There will always be moments where the latest commit of a p= roject, is incompatible with the latest commit of >> downsteam users of this project. It is not a problem in developer enviro= nments and automated testers, where you want things >> to break early and b= e fixed early. It is a huge problem when you follow the same early commit p= ush strategy for actual >> production code, where failures are not just a red light in a build farm= dashboard, but have real-world consequences. And >> the more interlinked git repositories you pile on one another, the highe= r the probability is two commits won't work with >> one another with failures cascading down > That is software engineering in general, I am not sure how Git relates > to this? Any change that you make (with or without utilizing Git) can > break the downstream world. It's a lot easier to manage when you have discrete release synchronisation = point and not just a flow of commits >> =E2=80=93 commits are a bad inter-project synchronisation point. There a= re too many of them, they are not ranked, everyone is >> choosing a different commit to deploy, that effectively kills the networ= k effects that helped making traditional releases >> solid (because distributors used the same release state, and could share= feedback and audit results). > There are different strategies. Relevant open source projects (kernel, > glibc, git) are pretty good at not breaking the downstream users with > every commit. Just pick any random kernel commit during merge windows, try to build/run i= t and we'll talk again;) What those projects are pretty good at is a clear release strategy that hel= ps their users identify good project states which are safe to run. Except, the releasing happens outside git, it's still fairly manual. All I'= m proposing is to integrate the basic functions in git to simplify the life= of those projects and help smaller projects that want completely intergrat= ed git workflows. > If you want faster velocity, you have to couple the projects more > (submodules or a large repo including everything) Just try do to it. You'll get slower velocity because of the difficulty inh= erent in managing a large number of projects with strong git coupling. And = if you tell me "I'll just not update everything in parallel all the time" y= ou've just reinvented releasing without the help of explicit release states= . > I am not convinced, yet. As said initially the release handling needs > to take more things into account (compiler version, hardware version > of the fleet, etc) which is usually not tracked in Git. Well you > could, but that is the job of the release management tool, no? Yes and it is so fun to herd hundreds of management tools with different co= nventions and quirks. About as much fun as managing dozens of scm before mo= st projects settled on git. All commonalities need to migrate in the common= git layer to simplify management and release id is the first of those. Bes= ides the first thing those tools want is a way to identify the states to us= e, they'll be the first consumers of release integration in git. >> 1. "release versions" are first class objects that can be attached to a = commit (not just freestyle tags that look like >> versions, but may be something else entirely). Tools can identify releas= e IDs reliably. > git tags ? Too loosely defined to be relied on by project-agnostic tools. That's what = most tools won't ever try to use those. Anything you will define around tag= s as they stand is unlikely to work on the project of someone else >> 2. "release versions" have strong format constrains, that allow humans a= nd tools to deduce their ordering without needing >> access to something else (full git history or project-specific conventio= ns). The usual string of numbers separated by dots >> is probably simple and universal enough (if you start to allow letters p= eople will try to use clever schemes like alpha or=20 >> roman numerals, that break automation). There needs to be at least two n= umbers in the string to allow tracking patchlevels. > git tags are pretty open ended in their naming. the strictness would > need to be enforced by the given requirement of the environment. You've just lost. You can't build any complex system without some level of = shared conventions, if you limit the conventions to the project level you l= imit what you can build above the project level, starting with tooling > (Some > want to have just one integer number going up; others want patch >levels, i.e. 4 ints;=20 That's why I don't propose to set any constrain on the number of levels, ex= cept for a minimum (2, because in practice every project will need a point = release at some time). This is sufficient for automation, and pretty much half of what linux distr= os do to manage complex multi-project systems (convert loosely under-specif= ied versionning to chains of numbers that deb or rpm can understand). And d= istributions manage to do that because that's already pretty much the relea= se id conventions everyone uses, with minor variations. > yet others want dates?) That never worked so well, half the time you miss the date because of delay= s and it's too late to change the naming everyone expects. But anyway, noth= ing prevents you from using 2017.10.21.0 as release id, the proposed scheme= allows this. >> 3. several such objects can be attached to a commit (a project may wish = to promote a minor release to major one after it >> passes QA, versionning history should not be lost). > Multiple git tags can be attached to the same commit. You can even tag > a tag or tag a blob. Again the problem with tags is that they can be anything, you can't rely on= a tag being a release id, you can't rely on a tag having ordering constrai= ns, you can't build any tooling around those above the project level. >> 4. absent human intervention the release state of a repo is initialised = at 0.0, for its first commit (tools can rely on at >> least one release exi= sting in a repo). > An initial repo doesn't have tags, which comes close to 0. And it's not defined anywere so some will insist history starts at 1 or at = -52 BC or whatever. Explicit convention enforced by tooling that others too= ls can rely on trumps implicit convention that can be argued to death all t= he time. >> 5. a command, such as "git release", allow a human with control of the r= epo to set an explicit release version to a commit.=20 > This sounds fairly specific to an environment that you are in, maybe > write git-release for your environment and then open source it. The > world will love it (assuming they have the same environment and > needs). If you take the time to look at it it is not specific, it is generic. But, anyway yet another project bubble presents no value. The value of conv= entions is that they are shared, not that they are better than the neighbou= r's. I'll applaud anything done at git level because all the other tools an= d humans rely on this level. I'm sick of looking at conversion heuristics b= etween higher-level tools, because they can't rely on scm-level conventions= . >> 9. a command, such as "git release cut",=20 > git -archive comes to mind, doing a subset here. It is not complex to do. The value is not on its complexity, the value is i= n setting conventions others can rely on. >> 11. when no releasing has been done in a repo for some time (I'd suggest= 3 months to balance freshness with churn, it can >> be user-overidable in = repo config), git reminds its human masters at the next commit events they = should think about >> stabilizing state and cutting a release. > This is all process specific to your environment. Consider e.g. the > C++ standard committee tracking the C++ Standard in Git . > https://isocpp.org/std/the-committee > They make a release every 10 years or such, so 3 month is off! Actually you'll find out that they do intermediary pre-standard releases wa= y more often. 3 months is the average that works for most projects. I don't= propose to set it in stone, just as a sane default. > Integrating with CI and release is definitely important, but Git > itself has no idea about the requirements and environments of the > project specifics, The proposal is not just about CI. The software life does not end when a de= v pushes code to CI. You need to identify software during its whole lifecyc= le, and the id needs to start in the scm, because that's where the lifecycl= e starts. Right now the only shared id that does not depend on project environment th= at git proposes is commit hashes, and it is terrible in post-dev stages of = the lifecycle. Regards, --=20 Nicolas Mailhot