From: Elijah Newren <newren@gmail.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>,
Elijah Newren <newren@gmail.com>,
Christian Couder <christian.couder@gmail.com>,
git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
Taylor Blau <me@ttaylorr.com>,
Rick Sanders <rick@sfconservancy.org>,
Git at SFC <git@sfconservancy.org>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Patrick Steinhardt <ps@pks.im>,
Christian Couder <chriscool@tuxfamily.org>
Subject: Re: [PATCH v2] SubmittingPatches: add section about AI
Date: Tue, 7 Oct 2025 21:18:09 -0700 [thread overview]
Message-ID: <CABPp-BHNaWdjkFuWs7uHdNweuurDGhb3DOrseSZmAEQnCEZFgw@mail.gmail.com> (raw)
In-Reply-To: <aOBMHqLxNd86vgjH@fruit.crustytoothpaste.net>
On Fri, Oct 3, 2025 at 3:20 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2025-10-03 at 20:48:40, Elijah Newren wrote:
> > Would this mean that you wanted to ban contributions like d12166d3c8bb
> > (Merge branch 'en/docfixes', 2023-10-23), available on the list over
> > at https://lore.kernel.org/git/pull.1595.git.1696747527.gitgitgadget@gmail.com/
> > ? We don't need to go theoretical, I've already contributed such a
> > patch series before -- 2 years ago -- and it was merged. Granted,
> > that was entirely documentation, and I called out the usage of AI in
> > the cover letter, and I manually checked every change (discarding many
> > of them) and split it into commits on my own, could easily explain any
> > change and why it was good, etc. And I was upfront about all of it.
>
> I think the main problem here is that we don't know the copyright
> status of LLM outputs. It is not uncommon for them to produce output
> that reflects their training input and we see evidence of that in, for
> instance, the New York Times lawsuit against OpenAI.
>
> As I said, the situation is very unclear legally, with active litigation
> in multiple countries, and we have to comply with pretty much every
> country's laws in this situation. Whether something is legal in the
> United States, where you're located, is completely irrelevant to whether
> it is legal in Canada, where I'm located, or Germany or the UK, where we
> have other contributors. We also have to consider whether it's legal in
> all of the countries that Git is distributed in, which includes every
> country in which Debian has a mirror[0], even countries under
> international sanctions, such as Iran, Russia, and Belarus.
>
> It doesn't matter if the person using AI has indemnification, either,
> since that only covers civil matters, and at least in the U.S. and
> Canada, knowingly violating copyright is also a criminal offence.
>
> The sign-off process is designed to clearly state that a person has the
> ability to contribute code under the license and I don't think, as
> things stand, it's possible to make that assertion with code or
> documentation generated from an LLM except in very limited
> circumstances. I don't allow LLM-generated code in my personal projects
> that require sign-off for that reason, and neither does QEMU[1]. I
> don't think I could honestly assert either (a) or (b) in the DCO with
> LLM-generated code because it's not clear to me whether "I have the
> right to submit it under the…license."
>
> To quote the QEMU policy:
>
> To satisfy the DCO, the patch contributor has to fully understand the
> copyright and license status of content they are contributing to QEMU. With AI
> content generators, the copyright and license status of the output is
> ill-defined with no generally accepted, settled legal foundation.
>
> Where the training material is known, it is common for it to include large
> volumes of material under restrictive licensing/copyright terms. Even where
> the training material is all known to be under open source licenses, it is
> likely to be under a variety of terms, not all of which will be compatible
> with QEMU's licensing requirements.
>
> I remember the SCO situation with Linux and how it really created a lot
> of uncertainty with Linux because SCO created FUD around Linux licensing
> and how that led to the DCO being created. I am aware of the fact that
> many open source contributors are very unhappy that their code has been
> used to train LLMs without retaining credits and copyright notices or
> honouring the license terms[2]. And I have spent many years working
> with non-profits[3], where I have always been taught that we should
> avoid even the appearance of impropriety.
>
> It may matter less what the situation actually ends up being legally
> (although it could end up being quite bad) and more whether someone can
> imply or suggest that Git is not being distributed in compliance with
> the license or contains infringing code, which could effectively make it
> undistributable because nobody wants to take that risk. And litigation,
> even if Git and its contributors are successful, can be extraordinarily
> expensive.
>
> So I think, given the circumstances, yes, the right thing to do is to
> ban LLM-generated contributions with a policy very similar or identical
> to QEMU's. If, in the future, the legal situation changes and it
> becomes unambiguously legal to use LLMs across the world, then we can
> reconsider that policy then.
>
> [0] https://www.debian.org/mirror/list
> [1] https://github.com/qemu/qemu/commit/3d40db0efc22520fa6c399cf73960dced423b048
> [2] Regardless of the legal concerns, this implicates professional
> ethics concerns, such as §1.5 of the ACM Code of Ethics[4]. Ethics
> requirements usually go well beyond what the law requires.
> [3] Software Freedom Conservancy, which handles legal matters for the
> Git project, is a non-profit.
> [4] https://www.acm.org/code-of-ethics
Thanks for clarifying your position. To me, your preferred wording
for the position statement doesn't quite match the rationale. I think
for cases of:
* fixing typos
* finding wording tweaks to existing documentation
* tab completion of e.g. the next three lines in an IDE when limited
to e.g. what most any engineer in the world would write based on the
comment on the line before (or if the AI plugin doesn't quite get the
three lines right, well I already had them in my head and if it gets
close enough, it's easier for me to accept and then edit into what I
already knew I wanted)
* assisting with wording in writing a commit message as an editor
(or maybe even suggesting some initial wording based on the patch I
already wrote)
* identifying potential bugs in a patch
* identifying potential typos in documentation
that none of these particular uses cause problems for the rationale
you specify, but at least the first four would be disallowed by the
preferred wording you want, and perhaps even the last two wouldn't be
allowed either (though I don't think AI is very good at the second to
last one, so not a big loss on that particular one yet). Perhaps due
to my incomplete understanding of copyright all of these would
actually be problematic with the rationale you already gave for
reasons I don't yet know about or just haven't yet understood, but if
not, I'd rather not disallow these kinds of uses.
The first two from my list have a good example in the form of the
series at d12166d3c8bb (Merge branch 'en/docfixes', 2023-10-23) [or on
the list at https://lore.kernel.org/git/pull.1595.git.1696747527.gitgitgadget@gmail.com/
], which was already merged a few years ago. So if we adopt wording
that disallows these kinds of changes, then we also need to talk about
whether we grandfather already-merged series or proactively revert
them.
next prev parent reply other threads:[~2025-10-08 4:18 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-30 20:32 [RFC/PATCH] SubmittingPatches: forbid use of genAI to generate changes Junio C Hamano
2025-06-30 21:07 ` brian m. carlson
2025-06-30 21:23 ` Collin Funk
2025-07-01 10:36 ` Christian Couder
2025-07-01 11:07 ` Christian Couder
2025-07-01 17:33 ` Junio C Hamano
2025-07-01 16:20 ` Junio C Hamano
2025-07-08 14:23 ` Christian Couder
2025-10-01 14:02 ` [PATCH v2] SubmittingPatches: add section about AI Christian Couder
2025-10-01 18:59 ` Chuck Wolber
2025-10-01 23:32 ` brian m. carlson
2025-10-02 2:30 ` Ben Knoble
2025-10-03 13:33 ` Christian Couder
2025-10-01 20:59 ` Junio C Hamano
2025-10-03 8:51 ` Christian Couder
2025-10-03 16:20 ` Junio C Hamano
2025-10-03 16:45 ` rsbecker
2025-10-08 7:22 ` Christian Couder
2025-10-01 21:37 ` brian m. carlson
2025-10-03 14:25 ` Christian Couder
2025-10-03 20:48 ` Elijah Newren
2025-10-03 22:20 ` brian m. carlson
2025-10-06 17:45 ` Junio C Hamano
2025-10-08 4:18 ` Elijah Newren
2025-10-12 15:07 ` Junio C Hamano
2025-10-08 9:28 ` Christian Couder
2025-10-13 18:14 ` Junio C Hamano
2025-10-23 17:32 ` Junio C Hamano
2025-10-08 4:18 ` Elijah Newren [this message]
2025-10-08 8:37 ` Christian Couder
2025-10-08 9:28 ` Michal Suchánek
2025-10-08 9:35 ` Christian Couder
2025-10-09 1:13 ` Collin Funk
2025-10-08 7:30 ` Christian Couder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABPp-BHNaWdjkFuWs7uHdNweuurDGhb3DOrseSZmAEQnCEZFgw@mail.gmail.com \
--to=newren@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=chriscool@tuxfamily.org \
--cc=christian.couder@gmail.com \
--cc=git@sfconservancy.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=ps@pks.im \
--cc=rick@sfconservancy.org \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).