git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Philip Oakley" <philipoakley@iee.org>
To: "Peter Backes" <rtc@helen.PLASMA.Xg8.DE>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>
Subject: Re: GDPR compliance best practices?
Date: Sun, 3 Jun 2018 23:28:43 +0100	[thread overview]
Message-ID: <5F80881E35F941E88D9C84565C437607@PhilipOakley> (raw)
In-Reply-To: 20180603174617.GA10900@helen.PLASMA.Xg8.DE

From: "Peter Backes" <rtc@helen.PLASMA.Xg8.DE>
> On Sun, Jun 03, 2018 at 04:28:31PM +0100, Philip Oakley wrote:
>> In most Git cases that legal/legitimate purpose is the copyright licence,
>> and/or corporate employment. That is, Jane wrote it, hence X has a legal
>> rights of use, and we need to have a record of that (Jane wrote it) as
>> evidence of that (I'm X, I can use it) right. That would mean that Jane
>> cannot just ask to have that record removed and expect it to be removed.
>
> Re corporate employment:
>
> For sure nobody would dare to quesion that a company has a right to
> keep an internal record that Jane wrote it.
>
> The issue is publishing that information. This is an entirely different
> story.

It is here that Article 6 kicks in as to whether the 'organisation' can 
retain the data and continue to use it.
https://gdpr-info.eu/art-6-gdpr/
https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/
https://www.lawscot.org.uk/news-and-events/news/gdpr-legal-basis-and-why-it-matters/

For an open source project with an open source licence then an implict DCO 
applies for the meta data. It is the legal  basis for the the release.

If a corporate project has a closed source project, then yes, open 
publishing of that personal data within a repo's meta data would be 
incorrect, even though the internal repo would be kept.


>
> I already stressed that from the very beginning.
>
> Re copyright license:
>
> No, a copyright license does not provide a legitimization.
>
> - copyright is about distributing the program, not about distributing
> version control metadata.

It is specificaly about giving that right to copy by Jane Doe (but git gives 
no other information other than that supposedly globally unique 'author 
email'.

>
> - Being named is a right, not an obligation of the author. Hence, if
> the author doesn't want his name published, the company doesn't have
> legitimate grounds based in copyright for doing it anyway, against his
> or her will.

Git for Open Source is about open licencing by name. I'd agree that a closed 
corporate licence stays closed, but not forgotten.

>
>> From a personal view, many folk want it to be that corporates (and open
>> source organisations) should hold no personal information with having
>> explicit permission that can then be withdrawn, with deletion to follow.
>> However that 'legal' clause does [generally] win.
>
> Let's be honest: We do not know what legitimization exactly in each
> specific case the git metadata is being distributed under.

We should know, already. A specific licence [or limit] should be in place. 
We don't really want to have to let a court decide ;-)

>
> It may be copyright, it may be employment, but it may also be revocable
> consent. This is, we cannot safely assume that no git user will ever
> have to deal with a legitimate request based on the right to be
> forgotten.
>

The law is never decided by technical means, unfortunately. Regular git 
users should have no issues - they just need to point their finger at the 
responsible authority. (beware though, of the oneway trap door that the 
users mistakes can become the problem for the responsible authority!)


>> In the git.git case (and linux.git) there is the DCO (to back up the 
>> GLP2)
>> as an explicit requirement/certification that puts the information into 
>> the
>> legal evidence category. IIUC almost all copyright ends up with a similar
>> evidentail trail for the meta data.
>
> This makes things more complicated, not less. You have yet more meta
> data to cope with, yet more opportunities to be bitten by the right to
> be forgotten. Since I proposed a list of metadata where each entry can
> be anonymized independently of each other, it would be able to deal
> with this perfectly.

The DCO/GPL2 are the legitimate data record that recipients should have for 
their copy. There is no right to be forgotten at that point.

>
>> The more likely problem is if the content of the repo, rather than the 
>> meta
>> data, is subject to GDPR, and that could easily ruin any storage method.
>> Being able to mark an object as <Lost/Deleted> would help here(*).
>
> My proposal supports any part of the commit, including the contents of
> individual files, as eraseable, yet verifiable data.
>
>> Also remember that most EU legislation is 'intent' based, rather than
>> 'letter of', for the style of legal arguments (which is where some of the 
>> UK
>> Brexit misunderstandings come from), so it is more than possible to get 
>> into
>> the situation where an action is both mandated and illegal at the same 
>> time,
>> so plent of snake oil salesman continue to sell magic fixes according to 
>> the
>> customers local biases.
>
> This may be true. I am not trying to sell snake oil, however. To have
> erasure and verifiability at the same time is a highly generic feature
> that may be desirable to have for a multitude of reasons, including but
> not limited to legal ones like GDPR and copyright violations.
>
>> I do not believe Git has anything to worry about that wasn't already an
>> issue.
>
> Yes, but it definitely had and still does have something to worry about.
>
> git should provide technical means to deal with this. I provided a
> proposal based on anonymization that does not in any way have any
> drawback compared to the status quo, except a slight increase in
> metadata size and various degrees of backwards incompatibility,
> depending on how it is implemented.
>
> What do you think about my proposal as a solution for the problem?

I see the solution to be elsewhere, and that it is in some ways a strawman 
discussion: "if someone has the right to be forgotten, how do we delete the 
meta data", when that right (to delete the meta data in a properly licence 
repo) does not exist.

That said, the problem of maintaining repo integrity when some objects must 
be deleted or re-written (because they had stored peronal info that they 
should not have), will require a little bit extra on the side.

But this is open source, so ideas, and code, will come forward that allows 
things like 'replaced commits' to be formally part of a repo and its leading 
oid (or maybe it's an oid pair) will handle that. I'd guess that the commit 
will have an extra line after the parents and tree lines that details (in 
some manner) the 'replaced' things, so that fsck still works, the oid is 
complete and thus the whole shebang can be verified.

>
> You provide a lot of arguments about why it is not a necessity to have
> this, but let's assume it is; is there any actual problem you see with
> the proposal, except that someone would have to implement it?

It's the strawman problem. If it was a real 'real issue' then it would have 
already shown up with companies clamouring to pay folk to fix our (git's) 
latest problem. But the haven't, so I think it's a much more balanced issue.
--
Philip




  parent reply	other threads:[~2018-06-03 22:28 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-17 19:15 GDPR compliance best practices? Peter Backes
2018-04-17 21:38 ` Ævar Arnfjörð Bjarmason
2018-04-17 23:25   ` Peter Backes
2018-06-03  9:27   ` Peter Backes
2018-06-03 10:45     ` Ævar Arnfjörð Bjarmason
2018-06-03 11:25       ` Peter Backes
2018-06-03 12:59         ` Ævar Arnfjörð Bjarmason
2018-06-03 14:18           ` Peter Backes
2018-06-03 15:28             ` Philip Oakley
2018-06-03 17:46               ` Peter Backes
2018-06-03 18:18                 ` Theodore Y. Ts'o
2018-06-03 19:11                   ` Peter Backes
2018-06-03 19:24                     ` Peter Backes
2018-06-03 20:07                       ` Theodore Y. Ts'o
2018-06-03 20:52                         ` Peter Backes
2018-06-03 21:03                           ` Theodore Y. Ts'o
2018-06-03 22:16                             ` Peter Backes
2018-06-04 13:47                               ` Theodore Y. Ts'o
2018-06-04 18:22                                 ` Peter Backes
2018-06-03 22:28                 ` Philip Oakley [this message]
2018-06-03 23:01                   ` Peter Backes
2018-06-04 12:24                     ` Philip Oakley
2018-06-07  1:38                 ` David Lang
2018-06-07  6:32                   ` Peter Backes
2018-06-07 21:28                     ` Philip Oakley
2018-06-07 22:34                       ` Peter Backes
2018-06-07 22:38                         ` David Lang
2018-06-07 23:21                           ` Peter Backes
2018-06-07 23:53                             ` David Lang
2018-06-08  6:16                               ` Peter Backes
2018-06-08  7:42                                 ` David Lang
2018-06-08 11:58                                   ` Peter Backes
2018-06-08 18:51                                     ` David Lang
2018-06-12 18:56                                       ` David Lang
2018-06-12 19:12                                         ` Peter Backes
2018-06-12 19:16                                           ` Martin Fick
2018-06-13 14:12                                           ` Theodore Y. Ts'o
2018-06-13 14:48                                             ` Peter Backes
2018-06-08  2:53                             ` Theodore Y. Ts'o
2018-06-08  6:26                               ` Peter Backes
2018-06-08  8:13                                 ` Ævar Arnfjörð Bjarmason
2018-06-08 12:03                                   ` Peter Backes
2018-06-08 22:53                                     ` Ævar Arnfjörð Bjarmason
2018-06-08 14:45                                 ` Theodore Y. Ts'o
2018-06-08 16:02                                   ` Peter Backes
2018-06-08 22:09                               ` Johannes Sixt
2018-06-09 22:50                               ` Philip Oakley
2018-06-10  1:41                                 ` Theodore Y. Ts'o
2018-06-03 17:54               ` Philip Oakley
2018-06-03 19:48             ` Ævar Arnfjörð Bjarmason
2018-06-03 20:24               ` Peter Backes
2018-06-08 22:42 ` Jonathan Nieder
2018-06-08 23:00   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5F80881E35F941E88D9C84565C437607@PhilipOakley \
    --to=philipoakley@iee.org \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=rtc@helen.PLASMA.Xg8.DE \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).