git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Is git compliant with GDPR?
@ 2020-07-02 15:58 Jakub Trzebiatowski
  2020-07-02 16:28 ` Jason Pyeron
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Trzebiatowski @ 2020-07-02 15:58 UTC (permalink / raw)
  To: git

Hello,

I've been using git for years, but I've never before taken part in the
discussion on the mailing list. I have a simple question, which
probably isn't easy to answer.

Is git compliant with GDPR, the EU data protection law?

Before I'm able to commit with git, I'm asked for my first and last
name. That is personal data.

GDPR, Article 4, point (1):
‘personal data’ means any information relating to an identified or
identifiable natural person (‘data subject’); [...]

That data is handled by the git utility. It's sent to other parties
operating remote git servers (as a result of my commands, but as far
as I know that's not relevant). It sounds like it's being processed.

GDPR, Article 4, point (2):
‘processing’ means any operation or set of operations which is
performed on personal data or on sets of personal data, whether or not
by automated means, such as collection, recording, organisation,
structuring, storage, adaptation or alteration, retrieval,
consultation, use, disclosure by transmission, dissemination or
otherwise making available, alignment or combination, restriction,
erasure or destruction;

This data is processed with a compatible computer owned by the end
user for the purpose of identification of git commits. It's sent to
other parties only when specific commands are given. All this was
defined by git authors/contributors (from all around the world).

GDPR, Article 4, point (7):
‘controller’ means the natural or legal person, public authority,
agency or other body which, alone or jointly with others, determines
the purposes and means of the processing of personal data; [...]

Git authors can be considered joint controllers.

If we'd assume the above interpretations, there would be many, many
consequences.

I'm not a lawyer, and I have no idea if this interpretation is
reasonable. I don't even know if I'd like it to be. But here are some
facts: GDPR does focus on protecting the end user. Possibly, it's the
most strict data protection law in the world. It doesn't care how
difficult it is to adjust the organisation for compliance and it
doesn't care where the controller is located, as long as it processes
personal data of EU citizens (if I understand it correctly).

Are there any lawyers in the git community? Could The Linux Foundation
help with legal support? It's a very non-trivial issue. It's non
obvious how local software relates to GDPR, and it's even more
difficult with Free/Open Source software with many, many authors. But
if the aforementioned interpretation was assumed, the git authors
could be held responsible for non-compliance.

Best,
Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Is git compliant with GDPR?
  2020-07-02 15:58 Is git compliant with GDPR? Jakub Trzebiatowski
@ 2020-07-02 16:28 ` Jason Pyeron
  2020-07-02 16:40   ` Randall S. Becker
  2020-07-02 17:06   ` Jakub Trzebiatowski
  0 siblings, 2 replies; 10+ messages in thread
From: Jason Pyeron @ 2020-07-02 16:28 UTC (permalink / raw)
  To: git; +Cc: Matthew Horowitz, 'Jakub Trzebiatowski'

> -----Original Message-----
> From: Jakub Trzebiatowski
> Sent: Thursday, July 2, 2020 11:58 AM
> 
> Hello,

First: I am not a lawyer, and even if I were, I (nor anyone else on this list) would not be your lawyer - get a lawyer.

Second: This thread is likely borderline off topic because for Git and GPDR to meet, it would be in the context of SaaS or your internal organization. There is almost nothing pure Git about these issues, see below. Discussion for the sake of it follows.

> 
> I've been using git for years, but I've never before taken part in the
> discussion on the mailing list. I have a simple question, which
> probably isn't easy to answer.
> 
> Is git compliant with GDPR, the EU data protection law?
> 
> Before I'm able to commit with git, I'm asked for my first and last
> name. That is personal data.
> 
> GDPR, Article 4, point (1):
> ‘personal data’ means any information relating to an identified or
> identifiable natural person (‘data subject’); [...]
> 
> That data is handled by the git utility. It's sent to other parties
> operating remote git servers (as a result of my commands, but as far
> as I know that's not relevant). It sounds like it's being processed.

Git is like a hard drive or database in your organization. It does not do anything else than store the information.

Exception 1: IF you configure it to do so.

Exception 2: You are using a SaaS provider (e.g. github.com, gitlab.com, etc.)

Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.).

> 
> GDPR, Article 4, point (2):
> ‘processing’ means any operation or set of operations which is
> performed on personal data or on sets of personal data, whether or not
> by automated means, such as collection, recording, organisation,
> structuring, storage, adaptation or alteration, retrieval,
> consultation, use, disclosure by transmission, dissemination or
> otherwise making available, alignment or combination, restriction,
> erasure or destruction;
> 
> This data is processed with a compatible computer owned by the end
> user for the purpose of identification of git commits. It's sent to
> other parties only when specific commands are given. All this was
> defined by git authors/contributors (from all around the world).
> 

Again, like any database, you can query it for its contents. What you put in it is what it has. If you put personal data in, then it is there.

Where can data reside in Git?

1. The blobs - e.g. your source code

2. The commit messages.

#2 is your most likely candidate of GDPR related activities.

Do you use the developers names and email addresses in the message? Almost certainly.

Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.).

> GDPR, Article 4, point (7):
> ‘controller’ means the natural or legal person, public authority,
> agency or other body which, alone or jointly with others, determines
> the purposes and means of the processing of personal data; [...]
> 
> Git authors can be considered joint controllers.
> 

The Git distributed model means that COPIES of all of the data are on each Git server and developer environment. You (and I mean your organization) must address this in your IT plans.

Note: this is no different than many other SCMs although some others SCM technologies only have the most recent version locally..

> If we'd assume the above interpretations, there would be many, many
> consequences.
> 
> I'm not a lawyer, and I have no idea if this interpretation is
> reasonable. I don't even know if I'd like it to be. But here are some
> facts: GDPR does focus on protecting the end user. Possibly, it's the
> most strict data protection law in the world. It doesn't care how
> difficult it is to adjust the organisation for compliance and it
> doesn't care where the controller is located, as long as it processes
> personal data of EU citizens (if I understand it correctly).
> 
> Are there any lawyers in the git community? Could The Linux Foundation
> help with legal support? It's a very non-trivial issue. It's non
> obvious how local software relates to GDPR, and it's even more
> difficult with Free/Open Source software with many, many authors. But
> if the aforementioned interpretation was assumed, the git authors
> could be held responsible for non-compliance.


I have copied our Policy SME, maybe he will have opinions.

-Jason


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Is git compliant with GDPR?
  2020-07-02 16:28 ` Jason Pyeron
@ 2020-07-02 16:40   ` Randall S. Becker
  2020-07-03  6:22     ` demerphq
  2020-07-02 17:06   ` Jakub Trzebiatowski
  1 sibling, 1 reply; 10+ messages in thread
From: Randall S. Becker @ 2020-07-02 16:40 UTC (permalink / raw)
  To: 'Jason Pyeron', git
  Cc: 'Matthew Horowitz', 'Jakub Trzebiatowski'

On July 2, 2020 12:28 PM, Jason Pyeron wrote:
> Subject: RE: Is git compliant with GDPR?
> > -----Original Message-----
> > From: Jakub Trzebiatowski
> > Sent: Thursday, July 2, 2020 11:58 AM
> >
> First: I am not a lawyer, and even if I were, I (nor anyone else on this list)
> would not be your lawyer - get a lawyer.
> 
> Second: This thread is likely borderline off topic because for Git and GPDR to
> meet, it would be in the context of SaaS or your internal organization. There
> is almost nothing pure Git about these issues, see below. Discussion for the
> sake of it follows.
> 
> >
> > I've been using git for years, but I've never before taken part in the
> > discussion on the mailing list. I have a simple question, which
> > probably isn't easy to answer.
> >
> > Is git compliant with GDPR, the EU data protection law?
> >
> > Before I'm able to commit with git, I'm asked for my first and last
> > name. That is personal data.
> >
> > GDPR, Article 4, point (1):
> > ‘personal data’ means any information relating to an identified or
> > identifiable natural person (‘data subject’); [...]
> >
> > That data is handled by the git utility. It's sent to other parties
> > operating remote git servers (as a result of my commands, but as far
> > as I know that's not relevant). It sounds like it's being processed.
> 
> Git is like a hard drive or database in your organization. It does not do
> anything else than store the information.
> 
> Exception 1: IF you configure it to do so.
> 
> Exception 2: You are using a SaaS provider (e.g. github.com, gitlab.com, etc.)
> 
> Note: this is no different than any other SCM (e.g. CVS, Subversion, file
> shares, etc.).
> 
> >
> > GDPR, Article 4, point (2):
> > ‘processing’ means any operation or set of operations which is
> > performed on personal data or on sets of personal data, whether or not
> > by automated means, such as collection, recording, organisation,
> > structuring, storage, adaptation or alteration, retrieval,
> > consultation, use, disclosure by transmission, dissemination or
> > otherwise making available, alignment or combination, restriction,
> > erasure or destruction;
> >
> > This data is processed with a compatible computer owned by the end
> > user for the purpose of identification of git commits. It's sent to
> > other parties only when specific commands are given. All this was
> > defined by git authors/contributors (from all around the world).
> >
> 
> Again, like any database, you can query it for its contents. What you put in it
> is what it has. If you put personal data in, then it is there.
> 
> Where can data reside in Git?
> 
> 1. The blobs - e.g. your source code
> 
> 2. The commit messages.
> 
> #2 is your most likely candidate of GDPR related activities.
> 
> Do you use the developers names and email addresses in the message?
> Almost certainly.
> 
> Note: this is no different than any other SCM (e.g. CVS, Subversion, file
> shares, etc.).
> 
> > GDPR, Article 4, point (7):
> > ‘controller’ means the natural or legal person, public authority,
> > agency or other body which, alone or jointly with others, determines
> > the purposes and means of the processing of personal data; [...]
> >
> > Git authors can be considered joint controllers.
> >
> 
> The Git distributed model means that COPIES of all of the data are on each
> Git server and developer environment. You (and I mean your organization)
> must address this in your IT plans.
> 
> Note: this is no different than many other SCMs although some others SCM
> technologies only have the most recent version locally..
> 
> > If we'd assume the above interpretations, there would be many, many
> > consequences.
> >
> > I'm not a lawyer, and I have no idea if this interpretation is
> > reasonable. I don't even know if I'd like it to be. But here are some
> > facts: GDPR does focus on protecting the end user. Possibly, it's the
> > most strict data protection law in the world. It doesn't care how
> > difficult it is to adjust the organisation for compliance and it
> > doesn't care where the controller is located, as long as it processes
> > personal data of EU citizens (if I understand it correctly).
> >
> > Are there any lawyers in the git community? Could The Linux Foundation
> > help with legal support? It's a very non-trivial issue. It's non
> > obvious how local software relates to GDPR, and it's even more
> > difficult with Free/Open Source software with many, many authors. But
> > if the aforementioned interpretation was assumed, the git authors
> > could be held responsible for non-compliance.
> 
> 
> I have copied our Policy SME, maybe he will have opinions.

I am not speaking for the Git Foundation here, nor am I a lawyer; However, to use some practices from some of my customers who have this concern, the team members are directed to use tokenized names and email addresses that can be resolved by their security teams during an audit. Obviously the team members recognize the tokens so they know who is making what change. This means that externally, any names/emails that might get pushed upstream are non-identifying.

The problem with this approach is that it is not global. As a result, if you want to contribute to a public project you have to self-identify, which may imply consent under GDPR. This is for the protection of the project itself as a project cannot take code from anonymous sources. If you are unwilling to share that information, do not contribute to a project.

Randall

-- Brief whoami:
 NonStop developer since approximately 211288444200000000
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is git compliant with GDPR?
  2020-07-02 16:28 ` Jason Pyeron
  2020-07-02 16:40   ` Randall S. Becker
@ 2020-07-02 17:06   ` Jakub Trzebiatowski
  2020-07-02 18:38     ` Paul Smith
  2020-07-02 18:47     ` Jason Pyeron
  1 sibling, 2 replies; 10+ messages in thread
From: Jakub Trzebiatowski @ 2020-07-02 17:06 UTC (permalink / raw)
  To: Jason Pyeron; +Cc: git, Matthew Horowitz

czw., 2 lip 2020 o 18:27 Jason Pyeron <jpyeron@pdinc.us> napisał(a):
>
> > -----Original Message-----
> > From: Jakub Trzebiatowski
> > Sent: Thursday, July 2, 2020 11:58 AM
> >
> > Hello,
>
> First: I am not a lawyer, and even if I were, I (nor anyone else on this list) would not be your lawyer - get a lawyer.
I don't think I'm in need of a lawyer. I wanted to start a discussion
on a topic that in my opinion deserves being discussed, because I'm a
git user and I believe it's interesting.
>
> Second: This thread is likely borderline off topic because for Git and GPDR to meet, it would be in the context of SaaS or your internal organization. There is almost nothing pure Git about these issues, see below. Discussion for the sake of it follows.

I do agree that that sounds reasonable. But could I ask you why do you
assume that there needs to be a service (or Software as a Service) to
make software fall under GDPR? The GDPR definitions don't seem to
mention that.

> >
> > I've been using git for years, but I've never before taken part in the
> > discussion on the mailing list. I have a simple question, which
> > probably isn't easy to answer.
> >
> > Is git compliant with GDPR, the EU data protection law?
> >
> > Before I'm able to commit with git, I'm asked for my first and last
> > name. That is personal data.
> >
> > GDPR, Article 4, point (1):
> > ‘personal data’ means any information relating to an identified or
> > identifiable natural person (‘data subject’); [...]
> >
> > That data is handled by the git utility. It's sent to other parties
> > operating remote git servers (as a result of my commands, but as far
> > as I know that's not relevant). It sounds like it's being processed.
>
> Git is like a hard drive or database in your organization. It does not do anything else than store the information.

Storing is processing. I'm not saying that git is evil or wrong, I'm
saying that it might be the case that it processes personal data (both
understood as in GDPR).

git is also a software created by people and used by people.

>
> Exception 1: IF you configure it to do so.

Sure, it doesn't change much. Processing data initiated by the user
isn't any kind of distinguished processing, as far as I know.

>
> Exception 2: You are using a SaaS provider (e.g. github.com, gitlab.com, etc.)
>
> Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.).

I'm totally aware. I know how git works, including some of the
internals, and I'm in general aware of standard solutions in the IT
industry. Probably if git would be considered non-compliant, then so
would be other SCMs.

>
> >
> > GDPR, Article 4, point (2):
> > ‘processing’ means any operation or set of operations which is
> > performed on personal data or on sets of personal data, whether or not
> > by automated means, such as collection, recording, organisation,
> > structuring, storage, adaptation or alteration, retrieval,
> > consultation, use, disclosure by transmission, dissemination or
> > otherwise making available, alignment or combination, restriction,
> > erasure or destruction;
> >
> > This data is processed with a compatible computer owned by the end
> > user for the purpose of identification of git commits. It's sent to
> > other parties only when specific commands are given. All this was
> > defined by git authors/contributors (from all around the world).
> >
>
> Again, like any database, you can query it for its contents. What you put in it is what it has. If you put personal data in, then it is there.

It's not a general purpose database, it's a structured database and a
software that operates on that database. That database has a field for
personal data, and that data is processed by the software.

> Where can data reside in Git?
>
> 1. The blobs - e.g. your source code
>
> 2. The commit messages.
>
> #2 is your most likely candidate of GDPR related activities.
>
> Do you use the developers names and email addresses in the message? Almost certainly.
>
> Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.).
>
> > GDPR, Article 4, point (7):
> > ‘controller’ means the natural or legal person, public authority,
> > agency or other body which, alone or jointly with others, determines
> > the purposes and means of the processing of personal data; [...]
> >
> > Git authors can be considered joint controllers.
> >
>
> The Git distributed model means that COPIES of all of the data are on each Git server and developer environment. You (and I mean your organization) must address this in your IT plans.
>
> Note: this is no different than many other SCMs although some others SCM technologies only have the most recent version locally..
>
> > If we'd assume the above interpretations, there would be many, many
> > consequences.
> >
> > I'm not a lawyer, and I have no idea if this interpretation is
> > reasonable. I don't even know if I'd like it to be. But here are some
> > facts: GDPR does focus on protecting the end user. Possibly, it's the
> > most strict data protection law in the world. It doesn't care how
> > difficult it is to adjust the organisation for compliance and it
> > doesn't care where the controller is located, as long as it processes
> > personal data of EU citizens (if I understand it correctly).
> >
> > Are there any lawyers in the git community? Could The Linux Foundation
> > help with legal support? It's a very non-trivial issue. It's non
> > obvious how local software relates to GDPR, and it's even more
> > difficult with Free/Open Source software with many, many authors. But
> > if the aforementioned interpretation was assumed, the git authors
> > could be held responsible for non-compliance.
>
>
> I have copied our Policy SME, maybe he will have opinions.
>
> -Jason
>

In general, I totally agree with everything you said.

But you said that git itself (as a software) doesn't fall under GDPR,
and that's the only thing I'm not sure about. I was wondering if
someone with a deeper understanding of GDPR would tell my _why_.
Because when interpreting the law literally, it sounds like it does.

Also, to clarify, I'm not seeking legal advice for myself or my organization.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is git compliant with GDPR?
  2020-07-02 17:06   ` Jakub Trzebiatowski
@ 2020-07-02 18:38     ` Paul Smith
  2020-07-02 19:25       ` Jason Pyeron
  2020-07-02 18:47     ` Jason Pyeron
  1 sibling, 1 reply; 10+ messages in thread
From: Paul Smith @ 2020-07-02 18:38 UTC (permalink / raw)
  To: Jakub Trzebiatowski, Jason Pyeron; +Cc: git, Matthew Horowitz

On Thu, 2020-07-02 at 19:06 +0200, Jakub Trzebiatowski wrote:
> But you said that git itself (as a software) doesn't fall under GDPR,
> and that's the only thing I'm not sure about. I was wondering if
> someone with a deeper understanding of GDPR would tell my _why_. 
> Because when interpreting the law literally, it sounds like it does.

You might be interested in reading the conversation that was had on
this list the last time this subject was raised, in 2018:

https://public-inbox.org/git/5587534.o6tcmYBVvN@mfick-lnx/T/

I can't say whether it will satisfy you or not.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Is git compliant with GDPR?
  2020-07-02 17:06   ` Jakub Trzebiatowski
  2020-07-02 18:38     ` Paul Smith
@ 2020-07-02 18:47     ` Jason Pyeron
  1 sibling, 0 replies; 10+ messages in thread
From: Jason Pyeron @ 2020-07-02 18:47 UTC (permalink / raw)
  To: git; +Cc: 'Matthew Horowitz', 'Jakub Trzebiatowski'

> -----Original Message-----
> From: Jakub Trzebiatowski
> Sent: Thursday, July 2, 2020 1:06 PM
> 
> czw., 2 lip 2020 o 18:27 Jason Pyeron napisał(a):
> >
> > > -----Original Message-----
> > > From: Jakub Trzebiatowski
> > > Sent: Thursday, July 2, 2020 11:58 AM
> > >
> > > Hello,
> >
> > First: I am not a lawyer, and even if I were, I (nor anyone else on this list) would not be your
> lawyer - get a lawyer.
> I don't think I'm in need of a lawyer. I wanted to start a discussion
> on a topic that in my opinion deserves being discussed, because I'm a
> git user and I believe it's interesting.
> >
> > Second: This thread is likely borderline off topic because for Git and GPDR to meet, it would be in
> the context of SaaS or your internal organization. There is almost nothing pure Git about these
> issues, see below. Discussion for the sake of it follows.
> 
> I do agree that that sounds reasonable. But could I ask you why do you
> assume that there needs to be a service (or Software as a Service) to
> make software fall under GDPR? The GDPR definitions don't seem to
> mention that.

You will need to read the whole GDPR, and understand it which is no small task. I feel it does, the GDPR says:

‘controller’ means the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data; where the purposes and means of such processing are determined by Union or Member State law, the controller or the specific criteria for its nomination may be provided for by Union or Member State law;

‘processor’ means a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller;

Here your question seems to extend "legal person" from the organization, to its systems, and further to the software (e.g. Git) running on those systems.

Whereas a SaaS provider is a legal person subject to GDPR or is a "Third Party".

> 
> > >
> > > I've been using git for years, but I've never before taken part in the
> > > discussion on the mailing list. I have a simple question, which
> > > probably isn't easy to answer.
> > >
> > > Is git compliant with GDPR, the EU data protection law?
> > >
> > > Before I'm able to commit with git, I'm asked for my first and last
> > > name. That is personal data.
> > >
> > > GDPR, Article 4, point (1):
> > > ‘personal data’ means any information relating to an identified or
> > > identifiable natural person (‘data subject’); [...]
> > >
> > > That data is handled by the git utility. It's sent to other parties
> > > operating remote git servers (as a result of my commands, but as far
> > > as I know that's not relevant). It sounds like it's being processed.
> >
> > Git is like a hard drive or database in your organization. It does not do anything else than store
> the information.
> 
> Storing is processing. I'm not saying that git is evil or wrong, I'm
> saying that it might be the case that it processes personal data (both
> understood as in GDPR).
> 
> git is also a software created by people and used by people.

Again the relevance is on the organization.

> 
> >
> > Exception 1: IF you configure it to do so.
> 
> Sure, it doesn't change much. Processing data initiated by the user
> isn't any kind of distinguished processing, as far as I know.
> 
> >
> > Exception 2: You are using a SaaS provider (e.g. github.com, gitlab.com, etc.)
> >
> > Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.).
> 
> I'm totally aware. I know how git works, including some of the
> internals, and I'm in general aware of standard solutions in the IT
> industry. Probably if git would be considered non-compliant, then so
> would be other SCMs.

I am referring to configurations that are following organization policies, which in themselves are causing the GDPR concerns. E.g. commit data is tweeted. Or as Randall S. Becker said on Thursday, July 2, 2020 12:41 PM:

> some practices from some of my customers who have this concern, the team members are directed to use
> tokenized names and email addresses that can be resolved by their security teams during an audit. Obviously 
> the team members recognize the tokens so they know who is making what change. This means that externally,
> any names/emails that might get pushed upstream are non-identifying.

The organization explicitly added GDPR covered information (see European Parliament question E-007174/2017).

> 
> >
> > >
> > > GDPR, Article 4, point (2):
> > > ‘processing’ means any operation or set of operations which is
> > > performed on personal data or on sets of personal data, whether or not
> > > by automated means, such as collection, recording, organisation,
> > > structuring, storage, adaptation or alteration, retrieval,
> > > consultation, use, disclosure by transmission, dissemination or
> > > otherwise making available, alignment or combination, restriction,
> > > erasure or destruction;
> > >
> > > This data is processed with a compatible computer owned by the end
> > > user for the purpose of identification of git commits. It's sent to
> > > other parties only when specific commands are given. All this was
> > > defined by git authors/contributors (from all around the world).
> > >
> >
> > Again, like any database, you can query it for its contents. What you put in it is what it has. If
> you put personal data in, then it is there.
> 
> It's not a general purpose database, it's a structured database and a
> software that operates on that database. That database has a field for
> personal data, and that data is processed by the software.
> 

I disagree, but see https://blog.sqlauthority.com/2018/01/19/sql-server-make-sql-server-gdpr-compliance/ . I think we can all agree if software could be complaint/noncompliant, then a SQL server is a perfect candidate. That article addresses the issues of how to configure it and the business procedures to align with GDPR obligations. 

That (and only that) discussion I think is very on topic here.

> > Where can data reside in Git?
> >
> > 1. The blobs - e.g. your source code
> >
> > 2. The commit messages.
> >
> > #2 is your most likely candidate of GDPR related activities.
> >
> > Do you use the developers names and email addresses in the message? Almost certainly.
> >
> > Note: this is no different than any other SCM (e.g. CVS, Subversion, file shares, etc.).
> >
> > > GDPR, Article 4, point (7):
> > > ‘controller’ means the natural or legal person, public authority,
> > > agency or other body which, alone or jointly with others, determines
> > > the purposes and means of the processing of personal data; [...]
> > >
> > > Git authors can be considered joint controllers.
> > >
> >
> > The Git distributed model means that COPIES of all of the data are on each Git server and developer
> environment. You (and I mean your organization) must address this in your IT plans.
> >
> > Note: this is no different than many other SCMs although some others SCM technologies only have the
> most recent version locally..
> >
> > > If we'd assume the above interpretations, there would be many, many
> > > consequences.
> > >
> > > I'm not a lawyer, and I have no idea if this interpretation is
> > > reasonable. I don't even know if I'd like it to be. But here are some
> > > facts: GDPR does focus on protecting the end user. Possibly, it's the
> > > most strict data protection law in the world. It doesn't care how
> > > difficult it is to adjust the organisation for compliance and it
> > > doesn't care where the controller is located, as long as it processes
> > > personal data of EU citizens (if I understand it correctly).
> > >
> > > Are there any lawyers in the git community? Could The Linux Foundation
> > > help with legal support? It's a very non-trivial issue. It's non
> > > obvious how local software relates to GDPR, and it's even more
> > > difficult with Free/Open Source software with many, many authors. But
> > > if the aforementioned interpretation was assumed, the git authors
> > > could be held responsible for non-compliance.
> >
> >
> > I have copied our Policy SME, maybe he will have opinions.
> >
> > -Jason
> >
> 
> In general, I totally agree with everything you said.
> 
> But you said that git itself (as a software) doesn't fall under GDPR,
> and that's the only thing I'm not sure about. I was wondering if
> someone with a deeper understanding of GDPR would tell my _why_.
> Because when interpreting the law literally, it sounds like it does.
> 
> Also, to clarify, I'm not seeking legal advice for myself or my organization.

-Jason



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Is git compliant with GDPR?
  2020-07-02 18:38     ` Paul Smith
@ 2020-07-02 19:25       ` Jason Pyeron
  2020-07-03  6:29         ` demerphq
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Pyeron @ 2020-07-02 19:25 UTC (permalink / raw)
  To: git; +Cc: 'Matthew Horowitz', 'Jakub Trzebiatowski', paul

> -----Original Message-----
> From: Paul Smith 
> Sent: Thursday, July 2, 2020 2:38 PM
> 
> On Thu, 2020-07-02 at 19:06 +0200, Jakub Trzebiatowski wrote:
> > But you said that git itself (as a software) doesn't fall under GDPR,
> > and that's the only thing I'm not sure about. I was wondering if
> > someone with a deeper understanding of GDPR would tell my _why_.
> > Because when interpreting the law literally, it sounds like it does.
> 
> You might be interested in reading the conversation that was had on
> this list the last time this subject was raised, in 2018:
> 
> https://public-inbox.org/git/5587534.o6tcmYBVvN@mfick-lnx/T/
> 
> I can't say whether it will satisfy you or not.

IMHO the most valuable bits were (I left out the discussion of changes to Git):

1: 

From: David Lang 
Date: Wed, 6 Jun 2018 18:38:55 -0700 (PDT)
Message-ID: <alpine.DEB.2.02.1806061831340.7659@nftneq.ynat.uz> (raw) https://public-inbox.org/git/alpine.DEB.2.02.1806061831340.7659@nftneq.ynat.uz/#t

I'm going to take the risk of inserting actual real-world data into the mix 
rather than just speculation :-)

Here is an example of that the Rsyslog project is doing (main developers based 
in Germany). I'll say as someone who's day job has been very involved with GDPR 
stuff recently, this looks like a very reasonable statement to me. But I am not 
a lawyer. I will also say that I think it would be very reasonable for projects 
to not accept code from someone who doesn't give them any way to contact them 
later in case there is a question about authorship or licensing.

David Lang


https://github.com/rsyslog/rsyslog/pull/2746/files

LEGAL GDPR NOTICE:
According to the European data protection laws (GDPR), we would like to make you
aware that contributing to rsyslog via git will permanently store the
name and email address you provide as well as the actual commit and the
time and date you made it inside git's version history. This is inevitable,
because it is a main feature git. If you are concerned about your
privacy, we strongly recommend to use

--author "anonymous <gdpr@example.com>"

together with your commit. Also please do NOT sign your commit in this case,
as that potentially could lead back to you. Please note that if you use your
real identity, the GDPR grants you the right to have this information removed
later. However, we have valid reasons why we cannot remove that information
later on. The reasons are:

* this would break git history and make future merges unworkable
* the rsyslog projects has legitimate interest to keep a permanent record of the
   contributor identity, once given, for
   - copyright verification
   - being able to provide proof should a malicious commit be made

Please also note that your commit is public and as such will potentially be
processed by many third-parties. Git's distributed nature makes it impossible
to track where exactly your commit, and thus your personal data, will be stored
and be processed. If you would not like to accept this risk, please do either
commit anonymously or refrain from contributing to the rsyslog project.

2:

From: "Philip Oakley"
Date: Sun, 3 Jun 2018 23:28:43 +0100
Message-ID: <5F80881E35F941E88D9C84565C437607@PhilipOakley> (raw) https://public-inbox.org/git/5F80881E35F941E88D9C84565C437607@PhilipOakley/#t

> On Sun, Jun 03, 2018 at 04:28:31PM +0100, Philip Oakley wrote:
<snip/>
> You provide a lot of arguments about why it is not a necessity to have
> this, but let's assume it is; is there any actual problem you see with
> the proposal, except that someone would have to implement it?

It's the strawman problem. If it was a real 'real issue' then it would have 
already shown up with companies clamouring to pay folk to fix our (git's) 
latest problem. But the haven't, so I think it's a much more balanced issue.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is git compliant with GDPR?
  2020-07-02 16:40   ` Randall S. Becker
@ 2020-07-03  6:22     ` demerphq
  2020-07-03 13:52       ` Randall S. Becker
  0 siblings, 1 reply; 10+ messages in thread
From: demerphq @ 2020-07-03  6:22 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: Jason Pyeron, Git, Matthew Horowitz, Jakub Trzebiatowski

On Thu, 2 Jul 2020 at 18:42, Randall S. Becker <rsbecker@nexbridge.com> wrote:
> I am not speaking for the Git Foundation here, nor am I a lawyer; However, to use some practices from some of my customers who have this concern, the team members are directed to use tokenized names and email addresses that can be resolved by their security teams during an audit. Obviously the team members recognize the tokens so they know who is making what change. This means that externally, any names/emails that might get pushed upstream are non-identifying.

I think this is a really good point. I think git could make itself
much more GDPR friendly by having some support for this type of idea
built in.

Not sure how it could work, maybe some kind of object that can be
deleted after the fact which maps an identifier used for the author
with name and email. If that name and email change the object can be
updated, and if there is a need to "forget" the author, the object can
be deleted. The object would not be shared on clone, so it would stay
private to the repo that held it.

I guess you can argue that this isnt git's problem. But at a corporate
level, it will be seen as git's fault regardless if it cause a big
disruption. It could/would also be a reason that european companies
might decide not to use git.

cheers,
Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is git compliant with GDPR?
  2020-07-02 19:25       ` Jason Pyeron
@ 2020-07-03  6:29         ` demerphq
  0 siblings, 0 replies; 10+ messages in thread
From: demerphq @ 2020-07-03  6:29 UTC (permalink / raw)
  To: Jason Pyeron; +Cc: Git, Matthew Horowitz, Jakub Trzebiatowski, paul

On Thu, 2 Jul 2020 at 21:27, Jason Pyeron <jpyeron@pdinc.us> wrote:

> > On Sun, Jun 03, 2018 at 04:28:31PM +0100, Philip Oakley wrote:
> <snip/>
> > You provide a lot of arguments about why it is not a necessity to have
> > this, but let's assume it is; is there any actual problem you see with
> > the proposal, except that someone would have to implement it?
>
> It's the strawman problem. If it was a real 'real issue' then it would have
> already shown up with companies clamouring to pay folk to fix our (git's)
> latest problem. But the haven't, so I think it's a much more balanced issue.
>

I don't agree. These things tend to come in waves. Just because the
first wave hasnt hit yet doesn't mean it wont come. GDPR is still
super new, people are still coming to understand it. Over time this
understanding will lead to more people exercising the right to be
forgotten.

cheers,
Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Is git compliant with GDPR?
  2020-07-03  6:22     ` demerphq
@ 2020-07-03 13:52       ` Randall S. Becker
  0 siblings, 0 replies; 10+ messages in thread
From: Randall S. Becker @ 2020-07-03 13:52 UTC (permalink / raw)
  To: 'demerphq'
  Cc: 'Jason Pyeron', 'Git', 'Matthew Horowitz',
	'Jakub Trzebiatowski'

On July 3, 2020 2:23 AM, demerphq wrote:
> On Thu, 2 Jul 2020 at 18:42, Randall S. Becker <rsbecker@nexbridge.com>
> wrote:
> > I am not speaking for the Git Foundation here, nor am I a lawyer; However,
> to use some practices from some of my customers who have this concern,
> the team members are directed to use tokenized names and email addresses
> that can be resolved by their security teams during an audit. Obviously the
> team members recognize the tokens so they know who is making what
> change. This means that externally, any names/emails that might get pushed
> upstream are non-identifying.
> 
> I think this is a really good point. I think git could make itself much more
> GDPR friendly by having some support for this type of idea built in.
> 
> Not sure how it could work, maybe some kind of object that can be deleted
> after the fact which maps an identifier used for the author with name and
> email. If that name and email change the object can be updated, and if there
> is a need to "forget" the author, the object can be deleted. The object would
> not be shared on clone, so it would stay private to the repo that held it.
> 
> I guess you can argue that this isnt git's problem. But at a corporate level, it
> will be seen as git's fault regardless if it cause a big disruption. It could/would
> also be a reason that european companies might decide not to use git.

How you choose to identify yourself to git is entirely arbitrary. There are SSO solutions used by GitHub that have the personal information stripped out. I contend that this is not git's problem because anyone can use anything to self-identify. Git does not care. Policies can be implemented (commit-hooks) to automatically tokenize but that's up to what the corporation wants to do. In fact, git is less subject to GDPR issues than other VCS systems, which uses the logon credentials that are personal-identifying in many locations and could represent a security vulnerability. So while a corporation can choose to find fault with git, the fault is in their own credential management policies.

It might be worth some documentation to explain this.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-07-03 13:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02 15:58 Is git compliant with GDPR? Jakub Trzebiatowski
2020-07-02 16:28 ` Jason Pyeron
2020-07-02 16:40   ` Randall S. Becker
2020-07-03  6:22     ` demerphq
2020-07-03 13:52       ` Randall S. Becker
2020-07-02 17:06   ` Jakub Trzebiatowski
2020-07-02 18:38     ` Paul Smith
2020-07-02 19:25       ` Jason Pyeron
2020-07-03  6:29         ` demerphq
2020-07-02 18:47     ` Jason Pyeron

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).