Coccinelle for automated refactors

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Coccinelle for automated refactors
@ 2016-06-05 20:55 brian m. carlson
  2016-06-06 17:43 ` Stefan Beller
  2016-06-06 18:55 ` Junio C Hamano
  0 siblings, 2 replies; 5+ messages in thread
From: brian m. carlson @ 2016-06-05 20:55 UTC (permalink / raw
  To: git

[-- Attachment #1: Type: text/plain, Size: 1683 bytes --]

One thing that I've noticed with the struct object_id conversion is that
most of the work is mechanical transformations of a data member from one
type into another.  Doing this by hand is both boring and error-prone,
and it requires a tiresome review of nearly-identical changes.

I've noticed that Coccinelle[0], a tool for automated refactors, has
been used with great success on LKML, because it understands C well and
can perform the transformations precisely and rapidly.  It also does
nice things like indenting the code it modifies if necessary.

An example semantic patch looks like this:

@@
expression E1;
@@
- is_null_sha1(E1.hash)
+ is_null_oid(&E1)

@@
expression E1;
@@
- is_null_sha1(E1->hash)
+ is_null_oid(E1)

This does what you think it does: transforms calls to is_null_sha1 that
use the struct object_id hash member into calls to is_null_oid.

I'd like to use this for some of the struct object_id work if others
think this is a good idea.  I feel it's likely to reduce the reviewing
overhead and allow people to better reason about the quality and
behavior of the sent patches.  Of course, I would still review the
patches manually for errors and improvements, and would still accept
responsibility for the content of the patches.

If there's interest, I can send a patch with a set of basic object_id
transforms to make it easier for others to make those changes when
they're doing work elsewhere in the codebase.

[0] http://coccinelle.lip6.fr/
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Coccinelle for automated refactors
  2016-06-05 20:55 Coccinelle for automated refactors brian m. carlson
@ 2016-06-06 17:43 ` Stefan Beller
  2016-06-06 18:55 ` Junio C Hamano
  1 sibling, 0 replies; 5+ messages in thread
From: Stefan Beller @ 2016-06-06 17:43 UTC (permalink / raw
  To: brian m. carlson, git@vger.kernel.org

On Sun, Jun 5, 2016 at 1:55 PM, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> One thing that I've noticed with the struct object_id conversion is that
> most of the work is mechanical transformations of a data member from one
> type into another.  Doing this by hand is both boring and error-prone,
> and it requires a tiresome review of nearly-identical changes.
>
> I've noticed that Coccinelle[0], a tool for automated refactors, has
> been used with great success on LKML, because it understands C well and
> can perform the transformations precisely and rapidly.  It also does
> nice things like indenting the code it modifies if necessary.
>
> An example semantic patch looks like this:
>
> @@
> expression E1;
> @@
> - is_null_sha1(E1.hash)
> + is_null_oid(&E1)
>
> @@
> expression E1;
> @@
> - is_null_sha1(E1->hash)
> + is_null_oid(E1)

So is this all we have to review then?

>
> This does what you think it does: transforms calls to is_null_sha1 that
> use the struct object_id hash member into calls to is_null_oid.
>
> I'd like to use this for some of the struct object_id work if others
> think this is a good idea.  I feel it's likely to reduce the reviewing
> overhead and allow people to better reason about the quality and
> behavior of the sent patches.  Of course, I would still review the
> patches manually for errors and improvements, and would still accept
> responsibility for the content of the patches.
>
> If there's interest, I can send a patch with a set of basic object_id
> transforms to make it easier for others to make those changes when
> they're doing work elsewhere in the codebase.

I would be interested in reviewing such a patch.

Thanks,
Stefan

>
> [0] http://coccinelle.lip6.fr/
> --
> brian m. carlson / brian with sandals: Houston, Texas, US
> +1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
> OpenPGP: https://keybase.io/bk2204

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Coccinelle for automated refactors
  2016-06-05 20:55 Coccinelle for automated refactors brian m. carlson
  2016-06-06 17:43 ` Stefan Beller
@ 2016-06-06 18:55 ` Junio C Hamano
  2016-06-06 20:38   ` Jacob Keller
  2016-06-06 23:09   ` brian m. carlson
  1 sibling, 2 replies; 5+ messages in thread
From: Junio C Hamano @ 2016-06-06 18:55 UTC (permalink / raw
  To: brian m. carlson; +Cc: git

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> An example semantic patch looks like this:
>
> @@
> expression E1;
> @@
> - is_null_sha1(E1.hash)
> + is_null_oid(&E1)
>
> @@
> expression E1;
> @@
> - is_null_sha1(E1->hash)
> + is_null_oid(E1)
>
> This does what you think it does: transforms calls to is_null_sha1 that
> use the struct object_id hash member into calls to is_null_oid.
>
> I'd like to use this for some of the struct object_id work if others
> think this is a good idea.  I feel it's likely to reduce the reviewing
> overhead and allow people to better reason about the quality and
> behavior of the sent patches.  Of course, I would still review the
> patches manually for errors and improvements, and would still accept
> responsibility for the content of the patches.

Is the plan for such a "refactor" patch to compose such a series as
two patch series:

 [1/2] automatic refactor

which gives the "semantic patch" in the proposed log message as part
of its description, and the automated result (with possible
misconversions that may have come from bugs in the automated tools),
with a separate

 [2/2] manual fixups

that corrects what was misconverted and what was missed?

As long as [2/2] can be kept to the minimum (and an automated tool
that is worth using should make it so), I think that is a good way
forward.  Another possibility would be to send the end-result as a
single patch, with description on the manual fixups in the proposed
log message, but it would be a lot more work to generate and review
such a patch, I would think.

> If there's interest, I can send a patch with a set of basic object_id
> transforms to make it easier for others to make those changes when
> they're doing work elsewhere in the codebase.
>
> [0] http://coccinelle.lip6.fr/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Coccinelle for automated refactors
  2016-06-06 18:55 ` Junio C Hamano
@ 2016-06-06 20:38   ` Jacob Keller
  2016-06-06 23:09   ` brian m. carlson
  1 sibling, 0 replies; 5+ messages in thread
From: Jacob Keller @ 2016-06-06 20:38 UTC (permalink / raw
  To: Junio C Hamano; +Cc: brian m. carlson, Git mailing list

On Mon, Jun 6, 2016 at 11:55 AM, Junio C Hamano <gitster@pobox.com> wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
>
>> An example semantic patch looks like this:
>>
>> @@
>> expression E1;
>> @@
>> - is_null_sha1(E1.hash)
>> + is_null_oid(&E1)
>>
>> @@
>> expression E1;
>> @@
>> - is_null_sha1(E1->hash)
>> + is_null_oid(E1)
>>
>> This does what you think it does: transforms calls to is_null_sha1 that
>> use the struct object_id hash member into calls to is_null_oid.
>>
>> I'd like to use this for some of the struct object_id work if others
>> think this is a good idea.  I feel it's likely to reduce the reviewing
>> overhead and allow people to better reason about the quality and
>> behavior of the sent patches.  Of course, I would still review the
>> patches manually for errors and improvements, and would still accept
>> responsibility for the content of the patches.
>
> Is the plan for such a "refactor" patch to compose such a series as
> two patch series:
>
>  [1/2] automatic refactor
>
> which gives the "semantic patch" in the proposed log message as part
> of its description, and the automated result (with possible
> misconversions that may have come from bugs in the automated tools),
> with a separate
>
>  [2/2] manual fixups
>
> that corrects what was misconverted and what was missed?
>
> As long as [2/2] can be kept to the minimum (and an automated tool
> that is worth using should make it so), I think that is a good way
> forward.  Another possibility would be to send the end-result as a
> single patch, with description on the manual fixups in the proposed
> log message, but it would be a lot more work to generate and review
> such a patch, I would think.
>
>> If there's interest, I can send a patch with a set of basic object_id
>> transforms to make it easier for others to make those changes when
>> they're doing work elsewhere in the codebase.
>>
>> [0] http://coccinelle.lip6.fr/
> --

I've used coccinelle before on the Linux Kernel, and I've found that
it works quite well without producing any incorrect changes,
especially if you keep the semantic patch small and simple. I'd
definitely go with the 1/2 2/2 approach but I suspect in most useful
cases 2/2 would be very minor if at all.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Coccinelle for automated refactors
  2016-06-06 18:55 ` Junio C Hamano
  2016-06-06 20:38   ` Jacob Keller
@ 2016-06-06 23:09   ` brian m. carlson
  1 sibling, 0 replies; 5+ messages in thread
From: brian m. carlson @ 2016-06-06 23:09 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]

On Mon, Jun 06, 2016 at 11:55:50AM -0700, Junio C Hamano wrote:
> Is the plan for such a "refactor" patch to compose such a series as
> two patch series:
> 
>  [1/2] automatic refactor
> 
> which gives the "semantic patch" in the proposed log message as part
> of its description, and the automated result (with possible
> misconversions that may have come from bugs in the automated tools),
> with a separate
> 
>  [2/2] manual fixups
> 
> that corrects what was misconverted and what was missed?
> 
> As long as [2/2] can be kept to the minimum (and an automated tool
> that is worth using should make it so), I think that is a good way
> forward.  Another possibility would be to send the end-result as a
> single patch, with description on the manual fixups in the proposed
> log message, but it would be a lot more work to generate and review
> such a patch, I would think.

My hope is that I can make the automated changes such that manual fixups
are more along the lines of cleaning up related functions in the module,
fixing issues noticed during the refactor, and the like: in other words,
things that one might have done incidentally as part of the patch, but
could defensibly be done in a second patch anyway.

My goal is to make the series as much like a human-edited series as
possible, but with less work on all sides.

I can send an RFC series that demonstrates this a little later to see if
it's an acceptable direction for work.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-06 23:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-05 20:55 Coccinelle for automated refactors brian m. carlson
2016-06-06 17:43 ` Stefan Beller
2016-06-06 18:55 ` Junio C Hamano
2016-06-06 20:38   ` Jacob Keller
2016-06-06 23:09   ` brian m. carlson

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).