git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Some misspelling errors in the git release 2.24.0
@ 2019-11-04 14:55 Fossies Administrator
  2019-11-04 15:26 ` Elijah Newren
  0 siblings, 1 reply; 7+ messages in thread
From: Fossies Administrator @ 2019-11-04 14:55 UTC (permalink / raw)
  To: git

Hi,

although misspelling corrections are not the most exciting issues and the 
spelling errors are rarely true code bugs but mostly contained in the 
comments and documentation parts they correction may still improve the 
overall quality of a software project a little bit.

In this sense I created a code misspelling report for "git" using the 
program "codespell"

  https://fossies.org/linux/misc/git-2.24.0.tar.xz/codespell.html

or version independent

  https://fossies.org/linux/misc/git/codespell.html

The latter URL redirects always to the report of the last "git" release 
supported on Fossies (if such a report was requested resp. is existing).

Principally it's possible to make further runs not only on "git" releases 
but also within a separated test environment on master or branches. If you 
found FPs please inform me and I will rerun the analysis.

Regards

Jens

-- 
FOSSIES - The Fresh Open Source Software archive
mainly for Internet, Engineering and Science
https://fossies.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some misspelling errors in the git release 2.24.0
  2019-11-04 14:55 Some misspelling errors in the git release 2.24.0 Fossies Administrator
@ 2019-11-04 15:26 ` Elijah Newren
  2019-11-04 16:14   ` Fossies Administrator
  0 siblings, 1 reply; 7+ messages in thread
From: Elijah Newren @ 2019-11-04 15:26 UTC (permalink / raw)
  To: Fossies Administrator; +Cc: Git Mailing List

On Mon, Nov 4, 2019 at 7:07 AM Fossies Administrator
<Jens.Schleusener@fossies.org> wrote:
>
> Hi,
>
> although misspelling corrections are not the most exciting issues and the
> spelling errors are rarely true code bugs but mostly contained in the
> comments and documentation parts they correction may still improve the
> overall quality of a software project a little bit.
>
> In this sense I created a code misspelling report for "git" using the
> program "codespell"
>
>   https://fossies.org/linux/misc/git-2.24.0.tar.xz/codespell.html
>
> or version independent
>
>   https://fossies.org/linux/misc/git/codespell.html

Cool, thanks for sending this report along.  The typos within the
Documentation/ subdirectory have mostly been addressed by the
en/doc-typofix branch (in next, not yet merged to master).  There are
also some false positives in this report (e.g. mmaped should not be
changed to mapped, CREAT should not be changed to CREATE, examples in
format-patch showing how to correct spelling errors need to keep their
spelling errors or it won't make sense, and perhaps some others), but
most of them look like actual spelling errors that should be
corrected.  I'll send in a patch, and mark you as the reporter of the
issues.

Elijah

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some misspelling errors in the git release 2.24.0
  2019-11-04 15:26 ` Elijah Newren
@ 2019-11-04 16:14   ` Fossies Administrator
       [not found]     ` <20191105171107.27379-1-newren@gmail.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Fossies Administrator @ 2019-11-04 16:14 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

Hi Elijah,

> On Mon, Nov 4, 2019 at 7:07 AM Fossies Administrator
> <Jens.Schleusener@fossies.org> wrote:
>>
>> Hi,
>>
>> although misspelling corrections are not the most exciting issues and the
>> spelling errors are rarely true code bugs but mostly contained in the
>> comments and documentation parts they correction may still improve the
>> overall quality of a software project a little bit.
>>
>> In this sense I created a code misspelling report for "git" using the
>> program "codespell"
>>
>>   https://fossies.org/linux/misc/git-2.24.0.tar.xz/codespell.html
>>
>> or version independent
>>
>>   https://fossies.org/linux/misc/git/codespell.html
>
> Cool, thanks for sending this report along.  The typos within the
> Documentation/ subdirectory have mostly been addressed by the
> en/doc-typofix branch (in next, not yet merged to master).  There are
> also some false positives in this report (e.g. mmaped should not be
> changed to mapped, CREAT should not be changed to CREATE, examples in
> format-patch showing how to correct spelling errors need to keep their
> spelling errors or it won't make sense, and perhaps some others), but
> most of them look like actual spelling errors that should be
> corrected.  I'll send in a patch, and mark you as the reporter of the
> issues.
>
> Elijah

Thanks for your feedback. Yes, the words "mmaped" and "CREAT" I noticed as 
possible FPs but I was not really sure. And the character of the file 
"format-patch" I have simply overseen. So I have done a rerun and the 
mentioned two words and the file are now excluded.

Jens

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some misspelling errors in the git release 2.24.0
       [not found]     ` <20191105171107.27379-1-newren@gmail.com>
@ 2019-11-05 18:24       ` Elijah Newren
  2019-11-06 11:08       ` Fossies Administrator
  1 sibling, 0 replies; 7+ messages in thread
From: Elijah Newren @ 2019-11-05 18:24 UTC (permalink / raw)
  To: Fossies Administrator; +Cc: Git Mailing List

On Tue, Nov 5, 2019 at 9:11 AM Elijah Newren <newren@gmail.com> wrote:
> On Mon, Nov 4, 2019 at 8:14 AM Fossies Administrator <Jens.Schleusener@fossies.org> wrote:
> > > On Mon, Nov 4, 2019 at 7:07 AM Fossies Administrator
> > > <Jens.Schleusener@fossies.org> wrote:

> But I thought it might also be worthwhile to you to report what the
> false positives found by that program were; I've included them at the
> end of this email in the form of a patch.  The places where the program
> seemed to struggle were:
>
>   * In dealing with translation files.  It didn't recognize them as
>     such and often tried to translate foreign words to a nearby English
>     one.
>   * In handling variable names: acronyms might be similar to english
>     words (cas, for compare and swap, looks like case), abbreviations
>     might look like alternate words (ans, short for answer, looks like
>     and).
>   * Testcases with intentional spelling errors
>   * Proper names that were similar to English words (Ned -> Need,
>     Claus -> Clause)
>   * miscellaneous tech jargon or package names (e.g. 'filetest' module
>     being replaced with 'file test', 'ith' as in not first or second
>     but the item at position i being replaced with 'with', 'mmaped'
>     being replaced with 'mapped', 'CREAT' changing to 'CREATE',
>     'UserA' (out of a sequence of UserB, UserC, etc.) changing to
>     'users', 'spawnve' function name being replaced with "spawn",
>     'CAs' (certificate authorities) being replaced with 'case', etc.)

Ooh, one more I remembered that I wanted to point out.  It found the
spelling error 'achiving', but it wanted to replace it with
'achieving' rather than the correct 'archiving'.  Given that the
correct is the same edit distance from the spelling error, it made me
wonder whether the dictionary in use just needed to be expanded a
little.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some misspelling errors in the git release 2.24.0
       [not found]     ` <20191105171107.27379-1-newren@gmail.com>
  2019-11-05 18:24       ` Elijah Newren
@ 2019-11-06 11:08       ` Fossies Administrator
  2019-11-07  4:46         ` Elijah Newren
  1 sibling, 1 reply; 7+ messages in thread
From: Fossies Administrator @ 2019-11-06 11:08 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

Hi Elijah,

> On Mon, Nov 4, 2019 at 8:14 AM Fossies Administrator <Jens.Schleusener@fossies.org> wrote:
>>
>> Hi Elijah,
>>
>>> On Mon, Nov 4, 2019 at 7:07 AM Fossies Administrator
>>> <Jens.Schleusener@fossies.org> wrote:
>>>>
>>>> Hi,
>>>>
>>>> although misspelling corrections are not the most exciting issues and the
>>>> spelling errors are rarely true code bugs but mostly contained in the
>>>> comments and documentation parts they correction may still improve the
>>>> overall quality of a software project a little bit.
>>>>
>>>> In this sense I created a code misspelling report for "git" using the
>>>> program "codespell"
>>>>
>>>>   https://fossies.org/linux/misc/git-2.24.0.tar.xz/codespell.html
>>>>
>>>> or version independent
>>>>
>>>>   https://fossies.org/linux/misc/git/codespell.html
>>>
>>> Cool, thanks for sending this report along.  The typos within the
>>> Documentation/ subdirectory have mostly been addressed by the
>>> en/doc-typofix branch (in next, not yet merged to master).  There are
>>> also some false positives in this report (e.g. mmaped should not be
>>> changed to mapped, CREAT should not be changed to CREATE, examples in
>>> format-patch showing how to correct spelling errors need to keep their
>>> spelling errors or it won't make sense, and perhaps some others), but
>>> most of them look like actual spelling errors that should be
>>> corrected.  I'll send in a patch, and mark you as the reporter of the
>>> issues.
>
> So, I used your codespell program

That seems to be a misunderstanding: I'm not the author of the codespell 
program but I only use that program to detect spelling errors and point to 
their existence while offering the option to inspect the context of the 
probably misspelled words in a fast and comfortable way via a Web page.

> to catch all these and turned
> en/doc-typofix into a series of patches to fix all the errors:
>  https://public-inbox.org/git/pull.418.v2.git.1572973650.gitgitgadget@gmail.com/
>
>
> But I thought it might also be worthwhile to you to report what the
> false positives found by that program were; I've included them at the
> end of this email in the form of a patch.  The places where the program
> seemed to struggle were:
>
>  * In dealing with translation files.  It didn't recognize them as
>    such and often tried to translate foreign words to a nearby English
>    one.
>  * In handling variable names: acronyms might be similar to english
>    words (cas, for compare and swap, looks like case), abbreviations
>    might look like alternate words (ans, short for answer, looks like
>    and).
>  * Testcases with intentional spelling errors
>  * Proper names that were similar to English words (Ned -> Need,
>    Claus -> Clause)
>  * miscellaneous tech jargon or package names (e.g. 'filetest' module
>    being replaced with 'file test', 'ith' as in not first or second
>    but the item at position i being replaced with 'with', 'mmaped'
>    being replaced with 'mapped', 'CREAT' changing to 'CREATE',
>    'UserA' (out of a sequence of UserB, UserC, etc.) changing to
>    'users', 'spawnve' function name being replaced with "spawn",
>    'CAs' (certificate authorities) being replaced with 'case', etc.)

Thanks for the detailed and informed feedback. These are exactly problems 
that I noticed as well. Some additional ones are for e.g. words at the end 
of a string definition (apostrophe), mail addresses and differences 
between US-English and UK-English. Sometimes it's difficult to decide what 
to exclude: For e.g. your mentioned word "ans" is often intentionally used 
but often it's also a typo ("and") so FPs or FNs seem unavoidable. So 
codespell respectively Fossies can give only the pointers and an 
individual check seems always required.

Some of the according FPs are excluded by Fossies generally, some other 
obvious FPs are excluded by Fossies specifically for each FOSS project 
(see always the bold item "Codespell configuration" with a link to 
"Project-specific additions" or to "(no project-specific adaptions yet 
done)" that shows all the excluded words and directories/files).

> Some of these would be difficult for any tool to deal with, but e.g.
> recognizing translation files as such and ignoring them might be
> interesting.

As one can see on the page

  https://fossies.org/linux/misc/git-2.24.0.tar.xz/codespell_conf_info.html

there are already done some according attempts.

To let run codespell with an English dictionary about directories like 
"translations" or "langmap" is probably in the most cases meaningless 
(although in such directories may be contained also "English" source 
code). So if one uses codespell manually one should use options like 
"--ignore-words-list" and "--skip" (directories, files) and can look 
optionally for a starting point to the values Fossies has used.

> Anyway, thanks for the report and the pointer to the tool.

Although (hopefully) most of the FPs within your list are already excluded 
by Fossies I will check the list to make Fossies perhaps even smarter.

> -- 8< --
> Subject: [PATCH] DO NOT MERGE: False positives from `codespell -w`

[... the very big list of FPs removed in this reply mail ...]

Regards

Jens

-- 
FOSSIES - The Fresh Open Source Software archive
mainly for Internet, Engineering and Science
https://fossies.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some misspelling errors in the git release 2.24.0
  2019-11-06 11:08       ` Fossies Administrator
@ 2019-11-07  4:46         ` Elijah Newren
  2019-11-07  9:18           ` Fossies Administrator
  0 siblings, 1 reply; 7+ messages in thread
From: Elijah Newren @ 2019-11-07  4:46 UTC (permalink / raw)
  To: Fossies Administrator; +Cc: Git Mailing List

On Wed, Nov 6, 2019 at 3:08 AM Fossies Administrator
<Jens.Schleusener@fossies.org> wrote:
>
> Hi Elijah,
>
> > On Mon, Nov 4, 2019 at 8:14 AM Fossies Administrator <Jens.Schleusener@fossies.org> wrote:
> >>
> >> Hi Elijah,
> >>
> >>> On Mon, Nov 4, 2019 at 7:07 AM Fossies Administrator
> >>> <Jens.Schleusener@fossies.org> wrote:

> > So, I used your codespell program
>
> That seems to be a misunderstanding: I'm not the author of the codespell
> program but I only use that program to detect spelling errors and point to
> their existence while offering the option to inspect the context of the
> probably misspelled words in a fast and comfortable way via a Web page.

Oops, sorry for the misunderstanding; thanks for clearing it up.

[...]
> Some of the according FPs are excluded by Fossies generally, some other
> obvious FPs are excluded by Fossies specifically for each FOSS project
> (see always the bold item "Codespell configuration" with a link to
> "Project-specific additions" or to "(no project-specific adaptions yet
> done)" that shows all the excluded words and directories/files).
[...]
> As one can see on the page
>
>   https://fossies.org/linux/misc/git-2.24.0.tar.xz/codespell_conf_info.html
>
> there are already done some according attempts.

Ah, thanks for the pointer.  Could you add t/t9150/svk-merge.dump and
t/t9151/svn-mergeinfo.dump the the list of files to exclude?  Both
have the 'hapenning' typo, but both are a dump of some repository and
editing it means recomputing sha1sums and whatnot for tests that just
isn't worth it.  I thought maybe I could get away with correcting
those spelling errors but backed out once I saw further knock-on
effects.

Thanks for the report and the background and corrections!

Elijah

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some misspelling errors in the git release 2.24.0
  2019-11-07  4:46         ` Elijah Newren
@ 2019-11-07  9:18           ` Fossies Administrator
  0 siblings, 0 replies; 7+ messages in thread
From: Fossies Administrator @ 2019-11-07  9:18 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

Hi Elijah,

On Wed, 6 Nov 2019, Elijah Newren wrote:

> On Wed, Nov 6, 2019 at 3:08 AM Fossies Administrator
> <Jens.Schleusener@fossies.org> wrote:
>>
>> Hi Elijah,
>>
>>> On Mon, Nov 4, 2019 at 8:14 AM Fossies Administrator <Jens.Schleusener@fossies.org> wrote:
>>>>
>>>> Hi Elijah,
>>>>
>>>>> On Mon, Nov 4, 2019 at 7:07 AM Fossies Administrator
>>>>> <Jens.Schleusener@fossies.org> wrote:
>
>>> So, I used your codespell program
>>
>> That seems to be a misunderstanding: I'm not the author of the codespell
>> program but I only use that program to detect spelling errors and point to
>> their existence while offering the option to inspect the context of the
>> probably misspelled words in a fast and comfortable way via a Web page.
>
> Oops, sorry for the misunderstanding; thanks for clearing it up.
>
> [...]
>> Some of the according FPs are excluded by Fossies generally, some other
>> obvious FPs are excluded by Fossies specifically for each FOSS project
>> (see always the bold item "Codespell configuration" with a link to
>> "Project-specific additions" or to "(no project-specific adaptions yet
>> done)" that shows all the excluded words and directories/files).
> [...]
>> As one can see on the page
>>
>>   https://fossies.org/linux/misc/git-2.24.0.tar.xz/codespell_conf_info.html
>>
>> there are already done some according attempts.
>
> Ah, thanks for the pointer.  Could you add t/t9150/svk-merge.dump and
> t/t9151/svn-mergeinfo.dump the the list of files to exclude?  Both
> have the 'hapenning' typo, but both are a dump of some repository and
> editing it means recomputing sha1sums and whatnot for tests that just
> isn't worth it.

Sure, done.

Principally test directories and files seem to be a "special" area
also for spelling checks.

> I thought maybe I could get away with correcting
> those spelling errors but backed out once I saw further knock-on
> effects.

That is one reason that only the developers should made such corrections.

> Thanks for the report and the background and corrections!

Regards

Jens

-- 
FOSSIES - The Fresh Open Source Software archive
mainly for Internet, Engineering and Science
https://fossies.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-11-07  9:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-04 14:55 Some misspelling errors in the git release 2.24.0 Fossies Administrator
2019-11-04 15:26 ` Elijah Newren
2019-11-04 16:14   ` Fossies Administrator
     [not found]     ` <20191105171107.27379-1-newren@gmail.com>
2019-11-05 18:24       ` Elijah Newren
2019-11-06 11:08       ` Fossies Administrator
2019-11-07  4:46         ` Elijah Newren
2019-11-07  9:18           ` Fossies Administrator

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).