git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Minor bug: git config ignores empty sections
@ 2016-08-15  1:29 Eli Barzilay
  2016-08-15 12:09 ` Jeff King
  2016-08-15 17:46 ` Junio C Hamano
  0 siblings, 2 replies; 12+ messages in thread
From: Eli Barzilay @ 2016-08-15  1:29 UTC (permalink / raw)
  To: git

Looks like there's a problem with setting a config with an empty
section, making it create a new section.  The result is:

    $ git --version
    git version 2.9.2
    $ git init
    Initialized empty Git repository in /home/eli/t/.git/
    $ t() { git config x.y x; git config --unset x.y; }
    $ t;t;t
    $ grep -c '\[x\]' .git/config
    3
    $ git config x.z x
    $ t;t;t
    $ git config x.z x     # adds another [x], but leaves it populated
    $ t;t;t;t;t;t;t;t
    $ grep -c '\[x\]' .git/config
    4

-- 
                   ((x=>x(x))(x=>x(x)))                  Eli Barzilay:
                   http://barzilay.org/                  Maze is Life!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15  1:29 Minor bug: git config ignores empty sections Eli Barzilay
@ 2016-08-15 12:09 ` Jeff King
  2016-08-15 17:34   ` Andreas Schwab
  2016-08-15 17:46 ` Junio C Hamano
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff King @ 2016-08-15 12:09 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: git

On Sun, Aug 14, 2016 at 09:29:27PM -0400, Eli Barzilay wrote:

> Looks like there's a problem with setting a config with an empty
> section, making it create a new section.  The result is:
> 
>     $ git --version
>     git version 2.9.2
>     $ git init
>     Initialized empty Git repository in /home/eli/t/.git/
>     $ t() { git config x.y x; git config --unset x.y; }
>     $ t;t;t
>     $ grep -c '\[x\]' .git/config
>     3
>     $ git config x.z x
>     $ t;t;t
>     $ git config x.z x     # adds another [x], but leaves it populated
>     $ t;t;t;t;t;t;t;t
>     $ grep -c '\[x\]' .git/config
>     4

Yep. This is a very old and well-known bug. IIRC, the problem is that
config updates use the regular callback-parser, so they see only the
config keys themselves. Empty section headers never trigger them.
Fixing it would require rewriting the config-update code.

And implicit in your test is the other bug, which is that deleting the
last key in a section leaves the empty header. I think it's related to
the same issue.

Patches welcome, of course.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15 12:09 ` Jeff King
@ 2016-08-15 17:34   ` Andreas Schwab
  2016-08-15 18:09     ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2016-08-15 17:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Eli Barzilay, git

On Aug 15 2016, Jeff King <peff@peff.net> wrote:

> And implicit in your test is the other bug, which is that deleting the
> last key in a section leaves the empty header. I think it's related to
> the same issue.

Indiscriminately removing empty section headers may break comments that
have been put there on purpose.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15  1:29 Minor bug: git config ignores empty sections Eli Barzilay
  2016-08-15 12:09 ` Jeff King
@ 2016-08-15 17:46 ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2016-08-15 17:46 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: git

Eli Barzilay <eli@barzilay.org> writes:

> Looks like there's a problem with setting a config with an empty
> section, making it create a new section.  The result is:

Thanks; this unfortunately is a well-known issue, that once was even
(supposed to be) a part of GSoC project but hasn't been solved.

The latest mention of the same issue I think is

https://public-inbox.org/git/xmqqeg95aor6.fsf@gitster.mtv.corp.google.com/

which redirects to

    If you are interested, start here:

    http://thread.gmane.org/gmane.comp.version-control.git/219505/focus=219524

but gmane web interface is not working and an equivalent is here:

https://public-inbox.org/git/20130329195155.GA19994@sigill.intra.peff.net/


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15 17:34   ` Andreas Schwab
@ 2016-08-15 18:09     ` Jeff King
  2016-08-15 18:28       ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2016-08-15 18:09 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eli Barzilay, git

On Mon, Aug 15, 2016 at 07:34:50PM +0200, Andreas Schwab wrote:

> On Aug 15 2016, Jeff King <peff@peff.net> wrote:
> 
> > And implicit in your test is the other bug, which is that deleting the
> > last key in a section leaves the empty header. I think it's related to
> > the same issue.
> 
> Indiscriminately removing empty section headers may break comments that
> have been put there on purpose.

I know, but we do not even do so discriminately.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15 18:09     ` Jeff King
@ 2016-08-15 18:28       ` Junio C Hamano
  2016-08-15 18:54         ` Andreas Schwab
  2016-08-15 18:55         ` Jeff King
  0 siblings, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2016-08-15 18:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Andreas Schwab, Eli Barzilay, git

Jeff King <peff@peff.net> writes:

> On Mon, Aug 15, 2016 at 07:34:50PM +0200, Andreas Schwab wrote:
>
>> On Aug 15 2016, Jeff King <peff@peff.net> wrote:
>> 
>> > And implicit in your test is the other bug, which is that deleting the
>> > last key in a section leaves the empty header. I think it's related to
>> > the same issue.
>> 
>> Indiscriminately removing empty section headers may break comments that
>> have been put there on purpose.
>
> I know, but we do not even do so discriminately.

I notice that we have thought about all the issues when we last
discussed it in 2013.  Refining a message from the earlier thread,
as it illustrates tricky cases in which we have to be careful.

If we were to remove the section header at this point, we should
be removing the comment two out of three cases.

 - if it is about section.key, it should go when section.key goes.
 - if it is about section, it should go when section goes.
 - if it is a more generic comment about this configuration file,
   it should stay.

A configuration file may have something like this:

        # Now, let's list all the remotes I interact with.

        # This is where I push all the separate topics.
        [remote "github"]
                # fetch like everybody else without auth
                url = git://github.com/user/git
                # push with auth
                pushURL = github.com:user/git
                # publish my working area as-is
                mirror

        # Traditional "canonical" one
        [remote "korg"]
                url = k.org:/pub/scm/user/git

If I were to retire the "github" account, removing the entire
[remote "github"] section while keeping the comments before that
section would be confusing, which would end up with:

        # Now, let's list all the remotes I interact with.

        # This is where I push all the separate topics.

        # Traditional "canonical" one
        [remote "korg"]
                url = k.org:/pub/scm/user/git

because I do not push all the separate topics to korg.

Removing the section while removing the comments that pertain to the
section is impossible; "Now, let's list all the remotes" should
stay, "This is where I push" should go.

So a comment outside [section "name"] is tricky; it needs some
mechanism (or convention) to tell us if it is about the particular
section, or it is about the location in the configuration file.

Removing only the keys and keeping the skelton around would give us:

        # Now, let's list all the remotes I interact with.
        # This is where I push all the separate topics.
        [remote "github"]
                # fetch like everybody else without auth
                # push with auth
                # publish my working area as-is

        # Traditional "canonical" one
        [remote "korg"]
                url = k.org:/pub/scm/user/git

which is still confusing, but at least the confusion is not spread
to adjacent sections.  What we could do easily without understanding
natural languages is to remove comments inside [section "name"], but
it still depends on a convention that such a per-key comment comes
before the key, not after, i.e.

        # This is where I push all the separate topics.
        [remote "github"]
                # fetch like everybody else without auth
                ...
                # publish my working area as-is
                mirror
		# note that mirror may push out outside tags/ and heads/

	# Traditional "canonical" one
	[remote "korg"]

We could also take hints from the indentation level, but now we are
deep into the "convention" territory once we start doing that.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15 18:28       ` Junio C Hamano
@ 2016-08-15 18:54         ` Andreas Schwab
  2016-08-15 18:55         ` Jeff King
  1 sibling, 0 replies; 12+ messages in thread
From: Andreas Schwab @ 2016-08-15 18:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Eli Barzilay, git

On Aug 15 2016, Junio C Hamano <gitster@pobox.com> wrote:

> If we were to remove the section header at this point, we should
> be removing the comment two out of three cases.
>
>  - if it is about section.key, it should go when section.key goes.
>  - if it is about section, it should go when section goes.
>  - if it is a more generic comment about this configuration file,
>    it should stay.

The comment may not be related to the surrounding keys, but just a
commented-out config key that you want to keep around for reference.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15 18:28       ` Junio C Hamano
  2016-08-15 18:54         ` Andreas Schwab
@ 2016-08-15 18:55         ` Jeff King
  2016-08-15 20:49           ` Junio C Hamano
  2016-08-16  4:06           ` Eli Barzilay
  1 sibling, 2 replies; 12+ messages in thread
From: Jeff King @ 2016-08-15 18:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Andreas Schwab, Eli Barzilay, git

On Mon, Aug 15, 2016 at 11:28:20AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Mon, Aug 15, 2016 at 07:34:50PM +0200, Andreas Schwab wrote:
> >
> >> On Aug 15 2016, Jeff King <peff@peff.net> wrote:
> >> 
> >> > And implicit in your test is the other bug, which is that deleting the
> >> > last key in a section leaves the empty header. I think it's related to
> >> > the same issue.
> >> 
> >> Indiscriminately removing empty section headers may break comments that
> >> have been put there on purpose.
> >
> > I know, but we do not even do so discriminately.
> 
> I notice that we have thought about all the issues when we last
> discussed it in 2013.  Refining a message from the earlier thread,
> as it illustrates tricky cases in which we have to be careful.

Thanks for digging up the threads that I was too lazy to find.

I agree with most everything here, though I would be happy if somebody
even wrote a patch to handle the "easy" cases.

> So a comment outside [section "name"] is tricky; it needs some
> mechanism (or convention) to tell us if it is about the particular
> section, or it is about the location in the configuration file.

Keep in mind that even "outside" is hard, because sections do not
explicitly close.

So in:

  [core]
  foo = bar

  # here are my remotes

  [remote "github"]
  url = ...

How do we know that the comment is "outside" and not part of [core]?

We can perhaps guess so because there are no keys after it in the
section, though there are some special cases, like:

  [core]
  foo = bar
  # This isn't necessary anymore because...
  # xyzzy = false

or even:

  [core]
  foo = bar # needed because of xyzzy

You can probably make reasonable cases based on heuristics around
newlines, but that is even further into "convention" territory.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15 18:55         ` Jeff King
@ 2016-08-15 20:49           ` Junio C Hamano
  2016-08-16  4:06           ` Eli Barzilay
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2016-08-15 20:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Andreas Schwab, Eli Barzilay, git

Jeff King <peff@peff.net> writes:

>> So a comment outside [section "name"] is tricky; it needs some
>> mechanism (or convention) to tell us if it is about the particular
>> section, or it is about the location in the configuration file.
>
> Keep in mind that even "outside" is hard, because sections do not
> explicitly close.

Yes, that was what I wanted to say.  We'd be deep in the
"convention" territory.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-15 18:55         ` Jeff King
  2016-08-15 20:49           ` Junio C Hamano
@ 2016-08-16  4:06           ` Eli Barzilay
  2016-08-16 12:36             ` Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Eli Barzilay @ 2016-08-16  4:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Andreas Schwab, git

On Mon, Aug 15, 2016 at 2:55 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Aug 15, 2016 at 11:28:20AM -0700, Junio C Hamano wrote:
>
>> I notice that we have thought about all the issues when we last
>> discussed it in 2013.  Refining a message from the earlier thread,
>> as it illustrates tricky cases in which we have to be careful.
>
> Thanks for digging up the threads that I was too lazy to find.
>
> I agree with most everything here, though I would be happy if somebody
> even wrote a patch to handle the "easy" cases.

So it sounds like removing an empty header is problematic in most cases,
but adding a new variable to an existing empty header should not be...?

I looked at the code, and had a rough sketch that works as follows:

* Make git_parse_source() call the callback in a special way to note
  that a section header is seen (I hacked it by passing a special value,
  a pointer to a global string, as the second argument)

* Add a new store.last_section_offset field

* In store_aux(), if it's getting the special value, and the section
  name matches, save the offset in store.last_section_offset

* In git_config_set_multivar_in_file_gently() right before the "write
  the first part of the config" comment, test that
      (store.seen == 1 && copy_begin == 0 && copy_end == contents_sz
       && store.last_section_offset > 0)
  and if so, write the contents up to that point, and set copy_begin;
  if the condition is false, do the same thing it does now

* A bit after that, add "&& store.last_section_offset == 0" to the
  condition that decides whether to call store_write_section()

It looks to me like something like this can work, but it's very hacky
because I wanted to see if it can work quickly, and because I don't know
if this kind of a solution will be wanted, and because I don't know
enough about that code in general.  Making it be an actual solution
involves a better way to call the callback in a special way (my hack
does the special thing only if `fn==store_aux`, but it shouldn't),
calling it after the last variable in a section is seen so it's added
after that.

So, will something like this be acceptable?  If so, is there anyone who
I can ask questions about the code?

-- 
                   ((x=>x(x))(x=>x(x)))                  Eli Barzilay:
                   http://barzilay.org/                  Maze is Life!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-16  4:06           ` Eli Barzilay
@ 2016-08-16 12:36             ` Jeff King
  2016-08-21 17:09               ` Jakub Narębski
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2016-08-16 12:36 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Junio C Hamano, Andreas Schwab, git

On Tue, Aug 16, 2016 at 12:06:56AM -0400, Eli Barzilay wrote:

> So it sounds like removing an empty header is problematic in most cases,
> but adding a new variable to an existing empty header should not be...?

I think it's less problematic, though there are still some funny cases.
Like:

  [foo]
  # comments for the "foo" section

  # comments introducing the "bar" section
  [bar]

Ideally the new entry goes in between the two comments. But I don't
think it would be unreasonable to add it right after "[foo]". In this
example, using the blank line as a hint might be useful, but that's
going to be a lot harder to coax out of the callback-parser as syntactic
event.

> I looked at the code, and had a rough sketch that works as follows:
> 
> * Make git_parse_source() call the callback in a special way to note
>   that a section header is seen (I hacked it by passing a special value,
>   a pointer to a global string, as the second argument)

We'd want to make sure that didn't kick in for "regular" callers, with
some kind of option (or, blech, a global variable that changes the
behavior of git_parse_source(), though the config code is already full
of them).

> * Add a new store.last_section_offset field
>
> * In store_aux(), if it's getting the special value, and the section
>   name matches, save the offset in store.last_section_offset

Make sense.

> * In git_config_set_multivar_in_file_gently() right before the "write
>   the first part of the config" comment, test that
>       (store.seen == 1 && copy_begin == 0 && copy_end == contents_sz
>        && store.last_section_offset > 0)
>   and if so, write the contents up to that point, and set copy_begin;
>   if the condition is false, do the same thing it does now

This part I'm not sure about. Naively I'd think you could do with less
special-casing here. If we want to add "foo.two" and find:

  [foo]
  one = whatever

then we mark the end of that line as the point we want to copy up to,
and then add our new "two = ..." line.

Conceptually if we see just:

  [foo]

we should be able to kick in the same code. So it seems like the logic
would all be in the detection and setting of the offset.

But I am not sure I understand all of the code here, and it's rather
complicated. So there is probably a good reason that wouldn't work.

> It looks to me like something like this can work, but it's very hacky
> because I wanted to see if it can work quickly, and because I don't know
> if this kind of a solution will be wanted, and because I don't know
> enough about that code in general.  Making it be an actual solution
> involves a better way to call the callback in a special way (my hack
> does the special thing only if `fn==store_aux`, but it shouldn't),
> calling it after the last variable in a section is seen so it's added
> after that.
> 
> So, will something like this be acceptable?  If so, is there anyone who
> I can ask questions about the code?

Yeah, I think this general strategy is the smallest change that could
work.

What I think would be much more sane in general is to parse the whole
thing into a tree (or even a flat list of events), marking each
syntactically as a section header, a key, a comment, a run of
whitespace, and so on. And then do manipulations on that tree (e.g.,
walk down to the section of interest, add a new key at the end), and
then reformat the tree back into a stream of bytes. That lets you
separate the policy logic from the parsing, and do high-level operations
that might involve multiple passes or backtracking in the tree (e.g.,
walk to the section, walk past any existing keys, walk past any comments
but _not_ past any blank lines, then add the new key).

There are other other upsides, too. For example, the current code cannot
write multiple unrelated keys in a single pass.

The downsides is that it's a complete rewrite of the code. So it's a
_lot_ more work, and it carries more risk of regression. So please don't
take this as "no, your solution isn't OK; you have to do the rewrite".
We've been living with band-aids on the config code for a decade, so I
am OK with one more.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Minor bug: git config ignores empty sections
  2016-08-16 12:36             ` Jeff King
@ 2016-08-21 17:09               ` Jakub Narębski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Narębski @ 2016-08-21 17:09 UTC (permalink / raw)
  To: Jeff King, Eli Barzilay; +Cc: Junio C Hamano, Andreas Schwab, git

W dniu 16.08.2016 o 14:36, Jeff King pisze:

> What I think would be much more sane in general is to parse the whole
> thing into a tree (or even a flat list of events),

List of events is cheaper on memory (though I don't think there is
an issue with size of config file), and can be turned into tree
and DOM easily.

>                                                    marking each
> syntactically as a section header, a key, a comment, a run of
> whitespace, and so on. And then do manipulations on that tree (e.g.,
> walk down to the section of interest, add a new key at the end), and
> then reformat the tree back into a stream of bytes. That lets you
> separate the policy logic from the parsing, and do high-level operations
> that might involve multiple passes or backtracking in the tree (e.g.,
> walk to the section, walk past any existing keys, walk past any comments
> but _not_ past any blank lines, then add the new key).
> 
> There are other other upsides, too. For example, the current code cannot
> write multiple unrelated keys in a single pass.
> 
> The downsides is that it's a complete rewrite of the code. So it's a
> _lot_ more work, and it carries more risk of regression. So please don't
> take this as "no, your solution isn't OK; you have to do the rewrite".
> We've been living with band-aids on the config code for a decade, so I
> am OK with one more.

So how 'git config --show-origin --list' works?  

Ah, I guess it prints as it goes, instead of parsing config file(s)
into stream of events, and printing from those.

-- 
Jakub Narębski


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-08-21 17:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-15  1:29 Minor bug: git config ignores empty sections Eli Barzilay
2016-08-15 12:09 ` Jeff King
2016-08-15 17:34   ` Andreas Schwab
2016-08-15 18:09     ` Jeff King
2016-08-15 18:28       ` Junio C Hamano
2016-08-15 18:54         ` Andreas Schwab
2016-08-15 18:55         ` Jeff King
2016-08-15 20:49           ` Junio C Hamano
2016-08-16  4:06           ` Eli Barzilay
2016-08-16 12:36             ` Jeff King
2016-08-21 17:09               ` Jakub Narębski
2016-08-15 17:46 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).