unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* I'm unhappy about twalk_r
@ 2019-05-14 12:08 Florian Weimer
  2019-05-14 18:22 ` DJ Delorie
  2019-05-14 20:40 ` Carlos O'Donell
  0 siblings, 2 replies; 5+ messages in thread
From: Florian Weimer @ 2019-05-14 12:08 UTC (permalink / raw)
  To: libc-alpha

I have second thoughts about twalk_r.  I think the twalk interface is
just broken because the internal tree structure is
implementation-defined (because there is so much flexibility in
balancing a binary tree).  Or put differently, while the overall
iteration order is well-defined by the key ordering, it's mostly
unspecified which nodes are leaf nodes and which are internal nodes.
Exposing this information using the VISIT argument seems wrong to me.

I still think the function is useful, but I think the interface should
look like this instead:

int twalk_r (const void *root,
             int (*action) (const void *nodep, void *closure),
             void *closure);

ACTION is invoked for every node in the tree, in increasing key order,
as long as ACTION returns zero.  Otherwise, no further invocations
happen and twalk_r returns this non-zero value.  If all calls to ACTION
return zero (and the entire tree is traversed), twalk_r returns zero.

I think this is much more useful: It avoids pointless repeated calls to
ACTION, and it allows premature termination of the iteration.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: I'm unhappy about twalk_r
  2019-05-14 12:08 I'm unhappy about twalk_r Florian Weimer
@ 2019-05-14 18:22 ` DJ Delorie
  2019-05-14 19:50   ` Adhemerval Zanella
  2019-05-17 15:43   ` Florian Weimer
  2019-05-14 20:40 ` Carlos O'Donell
  1 sibling, 2 replies; 5+ messages in thread
From: DJ Delorie @ 2019-05-14 18:22 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha


Florian Weimer <fweimer@redhat.com> writes:
> Exposing this information using the VISIT argument seems wrong to me.

I agree.  It would be more useful if the binary-ness of the tree were
*more* exposed, so the user could take advatage of the walk-order, but
we don't expose it.

> I think this is much more useful: It avoids pointless repeated calls to
> ACTION, and it allows premature termination of the iteration.

twalk() is defined by POSIX.  If we have a *_r version of it, it should
conform to the POSIX definition as closely as we can, despite these
inefficiencies.

Perhaps we could add a twalk_find_r() function that does something
between twalk and tfind?  Or twalk_sorted_r()?  Or something else not
called twalk_r()?

Of course, that assumes that we have a need for a walk-no-find function
that isn't already solved by either twalk or tfind.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: I'm unhappy about twalk_r
  2019-05-14 18:22 ` DJ Delorie
@ 2019-05-14 19:50   ` Adhemerval Zanella
  2019-05-17 15:43   ` Florian Weimer
  1 sibling, 0 replies; 5+ messages in thread
From: Adhemerval Zanella @ 2019-05-14 19:50 UTC (permalink / raw)
  To: libc-alpha



On 14/05/2019 15:22, DJ Delorie wrote:
> 
> Florian Weimer <fweimer@redhat.com> writes:
>> Exposing this information using the VISIT argument seems wrong to me.
> 
> I agree.  It would be more useful if the binary-ness of the tree were
> *more* exposed, so the user could take advatage of the walk-order, but
> we don't expose it.
> 
>> I think this is much more useful: It avoids pointless repeated calls to
>> ACTION, and it allows premature termination of the iteration.
> 
> twalk() is defined by POSIX.  If we have a *_r version of it, it should
> conform to the POSIX definition as closely as we can, despite these
> inefficiencies.
> 
> Perhaps we could add a twalk_find_r() function that does something
> between twalk and tfind?  Or twalk_sorted_r()?  Or something else not
> called twalk_r()?
> 
> Of course, that assumes that we have a need for a walk-no-find function
> that isn't already solved by either twalk or tfind.
> 

So should we still provide a twalk_r or aim to just add a more sane 
interface and work with POSIX to promote it?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: I'm unhappy about twalk_r
  2019-05-14 12:08 I'm unhappy about twalk_r Florian Weimer
  2019-05-14 18:22 ` DJ Delorie
@ 2019-05-14 20:40 ` Carlos O'Donell
  1 sibling, 0 replies; 5+ messages in thread
From: Carlos O'Donell @ 2019-05-14 20:40 UTC (permalink / raw)
  To: libc-alpha, Florian Weimer

On 5/14/19 8:08 AM, Florian Weimer wrote:
> I have second thoughts about twalk_r.  I think the twalk interface is
> just broken because the internal tree structure is
> implementation-defined (because there is so much flexibility in
> balancing a binary tree).  Or put differently, while the overall
> iteration order is well-defined by the key ordering, it's mostly
> unspecified which nodes are leaf nodes and which are internal nodes.
> Exposing this information using the VISIT argument seems wrong to me.
> 
> I still think the function is useful, but I think the interface should
> look like this instead:
> 
> int twalk_r (const void *root,
>              int (*action) (const void *nodep, void *closure),
>              void *closure);
> 
> ACTION is invoked for every node in the tree, in increasing key order,
> as long as ACTION returns zero.  Otherwise, no further invocations
> happen and twalk_r returns this non-zero value.  If all calls to ACTION
> return zero (and the entire tree is traversed), twalk_r returns zero.
> 
> I think this is much more useful: It avoids pointless repeated calls to
> ACTION, and it allows premature termination of the iteration.

You identify two distinct issues:

* For a given iteration the twalk API exposes the node placement via the
  VISIT argument, and this information is going to change as the tree is
  rebalanced after modification. It's not clear what use this information
  can have given that it changes dynamically. A quick look at the Java
  TreeMap API shows no "VISIT"-like information for the underlying tree.

* There is no way to terminate the walk early with the exception of
  heavy weight operations like siglongjmp/longjmp/pthread_cancel. The
  alternative interfaces in other languages use iterators and therefore
  you can simply stop iterating if you've found the node of interest.

My own position:

- I agree with DJ, and I think twalk_r should match twalk as closely
  as possible even if we don't like the way twalk or twalk_r behave
  semantically as APIs. Having twalk_r be as straight-forward a
  solution as possible is important. I think twalk_r as you implemented
  it for glibc 2.30 is perfect and we should keep it (unless you want
  to suggest adding back level just for simplicity of conversion so
  users don't have to +/- their own level following preorder/endorder).

- If we are going to discuss a new API we should look at the last decade
  of reference implementations in other languages, and see what would be
  a best fit for C. We should do our due diligence and think about the
  interfaces other languages provide and what we are missing.

- In Java you have TreeMap, in C++ you have std::map, etc. would it be
  useful if glibc had a map structure? With iterators? etc. to replace
  the twalk* API?

In summary:

- Make twalk_r mirror twalk.
- Discuss other APIs.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: I'm unhappy about twalk_r
  2019-05-14 18:22 ` DJ Delorie
  2019-05-14 19:50   ` Adhemerval Zanella
@ 2019-05-17 15:43   ` Florian Weimer
  1 sibling, 0 replies; 5+ messages in thread
From: Florian Weimer @ 2019-05-17 15:43 UTC (permalink / raw)
  To: DJ Delorie; +Cc: libc-alpha

* DJ Delorie:

> Florian Weimer <fweimer@redhat.com> writes:
>> Exposing this information using the VISIT argument seems wrong to me.
>
> I agree.  It would be more useful if the binary-ness of the tree were
> *more* exposed, so the user could take advatage of the walk-order, but
> we don't expose it.

But it's really an implementation detail, dependent on how the
implementation chooses to rebalance the tree (if at all).

>> I think this is much more useful: It avoids pointless repeated calls to
>> ACTION, and it allows premature termination of the iteration.
>
> twalk() is defined by POSIX.  If we have a *_r version of it, it should
> conform to the POSIX definition as closely as we can, despite these
> inefficiencies.

Hmm.  Okay.  Carlos said something similar, so I'm going to leave it
as-is.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-05-17 15:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14 12:08 I'm unhappy about twalk_r Florian Weimer
2019-05-14 18:22 ` DJ Delorie
2019-05-14 19:50   ` Adhemerval Zanella
2019-05-17 15:43   ` Florian Weimer
2019-05-14 20:40 ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).