From: Jim Meyering <jim@meyering.net>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: Bruno Haible <bruno@clisp.org>,
Simon Josefsson <simon@josefsson.org>,
bug-gnulib@gnu.org
Subject: Re: fts: Document this module
Date: Thu, 19 Jan 2023 21:24:01 -0800 [thread overview]
Message-ID: <CA+8g5KEBYfqgWEohtbihM5ZcvXUbV-aDsubmXTzZz=J6aRsTkA@mail.gmail.com> (raw)
In-Reply-To: <e508a4ec-0e52-c01f-67dd-a7c6f4006ae2@cs.ucla.edu>
On Thu, Jan 19, 2023 at 7:05 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
>
> On 1/19/23 15:41, Bruno Haible wrote:
> > Jim or Paul, what should we state
> > — either in the 'fts' module description, or in the .texi documentation?
>
> The quick thing is to say in both that the description/documentation is
> incomplete, and that people need to read the source code.
>
> Jim may be able to fill in a bit here, since I think he wrote most of
> that stuff. (I haven't checked this though; sorry, I'm a bit crunched
> for time today.)
Thanks for caring/documenting. Here's a quick summary (for more
detail, see the comments in fts_.h).
This started when I found glibc's fts was insufficiently robust to
meet GNU rm's needs (rm was merely the first user; now, many others
use it):
- O(N^2) behavior in the number of file name components due to cycle detection
- max hierarchy depth was 64k due to type of fts_level being a "short"
- subject to O(N^2) effects for directories with many entries (poor
locality of reference, for which the fix was to process entries in
sorted-inode order (per a heuristic), delaying any "stat" until
operating on the entry)
Re fts's cycle detection:
- contrast glibc's O(depth) time algorithm vs our O(1) implementation
- our cheap-but-lazy O(1)-memory approach is ok for most applications, but
- there's an optional, slightly more costly detect-ASAP approach required for du
(uses O(max-depth-of-hierarchy) memory)
Fixing those things required ABI changes and nontrivial redesign.
prev parent reply other threads:[~2023-01-20 5:24 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-19 17:38 fts: Document this module Bruno Haible
2023-01-19 23:30 ` Simon Josefsson via Gnulib discussion list
2023-01-19 23:41 ` Bruno Haible
2023-01-20 3:05 ` Paul Eggert
2023-01-20 5:24 ` Jim Meyering [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.gnu.org/mailman/listinfo/bug-gnulib
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+8g5KEBYfqgWEohtbihM5ZcvXUbV-aDsubmXTzZz=J6aRsTkA@mail.gmail.com' \
--to=jim@meyering.net \
--cc=bruno@clisp.org \
--cc=bug-gnulib@gnu.org \
--cc=eggert@cs.ucla.edu \
--cc=simon@josefsson.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).