git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* fatal: Out of memory, getdelim failed  under NFS mounts
@ 2017-08-09 17:39 Yaroslav Halchenko
  2017-08-10 13:27 ` René Scharfe
  0 siblings, 1 reply; 21+ messages in thread
From: Yaroslav Halchenko @ 2017-08-09 17:39 UTC (permalink / raw)
  To: git@vger.kernel.org

Dear Git gurus,

More context (may be different issue(s)) could be found at 
http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/
but currently I am consistently reproducing it while running 
git (1:2.11.0-3 debian stretch build) within debian stretch singularity
environment [1].  

External system is Centos 6.9, and git 1.7.1 (and installed in modules
2.0.4) do not show similar buggy behavior.

NFS mounted partitions are bind mounted inside the sinularity space and
when I try to do some git operations, I get that error inconsistently , e.g.

	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
	fatal: Out of memory, getdelim failed
	error: git://github.com/datalad/datalad did not send all necessary objects

	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
	fatal: Out of memory, getdelim failed
	error: git://github.com/datalad/datalad did not send all necessary objects

	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
	From git://github.com/datalad/datalad
	 * branch              master     -> FETCH_HEAD
	fatal: Out of memory, getdelim failed

and some times it succeeds.  So it smells that some race condition
somewhere...?

any recommendations on how to pin point the "offender"? ;)  Here is the
trailer of one of the straced calls:

...
[pid 19713] getcwd("/ihome/yhalchen/datalad", 129) = 24
[pid 19713] stat(".git", {st_mode=S_IFDIR|0755, st_size=322, ...}) = 0
[pid 19713] lstat(".git/HEAD", {st_mode=S_IFREG|0644, st_size=41, ...}) = 0
[pid 19713] open(".git/HEAD", O_RDONLY) = 3
[pid 19713] read(3, "39f80454d31cfb691b006302b1f29dee"..., 255) = 41
[pid 19713] read(3, "", 214)            = 0
[pid 19713] close(3)                    = 0
[pid 19713] lstat(".git/commondir", 0x7ffc1a571190) = -1 ENOENT (No such file or directory)
[pid 19713] access(".git/objects", X_OK) = 0
[pid 19713] access(".git/refs", X_OK)   = 0
[pid 19713] lstat(".git/commondir", 0x7ffc1a571120) = -1 ENOENT (No such file or directory)
[pid 19713] open(".git/config", O_RDONLY) = 3
[pid 19713] fstat(3, {st_mode=S_IFREG|0644, st_size=257, ...}) = 0
[pid 19713] mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ba7894e7000
[pid 19713] read(3, "[core]\n\trepositoryformatversion "..., 524288) = 257
[pid 19713] read(3, "", 524288)         = 0
[pid 19713] close(3)                    = 0
[pid 19713] munmap(0x2ba7894e7000, 528384) = 0
[pid 19713] stat(".", {st_mode=S_IFDIR|0755, st_size=907, ...}) = 0
[pid 19713] getcwd("/ihome/yhalchen/datalad", 129) = 24
[pid 19713] chdir(".")                  = 0
[pid 19713] getcwd("/ihome/yhalchen/datalad", 130) = 24
[pid 19713] lstat("/ihome/yhalchen/datalad", {st_mode=S_IFDIR|0755, st_size=907, ...}) = 0
[pid 19713] chdir("/ihome/yhalchen/datalad") = 0
[pid 19713] stat(".git", {st_mode=S_IFDIR|0755, st_size=322, ...}) = 0
[pid 19713] lstat(".git/commondir", 0x7ffc1a571140) = -1 ENOENT (No such file or directory)
[pid 19713] access("/etc/gitconfig", R_OK) = -1 ENOENT (No such file or directory)
[pid 19713] access("/ihome/yhalchen/.config/git/config", R_OK) = -1 ENOENT (No such file or directory)
[pid 19713] access("/ihome/yhalchen/.gitconfig", R_OK) = -1 ENOENT (No such file or directory)
[pid 19713] access(".git/config", R_OK) = 0
[pid 19713] open(".git/config", O_RDONLY) = 3
[pid 19713] fstat(3, {st_mode=S_IFREG|0644, st_size=257, ...}) = 0
[pid 19713] brk(0x2ba78afea000)         = 0x2ba78afea000
[pid 19713] read(3, "[core]\n\trepositoryformatversion "..., 524288) = 257
[pid 19713] read(3, "", 524288)         = 0
[pid 19713] close(3)                    = 0
[pid 19713] access("/etc/gitconfig", R_OK) = -1 ENOENT (No such file or directory)
[pid 19713] access("/ihome/yhalchen/.config/git/config", R_OK) = -1 ENOENT (No such file or directory)
[pid 19713] access("/ihome/yhalchen/.gitconfig", R_OK) = -1 ENOENT (No such file or directory)
[pid 19713] access(".git/config", R_OK) = 0
[pid 19713] open(".git/config", O_RDONLY) = 3
[pid 19713] fstat(3, {st_mode=S_IFREG|0644, st_size=257, ...}) = 0
[pid 19713] read(3, "[core]\n\trepositoryformatversion "..., 524288) = 257
[pid 19713] read(3, "", 524288)         = 0
[pid 19713] close(3)                    = 0
[pid 19713] open(".git/objects/pack", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
[pid 19713] fstat(3, {st_mode=S_IFDIR|0755, st_size=270, ...}) = 0
[pid 19713] getdents(3, /* 6 entries */, 524288) = 336
[pid 19713] access(".git/objects/pack/pack-fd9e70075570d8ec41f12605852f54f1cb9771a8.keep", F_OK) = -1 ENOENT (No such file or directory)
[pid 19713] stat(".git/objects/pack/pack-fd9e70075570d8ec41f12605852f54f1cb9771a8.pack", {st_mode=S_IFREG|0444, st_size=10019975, ...}) = 0
[pid 19713] access(".git/objects/pack/pack-f1fc124e3aa1619d65a6ba56219f84871a762775.keep", F_OK) = -1 ENOENT (No such file or directory)
[pid 19713] stat(".git/objects/pack/pack-f1fc124e3aa1619d65a6ba56219f84871a762775.pack", {st_mode=S_IFREG|0444, st_size=610330, ...}) = 0
[pid 19713] getdents(3, /* 0 entries */, 524288) = 0
[pid 19713] close(3)                    = 0
[pid 19713] getcwd("/ihome/yhalchen/datalad", 129) = 24
[pid 19713] open(".git/objects/info/alternates", O_RDONLY|O_NOATIME|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 19713] open(".git/objects/17", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 19713] fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 17), ...}) = 0
[pid 19713] close(1)                    = 0
[pid 19713] exit_group(0)               = ?
[pid 19713] +++ exited with 0 +++
[pid 19710] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 19713
[pid 19710] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=19713, si_uid=14076, si_status=0, si_utime=0, si_stime=0} ---
[pid 19710] exit_group(1)               = ?
[pid 19710] +++ exited with 1 +++
<... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 19710
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=19710, si_uid=14076, si_status=1, si_utime=0, si_stime=2} ---
exit_group(1)                           = ?
+++ exited with 1 +++


PS please CC me in replies! Thanks in advance

[1] http://datasets.datalad.org/singularity/neurodebian-v2.1.img.tgz
it is a tarball to extract/run with , I promise I did no evil in there ;)

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fatal: Out of memory, getdelim failed under NFS mounts
  2017-08-09 17:39 fatal: Out of memory, getdelim failed under NFS mounts Yaroslav Halchenko
@ 2017-08-10 13:27 ` René Scharfe
  2017-08-10 14:43   ` Yaroslav Halchenko
  2017-08-10 18:56   ` Junio C Hamano
  0 siblings, 2 replies; 21+ messages in thread
From: René Scharfe @ 2017-08-10 13:27 UTC (permalink / raw)
  To: Yaroslav Halchenko; +Cc: git@vger.kernel.org

Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko:
> More context (may be different issue(s)) could be found at
> http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/
> but currently I am consistently reproducing it while running
> git (1:2.11.0-3 debian stretch build) within debian stretch singularity
> environment [1].
> 
> External system is Centos 6.9, and git 1.7.1 (and installed in modules
> 2.0.4) do not show similar buggy behavior.
> 
> NFS mounted partitions are bind mounted inside the sinularity space and
> when I try to do some git operations, I get that error inconsistently , e.g.
> 
> 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
> 	fatal: Out of memory, getdelim failed
> 	error: git://github.com/datalad/datalad did not send all necessary objects
> 
> 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
> 	fatal: Out of memory, getdelim failed
> 	error: git://github.com/datalad/datalad did not send all necessary objects
> 
> 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
> 	From git://github.com/datalad/datalad
> 	 * branch              master     -> FETCH_HEAD
> 	fatal: Out of memory, getdelim failed
> 
> and some times it succeeds.  So it smells that some race condition
> somewhere...?

I doubt the type of file system matters.  The questions are: How much
main memory do you have, what is git trying to cram into it, is there
a way to reduce the memory footprint or do you need to add more RAM?

> any recommendations on how to pin point the "offender"? ;)
Running "GIT_TRACE=1 git pull --ff-only origin master" would be a
good start, I think, to find out which of the different activities
that pull is doing causes the out-of-memory error.

"free" and "ulimit -a" can help you find out how much memory you can
use.

Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report?
getdelim() is used mostly to read lines from files like these and in
the admittedly unlikely case that they are *really* long such an
error would be expected.

René

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fatal: Out of memory, getdelim failed under NFS mounts
  2017-08-10 13:27 ` René Scharfe
@ 2017-08-10 14:43   ` Yaroslav Halchenko
  2017-08-10 19:44     ` René Scharfe
  2017-08-10 18:56   ` Junio C Hamano
  1 sibling, 1 reply; 21+ messages in thread
From: Yaroslav Halchenko @ 2017-08-10 14:43 UTC (permalink / raw)
  To: git@vger.kernel.org

Thank you René!  comments/answers embedded below

On Thu, 10 Aug 2017, René Scharfe wrote:

> Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko:
> > More context (may be different issue(s)) could be found at
> > http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/
> > but currently I am consistently reproducing it while running
> > git (1:2.11.0-3 debian stretch build) within debian stretch singularity
> > environment [1].

> > External system is Centos 6.9, and git 1.7.1 (and installed in modules
> > 2.0.4) do not show similar buggy behavior.

> > NFS mounted partitions are bind mounted inside the sinularity space and
> > when I try to do some git operations, I get that error inconsistently , e.g.

> > 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
> > 	fatal: Out of memory, getdelim failed
> > 	error: git://github.com/datalad/datalad did not send all necessary objects

> > 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
> > 	fatal: Out of memory, getdelim failed
> > 	error: git://github.com/datalad/datalad did not send all necessary objects

> > 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
> > 	From git://github.com/datalad/datalad
> > 	 * branch              master     -> FETCH_HEAD
> > 	fatal: Out of memory, getdelim failed

> > and some times it succeeds.  So it smells that some race condition
> > somewhere...?

> I doubt the type of file system matters.  

So far it has been a very consistent indicator.  I did not manage to get
this error while performing the same operation under /tmp (bind to local
mounted drive), where it also feels going faster (again suggesting that
original issue is some kind of a race)

> The questions are: How much
> main memory do you have, what is git trying to cram into it, is there
> a way to reduce the memory footprint or do you need to add more RAM?
> ... reordered ...
> "free" and "ulimit -a" can help you find out how much memory you can
> use.

I think those aren't the reason:

yhalchen@discovery:/mnt/scratch/yoh/datalad$ free -h
              total        used        free      shared  buff/cache   available
Mem:           126G        2.5G         90G        652K         33G        123G
Swap:          127G        1.7M        127G
yhalchen@discovery:/mnt/scratch/yoh/datalad$ ulimit
unlimited

> > any recommendations on how to pin point the "offender"? ;)
> Running "GIT_TRACE=1 git pull --ff-only origin master" would be a
> good start, I think, to find out which of the different activities
> that pull is doing causes the out-of-memory error.

samples of bad, and then good runs (from eyeballing -- the same until
error message):

yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_bad.log
14:05:25.782270 git.c:371               trace: built-in: git 'pull' '--ff-only' 'origin' 'master'
14:05:25.795036 run-command.c:350       trace: run_command: 'fetch' '--update-head-ok' 'origin' 'master'
14:05:25.795332 exec_cmd.c:116          trace: exec: 'git' 'fetch' '--update-head-ok' 'origin' 'master'
14:05:25.797212 git.c:371               trace: built-in: git 'fetch' '--update-head-ok' 'origin' 'master'
14:05:25.904088 run-command.c:350       trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
14:05:26.085954 run-command.c:350       trace: run_command: 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
14:05:26.086333 exec_cmd.c:116          trace: exec: 'git' 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
14:05:26.088382 git.c:371               trace: built-in: git 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
14:05:26.133326 run-command.c:350       trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
14:05:26.133688 exec_cmd.c:116          trace: exec: 'git' 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
14:05:26.135493 git.c:371               trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
fatal: Out of memory, getdelim failed
error: git://github.com/datalad/datalad did not send all necessary objects

14:05:26.138838 run-command.c:350       trace: run_command: 'gc' '--auto'
14:05:26.139131 exec_cmd.c:116          trace: exec: 'git' 'gc' '--auto'
14:05:26.141108 git.c:371               trace: built-in: git 'gc' '--auto'


yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_good.log
14:05:37.851862 git.c:371               trace: built-in: git 'pull' '--ff-only' 'origin' 'master'
14:05:37.854250 run-command.c:350       trace: run_command: 'fetch' '--update-head-ok' 'origin' 'master'
14:05:37.854527 exec_cmd.c:116          trace: exec: 'git' 'fetch' '--update-head-ok' 'origin' 'master'
14:05:37.856389 git.c:371               trace: built-in: git 'fetch' '--update-head-ok' 'origin' 'master'
14:05:37.954099 run-command.c:350       trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
14:05:38.118652 run-command.c:350       trace: run_command: 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11688 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
14:05:38.119011 exec_cmd.c:116          trace: exec: 'git' 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11688 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
14:05:38.120924 git.c:371               trace: built-in: git 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11688 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
14:05:38.167508 run-command.c:350       trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
14:05:38.167851 exec_cmd.c:116          trace: exec: 'git' 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
14:05:38.169726 git.c:371               trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
From git://github.com/datalad/datalad
 * branch              master     -> FETCH_HEAD
   39f80454..1f90ef47  master     -> origin/master
14:05:38.306113 run-command.c:1130      run_processes_parallel: preparing to run up to 1 tasks
14:05:38.306148 run-command.c:1162      run_processes_parallel: done
14:05:38.306313 run-command.c:350       trace: run_command: 'gc' '--auto'
14:05:38.306616 exec_cmd.c:116          trace: exec: 'git' 'gc' '--auto'
14:05:38.308598 git.c:371               trace: built-in: git 'gc' '--auto'
14:05:38.311380 run-command.c:350       trace: run_command: 'merge' '--ff-only' 'FETCH_HEAD'
14:05:38.311645 exec_cmd.c:116          trace: exec: 'git' 'merge' '--ff-only' 'FETCH_HEAD'
14:05:38.313384 git.c:371               trace: built-in: git 'merge' '--ff-only' 'FETCH_HEAD'
14:05:45.092247 run-command.c:350       trace: run_command: 'gc' '--auto'
Updating 39f80454..1f90ef47
Fast-forward
14:05:45.092620 exec_cmd.c:116          trace: exec: 'git' 'gc' '--auto'
14:05:45.095862 git.c:371               trace: built-in: git 'gc' '--auto'


> Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report?

"varying" and not consistent with causing an error (first trials, where I
did not cat .git/FETCH_HEAD kinda suggested differently):

yhalchen@discovery:/mnt/scratch/yoh/datalad$ cat .git/FETCH_HEAD; wc -L .git/FETCH_HEAD .git/packed-refs; git pull --ff-only origin master
1f90ef474ee200befea19ba77242fa44f16739f0                branch 'master' of git://github.com/datalad/datalad
 107 .git/FETCH_HEAD
  90 .git/packed-refs
 107 total
From git://github.com/datalad/datalad
 * branch              master     -> FETCH_HEAD
Already up-to-date.
yhalchen@discovery:/mnt/scratch/yoh/datalad$ cat .git/FETCH_HEAD; wc -L .git/FETCH_HEAD .git/packed-refs; git pull --ff-only origin master 
1f90ef474ee200befea19ba77242fa44f16739f0                branch 'master' of git://github.com/datalad/datalad
 107 .git/FETCH_HEAD
  90 .git/packed-refs
 107 total
fatal: Out of memory, getdelim failed
error: git://github.com/datalad/datalad did not send all necessary objects

yhalchen@discovery:/mnt/scratch/yoh/datalad$ cat .git/FETCH_HEAD; wc -L .git/FETCH_HEAD .git/packed-refs; git pull --ff-only origin master 
   0 .git/FETCH_HEAD
  90 .git/packed-refs
  90 total
fatal: Out of memory, getdelim failed
error: git://github.com/datalad/datalad did not send all necessary objects

yhalchen@discovery:/mnt/scratch/yoh/datalad$ cat .git/FETCH_HEAD; wc -L .git/FETCH_HEAD .git/packed-refs; git pull --ff-only origin master 
   0 .git/FETCH_HEAD
  90 .git/packed-refs
  90 total
fatal: Out of memory, getdelim failed
error: git://github.com/datalad/datalad did not send all necessary objects

yhalchen@discovery:/mnt/scratch/yoh/datalad$ cat .git/FETCH_HEAD; wc -L .git/FETCH_HEAD .git/packed-refs; git pull --ff-only origin master 
   0 .git/FETCH_HEAD
  90 .git/packed-refs
  90 total
From git://github.com/datalad/datalad
 * branch              master     -> FETCH_HEAD
Already up-to-date.


> getdelim() is used mostly to read lines from files like these and in
> the admittedly unlikely case that they are *really* long such an
> error would be expected.

packed-ref pointer seems to also relate to the strace output around the
point of message:

[pid 12843] open(".git/packed-refs", O_RDONLY) = 3
[pid 12843] fstat(3, {st_mode=S_IFREG|0644, st_size=5042, ...}) = 0
[pid 12843] fstat(3, {st_mode=S_IFREG|0644, st_size=5042, ...}) = 0
[pid 12843] brk(0x2b5704317000)         = 0x2b5704276000
[pid 12843] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b570451d000
[pid 12843] read(3, "# pack-refs with: peeled fully-p"..., 524288) = 5042
[pid 12843] read(3, "", 524288)         = 0
[pid 12843] fstat(2, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
[pid 12843] write(2, "fatal: Out of memory, getdelim f"..., 38fatal: Out of memory, getdelim failed
) = 38
[pid 12843] lseek(0, -41, SEEK_CUR)     = -1 ESPIPE (Illegal seek)
[pid 12843] exit_group(128)             = ?


I verified that content of that packed-refs didn't change between good and bad
runs.

NB is there a diff which could be given regexes within a line to ignore
in diffs in so we could still retain original lines, i.e. answer to
https://stackoverflow.com/questions/15841223/diff-while-ignoring-patterns-within-a-line-but-not-the-entire-line
?

FWIW -- here is a diff (from good to bad run) of strace (with pid/address
info changed to stay the same for comparisons):

@@ -3121,6 +3118,7 @@
 [pid YYYYY] close(3)                    = 0
 [pid YYYYY] brk(NULL)                   = 0xXXXXX
 [pid YYYYY] brk(0xXXXXX)         = 0xXXXXX
+[pid YYYYY] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xXXXXX
 [pid YYYYY] open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
 [pid YYYYY] open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
 [pid YYYYY] open("/usr/lib/locale/en_US/LC_MESSAGES", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
@@ -3144,11 +3142,9 @@
 [pid YYYYY] lstat(".git/commondir", 0xXXXXX) = -1 ENOENT (No such file or directory)
 [pid YYYYY] open(".git/config", O_RDONLY) = 3
 [pid YYYYY] fstat(3, {st_mode=S_IFREG|0644, st_size=257, ...}) = 0
-[pid YYYYY] mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xXXXXX
 [pid YYYYY] read(3, "[core]\n\trepositoryformatversion "..., 524288) = 257
 [pid YYYYY] read(3, "", 524288)         = 0
 [pid YYYYY] close(3)                    = 0
-[pid YYYYY] munmap(0xXXXXX, 528384) = 0
 [pid YYYYY] stat(".", {st_mode=S_IFDIR|0755, st_size=907, ...}) = 0
 [pid YYYYY] getcwd("/mnt/scratch/yoh/datalad", 129) = 25
 [pid YYYYY] chdir(".")                  = 0
@@ -3163,7 +3159,6 @@
 [pid YYYYY] access(".git/config", R_OK) = 0
 [pid YYYYY] open(".git/config", O_RDONLY) = 3
 [pid YYYYY] fstat(3, {st_mode=S_IFREG|0644, st_size=257, ...}) = 0
-[pid YYYYY] brk(0xXXXXX)         = 0xXXXXX
 [pid YYYYY] read(3, "[core]\n\trepositoryformatversion "..., 524288) = 257
 [pid YYYYY] read(3, "", 524288)         = 0
 [pid YYYYY] close(3)                    = 0
@@ -3180,245 +3175,48 @@
 [pid YYYYY] read(0, "1f90ef474ee200befea19ba77242fa44"..., 4096) = 82
 [pid YYYYY] open(".git/refs/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
 [pid YYYYY] fstat(3, {st_mode=S_IFDIR|0755, st_size=70, ...}) = 0
-[pid YYYYY] brk(0xXXXXX)         = 0xXXXXX
+[pid YYYYY] mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xXXXXX
 [pid YYYYY] getdents(3, /* 5 entries */, 524288) = 136
 [pid YYYYY] stat(".git/refs/heads", {st_mode=S_IFDIR|0755, st_size=24, ...}) = 0
 [pid YYYYY] stat(".git/refs/remotes", {st_mode=S_IFDIR|0755, st_size=24, ...}) = 0
 [pid YYYYY] stat(".git/refs/tags", {st_mode=S_IFDIR|0755, st_size=47, ...}) = 0
 [pid YYYYY] getdents(3, /* 0 entries */, 524288) = 0
+[pid YYYYY] munmap(0xXXXXX, 528384) = 0
 [pid YYYYY] close(3)                    = 0
 [pid YYYYY] open(".git/refs/bisect", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
 [pid YYYYY] open(".git/packed-refs", O_RDONLY) = 3
 [pid YYYYY] fstat(3, {st_mode=S_IFREG|0644, st_size=5042, ...}) = 0
 [pid YYYYY] fstat(3, {st_mode=S_IFREG|0644, st_size=5042, ...}) = 0
-[pid YYYYY] read(3, "# pack-refs with: peeled fully-p"..., 524288) = 5042
-[pid YYYYY] read(3, "", 524288)         = 0
-[pid YYYYY] close(3)                    = 0
-[pid YYYYY] open(".git/objects/pack", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
-[pid YYYYY] fstat(3, {st_mode=S_IFDIR|0755, st_size=405, ...}) = 0
-[pid YYYYY] getdents(3, /* 8 entries */, 524288) = 480
-[pid YYYYY] access(".git/objects/pack/pack-a5ed9c83e2a39b21a2ab12dc351d2a7513d76e48.keep", F_OK) = -1 ENOENT (No such file or directory)
-[pid YYYYY] stat(".git/objects/pack/pack-a5ed9c83e2a39b21a2ab12dc351d2a7513d76e48.pack", {st_mode=S_IFREG|0444, st_size=124835, ...}) = 0
-[pid YYYYY] access(".git/objects/pack/pack-fd9e70075570d8ec41f12605852f54f1cb9771a8.keep", F_OK) = -1 ENOENT (No such file or directory)
-[pid YYYYY] stat(".git/objects/pack/pack-fd9e70075570d8ec41f12605852f54f1cb9771a8.pack", {st_mode=S_IFREG|0444, st_size=10019975, ...}) = 0
-[pid YYYYY] access(".git/objects/pack/pack-f1fc124e3aa1619d65a6ba56219f84871a762775.keep", F_OK) = -1 ENOENT (No such file or directory)
-[pid YYYYY] stat(".git/objects/pack/pack-f1fc124e3aa1619d65a6ba56219f84871a762775.pack", {st_mode=S_IFREG|0444, st_size=610330, ...}) = 0




-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fatal: Out of memory, getdelim failed under NFS mounts
  2017-08-10 13:27 ` René Scharfe
  2017-08-10 14:43   ` Yaroslav Halchenko
@ 2017-08-10 18:56   ` Junio C Hamano
  2017-08-10 19:44     ` Yaroslav Halchenko
  2017-08-10 19:58     ` René Scharfe
  1 sibling, 2 replies; 21+ messages in thread
From: Junio C Hamano @ 2017-08-10 18:56 UTC (permalink / raw)
  To: René Scharfe; +Cc: Yaroslav Halchenko, git@vger.kernel.org

René Scharfe <l.s.r@web.de> writes:

> I doubt the type of file system matters.  The questions are: How much
> main memory do you have, what is git trying to cram into it, is there
> a way to reduce the memory footprint or do you need to add more RAM?
>
>> any recommendations on how to pin point the "offender"? ;)
> Running "GIT_TRACE=1 git pull --ff-only origin master" would be a
> good start, I think, to find out which of the different activities
> that pull is doing causes the out-of-memory error.
>
> "free" and "ulimit -a" can help you find out how much memory you can
> use.
>
> Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report?
> getdelim() is used mostly to read lines from files like these and in
> the admittedly unlikely case that they are *really* long such an
> error would be expected.

There is only one getdelim() call, which was introduced in v2.5.0
timeframe, and it is used like this:

	r = getdelim(&sb->buf, &sb->alloc, term, fp);

	if (r > 0) {
		sb->len = r;
		return 0;
	}
	assert(r == -1);

	/*
	 * Normally we would have called xrealloc, which will try to free
	 * memory and recover. But we have no way to tell getdelim() to do so.
	 * Worse, we cannot try to recover ENOMEM ourselves, because we have
	 * no idea how many bytes were read by getdelim.
	 *
	 * Dying here is reasonable. It mirrors what xrealloc would do on
	 * catastrophic memory failure. We skip the opportunity to free pack
	 * memory and retry, but that's unlikely to help for a malloc small
	 * enough to hold a single line of input, anyway.
	 */
	if (errno == ENOMEM)
		die("Out of memory, getdelim failed");

So the function is returning -1 and leaving ENOMEM in errno on
Yaroslav's system.  

I wonder if we are truly hitting out of memory, though.  The same
symptom could bee seen if getdelim() does not touch errno when it
returns -1, but some other system call earlier set it to ENOMEM,
for example.

If the same version of Git is recompiled there without HAVE_GETDELIM
defined, would it still die with out of memory (presumably inside
the call to strbuf_grow() in the strbuf_getwholeline() function)?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fatal: Out of memory, getdelim failed under NFS mounts
  2017-08-10 14:43   ` Yaroslav Halchenko
@ 2017-08-10 19:44     ` René Scharfe
  0 siblings, 0 replies; 21+ messages in thread
From: René Scharfe @ 2017-08-10 19:44 UTC (permalink / raw)
  To: Yaroslav Halchenko, git@vger.kernel.org

Am 10.08.2017 um 16:43 schrieb Yaroslav Halchenko:
> On Thu, 10 Aug 2017, René Scharfe wrote:
>> Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko:
>>> More context (may be different issue(s)) could be found at
>>> http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/
>>> but currently I am consistently reproducing it while running
>>> git (1:2.11.0-3 debian stretch build) within debian stretch singularity
>>> environment [1].
> 
>>> External system is Centos 6.9, and git 1.7.1 (and installed in modules
>>> 2.0.4) do not show similar buggy behavior.
> 
>>> NFS mounted partitions are bind mounted inside the sinularity space and
>>> when I try to do some git operations, I get that error inconsistently , e.g.
> 
>>> 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
>>> 	fatal: Out of memory, getdelim failed
>>> 	error: git://github.com/datalad/datalad did not send all necessary objects
> 
>>> 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
>>> 	fatal: Out of memory, getdelim failed
>>> 	error: git://github.com/datalad/datalad did not send all necessary objects
> 
>>> 	yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin master
>>> 	From git://github.com/datalad/datalad
>>> 	 * branch              master     -> FETCH_HEAD
>>> 	fatal: Out of memory, getdelim failed
> 
>>> and some times it succeeds.  So it smells that some race condition
>>> somewhere...?
> 
>> I doubt the type of file system matters.
> 
> So far it has been a very consistent indicator.  I did not manage to get
> this error while performing the same operation under /tmp (bind to local
> mounted drive), where it also feels going faster (again suggesting that
> original issue is some kind of a race)

Well, there have been bugs in getdelim() before, e.g.:

  https://bugzilla.redhat.com/show_bug.cgi?id=601071
  https://bugzilla.redhat.com/show_bug.cgi?id=1332917

git v2.5.0 was the first version to use it.   So if all else fails it may
be worth compiling git without HAVE_GETDELIM.

>> The questions are: How much
>> main memory do you have, what is git trying to cram into it, is there
>> a way to reduce the memory footprint or do you need to add more RAM?
>> ... reordered ...
>> "free" and "ulimit -a" can help you find out how much memory you can
>> use.
> 
> I think those aren't the reason:
> 
> yhalchen@discovery:/mnt/scratch/yoh/datalad$ free -h
>                total        used        free      shared  buff/cache   available
> Mem:           126G        2.5G         90G        652K         33G        123G
> Swap:          127G        1.7M        127G

Is all of that available to the git in the Singularity container or
is that the memory size of the host and there's some kind of limit
for the guests?

> yhalchen@discovery:/mnt/scratch/yoh/datalad$ ulimit
> unlimited

That's just the maximum file size; memory-related limits are more
interesting for this case.  "ulimit -a" will show all limits.

>>> any recommendations on how to pin point the "offender"? ;)
>> Running "GIT_TRACE=1 git pull --ff-only origin master" would be a
>> good start, I think, to find out which of the different activities
>> that pull is doing causes the out-of-memory error.
> 
> samples of bad, and then good runs (from eyeballing -- the same until
> error message):
> 
> yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_bad.log
> 14:05:25.782270 git.c:371               trace: built-in: git 'pull' '--ff-only' 'origin' 'master'
> 14:05:25.795036 run-command.c:350       trace: run_command: 'fetch' '--update-head-ok' 'origin' 'master'
> 14:05:25.795332 exec_cmd.c:116          trace: exec: 'git' 'fetch' '--update-head-ok' 'origin' 'master'
> 14:05:25.797212 git.c:371               trace: built-in: git 'fetch' '--update-head-ok' 'origin' 'master'
> 14:05:25.904088 run-command.c:350       trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
> 14:05:26.085954 run-command.c:350       trace: run_command: 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
> 14:05:26.086333 exec_cmd.c:116          trace: exec: 'git' 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
> 14:05:26.088382 git.c:371               trace: built-in: git 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103'
> 14:05:26.133326 run-command.c:350       trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
> 14:05:26.133688 exec_cmd.c:116          trace: exec: 'git' 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
> 14:05:26.135493 git.c:371               trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet'
> fatal: Out of memory, getdelim failed
> error: git://github.com/datalad/datalad did not send all necessary objects

That rev-list call comes from connected::check_connected(); the error
message from builtin/fetch.c::store_updated_refs(), which actually calls
check_connected().  So you should be able to reproduce the issue just
with git fetch or with git rev-list.  The latter requires passing the
right objects to the command, but perhaps reproduction is possible with
guessed or arbitrary values.

I don't know which files these commands access with getdelim() except
for the ones mentioned below, though.

>> Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report?
> 
> "varying" and not consistent with causing an error (first trials, where I
> did not cat .git/FETCH_HEAD kinda suggested differently):
> 
> yhalchen@discovery:/mnt/scratch/yoh/datalad$ cat .git/FETCH_HEAD; wc -L .git/FETCH_HEAD .git/packed-refs; git pull --ff-only origin master
> 1f90ef474ee200befea19ba77242fa44f16739f0                branch 'master' of git://github.com/datalad/datalad
>   107 .git/FETCH_HEAD
>    90 .git/packed-refs
>   107 total

These line lengths are unlikely to exhaust the memory.  I was rather
hoping for values in the range of billions due to some kind of freak
accident or unconventional refs usage.

> NB is there a diff which could be given regexes within a line to ignore
> in diffs in so we could still retain original lines, i.e. answer to
> https://stackoverflow.com/questions/15841223/diff-while-ignoring-patterns-within-a-line-but-not-the-entire-line
> ?
> 
> FWIW -- here is a diff (from good to bad run) of strace (with pid/address
> info changed to stay the same for comparisons):
> 
> @@ -3121,6 +3118,7 @@
>   [pid YYYYY] close(3)                    = 0
>   [pid YYYYY] brk(NULL)                   = 0xXXXXX
>   [pid YYYYY] brk(0xXXXXX)         = 0xXXXXX
> +[pid YYYYY] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xXXXXX

That looks like malloc() increasing its pool by 1 MB...

>   [pid YYYYY] open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
>   [pid YYYYY] open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
>   [pid YYYYY] open("/usr/lib/locale/en_US/LC_MESSAGES", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> @@ -3144,11 +3142,9 @@
>   [pid YYYYY] lstat(".git/commondir", 0xXXXXX) = -1 ENOENT (No such file or directory)
>   [pid YYYYY] open(".git/config", O_RDONLY) = 3
>   [pid YYYYY] fstat(3, {st_mode=S_IFREG|0644, st_size=257, ...}) = 0
> -[pid YYYYY] mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xXXXXX
>   [pid YYYYY] read(3, "[core]\n\trepositoryformatversion "..., 524288) = 257
>   [pid YYYYY] read(3, "", 524288)         = 0
>   [pid YYYYY] close(3)                    = 0
> -[pid YYYYY] munmap(0xXXXXX, 528384) = 0

... and in the working case it just gets 516 KB and releases it
shortly afterwards...

>   [pid YYYYY] stat(".", {st_mode=S_IFDIR|0755, st_size=907, ...}) = 0
>   [pid YYYYY] getcwd("/mnt/scratch/yoh/datalad", 129) = 25
>   [pid YYYYY] chdir(".")                  = 0
> @@ -3163,7 +3159,6 @@
>   [pid YYYYY] access(".git/config", R_OK) = 0
>   [pid YYYYY] open(".git/config", O_RDONLY) = 3
>   [pid YYYYY] fstat(3, {st_mode=S_IFREG|0644, st_size=257, ...}) = 0
> -[pid YYYYY] brk(0xXXXXX)         = 0xXXXXX
>   [pid YYYYY] read(3, "[core]\n\trepositoryformatversion "..., 524288) = 257
>   [pid YYYYY] read(3, "", 524288)         = 0
>   [pid YYYYY] close(3)                    = 0
> @@ -3180,245 +3175,48 @@
>   [pid YYYYY] read(0, "1f90ef474ee200befea19ba77242fa44"..., 4096) = 82
>   [pid YYYYY] open(".git/refs/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
>   [pid YYYYY] fstat(3, {st_mode=S_IFDIR|0755, st_size=70, ...}) = 0
> -[pid YYYYY] brk(0xXXXXX)         = 0xXXXXX
> +[pid YYYYY] mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xXXXXX
>   [pid YYYYY] getdents(3, /* 5 entries */, 524288) = 136
>   [pid YYYYY] stat(".git/refs/heads", {st_mode=S_IFDIR|0755, st_size=24, ...}) = 0
>   [pid YYYYY] stat(".git/refs/remotes", {st_mode=S_IFDIR|0755, st_size=24, ...}) = 0
>   [pid YYYYY] stat(".git/refs/tags", {st_mode=S_IFDIR|0755, st_size=47, ...}) = 0
>   [pid YYYYY] getdents(3, /* 0 entries */, 524288) = 0
> +[pid YYYYY] munmap(0xXXXXX, 528384) = 0

... which is done in the non-working case as well, just a bit
later...

>   [pid YYYYY] close(3)                    = 0
>   [pid YYYYY] open(".git/refs/bisect", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
>   [pid YYYYY] open(".git/packed-refs", O_RDONLY) = 3
>   [pid YYYYY] fstat(3, {st_mode=S_IFREG|0644, st_size=5042, ...}) = 0
>   [pid YYYYY] fstat(3, {st_mode=S_IFREG|0644, st_size=5042, ...}) = 0
> -[pid YYYYY] read(3, "# pack-refs with: peeled fully-p"..., 524288) = 5042

... and it doesn't seem to reach the stage when the packed refs are read.

So the bad case has malloc() grab and never release one more megabyte.
Is there are a limit for Singularity containers?  Can you increase the
one for yours by 1 MB? :)

René

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fatal: Out of memory, getdelim failed under NFS mounts
  2017-08-10 18:56   ` Junio C Hamano
@ 2017-08-10 19:44     ` Yaroslav Halchenko
  2017-08-10 19:58     ` René Scharfe
  1 sibling, 0 replies; 21+ messages in thread
From: Yaroslav Halchenko @ 2017-08-10 19:44 UTC (permalink / raw)
  To: git@vger.kernel.org

Thank you Junio

On Thu, 10 Aug 2017, Junio C Hamano wrote:
> There is only one getdelim() call, which was introduced in v2.5.0
> timeframe, and it is used like this:

> 	r = getdelim(&sb->buf, &sb->alloc, term, fp);

> 	if (r > 0) {
> 		sb->len = r;
> 		return 0;
> 	}
> 	assert(r == -1);

> 	/*
> 	 * Normally we would have called xrealloc, which will try to free
> 	 * memory and recover. But we have no way to tell getdelim() to do so.
> 	 * Worse, we cannot try to recover ENOMEM ourselves, because we have
> 	 * no idea how many bytes were read by getdelim.
> 	 *
> 	 * Dying here is reasonable. It mirrors what xrealloc would do on
> 	 * catastrophic memory failure. We skip the opportunity to free pack
> 	 * memory and retry, but that's unlikely to help for a malloc small
> 	 * enough to hold a single line of input, anyway.
> 	 */
> 	if (errno == ENOMEM)
> 		die("Out of memory, getdelim failed");

> So the function is returning -1 and leaving ENOMEM in errno on
> Yaroslav's system.  

> I wonder if we are truly hitting out of memory, though.  The same
> symptom could bee seen if getdelim() does not touch errno when it
> returns -1, but some other system call earlier set it to ENOMEM,
> for example.

> If the same version of Git is recompiled there without HAVE_GETDELIM
> defined, would it still die with out of memory (presumably inside
> the call to strbuf_grow() in the strbuf_getwholeline() function)?

will check now...  for my own education (rotten by Python) -- how
do you know which syscall set errno to be analyzed at this specific
point?  may be it was already set to ENOMEM before entry to this
function?

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fatal: Out of memory, getdelim failed under NFS mounts
  2017-08-10 18:56   ` Junio C Hamano
  2017-08-10 19:44     ` Yaroslav Halchenko
@ 2017-08-10 19:58     ` René Scharfe
  2017-08-10 20:05       ` Jeff King
  1 sibling, 1 reply; 21+ messages in thread
From: René Scharfe @ 2017-08-10 19:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Yaroslav Halchenko, git@vger.kernel.org

Am 10.08.2017 um 20:56 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
> 
>> I doubt the type of file system matters.  The questions are: How much
>> main memory do you have, what is git trying to cram into it, is there
>> a way to reduce the memory footprint or do you need to add more RAM?
>>
>>> any recommendations on how to pin point the "offender"? ;)
>> Running "GIT_TRACE=1 git pull --ff-only origin master" would be a
>> good start, I think, to find out which of the different activities
>> that pull is doing causes the out-of-memory error.
>>
>> "free" and "ulimit -a" can help you find out how much memory you can
>> use.
>>
>> Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report?
>> getdelim() is used mostly to read lines from files like these and in
>> the admittedly unlikely case that they are *really* long such an
>> error would be expected.
> 
> There is only one getdelim() call, which was introduced in v2.5.0
> timeframe, and it is used like this:
> 
> 	r = getdelim(&sb->buf, &sb->alloc, term, fp);
> 
> 	if (r > 0) {
> 		sb->len = r;
> 		return 0;
> 	}
> 	assert(r == -1);
> 
> 	/*
> 	 * Normally we would have called xrealloc, which will try to free
> 	 * memory and recover. But we have no way to tell getdelim() to do so.
> 	 * Worse, we cannot try to recover ENOMEM ourselves, because we have
> 	 * no idea how many bytes were read by getdelim.
> 	 *
> 	 * Dying here is reasonable. It mirrors what xrealloc would do on
> 	 * catastrophic memory failure. We skip the opportunity to free pack
> 	 * memory and retry, but that's unlikely to help for a malloc small
> 	 * enough to hold a single line of input, anyway.
> 	 */
> 	if (errno == ENOMEM)
> 		die("Out of memory, getdelim failed");
> 
> So the function is returning -1 and leaving ENOMEM in errno on
> Yaroslav's system.
> 
> I wonder if we are truly hitting out of memory, though.  The same
> symptom could bee seen if getdelim() does not touch errno when it
> returns -1, but some other system call earlier set it to ENOMEM,
> for example.

That can happen when the end of a file is reached; getdelim() returns
-1 and leaves errno unchanged.  So we need to set errno to 0 just
before that call.

Still *some* function must have run into a memory shortage in that
scenario, so adding/assigning more should help, right?

> If the same version of Git is recompiled there without HAVE_GETDELIM
> defined, would it still die with out of memory (presumably inside
> the call to strbuf_grow() in the strbuf_getwholeline() function)?

It's worth a try, if possible.

René

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fatal: Out of memory, getdelim failed under NFS mounts
  2017-08-10 19:58     ` René Scharfe
@ 2017-08-10 20:05       ` Jeff King
  2017-08-10 20:29         ` Yaroslav Halchenko
  2017-08-10 20:56         ` [PATCH] strbuf: clear errno before calling getdelim(3) René Scharfe
  0 siblings, 2 replies; 21+ messages in thread
From: Jeff King @ 2017-08-10 20:05 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, Yaroslav Halchenko, git@vger.kernel.org

On Thu, Aug 10, 2017 at 09:58:37PM +0200, René Scharfe wrote:

> > So the function is returning -1 and leaving ENOMEM in errno on
> > Yaroslav's system.
> > 
> > I wonder if we are truly hitting out of memory, though.  The same
> > symptom could bee seen if getdelim() does not touch errno when it
> > returns -1, but some other system call earlier set it to ENOMEM,
> > for example.
> 
> That can happen when the end of a file is reached; getdelim() returns
> -1 and leaves errno unchanged.  So we need to set errno to 0 just
> before that call.

Good catch. That's a bug in my original conversoin of
strbuf_getwholeline().

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fatal: Out of memory, getdelim failed under NFS mounts
  2017-08-10 20:05       ` Jeff King
@ 2017-08-10 20:29         ` Yaroslav Halchenko
  2017-08-10 20:56         ` [PATCH] strbuf: clear errno before calling getdelim(3) René Scharfe
  1 sibling, 0 replies; 21+ messages in thread
From: Yaroslav Halchenko @ 2017-08-10 20:29 UTC (permalink / raw)
  To: git@vger.kernel.org


On Thu, 10 Aug 2017, Jeff King wrote:

> On Thu, Aug 10, 2017 at 09:58:37PM +0200, René Scharfe wrote:

> > > So the function is returning -1 and leaving ENOMEM in errno on
> > > Yaroslav's system.

> > > I wonder if we are truly hitting out of memory, though.  The same
> > > symptom could bee seen if getdelim() does not touch errno when it
> > > returns -1, but some other system call earlier set it to ENOMEM,
> > > for example.

> > That can happen when the end of a file is reached; getdelim() returns
> > -1 and leaves errno unchanged.  So we need to set errno to 0 just
> > before that call.

> Good catch. That's a bug in my original conversoin of
> strbuf_getwholeline().

I think this did it!  at least on this simple test... yet to test a bit
more in those scenarios we ran into it before while testing git-annex.

commit 36ef5e3ad2c187d3be664c33dbc8c06e59bceaf4 (HEAD -> bf-seterrno0)
Author: Yaroslav O. Halchenko <yhalchen@smaug.cns.dartmouth.edu>
Date:   Thu Aug 10 20:26:47 2017 +0000

    BF: set errno to 0 before getdelim call to get unbiased assesment later

diff --git a/strbuf.c b/strbuf.c
index 89d22e3b0..323c49ceb 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
        /* Translate slopbuf to NULL, as we cannot call realloc on it */
        if (!sb->alloc)
                sb->buf = NULL;
+       errno = 0;
        r = getdelim(&sb->buf, &sb->alloc, term, fp);
 
        if (r > 0) {

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-10 20:05       ` Jeff King
  2017-08-10 20:29         ` Yaroslav Halchenko
@ 2017-08-10 20:56         ` René Scharfe
  2017-08-10 21:02           ` Jeff King
                             ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: René Scharfe @ 2017-08-10 20:56 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Jeff King, Junio C Hamano, Yaroslav Halchenko

getdelim(3) returns -1 at the end of the file and if it encounters an
error, but sets errno only in the latter case.  Set errno to zero before
calling it to avoid misdiagnosing an out-of-memory condition due to a
left-over value from some other function call.

Reported-by: Yaroslav Halchenko <yoh@onerussian.com>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
Do we need to save and restore the original value of errno?  I doubt it,
but didn't think deeply about it, yet.

 strbuf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/strbuf.c b/strbuf.c
index 89d22e3b09..323c49ceb3 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 	/* Translate slopbuf to NULL, as we cannot call realloc on it */
 	if (!sb->alloc)
 		sb->buf = NULL;
+	errno = 0;
 	r = getdelim(&sb->buf, &sb->alloc, term, fp);
 
 	if (r > 0) {
-- 
2.14.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-10 20:56         ` [PATCH] strbuf: clear errno before calling getdelim(3) René Scharfe
@ 2017-08-10 21:02           ` Jeff King
  2017-08-10 21:35             ` Yaroslav Halchenko
  2017-08-10 21:41           ` Junio C Hamano
  2017-08-11  7:50           ` Simon Ruderich
  2 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2017-08-10 21:02 UTC (permalink / raw)
  To: René Scharfe; +Cc: git@vger.kernel.org, Junio C Hamano, Yaroslav Halchenko

On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:

> getdelim(3) returns -1 at the end of the file and if it encounters an
> error, but sets errno only in the latter case.  Set errno to zero before
> calling it to avoid misdiagnosing an out-of-memory condition due to a
> left-over value from some other function call.

Looks good to me.

> Do we need to save and restore the original value of errno?  I doubt it,
> but didn't think deeply about it, yet.

I'd say no. Anybody depending on strbuf_getwholeline() is clearly
already wrong in the error case. And in general I think we assume that
syscalls can clear errno on success if they choose to (this isn't a
syscall, but obviously it is calling some).

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-10 21:02           ` Jeff King
@ 2017-08-10 21:35             ` Yaroslav Halchenko
  0 siblings, 0 replies; 21+ messages in thread
From: Yaroslav Halchenko @ 2017-08-10 21:35 UTC (permalink / raw)
  To: git@vger.kernel.org


On Thu, 10 Aug 2017, Jeff King wrote:

> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:

> > getdelim(3) returns -1 at the end of the file and if it encounters an
> > error, but sets errno only in the latter case.  Set errno to zero before
> > calling it to avoid misdiagnosing an out-of-memory condition due to a
> > left-over value from some other function call.

> Looks good to me.

> > Do we need to save and restore the original value of errno?  I doubt it,
> > but didn't think deeply about it, yet.

> I'd say no. Anybody depending on strbuf_getwholeline() is clearly
> already wrong in the error case. And in general I think we assume that
> syscalls can clear errno on success if they choose to (this isn't a
> syscall, but obviously it is calling some).

Shouldn't ideally errno being reset to 0 upon check of the syscall
successful run right after that syscall?  (I also see some spots within
git code where it sets errno to ENOMEM)


-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-10 20:56         ` [PATCH] strbuf: clear errno before calling getdelim(3) René Scharfe
  2017-08-10 21:02           ` Jeff King
@ 2017-08-10 21:41           ` Junio C Hamano
  2017-08-11  7:50           ` Simon Ruderich
  2 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2017-08-10 21:41 UTC (permalink / raw)
  To: René Scharfe; +Cc: git@vger.kernel.org, Jeff King, Yaroslav Halchenko

René Scharfe <l.s.r@web.de> writes:

> getdelim(3) returns -1 at the end of the file and if it encounters an
> error, but sets errno only in the latter case.  Set errno to zero before
> calling it to avoid misdiagnosing an out-of-memory condition due to a
> left-over value from some other function call.
>
> Reported-by: Yaroslav Halchenko <yoh@onerussian.com>
> Suggested-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Rene Scharfe <l.s.r@web.de>

Heh.  I mumble something vague then people more capable than me jump
in to take it to the conclusion, and still I get the credit.  I wish
all the debugging sessions were this easy ;-)

> ---
> Do we need to save and restore the original value of errno?  I doubt it,
> but didn't think deeply about it, yet.

We probably don't need to---a caller who knows it got an error
before calling this function and wants to use errno after doing so
should be stashing it away; after all, this function will clobber
errno when any of the library calls it makes fails and this is on
the I/O codepath, so anything can go wrong.

Thanks.

>
>  strbuf.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/strbuf.c b/strbuf.c
> index 89d22e3b09..323c49ceb3 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
>  	/* Translate slopbuf to NULL, as we cannot call realloc on it */
>  	if (!sb->alloc)
>  		sb->buf = NULL;
> +	errno = 0;
>  	r = getdelim(&sb->buf, &sb->alloc, term, fp);
>  
>  	if (r > 0) {

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-10 20:56         ` [PATCH] strbuf: clear errno before calling getdelim(3) René Scharfe
  2017-08-10 21:02           ` Jeff King
  2017-08-10 21:41           ` Junio C Hamano
@ 2017-08-11  7:50           ` Simon Ruderich
  2017-08-11  8:52             ` René Scharfe
  2 siblings, 1 reply; 21+ messages in thread
From: Simon Ruderich @ 2017-08-11  7:50 UTC (permalink / raw)
  To: René Scharfe
  Cc: git@vger.kernel.org, Jeff King, Junio C Hamano,
	Yaroslav Halchenko

On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:
> getdelim(3) returns -1 at the end of the file and if it encounters an
> error, but sets errno only in the latter case.  Set errno to zero before
> calling it to avoid misdiagnosing an out-of-memory condition due to a
> left-over value from some other function call.

getdelim(3p) doesn't explicitly forbid changing the errno on EOF:

    If no characters were read, and the end-of-file indicator for
    the stream is set, or if the stream is at end-of-file, the
    end-of-file indicator for the stream shall be set and the
    function shall return −1. If an error occurs, the error
    indicator for the stream shall be set, and the function shall
    return −1 and set errno to indicate the error.

So a valid implementation could still set errno on EOF and also
on another error (where it's required to set errno).

I don't think that it matters in practice, but the "most" correct
way to handle this would be to check if feof(3) is true to check
for the non-errno case.

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-11  7:50           ` Simon Ruderich
@ 2017-08-11  8:52             ` René Scharfe
  2017-08-12 10:02               ` Simon Ruderich
  0 siblings, 1 reply; 21+ messages in thread
From: René Scharfe @ 2017-08-11  8:52 UTC (permalink / raw)
  To: Simon Ruderich
  Cc: git@vger.kernel.org, Jeff King, Junio C Hamano,
	Yaroslav Halchenko

Am 11.08.2017 um 09:50 schrieb Simon Ruderich:
> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:
>> getdelim(3) returns -1 at the end of the file and if it encounters an
>> error, but sets errno only in the latter case.  Set errno to zero before
>> calling it to avoid misdiagnosing an out-of-memory condition due to a
>> left-over value from some other function call.
> 
> getdelim(3p) doesn't explicitly forbid changing the errno on EOF:
> 
>      If no characters were read, and the end-of-file indicator for
>      the stream is set, or if the stream is at end-of-file, the
>      end-of-file indicator for the stream shall be set and the
>      function shall return −1. If an error occurs, the error
>      indicator for the stream shall be set, and the function shall
>      return −1 and set errno to indicate the error.
> 
> So a valid implementation could still set errno on EOF and also
> on another error (where it's required to set errno).

True, especially the part that other errors are possible.  But we can't
rely on errno being set on EOF because leaving it unchanged is allowed
as well in that
 case.

> I don't think that it matters in practice, but the "most" correct
> way to handle this would be to check if feof(3) is true to check
> for the non-errno case.

Only if errors at EOF are guaranteed to be impossible.  Imagine
getdelim being able to read to the end and then failing to get memory
for the final NUL.  Other scenarios may be possible.

René

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-11  8:52             ` René Scharfe
@ 2017-08-12 10:02               ` Simon Ruderich
  2017-08-12 11:57                 ` René Scharfe
  0 siblings, 1 reply; 21+ messages in thread
From: Simon Ruderich @ 2017-08-12 10:02 UTC (permalink / raw)
  To: René Scharfe
  Cc: git@vger.kernel.org, Jeff King, Junio C Hamano,
	Yaroslav Halchenko

On Fri, Aug 11, 2017 at 10:52:51AM +0200, René Scharfe wrote:
> Am 11.08.2017 um 09:50 schrieb Simon Ruderich:
>> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:
>>> getdelim(3) returns -1 at the end of the file and if it encounters an
>>> error, but sets errno only in the latter case.  Set errno to zero before
>>> calling it to avoid misdiagnosing an out-of-memory condition due to a
>>> left-over value from some other function call.
>>
>> getdelim(3p) doesn't explicitly forbid changing the errno on EOF:
>>
>>      If no characters were read, and the end-of-file indicator for
>>      the stream is set, or if the stream is at end-of-file, the
>>      end-of-file indicator for the stream shall be set and the
>>      function shall return −1. If an error occurs, the error
>>      indicator for the stream shall be set, and the function shall
>>      return −1 and set errno to indicate the error.
>
> [snip]
>
>> I don't think that it matters in practice, but the "most" correct
>> way to handle this would be to check if feof(3) is true to check
>> for the non-errno case.
>
> Only if errors at EOF are guaranteed to be impossible.  Imagine
> getdelim being able to read to the end and then failing to get memory
> for the final NUL.  Other scenarios may be possible.

Good point. Instead of feof(3), checking ferror(3) should handle
that properly. It's guaranteed to be set by getdelim for any
error.

For example like this (as replacement for the original patch):

diff --git i/strbuf.c w/strbuf.c
index 89d22e3b0..831be21ce 100644
--- i/strbuf.c
+++ w/strbuf.c
@@ -495,7 +495,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 	 * memory and retry, but that's unlikely to help for a malloc small
 	 * enough to hold a single line of input, anyway.
 	 */
-	if (errno == ENOMEM)
+	if (ferror(fp) && errno == ENOMEM)
 		die("Out of memory, getdelim failed");
 
 	/*

An edge case is if the error indicator was already set before
calling getdelim() and the errno was already set to ENOMEM. This
could be fixed by checking ferror() before calling getdelim, but
I'm not sure if that's necessary:

diff --git i/strbuf.c w/strbuf.c
index 89d22e3b0..4d394bb91 100644
--- i/strbuf.c
+++ w/strbuf.c
@@ -468,7 +468,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 {
 	ssize_t r;
 
-	if (feof(fp))
+	if (feof(fp) || ferror(fp))
 		return EOF;
 
 	strbuf_reset(sb);

Regards
Simon
-- 
+ Privatsphäre ist notwendig
+ Ich verwende GnuPG http://gnupg.org
+ Öffentlicher Schlüssel: 0x92FEFDB7E44C32F9

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-12 10:02               ` Simon Ruderich
@ 2017-08-12 11:57                 ` René Scharfe
  2017-08-12 12:21                   ` René Scharfe
  2017-08-13  4:32                   ` Jeff King
  0 siblings, 2 replies; 21+ messages in thread
From: René Scharfe @ 2017-08-12 11:57 UTC (permalink / raw)
  To: Simon Ruderich
  Cc: git@vger.kernel.org, Jeff King, Junio C Hamano,
	Yaroslav Halchenko

Am 12.08.2017 um 12:02 schrieb Simon Ruderich:
> On Fri, Aug 11, 2017 at 10:52:51AM +0200, René Scharfe wrote:
>> Am 11.08.2017 um 09:50 schrieb Simon Ruderich:
>>> On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote:
>>>> getdelim(3) returns -1 at the end of the file and if it encounters an
>>>> error, but sets errno only in the latter case.  Set errno to zero before
>>>> calling it to avoid misdiagnosing an out-of-memory condition due to a
>>>> left-over value from some other function call.
>>>
>>> getdelim(3p) doesn't explicitly forbid changing the errno on EOF:
>>>
>>>       If no characters were read, and the end-of-file indicator for
>>>       the stream is set, or if the stream is at end-of-file, the
>>>       end-of-file indicator for the stream shall be set and the
>>>       function shall return −1. If an error occurs, the error
>>>       indicator for the stream shall be set, and the function shall
>>>       return −1 and set errno to indicate the error.
>>
>> [snip]
>>
>>> I don't think that it matters in practice, but the "most" correct
>>> way to handle this would be to check if feof(3) is true to check
>>> for the non-errno case.
>>
>> Only if errors at EOF are guaranteed to be impossible.  Imagine
>> getdelim being able to read to the end and then failing to get memory
>> for the final NUL.  Other scenarios may be possible.
> 
> Good point. Instead of feof(3), checking ferror(3) should handle
> that properly. It's guaranteed to be set by getdelim for any
> error.

Only if we have a POSIXly correct implementation of getdelim(3). On
Linux its manpage doesn't mention the error indicator, but says that
the function started out as a GNU extension before becoming
standardized, so I'm not sure we can depend on that everywhere.

But we probably want to check for other errors.  They look unlikely
enough that we may get away with something like this:

	-	if (errno == ENOMEM)
	-		die("Out of memory, getdelim failed");
	+	if (errno || ferror(fp))
	+		die_errno(_("getdelim failed"));

NB: The other errors are EINVAL (input pointers are NULL or the
stream is invalid) and EOVERFLOW (read more than fits into ssize_t)
according to POSIX and the Linux manpage.

Update: Just noticed that on the BSDs getdelim(3) doesn't set errno
to ENOMEM on allocation failure, but does set the error indicator.
That might be an oversight on their part, but that means we
certainly *need* to check ferror().  And an unchanged errno can
mean ENOMEM there.  Ugh..

> An edge case is if the error indicator was already set before
> calling getdelim() and the errno was already set to ENOMEM. This
> could be fixed by checking ferror() before calling getdelim, but
> I'm not sure if that's necessary:

Treating an error as end-of-file is not a good idea.  An invalid
stream should not be passed to strbuf_getwholeline() in the first
place.  We could call die() or BUG(), but I think we may leave
that check to getdelim(3), unless we learn about an implementation
that is unprepared to reject streams with the error indicator set.

René

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-12 11:57                 ` René Scharfe
@ 2017-08-12 12:21                   ` René Scharfe
  2017-08-13  4:32                   ` Jeff King
  1 sibling, 0 replies; 21+ messages in thread
From: René Scharfe @ 2017-08-12 12:21 UTC (permalink / raw)
  To: Simon Ruderich
  Cc: git@vger.kernel.org, Jeff King, Junio C Hamano,
	Yaroslav Halchenko

Am 12.08.2017 um 13:57 schrieb René Scharfe:
> Update: Just noticed that on the BSDs getdelim(3) doesn't set errno
> to ENOMEM on allocation failure, but does set the error indicator.
> That might be an oversight on their part, but that means we
> certainly *need* to check ferror().  And an unchanged errno can
> mean ENOMEM there.  Ugh..

I take that back.  The memory allocation function called by getdelim
will of course set errno on failure.  I got fooled by the manpages,
which don't mention ENOMEM, e.g.: https://man.openbsd.org/getdelim.

René

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-12 11:57                 ` René Scharfe
  2017-08-12 12:21                   ` René Scharfe
@ 2017-08-13  4:32                   ` Jeff King
  2017-08-13  5:51                     ` René Scharfe
  1 sibling, 1 reply; 21+ messages in thread
From: Jeff King @ 2017-08-13  4:32 UTC (permalink / raw)
  To: René Scharfe
  Cc: Simon Ruderich, git@vger.kernel.org, Junio C Hamano,
	Yaroslav Halchenko

On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote:

> But we probably want to check for other errors.  They look unlikely
> enough that we may get away with something like this:
> 
> 	-	if (errno == ENOMEM)
> 	-		die("Out of memory, getdelim failed");
> 	+	if (errno || ferror(fp))
> 	+		die_errno(_("getdelim failed"));
> 
> NB: The other errors are EINVAL (input pointers are NULL or the
> stream is invalid) and EOVERFLOW (read more than fits into ssize_t)
> according to POSIX and the Linux manpage.

Can't we also get any of the errors that fgetc() would return. I.e., any
normal read errors? We should return EOF on those, not die (and the
caller can check ferror()).

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-13  4:32                   ` Jeff King
@ 2017-08-13  5:51                     ` René Scharfe
  2017-08-13  5:58                       ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: René Scharfe @ 2017-08-13  5:51 UTC (permalink / raw)
  To: Jeff King
  Cc: Simon Ruderich, git@vger.kernel.org, Junio C Hamano,
	Yaroslav Halchenko

Am 13.08.2017 um 06:32 schrieb Jeff King:
> On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote:
> 
>> But we probably want to check for other errors.  They look unlikely
>> enough that we may get away with something like this:
>>
>> 	-	if (errno == ENOMEM)
>> 	-		die("Out of memory, getdelim failed");
>> 	+	if (errno || ferror(fp))
>> 	+		die_errno(_("getdelim failed"));
>>
>> NB: The other errors are EINVAL (input pointers are NULL or the
>> stream is invalid) and EOVERFLOW (read more than fits into ssize_t)
>> according to POSIX and the Linux manpage.
> 
> Can't we also get any of the errors that fgetc() would return. I.e., any
> normal read errors? We should return EOF on those, not die (and the
> caller can check ferror()).

Yes, we can get those as well, and leaving error checking to the caller
is a flexible way to handle them.  

Many of the existing callers don't seem to be bother, though.

René

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] strbuf: clear errno before calling getdelim(3)
  2017-08-13  5:51                     ` René Scharfe
@ 2017-08-13  5:58                       ` Jeff King
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2017-08-13  5:58 UTC (permalink / raw)
  To: René Scharfe
  Cc: Simon Ruderich, git@vger.kernel.org, Junio C Hamano,
	Yaroslav Halchenko

On Sun, Aug 13, 2017 at 07:51:34AM +0200, René Scharfe wrote:

> Am 13.08.2017 um 06:32 schrieb Jeff King:
> > On Sat, Aug 12, 2017 at 01:57:06PM +0200, René Scharfe wrote:
> > 
> >> But we probably want to check for other errors.  They look unlikely
> >> enough that we may get away with something like this:
> >>
> >> 	-	if (errno == ENOMEM)
> >> 	-		die("Out of memory, getdelim failed");
> >> 	+	if (errno || ferror(fp))
> >> 	+		die_errno(_("getdelim failed"));
> >>
> >> NB: The other errors are EINVAL (input pointers are NULL or the
> >> stream is invalid) and EOVERFLOW (read more than fits into ssize_t)
> >> according to POSIX and the Linux manpage.
> > 
> > Can't we also get any of the errors that fgetc() would return. I.e., any
> > normal read errors? We should return EOF on those, not die (and the
> > caller can check ferror()).
> 
> Yes, we can get those as well, and leaving error checking to the caller
> is a flexible way to handle them.  
> 
> Many of the existing callers don't seem to be bother, though.

Yes, but keep in mind that we need to match the non-getdelim()
implementation here (which itself tries to behave like more or less like
a dynamic fgets, including the eof/error handling). Because the callers
don't know which one they're getting.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-08-13  5:58 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-09 17:39 fatal: Out of memory, getdelim failed under NFS mounts Yaroslav Halchenko
2017-08-10 13:27 ` René Scharfe
2017-08-10 14:43   ` Yaroslav Halchenko
2017-08-10 19:44     ` René Scharfe
2017-08-10 18:56   ` Junio C Hamano
2017-08-10 19:44     ` Yaroslav Halchenko
2017-08-10 19:58     ` René Scharfe
2017-08-10 20:05       ` Jeff King
2017-08-10 20:29         ` Yaroslav Halchenko
2017-08-10 20:56         ` [PATCH] strbuf: clear errno before calling getdelim(3) René Scharfe
2017-08-10 21:02           ` Jeff King
2017-08-10 21:35             ` Yaroslav Halchenko
2017-08-10 21:41           ` Junio C Hamano
2017-08-11  7:50           ` Simon Ruderich
2017-08-11  8:52             ` René Scharfe
2017-08-12 10:02               ` Simon Ruderich
2017-08-12 11:57                 ` René Scharfe
2017-08-12 12:21                   ` René Scharfe
2017-08-13  4:32                   ` Jeff King
2017-08-13  5:51                     ` René Scharfe
2017-08-13  5:58                       ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).