[GE users] qrsh segfault with vmemoryuse limit

ohi ohi at hgc.jp
Fri Mar 13 06:32:39 GMT 2009


Hi,

I use soft vmemoryuse in my queue.
> [ohi at gw16 ~]$ qconf -sq all.q |grep s_vmem

> s_vmem                34359738368

I run qrsh, and I got following error.
> [ohi at gw16 ~]$ qrsh pwd
> Segmentation fault

When I unlimited vmemoryuse by shell command,
I did not get segfaul.
> [ohi at gw16 ~]$ unlimit vmemoryuse
> [ohi at gw16 ~]$ qrsh pwd
> /home/ohi

I use strace about qrsh with vmemoryuse limit,
I got following output
> mmap(NULL, 34359742464, PROT_READ|PROT_WRITE, MAP_PRIVATE| 
> MAP_ANONYMOUS|MAP_32BIT, -1, 0) = -1 ENOMEM (Cannot allocate memory)
> mmap(NULL, 34359742464, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE| 
> MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> +++ killed by SIGSEGV +++
qrsh used mmap and try to keep
34,359,742,464 memory.
This number over my vmemoryuse limit
34,359,738,368 memory.

Why Does qrsh try to keep such a big memory ?
Is this Bug?

I found this phenomenon, when I submitted
MPI job. The MPI job was crushed by qrsh
segfault.

I also set another amount of vmemoryuse.
> [ohi at gw16 ~]$ qconf -sq all.q |grep s_vmem
> s_vmem                68719476736

Next time, qrsh try to keep below memory,
and cause segfault.
> mmap(NULL, 68719480832, PROT_READ|PROT_WRITE, MAP_PRIVATE| 
> MAP_ANONYMOUS|MAP_32BIT, -1, 0) = -1 ENOMEM (Cannot allocate memory)
> mmap(NULL, 68719480832, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE| 
> MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> +++ killed by SIGSEGV +++

My Server environment is below.
> [ohi at gw16 ~]$ uname -a
> Linux gw16 2.6.18-92.1.13.el5 #1 SMP Wed Sep 24 19:32:05 EDT 2008  
> x86_64 x86_64 x86_64 GNU/Linux
> [ohi at gw16 lx24-amd64]$ ./sge_qmaster -help
> SGE 6.2u2
> usage: sge_qmaster [options]
>    [-help]                                  print this help
I use tcsh.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=129443

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list