[GE users] "Output file too large" errors back?

Reuti reuti at staff.uni-marburg.de
Thu May 8 21:27:41 BST 2008


Am 08.05.2008 um 22:15 schrieb Bevan C. Bennett:

> Any news on this? Am I the only one seeing this issue?
> It effects both my regular and 64-bit systems.

Can it be this - the fileserver is also CentOS4?:

http://gridengine.info/articles/2008/04/21/rhel5-2-centos5-kernel- 
update-may-cause-problems

-- Reuti


> [bevan at osmium ~]> qrsh -o /home/user/bevan/test-output
> error: 1: can't stat() "/home/user/bevan/test-output" as stdout_path:
> Value too large for defined data type KRB5CCNAME=none uid=0 gid=0 1004
> 1008 1009 1030 5126 9072 9076 9085
>
> [bevan at osmium ~]> which qrsh
> /usr/local/grid-6.0/bin/lx24-amd64/qrsh
> [bevan at osmium ~]> /usr/local/grid-6.0/bin/lx24-amd64/qrsh -help
> GE 6.1u2
>
> [bevan at alexander ~]$ qrsh -o /home/user/bevan/test-output
> error: 1: can't stat() "/home/user/bevan/test-output" as stdout_path:
> Value too large for defined data type KRB5CCNAME=none uid=0 gid=0 1004
> 1008 1009 1030 5126 9072 9076 9085
>
> [bevan at alexander ~]$ which qrsh
> /usr/local/grid-6.0/bin/lx24-x86/qrsh
>
>
> Bevan C. Bennett wrote:
>> The servers are x86 Linux systems running CentOS 4.
>> The desktops are running Fedora Core 6.
>> The output file was indeed larger than 2GB.
>>
>> Here's an easy re-enactment:
>>
>> [bevan at alexander ~]$ ls -al test*
>> -rw------- 1 bevan bevan 2307973120 Sep 14 16:16 test-output
>> [bevan at alexander ~]$ ls -lh test-output
>> -rw------- 1 bevan bevan 2.2G Sep 14 16:16 test-output
>> [bevan at alexander ~]$ qrsh -o /home/user/bevan/test-output
>> error: 1: can't stat() "/home/user/bevan/test-output" as stdout_path:
>> Value too large for defined data type KRB5CCNAME=none uid=0 gid=0  
>> 1004
>> 1008 1009 1030 5126 9072 9076 9085
>> [bevan at alexander ~]$ which qrsh
>> /usr/local/grid/bin/lx24-x86/qrsh
>> [bevan at alexander ~]$ qrsh -help
>> GE 6.1u2
>> ...
>>
>> Joachim Gabler wrote:
>>> Hi Bevan,
>>>
>>> on which OS are you experiencing this problem?
>>> I shortly verified the code, it doesn't differ between V60s2_BRANCH
>>> (6.0u??) and V61_BRANCH (6.1u?).
>>> It uses the SGE_STAT macro, which resolves to stat64 on Solaris,
>>> Linux, and Irix.
>>>
>>>   Joachim
>>>
>>> Bevan C. Bennett wrote:
>>>> I've started seeing errors that appear to be a re-emergence of this
>>>> very old bug in SGE 6.1u2. Is anyone else experiencing anything  
>>>> similar?
>>>>
>>>> Original bug from 2005:
>>>> http://gridengine.sunsource.net/issues/show_bug.cgi?id=1628
>>>>
>>>> An error from this afternoon:
>>>> Shepherd trace:
>>>> 09/10/2007 15:27:25 [5143:23799]: shepherd called with uid = 0,  
>>>> euid
>>>> = 5143
>>>> 09/10/2007 15:27:25 [5143:23799]: starting up 6.1u2
>>>> 09/10/2007 15:27:25 [5143:23799]: setpgid(23799, 23799) returned 0
>>>> 09/10/2007 15:27:25 [5143:23799]: no prolog script to start
>>>> 09/10/2007 15:27:25 [5143:23800]: pid=23800 pgrp=23800 sid=23800  
>>>> old
>>>> pgrp=23799 getlogin()=<no login set>
>>>> 09/10/2007 15:27:25 [5143:23800]: reading passwd information for  
>>>> user
>>>> 'xsu'
>>>> 09/10/2007 15:27:25 [5143:23800]: setosjobid: uid = 0, euid = 5143
>>>> 09/10/2007 15:27:25 [5143:23799]: forked "job" with pid 23800
>>>> 09/10/2007 15:27:25 [5143:23800]: setting limits
>>>> 09/10/2007 15:27:25 [5143:23800]: RLIMIT_CPU setting: (soft
>>>> 4294967295 hard 4294967295) resulting: (soft 4294967295 hard  
>>>> 4294967295)
>>>> 09/10/2007 15:27:25 [5143:23800]: RLIMIT_FSIZE setting: (soft
>>>> 4294967295 hard 4294967295) resulting: (soft 4294967295 hard  
>>>> 4294967295)
>>>> 09/10/2007 15:27:25 [5143:23800]: RLIMIT_DATA setting: (soft
>>>> 4294967295 hard 4294967295) resulting: (soft 4294967295 hard  
>>>> 4294967295)
>>>> 09/10/2007 15:27:25 [5143:23800]: RLIMIT_STACK setting: (soft
>>>> 4294967295 hard 4294967295) resulting: (soft 4294967295 hard  
>>>> 4294967295)
>>>> 09/10/2007 15:27:25 [5143:23800]: RLIMIT_CORE setting: (soft
>>>> 4294967295 hard 4294967295) resulting: (soft 4294967295 hard  
>>>> 4294967295)
>>>> 09/10/2007 15:27:25 [5143:23800]: RLIMIT_VMEM/RLIMIT_AS setting:
>>>> (soft 4294967295 hard 4294967295) resulting: (soft 4294967295 hard
>>>> 4294967295)
>>>> 09/10/2007 15:27:25 [5143:23800]: RLIMIT_RSS setting: (soft
>>>> 4294967295 hard 4294967295) resulting: (soft 4294967295 hard  
>>>> 4294967295)
>>>> 09/10/2007 15:27:25 [5143:23800]: setting environment
>>>> 09/10/2007 15:27:25 [5143:23799]: child: job - pid: 23800
>>>> 09/10/2007 15:27:25 [5143:23800]: Initializing error file
>>>> 09/10/2007 15:27:25 [5143:23800]: switching to intermediate/ 
>>>> target user
>>>> 09/10/2007 15:27:25 [9153:23800]: closing all filedescriptors
>>>> 09/10/2007 15:27:25 [9153:23800]: further messages are in  
>>>> "error" and
>>>> "trace"
>>>> 09/10/2007 15:27:25 [9153:23800]: can't stat()
>>>> "/home/user/xsu/sim.output" as stdout_path: Value too large for
>>>> defined data type KRB5CCNAME=none uid=9153 gid=9153 1004 1030  
>>>> 9153 20086
>>>> 09/10/2007 15:27:25 [5143:23799]: wait3 returned 23800 (status:  
>>>> 6656;
>>>> WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 26)
>>>> 09/10/2007 15:27:25 [5143:23799]: job exited with exit status 26
>>>> 09/10/2007 15:27:25 [5143:23799]: reaped "job" with pid 23800
>>>> 09/10/2007 15:27:25 [5143:23799]: job exited not due to signal
>>>> 09/10/2007 15:27:25 [5143:23799]: job exited with status 26
>>>> 09/10/2007 15:27:25 [5143:23799]: now sending signal KILL to pid  
>>>> -23800
>>>> 09/10/2007 15:27:25 [5143:23799]: no tasker to notify
>>>> 09/10/2007 15:27:25 [5143:23799]: failed starting job
>>>> 09/10/2007 15:27:25 [5143:23799]: no epilog script to start
>>>>
>>>> Shepherd error:
>>>> 09/10/2007 15:27:25 [9153:23800]: can't stat()
>>>> "/home/user/xsu/sim.output" as stdout_path: Value too large for
>>>> defined data type KRB5CCNAME=none uid=9153 gid=9153 1004 1030  
>>>> 9153 20086
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list