[GE users] Re; {GE users] Stuck in pending

markdenni at comcast.net markdenni at comcast.net
Tue Dec 6 22:56:59 GMT 2005



-------------- Original message -------------- 

> Hi, 
> 
> Am 06.12.2005 um 22:44 schrieb markdenni at comcast.net: 
> 
> > The version of SGE is: 6.0u6 
> > The output of qstat -j when sge_execd is not running on rs101 
> > mentions that 
> > queue instance rs101... is temporarily not available. But when I 
> > start it again, 
> > queue instance rs101 is not even mentioned. This is a mystery to me. 
> > 
> > In Host Configuration -> execution host I have set Access to test 
> > (of which I am the only member), XAccess to NONE, PROJECTS to NONE, 
> > and XProjects to all the projects the other users use so their jobs 
> > won't run on this server.) Perhaps I am not doing this correctly. 
> > 
> > When rs101 is not running the output of qstat is: 
> > % qstat -j 129791 
> > job_number: 129791 
> > exec_file: job_scripts/129791 
> > submission_time: Thu Dec 1 19:21:33 2005 
> > owner: markd 
> > uid: 1605 
> > group: iosd 
> > gid: 1000 
> > acco unt: sge 
> > mail_options: n 
> > mail_list: markd at rs100.sjs.agilent.com 
> > notify: FALSE 
> > job_name: newtest 
> > priority: 1 
> > jobshare: 0 
> > script_file: newtest 
> > verify_suitable_queues: 3 
> > version: 7 
> > scheduling info: queue instance 
> > "all.q at rsl019.sjs.agilent.com" dropped because it is temporarily 
> > not available 
> > queue instance 
> > "all.q at rs101.sjs.agilent.com" dropped because it is temporarily not 
> > available 
> > &n bsp; queue instance 
> > "all.q at rsl020.sjs.agilent.com" dropped because it is full 
> > queue instance 
> > "all.q at rsl022.sjs.agilent.com" dropped because it is full 
> > has no permission for host "global" 
> > 
> > When sgeexecd is running on rs101, then it isn't even mentioned in 
> > qstat -j: 
> > % qstat -j 129791 
> > job_number: 129791 
> > exec_file: job_scripts/129791 
> > submission_time: Thu Dec 1 19:21:33 2005 
> > owner: markd 
> > uid: 1605 
> > group: iosd 
> > gid: 1000 
> > acco unt: sge 
> > mail_options: n 
> > mail_list: markd at rs100.sjs.agilent.com 
> > notify: FALSE 
> > job_name: newtest 
> > priority: 1 
> > jobshare: 0 
> > script_file: newtest 
> > verify_suitable_queues: 3 
> > version: 7 
> > scheduling info: queue instance 
> > "all.q at rsl019.sjs.agilent.com" dropped because it is temporarily 
> > not available 
> > queue instance 
> > "all.q at rsl020.sjs.agilent.com" dropped because it is full 
> > queue in stance 
> > "all.q at rsl022.sjs.agilent.com" dropped because it is full 
> > has no permission for host "global" 
> > 
> > Thanks in advance for any suggestions. 
> 
> in the first case: if I understand you in the right, this is the 
> output without sgeexec running - well, therefore it's temporarily not 
> available. 
> 
> in the second case: you get two outputs for the scheduling info. One 
> is the overall scheduler information about certain queue instances on 
> the hosts, the other output is for the job in question (which you see 
> only for pending jobs). As the job is already running, you only get 
> the first part. And maybe there are still slots left on rs101 and all 
> is in best order. To me it looks fine. 
> 
> Cheers - Reuti 
> 
> > 
> > - Mark 
> 
> 
> --------------------------------------------------------------------- 
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net 
> For additional commands, e-mail: users-help at gridengine.sunsource.net 
> 
Reuti -
The problem is that the job is not running.  It is still pending.
- Mark



More information about the gridengine-users mailing list