[GE users] active_jobs directory.

Iwona Sakrejda isakrejda at lbl.gov
Tue Mar 14 03:51:28 GMT 2006


No luck - still same error.
What I wonder about is why it is not giving the full path
It says only:
"cant open file   active_jobs/5.1/error" and so on....

Ron Chen wrote:

> Need to make sure that there are no jobs running on the host,
> shutdown execd on that host, and then change execd_spool_dir.
> 
> You can see the comments in:
> http://gridengine.sunsource.net/issues/show_bug.cgi?id=103
> 
>  -Ron
> 
> 
> --- Iwona Sakrejda <isakrejda at lbl.gov> wrote:
> 
>>yes, I did and I cleared the error.
>>I restarted sgexecd on the compute node and on the submission
>>node.
>>Does it need to run on the master node (I don't think it ran
>>before..)
>>
>>I see execution host creates directories in the default/spool
>>area,
>>but the job fails and there is a message in
>>default/spool/<host>/messages:
>>
>>03/13/2006 18:57:17|execd|pc2203|E|shepherd of job 5.1 exited
>>with exit status = 7
>>03/13/2006 18:57:17|execd|pc2203|W|reaping job "5" ptf
>>complains: Job does not exist
>>03/13/2006 18:57:17|execd|pc2203|E|abnormal termination of
>>shepherd for job 5.1: no "exit_status" file
>>03/13/2006 18:57:17|execd|pc2203|E|cant open file
>>active_jobs/5.1/error: No such file or directory
>>03/13/2006 18:57:17|execd|pc2203|E|can't open pid file
>>"active_jobs/5.1/pid" for job 5.1
>>03/13/2006 18:57:17|execd|pc2203|I|sending admin mail mail to
>>user "sgeadm at nersc.gov"|mailer 
>>"/common/sge/util/pdsf_mail"|"SGE 6.0u4: Job 5 failed"
>>
>>
>>Ron Chen wrote:
>>
>>
>>>Did you change var "execd_spool_dir" with cmd "qconf
>>
>>-mconf"?
>>
>>>Then use cmd "qmod -cq <queue>" to clear the error.
>>>
>>> -Ron
>>>
>>>
>>>--- Iwona Sakrejda <isakrejda at lbl.gov> wrote:
>>>
>>>
>>>>Hi,
>>>>
>>>>My job execution is failing because the excution host
>>>>cannot find the job it's supposed to run and the
>>>><exec_host>/active_jobs directory is  empty in the spool
>>>>area.
>>>>
>>>>Jobs show up in the queue, but queue on the excution
>>>>host goes into an error state.
>>>>
>>>>What could have gotten misconfigured?
>>>>
>>>>Suggestions appreciated
>>>>
>>>>Iwona
>>>>
>>>>
>>>
>>>
> ---------------------------------------------------------------------
> 
>>>>To unsubscribe, e-mail:
>>>>users-unsubscribe at gridengine.sunsource.net
>>>>For additional commands, e-mail:
>>>>users-help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>>
>>>
>>>__________________________________________________
>>>Do You Yahoo!?
>>>Tired of spam?  Yahoo! Mail has the best spam protection
>>
>>around 
>>
>>>http://mail.yahoo.com 
>>>
>>>
>>
> ---------------------------------------------------------------------
> 
>>>To unsubscribe, e-mail:
>>
>>users-unsubscribe at gridengine.sunsource.net
>>
>>>For additional commands, e-mail:
>>
>>users-help at gridengine.sunsource.net
>>
>>
> ---------------------------------------------------------------------
> 
>>To unsubscribe, e-mail:
>>users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail:
>>users-help at gridengine.sunsource.net
>>
>>
> 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list