[GE users] Parallel jobs remain in r state after finishing

Bart Willems b-willems at northwestern.edu
Tue Nov 25 21:23:38 GMT 2008


Hi All,

I have set up tight integration between mpich2 and sge 6.2 following  
Reuti's howto:

http://gridengine.sunsource.net/howto/mpich2-integration/mpich2- 
integration.html

Everything worked fine during my testing period when I had a few  
nodes dedicated exclusively to the parallel job queue. I have now re- 
opened these nodes to other queues as well and now parallel jobs are  
no longer deleted from the queue when they finish.

My PE is set up as follows:

# qconf -sp mpich2_smpd
pe_name            mpich2_smpd
slots              9999
user_lists         parallelusers
xuser_lists        NONE
start_proc_args    /opt/gridengine/mpich2_smpd/startmpich2.sh - 
catch_rsh \
                    $pe_hostfile /share/apps/mpich2
stop_proc_args     /opt/gridengine/mpich2_smpd/stopmpich2.sh - 
catch_rsh \
                    /share/apps/mpich2
allocation_rule    $fill_up
control_slaves     TRUE
job_is_first_task  FALSE
urgency_slots      min
accounting_summary FALSE

Any suggestions would be most appreciated.

Thanks,
Bart

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89829

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list