[GE users] Reservation of resources in duration of checkpoing so that no other job can be able to use those resources.

sgerns rajansrivastava83 at gmail.com
Tue Aug 17 15:57:07 BST 2010


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi all,

I have a scenario here, I am trying to explain this in steps.

1. I have some jobs running in the cluster, based on the priority of the queues (Few queues have higher priority than others & so on).

2. I am also having a VIP.queue which is having the highest priority among all the queues. So ofcourse jobs which has been submitted through vip queue will definitely go up in queue, as I have shown below.

100    0.50500        job_name1      usr1       r              normal.q at exehost<mailto:normal.q at exehost>             32

101     0.50500      job_name2       usr1       r             normal.q at exehost<mailto:normal.q at exehost>              16

102     0.50500       job_name2      usr1       r             normal.q at exehost<mailto:normal.q at exehost>              32

103     0.50500        job_name3      usr2      r             normal.q at exehost<mailto:normal.q at exehost>              128

104     0.50973         job_name4      usr3     qw              VIP.q at exehost<mailto:VIP.q at exehost>.              64

105     0.50973         job_name5      usr3      qw           normal.q at exehost<mailto:normal.q at exehost>         32

106     0.51514         job_name6        usr4     qw            normal.q at exehost<mailto:normal.q at exehost>         16


3. Now These VIP Jobs are very important jobs & I want these jobs to run ASAP, Hence I will checkpoint the lower priority jobs which are running right now.

4. Suppose job id 100 & 102 I have selected for checkpointing & send checkpointing signal to those jobs, & It has taken 2 hours to chekpoint these jobs.
   I do not want any other job to run on these resources during this duration of 2 hours.
   Because there is a possibilty that small jobs can get the resources & start running



Kindly help me How can I be able to reserve the resources for the duration of checkpointing so that any other un important job can not able to start.







More information about the gridengine-users mailing list