[GE users] Putting a job on another server

Reuti reuti at staff.uni-marburg.de
Wed Sep 26 22:35:20 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Am 26.09.2007 um 22:37 schrieb Alexandre Racine:

> But, does "reschedule" means "resume on another server" or "stop  
> and restart from the beginning on another server"?

It will start from the beginning - unless the application is so  
sophisticated to discover any already computed results in any of its  
created files (as long as they are in a shared dircetory).

Another option is to use a checkpointing-interface, where you can  
setup any checkpointing e.g. in timed intervals. SGE supports  
checkpointing as long as this feature is already in the application.  
It doesn't offer checkpointing on its own. You can check the man  
pages of "checkpoint" and "sge_ckpt" besides:

http://gridengine.sunsource.net/howto/checkpointing.html

whether it's suitable for your application. Then you could setup this  
checkpointing environment to migrate the job when it gets suspended  
(either automatically or with a "qmod -sj").

-- Reuti


>
>
> Alexandre Racine
> Projets spéciaux
> 514-461-1300 poste 3304
> alexandre.racine at mhicc.org
>
>
>
> -----Original Message-----
> From: Beadles, Jeff [mailto:jeff_beadles at mentor.com]
> Sent: Wed 2007-09-26 16:09
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Putting a job on another server
>
>
> $ qmod -rj <job_number>
>
>      -rj  If applied  to  running  jobs,  reschedules  the  jobs.
>           Requires root or manager privileges.
>
> 	-Jeff
>
> -- 
> Jeff Beadles, Mentor Graphics Corporation - jeff_beadles at mentor.com
>
>
> -----Original Message-----
> From: Alexandre Racine [mailto:Alexandre.Racine at mhicc.org]
> Sent: Wednesday, September 26, 2007 1:03 PM
> To: users at gridengine.sunsource.net
> Subject: [GE users] Putting a job on another server
>
> Is there a way to take a running job and to thrown it to another  
> server?
>
>
> Thanks.
>
>
>
> Alexandre Racine
> Projets spéciaux
> 514-461-1300 poste 3304
> alexandre.racine at mhicc.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list