[GE users] Handling DRMAA sessions within a Java web-service

mdondrup michael.dondrup at uni.no
Tue Aug 31 19:37:47 BST 2010


Hi Daniel and all,

thank you very much.  This configuration of ODE to make it account aware would be a nice next step. For now,
I stumbled into a more basic problem. I seem to be unable to retrieve the job status of a job after it has finished. 
I get an InvalidJobException when sharing a static session between threads. I didn't call session.exit() in between, 
because in a multithreaded app I think I cannot. I made a little program that illustrates what I am trying to do. 
It's the same problem as with my Axis2 service:

The output is as follows:
$ java -cp /usr/share/java/drmaa.jar:. drmaatest5
main: Your job has been submitted with id 171
status: 16
Thread is waiting for the job to finish: Thread[Thread-0,5,main]
status: 16
status: 32
status: 32
got info for 171
Job 171 finished regularly with exit status 0
org.ggf.drmaa.InvalidJobException: The job specified by the 'jobid' does not exist.
	at com.sun.grid.drmaa.SessionImpl.nativeGetJobProgramStatus(Native Method)
	at com.sun.grid.drmaa.SessionImpl.getJobProgramStatus(SessionImpl.java:213)
	at drmaatest5.main(drmaatest5.java:38)
status: 0
^C 

Is there a way to do this right? Please help.

Michael

===========================drmaatest5.java==============================

import org.ggf.drmaa.*;

public class drmaatest5  {

    private static Session session = null;
    private static String contact = null;

    public static void main(String[] args) {		
	JobTemplate jt = null;
	String jobid = null;	
	try {
	    SessionFactory factory = SessionFactory.getFactory();
	    session = factory.getSession();
	    session.init("");
	    contact = session.getContact();
	    jt = session.createJobTemplate();
		
	    jt.setRemoteCommand("/home/michi/sleep.sh");
	    jobid = session.runJob(jt);
	    System.out.println("main: Your job has been submitted with id " + jobid);
	    session.deleteJobTemplate(jt);
	   
	} catch (Exception e) {
	    e.printStackTrace();
	}
	DrmaaJob dj = new DrmaaJob(session);
	if (jobid != null) {
	    dj.setJobid(jobid);
	    dj.setContact(contact);
	    Thread t = new Thread(dj);
	    t.start();
	}
	// watch the job
	while (true) {
	    int status = 0;
	    try {
		status = session.getJobProgramStatus(jobid);	   
	    } catch (DrmaaException e) {
		e.printStackTrace();
	    }
	    System.out.println("status: "+ status);
	    try {
		Thread.sleep(10000);
	    } catch (InterruptedException e1) {}
	}

    };
    
    public void finalize() {
	try {
	    System.out.println("finalize()");
	    this.session.exit();
	    
	} catch (Exception e) {
	    e.printStackTrace();
	}
    }


}

class DrmaaJob implements Runnable {

    private Session session = null;
    private String contact = null;
    private volatile String jobid = null;

    DrmaaJob(Session session) {
	this.session = session;
    }

    public void setContact(String contact) {
	this.contact = contact;
    }

    public synchronized void setJobid(String jobid){
	this.jobid = jobid;
    }

    public synchronized String getJobid() {
	return this.jobid;
    }
    
    public  void run() {
		
	try {
	  
	    System.out.println("Thread is waiting for the job to finish: " + Thread
			       .currentThread().toString());
	    
	    JobInfo info = session.wait(getJobid(),
					Session.TIMEOUT_WAIT_FOREVER);
	    System.out.println("got info for " + info.getJobId());
	
	    if (info.wasAborted()) {
		System.out.println("Job " + info.getJobId() + " never ran ");
	    } else if (info.hasExited()) {
		System.out.println("Job " + info.getJobId()
				   + " finished regularly with exit status "
				   + info.getExitStatus());
	    } else if (info.hasSignaled()) {
		System.out.println("Job " + info.getJobId() + " finished due to signal "
				   + info.getTerminatingSignal());
		
	    } else {
		System.out.println("Job " + info.getJobId()
				   + " finished with unclear conditions");
		
		System.out.println("program status: "
				   + session.getJobProgramStatus(jobid));
		
	    }	 
	} catch (DrmaaException e) {
	    System.out.println("DRMAA exception occured: " + e);	    
	}
    }
}



Am Aug 31, 2010 um 4:06 PM schrieb Daniel Templeton:

> If a thread tries to use a session that has exited, it will get a NotActiveSessionException, so as long as it handles the exception correctly, you're fine.
> 
> Now, the problem you do have is that jobs are submitted as the process doing the submission.  In the case of a portal, it means that all jobs submitted by the users end up running as the id that is running the portal.  If that's a problem for you, I can explain how to work around it through clever OGE configuration.
> 
> Daniel
> 
> On 08/31/10 06:53 AM, mdondrup wrote:
>> Hi,
>> 
>> first , I hope that this is the right place to ask this question. I would like to know, if that is the right
>> way to do it, it seems to work but there might be some hidden problems or if there is a more standard way
>> of doing this.
>> 
>> I am using the drmaa Java api inside an Axis2 SOAP web-service to submit jobs to a
>> GE cluster. I understood from the documentation that there can be at most a single active DRMAA session
>> per process. Given I got two classes:
>> - the implementation class of the service has a method 'start' which can be called by the client from the outside
>> - a class 'DrmaaScheduler' which makes the actual calls to the grid-engine via DRMAA.
>> 
>> The following code works so far, but there could be a problem in theory, if a threads try to re-use a session
>> which is already exited but not yet set to null?
>> 
>> Thank you very much for any comments and suggestions
>> 
>> Michael
>> 
>> 
>> in class DrmaaScheduler I created a static variable for the session, like this:
>> 
>> public class DrmaaScheduler implements Scheduler {
>> 	// store session and contact
>> 	private static Session session = null;
>> 	private static String contact = null;
>> 
>> then in the constructor:
>> 
>> public DrmaaScheduler() throws SchedulerException {
>>  if (session == null) {
>>     try {
>>           SessionFactory factory = SessionFactory.getFactory();
>>           session = factory.getSession();
>>           if (contact != null)
>>                session.init(contact);
>>          else
>>                session.init("");
>>          contact = session.getContact();
>>          log.debug("DRMAA session {} initialized", contact);
>>     } catch (Exception e) {
>>        log.error("something went wrong initializing DRMAA:", e);
>>     }
>> [....]
>> 
>> I only call session.exit in the finally method like:
>> 
>> public void finalize() {
>> 	try {
>> 		super.finalize();
>> 		session.exit();
>> // ** this could be a problem, because it's not atomic, right? **
>> 		session = null;
>> 	  } catch (Throwable e) {
>> 			log.error("Error during finalization: ", e);
>>           }
>> }
>> 
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=278486
>> 
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=278553

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list