[GE users] The Scheduler dies" COMPLETE information

Stephan Grell - Sun Germany - SSG - Software Engineer stephan.grell at sun.com
Mon May 23 16:13:42 BST 2005


Yes, you are right. Removing the files and dirs will remove the job.

Stephan

Viktor Oudovenko wrote:

>Thank you very much , Stephan, most probably 1416 is my problem. At least
>all the symptoms the same.
>What a relief! Thanks a lot! 
>Just for information when you will have a chance to answer. If I remove
>stuff from directory jobs to some other place and then restart the sgemaster
>(softstop) will all the running jobs killed or not? From my play with one of
>the jobs it was exactly the way I said i.e. job got killed.
>v
>
>  
>
>>-----Original Message-----
>>From: Stephan Grell - Sun Germany - SSG - Software Engineer 
>>[mailto:stephan.grell at sun.com] 
>>Sent: Monday, May 23, 2005 4:53
>>To: users at gridengine.sunsource.net
>>Subject: Re: [GE users] The Scheduler dies" COMPLETE information
>>
>>
>>The work around is to remove all pe jobs, start the scheduler 
>>and than resubmit the pe jobs...
>>
>>u4 will be available soon. I do not know the date. Sorry. 
>>However, you can compile it yourself by checking out the u4 tag.
>>
>>Stephan
>>
>>Viktor Oudovenko wrote:
>>
>>    
>>
>>>Hi, Stephan,
>>>
>>>Thank you for the answer.
>>>When u4 will be issued and where I can read about issue 1416?
>>>
>>>Meanwhile  I tried many things but nothing helped me at the 
>>>      
>>>
>>moment. Why 
>>    
>>
>>>my scheduler reregister so often. Because  after it dies I 
>>>      
>>>
>>restart it 
>>    
>>
>>>manually. Simply issuing command: $SGE_ROOT/bin/lx..../scg_schedd
>>>Then the information about reregistering appears.
>>>
>>>Thank you very much for your help.
>>>v
>>>
>>> 
>>>
>>>      
>>>
>>>>-----Original Message-----
>>>>From: Stephan Grell - Sun Germany - SSG - Software Engineer
>>>>[mailto:stephan.grell at sun.com] 
>>>>Sent: Monday, May 23, 2005 3:45
>>>>To: users at gridengine.sunsource.net
>>>>Subject: Re: [GE users] The Scheduler dies" COMPLETE information
>>>>
>>>>
>>>>Hi Viktor,
>>>>
>>>>you encounter issue 1416. This is fixed with u4.
>>>>However, the important question is, why your scheduler is
>>>>reregistering so often.
>>>>
>>>>Stephan
>>>>
>>>>Viktor Oudovenko wrote:
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>>>Hi, Stephan and anybody who can help!
>>>>>
>>>>>Could you have a look at the attachment to see what is going
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>on with my
>>>>   
>>>>
>>>>        
>>>>
>>>>>scheduler. What I did I just run as you advised scheduler
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>demon in dl 1
>>>>   
>>>>
>>>>        
>>>>
>>>>>mode and waited until it crashes.
>>>>>And it did. It dies even without any events.  I mean you
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>will find two lines
>>>>   
>>>>
>>>>        
>>>>
>>>>>in from messages file when the scheduler died without any
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>reason. But the
>>>>   
>>>>
>>>>        
>>>>
>>>>>last crash happened because one of the myrinet jobs 
>>>>>          
>>>>>
>>finished. Could 
>>    
>>
>>>>>you give any hint what could it be and what could it be done. I am 
>>>>>running Linux SuSE 8.2 on the server  and 9.0 and 9.2
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>on the slaves.
>>>>   
>>>>
>>>>        
>>>>
>>>>>I also have a few opterons (8 machines). I am happy to
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>provide any further
>>>>   
>>>>
>>>>        
>>>>
>>>>>information if necessary.
>>>>>Please help.
>>>>>
>>>>>With kind regards,
>>>>>Viktor
>>>>>P.S. In the attachment I put  not only the last iteration
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>but a couple
>>>>   
>>>>
>>>>        
>>>>
>>>>>of successful ones. Actually in debug mode the scheduler updates
>>>>>information like every 5-10 second or so.
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Stephan Grell - Sun Germany - SSG - Software Engineer 
>>>>>>[mailto:stephan.grell at sun.com]
>>>>>>Sent: Friday, May 20, 2005 3:05
>>>>>>To: users at gridengine.sunsource.net
>>>>>>Subject: Re: [GE users] Scheduler dies like a hell
>>>>>>
>>>>>>
>>>>>>Hi,
>>>>>>
>>>>>>I am not sure, that a currupted file is the problem. The qmaster 
>>>>>>does some validation during the startup. Could you run 
>>>>>>            
>>>>>>
>>the scheduler 
>>    
>>
>>>>>>in debug mode and post the output just before it dies?
>>>>>>
>>>>>>You can set the debug mode with:
>>>>>>
>>>>>>source $SGE_ROOT/<CELL>/common/settings.csh
>>>>>>source $SGE_ROOT/util/dl.csh
>>>>>>dl 1
>>>>>>
>>>>>>bin/<arch>/sge_schedd
>>>>>>
>>>>>>Or, do you have a stack trace of the scheduler?
>>>>>>
>>>>>>Which version are you running on which arch?
>>>>>>
>>>>>>Thanks,
>>>>>>Stephan
>>>>>>
>>>>>>Viktor Oudovenko wrote:
>>>>>>
>>>>>>  
>>>>>>
>>>>>>       
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Ron,
>>>>>>>
>>>>>>>Can I try to cat part of accounting file ? I mean to EDIT it
>>>>>>>    
>>>>>>>
>>>>>>>         
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>MANUALLY
>>>>>>  
>>>>>>
>>>>>>       
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>despite it is written do not do it? Best regards,
>>>>>>>v
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>    
>>>>>>>
>>>>>>>         
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>-----Original Message-----
>>>>>>>>From: Ron Chen [mailto:ron_chen_123 at yahoo.com]
>>>>>>>>Sent: Thursday, May 19, 2005 22:02
>>>>>>>>To: users at gridengine.sunsource.net
>>>>>>>>Subject: RE: [GE users] Scheduler dies like a hell
>>>>>>>>
>>>>>>>>
>>>>>>>>It is not easy to find out which file gets corrupted
>>>>>>>>:(
>>>>>>>>
>>>>>>>>One thing you can try is to move spooled job files (in
>>>>>>>>default/spool/qmaster/jobs) to a backup directory.
>>>>>>>>Also, you can use qconf to dump the configuration for
>>>>>>>>the queues/users/hosts, and see if the values "make sense".
>>>>>>>>
>>>>>>>>Of course the best way to fix this is to restore from backup!
>>>>>>>>
>>>>>>>>-Ron
>>>>>>>>
>>>>>>>>
>>>>>>>>--- Viktor Oudovenko <udo at physics.rutgers.edu> wrote:
>>>>>>>> 
>>>>>>>>
>>>>>>>>      
>>>>>>>>
>>>>>>>>           
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi, Ron,
>>>>>>>>>
>>>>>>>>>I am using classic spooling.
>>>>>>>>>Which file should I look for corruption? Can I edit
>>>>>>>>>it manually?
>>>>>>>>>Thank you very much in advance.
>>>>>>>>>v
>>>>>>>>>
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>-----Original Message-----
>>>>>>>>>>From: Ron Chen [mailto:ron_chen_123 at yahoo.com]
>>>>>>>>>>Sent: Thursday, May 19, 2005 20:38
>>>>>>>>>>To: users at gridengine.sunsource.net
>>>>>>>>>>Subject: RE: [GE users] Scheduler dies like a hell
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Are you using classic spooling or Berkeley DB
>>>>>>>>>>spooling?
>>>>>>>>>>
>>>>>>>>>>With classic spooling, when the machine crashes,
>>>>>>>>>>     
>>>>>>>>>>
>>>>>>>>>>          
>>>>>>>>>>
>>>>>>>>>>               
>>>>>>>>>>
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>the
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>files may get corrupted. And when qmaster reads in
>>>>>>>>>>     
>>>>>>>>>>
>>>>>>>>>>          
>>>>>>>>>>
>>>>>>>>>>               
>>>>>>>>>>
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>the
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>corrupted files, it may also corrupt the qmasters'
>>>>>>>>>>     
>>>>>>>>>>
>>>>>>>>>>          
>>>>>>>>>>
>>>>>>>>>>               
>>>>>>>>>>
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>data structures.
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>IIRC, Berkeley DB handles recovery itself, but I
>>>>>>>>>>     
>>>>>>>>>>
>>>>>>>>>>          
>>>>>>>>>>
>>>>>>>>>>               
>>>>>>>>>>
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>have
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>never played with it myself :)
>>>>>>>>>>
>>>>>>>>>>-Ron
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>--- Viktor Oudovenko <udo at physics.rutgers.edu>
>>>>>>>>>>     
>>>>>>>>>>
>>>>>>>>>>          
>>>>>>>>>>
>>>>>>>>>>               
>>>>>>>>>>
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>wrote:
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>Hi, Mac,
>>>>>>>>>>>Thank you very much for your advices!
>>>>>>>>>>>I'll try. I think one of running or finished
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>jobs
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>did a bad record somewhere
>>>>>>>>>>>(like jobs directory).
>>>>>>>>>>>Best regards,
>>>>>>>>>>>v
>>>>>>>>>>>
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>-----Original Message-----
>>>>>>>>>>>>From: McCalla, Mac
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>[mailto:macmccalla at hess.com]
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>>Sent: Thursday, May 19, 2005 15:12
>>>>>>>>>>>>To: users at gridengine.sunsource.net
>>>>>>>>>>>>Subject: RE: [GE users] Scheduler dies like a
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>hell
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>>Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>Some thinks to look at:  any messages in
>>>>>>>>>>>>$SGE_ROOT/......../qmaster/schedd/messages  ?
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>To
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>get more
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>info about what scheduler is doing while it is
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>running, see
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>info about scheduler params profile and
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>monitor,
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>you can set
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>them equal to 1 to turn on
>>>>>>>>>>>>some scheduler diagnostics,  see man
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>sched_conf.
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>>To extend timeout value for scheduler you can
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>set
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>>qmaster_params SCHEDULER_TIMEOUT to some value
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>greater than
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>600 (seconds).
>>>>>>>>>>>>You can also use system command strace to get
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>trace of
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>scheduler activity while it is running to
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>perhaps
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>get a
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>better idea of what it is spending its time
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>doing.
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>>Hope this helps,
>>>>>>>>>>>>
>>>>>>>>>>>>mac mccalla
>>>>>>>>>>>>
>>>>>>>>>>>>-----Original Message-----
>>>>>>>>>>>>From: Viktor Oudovenko
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>[mailto:udo at physics.rutgers.edu]
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>Sent: Thursday, May 19, 2005 12:00 PM
>>>>>>>>>>>>To: users at gridengine.sunsource.net
>>>>>>>>>>>>Subject: [GE users] Scheduler dies like a hell
>>>>>>>>>>>>
>>>>>>>>>>>>Hi, everybody,
>>>>>>>>>>>>
>>>>>>>>>>>>I am asking your help and ideas what could be
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>done
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>to restore
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>normal operation of the scheduler. First what
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>happened. A few
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>time during last week our main server died and
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>I
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>needed to
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>reboot it and even replace it. But jobs which
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>used
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>automount
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>proceed run. But from yesterday or day before
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>yesterday
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>scheduler demon dies. I tried to restart
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>sge_master but it
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>did not help. Now when demon died I start it
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>manually simply typing:
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>/opt/SGE/bin/lx24-x86/sge_schedd
>>>>>>>>>>>>
>>>>>>>>>>>>but after some time it died again. Please
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>advice
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>what could it be?
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>Below plz find some info form file messages:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>05/19/2005 01:02:37|qmaster|rupc-cs04b|E|no
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>execd
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>known on
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>host sub04n87 to send conf notification
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>05/19/2005
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>>01:02:37|qmaster|rupc-cs04b|E|no execd known
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>on
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>host sub04n88
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>to send conf notification 05/19/2005
>>>>>>>>>>>>01:02:37|qmaster|rupc-cs04b|E|no execd known
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>on
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>host sub04n89
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>to send conf notification 05/19/2005
>>>>>>>>>>>>01:02:37|qmaster|rupc-cs04b|E|no execd known
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>on
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>host sub04n90
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>to send conf notification 05/19/2005
>>>>>>>>>>>>01:02:37|qmaster|rupc-cs04b|E|no execd known
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>on
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>host sub04n91
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>to send conf notification 05/19/2005
>>>>>>>>>>>>01:02:37|qmaster|rupc-cs04b|E|no execd known
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>on
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>host
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>rupc04.rutgers.edu to send conf notification
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>05/19/2005
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>01:02:37|qmaster|rupc-cs04b|I|starting up
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>6.0u3
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>05/19/2005
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>01:08:11|qmaster|rupc-cs04b|E|commlib error:
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>got
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>read error
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>(closing connection) 05/19/2005
>>>>>>>>>>>>01:11:06|qmaster|rupc-cs04b|E|event client
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>"scheduler"
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>(rupc-cs04b/schedd/1) reregistered - it will
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>need
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>a total
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>update 05/19/2005
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>01:24:31|qmaster|rupc-cs04b|W|job 21171.1
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>failed on host sub04n203 assumedly after job
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>because: job
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>21171.1 died through signal TERM
>>>>>>>>>>>>(15)
>>>>>>>>>>>>05/19/2005
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>05:17:19|qmaster|rupc-cs04b|E|acknowledge
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>timeout
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>>after 600 seconds for event client (schedd:1)
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>on
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>host
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>"rupc-cs04b" 05/19/2005
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>09:29:03|qmaster|rupc-cs04b|W|job
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>21060.1 failed on host sub04n74 assumedly
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>after
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>job because:
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>job 21060.1 died through signal TERM (15)
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>05/19/2005
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>09:30:37|qmaster|rupc-cs04b|E|event client
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>"scheduler"
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>(rupc-cs04b/schedd/1) reregistered - it will
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>need
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>a total
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>update 05/19/2005
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>11:04:21|qmaster|rupc-cs04b|W|job 20222.1
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>failed on host sub04n29 assumedly after job
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>because: job
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>20222.1 died through signal KILL (9)
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>05/19/2005
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>>11:05:50|qmaster|rupc-cs04b|W|job 21212.1
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>failed
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>on host
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>sub04n25 assumedly after job because: job
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>21212.1
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>died
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>through signal KILL (9) 05/19/2005
>>>>>>>>>>>>12:04:51|qmaster|rupc-cs04b|E|acknowledge
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>timeout
>>>>>>>>>   
>>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>>after 600
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>>>>>seconds for event client (schedd:1) on host
>>>>>>>>>>>>         
>>>>>>>>>>>>
>>>>>>>>>>>>              
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>>>>>>>>>>>"rupc-cs04b"
>>>>>>>>>>>       
>>>>>>>>>>>
>>>>>>>>>>>            
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>
>>>>>>>>>>>                      
>>>>>>>>>>>
>>>>>>>>=== message truncated ===
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>		
>>>>>>>>Discover Yahoo!
>>>>>>>>Have fun online with music videos, cool games, IM and 
>>>>>>>>                
>>>>>>>>
>>more. Check 
>>    
>>
>>>>>>>>it out! http://discover.yahoo.com/online.html
>>>>>>>>
>>>>>>>>------------------------------------------------------------
>>>>>>>>      
>>>>>>>>
>>>>>>>>           
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>---------
>>>>>>  
>>>>>>
>>>>>>       
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>>To unsubscribe, e-mail: 
>>>>>>>>                
>>>>>>>>
>>users-unsubscribe at gridengine.sunsource.net
>>    
>>
>>>>>>>>For additional commands, e-mail:
>>>>>>>>           
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>users-help at gridengine.sunsource.net
>>>>   
>>>>
>>>>        
>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>      
>>>>>>>>
>>>>>>>>           
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>-----------------------------------------------------------
>>>>>>>         
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>----------
>>>>   
>>>>
>>>>        
>>>>
>>>>>>>To unsubscribe, e-mail: 
>>>>>>>              
>>>>>>>
>>users-unsubscribe at gridengine.sunsource.net
>>    
>>
>>>>>>>For additional commands, e-mail:
>>>>>>>         
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>users-help at gridengine.sunsource.net
>>>>   
>>>>
>>>>        
>>>>
>>>>>>>    
>>>>>>>
>>>>>>>         
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>------------------------------------------------------------
>>>>>>       
>>>>>>
>>>>>>            
>>>>>>
>>>>---------
>>>>   
>>>>
>>>>        
>>>>
>>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>>For additional commands, e-mail: 
>>>>>>            
>>>>>>
>>users-help at gridengine.sunsource.net
>>    
>>
>>>>>>  
>>>>>>
>>>>>>       
>>>>>>
>>>>>>            
>>>>>>
>>>>>-------------------------------------------------------------
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>----------
>>>>   
>>>>
>>>>        
>>>>
>>>>>-
>>>>>
>>>>>WS128133  25368 16384     SENDING 22 ORDERS TO QMASTER
>>>>>128134  25368 16384     RESETTING BUSY STATE OF EVENT CLIENT
>>>>>128135  25368 16384     reresolve port timeout in 340
>>>>>128136  25368 16384     returning cached port value: 536
>>>>>--------------STOP-SCHEDULER-RUN-------------
>>>>>128137  25368 16384     ec_get retrieving events - will do 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>max 20 fetches
>>>>   
>>>>
>>>>        
>>>>
>>>>>128138  25368 16384     doing sync fetch for messages, 20 
>>>>>          
>>>>>
>>still to do
>>    
>>
>>>>>128139  25368 16384     try to get request from qmaster, id 1
>>>>>128140  25368 16384     Checking 55 events (44303-44357) 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>while waiting for #44303
>>>>   
>>>>
>>>>        
>>>>
>>>>>128141  25368 16384     check complete, 55 events in list
>>>>>128142  25368 16384     got 55 events till 44357
>>>>>128143  25368 16384     doing async fetch for messages, 19 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>still to do
>>>>   
>>>>
>>>>        
>>>>
>>>>>128144  25368 16384     try to get request from qmaster, id 1
>>>>>128145  25368 16384     reresolve port timeout in 320
>>>>>128146  25368 16384     returning cached port value: 536
>>>>>128147  25368 16384     Sent ack for all events lower or 
>>>>>          
>>>>>
>>equal 44357
>>    
>>
>>>>>128148  25368 16384     ec_get - received 55 events
>>>>>128149  25368 16384     44303. EVENT MOD EXECHOST sub04n147
>>>>>128150  25368 16384     44304. EVENT MOD USER udo
>>>>>128151  25368 16384     44305. EVENT MOD USER iber
>>>>>128152  25368 16384     44306. EVENT MOD USER dieguez
>>>>>128153  25368 16384     44307. EVENT MOD USER karenjoh
>>>>>128154  25368 16384     44308. EVENT MOD USER lorenzo
>>>>>128155  25368 16384     44309. EVENT MOD USER parcolle
>>>>>128156  25368 16384     44310. EVENT MOD USER cfennie
>>>>>128157  25368 16384     44311. EVENT MOD USER civelli
>>>>>128158  25368 16384     44312. EVENT MOD EXECHOST sub04n135
>>>>>128159  25368 16384     44313. EVENT MOD EXECHOST sub04n141
>>>>>128160  25368 16384     44314. EVENT MOD EXECHOST sub04n127
>>>>>128161  25368 16384     44315. EVENT MOD EXECHOST sub04n145
>>>>>128162  25368 16384     44316. EVENT MOD EXECHOST sub04n133
>>>>>128163  25368 16384     44317. EVENT MOD EXECHOST sub04n148
>>>>>128164  25368 16384     44318. EVENT MOD EXECHOST sub04n74
>>>>>128165  25368 16384     44319. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n74 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128166  25368 16384     44320. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n74 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128167  25368 16384     44321. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>rupc03.rutgers.edu
>>    
>>
>>>>>128168  25368 16384     44322. EVENT MOD EXECHOST sub04n139
>>>>>128169  25368 16384     44323. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>rupc02.rutgers.edu
>>    
>>
>>>>>128170  25368 16384     44324. EVENT MOD EXECHOST sub04n80
>>>>>128171  25368 16384     44325. EVENT MOD EXECHOST sub04n207
>>>>>128172  25368 16384     44326. EVENT MOD EXECHOST sub04n180
>>>>>128173  25368 16384     44327. EVENT MOD EXECHOST sub04n23
>>>>>128174  25368 16384     44328. EVENT MOD EXECHOST sub04n30
>>>>>128175  25368 16384     44329. EVENT MOD EXECHOST sub04n203
>>>>>128176  25368 16384     44330. EVENT MOD EXECHOST sub04n109
>>>>>128177  25368 16384     44331. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>rupc04.rutgers.edu
>>    
>>
>>>>>128178  25368 16384     44332. EVENT MOD EXECHOST sub04n114
>>>>>128179  25368 16384     44333. EVENT MOD EXECHOST sub04n106
>>>>>128180  25368 16384     44334. EVENT MOD EXECHOST sub04n88
>>>>>128181  25368 16384     44335. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n88 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128182  25368 16384     44336. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n88 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128183  25368 16384     44337. EVENT MOD EXECHOST sub04n157
>>>>>128184  25368 16384     44338. EVENT MOD EXECHOST sub04n20
>>>>>128185  25368 16384     44339. EVENT MOD EXECHOST sub04n156
>>>>>128186  25368 16384     44340. EVENT MOD EXECHOST sub04n26
>>>>>128187  25368 16384     44341. EVENT JOB 21213.1 USAGE
>>>>>128188  25368 16384     44342. EVENT MOD EXECHOST sub04n05
>>>>>128189  25368 16384     44343. EVENT MOD EXECHOST sub04n103
>>>>>128190  25368 16384     44344. EVENT MOD EXECHOST sub04n164
>>>>>128191  25368 16384     44345. EVENT MOD EXECHOST sub04n09
>>>>>128192  25368 16384     44346. EVENT MOD EXECHOST sub04n105
>>>>>128193  25368 16384     44347. EVENT MOD EXECHOST sub04n113
>>>>>128194  25368 16384     44348. EVENT MOD EXECHOST sub04n28
>>>>>128195  25368 16384     44349. EVENT MOD EXECHOST sub04n76
>>>>>128196  25368 16384     44350. EVENT MOD EXECHOST sub04n162
>>>>>128197  25368 16384     44351. EVENT MOD EXECHOST sub04n108
>>>>>128198  25368 16384     44352. EVENT MOD EXECHOST sub04n38
>>>>>128199  25368 16384     44353. EVENT MOD EXECHOST sub04n04
>>>>>128200  25368 16384     44354. EVENT MOD EXECHOST sub04n116
>>>>>128201  25368 16384     44355. EVENT MOD EXECHOST sub04n179
>>>>>128202  25368 16384     44356. EVENT MOD EXECHOST sub04n160
>>>>>128203  25368 16384     44357. EVENT MOD EXECHOST sub04n107
>>>>>Q:169, AQ:343 J:19(19), H:169(170), C:49, A:4, D:3, P:7,
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>CKPT:0 US:15 PR:4 S:nd:12/lf:7
>>>>   
>>>>
>>>>        
>>>>
>>>>>128204  25368 16384     
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>================[SCHEDULING-EPOCH]==================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128205  25368 16384     JOB 20937.1 start_time = 1116447112 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 338079 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128206  25368 16384     JOB 20938.1 start_time = 1116374344 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 410847 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128207  25368 16384     JOB 21040.1 start_time = 1116443073 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 342118 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128208  25368 16384     JOB 21076.1 start_time = 1116451351 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 333840 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128209  25368 16384     JOB 21210.1 start_time = 1116514970 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 270221 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128210  25368 16384     JOB 21213.1 start_time = 1116515250 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 269941 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128211  25368 16384     JOB 21338.1 start_time = 1116543252 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 241939 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128212  25368 16384     JOB 21423.1 start_time = 1116629274 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 155917 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128213  25368 16384     JOB 21424.1 start_time = 1116631365 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 153826 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128214  25368 16384     JOB 21440.1 start_time = 1116632934 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 152257 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128215  25368 16384     JOB 21441.1 start_time = 1116632994 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 152197 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128216  25368 16384     JOB 21443.1 start_time = 1116633602 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 151589 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128217  25368 16384     JOB 21474.1 start_time = 1116655118 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 130073 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128218  25368 16384     JOB 21503.1 start_time = 1116707395 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 77796 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128219  25368 16384     JOB 21507.1 start_time = 1116714061 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 71130 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128220  25368 16384     JOB 21528.1 start_time = 1116707641 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 77550 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128221  25368 16384     JOB 21530.1 start_time = 1116714453 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 70738 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128222  25368 16384     JOB 21537.1 start_time = 1116724845 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 60346 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128223  25368 16384     JOB 21542.1 start_time = 1116782511 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 2680 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128224  25368 16384     verified threshold of 169 queues
>>>>>128225  25368 16384     queue myrinet at sub04n61 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128226  25368 16384     queue myrinet at sub04n62 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128227  25368 16384     queue myrinet at sub04n65 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128228  25368 16384     queue myrinet at sub04n66 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128229  25368 16384     queue myrinet at sub04n67 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128230  25368 16384     queue myrinet at sub04n68 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128231  25368 16384     queue myrinet at sub04n69 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128232  25368 16384     queue myrinet at sub04n70 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128233  25368 16384     queue myrinet at sub04n71 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128234  25368 16384     queue myrinet at sub04n72 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128235  25368 16384     queue myrinet at sub04n75 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128236  25368 16384     queue myrinet at sub04n77 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128237  25368 16384     queue myrinet at sub04n78 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128238  25368 16384     queue myrinet at sub04n79 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128239  25368 16384     queue myrinet at sub04n81 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128240  25368 16384     queue myrinet at sub04n84 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128241  25368 16384     queue myrinet at sub04n85 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128242  25368 16384     queue myrinet at sub04n86 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128243  25368 16384     queue myrinet at sub04n87 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128244  25368 16384     queue myrinet at sub04n88 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128245  25368 16384     queue myrinet at sub04n89 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128246  25368 16384     queue myrinet at sub04n90 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128247  25368 16384     queue myrinet at sub04n91 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128248  25368 16384     queue myrinet at sub04n63 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128249  25368 16384     queue myrinet at sub04n64 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128250  25368 16384     queue myrinet at sub04n73 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128251  25368 16384     queue myrinet at sub04n74 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128252  25368 16384     queue opteronp at sub04n202 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128253  25368 16384     queue opteronp at sub04n205 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.010000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128254  25368 16384     queue opteronp at sub04n206 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128255  25368 16384     queue opteronp at sub04n208 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.010000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128256  25368 16384     queue parallel at sub04n121 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128257  25368 16384     queue parallel at sub04n139 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128258  25368 16384     queue parallel at sub04n140 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128259  25368 16384     queue parallel at sub04n141 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128260  25368 16384     queue parallel at sub04n142 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128261  25368 16384     queue parallel at sub04n143 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128262  25368 16384     queue parallel at sub04n144 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128263  25368 16384     queue parallel at sub04n146 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128264  25368 16384     queue parallel at sub04n02 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128265  25368 16384     queue parallel at sub04n03 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128266  25368 16384     queue parallel at sub04n04 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128267  25368 16384     queue parallel at sub04n05 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128268  25368 16384     queue parallel at sub04n06 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128269  25368 16384     queue parallel at sub04n07 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128270  25368 16384     queue parallel at sub04n08 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128271  25368 16384     queue parallel at sub04n09 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128272  25368 16384     queue parallel at sub04n10 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128273  25368 16384     queue parallel at sub04n11 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128274  25368 16384     verified threshold of 169 queues
>>>>>128275  25368 16384     STARTING PASS 1 WITH 0 PENDING JOBS
>>>>>128276  25368 16384     Not enrolled ja_tasks: 0
>>>>>128277  25368 16384     Enrolled ja_tasks: 1
>>>>>128278  25368 16384     Not enrolled ja_tasks: 0
>>>>>128279  25368 16384     Enrolled ja_tasks: 1
>>>>>128280  25368 16384     Not enrolled ja_tasks: 0
>>>>>128281  25368 16384     Enrolled ja_tasks: 1
>>>>>128282  25368 16384     Not enrolled ja_tasks: 0
>>>>>128283  25368 16384     Enrolled ja_tasks: 1
>>>>>128284  25368 16384     Not enrolled ja_tasks: 0
>>>>>128285  25368 16384     Enrolled ja_tasks: 1
>>>>>128286  25368 16384     Not enrolled ja_tasks: 0
>>>>>128287  25368 16384     Enrolled ja_tasks: 1
>>>>>128288  25368 16384     Not enrolled ja_tasks: 0
>>>>>128289  25368 16384     Enrolled ja_tasks: 1
>>>>>128290  25368 16384     Not enrolled ja_tasks: 0
>>>>>128291  25368 16384     Enrolled ja_tasks: 1
>>>>>128292  25368 16384     Not enrolled ja_tasks: 0
>>>>>128293  25368 16384     Enrolled ja_tasks: 1
>>>>>128294  25368 16384     Not enrolled ja_tasks: 0
>>>>>128295  25368 16384     Enrolled ja_tasks: 1
>>>>>128296  25368 16384     Not enrolled ja_tasks: 0
>>>>>128297  25368 16384     Enrolled ja_tasks: 1
>>>>>128298  25368 16384     Not enrolled ja_tasks: 0
>>>>>128299  25368 16384     Enrolled ja_tasks: 1
>>>>>128300  25368 16384     Not enrolled ja_tasks: 0
>>>>>128301  25368 16384     Enrolled ja_tasks: 1
>>>>>128302  25368 16384     Not enrolled ja_tasks: 0
>>>>>128303  25368 16384     Enrolled ja_tasks: 1
>>>>>128304  25368 16384     Not enrolled ja_tasks: 0
>>>>>128305  25368 16384     Enrolled ja_tasks: 1
>>>>>128306  25368 16384     Not enrolled ja_tasks: 0
>>>>>128307  25368 16384     Enrolled ja_tasks: 1
>>>>>128308  25368 16384     Not enrolled ja_tasks: 0
>>>>>128309  25368 16384     Enrolled ja_tasks: 1
>>>>>128310  25368 16384     Not enrolled ja_tasks: 0
>>>>>128311  25368 16384     Enrolled ja_tasks: 1
>>>>>128312  25368 16384     Not enrolled ja_tasks: 0
>>>>>128313  25368 16384     Enrolled ja_tasks: 1
>>>>>128314  25368 16384     STARTING PASS 2 WITH 0 PENDING JOBS
>>>>>128315  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128316  25368 16384        slots: 1.000000 * 1000.000000 * 6 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 6000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128317  25368 16384     slot request assumed for static 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>urgency is 20 for ,20-64 PE range due to PE's "mpi" setting "min"
>>>>   
>>>>
>>>>        
>>>>
>>>>>128318  25368 16384        slots: 1.000000 * 1000.000000 * 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>20    ---> 20000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128319  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128320  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128321  25368 16384        slots: 1.000000 * 1000.000000 * 6 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 6000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128322  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128323  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128324  25368 16384     slot request assumed for static 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>urgency is 2 for ,2-8 PE range due to PE's "mpich_myri" 
>>>>        
>>>>
>>setting "min"
>>    
>>
>>>>   
>>>>
>>>>        
>>>>
>>>>>128325  25368 16384        slots: 1.000000 * 1000.000000 * 2 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 2000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128326  25368 16384        slots: 1.000000 * 1000.000000 * 8 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 8000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128327  25368 16384     ASU min = 1000.00000000000, ASU max 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>= 20000.00000000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128328  25368 16384     
>>>>>128329  25368 16384     no DDJU: do_usage: 1 finished_jobs 0
>>>>>128330  25368 16384     
>>>>>128331  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128332  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128333  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128334  25368 16384     
>>>>>128335  25368 16384     no DDJU: do_usage: 0 finished_jobs 0
>>>>>128336  25368 16384     
>>>>>128337  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128338  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128339  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128340  25368 16384     Normalizing tickets using 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0.000000/18.333333 as min_tix/max_tix
>>>>   
>>>>
>>>>        
>>>>
>>>>>128341  25368 16384        got 19 running jobs
>>>>>128342  25368 16384        added 19 ticket orders for running jobs
>>>>>128343  25368 16384        added 1 orders for updating 
>>>>>          
>>>>>
>>usage of user
>>    
>>
>>>>>128344  25368 16384        added 0 orders for updating usage 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>of project
>>>>   
>>>>
>>>>        
>>>>
>>>>>128345  25368 16384        added 0 orders for updating share tree
>>>>>128346  25368 16384        added 1 orders for scheduler 
>>>>>          
>>>>>
>>configuration
>>    
>>
>>>>>128347  25368 16384     SENDING 22 ORDERS TO QMASTER
>>>>>128348  25368 16384     RESETTING BUSY STATE OF EVENT CLIENT
>>>>>128349  25368 16384     reresolve port timeout in 320
>>>>>128350  25368 16384     returning cached port value: 536
>>>>>--------------STOP-SCHEDULER-RUN-------------
>>>>>128351  25368 16384     ec_get retrieving events - will do 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>max 20 fetches
>>>>   
>>>>
>>>>        
>>>>
>>>>>128352  25368 16384     doing sync fetch for messages, 20 
>>>>>          
>>>>>
>>still to do
>>    
>>
>>>>>128353  25368 16384     try to get request from qmaster, id 1
>>>>>128354  25368 16384     Checking 120 events (44358-44477) 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>while waiting for #44358
>>>>   
>>>>
>>>>        
>>>>
>>>>>128355  25368 16384     check complete, 120 events in list
>>>>>128356  25368 16384     got 120 events till 44477
>>>>>128357  25368 16384     doing async fetch for messages, 19 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>still to do
>>>>   
>>>>
>>>>        
>>>>
>>>>>128358  25368 16384     try to get request from qmaster, id 1
>>>>>128359  25368 16384     reresolve port timeout in 300
>>>>>128360  25368 16384     returning cached port value: 536
>>>>>128361  25368 16384     Sent ack for all events lower or 
>>>>>          
>>>>>
>>equal 44477
>>    
>>
>>>>>128362  25368 16384     ec_get - received 120 events
>>>>>128363  25368 16384     44358. EVENT MOD EXECHOST sub04n166
>>>>>128364  25368 16384     44359. EVENT MOD EXECHOST sub04n90
>>>>>128365  25368 16384     44360. EVENT JOB 21503.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n90 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128366  25368 16384     44361. EVENT JOB 21503.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n90 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128367  25368 16384     44362. EVENT MOD EXECHOST sub04n168
>>>>>128368  25368 16384     44363. EVENT MOD EXECHOST sub04n112
>>>>>128369  25368 16384     44364. EVENT MOD EXECHOST sub04n08
>>>>>128370  25368 16384     44365. EVENT MOD EXECHOST sub04n75
>>>>>128371  25368 16384     44366. EVENT JOB 21040.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n75 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128372  25368 16384     44367. EVENT JOB 21040.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n75 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128373  25368 16384     44368. EVENT MOD USER udo
>>>>>128374  25368 16384     44369. EVENT MOD USER iber
>>>>>128375  25368 16384     44370. EVENT MOD USER dieguez
>>>>>128376  25368 16384     44371. EVENT MOD USER karenjoh
>>>>>128377  25368 16384     44372. EVENT MOD USER lorenzo
>>>>>128378  25368 16384     44373. EVENT MOD USER parcolle
>>>>>128379  25368 16384     44374. EVENT MOD USER cfennie
>>>>>128380  25368 16384     44375. EVENT MOD USER civelli
>>>>>128381  25368 16384     44376. EVENT MOD EXECHOST sub04n14
>>>>>128382  25368 16384     44377. EVENT MOD EXECHOST sub04n150
>>>>>128383  25368 16384     44378. EVENT MOD EXECHOST sub04n169
>>>>>128384  25368 16384     44379. EVENT MOD EXECHOST sub04n165
>>>>>128385  25368 16384     44380. EVENT MOD EXECHOST sub04n136
>>>>>128386  25368 16384     44381. EVENT MOD EXECHOST sub04n81
>>>>>128387  25368 16384     44382. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n81 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128388  25368 16384     44383. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n81 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128389  25368 16384     44384. EVENT MOD EXECHOST sub04n176
>>>>>128390  25368 16384     44385. EVENT MOD EXECHOST sub04n161
>>>>>128391  25368 16384     44386. EVENT MOD EXECHOST sub04n124
>>>>>128392  25368 16384     44387. EVENT MOD EXECHOST sub04n01
>>>>>128393  25368 16384     44388. EVENT MOD EXECHOST sub04n158
>>>>>128394  25368 16384     44389. EVENT MOD EXECHOST sub04n159
>>>>>128395  25368 16384     44390. EVENT MOD EXECHOST sub04n134
>>>>>128396  25368 16384     44391. EVENT MOD EXECHOST sub04n143
>>>>>128397  25368 16384     44392. EVENT MOD EXECHOST sub04n121
>>>>>128398  25368 16384     44393. EVENT MOD EXECHOST sub04n15
>>>>>128399  25368 16384     44394. EVENT MOD EXECHOST sub04n13
>>>>>128400  25368 16384     44395. EVENT MOD EXECHOST sub04n118
>>>>>128401  25368 16384     44396. EVENT MOD EXECHOST sub04n64
>>>>>128402  25368 16384     44397. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n64 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128403  25368 16384     44398. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n64 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128404  25368 16384     44399. EVENT MOD EXECHOST sub04n151
>>>>>128405  25368 16384     44400. EVENT MOD EXECHOST sub04n154
>>>>>128406  25368 16384     44401. EVENT MOD EXECHOST sub04n149
>>>>>128407  25368 16384     44402. EVENT MOD EXECHOST sub04n16
>>>>>128408  25368 16384     44403. EVENT MOD EXECHOST sub04n155
>>>>>128409  25368 16384     44404. EVENT MOD EXECHOST sub04n152
>>>>>128410  25368 16384     44405. EVENT MOD EXECHOST sub04n163
>>>>>128411  25368 16384     44406. EVENT MOD EXECHOST sub04n86
>>>>>128412  25368 16384     44407. EVENT JOB 21423.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n86 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128413  25368 16384     44408. EVENT JOB 21423.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n86 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128414  25368 16384     44409. EVENT MOD EXECHOST sub04n43
>>>>>128415  25368 16384     44410. EVENT MOD EXECHOST sub04n204
>>>>>128416  25368 16384     44411. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>rupc01.rutgers.edu
>>    
>>
>>>>>128417  25368 16384     44412. EVENT MOD EXECHOST sub04n125
>>>>>128418  25368 16384     44413. EVENT MOD EXECHOST sub04n03
>>>>>128419  25368 16384     44414. EVENT JOB 21076.1 USAGE
>>>>>128420  25368 16384     44415. EVENT MOD EXECHOST sub04n44
>>>>>128421  25368 16384     44416. EVENT MOD EXECHOST sub04n32
>>>>>128422  25368 16384     44417. EVENT MOD EXECHOST sub04n21
>>>>>128423  25368 16384     44418. EVENT MOD EXECHOST sub04n22
>>>>>128424  25368 16384     44419. EVENT MOD EXECHOST sub04n35
>>>>>128425  25368 16384     44420. EVENT MOD EXECHOST sub04n201
>>>>>128426  25368 16384     44421. EVENT MOD EXECHOST sub04n146
>>>>>128427  25368 16384     44422. EVENT MOD EXECHOST sub04n111
>>>>>128428  25368 16384     44423. EVENT MOD EXECHOST sub04n177
>>>>>128429  25368 16384     44424. EVENT MOD EXECHOST sub04n89
>>>>>128430  25368 16384     44425. EVENT JOB 21530.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n89 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128431  25368 16384     44426. EVENT JOB 21530.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n89 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128432  25368 16384     44427. EVENT JOB 21530.1 USAGE
>>>>>128433  25368 16384     44428. EVENT MOD EXECHOST sub04n205
>>>>>128434  25368 16384     44429. EVENT JOB 21440.1 USAGE
>>>>>128435  25368 16384     44430. EVENT MOD EXECHOST sub04n208
>>>>>128436  25368 16384     44431. EVENT JOB 21528.1 USAGE
>>>>>128437  25368 16384     44432. EVENT MOD EXECHOST sub04n104
>>>>>128438  25368 16384     44433. EVENT MOD EXECHOST sub04n24
>>>>>128439  25368 16384     44434. EVENT JOB 21210.1 USAGE
>>>>>128440  25368 16384     44435. EVENT MOD EXECHOST sub04n18
>>>>>128441  25368 16384     44436. EVENT MOD EXECHOST sub04n31
>>>>>128442  25368 16384     44437. EVENT JOB 20937.1 USAGE
>>>>>128443  25368 16384     44438. EVENT MOD EXECHOST sub04n202
>>>>>128444  25368 16384     44439. EVENT JOB 21443.1 USAGE
>>>>>128445  25368 16384     44440. EVENT MOD EXECHOST sub04n171
>>>>>128446  25368 16384     44441. EVENT MOD EXECHOST sub04n37
>>>>>128447  25368 16384     44442. EVENT MOD EXECHOST sub04n36
>>>>>128448  25368 16384     44443. EVENT MOD EXECHOST sub04n40
>>>>>128449  25368 16384     44444. EVENT MOD EXECHOST sub04n12
>>>>>128450  25368 16384     44445. EVENT MOD EXECHOST sub04n172
>>>>>128451  25368 16384     44446. EVENT MOD EXECHOST sub04n79
>>>>>128452  25368 16384     44447. EVENT JOB 21040.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n79 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128453  25368 16384     44448. EVENT JOB 21040.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n79 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128454  25368 16384     44449. EVENT JOB 21040.1 USAGE
>>>>>128455  25368 16384     44450. EVENT MOD EXECHOST sub04n61
>>>>>128456  25368 16384     44451. EVENT JOB 21040.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n61 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128457  25368 16384     44452. EVENT JOB 21040.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n61 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128458  25368 16384     44453. EVENT MOD EXECHOST sub04n170
>>>>>128459  25368 16384     44454. EVENT MOD EXECHOST sub04n41
>>>>>128460  25368 16384     44455. EVENT JOB 20938.1 USAGE
>>>>>128461  25368 16384     44456. EVENT MOD EXECHOST sub04n153
>>>>>128462  25368 16384     44457. EVENT MOD EXECHOST sub04n39
>>>>>128463  25368 16384     44458. EVENT MOD EXECHOST sub04n83
>>>>>128464  25368 16384     44459. EVENT MOD EXECHOST sub04n82
>>>>>128465  25368 16384     44460. EVENT MOD EXECHOST sub04n174
>>>>>128466  25368 16384     44461. EVENT MOD EXECHOST sub04n173
>>>>>128467  25368 16384     44462. EVENT MOD EXECHOST sub04n85
>>>>>128468  25368 16384     44463. EVENT JOB 21423.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n85 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128469  25368 16384     44464. EVENT JOB 21423.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n85 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128470  25368 16384     44465. EVENT MOD EXECHOST sub04n68
>>>>>128471  25368 16384     44466. EVENT JOB 21474.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>14.sub04n68 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128472  25368 16384     44467. EVENT JOB 21474.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>13.sub04n68 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128473  25368 16384     44468. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>beowulf.rutgers.edu
>>    
>>
>>>>>128474  25368 16384     44469. EVENT MOD EXECHOST sub04n91
>>>>>128475  25368 16384     44470. EVENT JOB 21423.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n91 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128476  25368 16384     44471. EVENT JOB 21423.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n91 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128477  25368 16384     44472. EVENT JOB 21423.1 USAGE
>>>>>128478  25368 16384     44473. EVENT MOD EXECHOST sub04n29
>>>>>128479  25368 16384     44474. EVENT MOD EXECHOST sub04n69
>>>>>128480  25368 16384     44475. EVENT JOB 21474.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>14.sub04n69 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128481  25368 16384     44476. EVENT JOB 21474.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>13.sub04n69 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128482  25368 16384     44477. EVENT MOD EXECHOST sub04n175
>>>>>Q:169, AQ:343 J:19(19), H:169(170), C:49, A:4, D:3, P:7,
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>CKPT:0 US:15 PR:4 S:nd:12/lf:7
>>>>   
>>>>
>>>>        
>>>>
>>>>>128483  25368 16384     
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>================[SCHEDULING-EPOCH]==================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128484  25368 16384     JOB 20937.1 start_time = 1116447112 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 338099 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128485  25368 16384     JOB 20938.1 start_time = 1116374344 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 410867 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128486  25368 16384     JOB 21040.1 start_time = 1116443073 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 342138 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128487  25368 16384     JOB 21076.1 start_time = 1116451351 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 333860 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128488  25368 16384     JOB 21210.1 start_time = 1116514970 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 270241 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128489  25368 16384     JOB 21213.1 start_time = 1116515250 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 269961 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128490  25368 16384     JOB 21338.1 start_time = 1116543252 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 241959 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128491  25368 16384     JOB 21423.1 start_time = 1116629274 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 155937 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128492  25368 16384     JOB 21424.1 start_time = 1116631365 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 153846 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128493  25368 16384     JOB 21440.1 start_time = 1116632934 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 152277 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128494  25368 16384     JOB 21441.1 start_time = 1116632994 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 152217 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128495  25368 16384     JOB 21443.1 start_time = 1116633602 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 151609 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128496  25368 16384     JOB 21474.1 start_time = 1116655118 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 130093 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128497  25368 16384     JOB 21503.1 start_time = 1116707395 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 77816 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128498  25368 16384     JOB 21507.1 start_time = 1116714061 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 71150 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128499  25368 16384     JOB 21528.1 start_time = 1116707641 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 77570 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128500  25368 16384     JOB 21530.1 start_time = 1116714453 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 70758 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128501  25368 16384     JOB 21537.1 start_time = 1116724845 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 60366 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128502  25368 16384     JOB 21542.1 start_time = 1116782511 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 2700 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128503  25368 16384     verified threshold of 169 queues
>>>>>128504  25368 16384     queue myrinet at sub04n61 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128505  25368 16384     queue myrinet at sub04n62 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128506  25368 16384     queue myrinet at sub04n65 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128507  25368 16384     queue myrinet at sub04n66 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128508  25368 16384     queue myrinet at sub04n67 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128509  25368 16384     queue myrinet at sub04n68 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128510  25368 16384     queue myrinet at sub04n69 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128511  25368 16384     queue myrinet at sub04n70 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128512  25368 16384     queue myrinet at sub04n71 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128513  25368 16384     queue myrinet at sub04n72 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128514  25368 16384     queue myrinet at sub04n75 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128515  25368 16384     queue myrinet at sub04n77 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128516  25368 16384     queue myrinet at sub04n78 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128517  25368 16384     queue myrinet at sub04n79 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128518  25368 16384     queue myrinet at sub04n81 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128519  25368 16384     queue myrinet at sub04n84 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128520  25368 16384     queue myrinet at sub04n85 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128521  25368 16384     queue myrinet at sub04n86 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128522  25368 16384     queue myrinet at sub04n87 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128523  25368 16384     queue myrinet at sub04n88 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128524  25368 16384     queue myrinet at sub04n89 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128525  25368 16384     queue myrinet at sub04n90 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128526  25368 16384     queue myrinet at sub04n91 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128527  25368 16384     queue myrinet at sub04n63 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128528  25368 16384     queue myrinet at sub04n64 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128529  25368 16384     queue myrinet at sub04n73 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128530  25368 16384     queue myrinet at sub04n74 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128531  25368 16384     queue opteronp at sub04n202 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128532  25368 16384     queue opteronp at sub04n205 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128533  25368 16384     queue opteronp at sub04n206 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128534  25368 16384     queue opteronp at sub04n208 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128535  25368 16384     queue parallel at sub04n121 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128536  25368 16384     queue parallel at sub04n139 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128537  25368 16384     queue parallel at sub04n140 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128538  25368 16384     queue parallel at sub04n141 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128539  25368 16384     queue parallel at sub04n142 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128540  25368 16384     queue parallel at sub04n143 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128541  25368 16384     queue parallel at sub04n144 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128542  25368 16384     queue parallel at sub04n146 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128543  25368 16384     queue parallel at sub04n02 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128544  25368 16384     queue parallel at sub04n03 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128545  25368 16384     queue parallel at sub04n04 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128546  25368 16384     queue parallel at sub04n05 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128547  25368 16384     queue parallel at sub04n06 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128548  25368 16384     queue parallel at sub04n07 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128549  25368 16384     queue parallel at sub04n08 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128550  25368 16384     queue parallel at sub04n09 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128551  25368 16384     queue parallel at sub04n10 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128552  25368 16384     queue parallel at sub04n11 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128553  25368 16384     verified threshold of 169 queues
>>>>>128554  25368 16384     STARTING PASS 1 WITH 0 PENDING JOBS
>>>>>128555  25368 16384     Not enrolled ja_tasks: 0
>>>>>128556  25368 16384     Enrolled ja_tasks: 1
>>>>>128557  25368 16384     Not enrolled ja_tasks: 0
>>>>>128558  25368 16384     Enrolled ja_tasks: 1
>>>>>128559  25368 16384     Not enrolled ja_tasks: 0
>>>>>128560  25368 16384     Enrolled ja_tasks: 1
>>>>>128561  25368 16384     Not enrolled ja_tasks: 0
>>>>>128562  25368 16384     Enrolled ja_tasks: 1
>>>>>128563  25368 16384     Not enrolled ja_tasks: 0
>>>>>128564  25368 16384     Enrolled ja_tasks: 1
>>>>>128565  25368 16384     Not enrolled ja_tasks: 0
>>>>>128566  25368 16384     Enrolled ja_tasks: 1
>>>>>128567  25368 16384     Not enrolled ja_tasks: 0
>>>>>128568  25368 16384     Enrolled ja_tasks: 1
>>>>>128569  25368 16384     Not enrolled ja_tasks: 0
>>>>>128570  25368 16384     Enrolled ja_tasks: 1
>>>>>128571  25368 16384     Not enrolled ja_tasks: 0
>>>>>128572  25368 16384     Enrolled ja_tasks: 1
>>>>>128573  25368 16384     Not enrolled ja_tasks: 0
>>>>>128574  25368 16384     Enrolled ja_tasks: 1
>>>>>128575  25368 16384     Not enrolled ja_tasks: 0
>>>>>128576  25368 16384     Enrolled ja_tasks: 1
>>>>>128577  25368 16384     Not enrolled ja_tasks: 0
>>>>>128578  25368 16384     Enrolled ja_tasks: 1
>>>>>128579  25368 16384     Not enrolled ja_tasks: 0
>>>>>128580  25368 16384     Enrolled ja_tasks: 1
>>>>>128581  25368 16384     Not enrolled ja_tasks: 0
>>>>>128582  25368 16384     Enrolled ja_tasks: 1
>>>>>128583  25368 16384     Not enrolled ja_tasks: 0
>>>>>128584  25368 16384     Enrolled ja_tasks: 1
>>>>>128585  25368 16384     Not enrolled ja_tasks: 0
>>>>>128586  25368 16384     Enrolled ja_tasks: 1
>>>>>128587  25368 16384     Not enrolled ja_tasks: 0
>>>>>128588  25368 16384     Enrolled ja_tasks: 1
>>>>>128589  25368 16384     Not enrolled ja_tasks: 0
>>>>>128590  25368 16384     Enrolled ja_tasks: 1
>>>>>128591  25368 16384     Not enrolled ja_tasks: 0
>>>>>128592  25368 16384     Enrolled ja_tasks: 1
>>>>>128593  25368 16384     STARTING PASS 2 WITH 0 PENDING JOBS
>>>>>128594  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128595  25368 16384        slots: 1.000000 * 1000.000000 * 6 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 6000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128596  25368 16384     slot request assumed for static 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>urgency is 20 for ,20-64 PE range due to PE's "mpi" setting "min"
>>>>   
>>>>
>>>>        
>>>>
>>>>>128597  25368 16384        slots: 1.000000 * 1000.000000 * 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>20    ---> 20000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128598  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128599  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128600  25368 16384        slots: 1.000000 * 1000.000000 * 6 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 6000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128601  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128602  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128603  25368 16384     slot request assumed for static 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>urgency is 2 for ,2-8 PE range due to PE's "mpich_myri" 
>>>>        
>>>>
>>setting "min"
>>    
>>
>>>>   
>>>>
>>>>        
>>>>
>>>>>128604  25368 16384        slots: 1.000000 * 1000.000000 * 2 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 2000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128605  25368 16384        slots: 1.000000 * 1000.000000 * 8 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 8000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128606  25368 16384     ASU min = 1000.00000000000, ASU max 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>= 20000.00000000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128607  25368 16384     
>>>>>128608  25368 16384     no DDJU: do_usage: 1 finished_jobs 0
>>>>>128609  25368 16384     
>>>>>128610  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128611  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128612  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128613  25368 16384     
>>>>>128614  25368 16384     no DDJU: do_usage: 0 finished_jobs 0
>>>>>128615  25368 16384     
>>>>>128616  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128617  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128618  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128619  25368 16384     Normalizing tickets using 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0.000000/18.333333 as min_tix/max_tix
>>>>   
>>>>
>>>>        
>>>>
>>>>>128620  25368 16384        got 19 running jobs
>>>>>128621  25368 16384        added 19 ticket orders for running jobs
>>>>>128622  25368 16384        added 1 orders for updating 
>>>>>          
>>>>>
>>usage of user
>>    
>>
>>>>>128623  25368 16384        added 0 orders for updating usage 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>of project
>>>>   
>>>>
>>>>        
>>>>
>>>>>128624  25368 16384        added 0 orders for updating share tree
>>>>>128625  25368 16384        added 1 orders for scheduler 
>>>>>          
>>>>>
>>configuration
>>    
>>
>>>>>128626  25368 16384     SENDING 22 ORDERS TO QMASTER
>>>>>128627  25368 16384     RESETTING BUSY STATE OF EVENT CLIENT
>>>>>128628  25368 16384     reresolve port timeout in 300
>>>>>128629  25368 16384     returning cached port value: 536
>>>>>--------------STOP-SCHEDULER-RUN-------------
>>>>>128630  25368 16384     ec_get retrieving events - will do 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>max 20 fetches
>>>>   
>>>>
>>>>        
>>>>
>>>>>128631  25368 16384     doing sync fetch for messages, 20 
>>>>>          
>>>>>
>>still to do
>>    
>>
>>>>>128632  25368 16384     try to get request from qmaster, id 1
>>>>>128633  25368 16384     Checking 84 events (44478-44561) 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>while waiting for #44478
>>>>   
>>>>
>>>>        
>>>>
>>>>>128634  25368 16384     check complete, 84 events in list
>>>>>128635  25368 16384     got 84 events till 44561
>>>>>128636  25368 16384     doing async fetch for messages, 19 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>still to do
>>>>   
>>>>
>>>>        
>>>>
>>>>>128637  25368 16384     try to get request from qmaster, id 1
>>>>>128638  25368 16384     reresolve port timeout in 280
>>>>>128639  25368 16384     returning cached port value: 536
>>>>>128640  25368 16384     Getting host by name - Linux
>>>>>128641  25368 16384     1 names in h_addr_list
>>>>>128642  25368 16384     0 names in h_aliases
>>>>>128643  25368 16384     Sent ack for all events lower or 
>>>>>          
>>>>>
>>equal 44561
>>    
>>
>>>>>128644  25368 16384     ec_get - received 84 events
>>>>>128645  25368 16384     44478. EVENT MOD EXECHOST sub04n167
>>>>>128646  25368 16384     44479. EVENT MOD EXECHOST sub04n63
>>>>>128647  25368 16384     44480. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n63 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128648  25368 16384     44481. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n63 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128649  25368 16384     44482. EVENT JOB 21542.1 USAGE
>>>>>128650  25368 16384     44483. EVENT MOD EXECHOST sub04n71
>>>>>128651  25368 16384     44484. EVENT JOB 21537.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n71 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128652  25368 16384     44485. EVENT JOB 21537.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n71 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128653  25368 16384     44486. EVENT MOD EXECHOST sub04n65
>>>>>128654  25368 16384     44487. EVENT JOB 21424.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n65 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128655  25368 16384     44488. EVENT JOB 21424.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n65 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128656  25368 16384     44489. EVENT MOD USER udo
>>>>>128657  25368 16384     44490. EVENT MOD USER iber
>>>>>128658  25368 16384     44491. EVENT MOD USER dieguez
>>>>>128659  25368 16384     44492. EVENT MOD USER karenjoh
>>>>>128660  25368 16384     44493. EVENT MOD USER lorenzo
>>>>>128661  25368 16384     44494. EVENT MOD USER parcolle
>>>>>128662  25368 16384     44495. EVENT MOD USER cfennie
>>>>>128663  25368 16384     44496. EVENT MOD USER civelli
>>>>>128664  25368 16384     44497. EVENT MOD EXECHOST sub04n25
>>>>>128665  25368 16384     44498. EVENT MOD EXECHOST sub04n144
>>>>>128666  25368 16384     44499. EVENT MOD EXECHOST sub04n206
>>>>>128667  25368 16384     44500. EVENT JOB 21441.1 USAGE
>>>>>128668  25368 16384     44501. EVENT MOD EXECHOST sub04n87
>>>>>128669  25368 16384     44502. EVENT JOB 21503.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n87 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128670  25368 16384     44503. EVENT JOB 21503.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n87 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128671  25368 16384     44504. EVENT MOD EXECHOST sub04n70
>>>>>128672  25368 16384     44505. EVENT JOB 21503.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n70 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128673  25368 16384     44506. EVENT JOB 21503.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n70 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128674  25368 16384     44507. EVENT JOB 21503.1 USAGE
>>>>>128675  25368 16384     44508. EVENT MOD EXECHOST sub04n19
>>>>>128676  25368 16384     44509. EVENT JOB 21338.1 USAGE
>>>>>128677  25368 16384     44510. EVENT MOD EXECHOST sub04n84
>>>>>128678  25368 16384     44511. EVENT JOB 21424.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n84 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128679  25368 16384     44512. EVENT JOB 21424.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n84 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128680  25368 16384     44513. EVENT MOD EXECHOST sub04n178
>>>>>128681  25368 16384     44514. EVENT MOD EXECHOST sub04n67
>>>>>128682  25368 16384     44515. EVENT JOB 21474.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>14.sub04n67 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128683  25368 16384     44516. EVENT JOB 21474.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>13.sub04n67 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128684  25368 16384     44517. EVENT JOB 21474.1 USAGE
>>>>>128685  25368 16384     44518. EVENT MOD EXECHOST sub04n27
>>>>>128686  25368 16384     44519. EVENT MOD EXECHOST sub04n34
>>>>>128687  25368 16384     44520. EVENT MOD EXECHOST sub04n72
>>>>>128688  25368 16384     44521. EVENT JOB 21537.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n72 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128689  25368 16384     44522. EVENT JOB 21537.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n72 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128690  25368 16384     44523. EVENT MOD EXECHOST sub04n78
>>>>>128691  25368 16384     44524. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n78 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128692  25368 16384     44525. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n78 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128693  25368 16384     44526. EVENT JOB 21507.1 USAGE
>>>>>128694  25368 16384     44527. EVENT MOD EXECHOST sub04n17
>>>>>128695  25368 16384     44528. EVENT MOD EXECHOST sub04n07
>>>>>128696  25368 16384     44529. EVENT MOD EXECHOST sub04n128
>>>>>128697  25368 16384     44530. EVENT MOD EXECHOST sub04n42
>>>>>128698  25368 16384     44531. EVENT MOD EXECHOST sub04n62
>>>>>128699  25368 16384     44532. EVENT JOB 21424.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n62 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128700  25368 16384     44533. EVENT JOB 21424.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n62 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128701  25368 16384     44534. EVENT JOB 21424.1 USAGE
>>>>>128702  25368 16384     44535. EVENT MOD EXECHOST sub04n10
>>>>>128703  25368 16384     44536. EVENT MOD EXECHOST sub04n77
>>>>>128704  25368 16384     44537. EVENT JOB 21537.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n77 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128705  25368 16384     44538. EVENT JOB 21537.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n77 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128706  25368 16384     44539. EVENT MOD EXECHOST sub04n11
>>>>>128707  25368 16384     44540. EVENT MOD EXECHOST sub04n02
>>>>>128708  25368 16384     44541. EVENT MOD EXECHOST sub04n120
>>>>>128709  25368 16384     44542. EVENT MOD EXECHOST sub04n115
>>>>>128710  25368 16384     44543. EVENT MOD EXECHOST sub04n101
>>>>>128711  25368 16384     44544. EVENT MOD EXECHOST sub04n66
>>>>>128712  25368 16384     44545. EVENT JOB 21537.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n66 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128713  25368 16384     44546. EVENT JOB 21537.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n66 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128714  25368 16384     44547. EVENT JOB 21537.1 USAGE
>>>>>128715  25368 16384     44548. EVENT MOD EXECHOST sub04n142
>>>>>128716  25368 16384     44549. EVENT MOD EXECHOST sub04n123
>>>>>128717  25368 16384     44550. EVENT MOD EXECHOST sub04n33
>>>>>128718  25368 16384     44551. EVENT MOD EXECHOST sub04n126
>>>>>128719  25368 16384     44552. EVENT MOD EXECHOST sub04n140
>>>>>128720  25368 16384     44553. EVENT MOD EXECHOST sub04n119
>>>>>128721  25368 16384     44554. EVENT MOD EXECHOST sub04n102
>>>>>128722  25368 16384     44555. EVENT MOD EXECHOST sub04n110
>>>>>128723  25368 16384     44556. EVENT MOD EXECHOST sub04n117
>>>>>128724  25368 16384     44557. EVENT MOD EXECHOST sub04n06
>>>>>128725  25368 16384     44558. EVENT MOD EXECHOST sub04n73
>>>>>128726  25368 16384     44559. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n73 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128727  25368 16384     44560. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n73 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128728  25368 16384     44561. EVENT MOD EXECHOST sub04n122
>>>>>Q:169, AQ:343 J:19(19), H:169(170), C:49, A:4, D:3, P:7,
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>CKPT:0 US:15 PR:4 S:nd:12/lf:7
>>>>   
>>>>
>>>>        
>>>>
>>>>>128729  25368 16384     
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>================[SCHEDULING-EPOCH]==================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128730  25368 16384     JOB 20937.1 start_time = 1116447112 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 338119 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128731  25368 16384     JOB 20938.1 start_time = 1116374344 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 410887 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128732  25368 16384     JOB 21040.1 start_time = 1116443073 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 342158 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128733  25368 16384     JOB 21076.1 start_time = 1116451351 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 333880 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128734  25368 16384     JOB 21210.1 start_time = 1116514970 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 270261 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128735  25368 16384     JOB 21213.1 start_time = 1116515250 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 269981 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128736  25368 16384     JOB 21338.1 start_time = 1116543252 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 241979 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128737  25368 16384     JOB 21423.1 start_time = 1116629274 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 155957 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128738  25368 16384     JOB 21424.1 start_time = 1116631365 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 153866 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128739  25368 16384     JOB 21440.1 start_time = 1116632934 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 152297 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128740  25368 16384     JOB 21441.1 start_time = 1116632994 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 152237 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128741  25368 16384     JOB 21443.1 start_time = 1116633602 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 151629 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128742  25368 16384     JOB 21474.1 start_time = 1116655118 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 130113 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128743  25368 16384     JOB 21503.1 start_time = 1116707395 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 77836 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128744  25368 16384     JOB 21507.1 start_time = 1116714061 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 71170 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128745  25368 16384     JOB 21528.1 start_time = 1116707641 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 77590 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128746  25368 16384     JOB 21530.1 start_time = 1116714453 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 70778 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128747  25368 16384     JOB 21537.1 start_time = 1116724845 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 60386 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128748  25368 16384     JOB 21542.1 start_time = 1116782511 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 2720 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128749  25368 16384     verified threshold of 169 queues
>>>>>128750  25368 16384     queue myrinet at sub04n61 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128751  25368 16384     queue myrinet at sub04n62 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128752  25368 16384     queue myrinet at sub04n65 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128753  25368 16384     queue myrinet at sub04n66 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128754  25368 16384     queue myrinet at sub04n67 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128755  25368 16384     queue myrinet at sub04n68 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128756  25368 16384     queue myrinet at sub04n69 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128757  25368 16384     queue myrinet at sub04n70 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128758  25368 16384     queue myrinet at sub04n71 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128759  25368 16384     queue myrinet at sub04n72 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128760  25368 16384     queue myrinet at sub04n75 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128761  25368 16384     queue myrinet at sub04n77 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128762  25368 16384     queue myrinet at sub04n78 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128763  25368 16384     queue myrinet at sub04n79 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128764  25368 16384     queue myrinet at sub04n81 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128765  25368 16384     queue myrinet at sub04n84 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128766  25368 16384     queue myrinet at sub04n85 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128767  25368 16384     queue myrinet at sub04n86 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128768  25368 16384     queue myrinet at sub04n87 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128769  25368 16384     queue myrinet at sub04n88 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128770  25368 16384     queue myrinet at sub04n89 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128771  25368 16384     queue myrinet at sub04n90 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128772  25368 16384     queue myrinet at sub04n91 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128773  25368 16384     queue myrinet at sub04n63 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128774  25368 16384     queue myrinet at sub04n64 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128775  25368 16384     queue myrinet at sub04n73 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128776  25368 16384     queue myrinet at sub04n74 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128777  25368 16384     queue opteronp at sub04n202 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128778  25368 16384     queue opteronp at sub04n205 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128779  25368 16384     queue opteronp at sub04n206 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128780  25368 16384     queue opteronp at sub04n208 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128781  25368 16384     queue parallel at sub04n121 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128782  25368 16384     queue parallel at sub04n139 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128783  25368 16384     queue parallel at sub04n140 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128784  25368 16384     queue parallel at sub04n141 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128785  25368 16384     queue parallel at sub04n142 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128786  25368 16384     queue parallel at sub04n143 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128787  25368 16384     queue parallel at sub04n144 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128788  25368 16384     queue parallel at sub04n146 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128789  25368 16384     queue parallel at sub04n02 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128790  25368 16384     queue parallel at sub04n03 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128791  25368 16384     queue parallel at sub04n04 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128792  25368 16384     queue parallel at sub04n05 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128793  25368 16384     queue parallel at sub04n06 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128794  25368 16384     queue parallel at sub04n07 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128795  25368 16384     queue parallel at sub04n08 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128796  25368 16384     queue parallel at sub04n09 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128797  25368 16384     queue parallel at sub04n10 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128798  25368 16384     queue parallel at sub04n11 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128799  25368 16384     verified threshold of 169 queues
>>>>>128800  25368 16384     STARTING PASS 1 WITH 0 PENDING JOBS
>>>>>128801  25368 16384     Not enrolled ja_tasks: 0
>>>>>128802  25368 16384     Enrolled ja_tasks: 1
>>>>>128803  25368 16384     Not enrolled ja_tasks: 0
>>>>>128804  25368 16384     Enrolled ja_tasks: 1
>>>>>128805  25368 16384     Not enrolled ja_tasks: 0
>>>>>128806  25368 16384     Enrolled ja_tasks: 1
>>>>>128807  25368 16384     Not enrolled ja_tasks: 0
>>>>>128808  25368 16384     Enrolled ja_tasks: 1
>>>>>128809  25368 16384     Not enrolled ja_tasks: 0
>>>>>128810  25368 16384     Enrolled ja_tasks: 1
>>>>>128811  25368 16384     Not enrolled ja_tasks: 0
>>>>>128812  25368 16384     Enrolled ja_tasks: 1
>>>>>128813  25368 16384     Not enrolled ja_tasks: 0
>>>>>128814  25368 16384     Enrolled ja_tasks: 1
>>>>>128815  25368 16384     Not enrolled ja_tasks: 0
>>>>>128816  25368 16384     Enrolled ja_tasks: 1
>>>>>128817  25368 16384     Not enrolled ja_tasks: 0
>>>>>128818  25368 16384     Enrolled ja_tasks: 1
>>>>>128819  25368 16384     Not enrolled ja_tasks: 0
>>>>>128820  25368 16384     Enrolled ja_tasks: 1
>>>>>128821  25368 16384     Not enrolled ja_tasks: 0
>>>>>128822  25368 16384     Enrolled ja_tasks: 1
>>>>>128823  25368 16384     Not enrolled ja_tasks: 0
>>>>>128824  25368 16384     Enrolled ja_tasks: 1
>>>>>128825  25368 16384     Not enrolled ja_tasks: 0
>>>>>128826  25368 16384     Enrolled ja_tasks: 1
>>>>>128827  25368 16384     Not enrolled ja_tasks: 0
>>>>>128828  25368 16384     Enrolled ja_tasks: 1
>>>>>128829  25368 16384     Not enrolled ja_tasks: 0
>>>>>128830  25368 16384     Enrolled ja_tasks: 1
>>>>>128831  25368 16384     Not enrolled ja_tasks: 0
>>>>>128832  25368 16384     Enrolled ja_tasks: 1
>>>>>128833  25368 16384     Not enrolled ja_tasks: 0
>>>>>128834  25368 16384     Enrolled ja_tasks: 1
>>>>>128835  25368 16384     Not enrolled ja_tasks: 0
>>>>>128836  25368 16384     Enrolled ja_tasks: 1
>>>>>128837  25368 16384     Not enrolled ja_tasks: 0
>>>>>128838  25368 16384     Enrolled ja_tasks: 1
>>>>>128839  25368 16384     STARTING PASS 2 WITH 0 PENDING JOBS
>>>>>128840  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128841  25368 16384        slots: 1.000000 * 1000.000000 * 6 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 6000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128842  25368 16384     slot request assumed for static 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>urgency is 20 for ,20-64 PE range due to PE's "mpi" setting "min"
>>>>   
>>>>
>>>>        
>>>>
>>>>>128843  25368 16384        slots: 1.000000 * 1000.000000 * 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>20    ---> 20000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128844  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128845  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128846  25368 16384        slots: 1.000000 * 1000.000000 * 6 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 6000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128847  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128848  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128849  25368 16384     slot request assumed for static 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>urgency is 2 for ,2-8 PE range due to PE's "mpich_myri" 
>>>>        
>>>>
>>setting "min"
>>    
>>
>>>>   
>>>>
>>>>        
>>>>
>>>>>128850  25368 16384        slots: 1.000000 * 1000.000000 * 2 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 2000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128851  25368 16384        slots: 1.000000 * 1000.000000 * 8 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 8000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128852  25368 16384     ASU min = 1000.00000000000, ASU max 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>= 20000.00000000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>128853  25368 16384     
>>>>>128854  25368 16384     no DDJU: do_usage: 1 finished_jobs 0
>>>>>128855  25368 16384     
>>>>>128856  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128857  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128858  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128859  25368 16384     
>>>>>128860  25368 16384     no DDJU: do_usage: 0 finished_jobs 0
>>>>>128861  25368 16384     
>>>>>128862  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128863  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128864  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128865  25368 16384     Normalizing tickets using 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0.000000/18.333333 as min_tix/max_tix
>>>>   
>>>>
>>>>        
>>>>
>>>>>128866  25368 16384        got 19 running jobs
>>>>>128867  25368 16384        added 19 ticket orders for running jobs
>>>>>128868  25368 16384        added 1 orders for updating 
>>>>>          
>>>>>
>>usage of user
>>    
>>
>>>>>128869  25368 16384        added 0 orders for updating usage 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>of project
>>>>   
>>>>
>>>>        
>>>>
>>>>>128870  25368 16384        added 0 orders for updating share tree
>>>>>128871  25368 16384        added 1 orders for scheduler 
>>>>>          
>>>>>
>>configuration
>>    
>>
>>>>>128872  25368 16384     SENDING 22 ORDERS TO QMASTER
>>>>>128873  25368 16384     RESETTING BUSY STATE OF EVENT CLIENT
>>>>>128874  25368 16384     reresolve port timeout in 280
>>>>>128875  25368 16384     returning cached port value: 536
>>>>>--------------STOP-SCHEDULER-RUN-------------
>>>>>128876  25368 16384     ec_get retrieving events - will do 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>max 20 fetches
>>>>   
>>>>
>>>>        
>>>>
>>>>>128877  25368 16384     doing sync fetch for messages, 20 
>>>>>          
>>>>>
>>still to do
>>    
>>
>>>>>128878  25368 16384     try to get request from qmaster, id 1
>>>>>128879  25368 16384     Checking 55 events (44562-44616) 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>while waiting for #44562
>>>>   
>>>>
>>>>        
>>>>
>>>>>128880  25368 16384     check complete, 55 events in list
>>>>>128881  25368 16384     got 55 events till 44616
>>>>>128882  25368 16384     doing async fetch for messages, 19 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>still to do
>>>>   
>>>>
>>>>        
>>>>
>>>>>128883  25368 16384     try to get request from qmaster, id 1
>>>>>128884  25368 16384     reresolve port timeout in 260
>>>>>128885  25368 16384     returning cached port value: 536
>>>>>128886  25368 16384     Sent ack for all events lower or 
>>>>>          
>>>>>
>>equal 44616
>>    
>>
>>>>>128887  25368 16384     ec_get - received 55 events
>>>>>128888  25368 16384     44562. EVENT MOD EXECHOST sub04n147
>>>>>128889  25368 16384     44563. EVENT MOD USER udo
>>>>>128890  25368 16384     44564. EVENT MOD USER iber
>>>>>128891  25368 16384     44565. EVENT MOD USER dieguez
>>>>>128892  25368 16384     44566. EVENT MOD USER karenjoh
>>>>>128893  25368 16384     44567. EVENT MOD USER lorenzo
>>>>>128894  25368 16384     44568. EVENT MOD USER parcolle
>>>>>128895  25368 16384     44569. EVENT MOD USER cfennie
>>>>>128896  25368 16384     44570. EVENT MOD USER civelli
>>>>>128897  25368 16384     44571. EVENT MOD EXECHOST sub04n135
>>>>>128898  25368 16384     44572. EVENT MOD EXECHOST sub04n141
>>>>>128899  25368 16384     44573. EVENT MOD EXECHOST sub04n127
>>>>>128900  25368 16384     44574. EVENT MOD EXECHOST sub04n145
>>>>>128901  25368 16384     44575. EVENT MOD EXECHOST sub04n133
>>>>>128902  25368 16384     44576. EVENT MOD EXECHOST sub04n148
>>>>>128903  25368 16384     44577. EVENT MOD EXECHOST sub04n74
>>>>>128904  25368 16384     44578. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n74 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128905  25368 16384     44579. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n74 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128906  25368 16384     44580. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>rupc03.rutgers.edu
>>    
>>
>>>>>128907  25368 16384     44581. EVENT MOD EXECHOST sub04n139
>>>>>128908  25368 16384     44582. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>rupc02.rutgers.edu
>>    
>>
>>>>>128909  25368 16384     44583. EVENT MOD EXECHOST sub04n80
>>>>>128910  25368 16384     44584. EVENT MOD EXECHOST sub04n207
>>>>>128911  25368 16384     44585. EVENT MOD EXECHOST sub04n180
>>>>>128912  25368 16384     44586. EVENT MOD EXECHOST sub04n23
>>>>>128913  25368 16384     44587. EVENT MOD EXECHOST sub04n30
>>>>>128914  25368 16384     44588. EVENT MOD EXECHOST sub04n203
>>>>>128915  25368 16384     44589. EVENT MOD EXECHOST sub04n109
>>>>>128916  25368 16384     44590. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>rupc04.rutgers.edu
>>    
>>
>>>>>128917  25368 16384     44591. EVENT MOD EXECHOST sub04n114
>>>>>128918  25368 16384     44592. EVENT MOD EXECHOST sub04n106
>>>>>128919  25368 16384     44593. EVENT MOD EXECHOST sub04n88
>>>>>128920  25368 16384     44594. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n88 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128921  25368 16384     44595. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n88 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>128922  25368 16384     44596. EVENT MOD EXECHOST sub04n157
>>>>>128923  25368 16384     44597. EVENT MOD EXECHOST sub04n20
>>>>>128924  25368 16384     44598. EVENT MOD EXECHOST sub04n156
>>>>>128925  25368 16384     44599. EVENT MOD EXECHOST sub04n26
>>>>>128926  25368 16384     44600. EVENT JOB 21213.1 USAGE
>>>>>128927  25368 16384     44601. EVENT MOD EXECHOST sub04n09
>>>>>128928  25368 16384     44602. EVENT MOD EXECHOST sub04n05
>>>>>128929  25368 16384     44603. EVENT MOD EXECHOST sub04n103
>>>>>128930  25368 16384     44604. EVENT MOD EXECHOST sub04n164
>>>>>128931  25368 16384     44605. EVENT MOD EXECHOST sub04n105
>>>>>128932  25368 16384     44606. EVENT MOD EXECHOST sub04n113
>>>>>128933  25368 16384     44607. EVENT MOD EXECHOST sub04n28
>>>>>128934  25368 16384     44608. EVENT MOD EXECHOST sub04n76
>>>>>128935  25368 16384     44609. EVENT MOD EXECHOST sub04n162
>>>>>128936  25368 16384     44610. EVENT MOD EXECHOST sub04n108
>>>>>128937  25368 16384     44611. EVENT MOD EXECHOST sub04n38
>>>>>128938  25368 16384     44612. EVENT MOD EXECHOST sub04n116
>>>>>128939  25368 16384     44613. EVENT MOD EXECHOST sub04n179
>>>>>128940  25368 16384     44614. EVENT MOD EXECHOST sub04n04
>>>>>128941  25368 16384     44615. EVENT MOD EXECHOST sub04n160
>>>>>128942  25368 16384     44616. EVENT MOD EXECHOST sub04n107
>>>>>Q:169, AQ:343 J:19(19), H:169(170), C:49, A:4, D:3, P:7,
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>CKPT:0 US:15 PR:4 S:nd:12/lf:7
>>>>   
>>>>
>>>>        
>>>>
>>>>>128943  25368 16384     
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>================[SCHEDULING-EPOCH]==================
>>>>   
>>>>
>>>>        
>>>>
>>>>>128944  25368 16384     JOB 20937.1 start_time = 1116447112 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 338139 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128945  25368 16384     JOB 20938.1 start_time = 1116374344 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 410907 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128946  25368 16384     JOB 21040.1 start_time = 1116443073 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 342178 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128947  25368 16384     JOB 21076.1 start_time = 1116451351 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 333900 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128948  25368 16384     JOB 21210.1 start_time = 1116514970 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 270281 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128949  25368 16384     JOB 21213.1 start_time = 1116515250 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 270001 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128950  25368 16384     JOB 21338.1 start_time = 1116543252 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 241999 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128951  25368 16384     JOB 21423.1 start_time = 1116629274 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 155977 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128952  25368 16384     JOB 21424.1 start_time = 1116631365 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 153886 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128953  25368 16384     JOB 21440.1 start_time = 1116632934 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 152317 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128954  25368 16384     JOB 21441.1 start_time = 1116632994 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 152257 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128955  25368 16384     JOB 21443.1 start_time = 1116633602 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 151649 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128956  25368 16384     JOB 21474.1 start_time = 1116655118 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 130133 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128957  25368 16384     JOB 21503.1 start_time = 1116707395 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 77856 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128958  25368 16384     JOB 21507.1 start_time = 1116714061 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 71190 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128959  25368 16384     JOB 21528.1 start_time = 1116707641 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 77610 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128960  25368 16384     JOB 21530.1 start_time = 1116714453 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 70798 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128961  25368 16384     JOB 21537.1 start_time = 1116724845 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 60406 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128962  25368 16384     JOB 21542.1 start_time = 1116782511 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>running_time 2740 decay_time = 450
>>>>   
>>>>
>>>>        
>>>>
>>>>>128963  25368 16384     verified threshold of 169 queues
>>>>>128964  25368 16384     queue myrinet at sub04n61 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128965  25368 16384     queue myrinet at sub04n62 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128966  25368 16384     queue myrinet at sub04n65 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128967  25368 16384     queue myrinet at sub04n66 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128968  25368 16384     queue myrinet at sub04n67 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128969  25368 16384     queue myrinet at sub04n68 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128970  25368 16384     queue myrinet at sub04n69 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128971  25368 16384     queue myrinet at sub04n70 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128972  25368 16384     queue myrinet at sub04n71 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128973  25368 16384     queue myrinet at sub04n72 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128974  25368 16384     queue myrinet at sub04n75 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128975  25368 16384     queue myrinet at sub04n77 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128976  25368 16384     queue myrinet at sub04n78 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128977  25368 16384     queue myrinet at sub04n79 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128978  25368 16384     queue myrinet at sub04n81 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128979  25368 16384     queue myrinet at sub04n84 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128980  25368 16384     queue myrinet at sub04n85 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128981  25368 16384     queue myrinet at sub04n86 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128982  25368 16384     queue myrinet at sub04n87 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128983  25368 16384     queue myrinet at sub04n88 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128984  25368 16384     queue myrinet at sub04n89 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128985  25368 16384     queue myrinet at sub04n90 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128986  25368 16384     queue myrinet at sub04n91 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128987  25368 16384     queue myrinet at sub04n63 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128988  25368 16384     queue myrinet at sub04n64 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128989  25368 16384     queue myrinet at sub04n73 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128990  25368 16384     queue myrinet at sub04n74 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128991  25368 16384     queue opteronp at sub04n202 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128992  25368 16384     queue opteronp at sub04n205 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128993  25368 16384     queue opteronp at sub04n206 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128994  25368 16384     queue opteronp at sub04n208 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_medium=1.000000 (no load adjustment) >= 1.0
>>>>   
>>>>
>>>>        
>>>>
>>>>>128995  25368 16384     queue parallel at sub04n121 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128996  25368 16384     queue parallel at sub04n139 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128997  25368 16384     queue parallel at sub04n140 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128998  25368 16384     queue parallel at sub04n141 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>128999  25368 16384     queue parallel at sub04n142 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129000  25368 16384     queue parallel at sub04n143 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129001  25368 16384     queue parallel at sub04n144 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129002  25368 16384     queue parallel at sub04n146 tagged to 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>be overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129003  25368 16384     queue parallel at sub04n02 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129004  25368 16384     queue parallel at sub04n03 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.020000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129005  25368 16384     queue parallel at sub04n04 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129006  25368 16384     queue parallel at sub04n05 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129007  25368 16384     queue parallel at sub04n06 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129008  25368 16384     queue parallel at sub04n07 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129009  25368 16384     queue parallel at sub04n08 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.010000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129010  25368 16384     queue parallel at sub04n09 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129011  25368 16384     queue parallel at sub04n10 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129012  25368 16384     queue parallel at sub04n11 tagged to be 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>overloaded: load_avg=2.000000 (no load adjustment) >= 1.4
>>>>   
>>>>
>>>>        
>>>>
>>>>>129013  25368 16384     verified threshold of 169 queues
>>>>>129014  25368 16384     STARTING PASS 1 WITH 0 PENDING JOBS
>>>>>129015  25368 16384     Not enrolled ja_tasks: 0
>>>>>129016  25368 16384     Enrolled ja_tasks: 1
>>>>>129017  25368 16384     Not enrolled ja_tasks: 0
>>>>>129018  25368 16384     Enrolled ja_tasks: 1
>>>>>129019  25368 16384     Not enrolled ja_tasks: 0
>>>>>129020  25368 16384     Enrolled ja_tasks: 1
>>>>>129021  25368 16384     Not enrolled ja_tasks: 0
>>>>>129022  25368 16384     Enrolled ja_tasks: 1
>>>>>129023  25368 16384     Not enrolled ja_tasks: 0
>>>>>129024  25368 16384     Enrolled ja_tasks: 1
>>>>>129025  25368 16384     Not enrolled ja_tasks: 0
>>>>>129026  25368 16384     Enrolled ja_tasks: 1
>>>>>129027  25368 16384     Not enrolled ja_tasks: 0
>>>>>129028  25368 16384     Enrolled ja_tasks: 1
>>>>>129029  25368 16384     Not enrolled ja_tasks: 0
>>>>>129030  25368 16384     Enrolled ja_tasks: 1
>>>>>129031  25368 16384     Not enrolled ja_tasks: 0
>>>>>129032  25368 16384     Enrolled ja_tasks: 1
>>>>>129033  25368 16384     Not enrolled ja_tasks: 0
>>>>>129034  25368 16384     Enrolled ja_tasks: 1
>>>>>129035  25368 16384     Not enrolled ja_tasks: 0
>>>>>129036  25368 16384     Enrolled ja_tasks: 1
>>>>>129037  25368 16384     Not enrolled ja_tasks: 0
>>>>>129038  25368 16384     Enrolled ja_tasks: 1
>>>>>129039  25368 16384     Not enrolled ja_tasks: 0
>>>>>129040  25368 16384     Enrolled ja_tasks: 1
>>>>>129041  25368 16384     Not enrolled ja_tasks: 0
>>>>>129042  25368 16384     Enrolled ja_tasks: 1
>>>>>129043  25368 16384     Not enrolled ja_tasks: 0
>>>>>129044  25368 16384     Enrolled ja_tasks: 1
>>>>>129045  25368 16384     Not enrolled ja_tasks: 0
>>>>>129046  25368 16384     Enrolled ja_tasks: 1
>>>>>129047  25368 16384     Not enrolled ja_tasks: 0
>>>>>129048  25368 16384     Enrolled ja_tasks: 1
>>>>>129049  25368 16384     Not enrolled ja_tasks: 0
>>>>>129050  25368 16384     Enrolled ja_tasks: 1
>>>>>129051  25368 16384     Not enrolled ja_tasks: 0
>>>>>129052  25368 16384     Enrolled ja_tasks: 1
>>>>>129053  25368 16384     STARTING PASS 2 WITH 0 PENDING JOBS
>>>>>129054  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129055  25368 16384        slots: 1.000000 * 1000.000000 * 6 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 6000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129056  25368 16384     slot request assumed for static 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>urgency is 20 for ,20-64 PE range due to PE's "mpi" setting "min"
>>>>   
>>>>
>>>>        
>>>>
>>>>>129057  25368 16384        slots: 1.000000 * 1000.000000 * 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>20    ---> 20000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129058  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129059  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129060  25368 16384        slots: 1.000000 * 1000.000000 * 6 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 6000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129061  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129062  25368 16384        slots: 1.000000 * 1000.000000 * 1 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 1000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129063  25368 16384     slot request assumed for static 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>urgency is 2 for ,2-8 PE range due to PE's "mpich_myri" 
>>>>        
>>>>
>>setting "min"
>>    
>>
>>>>   
>>>>
>>>>        
>>>>
>>>>>129064  25368 16384        slots: 1.000000 * 1000.000000 * 2 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 2000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129065  25368 16384        slots: 1.000000 * 1000.000000 * 8 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>  ---> 8000.000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129066  25368 16384     ASU min = 1000.00000000000, ASU max 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>= 20000.00000000000
>>>>   
>>>>
>>>>        
>>>>
>>>>>129067  25368 16384     
>>>>>129068  25368 16384     no DDJU: do_usage: 1 finished_jobs 0
>>>>>129069  25368 16384     
>>>>>129070  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>129071  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>129072  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>129073  25368 16384     
>>>>>129074  25368 16384     no DDJU: do_usage: 0 finished_jobs 0
>>>>>129075  25368 16384     
>>>>>129076  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>129077  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>129078  25368 16384     =====================[Pass 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2]======================
>>>>   
>>>>
>>>>        
>>>>
>>>>>129079  25368 16384     Normalizing tickets using 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>0.000000/18.333333 as min_tix/max_tix
>>>>   
>>>>
>>>>        
>>>>
>>>>>129080  25368 16384        got 19 running jobs
>>>>>129081  25368 16384        added 19 ticket orders for running jobs
>>>>>129082  25368 16384        added 1 orders for updating 
>>>>>          
>>>>>
>>usage of user
>>    
>>
>>>>>129083  25368 16384        added 0 orders for updating usage 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>of project
>>>>   
>>>>
>>>>        
>>>>
>>>>>129084  25368 16384        added 0 orders for updating share tree
>>>>>129085  25368 16384        added 1 orders for scheduler 
>>>>>          
>>>>>
>>configuration
>>    
>>
>>>>>129086  25368 16384     SENDING 22 ORDERS TO QMASTER
>>>>>129087  25368 16384     RESETTING BUSY STATE OF EVENT CLIENT
>>>>>129088  25368 16384     reresolve port timeout in 260
>>>>>129089  25368 16384     returning cached port value: 536
>>>>>--------------STOP-SCHEDULER-RUN-------------
>>>>>129090  25368 16384     ec_get retrieving events - will do 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>max 20 fetches
>>>>   
>>>>
>>>>        
>>>>
>>>>>129091  25368 16384     doing sync fetch for messages, 20 
>>>>>          
>>>>>
>>still to do
>>    
>>
>>>>>129092  25368 16384     try to get request from qmaster, id 1
>>>>>129093  25368 16384     Checking 154 events (44617-44770) 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>while waiting for #44617
>>>>   
>>>>
>>>>        
>>>>
>>>>>129094  25368 16384     check complete, 154 events in list
>>>>>129095  25368 16384     got 154 events till 44770
>>>>>129096  25368 16384     doing async fetch for messages, 19 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>still to do
>>>>   
>>>>
>>>>        
>>>>
>>>>>129097  25368 16384     try to get request from qmaster, id 1
>>>>>129098  25368 16384     reresolve port timeout in 240
>>>>>129099  25368 16384     returning cached port value: 536
>>>>>129100  25368 16384     Sent ack for all events lower or 
>>>>>          
>>>>>
>>equal 44770
>>    
>>
>>>>>129101  25368 16384     ec_get - received 154 events
>>>>>129102  25368 16384     44617. EVENT MOD EXECHOST sub04n08
>>>>>129103  25368 16384     44618. EVENT MOD EXECHOST sub04n166
>>>>>129104  25368 16384     44619. EVENT MOD EXECHOST sub04n168
>>>>>129105  25368 16384     44620. EVENT MOD EXECHOST sub04n112
>>>>>129106  25368 16384     44621. EVENT MOD EXECHOST sub04n90
>>>>>129107  25368 16384     44622. EVENT JOB 21503.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n90 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129108  25368 16384     44623. EVENT JOB 21503.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n90 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129109  25368 16384     44624. EVENT MOD USER udo
>>>>>129110  25368 16384     44625. EVENT MOD USER iber
>>>>>129111  25368 16384     44626. EVENT MOD USER dieguez
>>>>>129112  25368 16384     44627. EVENT MOD USER karenjoh
>>>>>129113  25368 16384     44628. EVENT MOD USER lorenzo
>>>>>129114  25368 16384     44629. EVENT MOD USER parcolle
>>>>>129115  25368 16384     44630. EVENT MOD USER cfennie
>>>>>129116  25368 16384     44631. EVENT MOD USER civelli
>>>>>129117  25368 16384     44632. EVENT MOD EXECHOST sub04n14
>>>>>129118  25368 16384     44633. EVENT MOD EXECHOST sub04n75
>>>>>129119  25368 16384     44634. EVENT JOB 21040.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n75 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129120  25368 16384     44635. EVENT JOB 21040.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n75 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129121  25368 16384     44636. EVENT MOD EXECHOST sub04n150
>>>>>129122  25368 16384     44637. EVENT MOD EXECHOST sub04n169
>>>>>129123  25368 16384     44638. EVENT MOD EXECHOST sub04n165
>>>>>129124  25368 16384     44639. EVENT MOD EXECHOST sub04n136
>>>>>129125  25368 16384     44640. EVENT MOD EXECHOST sub04n176
>>>>>129126  25368 16384     44641. EVENT MOD EXECHOST sub04n81
>>>>>129127  25368 16384     44642. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n81 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129128  25368 16384     44643. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n81 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129129  25368 16384     44644. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>past_usage USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129130  25368 16384     44645. EVENT DEL PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n88
>>>>   
>>>>
>>>>        
>>>>
>>>>>129131  25368 16384     44646. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>past_usage USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129132  25368 16384     44647. EVENT DEL PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n78
>>>>   
>>>>
>>>>        
>>>>
>>>>>129133  25368 16384     44648. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>past_usage USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129134  25368 16384     44649. EVENT DEL PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>6.sub04n81
>>>>   
>>>>
>>>>        
>>>>
>>>>>129135  25368 16384     44650. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>past_usage USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129136  25368 16384     44651. EVENT DEL PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n81
>>>>   
>>>>
>>>>        
>>>>
>>>>>129137  25368 16384     44652. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>past_usage USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129138  25368 16384     44653. EVENT DEL PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n88
>>>>   
>>>>
>>>>        
>>>>
>>>>>129139  25368 16384     44654. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>past_usage USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129140  25368 16384     44655. EVENT DEL PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>5.sub04n78
>>>>   
>>>>
>>>>        
>>>>
>>>>>129141  25368 16384     44656. EVENT MOD EXECHOST sub04n161
>>>>>129142  25368 16384     44657. EVENT MOD EXECHOST sub04n124
>>>>>129143  25368 16384     44658. EVENT ADD PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>7.sub04n88
>>>>   
>>>>
>>>>        
>>>>
>>>>>129144  25368 16384     44659. EVENT ADD PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>7.sub04n78
>>>>   
>>>>
>>>>        
>>>>
>>>>>129145  25368 16384     44660. EVENT MOD EXECHOST sub04n158
>>>>>129146  25368 16384     44661. EVENT MOD EXECHOST sub04n01
>>>>>129147  25368 16384     44662. EVENT MOD EXECHOST sub04n159
>>>>>129148  25368 16384     44663. EVENT ADD PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>7.sub04n81
>>>>   
>>>>
>>>>        
>>>>
>>>>>129149  25368 16384     44664. EVENT MOD EXECHOST sub04n134
>>>>>129150  25368 16384     44665. EVENT ADD PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>8.sub04n88
>>>>   
>>>>
>>>>        
>>>>
>>>>>129151  25368 16384     44666. EVENT ADD PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>8.sub04n78
>>>>   
>>>>
>>>>        
>>>>
>>>>>129152  25368 16384     44667. EVENT ADD PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>8.sub04n81
>>>>   
>>>>
>>>>        
>>>>
>>>>>129153  25368 16384     44668. EVENT MOD EXECHOST sub04n121
>>>>>129154  25368 16384     44669. EVENT MOD EXECHOST sub04n143
>>>>>129155  25368 16384     44670. EVENT MOD EXECHOST sub04n15
>>>>>129156  25368 16384     44671. EVENT MOD EXECHOST sub04n13
>>>>>129157  25368 16384     44672. EVENT MOD EXECHOST sub04n64
>>>>>129158  25368 16384     44673. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n64 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129159  25368 16384     44674. EVENT JOB 21542.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n64 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129160  25368 16384     44675. EVENT MOD EXECHOST sub04n118
>>>>>129161  25368 16384     44676. EVENT MOD EXECHOST sub04n151
>>>>>129162  25368 16384     44677. EVENT MOD EXECHOST sub04n154
>>>>>129163  25368 16384     44678. EVENT MOD EXECHOST sub04n149
>>>>>129164  25368 16384     44679. EVENT MOD EXECHOST sub04n16
>>>>>129165  25368 16384     44680. EVENT MOD EXECHOST sub04n155
>>>>>129166  25368 16384     44681. EVENT MOD EXECHOST sub04n152
>>>>>129167  25368 16384     44682. EVENT MOD EXECHOST sub04n163
>>>>>129168  25368 16384     44683. EVENT MOD EXECHOST sub04n43
>>>>>129169  25368 16384     44684. EVENT MOD EXECHOST sub04n86
>>>>>129170  25368 16384     44685. EVENT JOB 21423.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n86 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129171  25368 16384     44686. EVENT JOB 21423.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n86 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129172  25368 16384     44687. EVENT MOD EXECHOST sub04n03
>>>>>129173  25368 16384     44688. EVENT JOB 21076.1 USAGE
>>>>>129174  25368 16384     44689. EVENT MOD EXECHOST sub04n204
>>>>>129175  25368 16384     44690. EVENT MOD EXECHOST 
>>>>>          
>>>>>
>>rupc01.rutgers.edu
>>    
>>
>>>>>129176  25368 16384     44691. EVENT MOD EXECHOST sub04n125
>>>>>129177  25368 16384     44692. EVENT MOD EXECHOST sub04n44
>>>>>129178  25368 16384     44693. EVENT MOD EXECHOST sub04n32
>>>>>129179  25368 16384     44694. EVENT MOD EXECHOST sub04n21
>>>>>129180  25368 16384     44695. EVENT MOD EXECHOST sub04n22
>>>>>129181  25368 16384     44696. EVENT MOD EXECHOST sub04n35
>>>>>129182  25368 16384     44697. EVENT MOD EXECHOST sub04n201
>>>>>129183  25368 16384     44698. EVENT MOD EXECHOST sub04n205
>>>>>129184  25368 16384     44699. EVENT JOB 21440.1 USAGE
>>>>>129185  25368 16384     44700. EVENT MOD EXECHOST sub04n111
>>>>>129186  25368 16384     44701. EVENT MOD EXECHOST sub04n89
>>>>>129187  25368 16384     44702. EVENT JOB 21530.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>2.sub04n89 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129188  25368 16384     44703. EVENT JOB 21530.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>1.sub04n89 USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129189  25368 16384     44704. EVENT JOB 21530.1 USAGE
>>>>>129190  25368 16384     44705. EVENT MOD EXECHOST sub04n177
>>>>>129191  25368 16384     44706. EVENT MOD EXECHOST sub04n146
>>>>>129192  25368 16384     44707. EVENT ADD PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>9.sub04n88
>>>>   
>>>>
>>>>        
>>>>
>>>>>129193  25368 16384     44708. EVENT JOB 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>past_usage USAGE
>>>>   
>>>>
>>>>        
>>>>
>>>>>129194  25368 16384     44709. EVENT DEL PETASK 21507.1 task 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>7.sub04n88
>>>>   
>>>>
>>>>        
>>>>
>>>>>Segmentation fault
>>>>>You have new mail in /var/spool/mail/root 
>>>>>          
>>>>>
>>rupc-cs04b:/opt/SGE/util #
>>    
>>
>>>>>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>+++++++++++++++++++++
>>>>   
>>>>
>>>>        
>>>>
>>>>>/opt/SGE/default/spool/qmaster
>>>>>
>>>>>Sun May 22 14:25:16 EDT 2005
>>>>>05/22/2005 00:20:01|qmaster|rupc-cs04b|E|event client "scheduler"
>>>>>(rupc-cs04b/schedd/1) reregistered - it will need a total update 
>>>>>05/22/2005 00:32:40|qmaster|rupc-cs04b|W|job 21538.1 
>>>>>          
>>>>>
>>failed on host 
>>    
>>
>>>>>sub04n63 in recognising job because: execd doesn't know this job 
>>>>>05/22/2005 00:32:49|qmaster|rupc-cs04b|E|execd sub04n63 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>reports running
>>>>   
>>>>
>>>>        
>>>>
>>>>>state for job (21538.1/master) in queue "myrinet at sub04n63"
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>while job is
>>>>   
>>>>
>>>>        
>>>>
>>>>>in state 65536 05/22/2005
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>00:33:49|qmaster|rupc-cs04b|E|execd at sub04n63
>>>>   
>>>>
>>>>        
>>>>
>>>>>reports running job (21538.1/master) in queue
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>"myrinet at sub04n63" that
>>>>   
>>>>
>>>>        
>>>>
>>>>>was not supposed to be there - killing 05/22/2005
>>>>>02:10:01|qmaster|rupc-cs04b|E|event client "scheduler" 
>>>>>(rupc-cs04b/schedd/1) reregistered - it will need a total update 
>>>>>05/22/2005 02:30:26|qmaster|rupc-cs04b|E|orders 
>>>>>          
>>>>>
>>user/project version 
>>    
>>
>>>>>(1035) is not uptodate (1036) for user/project "udo" 05/22/2005 
>>>>>02:30:26|qmaster|rupc-cs04b|E|orders user/project version 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>(1035) is not
>>>>   
>>>>
>>>>        
>>>>
>>>>>uptodate (1036) for user/project "iber" 05/22/2005
>>>>>02:30:26|qmaster|rupc-cs04b|E|orders user/project version 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>(1035) is not
>>>>   
>>>>
>>>>        
>>>>
>>>>>uptodate (1036) for user/project "dieguez" 05/22/2005
>>>>>02:30:26|qmaster|rupc-cs04b|E|orders user/project version 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>(1035) is not
>>>>   
>>>>
>>>>        
>>>>
>>>>>uptodate (1036) for user/project "zayak" 05/22/2005
>>>>>02:30:26|qmaster|rupc-cs04b|E|orders user/project version 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>(1035) is not
>>>>   
>>>>
>>>>        
>>>>
>>>>>uptodate (1036) for user/project "karenjoh" 05/22/2005
>>>>>02:30:26|qmaster|rupc-cs04b|E|orders user/project version 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>(1035) is not
>>>>   
>>>>
>>>>        
>>>>
>>>>>uptodate (1036) for user/project "lorenzo" 05/22/2005
>>>>>02:30:26|qmaster|rupc-cs04b|E|orders user/project version 
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>(1035) is not uptodate (1036) for user/project "parcolle"
>>>>05/22/2005 02:30:26|qmaster|rupc-cs04b|E|orders user/project 
>>>>version (1035) is not uptodate (1036) for user/project 
>>>>"cfennie" 05/22/2005 02:30:26|qmaster|rupc-cs04b|E|orders 
>>>>user/project version (1035) is not uptodate (1036) for 
>>>>user/project "civelli" 05/22/2005 
>>>>02:34:06|qmaster|rupc-cs04b|E|orders user/project version 
>>>>(1044) is not uptodate (1045) for user/project "udo" 
>>>>05/22/2005 02:34:06|qmaster|rupc-cs04b|E|orders user/project 
>>>>version (1044) is not uptodate (1045) for user/project "iber" 
>>>>05/22/2005 02:34:06|qmaster|rupc-cs04b|E|orders user/project 
>>>>version (1044) is not uptodate (1045) for user/project 
>>>>"dieguez" 05/22/2005 02:34:06|qmaster|rupc-cs04b|E|orders 
>>>>user/project version (1044) is not uptodate (1045) for 
>>>>user/project "zayak" 05/22/2005 
>>>>02:34:06|qmaster|rupc-cs04b|E|orders user/project version 
>>>>(1044) is not uptodate (1045) for user/project "karenjoh" 
>>>>05/22/2005 02:34:06|qmaster|rupc-cs04b|E|orders user/project 
>>>>version (1044) is not uptodate (1045) for user/project 
>>>>"lorenzo" 05/22/2005 02:34:06|qmaster|rupc-cs04b|E|orders 
>>>>user/project version (1044) is not uptodate (1045) for 
>>>>user/project "parcolle" 05/22/2005 
>>>>02:34:06|qmaster|rupc-cs04b|E|orders user/project version 
>>>>(1044) is not uptodate (1045) for user/project "cfennie" 
>>>>05/22/2005 02:34:06|qmaster|rupc-cs04b|E|orders user/project 
>>>>version (1044) is not uptodate (1045) for user/project 
>>>>"civelli" 05/22/2005 03:02:47|qmaster|rupc-cs04b|E|tightly 
>>>>integrated parallel task 21539.1 task 3.sub04n83 failed - 
>>>>        
>>>>
>>killing job
>>    
>>
>>>>   
>>>>
>>>>        
>>>>
>>>>>05/22/2005 03:10:01|qmaster|rupc-cs04b|E|event client
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>"scheduler" (rupc-cs04b/schedd/1) reregistered - it will need 
>>>>a total update    <-- YOU SEE THESE 2 lines : THE SCHEDULER 
>>>>DIED EVEN WITHOUT ANY EVENTS , JUST by itself !!!
>>>>   
>>>>
>>>>        
>>>>
>>>>>05/22/2005 07:30:01|qmaster|rupc-cs04b|E|event client
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>"scheduler" (rupc-cs04b/schedd/1) reregistered - it will need
>>>>a total update
>>>>   
>>>>
>>>>        
>>>>
>>>>>05/22/2005 11:11:39|qmaster|rupc-cs04b|E|event client
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>"scheduler" (rupc-cs04b/schedd/1) reregistered - it will need 
>>>>a total update    <-- BEFORE THE LAST CRASH
>>>>   
>>>>
>>>>        
>>>>
>>>>>05/22/2005 14:07:53|qmaster|rupc-cs04b|E|tightly integrated
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>parallel task 21507.1 task 10.sub04n88 failed - killing job   
>>>>                   <-- THIS IS WHAT TRIGGERED the CRASH
>>>>   
>>>>
>>>>        
>>>>
>>>>>05/22/2005 14:09:14|qmaster|rupc-cs04b|W|job 21507.1 failed
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>on host sub04n78 assumedly after job because: job 21507.1
>>>>died through signal TERM (15)
>>>>   
>>>>
>>>>        
>>>>
>>>>>05/22/2005 14:10:00|qmaster|rupc-cs04b|E|event client
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>"scheduler" (rupc-cs04b/schedd/1) reregistered - it will need 
>>>>a total update    <- SCHEDULER START AFTER THE CRASH
>>>>   
>>>>
>>>>        
>>>>
>>>>>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>+++++++++++++++++++++
>>>>   
>>>>
>>>>        
>>>>
>>>>>SCHEDULER  messages  BELOW
>>>>>
>>>>>05/22/2005 00:20:01|schedd|rupc-cs04b|I|starting up 6.0u3 
>>>>>          
>>>>>
>>05/22/2005
>>    
>>
>>>>>02:10:01|schedd|rupc-cs04b|I|starting up 6.0u3 05/22/2005 
>>>>>02:30:26|schedd|rupc-cs04b|I|controlled shutdown 6.0u3 05/22/2005 
>>>>>02:31:10|schedd|rupc-cs04b|I|starting up 6.0u3 05/22/2005 
>>>>>02:34:06|schedd|rupc-cs04b|I|controlled shutdown 6.0u3 05/22/2005 
>>>>>02:40:00|schedd|rupc-cs04b|I|starting up 6.0u3 05/22/2005 
>>>>>03:10:01|schedd|rupc-cs04b|I|starting up 6.0u3 05/22/2005 
>>>>>07:30:01|schedd|rupc-cs04b|I|starting up 6.0u3
>>>>>05/22/2005 11:11:39|schedd|rupc-cs04b|I|starting up 6.0u3    
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>   <--- before the last crush (I started debug mode)
>>>>   
>>>>
>>>>        
>>>>
>>>>>05/22/2005 14:10:00|schedd|rupc-cs04b|I|starting up 6.0u3    
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>   <--- AFTER the last crush
>>>>   
>>>>
>>>>        
>>>>
>>>>>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>+++++++++++++++++++++
>>>>   
>>>>
>>>>        
>>>>
>>>>>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>+++++++++++++++++++++
>>>>   
>>>>
>>>>        
>>>>
>>>>>-------------------------------------------------------------
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>----------
>>>>   
>>>>
>>>>        
>>>>
>>>>>-
>>>>>
>>>>>-----------------------------------------------------------
>>>>>          
>>>>>
>>----------
>>    
>>
>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>For additional commands, e-mail: 
>>>>>          
>>>>>
>>users-help at gridengine.sunsource.net
>>    
>>
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>------------------------------------------------------------
>>>>        
>>>>
>>---------
>>    
>>
>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>> 
>>>
>>>      
>>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>    
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list