[GE users] "messages" log of qmaster

Stephan Grell - Sun Germany - SSG - Software Engineer stephan.grell at sun.com
Tue Aug 17 10:42:19 BST 2004


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

we had a bug that caused the scheduler and master to get out of sync. It 
looks
like you are having the same issue. To get more details, please have a 
look at:

Issueszilla: 1154
Bugtraq:      5074788

It should be solved with the next update.

Cheers,
Stephan

Alexandre Barras wrote:

> Hi,
>
> I use SGE 6.0 and it happens after restarting the scheduler.
> I set the log level to "log_info" but it doesn't give more information.
>
> Master host is a RH 8.0, but there was the same issue with a Sun as 
> master.
>
> Here is my scheduler and cluster configurations. But there are very 
> near from the default one. Maybe it will be helpful.
>
> [root at catleya qmaster]# qconf -ssconf
> algorithm                         default
> schedule_interval                 0:0:30
> maxujobs                          8
> queue_sort_method                 load
> job_load_adjustments              np_load_avg=0.50
> load_adjustment_decay_time        0:7:30
> load_formula                      np_load_avg
> schedd_job_info                   true
> flush_submit_sec                  0
> flush_finish_sec                  0
> params                            none
> reprioritize_interval             0:2:0
> halftime                          168
> usage_weight_list                 cpu=1.000000,mem=0.000000,io=0.000000
> compensation_factor               5.000000
> weight_user                       0.250000
> weight_project                    0.250000
> weight_department                 0.250000
> weight_job                        0.250000
> weight_tickets_functional         0
> weight_tickets_share              0
> share_override_tickets            TRUE
> share_functional_shares           TRUE
> max_functional_jobs_to_schedule   200
> report_pjob_tickets               TRUE
> max_pending_tasks_per_job         50
> halflife_decay_list               none
> policy_hierarchy                  OFS
> weight_ticket                     0.010000
> weight_waiting_time               0.000000
> weight_deadline                   3600000.000000
> weight_urgency                    0.100000
> weight_priority                   1.000000
> max_reservation                   0
> default_duration                  0:10:0
>
> [root at catleya qmaster]# qconf -sconf global
> global:
> execd_spool_dir              /opt2/sge/default/spool
> mailer                       /bin/mailx
> xterm                        /usr/openwin/bin/xterm
> load_sensor                  none
> prolog                       none
> epilog                       none
> shell_start_mode             posix_compliant
> login_shells                 sh,ksh,csh,tcsh
> min_uid                      0
> min_gid                      0
> user_lists                   none
> xuser_lists                  none
> projects                     none
> xprojects                    none
> enforce_project              false
> enforce_user                 auto
> load_report_time             00:00:30
> stat_log_time                48:00:00
> max_unheard                  00:00:32
> reschedule_unknown           00:00:15
> loglevel                     log_info
> administrator_mail           barras at cerfacs.fr
> set_token_cmd                none
> pag_cmd                      none
> token_extend_time            none
> shepherd_cmd                 none
> qmaster_params               none
> execd_params                 none
> reporting_params             accounting=true reporting=false \
>                             flush_time=00:00:15 joblog=false \
>                             sharelog=00:00:00
> finished_jobs                100
> gid_range                    20000-20100
> qlogin_command               telnet
> qlogin_daemon                /usr/sbin/in.telnetd
> rlogin_daemon                /usr/sbin/in.rlogind
> max_aj_instances             2000
> max_aj_tasks                 75000
> max_u_jobs                   0
> max_jobs                     0
> auto_user_oticket            0
> auto_user_fshare             0
> auto_user_default_project    none
> auto_user_delete_time        100
> delegated_file_staging       false
> reprioritize                 1
>
>
>
> Andy Schwierskott wrote:
>
>> Alexandre,
>>
>> that's certainly a bug - to fix it it would be helpful to get some ideas
>> how to reproduce it.
>>
>> Which SGE version are you using?
>>
>> Does it happen after restarting the scheduler?
>>
>> Can you set the loglevel to "log_info" - may be this gives more insight
>> what's going on.
>>
>> Andy
>>
>>> hello,
>>>
>>> Every second, the master write this two lines in the "messages" log:
>>> ----------------------------------------------------
>>> 08/16/2004 11:24:23|qmaster|catleya|E|can't get task id
>>> 08/16/2004 11:24:23|qmaster|catleya|E|reinitialization of "scheduler"
>>> ----------------------------------------------------
>>>
>>> I have two questions:
>>> 1. What is the problem related in the log ? (I have to precise that 
>>> everything is alright with SGE in my cluster)
>>> 2. How to prevent SGE to write in the log file so often ?
>>>
>>> Reuti wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>>> our dual Pentium 4 Xeon 40 node cluster. The cluster is working on
>>>>> bioinformatics applications that are developed using perl scripts 
>>>>> that call
>>>>> each other and wait for each other to finish. Hence at any given 
>>>>> point of
>>>>> time there are many executing scripts that are actually waiting 
>>>>> and this
>>>>> increases the load average artificially. If I increase my 
>>>>> load_threshold on
>>>>>
>>>>
>>>> can you provide some more details about your scripts? They startup 
>>>> as serial jobs, and then they are starting something in the 
>>>> background and polling for the results?
>>>>
>>>> Reuti
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list