[GE users] Problem setting up SGE6.2

Lai, Sum Yee sum-yee.lai at hp.com
Wed Sep 17 22:36:47 BST 2008


[sumyee at cviant32 qmaster]$ qstat -f
queuename                      qtype resv/used/tot. load_avg arch          states
---------------------------------------------------------------------------------
all.q at cviant39.cv.hp.com       BP    0/0/4          0.00     lx24-amd64
---------------------------------------------------------------------------------
all.q at cviant32.cv.hp.com       BP    0/0/4          0.01     lx24-amd64

############################################################################
 - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
      6 0.00000 simple.sh  sumyee       qw    09/17/2008 13:39:03     1


-Sum Yee

-----Original Message-----
From: Chris Dagdigian [mailto:dag at sonsorol.org]
Sent: Wednesday, September 17, 2008 2:32 PM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Problem setting up SGE6.2


Is there any output from the "qstat -f" command?

-Chris



On Sep 17, 2008, at 5:14 PM, Lai, Sum Yee wrote:

> Yes.  I have 2 execution hosts set up.  I have verified the daemons
> are running.
>
> Sum Yee
>
> -----Original Message-----
> From: Darin Perusich [mailto:Darin.Perusich at cognigencorp.com]
> Sent: Wednesday, September 17, 2008 2:10 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Problem setting up SGE6.2
>
> Have you setup any execution hosts?
>
> Lai, Sum Yee wrote:
>> Hello!
>>
>> We have just setup SGE6.2 on a test environment.  When I tried to
>> submit a test job, the job doesn't get dispatched.  The message I get
>> from qstat is:
>>
>> Can not get job info messages, scheduler is not available.
>> ==============================================================
>> job_number:                 6 exec_file:
>> job_scripts/6 submission_time:            Wed Sep 17 13:39:03 2008
>> owner:                      sumyee uid:                        10771
>> group:                      users gid:                        100
>> sge_o_home:                 /home/sumyee sge_o_log_name:
>> sumyee sge_o_path:
>> /usr/local/GridEngine/bin/lx24-amd64 sge_o_shell:
>> /bin/bash sge_o_workdir:              /home/sumyee/sge/test
>> sge_o_host:                 cviant32 account:                    sge
>> mail_list:                  sumyee at cviant32.cv.hp.com notify:
>> FALSE job_name:                   simple.sh jobshare:
>> 0 shell_list:                 NONE:/bin/sh env_list: script_file:
>> simple.sh
>>
>> I have verified that sge_qmaster is running on the master host.  My
>> understanding is that sge_schedd is now incorporated into qmaster so
>> that it doesn't run separately.  If sge_qmaster is running, why isn't
>> the scheduler available?
>>
>> In the message file for qmaster, I get these two errors every 10
>> seconds:
>>
>> 09/15/2008 20:06:18|event_|cviant41|E|no event client known with id 1
>> to modify 09/15/2008 20:06:28|event_|cviant41|E|no event client known
>> with id 1 to process acknowledgements
>>
>> I am not sure if the two problems are related.  Can anyone give me
>> any suggestions on what may be causing these?
>>
>> My configurations is pretty much default at this point.  Here are
>> they are anyway: [sumyee at cviant32 qmaster]$ qconf -sconf #global:
>> execd_spool_dir              /usr/local/GridEngine/default/spool
>> mailer                       /bin/mail xterm
>> /usr/bin/X11/xterm load_sensor                  none prolog
>> none epilog                       none shell_start_mode
>> unix_behavior login_shells                 sh,ksh,csh,tcsh min_uid
>> 0 min_gid                      0 user_lists                   none
>> xuser_lists                  none projects                     none
>> xprojects                    none enforce_project              false
>> enforce_user                 auto load_report_time
>> 00:00:40 max_unheard                  00:05:00 reschedule_unknown
>> 00:00:00 loglevel                     log_warning administrator_mail
>> sum-yee.lai at hp.com set_token_cmd                none pag_cmd
>> none token_extend_time            none shepherd_cmd
>> none qmaster_params               none execd_params
>> none reporting_params             accounting=true reporting=true \
>> flush_time=00:00:15 joblog=true sharelog=00:00:00 finished_jobs
>> 100 gid_range                    20000-30000 qlogin_command
>> builtin qlogin_daemon                builtin rlogin_command
>> builtin rlogin_daemon                builtin rsh_command
>> builtin rsh_daemon                   builtin max_aj_instances
>> 2000 max_aj_tasks                 75000 max_u_jobs
>> 0 max_jobs                     0 max_advance_reservations     0
>> auto_user_oticket            0 auto_user_fshare             0
>> auto_user_default_project    none auto_user_delete_time        86400
>> delegated_file_staging       false reprioritize                 false
>>
>>
>> [sumyee at cviant32 qmaster]$ qconf -ssconf algorithm
>> default schedule_interval                 0:0:15 maxujobs
>> 0 queue_sort_method                 load job_load_adjustments
>> NONE load_adjustment_decay_time        00:15:00 load_formula
>> np_load_avg schedd_job_info                   true flush_submit_sec
>> 5 flush_finish_sec                  0 params
>> none reprioritize_interval             0:0:0 halftime
>> 168 usage_weight_list
>> cpu=1.000000,mem=0.000000,io=0.000000 compensation_factor
>> 5.000000 weight_user                       0.250000 weight_project
>> 0.250000 weight_department                 0.250000 weight_job
>> 0.250000 weight_tickets_functional         0 weight_tickets_share
>> 0 share_override_tickets            TRUE share_functional_shares
>> TRUE max_functional_jobs_to_schedule   200 report_pjob_tickets
>> TRUE max_pending_tasks_per_job         50 halflife_decay_list
>> none policy_hierarchy                  OFS weight_ticket
>> 0.010000 weight_waiting_time               0.000000 weight_deadline
>> 3600000.000000 weight_urgency                    0.100000
>> weight_priority                   1.000000 max_reservation
>> 0 default_duration                  INFINITY
>>
>> Thanks!
>>
>> Sum Yee
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
> --
> Darin Perusich
> Unix Systems Administrator
> Cognigen Corporation
> 395 Youngs Rd.
> Williamsville, NY 14221
> Phone: 716-633-3463
> Email: darinper at cognigencorp.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list