Notable changes in Son of Grid Engine ------------------------------------- For earlier changes (comprising Sun Grid Engine), see Changelog in the source distribution, and the bug lists at http://arc.liv.ac.uk/SGE/howto/Installation,%20Upgrade,%20Patches. For detailed recent change information, including credits for patches, see https://arc.liv.ac.uk/trac/SGE/log/sge/ "[#]" below refers to ticket at https://arc.liv.ac.uk/trac/SGE/ticket/. (The issues fixed by Univa changes aren't publicly available as such, and many other changes have been made without raising Trac tickets.) Changes due to Univa (or Sun) from the Univa repo are tagged as "[U]", and items partially due to them or appearing first in this version are tagged as "[(U)]". Version 8.0.0e -------------- Mostly build and security issues * Bug fixes * Fix linker-dependent hwloc build failure. * Fix Java build with -no-hwloc * Fix spurious messages from deleting job spool directories. * Fix build error on Solaris 11 * Fix spec file for systems that use mandb [#1407] * Fix #777 (8.0.0d) properly * Allow building against berkeleydb 5 and with GNU ld --as-needed (e.g. Ubuntu 12.04) * Rename status(1) to qstatus(1) to avoid name clash with upstart * Update LICENCES with some missing items * Security fixes The first fix is for a trivial remote root by a valid user. The others, including fixes for potential buffer overruns in daemon and setuid programs, may or may not be exploitable. * Sanitize the environment before executing remote startup programs etc. Somewhat incompatible: LD_LIBRARY_PATH etc. may need to be set differently. See the security notes in remote_startup(5) and sge_conf(5). (CVE-2012-0208, thanks to William Hay) * Don't write initial log messages in /tmp [#508]. Somewhat incompatible: initial messages now in syslog. * Avoid using mktemp. (Probably not a significant problem.) * Control core dumps under setuid etc. with SGE_ENABLE_COREDUMP. (Not normally a security issue.) * Bounds checking in replace_params [#215] * Avoid execd crash and possible overruns [#1328] * Fixes for buffer overrun and other improvements for (setuid program) sgepasswd [including #386] * Enhancements * Logging can be configured to use syslog [#808] from fixing #508. Version 8.0.0d -------------- * Bug fixes * Man and fixes * Fix building with older gcc versions * Provide load average in qstat XML output [#446, #454] * Partially back out Univa change which broke classic spooling * Fix -terse in sge_request [#777] * Replace 3rd_party_licscopyrights with updated LICENCES directory to fix some missing items * Other changes (possibly-incompatible) * Message fixes Version 8.0.0c -------------- * Bug fixes * Man and other documentation fixes * Build/installation fixes (particularly for Red Hat 6 and Linux 3) * Fix group ids for submitted jobs [U] * Fix default value of boolean with JSV [U] * Windows fixes for helper crashes and Vista GUI jobs [U] * Ensure parallel jobs are dispatched to the least loaded host [U] * Correct ownership of qsub -pty output file; was owned by admin user [U] * Fix format of Windows loadcheck.exe output [U] * Read from stderr even if stdout is already closed in IJS [U] * Fix PDC_INTERVAL=NEVER execd parameter [U] * Fix accounting information for Windows GUI jobs [U] * Increase default MAX_DYN_EC qmaster param [U] * Fix qsub -sync y error message and enforce MAX_DYN_EC correctly [U] * Fix job validation (-w e) behaviour [#716] [U] * Fix qrsh input redirection [U] * Avoid warning when submitting a qrsh job [U] * Print start time in qstat -j -xml output [U] * Don't raise an error changing resource request on waiting job [#806] * Don't exit 0 on error with qconf -secl or -sep * Include string.h in drmaa.h [#712] * Fix process-scheduler-log with host aliases * Enhancements * Base qmake and qtcsh on the current gmake and tcsh source [#289, #504, #832] * Support "-binding linear" and "-binding linear:slots" * Use the hwloc library for all topology information and core binding, supporting more operating systems (now: AIX, Darwin, FreeBSD, GNU/Linux, HPUX, MS Windows, OSF/1, Solaris), and more hardware types (specifically AMD Magny Cours and similar) * Add task number to execd "exceeds job ... limit" * Other changes (possibly-incompatible) * Modify default paths in build files and elsewhere [U] * Assorted message fixes * In RPMs, move qsched to qmaster package, and separate drmaa4ruby * Default to newijs in load_sge_config.sh * Default to sh, not csh for configured shell Version 8.0.0b -------------- * Bug fixes * Build/installation fixes [including #424, #1349] [(U)] * Fix execd init script [#1348] * Man and other documentation fixes [including #614, #764] [(U)] * Fix contents of admin mail properly [#1307, #1345] * Fix qalter messages for -tc * Fix build with -DSGE_PQS_API * Fix group ids for submitted jobs [U] * Enhancements * Update qsched and add man page * Other changes (possibly-incompatible) * Avoid the use of /bin/ksh [#1306] * Change installation defaults to classic spooling, not adding shadow hosts, and not JMX. [(U)] Version 8.0.0a -------------- This is roughly a superset of Univa's 8.0.0 (the V800_TAG from https://github.com/gridengine/gridengine), with thanks for that. Changes made there which haven't been included in this version: PLPA source not removed; some different build/installation defaults (e.g. for JMX); Univa/UGE "branding" (partly because trademark status is unknown); authuser not removed (for SDM and testing use), * Bug fixes * Many man and other documentation fixes [including #790, #776, #769, #733, #610, #587, #581, #459, #456, #439, #255, #1288, #797, #1271, #773] [(U)] * Some program message fixes [(U)] * Various build and installation fixes [including #761, #709, #656, #616, #546, #536, #521, #491, #438, #414, #411, #383, #381, #138, #455, #344, #438, #1311, #1272, #1273] [(U)] * Ask for keystore password twice on installation * Fix qmaster crashes with tightly integrated parallel jobs or un-discoverable qinstance [#789] [U] * Report 0 cores and sockets on unsupported Solaris hosts [U] * Fix malloc hooks which caused crashes, particularly with SuSE 11 [#792, #748, #749] [U] * Verify the pe task start user in execd in non-CSP mode [U] * Fix binding parameters parsing [U] * Fix JSV logging with multiple users submitting jobs on same submit host [U] * Fix unresponsive qmaster when modifying the global configuration in a huge cluster [U] * Speed up finishing tightly integrated jobs [U] * Check consistency of JSV binding information properly [U] * Fix broken project spooling, which caused loss of project when restarting master when using core binding [U] * Fix slotwise preemption failure to unsuspend one job per host [#775] [U] * Fix problems retrieving passwd and group information with large responses [#1295] [(U)] * Fix JSV changing default of boolean [U] * Fix ENABLE_RESCHEDULE_SLAVE=1 [U] * Allow comma in CMDNAME with Perl JSV scripts [#803] * Don't put queue into error state when supplementary group id cannot be set [#185] [U] * Don't convert LF to CRLF with qrsh -pty [U] * Fix qconf segfault on bad subordination string [U] * Fix group ids of submitted jobs [U] * Disallow -masterq with serial jobs [#155] [U] * Fix 100% CPU use by shepherd of qsh [U] * Removed unnecessary binding warning on job starts [U] * Fix qconf error reports when tmp directory has 755 permissions [U] * Fix suspending of remote process on qrsh -pty yes on Solaris [U] * Fix starting jobs after global host changed [U] * Reject invalid load_formula value [U] * Fix handling of implicitly-requested exclusive resources [U] * Fix execd vmem reporting on 64-bit Linux [U] * Fix startup of execd on Windows Vista [U] * Set xterm's path more appropriately on GNU/Linux [#557] * Fix generation of admin email from failed jobs [#1307] * Fix some ill-formed output from qstat -xml [#314] * Fix handling of multi-line environment variables propagated to shepherd [#395] * Fix example MPI PE templates * Fix bad quoting in JSV sh library * Fix checking of consumables for parallel jobs across multiple hosts [U] * Enhancements * Additional and clarified documentation * PAM modules for ssh tight integration and access control for interactive jobs * Initial core binding support for Solaris/SPARC64 [U] * Some efficiency improvements and memory leaks fixed [U] * Ports to S/390 and PARISC GNU/Linux [U] * New complex m_thread [U] * Show topology by default in qhost [U] * qsub -pty switch [#704] [U] * Improved qmon graphics [#530] [(U)] * Include bash in default shell list [U] * A JSV that rejects all jobs [U] * Files for Scali-MPI * Ruby DRMAA implementation * Enable easy building against shared system libraries and use system openssl and bdb binaries * New scripts: "qsched" reports resource reservations; "status" wraps qstat; enable/disable submission; node-selection (idle etc.) * Restart argument for daemon init scripts * Improved efficiency of shell JSV if used with bash * Core dumps from crashing daemons enabled under Linux [U] * Example host_aliases file [#154] * Spec file for RPM packaging [#820] * Other changes (possibly-incompatible) * Show core binding by default in qstat, qhost (use -ncb for compatibility) [U] * Removed Berkeley DB RPC support (recently dropped by BDB) [U] * Changed position in pending job list for user-rescheduled jobs (exit99, qmod -rj) and OLD_RESCHEDULE_BEHAVIOR, OLD_RESCHEDULE_BEHAVIOR_ARRAY_JOB parameters [U] * Unified GNU/Linux arch strings (lx-*, from lx24-* and lx26-*) [U] * Default to enabling core binding on GNU/Linux [U] * Removed Sun service tags support [U] * Removed obsolete SunHPCT5 files