[GE users] Manys jobs seems to lead to dropped callbacks
elauzier2 at perlstar.com
Sat Dec 12 22:13:13 GMT 2009
You may have to adjust your max file descriptors for the system and for the login shell...
I know for LSF this is important, here is a snippet from a google search that describes how to set it:
To meet the performance requirements of a large cluster, increase the file descriptor limit of the operating system.
The file descriptor limit of most operating systems used to be fixed, with a limit of 1024 open files. Some operating systems, such as Linux and AIX, have removed this limit, allowing you to increase the number of file descriptors.
Increase the file descriptor limit
1. To achieve efficiency of performance in LSF, follow the instructions in your operating system documentation to increase the number of file descriptors on the LSF master host.
To optimize your configuration, set your file descriptor limit to a value at least as high as the number of hosts in your cluster.
The following is an example configuration. The instructions for different operating systems, kernels, and shells are varied. You may have already configured the host to use the maximum number of file descriptors that are allowed by the operating system. On some operating systems, the limit is configured dynamically.
Your cluster size is 5000 hosts. Your master host is on Linux, kernel version 2.4:
2. Log in to the LSF master host as the root user.
3. Add the following line to your /etc/rc.d/rc.local startup script:
echo -n "5120" > /proc/sys/fs/file-max
4. Restart the operating system to apply the changes.
5. In the bash shell, instruct the operating system to use the new file limits:
# ulimit -n unlimited
Hope this helps...
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users