[GE issues] [Issue 3086] New - add ability to commlib to wait for data on foreign file descriptors

pollinger harald.pollinger at sun.com
Wed Jul 15 15:45:14 BST 2009

                 Issue #|3086
                 Summary|add ability to commlib to wait for data on foreign fil
                        |e descriptors
       Status whiteboard|
              Issue type|ENHANCEMENT
             Assigned to|crei
             Reported by|pollinger

------- Additional comments from pollinger at sunsource.net Wed Jul 15 07:45:10 -0700 2009 -------
With the current implementation of the commlib there is this problem:

If the application has to wait for data from the commlib and on some other file descriptors (fd's), it has to create two threads, one waits
blocking in the cl_commlib_receive_message() call for commlib data, the other blocking in the select() call for data on the other fd's.
This adds unnecessary complexity to the application.

A better solution would be to make the commlib able to wait also for data on foreign fd's, so there must be only one thread that gets
awakened whenever data arrives for the commlib or the other fd's.

This functionality could be implemented this way:
The application registers both a fd and a callback function to the commlib. This causes the commlib to select() or poll() on this fd, too,
besides of their own socket fd's. The application then calls cl_commlib_receive_message(), where the commlib blockingly waits until a fd is
ready. If the select() awakes, the commlib checks if one of the registered fd's is ready to read (or write). If it is, the commlib calls the
associated callback function to let the application work with this fd.

This must be changed to implement this functionality:

1. The fd's and the callback function pointers have to be stored in the commlib in a separate list. This list should be similar to the
"cl_message_list" list and could be called "cl_fd_list_t" (if it needs a type). It should consist of "cl_fd_list_elem_t" elements, which
consist of a "raw_elem" struct and a "cl_fd_data_t". They should be defined like this:

typedef struct {
   cl_fd_list_data_t *data;       /* the actual data */
   cl_raw_list_t     *raw_elem;   /* needed for list chaining */
} cl_fd_list_elem_t;

typedef struct {
   int          fd;
   cl_bool      read;   /* shall the fd be added to the read fd_set? */
   cl_bool      write;  /* shall the fd be added to the write fd_set? */
   cl_fd_func_t callback;
   void         *user_data;
} cl_fd_data_t;

2. The "cl_fd_list" list has to be stored in the "cl_com_handle". When the "cl_com_handle" gets created by "cl_com_create_handle()", the
"cl_fd_list" has to be created, too.

3. When the "cl_com_handle" gets destroyed by "cl_commlib_shutdown_handle()", the "cl_fd_list" has to be destroyed, too.

4. There must be a "cl_fd_register()" function which
- must lock the "cl_fd_list" before any operation!
- takes the "cl_com_handle" and a pointer to a allocated and filled "cl_fd_data_t" struct.
- checks if the fd is alread in the list. If yes, then it deletes the old element.
- creates and appends the list element. The new list element just points to the provided "cl_fd_data_t".
- releases the lock after all operations.

5. There must be a "cl_fd_unregister()" function which
- must lock the "cl_fd_list" before any operation!
- takes the "cl_com_handle" and a fd.
- free()s the "cl_fd_data_t" struct
- removes and deletes the "cl_fd_list_elem_t" from the list
- releases the lock after all operations.

6. In the function "cl_com_tcp_open_connection_request_handler()" resp. "cl_com_ssl_open_connection_request_handler()":
- where the FD_SETs are built, the list has to be locked, and for all elements in the "cl_fd_list" the fd has to be added to the read_fds
resp. the write_fds.
- unlock the list after constructing the FD_SETs
- ... select() or poll()...
- for all set fds in the FD_SET, check if it is a registered fd:
  - if no, let the commlib handle it
  - if yes, lock the list, call the callback function, unlock the list

7. The callback function is to be defined this way:
typedef int (*cl_fd_func_t) (cl_com_handle *handle, int select_return_value, cl_fd_data_t *p_fd_data);

The callback function takes the "cl_com_handle" it is associated to, the return value of the select (in case the fd caused a select() or
poll() error) and the "cl_fd_data_t" with all the necessary informations.
The callback function must return CL_RETVAL_OK to continue with normal operation. If it returns any other value, the associated
"cl_fd_list_elem_t" gets deleted from the "cl_fd_list".
Add a comment that the "cl_fd_list" is already locked when the callback function is called!


To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list