[GE users] Data transfer between submit and compute host

mdondrup michael.dondrup at uni.no
Fri Oct 1 14:44:23 BST 2010


We are adopting GE in a data-intensive web-services environment.
I appreciate your opinion and experience on how to best transfer data between submit- and exec-hosts. Data volume will be in the range of 
100 MB to 1 GB, and it will be rather variable data, not so good to be cached. Number of compute hosts will be < 10. Job submission
will be via DRMAA using different programming languages.

The following solutions came up:
 - NFS shared directory (seems feasible with not too many compute hosts access at the same time)
 - Storing files in BLOB fields in a relational database (MySQL/Posgress) having the script access it,
 I guess this is rather inefficient, but need some arguments why.
- I think I heard somewhere that GE has some built-in functionality for data-transfer, but I cannot recall where that was documented.
- CacheFS? Is it worth it even bothering with variable data?

Any input on pros and cons welcome



To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list