[GE users] Followup project

joelandman landman at scalableinformatics.com
Fri Aug 20 18:07:24 BST 2010

On 08/20/2010 03:38 AM, Andy Schwierskott wrote:
> Joe,
>  > Let me ask a (somewhat obvious) set of questions.
>  >
>  > Is there any possibility that this project could be shut down due to
>  > Oracle flexing legal muscle? That is, DanT looks like his group has
>  > funding and a roadmap, and they might not take too kindly to an external
>  > group building a relicensed version. Nor give their permission for
>  > relicensure.

Hi Andy

   As always, I appreciate your reply, your and the team's work on it. 
We've been using/recommending SGE for the better part of the last 
decade.  We've developed a number of tools in use by numerous groups for 
running certain calculations with SGE hidden beneath, as well as 
accounting, usage reporting, etc.  These tools have been available on 
our sites for years, as GPL.

> your question raises a fair concern and needs to be asked here. No one,
> neither contributing as an individual nor by representing a company
> wants to get involved in any legal troubles when kicking off a
> community-led fork of the Grid Engine project.

   I think a few of us were surprised by the change ... I expected 
something, though I wasn't sure what.

> Though I'm now a Oracle employee and an old-timer in the Grid Engine
> engineering team since 16 years I can't speak for Oracle here. So I just
> can share my personal view: As I always understood the SISSL license
> gives the freedom to anyone to take the code and create a fork - even
> commercial versions seem to be allowed if the requirements of the SISSL
> are met. I agree a statement or clarification and explanation what can
> be done and what not were helpful.
> And the legal question is not only about the code: I could not give an
> answer under which license the Issuezilla database and mailing list
> archives had been made available. That needs to be answered as well.

I think there are non-Oracle resources that have mirrored the lists. 
The bug/patch bits are another thing entirely, only one host for those. 
  The archives represent a huge investment on the part of the community, 
in terms of providing effectively free support to others (free and paid 
customers of Sun/Oracle).  Chris Dag's gridinfo.net wiki represents an 
incredibly useful knowledge store.  These community resources are 
available freely to all.

> Undoubtedly the vast effort of all contributions with new feature
> development and bug fixing was contributed by Sun in the past 9 years
> since we went open source in June 2001. There were a few code
> contributions from our community (think about the array job
> interdependencies) and there was help with porting to new OS platforms
> (the most important certainly was the initial port to Mac OS X). In my
> view the greatest asset which had been created was the incredible
> adoption of Grid Engine and this tremendously active, helpful and
> positive mailing lists in the Grid Engine project. Also the many issues
> and bugs which had been reported by our community helped us to improve
> the quality of the Grid Engine code and helped others to get fixes
> before they ran into it as well. No doubt, that helped our paying
> customers as well.

Yes I agree.  The community has been one of the best selling points to 
have customers adopt SGE.  Support is an email away.  With many eyes, 
many problems are shallow.

> All of you who are in the software business know what's the price tag of
> developing new features and fixing bugs. There are numbers which say a
> single bug on average costs more than $10k to fix. What I'm wondering is

Well, there are numbers and ... er ... then there are numbers.  I don't 
put much faith in these generalized metrics.

This said, bugs do cost real money to fix, and your point is basically 
pointing to the need of the OGE group to show a positive revenue, 
hopefully in (significant) excess of the costs to deliver this product.

Bugs cost people time (salary/benefits), machine time (power, 
acquisition costs, etc).  These are non-zero.  Of this there is no doubt.

> if the community will be able to sustain the code quality of the current
> feature set - and what's about new development or complete new modules
> which address new needs, like the Service Domain Manager? This thread

Look carefully at Torque and some of the others (Slurm etc).  Open 
source, contributing community, and quite vibrant, without the need for 
$10k/bug metrics and revenue offsets of the same.

> originally has started with a memory reporting regression on Linux. That
> bug was easy to fix. Others we fixed in 6.2u6 (see here
> http://gridengine.sunsource.net/project/gridengine/62patches.txt) took
> even us many, many person weeks to analyze and solve them, not talking
> about the QA team and lab and the continuous investment in the testsuite
> to ensure and improve the quality of the released code on all of the
> supported platforms.
> There's certainly quite some risk for anyone who is responsible for
> running clusters where every hour hundreds, thousands or tens of
> thousands CPU hours are waiting to kept busy and keep the company behind
> running. So is the assumed zero price tag worth the money you will lose
> if the DRM system of your choice does not work? Our customers are giving
> us figures which say that hardware, electricity, cooling,
> administration, OS licenses, DRM system licenses only make up one third
> of their costs. The remaining money goes into the licenses of the
> software which runs in the grid. Not talking about the labor costs for
> the engineers using the Grid. So can the license and support costs of a
> DRM system have any cost saving potential when you compare it with the
> value it adds to the bulk of hardware in a data center?

My concern is how the license changes impact the community.  Others have 
answered the costs issues.  You know as well as I that charging per core 
in an HPC scenario is a good way to kill sales.  We've seen folks 
charging $1k/core for clusters, and not understanding why they aren't 
rolling in the money.

In particular spaces, we are seeing customers actively investing in open 
source projects which allow them to scale their computing without 
scaling their software costs, which directly compete with the closed 
source systems at several thousands of dollars per socket.  The HPC user 
group  run by IDC keeps harping on these pricing models, and points out 
that this is what customers complain the most about.  Changing to that 
model isn't likely to increase the number of customers in this space, 
nor endear the product to those who would be  most likely to buy it.  I 
am not trying to be critical ... just pointing out what we hear others say.

This said, I am still hopeful that some sort of method will exist to 
enable users to keep using the OGE in an open source manner (think like 
the Centos-Redhat relationship).  Customers who don't want/need support 
don't pay for it.  Those who want/need it, do.  Redhat has a very 
similar issue to the OGE team on this.  Similar business model.  Its 
workable.  Centos can kick bugfixes back upstairs to RHEL, and they have 
access to the database.  This is something of what I'd hope for here. 
There are customers who'd pay for it, but its a smaller number than 
those who would use it.



Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list