Opened 7 years ago

Closed 7 years ago

#1423 closed defect (fixed)

renew_all_certs creates CRL which expires after one month

Reported by: aylee Owned by: Dave Love <d.love@…>
Priority: normal Milestone:
Component: sge Version: 8.0.0b
Severity: major Keywords:
Cc:

Description

I've stumbled over an effect which seems to be the same as described in:

http://arc.liv.ac.uk/pipermail/gridengine-users/2007-September/015678.html
http://www.mail-archive.com/users@gridengine.org/msg03479.html

One year after I setup SGE we got SSL errors: no wonder because the certificates expired... Consequently I ran renew_all_certs, distributed the files on all nodes and everything was running fine again.

After about a month we had certificate errors again! I checked the certficate files and they all seemed correct, e.g.:

[root@tsqm sgeCA]# openssl x509 -in /gridware/cst-gridengine/default/common/sgeCA/cacert.pem -noout -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            cd:4e:f8:ac:e0:53:85:7b
        Signature Algorithm: sha1WithRSAEncryption
        Issuer: C=DE, ST=Hessen, L=Darmstadt, O=CST AG, OU=Research and Development, CN=SGE Certificate Authority/UID=CA/emailAddress=thimo.neubauer@cst.com
        Validity
            Not Before: Jun  7 18:14:59 2012 GMT
            Not After : Jun  7 18:14:59 2013 GMT

I dug further and found that the CRL seems to be the problem! It claims that a new version has to be available every month:

[root@tsqm sgeCA]# openssl crl -in /gridware/cst-gridengine/default/common/sgeCA/ca-crl.pem -noout -text
Certificate Revocation List (CRL):
        Version 1 (0x0)
        Signature Algorithm: md5WithRSAEncryption
        Issuer: /C=DE/ST=Hessen/L=Darmstadt/O=CST AG/OU=Research and Development/CN=SGE Certificate Authority/UID=CA/emailAddress=thimo.neubauer@cst.com
        Last Update: Jun  7 18:15:00 2012 GMT
        Next Update: Jul  7 18:15:00 2012 GMT

I'd propose to set the same expiry period for the CRL as for the CA certificates.

IMHO this is a pretty subtle pitfall which easily breaks a CSP enabled installation completely; also others were already bitten by this. That is the reason why I've set "major" severity

Attachments (1)

sge_ca.diff (735 bytes) - added by aylee 7 years ago.
quick fix

Download all attachments as: .zip

Change History (7)

comment:1 Changed 7 years ago by aylee

I've found a quick fix (diff follows)

Changed 7 years ago by aylee

quick fix

comment:3 Changed 7 years ago by wish

On 10 July 2012 13:35, SGE <sge-bugs@…> wrote:

#1423: renew_all_certs creates CRL which expires after one month


Reporter: aylee | Owner:

Type: defect | Status: new

Priority: normal | Milestone:

Component: sge | Version: 8.0.0b

Severity: major | Keywords:


I've stumbled over an effect which seems to be the same as described in:

http://arc.liv.ac.uk/pipermail/gridengine-users/2007-September/015678.html
http://www.mail-archive.com/users@gridengine.org/msg03479.html

One year after I setup SGE we got SSL errors: no wonder because the
certificates expired... Consequently I ran renew_all_certs, distributed
the files on all nodes and everything was running fine again.

After about a month we had certificate errors again! I checked the
certficate files and they all seemed correct, e.g.:

[root@tsqm sgeCA]# openssl x509 -in /gridware/cst-
gridengine/default/common/sgeCA/cacert.pem -noout -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            cd:4e:f8:ac:e0:53:85:7b
        Signature Algorithm: sha1WithRSAEncryption
        Issuer: C=DE, ST=Hessen, L=Darmstadt, O=CST AG, OU=Research and
Development, CN=SGE Certificate
Authority/UID=CA/emailAddress=thimo.neubauer@cst.com
        Validity
            Not Before: Jun  7 18:14:59 2012 GMT
            Not After : Jun  7 18:14:59 2013 GMT

I dug further and found that the CRL seems to be the problem! It claims
that a new version has to be available every month:

[root@tsqm sgeCA]# openssl crl -in /gridware/cst-
gridengine/default/common/sgeCA/ca-crl.pem -noout -text
Certificate Revocation List (CRL):
        Version 1 (0x0)
        Signature Algorithm: md5WithRSAEncryption
        Issuer: /C=DE/ST=Hessen/L=Darmstadt/O=CST AG/OU=Research and
Development/CN=SGE Certificate
Authority/UID=CA/emailAddress=thimo.neubauer@cst.com
        Last Update: Jun  7 18:15:00 2012 GMT
        Next Update: Jul  7 18:15:00 2012 GMT

I'd propose to set the same expiry period for the CRL as for the CA
certificates.

Not sure this is a good idea. I don't think it would matter if used wholly
within a cluster controlled by a single admin or admin team that can update all
copies of the CRL whenever they need to but if CSP is used to allow trusted
submission from outside the cluster then you want to be certain your CRL is
reasonably up to date and you should set up (possibly automated) procedures
to ensure that it is rather than just extending the CRL lifetime.

I suspect this is the sort of problem that should be "solved" with better
documentation.

IMHO this is a pretty subtle pitfall which easily breaks a CSP enabled
installation completely; also others were already bitten by this. That is
the reason why I've set "major" severity

--
Ticket URL: <https://arc.liv.ac.uk/trac/SGE/ticket/1423>
SGE <https://arc.liv.ac.uk/trac/SGE>
Son of Grid Engine: Community continuation of work on Grid Engine
_
SGE-bugs mailing list
SGE-bugs@…
https://arc.liv.ac.uk/mailman/listinfo/sge-bugs

comment:4 Changed 7 years ago by dlove

SGE <sge-bugs@…> writes:

I dug further and found that the CRL seems to be the problem! It claims
that a new version has to be available every month:

That's what I assumed when people complained on the list a while ago,
but no-one sent further information. (The error message is at least
unhelpful, but the code is obscure.)

I'd propose to set the same expiry period for the CRL as for the CA
certificates.

Thanks for the patch. Actually I'd already done that, and it apparently
didn't override the .cnf file, but I've just re-checked and it does
work. I don't know what was wrong before. I'll also make the template
values consistent, as that change got lost.

comment:5 Changed 7 years ago by dlove

William Hay <w.hay@…> writes:

Not sure this is a good idea. I don't think it would matter if used wholly
within a cluster controlled by a single admin or admin team that can update all
copies of the CRL whenever they need to but if CSP is used to allow trusted
submission from outside the cluster then you want to be certain your CRL is
reasonably up to date and you should set up (possibly automated) procedures
to ensure that it is rather than just extending the CRL lifetime.

I don't understand how a shorter lifetime helps. I wouldn't expect to
worry much about mutual authentication, but doesn't the CRL get
distributed with the new certificates (with a non-shared SGE_ROOT) and
what's the advantage to doing it separately?

comment:6 Changed 7 years ago by wish

On 12 July 2012 11:34, SGE <sge-bugs@…> wrote:

#1423: renew_all_certs creates CRL which expires after one month


Reporter: aylee | Owner:

Type: defect | Status: new

Priority: normal | Milestone:

Component: sge | Version: 8.0.0b

Severity: major | Resolution:
Keywords: |


Comment (by dlove):

William Hay <w.hay@…> writes:

Not sure this is a good idea. I don't think it would matter if used

wholly

within a cluster controlled by a single admin or admin team that can

update all

copies of the CRL whenever they need to but if CSP is used to allow

trusted

submission from outside the cluster then you want to be certain your CRL

is

reasonably up to date and you should set up (possibly automated)

procedures

to ensure that it is rather than just extending the CRL lifetime.

I don't understand how a shorter lifetime helps. I wouldn't expect to
worry much about mutual authentication, but doesn't the CRL get
distributed with the new certificates (with a non-shared SGE_ROOT) and
what's the advantage to doing it separately?

Not being up to date with the CRL could lead to machines being
trusted that shouldn't be if a certificate is compromised. Having a
short lifetime ensures that you notice if it isn't up to date because
things break.

William

If someone hasn't updated the CRL recently then they are more likely
to get their session hijacked
if

--
Ticket URL: <https://arc.liv.ac.uk/trac/SGE/ticket/1423#comment:5>
SGE <https://arc.liv.ac.uk/trac/SGE>
Son of Grid Engine: Community continuation of work on Grid Engine

comment:7 Changed 7 years ago by Dave Love <d.love@…>

  • Owner set to Dave Love <d.love@…>
  • Resolution set to fixed
  • Status changed from new to closed

In [4291/sge]:

(The changeset message doesn't reference this ticket)

Note: See TracTickets for help on using tickets.