Is there a way to renew TLS certificates on a network / HTTP server without any downtime?

I have a simple https server which serves for a simple page (no errors for brevity):

package main import ( "crypto/tls" "fmt" "net/http" ) func main() { mux := http.NewServeMux() mux.HandleFunc("/", func(w http.ResponseWriter, req *http.Request) { fmt.Fprintf(w, "hello!") }) xcert, _ := tls.LoadX509KeyPair("cert1.crt", "key1.pem") tlsConf := &tls.Config{ Certificates: []tls.Certificate{xcert}, } srv := &http.Server{ Addr: ":https", Handler: mux, TLSConfig: tlsConf, } srv.ListenAndServeTLS("", "") } 

I want to use a Let encrypt TLS certificate to serve content over https. I would like to be able to renew certificates and renew the certificate on the server without any downtime.

I tried running goroutine to update tlsConf :

 go func(c *tls.Config) { xcert, _ := tls.LoadX509KeyPair("cert2.crt", "key2.pem") select { case <-time.After(3 * time.Minute): c.Certificates = []tls.Certificate{xcert} c.BuildNameToCertificate() fmt.Println("cert switched!") } }(tlsConf) 

However, this does not work, because the server does not "read" the modified config. In any case, ask the server to restart TLSConfig ?

+5
source share
2 answers

Exists: you can use the tls.Config s GetCertificate instead of populating Certificates . First, define a data structure that encapsulates the functionality of the certificate and reload (when receiving the SIGHUP signal in this example):

 type keypairReloader struct { certMu sync.RWMutex cert *tls.Certificate certPath string keyPath string } func NewKeypairReloader(certPath, keyPath string) (*keypairReloader, error) { result := &keypairReloader{ certPath: certPath, keyPath: keyPath, } cert, err := tls.LoadX509KeyPair(certPath, keyPath) if err != nil { return nil, err } result.cert = &cert go func() { c := make(chan os.Signal, 1) signal.Notify(c, syscall.SIGHUP) for range c { log.Printf("Received SIGHUP, reloading TLS certificate and key from %q and %q", *tlsCertPath, *tlsKeyPath) if err := result.maybeReload(); err != nil { log.Printf("Keeping old TLS certificate because the new one could not be loaded: %v", err) } } }() return result, nil } func (kpr *keypairReloader) maybeReload() error { newCert, err := tls.LoadX509KeyPair(kpr.certPath, kpr.keyPath) if err != nil { return err } kpr.certMu.Lock() defer kpr.certMu.Unlock() kpr.cert = &newCert return nil } func (kpr *keypairReloader) GetCertificateFunc() func(*tls.ClientHelloInfo) (*tls.Certificate, error) { return func(clientHello *tls.ClientHelloInfo) (*tls.Certificate, error) { kpr.certMu.RLock() defer kpr.certMu.RUnlock() return kpr.cert, nil } } 

Then in the server code use:

 kpr, err := NewKeypairReloader(*tlsCertPath, *tlsKeyPath) if err != nil { log.Fatal(err) } srv.TLSConfig.GetCertificate = kpr.GetCertificateFunc() 

I recently implemented this template in RobustIRC.

+11
source

You will need to stop and restart the listener, which in itself will be a "downtime".

If this is necessary for β€œno downtime,” one option is to build a graceful restart by deploying a child instance:

http://grisha.org/blog/2014/06/03/graceful-restart-in-golang/

But in fact, this is a false sense of security ... The fact that you have only one instance running and trying to ensure that this instance is stable means that this is the only point of failure, since you cannot guarantee uptime. Server reboot, application panic, connection drop.

Instead, consider setting up a web farm for at least 2 or 3 nodes to distribute traffic.

Listen to me for a moment ...

Amazon AWS has an "elastic bean stock" (among other similar offers). Windows Azure has Websites. Both of these managed options allow you to perform updates for Rolling Updates. Release SSH access and just lean back and let it control.

What are rolling updates? Say you have two instances in version 1. You want to deploy version 2.

  • You upgrade the package and AWS launches the deployment by starting the third instance.
  • Once the virtual machine is in a β€œready” state, when you deploy and run your version 2 code, AWS will start sending TCP traffic to it by changing the ELB (load balancer).
  • AWS will stop sending traffic to one of the old nodes of version 1. It does not close it yet, it just stops sending new connections.
  • After all TCP connections to this instance of the old version 1 have been deleted, AWS then disconnects this instance.
  • AWS now includes the 4th instance of version 2, and in a ready state begins to control traffic.
  • AWS stops traffic until the last instance of old version 1, waiting for the completion of existing connections.
  • After disconnecting, AWS disconnects the latest instance of the old version.

Zero downtime. Zero connections have fallen. Fixed Zero TCP packages. Fully automated. Rolling updates to your SSL certificates as you want.

This, of course, is fully customizable, for example, Blue / Green deployment (first it launches several new instances and directs all new traffic to a new environment - it is best suited for changing the database schema). You can also test canaries with little traffic, etc.

+2
source

All Articles