-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat: graceful shutdown to prevent 502 errors when oauth2-proxy is load balanced #3068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
c014b45 to
6ec9c0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation wise looks good. Please add an entry to the changelog. Run make generate and commit the update to the alpha config documentation. Add the newly introduced flag / option to the configuration.
https://github.com/oauth2-proxy/oauth2-proxy/blob/master/docs/docs/configuration/overview.md
f611ac0 to
c1edeab
Compare
| // Duration of time to continue serving traffic after receiving an exit signal. | ||
| // During this time the readiness endpoint will be returning HTTP 503 errors. | ||
| // Leave blank to disable. | ||
| ShutdownDuration time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check what impact this will have on the AlphaConfig 🤔
c1edeab to
4695366
Compare
|
|
||
| - [#3072](https://github.com/oauth2-proxy/oauth2-proxy/pull/3072) feat: support for multiple github orgs #3072 (@daniel-mersch) | ||
| - [#3068](https://github.com/oauth2-proxy/oauth2-proxy/pull/3068) feat: graceful shutdown to prevent errors when oauth2-proxy is load balanced (@adelsz) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The breaking change in the API surface of the oauthproxy package should probably be also mentioned.
Also, signal handling is now managed in the main package (which is better in my opinion), not anymore in oauthproxy.
@tuunit What do you think?
| handler: opts.Handler, | ||
| handler: opts.Handler, | ||
| shutdownDuration: opts.ShutdownDuration, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a check about negative Shutdown duration to fail early with a meaningful error.
| if err := srv.Shutdown(context.Background()); err != nil { | ||
| return fmt.Errorf("error shutting down server: %v", err) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should handle differently if the server couldn't start...
| logger.Printf("Context canceled. Waiting %s before shutting down the listeners.", s.shutdownDuration) | ||
| time.Sleep(s.shutdownDuration) | ||
| logger.Printf("Shutting down listener.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an if s.shutdownDuration > 0 check around those lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this time.Sleep, it's going to delay things without any help.
srv.Shutdown is a blocking call, it will ensure inflight request drained before it finish.
All we need to do is make sure process won't exit before srv.shutdown finish, quotes from https://pkg.go.dev/net/http#Server.Shutdown
Shutdown gracefully shuts down the server without interrupting any active connections. Shutdown works by first closing all open listeners, then closing all idle connections, and then waiting indefinitely for connections to return to idle and then shut down. If the provided context expires before the shutdown is complete, Shutdown returns the context's error, otherwise it returns any error returned from closing the Server's underlying Listener(s).
When Shutdown is called, Serve, ServeTLS, ListenAndServe, and ListenAndServeTLS immediately return ErrServerClosed. Make sure the program doesn't exit and waits instead for Shutdown to return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haitch This PR isn't about draining inflight requests, that works correctly already.
Please see #3066 for the motivation.
Long story short:
Load balancer health checks are periodical, so you must keep the service running and actively serving traffic while reporting a bad health check, otherwise requests will be lost by the load balancer.
|
|
||
| // NewOAuthProxy creates a new instance of OAuthProxy from the options provided | ||
| func NewOAuthProxy(opts *options.Options, validator func(string) bool) (*OAuthProxy, error) { | ||
| func NewOAuthProxy(ctx context.Context, opts *options.Options, validator func(string) bool) (*OAuthProxy, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this API change shouldn't be necessary.
Giving the server context should stay in the Start method.
What about moving the call to buildPreAuthChain in Start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, passing ctx to a constructor is weird to me.
readynesscheck should better be a background loop than a middleware, middleware should be processed on every request.
I think a proper way is to construct readynesscheck in main with root ctx, pass into this newOAuthProxy and to buildPreAuthChain.
Signed-off-by: Jan Larwig <jan@larwig.com>
4695366 to
5630b4d
Compare
This PR introduces a new
shutdownDurationconfiguration variable which allows the proxy to shutdown gracefully.After receiving a termination signal, proxy will keep serving traffic for
shutdownDurationbut will start returning a 503 error for its readiness/readyendpoint signaling to the load balancer to not send new traffic to it.After
shutdownDuration, proxy shuts down as usual.Motivation and Context
Fixes #3066
How Has This Been Tested?
shutdownDurationseconds/readyendpoint returns an errorChecklist: