Skip to content

Update Gorouter Back End Keep-Alive Connections Docs with Caveat about Potential Race Condition#199

Merged
paigecalvert merged 1 commit intocloudfoundry:masterfrom
peterellisjones:keep-alive-race-conditions
Jun 1, 2021
Merged

Update Gorouter Back End Keep-Alive Connections Docs with Caveat about Potential Race Condition#199
paigecalvert merged 1 commit intocloudfoundry:masterfrom
peterellisjones:keep-alive-race-conditions

Conversation

@peterellisjones
Copy link
Copy Markdown
Contributor

@peterellisjones peterellisjones commented May 20, 2021

Hi folks,

When Gorouter back end keep-alive connections is enabled, users are exposed to a potential race condition if their application keep-alive idle timeouts are not correctly configured. This PR updates the docs to inform users about this.

The Gorouter backend idle timeout value of 90 seconds comes from this hardcoded value.

Supporting information about why it's important for senders to close connections before receivers do:

https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-idle-timeout.html

To ensure that the load balancer is responsible for closing the connections to your instance, make sure that the value you set for the HTTP keep-alive time is greater than the idle timeout setting configured for your load balancer.

https://docs.pivotal.io/application-service/2-8/operating/frontend-idle-timeout.html

In general, set the value higher than your load balancer’s back end idle timeout to avoid the race condition where the load balancer sends a request before it discovers that the Gorouter or HAProxy has closed the connection.

https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340

This causes the load balancer to be the side that closes idle connections, rather than nginx, which fixes the race condition!

https://tanzu.vmware.com/content/pivotal-engineering-journal/understanding-keep-alive-timeouts-in-the-cloud-foundry-networking-stack-2

Let’s look from the “outside” (the client) to the “inside” (the server in the deployment). As you travel further in, the keep-alive timeouts should be configured to be longer. That is, the outermost layer (in this case, the client) should have the shortest backend keep-alive timeout, and as you go in, the keep-alive timeouts should get progressively longer in relation to the corresponding backend or frontend idle timeout

Consider Golang’s HTTP client, which we believe to be somewhat robust to this type of client-side problem. The low level http transport package used by the default client implements a background loop that checks if a connection in its idle pool has been closed by the server before reusing it to make a request. This leaves a small window for a race condition where the server could close the connection between the check and the new request; in the event that the client loses this race, it retries this request on a new connection.

kind regards,

Pete

@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented May 20, 2021

CLA Signed

The committers are authorized under a signed CLA.

@cf-gitbot
Copy link
Copy Markdown

We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.

The labels on this github issue will be updated when the story is started.

@peterellisjones peterellisjones changed the title Update Gorouter Back End Keep-Alive Connections with Caveat about Potential Race Condition Update Gorouter Back End Keep-Alive Connections Docs with Caveat about Potential Race Condition May 20, 2021
@peterellisjones peterellisjones force-pushed the keep-alive-race-conditions branch 2 times, most recently from e8fd2e9 to bb172f0 Compare May 20, 2021 09:26
@peterellisjones peterellisjones force-pushed the keep-alive-race-conditions branch from bb172f0 to eff7e45 Compare May 20, 2021 09:36
@paigecalvert paigecalvert merged commit 5d05c34 into cloudfoundry:master Jun 1, 2021
@paigecalvert
Copy link
Copy Markdown

Thank you @peterellisjones!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants