504 HTTP Errors Returned by ELB Even When Hosts Are Healthy and Can Serve a Request

I have a service that is deployed to Amazon Web Services (AWS), in particular 2 instances behind an elastic load balancer (ELB). Availability zones are selected as all three us-west-2a, b, c but only 2 of the above 3 zones have instances running in it.

The problems are that even though the traffic / load is not too high, I still often get HTTP 504 errors from ELB.

Journal lines read as follows

-1 -1 -1 504 0 0 0

In order, --request_processing_time --backend_processing_time --response_processing_time --elb_status_code --backend_status_code --received_bytes --sent_bytes. A description of how each field and response tool can be found here.

ELB timeout is 60 seconds. KeepAlive - 'On' for backend instances. Request delay from ELB is under control. I tried increasing KeepAliveTimeout , but to no avail.

Does anyone have an idea of ​​how to proceed? I don’t even know the cause of this problem.

PS: More like the second question, there are several cases (much less than 504 returned by ELB when the backend does not even accept the request), where even the backend returns 504, and then ELB sends the same to the client. As far as I know, HTTP 504 should be returned by the proxy only when the backend is disconnected. How can the server itself return 504?

+5
source share
2 answers

So that he can help others in the future, I publish my findings here:

1) This 504 0 HTTP error was mainly due to an apache reload instead of a logarithm rather than an elegant restart. The current AWS configuration does the following:

 /sbin/service httpd reload > /dev/null 2>/dev/null || true 

replace the service command with either apachectl -k graceful or /sbin/service httpd graceful

File location in my ec2 instance: /etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.httpd.conf

2) Since the lotroth frequency was too high by default in AWS (once per hour), at least for my use case, and this, in turn, recharged apache every hour, so I also reduced this.

+6
source

When the server connect timeout, ELB will put -1 in the backend_processing_time column in its access log. Think about what happens when some of your requests take more than 60 seconds to process your backend. To confirm this, can you check your latency metrics? When viewing this metric, go to the maximum. This will confirm my hunch if you see that latency often reaches 60 seconds.

After it is confirmed, you may need to increase the wait timeout for your ELB and backend.

0
source

All Articles