Spray servlet on Tomcat 7 vs Spray-can jar on JVM

Has anyone rated the actions of their application in the following two combinations?

  • built with a servo-sprayer and deployed on Tomcat 7 to JVM 7
  • spray-built and deployed as a can on the JVM 7

I would suggest that 2) works better than 1) in most cases, even if 1) uses the functions of servlet 3.0.

The reason I ask is because my team needs to trade on performance and ease of deployment / application management (automatic scaling, monitoring, etc.), since setting up AWA Elastic Beanstalks by default java webapp is Linux working with tomcat.

Any data on this will be greatly appreciated. Greetings

+7
performance tomcat amazon-web-services deployment spray
source share
1 answer

You should look here: http://spray.io/blog/2013-05-24-benchmarking-spray/

A few days ago, people at techempower published the fifth round of their well-received current series of tests for web frameworks, the first of which uses a spray. The basic techempower test consists of several different test scripts that run different parts of the web framework / stack, only one of which we set for atomization-based implementation for: the JSON Serialization test. Other parts of the reference target framework layers (for example, access to the database) that are intentionally sprayed do not provide.

Here are the published round 5 JSON test results presented in an alternate visualization (but showing the same data):

enter image description here

The test was performed between two identical machines connected via GB-Ethernet, a client machine that generates HTTP requests with wrk as a load generator, and a server machine running the corresponding benchmarkee. To provide an idea of ​​how performance depends on the underlying hardware platform, all tests are performed twice, once between two instances of EC2 "m1.large" and once between two dedicated i7-2600K workstations.

Analysis

In the above graph, we compare the results of work on specialized equipment with EC2 devices. We expect a strong correlation between them, with most data points collected around the trend line. "Bechmarkees" that are far from the trend line either do not scale up or down, as well as a "package" or suffer from any configuration problems on their "weak" side (for example, cpoll_cppsp and onion on i7 or gemini / servlet and sparks on EC2). In any case, some research may be recommended regarding the cause of the problem.

In addition to constructing the average requests / sec of numbers reported by wrk at the end of the 30 second run, we included an alternative forecast of the number of requests based on average request latencies reported by wrk (for example, 1 ms avg. Latency in 64 connections should result in approximately 64K avg.req / s). Ideally, these predicted results should roughly match the actual reported (prohibit any rounding problems).

However, as you can see in the diagram, the two results differ significantly for some benchmarks. For us, this indicates that during the corresponding test run, something is not entirely correct. Perhaps the client working with wrk experienced a different load that affected its ability to either generate requests or measure latency properly. Or we see that the results of the response are somewhat "unorthodox" request to select the time to wait for the request. In any case, our confidence regarding the reality of avg. the number of requests and the delay data will be higher if the two results are closer.

Take aways

The special significance of this test is due to the huge number of different frameworks / libraries / toolkits that the techempower team has managed to include. The fifth round presents the results for a very heterogeneous group, close to 70 (!) Benchmarks, written in 17 different languages. Thus, it gives a good idea of ​​the rough characteristics that can be expected from various solutions. For example, did you expect the Ruby on Rails application to run about 10-20 times slower than a good JVM based alternative? Most people would prefer a difference in productivity, but their actual size may come as a surprise and, of course, is interesting not only for those who are currently faced with a technological solution.

For us, as the authors of the HTTP stack, we look at such tests from a slightly different angle. The main question for us is: how is our solution implemented in comparison with alternatives on the same platform? What can we learn from them? Where do we still have the potential for optimization, which, apparently, remained on the table? What is the impact on performance of various architectural decisions that have to be made when writing a library, such as a spray?

As you can see from the graph above, we can be quite satisfied with the operation of the sprays in this particular test. It surpasses all other JVM-based HTTP stacks on EC2 and, if you look at the bandwidth projected from the delay data, even on specialized equipment.

This shows us that our work on optimizing the implementation of HTTP spray applications pays off. The version used in this test is the recent 1.1 spray, a nightly build that includes most (but not all) performance optimizations planned for the upcoming release of version 1.0 / 1.1 / 1.2 (1.0 for Akka 2.0, 1.1 for Akka 2.1 and 1.2 for Akka 2.2).

But does this test prove that the spray is the fastest HTTP stack on the JVM?

Unfortunately, this is not so. This single test implements a small percentage of the entire logic of various HTTP implementations in order to be able to correctly rank them. This gives an indication, but a little more.

What is missing?

Wish List

Let's take a closer look at what the "JSON serialization test" actually does in the techempower test. The client creates from 8 to 256 long-term simultaneous TCP connections to the server and launches as many test requests as possible on these connections. Each request goes to the NIC server, bubbles up through the network kernel stack of the Linux kernel, is picked up by the I / O abstraction and passed to the HTTP layer (where it is parsed and possibly routed) before actually being processed by the "application logic". In the case of this test, the application simply creates a small JSON object, pushes it into the HTTP response, and sends it back to the stack, where it passes all the layers again in the opposite direction.

As such, this test checks how good the tester is:

  • interacts with the kernel regarding the "pulling" of raw data coming into the socket
  • controls the internal communication between its internal layers (for example, IO ↔ HTTP ↔ Application) -
  • parses HTTP requests and displays HTTP responses
  • serializes small JSON objects

He does all this using small requests with a fixed set of HTTP headers over a fairly small number of long-lived connections. In addition, he does all this at once, without giving us clues about the potential strengths and weaknesses of the individual parts of the stack.

If we wanted to learn something deeper about how the spray works compared to competitors based on the JVM and where its strengths and weaknesses lie in the marriage, you need to set up a number of benchmarks that measure:

  • raw I / O performance: 1 say 50K long-term parallel connections, minimum request and response sizes
  • Connection overhead: various number of connections on request, minimum request and response sizes
  • HTTP request analyzer performance: different number of request headers and size of header values, different sizes of entities
  • HTTP response rendering performance: different number of response headers and size of header values, different sizes of entities
  • HTTP switching performance: chunked requests and responses with different number and size of message blocks.
  • HTTP pipelining performance: varying number of request batch sizes
  • SSL performance: 1 say 50K long-lived connections, minimum request and response size
  • Websocket Performance
  • System level indicators and JVM (processor load, GC-activity, etc.)

If we had a test suite in which numbers such as these wed felt much more comfortable in creating the correct ranking of the spray and its alternatives. And wouldn’t it be great if there was something like a “continuous benchmarking” infrastructure that automatically generated these test results with a simple git push to your repository?

Oh, good ... I think our ever-growing to-do list just got another item that was highlighted as important ... :)

+4
source share

All Articles