Stress testing your FeathersJS application like in production

Published in

The Feathers Flightpath

6 min readMar 15, 2018

When it comes time to simulate a real workload on your application you need to go beyond the technical benchmark and stick to your application logic.

Introduction

You will find extensive literature and tools around benchmarking (e.g. autocannon, siege, JMeter, Artillery, etc.) and more specifically this great article focused for instance on FeathersJS protocols. However, the following limitations are often raised with such tools:

You need to work at the websocket or socket library level (e.g. socket.io) while FeathersJS already provides a client layer hiding this complexity, allowing to change the protocol implementation on-demand.
These are low-level tools not really designed to create dynamic business scenarios as often based on static values stored in configuration files. In a real application requests at time t+1 heavily depend on request results at time t.
Some are oriented toward Quality Assurance (QA) end users, without any need for coding, usually with a steep learning curve. Probably well-suited for people focused on testing only it become really over-complicated when testing is done by the developers during short periods of time.
There are still a few able to cover both HTTP and websockets in a satisfactory way.

As a consequence, the main purpose of this article is to show how we simulate realistic load on our complex FeathersJS applications at Kalisio, taking benefit of its isomorphic client API. You will see if it is worth 150 lines of code.

Concepts

We will work with the concepts of virtual clients, that use the application in phases. A ramp-up phase defines how many new virtual client will be generated and connected to your application in a time period. Indeed, it is neither advisable or realistic to have all clients arrives simultaneously during the initialization phase; sudden load to the server may clog the server resources and leads to internal failure. Similarly, a ramp-down phase defines how many new virtual client will be disconnected from your application in a time period. During the steady phase the total number of typical concurrent users during peak business hours (e.g. from 11:00 to 13:00) will be engaged. Below is the pictorial representation of this workload model.

© http://qainsights.com/how-to-design-workload-model-for-load-testing/

Each client picks and runs some of the pre-defined scenarios, i.e. a sequence of requests/messages sent which describe a sequence of actions performed by a user that exercise a particular part of the application or simulate a common flow through the application. However, we need to allow for the probability of a scenario being picked by a new virtual client to be weighed relative to other scenarios. Indeed, some features of an application are always used more often than others (typically consultation vs edition but it really depends on your application). Weights allow you to specify that some scenarios should be picked more often than others:

© http://codetheory.in/weighted-biased-random-number-generation-with-javascript-based-on-probability/

Implementation

We will not detail here the server-side setup, for this you can read this article or look for useful information in the great FeathersJS docs.

Virtual clients

We create our clients as usual with the FeathersJS isomorphic API taking into account API prefixing and authentication:

To make things more realistic we authenticate the client using either

a standard password-based login (i.e. local authentication strategy) to simulate users without a valid token
an access token (i.e. jwt authentication strategy) to simulate users with a valid token

Indeed, in a typical application, tokens are valid for e.g. one day so that most users reuse their existing token to authenticate. This behavior is driven by a jwt ratio provided to the workload test, which gives the percentage of clients that will authenticate using an access token (e.g. 0.9 for 10% of password-based login).

As you can see we also use the high-resolution timing feature available in Node.js to get accurate results. Each operation you want to measure will be simply stored as a key in the durations structure of the client data. If you’d like to store a result for a following operation simply put it into the client data as we do with the user in the authentication phase. As a logger we use winston but you can use anything else (but please consider not using console.log ;-).

Workload

To generate our workload we need a tool to fork our virtual clients in a controlled way. We found node-worker-farm to be the module realizing this in the simplest way: a line of code ! However, we need to make some computations to find the best execution options because this module can control the number of workers to be used and the number of concurrent calls per worker, while we want a global number of concurrent clients. We actually create a worker per available CPU and derive the best number of clients to be handled by a single worker.

To best match your input the number of concurrent clients should be a multiple of the number of CPUs

When forking each client node-worker-farm passes arguments to the child process and we will use it e.g. for the target application URL, the client ID and the number of scenarios to be executed. The child process is simply a node module that will be loaded (in this case our virtual client code). When it has finished his work a callback is executed where we process all measured durations and update our client count to check if the test has to be stopped. In this case we display the final measurements:

So in the child process we simply create our virtual client, authenticate it, make it run randomly chosen scenarios, then disconnect it. There is also some tricks to manage the ramp-up/down phases:

Scenarios

A scenario is as simple as a node module exporting a function performing the required operations on the client services, e.g. in one of our application the users are landing on a home page displaying all current events of the organisations they belong to:

Once you have created all your scenario files in a dedicated scenarios folder the following function is used by the child process to select one randomly according to its probability of occurrence:

Finishing touch

To make the benchmark easy to run with different parameters as a CLI we have integrated it with commander, it’s simple as:

Example

To run a realistic workload test on our staging infrastructure with 1000 concurrent connections we use a ratio of 90% token-based connections and a ramp up phase of 1000 seconds so that we have one login per second on average. Here is the result with two replicas (using feathers-sync):

As you can see we have a low CPU usage, which is reflected by the test output reporting no errors. Here is the result with 2000 concurrent connections:

As you can see we start reaching the maximum of what the infrastructure can support, which is reflected by the test output reporting that around 3% of the clients encounter some timeouts.

If you’d like to see a more complete WIP skeleton have a look into our application template at https://github.com/kalisio/kApp/tree/master/benchmark. And if this article was helpful or for any foreseen improvement please let us know !

If you liked this article feel free to have a look at our Open Source solutions and enjoy our other articles on Feathers, the Kalisio team !