High performance mocking for load testing

7 min readMay 25, 2019

Today I wanted to share one of the most simple solutions to one of the most complex part of software performance engineering, and it’s called “mocking”

What is mocking?

The concept of mocking is simple. Let’s assume we have a system called “FOO”. System foo, depends on system “BAR”.

Mocking is a concept of simulating the behavior of a real system in a controlled way. We can achieve mocking by building a simple imitated dummy system that behaves like the real thing.

So let’s say we want to test system Foo in isolation without being dependent on system Bar. In this case we’d build a mock system that simulates system Bar.

Mocking Scenarios

There are a few scenarios under which a team would require a mock.

If system Foo is under development and system Bar is not available yet. The developer of system Foo might choose to build a mock to simulate system Bar’s existence
Let’s say there was a high cost to use system Bar. The company selling system Bar charge per API call. Developers are rapidly developing system Foo and have hundreds of automated tests that run which may call system Bar. This could incur high costs. It may in certain cases be wise to use a mock.
Performance load testing. When testing how system Foo behaves under load, we want to ensure we are not affected by system Bar. We also want to ensure we are not testing performance of system Bar. In this case we may introduce a mock server for system Bar.

Common mistakes

In my experience, mocking is one of the simplest parts of performance engineering when done right. Most teams are instantly stumped as it’s the first mistake they make when trying to understand performance of their system.

Teams assume a simple hello world style minimalist mock, can handle load. 99% of times, these types of mocks fail.
Similarly, teams assume a default runtime is capable of handling load. Web server runtimes are not always tuned for best performance. They favor security and stability and are generally shipped with balanced trade-offs. Performance is always a trade off. When building a load test mock, you’d rather sacrifice security, stability, memory or CPU usage in favor of high throughput and low latency. As long as a mock server responds well under load and can handle more than the system we are testing, we’ll be good. Unless you are testing a scenario under which a dependant external system is misbehaving. Therefore when using runtimes like Java, Dotnet and NodeJS, you have to make sure you configure it to accept many inbound or outbound connections etc. These runtimes have configuration options for high number of connections and timeout values.
Teams do not test their mock servers to ensure it’s capable of handling enough load for the test, they assume it’s perfect because the implementation seems very simple at a glance

The Solution

With all the above said and out the way, lets build a simple mock server capable of reaching 30 000+ requests / sec

You will need a 4 things

Docker (For Windows, MacOS or Linux)
My Dockerfile
My Golang file
a JSON response (You can change this to XML or anything really)

All the files are down below in a GitHub gist.

Lets start with a simple dockerfile

The Docker File

FROM golang:1.11.10-alpine3.9 as builder# installing git
RUN apk update && apk upgrade && \
apk add --no-cache git# setting working directory
WORKDIR /go/src/app# installing dependencies
RUN go get github.com/sirupsen/logrus
RUN go get github.com/buaazp/fasthttprouter
RUN go get github.com/valyala/fasthttp
COPY / /go/src/app/
RUN go build -o mock

The above is simply building our code. We use a tiny go SDK image, install git and we run go get to install our dependencies. We use COPY directive to place our source code for the mock into the image and run go build to build our code and produce a small static compiled binary called “mock”.

That tiny binary is our mock server.

FROM alpine:3.9
WORKDIR /go/src/app
COPY --from=builder /go/src/app/mock /go/src/app/mock
COPY *.json /app/mocks/
EXPOSE 80
CMD ["./mock"]

The rest of the image is simply starting with a small alpine OS, setting our working directory where the mock server will be placed. We use a popular docker multi-stage COPY --from=builder command to copy our mock server binary out to the final image. Docker multistage helps us to keep the image small as we don’t need go SDK as part of the final image.

After that we COPY the JSON files into our mock folder. Note here that you could change this to XML or any type of response you want to mock out. Then we indicate we’d like to expose port 80 for traffic and we start our application with the CMD directive.

Now that we have a dockerfile we cannot build it just yet. We need the code and the response.json

The Code

You may ask, Why golang ?

There are a couple of reasons why I used go for my mocking solutions.

It’s extremely simple, easy to read, learn and modify.
Low overhead and performs extremely well
I have used Proxies like NGINX before for this same purpose, but adding logic like Sleep (to simulate latency) is way simpler in go .

Let’s break down the code (The full code is in the gist below which you can copy)

Let’s bring in all our dependencies.

package mainimport (
  "fmt" //to print messages to stdout
  "os"  //used to exit the app if error occurs
  "time" //used for sleep function to simulate latency
  "io/ioutil" //used to read file from disk
  "log" //logging :)  //our web server that will host the mock
  "github.com/buaazp/fasthttprouter" 
  "github.com/valyala/fasthttp"
)

I’ve added comments above describing what each import line is used for

func main() {
  m, e := ioutil.ReadFile("/app/mocks/response.json")
  if e != nil {
    fmt.Printf("Error reading mock file: %v\n", e)
    os.Exit(1)
  }  searchMock = m
  fmt.Println("starting...")
  router := fasthttprouter.New()
  router.POST("/", Mock)
  log.Fatal(fasthttp.ListenAndServe(":80", router.Handler))
}

Above we have our main method, which starts by reading our mock file from disk. If there is a problem reading the mock, the application will exit and terminate. We set the content of the mock m to a global variable searchMock

Our web router will then host our mock on route / and will serve POST traffic. This can be tweaked for GET and different paths

The main method will call our Mock() which will just return the in-memory mock bytes as a HTTP response

func Mock(ctx *fasthttp.RequestCtx) {
  ctx.Response.Header.Set("Content-Type", "application/json")
  time.Sleep(700 * time.Millisecond)
  ctx.Write(searchMock);
}

We also use time.Sleep() to simulate latency of a real service. This is very important during load test.

If you return a mock response instantly, you are testing very different scenario to the real world, where high latency causes many open TCP socket connections. Most systems struggle during high volume of traffic because of sheer number of connections opened for an extended period of time. Keep-Alive is also another aspect which you may want to disable in this mock server if your system you are trying to mock has it disabled. Behavior on a system is very different where Keep-Alive is not used. In our mock, Keep-Alive is on by default.

Now to build and run our mock, we do:

docker build . -t mymock
docker run -d --rm -p 80:80 mymock

Let’s run a load test

To run a load test, I like to use WRK which is a modern HTTP bench-marking tool. I have a container for that!

Let’s create a quick post.lua file that describes what wrk should do

wrk.method = "POST"
wrk.body   = '{\"Hello\": \"World\"}'
wrk.headers["Content-Type"] = "application/json"

Above we describe we want wrk to make a POST request and we have it sample JSON to use.

Now we can run our wrk container and mount in the post.lua file

docker run -it --rm -v $PWD:/work -w /work --net host aimvector/wrk -c 3000 -t 10 -s post.lua -d 30 http://localhost

In this test we run -c to 3000 connections, with -t 10 threads for a duration of -d 30 seconds.

With that we are serving 3900 requests / sec with a forced response time of 700ms. We can see that we responde in 13ms + 700ms of forced delay

Running 30s test @ http://localhost
  10 threads and 3000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   713.63ms   38.33ms   1.76s    94.19%
    Req/Sec   432.22    374.86     2.56k    72.21%
  117685 requests in 30.09s, 79.69MB read
Requests/sec:   3911.28
Transfer/sec:      2.65MB

You can bump up the heat a bit by adding more threads and connections:

Running 30s test @ http://localhost
  100 threads and 8000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   764.18ms   67.33ms   1.80s    86.67%
    Req/Sec   120.21    117.02   797.00     84.13%
  310697 requests in 30.10s, 210.38MB read
Requests/sec:  10321.88
Transfer/sec:      6.99MB

Here we are responding at 64ms + 700ms delay.

Remember that the location of the mock and the load tests are key as well when running increased load. I usually separate them all on their own machines.

I was able to push this on an Azure basic machine to 15 000 req / sec with a 700ms sleep latency and 35 000 req /sec with no added latency.

Checkout the video too https://youtu.be/PIxD6f_pGQM

The gist

Anyways! I hope this was helpful. Come say hi on YouTube and Twitter

Peace!

High performance mocking for load testing

What is mocking?

Mocking Scenarios

Common mistakes

The Solution

The Docker File

The Code

Let’s run a load test

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Marcel Dempers

No responses yet