Introduction to Load Testing
What is Load Testing?
Load testing is the technique of simulating many users on a website or web application, to measure its performance under heavy load.
This type of software performance testing is concerned with scalability, which is the ability of the system to maintain fast response times with lots of concurrent users or many simultaneous requests.
Conceptually, load testing web applications is similar to load testing in the physical world. Bridges, roofs, and other physical structures can be load tested by weighing them down with heavy objects to make sure they meet their load bearing requirements. Of course, a site or web application crashing isn’t as dangerous as a bridge falling down, but load testing is nonetheless important if you want to ensure scalability and reliability of your software systems.
A website under minimal load, when just a few users are using it and a trickle of requests are coming in, often performs fine and and responds quickly. But the same site under a huge flood of traffic responds slowly or might not respond at all.
Load testing (including stress testing, spike testing, and stability testing) is the best way to measure your site’s scalability and determine how it will perform under heavy load.
Why is Load Testing Important?
The reason load testing is so important is that predicting how a software system will behave under heavy load is almost impossible without actually observing it under load.
We’ve all seen websites crash or slow down under heavy traffic. This famously happens around shopping frenzies like Black Friday or entertainment frenzies like Taylor Swift concerts, but it can occur anytime your site receives extra traffic for any reason, like a major feature launch or marketing campaign, or for no apparent reason at all.
Site crashes are harmful and costly. Downtime results in lost business and wasted exposure, and often brings bad publicity that damages your brand when frustrated users give up and look elsewhere.
Even if you run a web application with a captive audience, instability and sluggish performance are annoying and unproductive. Nobody likes them.
Just because your site performs fine with current levels of traffic doesn’t mean it will perform fine with higher levels of traffic in the future.
Website performance under load is unpredictable
Every site or web application has its breaking point, beyond which it can’t handle any additional throughput. When that breaking point is reached, the site’s performance degrades in a very non-linear fashion.
What do we mean by non-linear?
Consider the following table of average response times at different levels of load from an example site. The first column is the number of concurrent users active on the site, and the second column is the average page load time those users are experiencing.
Concurrent Users | Average Response Time |
---|---|
100 | 0.251s |
200 | 0.253s |
300 | 0.254s |
400 | 1.267s |
500 | 10.35s |
Notice how the average response time barely increases as the load increases from 100 to 200 to 300 concurrent users. It edges up noticeably at 400 concurrent users, but gets far worse at 500 concurrent users, to the point of being almost unusable!
If you were to graph these data points you’d see a classic “hockey stick” graph, which is decidedly non-linear.
This “hockey stick” pattern showing gradual increases in load without much performance degradation, but then getting dramatically worse when a breaking point is reached, is typical of websites and web applications.
Scalability bottlenecks cause non-linear performance
Response times degrade when the amount of load saturates a “bottleneck” in the system. Like trying to force too much liquid through the narrow neck of a bottle, a software system’s throughput is limited by its slowest component.
Once a server’s concurrency bottleneck has been saturated, additional incoming requests have nowhere to go and get piled up. The result is that the clients have to wait longer for a response, since their requests are piled up in the network stack or some other part of the system.
Not every website or web application will have the same bottlenecks. A site’s network, infrastructure, and software stack has many moving parts. For one site the primary bottleneck might be the web server’s connection pool while for another the bottleneck might be the database. Web applications tend to be even more complicated and have more potential bottlenecks than simple websites, since now you have your own application code to worry about too!
Guessing the performance bottleneck without testing is usually a fool’s errand. It’s safer to run a series of load tests to find the breaking point and locate the performance bottlenecks in your system.
Types of Load Testing
Up until now we’ve been using the term “load testing” to refer to the technique of generating load on a system to measure its performance and scalability, but the technique can be applied in different ways and for different reasons.
Here are a few subtypes within the “load testing” category, with a brief summary of why you might want to do each one.
Acceptance Load Testing
A common goal of many load testing efforts is to answer some form of the question: Will my site deliver acceptable performance under peak load?
To answer that clearly, you’ll first need specific requirements defining:
- The step-by-step user behavior or behaviors to be tested.
- The worst-case acceptable performance threshold for an individual user.
- The peak number of concurrent users or transactions per second.
There are different ways you could state your specific requirements, but it’s important to address all three dimensions: the user behavior to be tested, acceptable performance thresholds, and peak load targets.
If you’re working with colleagues or stakeholders, the process of gathering these requirements might not sound fun, but it’s an opportunity to align everyone’s expectations around the testing effort.
Having requirements that describe user behavior, acceptable performance, and peak load makes it easy for everyone to agree whether a load test passed or failed. That way, everyone on your team arrives at the same conclusion about whether the site is ready for heavy traffic.
When it comes to acceptance load testing, having requirements that can clearly be agreed upon as passing or failing is crucial.
Exploratory Load Testing
Load testing techniques can be useful even if you don’t yet have specific performance or scalability requirements.
In the early stages of a project, you may want to run some exploratory load tests just to see what performance and scalability bottlenecks you encounter.
Creating some realistic test scripts and gradually ramping up the load (with more and more concurrent bots to simulate increasing numbers of users) will reveal performance problems and scalability bottlenecks that you can resolve through tuning or code changes.
Tuning a site for better performance is a methodical process, and shouldn’t be done haphazardly. Measuring whether each tuning change made things better or worse is much easier when you have a repeatable test that measures the change’s impact.
Running an identical load test in between each change to your site and environment can tell you whether the change made things better, worse, or had no effect.
In fact, tuning a site for better scalability without a repeatable load test would be risky and haphazard.
Stress Testing
The term “stress testing” is often used interchangeably with load testing, but there’s a slight distinction.
Stress testing is less concerned with determining the breaking point or validating requirements, and more concerned with what happens when the site is pushed beyond the point of failure.
Some web applications recover cleanly on their own once the excess traffic subsides. Others require a restart. In some cases, heavy traffic can even leave a web application in a broken state, with stuck database transactions or exhausted resource pools. Even worse, there are situations where excessive load can even leave half-baked, damaged, or incomplete data in a database, resulting in permanent damage to your customer data!
Stress testing a web application with excessive amounts of load can tell you if it breaks down cleanly when pushed to the limit, or suffers bad side effects like corrupted data or other permanent damage.
Stability Testing (or Soak Testing)
Web applications sometimes run smoothly for a while and then bog down as a resource is gradually exhausted. Eventually, when the resource is fully exhausted, the application crashes.
A stability test simulates a prolonged period of moderate-to-high load, to detect possible memory leaks and resource exhaustion.
If you run an extended load test and see performance get gradually or suddenly worse after a while, it’s an indication that some resource is being leaked or exhausted.
It might be a hard limit like physical memory or process heap memory, or a soft limit like an internal cache filling up or database connections leaking.
A stability load test makes it easier to catch resource leaks and fix them before they cause problems in production.
Spike Testing
Spike testing helps you determine if your site receives a quick spike in traffic, will it handle the spike gracefully and recover quickly afterwards, or will it break down and remain broken?
It’s fairly common for web applications to suffer bad performance for a while after a large spike in traffic, sometimes requiring a restart. This might happen because an internal resource is exhausted, when old requests are queued up causing backpressure, or when your autoscaling configuration starts to snowball.
Since the aftermath of a traffic spike is so unpredictable between applications, running a spike test to simulate a short but severe traffic spike is a good way to find out.
Continuous Performance Testing
If your team frequently releases changes to your web application, it’s good to measure the performance and scalability impact of each revision.
By running the same load test against each build or after each deployment, you’ll be able to compare high-level metrics to see if performance has improved or degraded. If the test reveals a change in average response times, nth-percentile response times, or error rate, you can investigate and take action accordingly.
Many load testing tools, including ours, can be wired into a continuous integration or deployment pipeline for continuous load testing.
Load Testing Tools
Hiring thousands of human testers to use your site simultaneously to see how it performs under load would be a herculean task and prohibitively expensive.
That’s why load testing requires tools to simulate the users and generate traffic as though real users were actually using it concurrently. If the tool does its job right, your website won’t know the difference between traffic coming from the tool and traffic coming from real users.
Virtual users and test scripts
Most load testing tools have a concept of a virtual user or v-user, which simulates a real human user. In Loadster we call them bots but other tools have other names for them.
To load test a website or web application, you’ll typically need to create some sort of script to tell each bot (virtual user) what to do. Your load testing tool will make each bot run this script to generate traffic on your site.
The specifics of creating a script vary from one tool to another, but in general your scripts should represent real user behavior as closely as possible, so your site doesn’t know the difference. A load test is only as accurate as the user behavior being tested!
Protocol and real browser load testing
Different load testing tools have different ways of generating load.
Some tools are fairly basic and operate only at the HTTP/S protocol layer, or some other wire protocol. Such tools require that you string together individual HTTP requests, which works fine for testing HTTP APIs and simple websites, but can be challenging and time consuming for testing modern websites and web applications.
Other tools (like ours) automate real headless web browsers, running a separate browser for each bot or simulated user. Controlling real browsers in your test script is often easier than testing at the protocol layer, and the results of the load test are often more realistic. Real browser load test scripts include steps like navigating to a URL, clicking on a button, typing text into a form field, etc.
Specific details on every tool are outside the scope of this introduction, but your choice of load testing tool will naturally depend on what kind of system and user behavior you plan to load test, as well as practical constraints like your time and familiarity and budget.
Cloud and on-premises load testing
Some load testing tools generate the load from a single process running locally on your system (“on-premises”), while others launch distributed cloud engines in your choice of cloud regions and control them to generate the load.
If you need to test with a fairly small amount of load, have no budget for tools, or need to test applications on localhost or a private network, a single process CLI tool might serve you best. Many of these are free and open source.
If you need to run bigger load tests, especially resource-intensive real browser load tests, a tool that launches tests from the cloud will make your life easier. Most of the cloud-based load testing tools allow you to select a region or regions, and the tool does the work of provisioning cloud instances and spinning up the virtual users and correlating the results.
The Load Testing Process
The exact process of load testing depends on the tools you use, but at a high level it goes something like this:
- Record or create a script to simulate user behavior
- Run the script with a single bot to verify it works
- Run a load test with many concurrent bots
- Analyze and interpret the load test results
Most likely you’ll repeat the test quite a few times with the same or different test configuration, as you iterate to resolve performance bottlenecks in your site or test different criteria.
Creating load test scripts
Every load testing tool is different, but in general you’ll start by creating a script that tells the tool how to simulate a real user visit. Usually, scripts should imitate a real user as closely as possible, with realistic wait times or pauses between steps and following a typical user’s path through the site.
Some tools (including Loadster) support testing with real web browsers, while others require scripting to be done at the protocol layer. Your choice of tool might vary depending on whether you’re testing a static website, dynamic web application, or API.
Running load tests
For a test to be considered a load test it needs to simulate multiple users hitting your site at the same time. Load testing tools typically run hundreds or thousands of bots (virtual users) to execute your scripts in parallel.
Many of the free tools generate the load from a single process that you run on your own machine, and this often works fine for quick load tests with smaller amounts of load. For larger distributed load tests, it can be helpful to have a service that spins up cloud instances on demand from your choice of cloud regions, and takes care of the deployment and test infrastructure for you.
As the test runs, it’s a good idea to keep an eye on the tool to see how your site is responding and if any errors are detected. You might need to stop the test early if the load test has already crashed your site.
Speaking of site crashes, it’s often a good idea to load test against a non-production instance of your site, at least initially when you have low confidence in your site’s ability to handle the load. There are tradeoffs, however… testing against a non-production instance of your site introduces the caveat that something might be different in the environment between production and non-production, and you can’t always extrapolate performance or assume scalability between different environments. If you load test in a non-production environment, try to make it as close a replica as production as possible, and then repeat the test in production once you have a high degree of confidence it can handle the load.
Analyzing load test results
Nearly every load testing tool generates some kind of output, with the tool’s measurements of how well your site performed and what kind of errors it detected. You’ll want to review these metrics during and after a load test to tell if the test was a success and what kinds of changes might be needed to the scripts, test configuration, or the site itself.
Data analysis and reporting is an important and sometimes time-consuming part of load testing. Some tools do a lot of the reporting and data analysis for you with automatic graphs and reports, while others take the philosophy of emitting raw data that you parse and analyze yourself.
Start Your Load Testing
At this point you might be a bit overwhelmed by all the things there are to test! Load and stress testing can help you answer a lot of questions, including but not limited to…
- Will my site perform well on our busiest day of the year?
- Do I have enough cloud infrastructure or hardware to run this application at scale?
- Is autoscaling working properly?
- Does my app deliver fast response times and a good user experience even under peak load?
- When pushed to the breaking point, does my application recover gracefully, or crash hard and lose data?
- Are there concurrency issues or bugs (“heisenbugs”) in my application that only surface under heavy load?
- Do I have memory leaks or resource exhaustion that appears after extended usage?
- Are our redundancy and failover systems in place and working properly?
Of course, load testing “all the things” could take a substantial effort and might not even be worth it.
But that’s probably not necessary.
Reducing the risk of crashing from high traffic events is one of those things where something like 20% of the effort yields 80% of the results.
You can reduce your crash risk substantially with just a few quick rounds of load tests.
Once you’ve fixed the obvious bottlenecks and gained confidence that your site can perform well under heavy load, you can iterate gradually to further tune the site and improve scalability.
Load testing is actually quite fun, so we recommend selecting a tool and jumping right in. Soon you’ll be running load tests and improving your site’s performance and scalability.
Your future self will thank you when your site doesn’t crash.
If you’d like to share feedback about this guide or run into challenges with your load testing, we’d like to hear from you at help@loadster.app.
Next Steps
- Want a little more background? Read our guide on Website Load Testing for a longer, more in-depth introduction to load testing as it specifically relates to websites.
- Need to load test an API, such as a REST or GraphQL API? Check out our guide on API Performance Testing.
- If you’d like to try Loadster for your load testing, you can Start Testing for Free and get 50 units of Loadster Fuel to power your first load tests.