As a complement to a previous post we wrote, about Microservices Pre-Production Testing. In this post, we’re introducing the post-production testing technique for the Microservices Architectures.
Is introducing errors in a controlled manner in production to see if your system can hold up to those errors.
We always aim to build resilient systems. This means designing them carefully so that they can sustain their operations in the face of failure.
To make sure about the resilience of our system and that it behaves as expected, we have to see the failures being tolerated in production.
Why not simulate this in a QA or staging environment?
The existence of any differences in those environments brings uncertainty to the exercise.
In other words, while testing outside of production is a very proper approach, it’s incomplete because some behaviors can be seen only in production, no matter how identical a staging environment can be made.
Constructing Fault Injections Process:
- Imagine a possible untoward event in your infrastructure.
- Figure out what is needed to prevent that event from affecting your business, and implement that.
- Cause the event to happen in production, ultimately to prove the noneffect of the event and gain confidence surrounding it.
Best Practices in Constructing Fault Injections:
- Get a group of engineers together to brainstorm the various failure scenarios that a particular application, service, or infrastructure could experience.
In fact, the idea of fault injection may not be appealing: it brings risk to the forefront, and without context, causing failures on purpose may seem crazy. What if something goes wrong?
It’s better to prepare for failures in production and cause them to happen while we are watching, instead of relying on a strategy of hoping the system will behave correctly when we aren’t watching.
The worst-case scenario is that something will go wrong during the test. In that case, an entire team of engineers is ready to respond to the surprises, and the system will become stronger as a result.
Performance testing is a form of software testing that focuses on how a running system performs under a particular load.
Workload concept: could mean concurrent users or transactions.
Concurrent users concept: are virtual users achieves:
- Each user has its own unique cookies, session data, internal variables, and so on.
- All users can make a request at the same time (concurrent).
We need to make an important differentiation. If you’re testing a website, then the concept of concurrent users, each with their own set of cookies and session data, is indeed applicable. On the other hand, if you’re testing a stateless REST API, the concept of concurrent users might not be applicable, and all you really care about is requests per second.
Types of Performance Testing
|Test Type||What to Test?||How to Test?|
|Load testing||Measuring response time and system staying power as workload increases.||Increasing the workload within the parameters of normal working conditions.|
|Stress testing||Measuring software stability. At what point does software fail, and how does the software recover from failure?||The software is given more workload than that can be handled.|
|Spike testing||Evaluates software performance when workloads are substantially increased quickly and repeatedly.||The workload is beyond normal expectations for short amounts of time.|
|Endurance testing||Check for system problems such as memory leaks.||Generate a normal workload over an extended amount of time|
|Scalability testing||Determine if the software is effectively handling increasing workloads.||Gradually adding to the workload while monitoring system performance.|
What Do Performance Testing Metrics Measure?
|Response time||Total time to send a request and get a response.|
|Wait time||How long it takes to receive the first byte after a request is sent.|
|Average load time||The average amount of time it takes to deliver every request.|
|Peak response time||The measurement of the longest amount of time it takes to fulfill a request.|
|Error rate||A percentage of requests resulting in errors compared to all requests.|
|Concurrent users (load size)||how many active users at any point.|
|Requests per second||How many requests are handled.|
|Transactions passed/failed||The total number of successful or unsuccessful requests.|
|Throughput||The amount of bandwidth used during the test, by kilobytes per second|
|CPU utilization||How much time the CPU needs to process requests.|
|Memory utilization||How much memory is needed to process the request.|
Performance Testing Best Practices
- Test as early as possible in development. Do not wait and rush performance testing as the project winds down.
- If you can’t test in production, your testing environment should be close enough to the production one. The differences between the test and production environment can significantly affect system performance. try to match:
- Hardware components
- Operating system and settings
- Other applications used on the system
- Conduct multiple performance tests to ensure consistent findings and determine metrics averages.
- Applications often involve multiple systems such as databases, servers, and services. Test the individual units separately as well as together.
How many concurrent virtual users do I need?
- You should be able to ask your dev or web analytics team how many concurrent visitors you’re really getting.
- For new projects that haven’t yet launched, anticipating real user traffic can be difficult. In this case, looking at comparable projects can be helpful.
In reality, your application may scale at millions of users. But how will you simulate this reality perfectly? Even trying can be hugely expensive and time-consuming.
How to solve this?
The Pareto Principle. The Pareto Principle or the 80/20 rule states that 80% of the effects derive from 20% of the cause. In simple terms- Don’t try to simulate reality perfectly, your configuration will be a lot simpler.
A variant __with less drama__ of the in-production testing are some specific deployment strategies:
Blue/green is a technique for deployments where the existing running deployment is left in place. A new version of the application is installed in parallel with the existing version. When the new version is ready, cut over to the new version by changing the load balancer configuration.
The basic idea behind this technique is to be able to route users to either the green set of servers or the blue set of servers.
Hence, this strategy lets you test the performance of the new version before routing the users to it.
Similar to blue/green, although only a small amount of the servers are upgraded. Then, a fraction of users is directed to the new version.
You would start by deploying the application to a small subset of your servers. Once the subset of servers is deployed to, you may then route requests for a few users to the new set of servers.
This strategy lets you do a lot of interesting things, like:
- Test the performance of the application.
- Perform A/B tests based on demographics and user profiles, for example, “users between the ages of 20-25 living somewhere”.
Consider reading our previous article about A/B Testing.
- Get feedback from a subset of beta testers or users from your company.
As your confidence in the deployment improves, you can deploy your application to more and more servers, and route more users to the new servers.
If you face issues, you can start rolling back the new version of your service.
In this post, we covered in detail two different kinds of in-production testing namely failure injection and performance testing. For skipping the risk of testing directly in production we introduced two deployment strategies dedicated to this goal: Blue/Green Deployment, and Canary Deployment.
Do you know that we use all this and other AI technologies in our app? Look at what you’re reading now applied in action. Try our Almeta News app. You can download it from Google Play or Apple’s App Store.