← Back to Glossary

What is Testing In Production?

by Iwan Price-Evans on DevOps • May 27, 2022

Testing in production is a model for deploying new applications and updates that allows you to test the new release on a segment of live users in a production environment. This helps you to shorten production cycles, deploy without downtime, identify real-world problems and roll back fast, and limit your exposure to risk.

Why Test In Production?

The goal of most businesses, particularly those with a DevOps culture and an established CI/CD pipeline, is to deploy new features and enhancements without negatively affecting the user experience.

You must be able to quickly and securely deploy changes into active production environments without interrupting users engaged in critical activities on your applications.

Modern deployment models, such as blue-green and canary deployment, help you to automate and de-risk the process of testing in production. They do this by rolling out new releases initially to small segments of users and gradually to more users as the new release proves stable. This has the following benefits:

  • "Real-world" testing on dynamic client traffic is the best possible test and only feasible on live production servers.
  • Low risk of exposing a defective feature to all public traffic.
  • Rollback is easy.
  • Deployment is fast and straightforward.

How Do I Test In Production?

Generally, your environment should meet the following requirements for testing in production:

  • A deployment pipeline that can create, test, and deploy to specific environments (Test, Staging, Production).
  • A stateless application, so that any nodes in the cluster can serve requests at any time.
  • A load balancer to route requests to a high availability cluster of server nodes (e.g., virtual machines or containers).

Types Of Testing In Production

Deploying new releases without downtime can be achieved by testing in production using rolling releases, including the Canary and Blue-Green deployment methods.

Blue-Green Deployment

One fail-safe method of releasing new application code is to use Blue-Green deployments. In this approach, two production environments operate in tandem. Blue represents the existing "old" code, while Green represents the "new" code.

Blue and Green both run in Production, but initially only Blue receives public traffic.

The new application code is deployed in the Green environment, a replica production environment without any users. Green is thoroughly tested (integration, end-to-end, benchmark tests, etc.) for functionality, performance, and key health indicators in an actual production-ready setting.

When we are happy with the new version, we may route all traffic from the Blue to the Green environment. We can do this gracefully by draining completed sessions from Blue and ramping up new sessions on Green until the Green environment is handling all traffic.


Rolling Deployment

Rolling and phased deployment is better suited for online web applications than Blue-Green deployment. It decreases associated risks, such as user-facing downtime without easy rollbacks. In a rolling deployment, an application's new version gradually takes the place of its old one. The actual deployment happens over time, and new and existing versions coexist without affecting functionality or user experience.

In a rolling deployment strategy, you start by deploying one additional node containing the updated software. For example, this could be a new web application server instance containing code for the new user interface or a new product feature. A load balancer proxying requests to the application is required to route traffic to the new nodes. Over time, you deploy the new version on more nodes. Effectively, you have the latest version of the application running alongside the old one until new nodes have phased out all instances of the old code with the latest version. 


Canary Deployment

Canary Deployments, like Rolling Deployments, are a technique for gradually releasing new code. Unlike a Rolling Deployment that releases new code to a limited number of nodes, Canary Deployment controls risk by gradually releasing new code to a segment of users before releasing it to the entire infrastructure and making it available to everyone. This reduces any potential negative impact caused.

The subset of users that get the new code is named the "Canary", after the canary in the coal mine. The Canary is closely monitored and provides indicators of the health of the new code. This informs decisions on whether to roll back or release the new code to everyone. With Canary deployments, we can isolate key metrics such as errors, latency, and interesting user feedback on the new version.

There is no specific formula for dividing traffic, but you may generally expect a Canary release to distribute between 5% and 20% of traffic to the new "Canary" version. A Canary Deployment would never divide traffic 50-50 because if something goes wrong half of the users would have a bad experience. 


Should I Test In Production?

Testing in production is not always appropriate or possible. Furthermore, each deployment method has pros and cons. Here is a summary.

  Big Bang Blue-Green Rolling Release Canary
Description An upgrade is done in place of existing code in one go. New code is released alongside existing code, then traffic is switched to the new version. New code is released in an incremental roll-out. Old code is phased out as new code takes over. New code is released to a subset of live users under specific conditions, then an incremental roll-out.
Best Suited For

Appliances where no redundancy is available.

Monolithic code base tightly coupled with underlying propriety hardware.

Offline service during the maintenance window is acceptable.

Double resource capacity is a non-issue.

Live user testing is not critical.

Fewer challenges with stateless applications.

Instant rollback capability.

Fast roll-out preferred.

Convenient for stateful applications.

Apps deployed on a container or cloud platform can be easily orchestrated.

Slow rollout preferred.

Live user testing is essential.

Apps deployed on the container or cloud platform can be easily orchestrated.

Zero Downtime No Yes Yes Yes
Live User Testing No No Yes Yes
User-Impact On Failures High Medium Low Low
Infrastructure Cost Low High Low Low
Rollback Duration Impact High Low Low Low

Does Snapt Help With Testing In Production?

Yes. Snapt Nova is a modern application services platform capable of automating traffic distribution across dev, test, and production environments following blue-green, canary, or other deployment models. Snapt Nova can direct traffic based on a variety of information available from the client, the network, and the environment in which they are operating.