STATEFUL Application Remediation — to be Cloud Native

Shankhabrata Chowdhury
11 min readMay 22, 2022

--

Making stateful application Cloud-Native as part of Application modernization

Context:

While doing Application Modernization, if you are planning to make your application true cloud native, then just lift and shift is not the way. You should consider doing proper re-factoring/re-architecting your application to follow Twelve-Factor App principles to make your app true Cloud-Native.

Now while making an application true cloud native, one of many challenges is if the application is stateful. As the “Process” principle, one of the twelve principles for being compatible with Twelve-Factor App says “Execute the app as one or more stateless processes”. Twelve-factor processes are stateless and share-nothing. If at all, any data that needs to persist must be stored in a stateful backing service, typically a database.

While building an application as stateless is the objective, but in this article we will focus in understanding how we can make a stateful application cloud native by externalize the session(state) with an external common distributed stateful backing service/system.

Stateful vs Stateless application — quick refresher:

Just as a quick refresher ..

A stateful app is a program that saves client data from the activities of one session for use in the next session. The data that is saved is called the application’s state.
Stateful applications save client session data somewhere (mostly on the server where the application runs) for use by the server, by clients, and by other applications.

But stateless applications, in which client data or data related to past transactions is not saved to the server between sessions. It accepts each transaction or user interaction like a blank state without knowledge of previous interactions.

In some cases, statelessness doesn’t mean that there isn’t a state. It just means that the state is held somewhere else.

If you are new to the concept or need to explore in detail then below two articles will be helpful.

Challenge / Problem Statement:

Elastic cloud environments differ from traditional server configurations in that they have a variable number of servers based on traffic loads whereas traditional configurations had a fixed number of servers. When traffic volumes decline it is necessary to vaporize servers. In doing so, we would lose user sessions (essentially forcing a logout) unless we come up with a new strategy for session management.

The main challenge with the stateful application is its non compatibility with scaling. To use an in-process session, you must configure sticky sessions at your load balancer level, to bind clients and users to the same server so it can process subsequent requests in the context of previous ones. If traffic grows, you can’t simply replicate a stateful app and redirect new client requests because users will need to start from scratch. Similarly when traffic reduces you can’t scale down a server unless all existing users sticking to those servers are not logged out completely. This makes stateful apps difficult to scale and prone to system unavailability when client traffic increases.

Typically, monolith stateful web applications depend on session to maintain their state between requests and data of session state is persisted in-process (in memory of server). In a volatile container environment if an application is deployed on multiple containers, this could lead to session data loss if any container restarted.

As part of this scaling problem, the users of stateful apps need to continue sending requests to the same system that maintains their state data — or they lose historical context. That means you cannot scale by redirecting new client requests to other systems running the same application. Clients will need to reauthenticate, and they will lose the historical context of previous transactions. When you need to provide the same app experience to a rapidly growing user-base — or to an unpredictable amount of client traffic — stateful apps can experience delays and shutdowns as traffic rises to levels that the server can’t handle.

To solve this scaling issue, the ideal target architecture/pattern could be stateless application if it supports the business needs, otherwise we should externalize the session state of a stateful application, which we will discuss in next sections.

Ideal target state:

In the modern day architecture, if the business use case supports, the ideal target state for any application should be stateless to become true cloud native running on any container platform or even in auto scaled EC2/VM platforms.

There could be many architectures and pattern for stateless application but most commonly used pattern is decoupled front-end and back-end application, where the front-end app could be a Single Page Application (SPA) built on Angular/REACT running on CDN (for aws, CloudFront+ S3) and the Backend running as exposed api ( written in any language, .net-core, Java, Node.js, Python, PHP, etc.) to be called from front-end. And using Azure- AD Auth and user access tokens for authentication and identification.

An ideal stateless application is one which does not store the session at server side at all. At the same time you need to store the user authentication and user info somewhere, you can’t ask your user to reauthenticate every time for every new interaction. The minimum user’s session and state will be maintained at user/client side and passed to the server on every call/interaction in a form of encrypted token, e.g JWT,auth0.

Authentication is handled by the Angular SPA application, which passes a bearer token to the API application to be validated against Azure AD.

For details please refer below link, a very good article on how to make stateless session for a stateful app.

This pattern of keeping state client-side in the form of signed tokens, sometime call “stateless session of stateful app”

Sample reference diagram of using Token and cookie, making session stateless. Note: picture taken from “https://auth0.com
Reference deployment diagram of a typical stateless application.

Advantages of Token-Based Authentication:

Stateless, easier to scale: The token contains all the information to identify the user, eliminating the need for the session state. If we use a load balancer, we can pass the user to any server, instead of being bound to the same server we logged in on.

Whatever mentioned above is possible if you are building an application freshly from scratch or re-writing an application while modernizing. But not always possible if you are planning for re-platform/re-architecting/Re-factoring the app while modernizing. For that, externalization of state could be an acceptable solution where whole rewrite of the application is not required.

Acceptable Target State: “Session/State Externalization”

During application cloud modernization, Many times it is not possible to rewrite the whole application, still the desired target state is pure cloud native app. In those cases we can follow a pattern called ”Session/State Externalization”, a.k.a “Distributed Session Management”.

If your app needs to store session data to process transactions in-context and if the server can handle the expected processing load, a stateful system is probably needed and you can’t go for stateless application.

So to make a stateful application scalable in a cloud native platform, you need to externalize the session/state management via a distributed caching system, rather than making it a stateless application.

The idea in the Session/State externalization pattern is to store the session or state data in a central external system rather than along with the container/pod/node where the application is running, and that common central external system will be distributed among all containers/pod/nodes where the app is running. Need a small rewrite of the application to communicate with that external Session store system. And preferably that external session storing system should be close to the app for being fast to retrieve the session data, ideally a caching system.

Distributed session is a way for you to store your session state outside of your ASP.NET Core/Java/or any other application. Depending upon your performance, costing and caching requirement, using Couchbase/memcached/Redis/dynamodb/SQL DB to store session state can help you when you need to scale your web site.

A distributed cache is a cache shared by multiple app servers, typically maintained as an external service to the app servers that access it. A distributed cache can improve the performance and scalability of an ASP.NET Core or any app written in any language, especially when the app is hosted by a cloud service or a server farm.

A distributed cache has several advantages over other caching scenarios where cached data is stored on individual app servers.

When cached data is distributed, the data:

  • Is coherent (consistent) across requests to multiple servers.
  • Survives server restarts and app deployments.
  • Doesn’t use local memory.
Sample deployment diagram of monolith stateful application with session externalization

Though S3 and EFS can be used as file storage to store session data, it is not recommended due to performance, locking and other issues.

While doing assessment check of stateful nature of both web and API layers, by exploring code. We should analyze how deep the session is being used to store the state data. Next Analyze the effort and timeline impact of refactoring it to make stateless from stateful. Sometimes an application is just using the session to store the authorization data. If the use of session is minimal, we can refactor it to remove the dependency on session.

Sample Code Snippet :

Code guide for AWS ElasticCache (Redis):

Below is sample code snippet for .net core while using AWS ElasticCache (Redis) to store session data. Similar code you can get by googling, for other languages.

Code guide for Azure Redis Cache:

We can do the same with Azure Redis Cache using interacts with the cache using the IDistributedCache interface. Below is sample steps for .net core while using Azure Redis Cache to store session data. Similar code you can get by googling, for other languages.

For details implementation of Azure Redis Cache using .net-core, please follow below links.

https://docs.microsoft.com/en-us/aspnet/core/performance/caching/distributed?view=aspnetcore-3.1
https://docs.microsoft.com/en-us/aspnet/core/fundamentals/app-state?view=aspnetcore-3.1

Short term workaround (but not recommended):

There could be scenarios due to complexity, timeline constraint, application knowledge gap, you are not able to make the Session/state management externalize, then below could be some short term work around, though not recommended as first option.

So in this approach, you are neither making a stateful application as stateless nor externalizing the session in a distributed cache, rather you are trying to run a stateful application in a container platform, somehow.

Originally, containers were built to be stateless, as this suited their portable, flexible nature. But as containers have come into more widespread use, people began containerizing (redesigning and repackaging for the purposes of running from containers) existing stateful apps. This gave them the flexibility and speed of using containers, but with the storage and context of statefulness.

With the growth in popularity of containers, companies began to provide ways to manage both stateless and stateful containers using data storage, Kubernetes, and StatefulSets. Statefulness is now a major part of container storage and the question has become not if to use stateful containers, but when.

These approaches allow you to run the stateful application in Kubernetes with high availability, but still you will sacrifice the user experience(session) on platform auto-scaling.

As these are not recommended solutions hence we will touch briefly without explaining detailed implementation techniques.

Way 1 :

Using GlusterFS which is a container native storage offering from RedHat. But this option is only possible if you are using RedHat OpenShift Kubernetes container platform hosted on-premise or in AWS. GlusterFS CNS volumes are basically Distributed file storage on top of EBS volumes, managed by heketi. The volumes are built out of local block devices of the OpenShift nodes backed by EBS. These volumes provide shared storage and are mounted on the OpenShift nodes with the GlusterFS FUSE client. For details please refer “https://www.redhat.com/en/blog/struggling-containerize-stateful-applications-cloud-heres-how

Way 2:

Using Amazon Elastic File System (Amazon EFS) as a storage device for containers that are running on Amazon Elastic Kubernetes Service (Amazon EKS), using AWS Fargate to provision your compute resources. The Amazon EFS CSI driver allows multiple pods to write to a volume at the same time with the “accessmode:ReadWriteMany” mode.

AWS EFS provisioner Architecture for k8s. note: pic taken from “https://www.padok.fr/
Sample architecture to use share EFS for state management across multiple pods

And your application code serialize the session and store in file using persistentmanager with the file path pointed to the mounted EFS path.

For details please refer below links

https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/run-stateful-workloads-with-persistent-data-storage-by-using-amazon-efs-on-amazon-eks-with-aws-fargate.html

aws.amazon.com/blogs/containers/running-stateful-workloads-with-amazon-eks-on-aws-fargate-using-amazon-efs/

https://docs.giantswarm.io/advanced/storage/efs/

Way 3 :

Using a combination of StatefulSet and Session Affinity at Kubernetes cluster level so that all requests always direct traffic from a client to the same pod. Using the Kubernetes internal load balancing and configuring Ingress into the cluster we can set up session affinity. But this doesn’t work well during auto scaling of pods ( HPA). Below article will give a brief idea how we can run a stateful application in Kubernetes without refactoring the code or architecture of the application.

Have you noticed that all those short termed workaround options mentioned above to run a stateful application in container platform ( without making it stateless or externalizing the session) are on Kubernetes ?

Conclusion :

To run a stateful application on a container platform, in the above sections we have discussed 3 options,

1 — “Ideal -Making the app stateless- rewrite the app”,
2 — “Acceptable -Session Externalization-rearchitect the app”,
3 — “Short Term(not recommended)- no change in app — tweak the platform”.

If you have the time to rework your apps, do it! On a long-term basis, Option 1 will be to your advantage. However, if you don’t have time, choose the 2nd option.

Note: Before going stateless, make sure this is appropriate for your architecture. This article is not promoting stateless app pattern over stateful pattern. there could be some valid usecases where stateful applications hold merit. This article is guiding on techniques for converting a stateful app to stateless app through some comparison of advantages/disadvantages of the both patterns.

Note: Opinions and approaches expressed in this article are solely my own and do not express the views or opinions of my employer, AWS, Microsoft, Oracle, or any other organization.

Some of the product names, logos, brands, diagram are property of their respective owners.

Please: Post your comments to express your view where you agree or disagree, and to provide suggestions.

--

--

Shankhabrata Chowdhury
Shankhabrata Chowdhury

Written by Shankhabrata Chowdhury

Sr. Architect — Cloud, Security, Digital Systems & Technology

No responses yet