Pega PRPC — is it Cloud Native or Cloud Aware ?
Through this article I tried to explain my perspective on whether and how Pega PRPC platform or any application built on Pega PRPC can be considered as cloud native.
Let’s start with what my understanding about a platform/application as cloud native, In my opinion, Cloud-native architectures and application should support Speed of a Startup at the Scale of an Enterprise, with zero disruption to application user’s experience.
Some people strongly believe that cloud native application means ONLY those applications built with services packaged in containers, deployed as microservices and managed on elastic infrastructure, but I think a standalone and monolithic application can also be very near to cloud native, though can’t take full advantage of distributed and elastic cloud infrastructure.
Some people consider having below features of a application/platform to be considered as cloud native.
1. Containerized
2. Microservices
3. Horizontal Scaling
4. Service Mesh
5. API Driven
6. Service Discovery
7. Delivery Pipeline
8. Policy-Driven Resource Provisioning
9. Zero Downtime Deployment
and many more …
As you see many people may have different standards to evaluate cloud native application, I find The Twelve-Factor App principle is the best measuring scale to measure whether a platform/application is cloud native or not (again some people believe 12-factor-app principle is only for microservice).
So let’s jump into the our main objective to find how Pega platform/application stands on the each factor of 12-factor-app principles to be consider as cloud-native.
During evaluation of Pega PRPC platform itself, I will also try to point out how/what you as developer/architect will consider while building your own Pega application to make it cloud native.
Note: I will not go into detail of 12-Factor-App, please refer this link for details.
I. Codebase
One codebase tracked in revision control, many deploys
Pega has its own mechanism to maintains codebase (rules) within itself along with versioning, and does not support external code base. Here your hand is tied, which is bad as well as good.
But still you can follow the 1st factor/principle by maintaining Pega repository on an SOR, where a separate Pega instance will act as System Of Record repository to host ruleset version centrally.
Or via keeping the product jar in external artifactory (JFrog, AWS S3, etc), which is not exactly the version control system for codebase but a repository of deployable code to any other environment at anypoint.
below are some samples for bad and good practice in Pega for maintaining codebase.
When you are keeping the product extract JAR/ZIP in a central repository (jfrog, S3) keep it in separate folder ( may be named with Pega application name and version, which will work as versioning) for each different application.
Even for any reusable or component application , create separate application product in Pega and keep the product extract separately in repository/artifactory.
So for Pega, not the code/rule base but the code extract package maintain in central repository ( may be after fully unit tested or system tested), and use that same repository as source of code to deploy in any higher environments.
II. Dependencies
Explicitly declare and isolate dependencies
A twelve-factor app never relies on implicit existence of system-wide packages or system tools. Never depend on the host to have your dependency. Application deployments should carry all their dependencies with it.
While these tools/Packages/binaries may exist on many or even most systems, there is no guarantee that they will exist on all systems where the app may run in the future, or whether the version found on a future system will be compatible with the app.
So it is suggested that if your application has any dependency to run your application, those dependency should be isolated from your code and should be explicitly install/deploy while your are deploying your application code.
Pega core platform already follow this approach in their container image to explicitly fetch/download the required package/tools/binaries (e.g. log4j, tomcat binary, Elasticsearch, etc).
If you are planning to create your own image on top of Pega OOTB , then you should explicitly fetch/include (using FROM, curl, pip, etc.) the system package/tool you need ( e.g. EFK, ELK plugin, JDBC driver, etc.) in the docker file.
Now for the Pega application which your are building to run on Pega PRPC, there also you have to consider few points on dependency management to make it cloud native. Below are some examples which you should consider as dependency during every deployment.
> Base application product which includes base application code, including all required data instances — (golden copy).
> Any external jar/library.
> Any configuration setting (maintain in DSS, Data Table or RSS) to hold environment specific configuration, external system’s service endpoint etc.
> Any component ruleset or application.
> etc. if any.
All these dependency should be in external repository ( git, jfrog, S3), and your deployment ( most preferably via CD pipeline) should
> First deploy the golden version/copy product and then all delta product.
> Fetch and deploy the product extract any component ruleset or application.
> Fetch and deploy your release specific delta code as product extract.
> Fetch and deploy external dependency from external repository.
III. Config
Store config in the environment
Configuration is anything that changes between deployment environments. Also configuration of your dependencies which we discussed in Factor 2.
It is suggested that store config in environment not within your code base.
prconfig.xml, DB connection details, Credential, SSL Certificate, External server endpoint url, Pega application specific configuration mentioned in Data table or DSS, Service account information, credential to external services, etc. , these all are example of configuration and should be strictly separated from your code.
These configurations which change with time and target environments, should not be placed/maintained in the application code or any config file in the package bundle, rather it should be in external central config server/repo.
Pega PRPC as base platform as container image has all necessary configuration exposed as environment variable, which should be mapped from kubernetes configmap or secrets object. and those configmap/secret should be fetched/synched from github/vault using available mechanism in kubernetes ( gitRepo, git-synch, git cloning in local volume etc). Same concept applicable if you create your own image on top of Pega PRPC OOTB image.
For Pega application, keep the configuration in git or any central repository in xml/json format and mapped for different environment, ( don’t keep password or secret in git, keep in secret key manager tool) and call that from Pega Data Page. Use that data page in your code where you need to refer those configuration value. or using many other different implementation approaches.
IV. Backing services
Treat backing services as attached resources
A backing service is any service the app consumes over the network as part of its normal operation. Examples include datastores (such as Oracle, PostgreSQL or Cassandra), messaging/queueing systems (such as IBM MQ or Kafka), SMTP services for outbound email (such as Email Exchange), and caching systems (such as Memcached). And even endpoint of external service (API-accessible consumer services) could be treated as backing service for your Pega application.
For a twelve-factor-app these backing service should be treated as separate distinct attached resources and information ( connection related information) and should be stored in config, so that deploy of the twelve-factor app should be able to swap out those resources to different environments without any changes to the app’s code.
Pega PRPC as platform mostly support these concept and you can provide url/config of below service as environment variable or JVM parameter and change the value for different deployment to different environment.
> External Elasticsearch (Search)
> Cassandra / DDS (Datastore)
> Stream (Kafka) — customer’s data intake
But in case of Database, I think Pega does not comply with this factor fully. Due to it’s core architectural design, Pega keeps both code/rules and transactional data in database.
So any day if I want to swap out a local PostgreSQL database with one managed by a third party (such as Amazon RDS) without any changes to the Pega platform level code, will not be very smooth, as the rules/code need to be also migrated to new database.
If you are building a Pega application which interface with any system, it is recommended that you should consider those external system/service ( e.g. MQ, external REST service, SMTP etc.) as attached resource and maintain as external configuration as explained in Factor “3-Config”.
V. Build, release, run
Strictly separate build and run stages
This principle about separating the three stages of building, releasing, and running. Start the build process by storing the app in source control, then build out its dependences. By separating the config information you can combine it with the build for the release stage — and then it’s ready for the run stage.
Let’s discuss this factor in 2 parts. One for Pega Platform deployed as container and second the application build on Pega Platform.
For Pega core Platform …
> The build stage — take the Pega platform OOTB docker image to use as it is, and you don’t need any build stage. But if you are planning to crate your own container image on top of Pega provided, then maintain a code repo ( for docker file, dependencies, etc.), and then converts the code repo into an executable bundle known as a build, by fetching vendors dependencies and compiles binaries and assets. And push/keep in docker hub or in your enterprise image repository.
> The release stage — takes the build produced by the build stage (container image from docker hub) and combines it with the environment specific config (e.g. stored in kubernetes object like configmap, secrets, etc.), and is ready for immediate execution in the execution environment ( kubernetes platform).
> The run stage (also known as “runtime”) — and run the release Pega container image in the execution environment (Kubernetes based platform like OpenShift, EKS, etc.)
For Pega Application, you have to consider below to be compliant with this factor …
> The build stage — You should be able to build a jar, zip of extract of any format, as releasable component which is independent of the environment. Those build extract/package component should not change from one environment to another environment. Here you as Pega application developer have great relief, because Pega platform do this for you in form of Product rule to create jar/zip extract of your Pega application code.
But you have to consider below points.
If you are creating DevOps pipeline for CI and CD, then CI pipeline should be independent of CD pipeline.
Think Build and Release (deploy/run) as 2 independent activities. Branching, merging, extracting product jar and storing in repository/artifactory should be part of CI and should not include the deployment part. And the CD part should only include the Deployment part.
Follow the “Factor 2: Dependency” and “Factor 3: Config” to externalize any environment specific/dependent dependencies and configuration
> The release stage — take the build (Pega application product extract) from a single common repository/artifactory and release to different environments, and preferably via a CD pipeline in automated manner.
You can choose a external deployment tools which offer release management, and ability to roll back to a previous release (use PRPCUtil instead of Pega OOTB REST api for deployments).
Try that every release should always have a unique release ID. ( Use Jenkins or other tool)
> The run stage — here you as application developer, do not need to consider any specific, because run time environment will be managed by Pega Platform itself.
VI. Processes
Execute the app as one or more stateless processes
The idea is that the process should be stateless and shares absolutely nothing. Processes should be stateless, means if it crashes it should not take anything important with it. ( e.g. session data, transactional data in local memory or filesystem, trans etc.).
Two applications should not share anything between them, if need to be shared and persist that should be via endpoint of external stateful backing service or attached resources, such as databases or caching system.
Pega platform uses “sticky sessions”, storing information in the session expecting the next request will be from the same user/service contradicts this methodology.
Pega as platform rely on session affinity, so does not support this principle.
When you are building your Pega application, you can’t do much on this principle, as session/state management is done by Pega Platform and Pega application has no control over it.
VII. Port binding
Export services via port binding
Each Service must expose itself on a port number, specified by the environment variable. This factor is more applicable for micro services, where each service should bind to different port and by themselves.
Pega platform while running on Kubernetes, the image has its onw app server embedded within the docker/pod and exposed via port binding which you can control as environment variable in config. Similar way other supporting services in Pega platform ( cassandra, kafka, elastic search) are also tagged with ports configurable via config variable.
As Pega application developer, you can’t do much, because while you are creating any application or REST/SOAP service, you can not define or bind to specific ports of your choose. It get exposed to the default port where the underline Pega platform service is tagged to.
VIII. Concurrency
Scale out via the process model
A true 12-factor app is designed for scaling. Build your applications so that scaling them in the cloud is seamless. When you develop the app to be concurrent, you can spin up new instances to the cloud effortlessly. In order to add more capacity, your app should be able to add more instances (Horizontal scaling) instead of more memory or CPU on the local machine (Vertical scaling).
The process model truly shines when it comes time to scale out. The share-nothing, horizontally partitionable nature of twelve-factor app processes means that adding more concurrency is a simple and reliable operation. So the application must also be able to span multiple processes running on multiple physical machines.
Each container pod/EC2/jvm instance should dedicated for each separate application and if possible each specific dedicated job/process ( frontend, backend, batch job/agent, streaming, search index, etc.), so that proper concurrency and scaling can be achieved independently. The new Pega Platform (v8.x) support this model.
Pega platform provides option for deployment of separate container for search, background processing, web/frontend load, etc. shown in the image. And each of these component supports scaling, concurrency and disposability mechanism.
Few samples consideration while you are building your Pega application,
> Keep separate service package and URI path for separate group of services (REST/AOP) you are building. Don’t put all functionally unrelated services under same URI path /ServicePackage.
> If you planning to host 2 different Pega application on same Pega platform/instance, plan for separate access URL for accessing those application, via DNS record at load balancer level, or using Kubernetes service or Pega servlet overriding, so that at pod/JVM/node level you can scale independently for 2 different Pega applications.
IX. Disposability
Maximize robustness with fast startup and graceful shutdown
Servers/hosts/nodes/JVMs should not be treated as pets, it should be treated as cattle.
The concept of disposable processes means that an application can die at any time, but it won’t affect the user — the app can be replaced by other instances of same app, or it can start right up again. Building disposability into your app ensures that the app shuts down gracefully: it should clean up all utilized resources and shut down smoothly.
App/platform built on this principle, supports quick start, resilience to failure, graceful to shutdown and compensation action (resilience) in case of any failure.
As the Pega Platform is stateful in nature ( it demands sticky session / session affinity), Pega for it’s architectural nature does not support disposability ( auto de-scaling) without impacting user’s experience.
After disposition of a node, Pega does not transfer session/in-memory data from disposed node/pod to newly activated node/pod. Through load balancer you can control not to send new user load to the disposed node, but Pega platform does not support native mechanism to quickly transfer those session or externalization session via external caching system ( Redis/ Memcached). Pega has some on demand quiescing mechanism for passivation and activation ( via external DB) but that is not quick/spontaneous.
So in principle Pega as platform does does not follow cloud native disposability function.
It would be great if Pega maintains session in a external caching system (Redis/Memcached) shared among all nodes/pods.
So while Pega platform can scale out as per factor “8-Concurrency”, but when the times come for scaling down, due to the stateful nature it does not support factor “6-stateless process” and this factor “9-disposibility” ( without impacting user’s experience)
X. Dev/prod parity
Keep development, staging, and production as similar as possible
It ensure that you don’t face new problem in production, and if any problem in your application or platform you should face that earlier while you are in dev
For Pega, it supports and you should use container and external centralize code base to keep the all environments in synch and ready on demand. So that your Peg application along with the Pega platform should be ready to achieve the state Dev= Stage = Prod, anytime and instantaneously.
If anything is different across the environment, then that should be declared as dependency, attached resource or as configuration.
If we follow the previously discussed factors “1-Codebase”, “2-Dependencies”, “3-config”, “4- backing services”, then automatically we can achieve this factor of “Dev/Prod Parity” easily in Pega.
XI. Logs
Treat logs as event streams
A twelve-factor app never concerns itself with routing or storage of its output stream. Archival destinations should not visible to or configurable by the app, and instead are completely managed by the execution environment.
Redirecting the output of your application to a file allow the environment control what to do with the logs because the application doesn’t necessarily know the environment that it will eventually run within. Furthermore, by treating all logs as an event stream, you leave the choice of what to do with this stream up to the environment.
It gives visibility and traceability in logs even an environment/pods or the ephemeral storage in cloud is disposed or many instances running the same service. It also help in real-time monitoring and event automation in aggregate manner via external tools (e.g. splunk).
Pega Platform streams, stores, rotates logs in log files in respective nodes, when you are running Pega as container, using the sidecar or adaptor pattern pass the Pega logs to external log aggregator/indexing/searching/analytics tools (e.g. Fluentd, Logstash, Splunk, etc).
XII. Admin processes
Run admin/management tasks as one-off processes
Separating administrative tasks from the rest of your application, so that you can run admin process just like other process. These tasks might include migrating a database or inspecting records. Though the admin processes are separate, you must continue to run them in the same environment and against the base code and config of the app itself. Shipping the admin tasks code alongside the application prevents drift.
Some example of such one-off (but repetitive/regular in nature) administrative or maintenance tasks are as below,
> Running database migrations.
> repair some broken item.
> once a week move database records older than N days to cold storage.
> Every day email a report to manager.
Pega as platform support this factor/principle for some of its admin kind of tasks.
> Code build & deployment ( application import/export) Pega Platform provides tool/wizard from the developer portal which is platform code within the application/platform itself.
But along with that Pega also provide PRPCUtil which is external batch/shell script/tool outside of the Pega code/platform. This one is good example of admin/malmanagement task/code which is outside of the core platform app code instead of trying to incorporate into platform code.
> BIX, SMA can be also treated as another admin/management tasks as one-off processes, available as outside of the core platform code.
Still there are some area where you have to be within Pega platform to perform below admin task, means the admin task/process written in the platform code itself
> repair/requeue Broken item/queue
> modification of business delegated rules
> OOTB case purge/archive module
> report scheduler
Similarly in your Pega application if you have any requirement of any admin/management tasks, then build that in such a way it is repeatable, and out side of your application. Few examples are as below
> If you have to truncate/purge/mode some records from a Pega table weekly/monthly, rather than creating a Pega agent, try to utilize Oracle store proc and scheduler.
> Even you are forced to run a Pega agent within your application, plan to run those in separate dedicated background node as a separate independent task.
Wrapping Up
So in my opinion, Pega Platform and the application build on Pega is not Cloud Native but it is very much Cloud Aware (compatible).
While microservice based application is best as cloud native, and monolithic is worst (as per latest perception trend), it is not necessary that you have to take the extreme side of the either while building Pega application. You can build Pega application as Distributed Minilithic and sticking to most of the 12-factor-app principles for being cloud native (at least cloud aware). [I will publish another article soon on “Distributed Minilith”]
Also, for being a true cloud native platform, not only the 12-factor-app principles at application/platform level is must, but also you need cloud native run time layer, cloud native storage, cloud native networking which are out of scope of this article/discussion.
Note: Opinions expressed in this article are solely my own and do not express the views or opinions of my employer, Pegasystems or any other organization.
Please: post your comment to express your view where you agree or disagree, and suggestions.