How Much IT Infrastructure Do I Need? Capacity Planning for Workflow Automation

Last published at: May 2nd, 2023

Capacity planning is the process of determining the resource capacity needed  to meet anticipated production requirements.  In most organizations, capacity planning starts from a foundational baseline capacity and use cases that include various possible increased production scenarios are considered and those most likely are accounted for in expansion planning.

A simple manufacturing example is a baseline case of a single machine being able to make 100 items in 1 hour, but if demand increases to 300 items per hour then you will need how many new machines (plus the space, man-power, utilities, storage, packaging and everything else that entails?)  In this case, the simple answer is 3 machines.  Workflow automation infrastructure planning similarly requires you define:

  • how many processes you need to run
  • how fast you need to run them
  • how many processors, how much memory, etc. (# machines)

If you need more processes to be processed in shorter time periods using the infrastructure that you initially designed for baseline estimates, then more (or better, faster, etc.) infrastructure capacity must be added.

Baseline infrastructure for FlowWright includes two servers: 1) an application server and 2) a database server.  In most organizations, databases are managed and shared within a single database server.  If you anticipate your primary database server will be overtaxed due to processing requirements, then move the FlowWright database to a database server with enough resources.

The baseline application server for FlowWright runs on 8GB memory, Quad-core CPU and 100 GB storage.  This configuration is the minimal configuration for the FlowWright application server.  When it comes to workflow capacity planning, the # of workflow instances being processed and their complexity, as well as throughput requirements, will dictate whether you will want to scale up application server capacity and speed.

In some cases, the amount of work needs to be processed within a time period.  Let’s say 10,000 instances need to be processed in 1 hour, and you determine the application server does not have enough resources to process 10,000 workflow instances.  You can scale up your processing infrastructure in one of two ways.  1) in a virtual or cloud environment, resources such as # of CPUs and memory can be increased.  Or, 2) if the application server is running on an on-premises or hosted physical server, then FlowWright can be configured with another application server to perform distributed processing to manage your increased load.

Sometimes capacity planning is tricky to determine because the amount of work and the way it is executed may be complex and/or unpredictable.  For example, let’s say you have a process with 87 steps, and after processing the 3rd step, the workflow often goes to sleep - sometimes for weeks or months.  In this case, you might launch 1 million instances of a process, but the engine has very little work to do for each workflow instance.  Resources require to process 87 complex steps vs. 3 simple steps are very different. 

In some cases, the processing load is primarily based on integration and most resource-intensive computing is actually done by other systems.  For example, one FlowWright customer processes millions of prescriptions each day - but most of the work performed on each prescription is done on the client’s application server rather than FlowWright's workflow server.   For example, one of the operations performed on each prescription is optical character recognition (OCR) which is a very CPU and memory intensive operation.  OCR operations are handed off to an OCR server to perform using FlowWright asynchronous steps.  FlowWright asynchronous steps make REST API calls to other systems to perform the work and then go to sleep, consuming 0 resources from FlowWright infrastructure.  Once the OCR operation is completed by the OCR server, the OCR server makes a call to FlowWright using the FlowWright API to continue the processing of the workflow instance.  At the end of the day, FlowWright drives the process and performs orchestration across systems - but consumes little resources directly.

Capacity planning, like data science, can be aided by one of the many software solutions/tools built specifically for this purpose.  When capacity planning for workflow processes, the following variables must be included in the computations:

  • # of workflow instances processed
  • # of steps continuously processed
  • How much processing is performed by certain complex steps
  • Amount of processes that need to get processed within a time period

In some cases, sizing processing infrastructure is also influence by the number and complexity of decisions processes make.  Some processes may involve complex calculations that determine which path a workflow may take.  And some simple decisions, for example, a process might have a decision that determines whether to OCR the incoming prescription or not, also have big processing implications: if the process has to OCR then a lot more processing is required than not having  to OCR.    

Capacity planning is an art.  Some customers who automated their back-end server processes have already gone through this exercise have subsequently seen their business soar and had to increase resources on their physical servers and virtual environments.  If you need resource planning help, we are here to help.  Our resources can analyze your environment, processes, steps and recommend proper infrastructure requirements.

Have questions on how to determine your capacity around workflow automation, Let's Talk!