From Consumption to Forecasting: How Proven IT Finance Helped Improve the TCO View of Infrastructure for a Large Resources Sector Client

Prior to engaging Proven IT Finance, our client was leveraging a manual Excel sheet to forecast and budget IT. Not only was this a time-consuming and complex process; it limited visibility in the relationship between their IT operations, services and costs across all environments in a growing cloud infrastructure.

Proven IT Finance stepped in to develop a comprehensive model to better manage the supply/demand equation.

Translating Service Design into Capacity & Utilization

This meant defining their virtual machine (VM) sizes in terms of CPU and memory, storage tiers, affinity and anti-affinity rules, and backup and recovery and then translated the service design into capacity and utilization.

For example, if there was demand for 50 small VMs, our model was able to determine how much resource capacity was consumed.

Resource capacity was calculated in terms of labor: 

  • Number of hours to commission 50 small VMs
  • Number of hours to commission new hardware and software to support the 50 small VMs

And also in terms of non-labor:

  • Hardware: Number of physical servers (individual blades in the FlexPod), chassis, switches, storage trays, controllers to purchase to support the 50 small VMs
  • Software: Number of licenses for 50 small VMs and number of licenses for any net new physical hardware to be purchased

The resource capacity was converted to costs based on labor rates and purchase agreements for hardware and software. These costs were then used to plan and understand the budget for the next 36 months.

For example, if there was demand for 50 small VMs, but the environment could only hold 5 more small VMs, then the model automatically calculated how to expand the infrastructure in terms of labor, hardware and software—broken down between compute, storage, and network. The model worked with any combination of services and forecast dates to help the client understand the expected time and costs to grow the environment.

This allowed the client lead time and a justification for why they had to make additional hardware purchases. If demand changes, then the client can also pinpoint which projects and services were no longer needed and is able to explain the excess capacity they may/may not have or why they may or may not be over budget.

The Need for a More Sophisticated Approach

Though the initial implementation provided visibility in predicting growth and costs associated with it, the model had the following drawbacks:

  • High monthly maintenance required as relationships between services and the environment had to be manually defined
  • Lack of scalability beyond the FlexPod environment
  • Costs for growth did not account for spares
  • Limited visibility in connecting costs of infrastructure to each Platform as a Service

The recognition of the above limitations formed the basis of the requirements for a more sophisticated implementation.

Establishing the Relationship Between Services, Infrastructure, Spares & Capacity

The client was in the process of defining a CMDB in which they wanted the model to leverage the various relationships between services and infrastructure. This properly defined the movement of services to clusters and/or physical infrastructure—removing the excess monthly maintenance required as all relationships were going to be updated through the CMDB and automatically loaded into the model. Relationships were also consistent throughout the legacy environment.

For example, each Platform as a Service Platform belonged to a cluster, which also belonged to a FlexPod and a data center. Consistencies in the logical design enabled the model to be scalable to eventually encompass capacity and demand for all of the client’s data centers. Spares were also accounted for by standardizing inventory parts and components.

For example, each environment was given one type of storage tray option when it came time for growth. This allowed future growth in terms of cost and infrastructure to be properly predicted even if environments currently contained multiple types of storage trays.

The model also took spares into account before calculating the budget.

For example, if a project required a UCS B200 M3 blade, the spares allocation rule would find if this particular UCS B200 M3 blade is available in the storeroom along with its associated software before it hit the budget. This removed the burden of having the client search the store room and manually adjust the budget. It also eliminated the need to buy unnecessary excess capacity.

Whether or not there are spares available, all resources are also accounted for in terms of capacity.

For example, even if the physical asset and license were available in the store room and thus not generating additional procurement cost, this physical asset still utilized space and capacity in the data center that may require an additional FlexPod to be built. By leveraging the logical design relationships built from the CMDB, the model automatically calculated this Infrastructure as a Service growth and inserted it into the budget while also netting out any spares.

Developing a Showback Model

Evidently, shared Infrastructure as a Service such as new FlexPods are not billed against projects and thus needed a showback to project-based Platform as a Service demand.

In order to do this, another model was built that leveraged the same service design relationships and capacity assumptions to properly allocate a portion of the Infrastructure as a Service costs to each Platform as a Service based on consumption of compute and storage.

The Result: The Power to Estimate Consumption & Forecast Growth

What played a huge part in making this project successful was the client’s role in making sure everyone in operations had well-defined and standard services.

For example, a small tier 1 gold VM in the Canadian FlexPod had the same configurations as all the other small tier 1 gold VMs in any data center. If this wasn’t the case, then we labeled the outlier service under a different service name and excluded it from the model rather than making the model adjust for any minute variance. This made it incredibly easy to streamline the capacity assumptions and allowed a consistent data collection process – “T-Shirt sizing” the infrastructure offerings.

The client also had great transparency in the various rules in the environment that impacted capacity, licensing and purchasing agreements. We assisted the client in drawing and mapping out the logical service design, as well as determining the templates for each table and how they relate to each other.

In previous cases, clients had a lot of tables that all had different rules which made it difficult for users to remember how to update. We created a data structure that was repeated from table to table, easy to explain and fill out. About 80% of the files that were required to be updated monthly were now moved to an “as-needed” basis.

The future benefit of this model will help the client’s Operations team to start monitoring the performance of their IT against their cost. By starting this analysis, they can work to understand and improve the TCO of their environment.