idengrenme's picture

PART 3:  Case study:  System rationalization in controls readiness context

  • Discussion about: 

    • [x] "EA as a service" based approach
    • [x] People
    • [x] Tools
  • For:
    • EA program manager
    • EA Architects
    • EA Analysts
    • IT DevOps managers

In part 1, we looked at techniques to define and answer questions using a service based approach and people's roles.  In part 2, we had a closer look at the people, and the setup of the EA toolchain to implement a service-based approach.

Part 3 will apply the concepts in part 1 and 2 in the context of an audit-readiness case study, where we attempt to determine systems that pose the highest risk to passing an audit.

In our case study, we present an application portfolio rationalization in the context of an "audit readiness" engagement.  The organization has the goal to reduce overall risk of system ownership and operation, which will result in fewer system-related audit findings.  Many of those findings in the past were the result of expensive-to-operate systems that are out of compliance, for various reasons. 

The "key question" (or "goal") here is: Which systems have the highest risk?

  • WIRE/CREATE - First, we collect, normalize, and validate metrics, and align to KPIs in our architecture.  This will form the rational basis for the analysis. 

Then, to perform analysis needed to answer the key question, we followed these steps:

  • CRUNCH - Score metrics. 
  • FLATTEN - Summarize complex metrics/KPIs into spreadsheet).
  • CRUNCH - Transform data to be suitable for a dashboard)
  • PRESENT - Create dashboard widgets to consume transformed data)

Analyze the problem, model the KPIs, and collect data.

While we have a focus on audit risks, there are also other risks to owning and operating systems from a business and operational perspective.  To determine the system risk, we construct a visual KPI tree that allows us to decompose the goal into various levels of detail.  We use the Model Layout Engine to visually assemble the tree automatically.   Each level of detail compiles a score based on weighting criteria (which must add to the full value of 1.0), which are shown in the small gray numbers next to the arrow heads.

At the lowest level, (all the way on the right) there are additional details attached (metrics), which use the architecture in a flexible way to produce meaningful scores.  The "checkbox" or "blue exclamation points" are data-quality trackers that visually indicate "how done we are" collecting data needed to create those scores.

Shown below is an "audible" way to read the KPI tree.  Whereas Q1 is the main information we are looking for, notice that Q1.2.1.1 is the leaf-node question which collects metrics using the enterprise architecture repository data.

Q1:  Which systems are the highest risk to operate?

            Q1.1:  Which systems have the highest operational risk?

                        Q1.1.1:  Which systems have the highest staff risk?

                                    Q1.1.1.1:  Which systems' key maintenance staff will retire soon?

            Q1.2:  Which systems have the highest business risk?

                        Q1.2.1:  Which systems have the greatest security risk?

                                    Q1.2.1.1:  Which systems are the most vulnerable to security breach?

Continuing to the next level of detail, let's look at the metrics needed to assess the "Vulnerable to security breach" leaf-node KPI.  It is determined using several metrics, as follows: 

In this case, we have determined that we will collect three out of four proposed metrics (at this time).  Those metrics are:

  • System data encryption enabled?  [true/false]
  • System internal security enabled?  [true/false]
  • Personal Identifiable Information (PII) stored in system? [true/false]
  • No Authority To Operate (ATO) accreditation?  [true/false].  Notice that this metric is set to "0" probability, so it will be ignored (but is a placeholder for later).  This helps us prioritize what detailed information to collect for our analysis – because collecting quality information usually takes a long time, and in the spirt of "agile EA", we may want to provide a preliminary answer faster.

Score the metric data and create normalized scores. 

The metrics (i.e., "PII stored in system") are evaluated against the application system (i.e., Procurement system) and a score is assessed according to a formula:

"scoreType": "singleAttr", "evalType" : "boolean", "NOT_MAINTAINED" : 3, "payload": { "false": 3, "true": 1 }

Using the Metric Scoring Engine, the score is assessed from 0-1, and this contributes to the "Impact" portion of a risk assessment ("Vulnerable to security breach").  This (red risk object) is from the Risk and Compliance view, and the same value is assigned to the "Vulnerable to security breach" leaf-node KPI.

Flatten the complex metric scores and KPIs into a spreadsheet. 

The scores from all leaf-node KPIs are collected and summarized "up the tree", using a generic query defined in the Query Generator app.  This results in a flattened CSV file, created on a daily report schedule automatically:

Crunch/transform the data (for dashboards)

Now that the data has been extracted from the architecture in a "flattened" format, it is ready to be transformed in a way suitable for dashboard consumption (i.e., pie charts, bar charts).  In our scenario, this is done using the "Mashzone NG" product, but can also be done using ARIS Aware, or dashboard system that can receive CSV (Comma Separated Value) input.

Using Mashzone datafeeds can be constructed in a visual editor, without coding:

In this example, values from the summarized ("flattened") architecture – CSV files – are ingested into Mashzone, and the data is prepared to be presented in Bubble Chart representation. 

In this step, it is important that analysts (who understand the business problem) are technical enough to work with the raw data input, and transform it in such a way that it can "tell a story" with a visual output.

Present the result in a dashboard.

In our case, the visual presentation is a Bubble Chart (using the TIME[1] method).  The four quadrants are a typical of a TIME chart, but the audit risk emphasis in this case aligns the X-axis to overall system risk.  The Y-axis represents the value aspect of the system, which transforms a separate set of measures related to cost and benefits.

As part of the analysis, it is important for stakeholders to understand that they can trust the data.  This is done by analyzing the quality attributes of the metrics that are the basis for the analysis.  For example, "when was the data last updated" or "what is the source of the data"?  In this case, it is sufficient to indicate how many of the objects in the Bubble Chart have been validated (by a knowledgeable person) in the last 6 months.

Value summary

When we opened the article, we promised several benefits of using an EA service-based approach to analysis:  better, faster, and cheaper.  Let's review and see how we did:

  • Better.  Using an EA service-based approach to answer "key business questions" can be justified as "better" than manual techniques (such as merging spreadsheets every time a new/updated answer is needed).  The EA repository stores data from several sources (in our case study, from an audit findings team, combined with the input from the system owners).  Information can be tagged with quality control attributes (i.e., the last time it was updated) and results can be based on "quality assured" information.
  • Faster.  Using the EA-service based approach, we don't re-collect information that has already been collected and previously validated.  Using a combination of out-of-the-box analysis tools from professional EA products, pre-built toolchains and technical/non-technical people, we can produce answers faster, instead of having to launch a recurring mini projects to answer important business questions. Borrowing from the spirit of agile software development, we showed how to provide preliminary answers faster, and adding "placeholder" higher-quality analysis to metrics that might need more time to collect.

  • Cheaper.  Combined with an investment in professional EA tools, people, and role-specific training, day-to-day operations will scale and cost less in the long run.  For example, the right mix of outsourcing "L4-developers" vs. training in-house "L2-analysts" will optimize value, bringing stronger analysis capabilities to existing full-time staff.

For questions or to discuss how you can more effectively leverage your Enterprise Architecture capability, please comment below or e-mail the author below.

Michael Idengren, Agile Architect

KPMG CIO Advisory helps your organization deliver EA value in the context of IT/operations efficiency, process improvements, technical solution assessments, and "strategy-to-implementation" digital transformation program governance. 

[1] Tolerate, Invest, Maintain, Eliminate.  Source: "Gartner Application Portfolio Triage:  TIME for APM"