...rants by Asheesh Mehdiratta on Coaching, Transformation and Change

Tag: devops

Why MTTR is my favourite metric?

metrics devopsAs you walk the DevOps Transformation journey, you would build out success stories, build metrics and start to energise teams towards continuing improvements. But to quantify the end user experience, I always look towards the MTTR (Mean Time To Recover) metric.

MTTR is defined as – Average time required to repair a failed component or device. ITIL definitions can be more expressive.

Why MTTR is so useful and is my favourite metric?

Here are few of my reasons

  1. MTTR captures the End user EXPERIENCE,  by capturing when a service goes down and when it is restored.
  2. It shows the SPEED at which your team/organisation works!! Including how quickly the team –
    1. Acknowledges the problem
    2. Solves the problem
    3. Communicates the Resolution to the end user.
  3. MTTR encapsulates the internal dynamics of the teams /organisation.
  4. It is a simple metric and easy to understand metric, without any ambiguity.
  5. It can be measured in any unit (hours/days), which everyone can understand, including the Dev and ops.
  6. MTTR can be captured easily, automated and put across in the dashboard showing trends.
  7. It is applicable across all systems, of varying complexity and size.
  8. MTTR is technology agnostic, and can be understood by everyone – management, executives, support, operate and developers.
Conclusion

You do not want to measure anything, unless it helps the teams/stakeholders, but sometimes you may get carried away to the other extreme of measuring everything also. But MTTR is a simple, easy to understand, easy to capture metric, which serves the purpose of showing the inefficiencies and reminding the teams of the end user experience every time!

So what has been your favourite metric? feel free to share your feedback in the comments below.

If you like what you read here, then do share this article, and subscribe to my future articles. 

5 Tips to integrate Change management in DevOps

 

TalkIn order to reduce operational risks, organizations put in CONTROLS, typically via Change Management processes, which satisfy audit and compliance requirements. These CONTROLS create friction among the team. To minimize this friction,  let us look at 5 Tips to integrate Change management in your DevOps journey.

TIP #1 – TALK TO YOUR AUDIT/COMPLIANCE TEAM

START a conversation with your Audit/Compliance team members now, and try to understand their needs. These conversations will help your team to empathize and see the world from the ‘audit’/’security’ lens. You can then move forward to provide the ‘solution’ instead of jumping in with precooked notions. Read more here on how to start these conversations and ASK the right questions.

TIP #2 – CREATE TRACEABILITY AND CONTEXT FOR YOUR CHANGE SET

START providing the traceability and context for your change set to the operations teams. The goal for your team should be to provide evidence of quality test results for the proposed change set, which will provide the required CONFIDENCE for the Operations team. Read more here on how to start providing this traceability and have a deeper engagement with your Operations teams.

TIP #3 – RE-CLASSIFY YOUR CHANGE SETS

Start to reclassify your change sets, and build agreements, which allow you to auto-deploy to production. Building “standard” change sets, with pre-defined risk profile (low risk first!), you can move towards building a culture of trust with change and operations teams. Read more here on how to reclassify your change sets and increase transparency across the team.

TIP #4 – — USE TELEMETRY TO SHOW EVIDENCE

Start to build out your telemetry systems. These systems allow capturing error, warnings, events, trigger points, and logging this data to central\distributed stores. Use this evidence to show the CONTROLS required for the audit, change management processes. Read more here on how to build these telemetry systems in your DevOps journey.

TIP #5 – AUTOMATE, AUTOMATE, AND AGAIN AUTOMATE !

Stop doing manual steps in your change management teams, STOP ! Start to automate your workflows –build-automated tests – deployments –reports. Read more here on how to increase the automation across the life cycle and increase transparency across the team

So go ahead, kick start and integrate the change management in your DevOps with these tips. Subscribe to my blog for more, and feel free to share your feedback here.

Tip #5: Effective Change management in DevOps

TalkIn order to reduce operational risks, organizations put in CONTROLS, typically via Change Management processes. To minimize the frictions in your DevOps journey, and building on my previous Tip#4, let us look below for the Tip#5 for effective change management.

TIP #5 – Automate, Automate, and AGAIN Automate !

Change management typically includes a CAB (Change Advisory Board) meeting. This CAB meeting reviews the list of change sets, which the operations team filters and can either accept or reject,  to move the change-set into production.

The "Advisory" Board just became the "Gate Keeper", if you noticed!

Large enterprises will work with a traditional mindset. This traditional mindset assumes a Large Batch of changes, which may have been suited in the past for a BIG CAB meeting. But now with smaller change-sets (read as micro services) becoming the norm, imagine going through the rigor of a 2 hour CAB meeting. It will not be a very pleasant experience !!

Therefore we need to re-imagine the CAB meeting and ask the simple but difficult question - WHY DO I NEED A CAB ? 
Purpose of the CAB

Typically the purpose of the CAB meeting is to verify all the artifacts, as below-

  • Have you tested your changes?
  • Have you integrated the security practices?
  • Have you tested the migration, rollback and can provide evidence?
  • Is the change-set linked to the business need and do you have approvals?
  • and the list can go on and on…..
Good news?

But there is good news now !  All these questions can be answered easily by automating your change management workflows. This is now supported by the convergence of the tooling across the application development life cycle and all the evidence required along with the artifacts can be easily built into your automated build and deployment pipelines and integrating with the change management workflows.

Thus the elimination of CAB is done by Automation of our workflows!

These automated workflow makes it possible for the Operations teams to trace the requirements as they become implemented, and improves the ability to see changes, the effects of changes, approvals, and gather the evidence in a self service model using telemetry.

Many teams start with an intermediate manual approval step, till they start to trust the teams, and their change sets. But this manual approval step also goes away, over a period of time depending on the maturity of your teams.

As the teams look to implement pre-approved change strategy, and look for future opportunities to keep on continuously improving and widening this definition, it is a win-win for both sides.

So if you still doing manual reviews in your change management teams, STOP ! Start to automate your workflows, start to automate your build process, start to add automated tests for security into your coding, start to automate your deployments to production, start to add Telemetry and automate the production reports, start to complete the feedback loop, and in the end increase the Transparency across the team.

So go ahead – Automate, Automate and again Automate !

Subscribe to my blog for more learning’s, and feel free to share your feedback here.

Tip #4: Effective Change management in DevOps

TalkIn order to reduce operational risks, organizations put in CONTROLS, typically via Change Management processes. To minimize the frictions in your DevOps journey, and building on my previous Tip#3, let us look below for the Tip#4 for effective change management.

TIP #4 – Use TELEMETRY to show evidence

Traditional thinking auditors look for evidence, and will typically ask for screenshots, configuration logs, settings etc. If you manage thousands of servers, this itself is a cumbersome activity, especially if you are launching and shutting down servers in the cloud all day.

Imagine the activities needed to manage the expectations and you would simply need an army to satisfy the audit needs!

But the auditors and compliance personnel cannot read code, and hence need all the help they can get, to satisfy the regulatory bodies. So you can help them with providing evidence using the following options.

 1. Create alternate data sources to present evidence 

Applications which are ‘operationally-aware’, will include telemetry data, including capturing error, warnings, events, trigger points, and logging this data to central\distributed stores. Typical telemetry systems (MS Insights/Kibana (logstash) ELK stack / Splunk etc.) can capture all this information and present in visual dashboards.  These dashboards can be customized, based on the needs of the users, and present the data at multiple levels of detail.

Auditors can slice and dice this data, and ‘self serve’ their audit needs !

2. Use Iterative approach to building CONTROLS evidence

As part of early engagement with the auditors, successful teams invite audit teams to their sprint planning and sprint reviews. This conversation can kick start rich discussions on how to build controls evidence in every sprint, instead of the end stage.

Teams can start to build controls right from the beginning! 

Sometimes the solutions to meeting the audit controls could be as simple as maintaining version control for all the artifacts. Other solutions could be simply linking all the artifacts across the complete application development life cycle. This allows traceability for each change set put in production.

To help you explore further, read up the fictitious narrative DevOps Audit Defense Toolkit. This provides some real life examples and links it all together.

So go ahead and start to build out your telemetry systems. Start doing early engagements with the change teams. Start to build out the controls iteratively, thereby building Trust and Transparency across the team.

Subscribe for more tips in my next post, and feel free to share your feedback here.

Tip #3: Effective Change management in DevOps

TalkIn order to reduce operational risks, organizations put in CONTROLS, typically via Change Management processes. To minimize the frictions in your DevOps journey, and building on my previous Tip#2, let us look below for the Tip#3 for effective change management.

TIP #3 – re-classify your change sets

Enterprises will have multiple change requests being pushed to production, of varying size, complexity and with different risk profilesBut existing change management processes today do not distinguish between these variations!

In reality, different change sets allow us to build different risk profiles.

So let us try to understand these variations in the Change sets, which can typically be classified into one of these 3 categories –

 1. STANDARD Change sets

These change sets are very low risk, and operations are familiar with these. These change sets have an established approval process in place. Examples – web style changes / data table updates etc.

2. NORMAL Change sets

These change sets are high risk, and operations are not familiar with these. These change sets typically use a CAB (Change Approval Board) process to approve/reject the changes. This process requires submitting change forms, with schedule, impacts, risks etc. Examples -New feature/product etc.

3. HIGH Urgency Change sets

These change sets are emergency changes, with potential high risk, and may need approvals from senior management. Examples -Security patch, Service fix patch etc.

Now with the above classification, we can aim to align with the operations teams and change management and ask for an agreement.
Agreement:  Can the STANDARD Change set be Pre-APPROVED?

As the standard changes sets are low risk => Operations teams do not need to approve. This agreement immediately give us the ability to define a pre- approval process. This allows us to deploy our change sets automatically (using  our automated deployment pipelines).

I am sure that this agreement itself will allow you to breathe more freely !

So go ahead and start working with your change management teams. Start to build these agreements, which allow you to auto-deploy to production, with complete Trust and Transparency across the team.

Subscribe for more tips in my next post, and feel free to share your feedback here.

Tip #2: Effective Change management in DevOps

TalkIn order to reduce operational risks, organizations put in CONTROLS, typically via Change Management processes. To minimize the frictions in your DevOps journey, and building on my previous Tip#1, let us look below for the Tip#2 for effective change management.

TIP #2 – create traceability and context for your change set

Operations do not want to be Surprised by any change!!  When you are working in operations every weekend and having multiple late nights, for supporting servers going down or applications crashing, you really want to know what’s the next patch upgrade going to do and how well it will work on the production box. Operation team members are also human and need to have the same LIFE as development team members. ASK your Operations team the OPS PAIN INDEX

OPS PAIN INDEX = #EXTRA HOURS WORKED x  EXTRA NIGHTS

A higher value for this Ops Pain Index will give you a better understanding of the need for Ops to learn more, about every new change being proposed and it’s impact for deployment on existing running stable system. Talk about building TRUST in the development change set!!

Thus, the key is to INCREASE THE TRUST IN YOUR CHANGE SET, by creating traceability and providing context.

With today’s tools and deployment pipelines, it is easy to link your work items in the planning tools (say JIRA, TFS, Rally…etc.). These typically include features/stories/defects – including the Ticket number, version control checkins, comments, release notes. This input can be easily feed into the deployment pipeline tools (Jenkins, Chef, Puppet etc.). This integrated view provides complete traceability across all stages from requirements to the deployment.

The linkage of the work items to the deployment artifacts describes CHANGE SET and provides the CONTEXT for the Operations team.

The additional evidence from a Quality standpoint is typically available from various channels. This includes automated builds results, automated testing results, showing the test cases executed, pass/fail ratio etc. across the various test stages (unit, integration, regression, performance, security tests). All these provide the additional confidence to the operations teams that the development team has really tested the application.

When the development says it works, they really mean it !

Aim to provide evidence of quality test results for the proposed change set, which will provide the required CONFIDENCE for the Operations team.

So go ahead and start providing the traceability and context for your change set to the operations teams and you will be on your way to building some new OPS friends 🙂

Subscribe for more tips in my next post, and feel free to share your feedback here.

Tip #1: Effective Change management in DevOps

TalkIn order to reduce operational risks, organizations put in CONTROLS, typically via Change Management processes. The outputs typically feed into the compliance/audit personnel needs, and satisfy them, but the legacy audit mindset CONFLICTS with the DevOps team mindset.

Therefore to minimize this friction, see my #1 tip in this post on how to work with the change management processes, and teams. I will be sharing more tips in my next posts.

TIP #1 – TALK to your Audit/Compliance team

  1. ASK – Why does your audit team need the Change information? 
  2. ASK – What will they do with the Change information? 
  3. ASK – What level of granularity of data about the Change is required? 
  4. ASK – Are there alternate sources of the same Change data?
  5. ASK – When do they need this Change information?

Speaking the same language (audit-speak) and asking them questions, will give you as an IT team a better understanding of the Audit/Compliance process. You may be surprised by the technical nature of the various ACTS (Financial \ Healthcare etc.) and start to appreciate them even.

So just go ahead and START a conversation with your Audit/Compliance team members now, and you might be pleasantly surprised.

Subscribe for more tips in my next post, and feel free to share your feedback here.

© 2025 agile journeys

Theme by Anders NorénUp ↑