Q&A with Scott Prugh of CSG on Patterns and Anti-patterns for DevOps Team Topologies

We recently talked with Scott Prugh, Chief Architect & SVP Software Engineering at CSG, about some of the patterns and anti-patterns in the DevOps Topologies online catalog (predecessor of the Team Topologies book). We explore how the topologies have helped shape team structures and interactions at CSG.

We were excited to chat with Scott because what CSG has achieved is rather impressive for a 35+ years old provider of software and services operating across more than 120 countries worldwide, with over 3300 employees (including 1000 IT practitioners and 40 development teams).

scott-prugh.png

Scott Prugh is a regular speaker, namely at the DevOps Enterprise Summit in the US where he has shared both CSG’s and their clients’ journeys into DevOps year after year.

Interview by Manuel Pais, co-author of Team Topologies.

Manuel: Hi Scott. Over the last few years, you and Erica Morrison have actively shared your journey at the DevOps Enterprise Summit. Why do you find it important to get these stories out and what have been some of your key takeaways?

Scott: I believe it is incredibly important to nurture learning and sharing as a community. Personally, I view it as my responsibility to set the tone as a leader on how sharing and learning from each other is so vitally important. At CSG we have a heritage of being a great software & service provider. We have also shown that companies with a traditional heritage in technology, people and processes can radically transform to continue to be a leader in the digital marketplace.

As far as takeaways I have learned that:

1) Many companies are struggling with some of the same things: legacy technologies, operations pain, technical debt, employee burnout.

2) The patterns to solve for these problems are similar but each company will have a different path to get there. There is no silver bullet. It takes courage, trial & error, and persistence.

3) The community as a whole is incredibly supportive and committed to helping each other evolve.

How the DevOps Topologies catalog helped

Manuel: You recently provided some great insights around our DevOps Topologies catalog. To which extent have the patterns been useful for you to trigger discussions around team responsibilities and interactions within CSG and with your clients?

Scott: The patterns and pictures of them are a great place to:

1) Discuss alternative structures that have seen success

2) Discuss the positives and negatives of said structures and the ability to adopt them

3) Discuss anti-patterns that are seen to lead to low performance

Manuel: In your comments on DevOps Topologies, you mentioned how there’s a thin line between “Type 2: Fully Shared Ops Responsibilities” where a cross-functional team takes on ops responsibilities and “Anti-Type F: Ops Embedded in Dev Team” where ops responsibilities are “dumped” on a development team. What are some of the anti-patterns you’ve seen in the field around that?

Scott: I am a full subscriber to trying to manage to Type2 as much as possible. FWIW, I think Type2 is better coined as either: Build/Run Teams or Cross Functional Build and Run Teams. We actually use the term DevOps Team to mean Build & Run Teams at CSG but I would caution using the term DevOps as it is so heavily overloaded. The problem with the term “Fully Shared Ops Responsibilities” is that in infers that all team members share ops responsibility. Although that has been seen to work, it can cause context switching and burnout (as you mention). Type2 should signify a team or set of teams composed of most of the resources to build and operate their service from a shared backlog. The correct Type2 team will be slightly larger (+/- 10) people compared to traditional agile teams. If done effectively with the correct number and type of resources the context switching can be managed and you create an operational feedback loop that you do not get in other topologies. We have also seen this model with variants at the service or product level. A larger service might have 3 teams under one leader. One focused on operations (run and support) and two on development. All teams on this service should share a backlog for visibility and rotate team members to create continued improvement.

The key antipatterns we have seen are:

1) Developer as Operations: This overloads the developer because they are working two jobs.

2) Siloed Backlog: Not exposing the entire backlog to the team as well as product management and having positive discussions around improvements as well as tough discussions around operational pain and priority to work.

Manuel: The DevOps Topologies patterns identify different perspectives for groups building and operating software systems. These are two very different but highly connected activities obviously. In your experience, how can teams acquire the necessary skills and mindset to perform both activities effectively?

Scott: Hire both skill sets on the team and cross train with pairing and a shared backlog. Start treating operations as an engineering problem.

Moving from 3-tier support to Swarming

Manuel: You’ve also talked recently about the need to move from the classic 3-tier support model to a swarming model. Could you expand on that and what’s the impact for organizations that remain attached to the old model?

Scott: Jon hall is really the expert on this: https://www.rundeck.com/blog/twl-jon-hall-on-swarming-to-avoid-the-painful-3-tier-support-model

We did develop a bit of this model independently (called our MIM process) but once he published his work I began to refer to it as swarming. For the teams, this can mean more context switching because when an escalated incident comes it you will pull your ops folks and possibly developers onto a bridge to troubleshoot. Although disruptive, it has remarkable efficacy in driving down MTTR as well as exposing the team to the realities of their service in production. This has a great side-effect of improving robustness in the service as well as resilience in the people that support it.

For folks stuck in the old model: The reality is that all the handoffs between L1-L3 elongate your response and recovery to an issue. There seems to be no better way to annoy your customers than putting their resolution as far away from the people that know how to fix it (the team that built the service).

Manuel: We can’t wait for the DevOps Enterprise Summit in Las Vegas next week (October 28-30)! Could you give us a sneak peek on the topic of your talk?

Scott: In the past, we covered both the radical people and process changes including the way we have changed how we work. This really started with our Lean/Agile transformation in 2012, followed by our DevOps transformation in 2016 and even integrating Product Management into DevOps thinking in 2018.

This year’s topic will take a different path than the previous years. This year we will be very technical focused and cover the great modernization work we have done at CSG. As a company with a great tradition, leadership and engineering we saw it vital not to be viewed and act like a legacy company. We wanted to be a heritage company not a legacy one.

Many years ago we set out to completely modernize and transform our technology and application stack. This included putting in end-to-end version control & CI/CD, automated testing and test data management, as well as modernizing our infrastructure and our application stack which includes mainframe technology. We have also had a strong focus on adopting commodity and OSS as well as moving away from painful, costly and dangerous vendors.

Manuel: Thank you for sharing your experience with us, Scott!




If you want to know how to apply Team Topologies not only to DevOps but more widely to the intersection of technology and business, check out our book “Team Topologies: Organizing Business and Technology Teams for Fast Flow”.

Previous
Previous

Software delivery for IoT + mobile + cloud at Dyson - interview with Andy Nesling

Next
Next

Team Topologies book signing tour - when and where