Virtual Worlds: using Team Topologies at Improbable to transform teams, technology, reliability, and customer satisfaction

 

By Yan Collendavelloo, Director of Engineering at Improbable

Photo: Yan Collendavelloo - Director of Engineering at Improbable

Yan Collendavelloo, Improbable

The Team Topologies book was a hugely valuable resource, not just for providing compelling information about Conway’s law, fracture planes, team types etc but also for providing examples of war stories from other companies.

Using some of the key principles and practices defined in Team Topologies, Improbable were able to identify and execute on a plan to evolve the team structure in a way that would align with their future growth goals.

Taking a stream-aligned approach utilizing platform teams to reduce cognitive load has resulted in massive improvements in the inherited platform.

Founded in 2012, Improbable is a British technology company, dedicated to solving the challenges of building rich virtual worlds and pioneering the path to the metaverse. The Improbable vision is to bring about rich, interactive environments that will transform economies and industries, to make people happier, safer and more connected. These can take the form of multiplayer games or synthetic environments that simulate the real world, helping governments model, plan and train.

Improbable’s unique technology enables thousands to assemble in a single virtual game environment or simulate real-world challenges at scale. The critical infrastructure Improbable creates will power the coming age of virtual worlds.

In 2020 Improbable acquired Munich-based video games company Zeuz, a managed hosting service used by the makers of hit online games such as SCUM and Conan Exiles, to further the company’s mission to make online games development more efficient, effective and accessible for developers. Despite the two companies being significantly different in size and culture with Improbable having over 900 employees and Zeuz around just 50, Improbable were able to apply the principles and practices of Team Topologies to help them re-imagine how the two companies could merge together effectively.

Challenges of the merger

Historically, Improbable’s focus had been on enabling video games on a massive scale with 10s of 1000s of players globally. Achieving such massive scale required a focus on hiring top tech talent with incredible engineering maturity and high quality development practices including continuous integration/delivery processes, automated testing, monitoring etc.

Zeuz, on the other hand, had a strong customer focus, with in-depth games industry knowledge and a clear idea of their customer base. Zeuz had developed a fantastic multi-cloud and hybrid-cloud platform to run multiplayer video games at planet scale in a cost effective way leveraging the cost effectiveness of bare metal servers across 250 data centers, combined with the elasticity of the 4 major cloud providers. On top of that, it was easy to use and allowed game studios to focus on what they enjoy most: making fun games. However, being a start-up that had a relentless focus on the customer meant that their attention was more on getting the platform into the hands of their users and obtaining real-customer feedback in the shortest time possible rather than employing development practices such as those used at Improbable, the chosen software delivery approaches could not have been more different. The development approach at Zeuz had also resulted in the creation of a monolithic application which served the original purpose but needed to fundamentally evolve in order to meet the future needs of Improbable.

The acquisition of Zeuz provided an opportunity for Improbable to re-think their mission and how to execute on that mission through their commercial and product strategies in order to achieve the product market fit they had been looking for but also improve the general development practices used by Zeuz. With the teams from each company working in such significantly different ways, Improbable needed a way to organize their engineering teams that would help obtain the best possible results.   

First attempts at merging the teams

Improbable’s vision was the creation of a decoupled product that provided both ease of use for standard session based games (à la Fortnite) and optionality for more complex games (MMOs like World of Warcraft or massive interactive live events - think gigantic concert) allowing different game studios to consume any required multiplayer services as required. After the merger, Improbable had inherited two products, one from each company, but based upon their future growth goals also needed to create a third product that was an evolution of the previous two platforms whilst also supporting existing live costumers.

Differing team structures between Zeuz and Improbable 

Differing team structures between Zeuz and Improbable 

The way in which the teams were structured differed between the two organizations. Zeuz’s teams, based in Munich, were made up of 3 major teams, a front end team, a back end team and a sysops team each working on a monolith stack. The Improbable teams, based in London, were organized around components of their existing product.

Whilst developing the new product, Improbable initially used the London based teams to design and build new product features whilst the Munich teams were taking care of business as usual (BAU) tasks of the original Zeuz platform (shown in the diagram below). 

First team organization attempt, new features in London, BAU in Munich

First team organization attempt, new features in London, BAU in Munich

However, this did not work for very long as constraints were local and the localized solutions were suboptimal: the feature team approach was efficient when fixing specific platform issues, but it soon became apparent that the lack of ownership of code in the monolith codebase eventually led to difficulties in prioritization, planning and coordination across teams. 

As the new product evolved it became challenging for each team to fully understand the long-term vision for the product, there was a lot of context switching and centralized decision caused bottlenecks in the flow of value. It was also difficult to find the right balance of people in each of the teams due to some of the teams not having a strong customer focus that was so evident in the teams at Zeuz. 

The culture at Improbable is very much one of collaboration and help. Each team cared about helping the other but there was a general feeling that they were not tapping into the potential of combining the capabilities of the teams from both Improbable and Zeuz, something needed to change.

Evolving the team structure via Team-First thinking

Driven by the need for things to make a difference, Yan Collendavelloo, Director of Engineering at Improbable, was thinking about what the organization design should look like and had recently been introduced to Team Topologies via a book club run by a colleague. He found that a number of key concepts from the book resonated and provided insights into how the team at Improbable might choose to look at the problem from a different perspective.

“Team assignment is the first draft of the architecture”

Michael Nygard, author of Release It!

Yan decided to start thinking about things from a team-first perspective. Using the principles of Conway’s law and fracture planes along business domain boundaries Yan was able to uncover potentially independent streams of value within the organization that would enable the creation of stream-aligned teams with a “You build it, you run it” mentality. Creating fully cross-functional teams that combined the engineering capability of the people at Improbable with the customer focus ability of the people at Zeuz. These teams were able to obtain immediate feedback from customers enabling them to learn quickly from any mistakes.   

Improbable were building a platform that served different types of game studio. Each of these game studios had differing needs based upon the type of game they were building and those needs would be met by the different platform services provided by the Improbable Multiplayer Services platform. 

Team interaction model depicting the platform relationship between Improbable and game studios

Team interaction model depicting the platform relationship between Improbable and game studios

Whilst exploring ways to break up the  Improbable Multiplayer Services platform it became apparent that there were several different independent streams of value that served the requirements of each customer based upon the types of game they were creating, which included streams such as orchestration, infrastructure hosting and compute services. Each of these teams created and curated their own Team API that was captured on their internal wiki and helped to describe which areas of the domain the team worked on, what their slack channels were and how they should be contacted etc.

It was then necessary for the teams to go through short-periods of collaboration in order to develop X-as-a-service capabilities between each of the stream-aligned teams - these APIs would be provided by platforms curated by each of the stream-aligned teams. 

Team interaction model depicting use of collaboration and facilitation interaction modes between teams

Team interaction model depicting use of collaboration and facilitation interaction modes between teams

During the evolution of the new platform Improbable also engaged with some external providers for a short-period of time to help facilitate the upskilling of their teams on the Kubernetes platform enabling them to migrate away from Docker Swarm. The internal platform teams are responsible for reducing the cognitive load of stream-aligned teams via the provision of a mixture of internal services and expertise in public cloud providers, these teams collaborate with the stream-aligned teams for short-periods when necessary for evolving their stack.

Team interaction model depicting collaboration between the internal platform and stream-aligned teams

Team interaction model depicting collaboration between the internal platform and stream-aligned teams

Eventually, the teams had evolved all of the X-as-a-service capabilities required to meet both their internal and external customer needs. Each of the stream-aligned teams were able to manage their own backlogs and continuously deliver value to their customers with short feedback loops ensuring that the product benefited from any learnings made.

Team interaction model depicting X-as-a-service interactions between teams and customers

Team interaction model depicting X-as-a-service interactions between teams and customers

Throughout the evolution of the team organization, Improbable adopted and tracked their teams using the four key metrics as defined in Accelerate by Forsgren et al

  • Deploy Frequency

  • Mean Time To Recover (MTTR)

  • Lead Time for Changes

  • Change Failure Rate.

By tracking and using these metrics, Improbable were able to determine the impact of the team changes and observed that the teams evolved from low performing devops teams to high performing devops teams with an ambition of becoming elite performing devops teams. 

Summary: Great outcomes in terms of DevOps metrics and team performance

Merging two companies that had significantly different approaches to software development was a daunting task. Using some of the key principles and practices defined in Team Topologies, Improbable were able to identify and execute on a plan to evolve the team structure in a way that would align with their future growth goals.

The Team Topologies book was a hugely valuable resource, not just for providing compelling information about Conway’s law, fracture planes, team types etc but also for providing examples of war stories from other companies that detailed how stream aligned teams compared with product teams and feature teams. It is however important to share and discuss the book openly within the organization, via book clubs or lunchtime learning sessions etc, and help the wider community understand the reasons for adopting this approach. The online resources that can be found at https://github.com/teamTopologies/ can be useful to help share some of the key concepts and there are other resources such as infographics, the online academy and workshops that can also be used to help educate others within the organization.

Making the leap to such a different approach took a lot of internal discussion since some of the team were in favour of isolating the new and the old, arguing that isolation would yield greater productivity. However, the decision was taken to perform the restructuring using stream-aligned and platform teams as this offered the vision of a far more compelling future. That decision has since been justified with significantly compelling results such as a 30x reduction in Mean Time To Recover (MTTR) and a 5x reduction in major incidents.

Taking a stream-aligned approach utilizing platform teams to reduce cognitive load has resulted in massive improvements in the inherited platform. Not only did this result in the reliability and performance of teams to become far better than it ever has been, it also contributed to Improbable being able to smoothly ship new products which to date have achieved a 100% retention rate and improved customer satisfaction. It is important to note however, that the strong culture of trust and collaboration at Improbable was foundational in achieving the above outcomes.

About the Author

Yan Collendavelloo is Director of Engineering at Improbable

LinkedIn

About Improbable

Improbable

 
Previous
Previous

Rebuilding and scaling product development at Docker using Team Topologies

Next
Next

How the internal technology platform creates value at NAV