Inspire, develop, and guide a winning organization.
Create visible workflows to achieve well-architected software.
Understand and use meaningful data to measure success.
Integrate and automate quality, security, and compliance into daily work.
Understand the unique values and behaviors of a successful organization.
LLMs and Generative AI in the enterprise.
An on-demand learning experience from the people who brought you The Phoenix Project, Team Topologies, Accelerate, and more.
Learn how making work visible, value stream management, and flow metrics can affect change in your organization.
Clarify team interactions for fast flow using simple sense-making approaches and tools.
Multiple award-winning CTO, researcher, and bestselling author Gene Kim hosts enterprise technology and business leaders.
In the first part of this two-part episode of The Idealcast, Gene Kim speaks with Dr. Ron Westrum, Emeritus Professor of Sociology at Eastern Michigan University.
In the first episode of Season 2 of The Idealcast, Gene Kim speaks with Admiral John Richardson, who served as Chief of Naval Operations for four years.
New half-day virtual events with live watch parties worldwide!
DevOps best practices, case studies, organizational change, ways of working, and the latest thinking affecting business and technology leadership.
Is slowify a real word?
Could right fit help talent discover more meaning and satisfaction at work and help companies find lost productivity?
The values and philosophies that frame the processes, procedures, and practices of DevOps.
This post presents the four key metrics to measure software delivery performance.
June 5, 2024
As more enterprises look to adopt and scale generative AI capabilities, important questions are emerging around architecture, organizational design, and avoiding unintentional lock-in to specific large language models (LLMs). In a recent presentation, Dr. Mik Kersten, CTO of Planview and author of Project to Product, shared insights from his company’s experience building an AI co-pilot product and the lessons they learned in decoupling from foundational LLMs like GPT-4.
Like many companies, Planview’s AI journey began with a successful demo using GPT-3.5 to provide an AI co-pilot capability. The demo impressed, and the team set out to scale the product across a dozen different Planview offerings. This is where the challenges began to emerge.
“In the demos, no one really has to care much about the architecture,” explained Kersten. “But what happens very quickly—even more quickly in my experience than with our first experiences with cloud—is you realize this is going to get extremely expensive if you don’t think about how you’re going to make those LLM prompts.”
What makes LLMs powerful in the enterprise context is the data—how it’s curated and fed to the model. This requires serious work on the data architecture to make the solution unique and valuable beyond the out-of-the-box functionality. And it’s here where unintentional coupling can emerge.
As Kersten’s team began building out the co-pilot across products, they found the prompt engineering work was getting sucked into the various product teams. The road-mapping tool team was writing their own prompts, for example. This spread of prompt know-how seemed okay at first, but issues quickly became apparent.
“We realized we had an interesting use case, because our use case was really more around quantitative and structured data, less around images or natural language data,” said Kersten. “And it turns out the chain of thought process—how you get these LLMs to reason over data iteratively—was very different between GPT-4 and Claude 3.”
By decentralizing prompt development, Kersten realized they were setting themselves up to be tightly coupled to GPT-4. If a decision was made to switch LLMs down the line, it could take months as every team would need to rewrite their prompts. They had inadvertently created a form of architectural coupling that would make change difficult and expensive.
Accompanying this architectural challenge was an organizational one. Who should be responsible for developing the prompts and “LLMOps” work? The product teams? The AI/data science team? The co-pilot platform team?
Initially, Kersten pushed back on having the data science team take on operational responsibilities. “No way in hell the data science team is going to do ‘you build it, you run it’,” he recalled saying. “Who wants data scientists on call?”
But as the coupling risks became clear, the team changed course. The principal data scientist came back and said they needed to centralize this work and take on operational support, even putting data engineers on-call, in order to maintain development velocity. They couldn’t afford the coordination overhead of leaving it federated.
The solution Planview landed on was to centralize the critical integration points while maintaining loose coupling elsewhere. All prompt engineering work was pulled into a single repository owned by a central team. This ensured tight cohesion on the key integration surface to the LLMs while enabling the switch to different models or providers in the future.
Two different organizational “wirings” were set up between this central team and the product teams, which Kersten says they will evaluate against each other to see which approach works best. The key is maintaining regular communication between the architects and leads of the AI platform and product teams to evolve the structure as they learn.
This centralized-decentralized hybrid model allows Planview to take an options-based approach to their architecture. They currently support both GPT-4 and Claude 3, giving them leverage to switch between models as new capabilities emerge and cost/performance tradeoffs change over time. Had they left the prompt engineering decentralized, they would be facing months of rework to make such a transition.
For Kersten, this experience really drove home the importance of organizational design in managing architectural coupling. “The organizational debt that we would have created by decentralizing is completely shocking to me,” he said. “If we got the wiring wrong, that would have created the tech debt.”
His key takeaway is that in a world where the tech is changing so rapidly, leaders need to put in place the conditions to reorganize and rewire the architecture on a frequent basis—perhaps as often as monthly. Most companies only take on such changes annually or quarterly at best.
By constraining this organizational rewiring, leaders inhibit their teams from making the architectural changes necessary to avoid lock-in and accidental coupling. And the scale of this challenge is only increasing as enterprises look to adopt LLMs and generative AI more broadly. Getting the wiring right between platform teams, data science, product teams, and the LLM platforms themselves is critical.
Kersten acknowledges this is still an ongoing journey of learning for Planview. He expressed eagerness to learn from other organizations about their approaches to structuring data science teams and their architectural choices in designing LLM-powered products.
The Planview story illustrates that while powerful, adopting foundational LLMs and generative AI requires careful thought on coupling both at the technical architecture and organizational levels. Unintentional lock-in and excessive switching costs can quickly emerge if deliberate steps aren’t taken early to centralize key integration points and maintain loose coupling to external services.
As the technology continues to rapidly evolve, enabling organizational agility to refactor these architectures frequently will become a key leadership imperative. Those who get the wiring right will be positioned to rapidly embrace new capabilities while avoiding crippling tech and organizational debt. The lessons Planview learned provide a valuable set of guideposts for other executives embarking on this journey.
Watch the full presentation in our video library here.
Sign up for the next Enterprise Technology Leadership Summit here.
Articles created by summarizing a piece of original content from the author (with the help of AI).
No comments found
Your email address will not be published.
First Name Last Name
Δ
In recent years, the concept of minimum viable product (MVP) has gained widespread adoption…
It's no secret that technology organizations face increasing pressure to deliver more with less.…
Whoever said that philosophy doesn’t pay the bills hasn’t read enough history, or at…
It's no secret that organizations face numerous challenges, from navigating transformations to managing stakeholder…