Skip to content

June 5, 2024

Lessons Learned in Decoupling from Large Language Models

By Summary by IT Revolution

As more enterprises look to adopt and scale generative AI capabilities, important questions are emerging around architecture, organizational design, and avoiding unintentional lock-in to specific large language models (LLMs). In a recent presentation, Dr. Mik Kersten, CTO of Planview and author of Project to Product, shared insights from his company’s experience building an AI co-pilot product and the lessons they learned in decoupling from foundational LLMs like GPT-4.

The Journey Begins with a Demo

Like many companies, Planview’s AI journey began with a successful demo using GPT-3.5 to provide an AI co-pilot capability. The demo impressed, and the team set out to scale the product across a dozen different Planview offerings. This is where the challenges began to emerge.

“In the demos, no one really has to care much about the architecture,” explained Kersten. “But what happens very quickly—even more quickly in my experience than with our first experiences with cloud—is you realize this is going to get extremely expensive if you don’t think about how you’re going to make those LLM prompts.”

The Architectural Challenge

What makes LLMs powerful in the enterprise context is the data—how it’s curated and fed to the model. This requires serious work on the data architecture to make the solution unique and valuable beyond the out-of-the-box functionality. And it’s here where unintentional coupling can emerge.

As Kersten’s team began building out the co-pilot across products, they found the prompt engineering work was getting sucked into the various product teams. The road-mapping tool team was writing their own prompts, for example. This spread of prompt know-how seemed okay at first, but issues quickly became apparent.

“We realized we had an interesting use case, because our use case was really more around quantitative and structured data, less around images or natural language data,” said Kersten. “And it turns out the chain of thought process—how you get these LLMs to reason over data iteratively—was very different between GPT-4 and Claude 3.”

By decentralizing prompt development, Kersten realized they were setting themselves up to be tightly coupled to GPT-4. If a decision was made to switch LLMs down the line, it could take months as every team would need to rewrite their prompts. They had inadvertently created a form of architectural coupling that would make change difficult and expensive.

Rethinking the Organizational Wiring

Accompanying this architectural challenge was an organizational one. Who should be responsible for developing the prompts and “LLMOps” work? The product teams? The AI/data science team? The co-pilot platform team?

Initially, Kersten pushed back on having the data science team take on operational responsibilities. “No way in hell the data science team is going to do ‘you build it, you run it’,” he recalled saying. “Who wants data scientists on call?”

But as the coupling risks became clear, the team changed course. The principal data scientist came back and said they needed to centralize this work and take on operational support, even putting data engineers on-call, in order to maintain development velocity. They couldn’t afford the coordination overhead of leaving it federated.

Centralizing for Loose Coupling

The solution Planview landed on was to centralize the critical integration points while maintaining loose coupling elsewhere. All prompt engineering work was pulled into a single repository owned by a central team. This ensured tight cohesion on the key integration surface to the LLMs while enabling the switch to different models or providers in the future.

Two different organizational “wirings” were set up between this central team and the product teams, which Kersten says they will evaluate against each other to see which approach works best. The key is maintaining regular communication between the architects and leads of the AI platform and product teams to evolve the structure as they learn.

This centralized-decentralized hybrid model allows Planview to take an options-based approach to their architecture. They currently support both GPT-4 and Claude 3, giving them leverage to switch between models as new capabilities emerge and cost/performance tradeoffs change over time. Had they left the prompt engineering decentralized, they would be facing months of rework to make such a transition.

Learning to Rewire Frequently

For Kersten, this experience really drove home the importance of organizational design in managing architectural coupling. “The organizational debt that we would have created by decentralizing is completely shocking to me,” he said. “If we got the wiring wrong, that would have created the tech debt.”

His key takeaway is that in a world where the tech is changing so rapidly, leaders need to put in place the conditions to reorganize and rewire the architecture on a frequent basis—perhaps as often as monthly. Most companies only take on such changes annually or quarterly at best.

By constraining this organizational rewiring, leaders inhibit their teams from making the architectural changes necessary to avoid lock-in and accidental coupling. And the scale of this challenge is only increasing as enterprises look to adopt LLMs and generative AI more broadly. Getting the wiring right between platform teams, data science, product teams, and the LLM platforms themselves is critical.

An Ongoing Journey

Kersten acknowledges this is still an ongoing journey of learning for Planview. He expressed eagerness to learn from other organizations about their approaches to structuring data science teams and their architectural choices in designing LLM-powered products.

The Planview story illustrates that while powerful, adopting foundational LLMs and generative AI requires careful thought on coupling both at the technical architecture and organizational levels. Unintentional lock-in and excessive switching costs can quickly emerge if deliberate steps aren’t taken early to centralize key integration points and maintain loose coupling to external services.

As the technology continues to rapidly evolve, enabling organizational agility to refactor these architectures frequently will become a key leadership imperative. Those who get the wiring right will be positioned to rapidly embrace new capabilities while avoiding crippling tech and organizational debt. The lessons Planview learned provide a valuable set of guideposts for other executives embarking on this journey.

Watch the full presentation in our video library here.

Sign up for the next Enterprise Technology Leadership Summit here.

- About The Authors
Avatar photo

Summary by IT Revolution

Articles created by summarizing a piece of original content from the author (with the help of AI).

No comments found

Leave a Comment

Your email address will not be published.



Jump to Section

    More Like This

    Observing the Impact of AI on Law Firms, Software, and Writing: Winners and Losers
    By Gene Kim

    You may be reading this because of a certain Steve Yegge article called "Death…

    The Role of the Software Architect in Agile Medical Device Development
    By Summary by IT Revolution

    In a recent presentation at the 2024 Enterprise Technology Leadership Summit Virtual Europe, Tom…

    Calculating the ROI of Flow Engineering
    By Steve Pereira , Andrew Davis

    This post is adapted from the book Flow Engineering: From Value Stream Mapping to Effective…

    What to Expect at Enterprise Technology Leadership Summit Las Vegas 2024
    By Gene Kim

    Holy cow, Enterprise Technology Leadership Summit Las Vegas is happening in August, which is…