Skip to content

January 17, 2024

Running Large Language Models in Production: An Adventure into the Frontier

By Summary by IT Revolution

In his opening remarks at the 2023 DevOps Enterprise Summit, John Rauser, Director of Engineering at Cisco Systems, made an apt analogy. Building production systems for large language models (LLMs), he explained, is akin to “venturing into an area of the world where we don’t have a map yet.” It’s an exciting adventure into the frontier—full of equal parts risk and reward.  

Rauser anchored his talk around this adventurous theme. He emphasized that we currently sit at the “peak of inflated expectations” when it comes to AI. Yet while caution is warranted, real businesses like Rauser’s are already seeing material impacts from integrating LLMs into products and services. ChatGPT surpassing 100 million users in mere weeks stands as a microcosm of this tech’s massive uptake and outsized potential.

Ops is the Biggest Barrier 

A core theme woven throughout Rauser’s presentation is that the field of AI operations (AIOps) represents the most significant barrier to unlocking the potential value in LLMs. Developers can imagine endless creative applications powered by models like GPT-3. But those use cases mean little without the ability to accurately and reliably build systems around such models.

Rauser explained that within Cisco, LLM initiatives fall into three high-level categories:

  1. Viziers: Helpful advisors who answer questions and provide expertise through conversational interfaces.
  2. Judges: LLMs that summarize information and provide reasoned analysis to support decisions.  
  3. Generals: Autonomous LLMs that independently carry out critical business tasks.

Of these three use cases, the generative “general” poses the greatest AIOps challenge. Unlike Viziers and Judges, errant outputs from generals can directly impact everything from revenues to regulatory compliance. Success requires exceptional accuracy and reliability at scale.

Castles, Councils, and Keeps

Rauser introduced the metaphor of building a strong castle to house the LLM “council” of viziers, judges, and generals. This castle consists of three critical elements: models, data, and interfaces.

  • Models: Foundationally, production success relies on choosing (or building) the right LLM architecture. Rauser overviewed key model innovations, from ginormous models like GPT-3 to compact open-source options like LLAMA-2. Picking the optimal model requires balancing accuracy, performance, and infrastructure constraints.
  • Data: No castle is secure without ample provisions to withstand a siege. Similarly, LLMs need rich, clean data and context to produce reliable and targeted outputs across diverse applications. Rauser suggests that rather than raw data, the emphasis should be on integrating meaningful “knowledge” through techniques like retrieval augmented learning. 
  • Interfaces: Finally, the castle gates control how users interact with the LLM council. Rauser stresses that interfaces must embed guardrails against harmful generative content. Moreover, prompting mechanisms greatly impact outputs. Thus, interfaces should prompt responsibly while allowing for dynamic regeneration.

Constructing the Moat 

Unfortunately, warring factions threaten any newly constructed castles. Rauser argues the greatest competitive moat for enterprise AI isn’t proprietary data or models. Rather, it lies in building a robust platform that allows development teams to rapidly deploy innovative LLM-powered features. 

Rauser concludes by calling for collaboration among industry leaders to map out this uncharted frontier. For now, AI pioneers must be content with using LLMs like ChatGPT to guide understanding. But persistent progress relies on an expanding community pushing the boundaries. The adventure continues…

To watch the full presentation, please visit the IT Revolution Video Library here:

- About The Authors
Avatar photo

Summary by IT Revolution

Articles created by summarizing a piece of original content from the author (with the help of AI).

No comments found

Leave a Comment

Your email address will not be published.

Jump to Section

    More Like This

    Building an Automated Governance Architecture – Investments Unlimited Series: Chapter 5
    By IT Revolution , Helen Beal , Bill Bensing , Jason Cox , Michael Edenzon , Dr. Tapabrata "Topo" Pal , Caleb Queern , John Rzeszotarski , Andres Vega , John Willis

    Welcome to the fifth installment of IT Revolution’s series based on the book Investments…

    Addressing Burnout in Our DevOps Community Through Deming’s Lens
    By John Willis

    A Crucial Battle We Must Not Ignore Today, I'd like to pivot from our…

    The Ethical Tensions Between Bureaucracy and Digital
    By Summary by IT Revolution

    We live in an era of competing value systems—the lingering influence of impersonal, productivity-maximizing…

    The Path of Gracious Perseverance: Developing Leadership Courage for Business Impact 
    By Summary by IT Revolution

    We’ve all encountered situations at work where politics, opinions, and power dynamics seem to…