Skip to content

October 15, 2024

Navigating the Risks of AI/ML Systems: A Guide to Effective Controls

By Leah Brown

As artificial intelligence (AI) and machine learning (ML) systems become increasingly prevalent across industries, organizations face new challenges in managing the associated risks. A recent paper titled Systemic Controls for Managing Risk in AI/ML Systems offers valuable insights for auditors, internal risk directors, and software leaders on how to effectively control and mitigate these risks throughout the AI/ML development life cycle.

The authors highlight the unique characteristics of AI/ML systems that necessitate new approaches to risk management and provide practical guidance on implementing effective controls.

The Need for New Controls

The paper begins by emphasizing that AI/ML systems introduce new types of software assets and operational practices that differ significantly from traditional IT systems. These differences create new risks that organizations must manage, particularly in regulated industries. The authors cite examples of real-world problems that have occurred when AI/ML assets were not properly managed, such as Meta’s Galactica and Microsoft’s Tay chatbot, which both had to be shut down shortly after launch due to unexpected and problematic outputs.

Key Components of AI/ML Systems

The paper identifies several key components of AI/ML systems that require specific attention and controls:

  1. Model Weights: Quantifiable values that determine the functionality of an AI/ML model, similar to runtime configuration in traditional software.
  2. Data Management: Including the use of feature stores to manage and serve data accurately and safely.
  3. Model Development: Encompassing the various subcomponents that together achieve the business functionality of a model.
  4. Hardware Configuration: The specific arrangement of computing resources required to support AI/ML applications, which can significantly impact model performance.

Life Cycle Stages and Associated Controls

The authors outline six stages in the AI/ML development life cycle and provide sample controls for each stage:

1. Data Preparation

  • Data Source Validation
  • Data Transformation Attestation
  • Outlier Management and Data Enrichment

2. Data Management

  • Feature Versioning Control
  • Feature Access Control
  • Feature Quality Assurance

3. Model Development

  • Model Weight Version Control
  • Non-Deterministic Model Governance
  • Model Component Inventory

4. Model Evaluation

  • Critical Safety Constraint Verification
  • Enhanced Red Team Assessment
  • Functional and Non-Functional Requirements Compliance

5. Model Deployment

  • Hardware-Specific Performance Validation
  • Floating-Point Precision Consistency
  • Hardware Configuration Management

6. Continuous Monitoring

  • Performance Threshold Alerts
  • Data Quality Monitoring
  • Model Drift Detection

For each control, the paper provides a description, explains the risk it mitigates, and suggests types of auditable evidence that organizations should maintain.

Key Considerations for AI/ML Risk Management

Throughout the paper, several important themes emerge:

  1. Unique Asset Types: AI/ML systems introduce new types of assets, such as model weights and training datasets, that require specific controls and governance.
  2. Data Quality and Management: The performance of AI/ML models heavily depends on the quality and integrity of the data used for training and operation.
  3. Hardware Dependencies: Unlike traditional software, AI/ML models can be significantly impacted by the specific hardware configurations used for training and deployment.
  4. Continuous Monitoring and Adaptation: AI/ML systems require ongoing monitoring and adjustment to maintain performance and mitigate risks over time.
  5. Regulatory Compliance: As regulatory bodies develop new guidance for AI/ML systems, organizations need to proactively interpret and implement these requirements.

Practical Implementation of Controls

The paper provides practical guidance on implementing controls, emphasizing the importance of:

  1. Clear Documentation: Maintaining comprehensive records of model components, data sources, and hardware configurations.
  2. Version Control: Applying version control principles not just to code, but also to model weights, datasets, and other AI/ML-specific assets.
  3. Performance Validation: Regularly testing and validating model performance across different hardware configurations and scenarios.
  4. Access Controls: Implementing role-based access controls for features and model components.
  5. Automated Monitoring: Setting up automated alerts and monitoring systems to detect issues in real-time.
  6. Regular Audits: Conducting periodic reviews and assessments of AI/ML systems, including red team exercises and compliance checks.

Conclusion

The authors conclude by emphasizing that while AI/ML systems offer tremendous potential, they also introduce new risks that must be carefully managed. By implementing appropriate controls throughout the AI/ML development life cycle, organizations can harness the power of these technologies while ensuring responsible and ethical use.

The paper serves as a valuable starting point for organizations looking to develop or enhance their risk management practices for AI/ML systems. It provides a framework for thinking about the unique challenges posed by these technologies and offers practical guidance on how to address them.

For auditors, risk managers, and technology leaders involved in AI/ML initiatives, this paper offers crucial insights into the types of controls and evidence they should be looking for to ensure the responsible development and deployment of AI/ML systems. As these technologies continue to evolve and become more prevalent, the guidance provided in this paper will help organizations stay ahead of the curve in managing associated risks and maintaining compliance with emerging regulations.

To gain a deeper understanding of these concepts and how they might apply to your specific organizational context, we encourage you to read the full paper and consider how these controls can be integrated into your AI/ML development processes and risk management frameworks.

- About The Authors
Leah Brown

Leah Brown

Managing Editor at IT Revolution working on publishing books and guidance papers for the modern business leader. I also oversee the production of the IT Revolution blog, combining the best of responsible, human-centered content with the assistance of AI tools.

Follow Leah on Social Media

1 Comment

  • Anonymous Nov 8, 2024 11:24 am

    This blog provides a clear roadmap for managing AI/ML risks, offering practical guidance on implementing effective controls. It’s an insightful resource for organizations aiming to harness AI responsibly and securely.

Leave a Comment

Your email address will not be published.



Jump to Section

    More Like This

    Progressive Delivery: The Future of Software Deployment Is Already Here
    By Leah Brown

    When was the last time you enjoyed a software update? If you're struggling to…

    The Business Case for Building a Learning Culture in Your Organization
    By Leah Brown

    Last month, we focused on how building high-performing teams is a key differentiator of…

    Measuring What Matters: Using Outcome-Focused Metrics to Build High-Performing Teams in 2025
    By Leah Brown

    In 2025 and beyond, organizations that thrive will be those that effectively measure and…

    Reflecting on 15 Years of Books: A Journey Of DevOps, Organizational Wiring, and So Much More!
    By Gene Kim

    Last week, I had the delightful experience of reviewing the rough cuts of The…