Skip to content

October 15, 2024

Navigating the Risks of AI/ML Systems: A Guide to Effective Controls

By Summary by IT Revolution

As artificial intelligence (AI) and machine learning (ML) systems become increasingly prevalent across industries, organizations face new challenges in managing the associated risks. A recent paper titled Systemic Controls for Managing Risk in AI/ML Systems offers valuable insights for auditors, internal risk directors, and software leaders on how to effectively control and mitigate these risks throughout the AI/ML development life cycle.

The authors highlight the unique characteristics of AI/ML systems that necessitate new approaches to risk management and provide practical guidance on implementing effective controls.

The Need for New Controls

The paper begins by emphasizing that AI/ML systems introduce new types of software assets and operational practices that differ significantly from traditional IT systems. These differences create new risks that organizations must manage, particularly in regulated industries. The authors cite examples of real-world problems that have occurred when AI/ML assets were not properly managed, such as Meta’s Galactica and Microsoft’s Tay chatbot, which both had to be shut down shortly after launch due to unexpected and problematic outputs.

Key Components of AI/ML Systems

The paper identifies several key components of AI/ML systems that require specific attention and controls:

  1. Model Weights: Quantifiable values that determine the functionality of an AI/ML model, similar to runtime configuration in traditional software.
  2. Data Management: Including the use of feature stores to manage and serve data accurately and safely.
  3. Model Development: Encompassing the various subcomponents that together achieve the business functionality of a model.
  4. Hardware Configuration: The specific arrangement of computing resources required to support AI/ML applications, which can significantly impact model performance.

Life Cycle Stages and Associated Controls

The authors outline six stages in the AI/ML development life cycle and provide sample controls for each stage:

1. Data Preparation

  • Data Source Validation
  • Data Transformation Attestation
  • Outlier Management and Data Enrichment

2. Data Management

  • Feature Versioning Control
  • Feature Access Control
  • Feature Quality Assurance

3. Model Development

  • Model Weight Version Control
  • Non-Deterministic Model Governance
  • Model Component Inventory

4. Model Evaluation

  • Critical Safety Constraint Verification
  • Enhanced Red Team Assessment
  • Functional and Non-Functional Requirements Compliance

5. Model Deployment

  • Hardware-Specific Performance Validation
  • Floating-Point Precision Consistency
  • Hardware Configuration Management

6. Continuous Monitoring

  • Performance Threshold Alerts
  • Data Quality Monitoring
  • Model Drift Detection

For each control, the paper provides a description, explains the risk it mitigates, and suggests types of auditable evidence that organizations should maintain.

Key Considerations for AI/ML Risk Management

Throughout the paper, several important themes emerge:

  1. Unique Asset Types: AI/ML systems introduce new types of assets, such as model weights and training datasets, that require specific controls and governance.
  2. Data Quality and Management: The performance of AI/ML models heavily depends on the quality and integrity of the data used for training and operation.
  3. Hardware Dependencies: Unlike traditional software, AI/ML models can be significantly impacted by the specific hardware configurations used for training and deployment.
  4. Continuous Monitoring and Adaptation: AI/ML systems require ongoing monitoring and adjustment to maintain performance and mitigate risks over time.
  5. Regulatory Compliance: As regulatory bodies develop new guidance for AI/ML systems, organizations need to proactively interpret and implement these requirements.

Practical Implementation of Controls

The paper provides practical guidance on implementing controls, emphasizing the importance of:

  1. Clear Documentation: Maintaining comprehensive records of model components, data sources, and hardware configurations.
  2. Version Control: Applying version control principles not just to code, but also to model weights, datasets, and other AI/ML-specific assets.
  3. Performance Validation: Regularly testing and validating model performance across different hardware configurations and scenarios.
  4. Access Controls: Implementing role-based access controls for features and model components.
  5. Automated Monitoring: Setting up automated alerts and monitoring systems to detect issues in real-time.
  6. Regular Audits: Conducting periodic reviews and assessments of AI/ML systems, including red team exercises and compliance checks.

Conclusion

The authors conclude by emphasizing that while AI/ML systems offer tremendous potential, they also introduce new risks that must be carefully managed. By implementing appropriate controls throughout the AI/ML development life cycle, organizations can harness the power of these technologies while ensuring responsible and ethical use.

The paper serves as a valuable starting point for organizations looking to develop or enhance their risk management practices for AI/ML systems. It provides a framework for thinking about the unique challenges posed by these technologies and offers practical guidance on how to address them.

For auditors, risk managers, and technology leaders involved in AI/ML initiatives, this paper offers crucial insights into the types of controls and evidence they should be looking for to ensure the responsible development and deployment of AI/ML systems. As these technologies continue to evolve and become more prevalent, the guidance provided in this paper will help organizations stay ahead of the curve in managing associated risks and maintaining compliance with emerging regulations.

To gain a deeper understanding of these concepts and how they might apply to your specific organizational context, we encourage you to read the full paper and consider how these controls can be integrated into your AI/ML development processes and risk management frameworks.

- About The Authors
Avatar photo

Summary by IT Revolution

Articles created by summarizing a piece of original content from the author (with the help of AI).

No comments found

Leave a Comment

Your email address will not be published.



Jump to Section

    More Like This

    Mitigating Unbundling’s Biggest Risk
    By Stephen Fishman , Matt McLarty

    If you haven’t already read Unbundling the Enterprise: APIs, Optionality, and the Science of…

    Navigating Cloud Decisions: Debunking Myths and Mitigating Risks
    By Summary by IT Revolution

    Organizations face critical decisions when selecting cloud service providers (CSPs). A recent paper titled…

    The Phoenix Project Comes to Life: Graphic Novel Adaptation Now Available!
    By IT Revolution

    We're thrilled to announce the release of The Phoenix Project: A Graphic Novel (Volume…

    Embracing Uncertainty: GenAI and Unbundling the Enterprise
    By Matt McLarty , Stephen Fishman

    The following post is an excerpt from the book Unbundling the Enterprise: APIs, Optionality, and…