October 15, 2024

Navigating the Risks of AI/ML Systems: A Guide to Effective Controls

By Leah Brown

As artificial intelligence (AI) and machine learning (ML) systems become increasingly prevalent across industries, organizations face new challenges in managing the associated risks. A recent paper titled Systemic Controls for Managing Risk in AI/ML Systems offers valuable insights for auditors, internal risk directors, and software leaders on how to effectively control and mitigate these risks throughout the AI/ML development life cycle.

The authors highlight the unique characteristics of AI/ML systems that necessitate new approaches to risk management and provide practical guidance on implementing effective controls.

The Need for New Controls

The paper begins by emphasizing that AI/ML systems introduce new types of software assets and operational practices that differ significantly from traditional IT systems. These differences create new risks that organizations must manage, particularly in regulated industries. The authors cite examples of real-world problems that have occurred when AI/ML assets were not properly managed, such as Meta’s Galactica and Microsoft’s Tay chatbot, which both had to be shut down shortly after launch due to unexpected and problematic outputs.

Key Components of AI/ML Systems

The paper identifies several key components of AI/ML systems that require specific attention and controls:

Model Weights: Quantifiable values that determine the functionality of an AI/ML model, similar to runtime configuration in traditional software.
Data Management: Including the use of feature stores to manage and serve data accurately and safely.
Model Development: Encompassing the various subcomponents that together achieve the business functionality of a model.
Hardware Configuration: The specific arrangement of computing resources required to support AI/ML applications, which can significantly impact model performance.

Life Cycle Stages and Associated Controls

The authors outline six stages in the AI/ML development life cycle and provide sample controls for each stage:

1. Data Preparation

Data Source Validation
Data Transformation Attestation
Outlier Management and Data Enrichment

2. Data Management

Feature Versioning Control
Feature Access Control
Feature Quality Assurance

3. Model Development

Model Weight Version Control
Non-Deterministic Model Governance
Model Component Inventory

4. Model Evaluation

Critical Safety Constraint Verification
Enhanced Red Team Assessment
Functional and Non-Functional Requirements Compliance

5. Model Deployment

Hardware-Specific Performance Validation
Floating-Point Precision Consistency
Hardware Configuration Management

6. Continuous Monitoring

Performance Threshold Alerts
Data Quality Monitoring
Model Drift Detection

For each control, the paper provides a description, explains the risk it mitigates, and suggests types of auditable evidence that organizations should maintain.

Key Considerations for AI/ML Risk Management

Throughout the paper, several important themes emerge:

Unique Asset Types: AI/ML systems introduce new types of assets, such as model weights and training datasets, that require specific controls and governance.
Data Quality and Management: The performance of AI/ML models heavily depends on the quality and integrity of the data used for training and operation.
Hardware Dependencies: Unlike traditional software, AI/ML models can be significantly impacted by the specific hardware configurations used for training and deployment.
Continuous Monitoring and Adaptation: AI/ML systems require ongoing monitoring and adjustment to maintain performance and mitigate risks over time.
Regulatory Compliance: As regulatory bodies develop new guidance for AI/ML systems, organizations need to proactively interpret and implement these requirements.

Practical Implementation of Controls

The paper provides practical guidance on implementing controls, emphasizing the importance of:

Clear Documentation: Maintaining comprehensive records of model components, data sources, and hardware configurations.
Version Control: Applying version control principles not just to code, but also to model weights, datasets, and other AI/ML-specific assets.
Performance Validation: Regularly testing and validating model performance across different hardware configurations and scenarios.
Access Controls: Implementing role-based access controls for features and model components.
Automated Monitoring: Setting up automated alerts and monitoring systems to detect issues in real-time.
Regular Audits: Conducting periodic reviews and assessments of AI/ML systems, including red team exercises and compliance checks.

Conclusion

The authors conclude by emphasizing that while AI/ML systems offer tremendous potential, they also introduce new risks that must be carefully managed. By implementing appropriate controls throughout the AI/ML development life cycle, organizations can harness the power of these technologies while ensuring responsible and ethical use.

The paper serves as a valuable starting point for organizations looking to develop or enhance their risk management practices for AI/ML systems. It provides a framework for thinking about the unique challenges posed by these technologies and offers practical guidance on how to address them.

For auditors, risk managers, and technology leaders involved in AI/ML initiatives, this paper offers crucial insights into the types of controls and evidence they should be looking for to ensure the responsible development and deployment of AI/ML systems. As these technologies continue to evolve and become more prevalent, the guidance provided in this paper will help organizations stay ahead of the curve in managing associated risks and maintaining compliance with emerging regulations.

To gain a deeper understanding of these concepts and how they might apply to your specific organizational context, we encourage you to read the full paper and consider how these controls can be integrated into your AI/ML development processes and risk management frameworks.

- About The Authors

Leah Brown

Managing Editor at IT Revolution working on publishing books and guidance papers for the modern business leader. I also oversee the production of the IT Revolution blog, combining the best of responsible, human-centered content with the assistance of AI tools.

No comments found

with Andrew Davis and Steve Pereira

with Dominica DeGrandis

with Matthew Skelton & Manuel Pais

September 23-25, 2025

Navigating the Risks of AI/ML Systems: A Guide to Effective Controls

The Need for New Controls

Key Components of AI/ML Systems