As more organizations embrace Artificial Intelligence (AI) and Machine Learning (ML) to optimize their operations and gain a competitive edge, there is increasing focus on how to best secure this powerful technology. Central to this is the data used to train ML models, which fundamentally impacts how they behave and perform over time. As a result, organizations must pay close attention to what goes into their models and be constantly vigilant for signs of something sinister, such as data corruption.
Unfortunately, as ML models have grown in popularity, so has the risk of malicious backdoor attacks, where criminals use data poisoning techniques to feed ML models with compromised data, causing them to behave in unforeseen or malicious ways when triggered by specific commands. While such attacks can be time-consuming to execute (often requiring large amounts of poisoned data over many months), they can be incredibly damaging if successful. For this reason, it’s something that organizations need to protect against, especially in the foundational stages of any new ML model.
A great example of this threat landscape is the Sleepy Pickle technique. As the Trail of Bits blog explains, this technique leverages the ubiquitous and notoriously insecure Pickle file format used to package and distribute ML models. Sleepy Pickle goes beyond previous exploit techniques that target an organization’s systems as they deploy ML models, instead stealthily compromising the ML model itself. Over time, this allows attackers to target the organization’s end users of the model, potentially causing major security issues if successful.
Senior Solutions Architect at HackerOne.
The Rise of MLSecOps
To combat these types of threats, more and more organizations have started implementing MLSecOps as part of their development cycles.
At its core, MLSecOps integrates security practices and considerations into the ML development and deployment process. This includes ensuring the privacy and security of data used to train and test models, and protecting models that have already been deployed from malicious attacks, along with the infrastructure they run on.
Some examples of MLSecOps activities include performing threat modeling, implementing secure coding practices, conducting security audits, responding to incidents for ML systems and models, and ensuring transparency and explainability to prevent unintended bias in decision-making.
The core pillars of MLSecOps
What sets MLSecOps apart from other disciplines such as DevOps is that it is exclusively concerned with security issues within ML systems. With this in mind, there are five core pillars of MLSecOps, popularized by the MLSecOps community, that together form an effective risk framework:
Supply chain vulnerability
ML supply chain vulnerability can be defined as the potential for security breaches or attacks on the systems and components that make up the supply chain for ML technology. This can include issues with things like software/hardware components, communication networks, data storage and management. Unfortunately, all of these vulnerabilities can be exploited by cybercriminals to gain access to valuable information, steal sensitive data, and disrupt business operations. To mitigate these risks, organizations must implement robust security measures, including continuously monitoring and updating their systems to stay ahead of emerging threats.
Governance, risk and compliance
Complying with a wide range of laws and regulations, such as the General Data Protection Regulation (GDPR), has become an essential part of modern businesses, helping to avoid far-reaching legal and financial implications, as well as potential reputational damage. However, with the exponential growth in AI’s popularity, it is becoming increasingly difficult for businesses to keep track of data and ensure compliance, due to the increasing reliance on ML models.
MLSecOps can quickly identify modified code and components and situations that may call into question the underlying integrity and compliance of an AI framework. This helps organizations ensure compliance requirements are met and the integrity of sensitive data is maintained.
Origin of the model
Model lineage means tracking the processing of data and ML models in the pipeline. Logging should be secure, integrity-protected, and traceable. Access and versioning of data, ML models and pipeline parameters, logging, and monitoring are all critical controls that MLSecOps can effectively help with.
Trustworthy AI
Trusted AI is a term used to describe AI systems that are designed to be fair, unbiased, and explainable. To achieve this, Trusted AI systems must be transparent and have the ability to explain all decisions they make in a clear and concise manner. If an AI system’s decision-making process cannot be understood, it cannot be trusted, but making it explainable makes it accountable and therefore trustworthy.
Contradictory ML
Defending against malicious attacks on ML models is crucial. However, as discussed above, these attacks can take many forms, making them extremely challenging to identify and prevent. The goal of adversarial ML is to develop techniques and strategies to defend against such attacks, thereby improving the robustness and security of machine learning models and systems.
To achieve this, researchers have developed techniques that can detect and mitigate attacks in real time. Some of the most common techniques include using generative models to create synthetic training data, incorporating adversarial examples into the training process, and developing robust classifiers that can handle noisy inputs.
To quickly capitalize on the benefits that AI and ML can bring, too many organizations are compromising their data security by not addressing the increased cyber threats that come with them. MLSecOps provides a powerful framework that can help ensure the right level of protection is in place as developers and software engineers become more comfortable with these emerging technologies and the risks that come with them. While it may not be necessary for a long time, it will be invaluable in the years to come, making it a worthwhile investment for organizations that take data security seriously.
We have highlighted the best online cybersecurity course for you.
This article was produced as part of TechRadarPro’s Expert Insights channel, where we showcase the best and brightest minds in the technology sector today. The views expressed here are those of the author and do not necessarily represent those of TechRadarPro or Future plc. If you’re interested in contributing, you can read more here: