The Prompt: Findings from our AI Red Team’s first report (Q&A)
VP/CISO, Google Cloud
What we can learn when responsible hacking meets responsible AI
Business leaders are buzzing about generative AI. To help you keep up with this fast-moving, transformative topic, each week in “The Prompt,” we’ll bring you observations from our work with customers and partners, as well as the newest AI happenings at Google. In this edition, Phil Venables, vice president and CISO at Google Cloud, looks at the importance of security-testing AI systems. This blog originally appeared in Phil’s Cloud CISO Perspectives newsletter published on July 20, 2023. A slightly modified version follows.
This year has been a banner year for artificial intelligence (AI). There’s been a surge of interest in AI and how it can be applied to many fields, especially security. However, in the pursuit of progress within these new frontiers of innovation, there need to be clear industry security standards for building and deploying this technology in a responsible manner.
At Google, we believe that part of building AI responsibly means testing it for security weaknesses, including using red teams to evaluate how AI technology can stand up to realistic threats – which is why we published our first AI Red Team report last week at the Aspen Security Forum.
I spoke about AI, security, and risk topics with my colleague Royal Hansen, vice president of Privacy, Safety, and Security Engineering at Google. We also discussed the report’s findings and Google’s overall progress in this area. I hope you all will find Royal’s answers to my questions informative.
Royal Hansen: I’m really excited about this. At Google, we believe that red teaming — friendly hackers tasked with looking for security weaknesses in technology — will play a decisive role in preparing every organization for attacks on AI systems. Google has been an AI-first company for many years now, and this paper shows how red teaming is a core component of securing AI technologies.
It focuses on three important areas: 1) what red teaming in the context of AI systems is and why it is important; 2) what types of attacks AI red teams simulate; and 3) lessons we have learned that we can share with others.
PV: Our team has a singular mission, to simulate threat actors targeting AI deployments. What kinds of attacks is the red team simulating?
RH: The AI Red Team is focusing squarely on attacks on AI systems. We detail in the report six tactics, techniques, and procedures (TTP) that attackers are likely to use against AI: prompt attacks, extraction of training data, backdooring the AI model, adversarial examples to trick the model, data poisoning, and exfiltration.
Since AI systems often exist as part of a larger whole, we do stress that AI Red Team TTPs should be used along with traditional red team exercises. A good example of this is how our AI Red Team has worked with our Trust and Safety team to help prevent content abuse.
PV: Can you talk about what we learned from the report?
RH: Sure. I’ll start with some tactical lessons.
We know that to protect against many kinds of attacks, traditional security controls, such as ensuring the systems and models are properly locked down, can significantly mitigate risk. This is true in particular for protecting the integrity of AI models throughout their lifecycle, which can help prevent data poisoning and backdoor attacks.
It was helpful to learn that many attacks on AI systems can be detected in the same way as traditional attacks. But others — including prompt attacks and content issues — may require layering multiple safety models. Traditional security philosophies, such as validating and sanitizing both input and output to the models, still apply in the AI space.
From a higher-level point of view, addressing red team findings can be challenging, and some attacks may not have simple fixes. We encourage red teams to partner with security and AI subject matter experts for realistic end-to-end adversarial simulations.
PV: Let’s pull back the lens a bit and look at how we got here with AI, which has really dominated the technology conversation this year. How would you describe its evolution — particularly during your time at Google?
RH: AI isn’t new — we’ve been incorporating AI into our products for more than a decade. If you’ve been using Google Search, or Translate, or Maps, or Gmail, or the Play Store for apps, you’ve been using and benefiting from AI for years.
One of our AI milestones includes using machine learning to help detect anomalies on our internal networks back in 2011. Today, those capabilities have evolved and regularly help our red teams discover and test sophisticated hacking techniques against Google’s own systems.
In 2014, we started a Machine Learning Fairness team. In 2018, we adopted our AI principles, which led to spearheading the movement to adopt responsible AI, based on mitigating complexities and risks, and also improving people's lives while addressing social challenges.
This year, we built on our collaborative approach to cybersecurity by launching our Secure AI Framework (SAIF). SAIF is inspired by best practices for security that we’ve applied to software development, while incorporating our understanding of security megatrends and risks specific to AI systems.
Technology can create new threats, but it can also help us fight them. AI can often help counter the issues created by AI. It could even give security defenders the upper hand over attackers for the first time since the creation of the internet.
SAIF is designed to help mitigate risks specific to AI systems like stealing the model, poisoning the training data, injecting malicious inputs through prompt injection, and extracting confidential information in the training data.
So, while there’s a lot of discussion about generative AI in cybersecurity – and beyond – right now, we’ve been using and learning from AI more broadly in our day-to-day work for years.
PV: How can we ensure a higher quality of online information, particularly in critical situations such as moments of crisis and war, or elections? How are you thinking about security and protections in these moments that matter, particularly in the age of AI?
RH: Technology can create new threats, but it can also help us fight them. AI can often help counter the issues created by AI. It could even give security defenders the upper hand over attackers for the first time since the creation of the internet.
For example, Gmail uses AI right now to automatically block more than 99.9% of malware, phishing, and spam, and protects more than 1.5 billion inboxes. AI can help identify and track misinformation, disinformation, and manipulated media. One notable example of that happened last year, when Mandiant discovered and sounded the alarm about the AI-generated “deepfake” video impersonating Ukrainian President Volodymyr Zelensky surrendering to Russia.
We already use machine learning to identify toxic comments and problematic videos. More technical AI innovations we’re working on include watermarking AI-generated images, and creating tools to evaluate online information — like the upcoming “About this Image” feature in Google Search. We've also joined the Partnership on AI’s Responsible Practices for Synthetic Media, which promotes responsible practices in the development, creation, and sharing of media created with generative AI.
Looking ahead, our challenge is to put appropriate controls in place to prevent malicious use of AI and to work collectively to address bad actors, while maximizing the potential benefits of AI to stay at the front of the global competitiveness race.
While frontier AI models offer tremendous promise to improve the world… their development and deployment will require significant care — including potential new regulatory requirements.
PV: What work is Google doing to manage risks we might face from AI?
RH: We think about AI and security primarily through two lenses. First, using AI to enhance safety and security, and second, securing AI from attack.
While frontier AI models offer tremendous promise to improve the world, governments and industry agree that appropriate guardrails are required on the policy level, on the business level, and on the technology level.
Their development and deployment will require significant care — including potential new regulatory requirements. We’ve already seen important contributions to these efforts by the U.S. and U.K. governments, the European Union, the G7 through the Hiroshima AI Process, and others. To build on these efforts, further work is needed on safety standards and evaluations to ensure advanced AI systems are developed and deployed responsibly.
With the stakes so high, we’re calling on governments, the private sector, academia, and civil society to work together on a responsible AI policy agenda. And to enable progress in AI, we must focus on three key areas: opportunity, responsibility, and security.
PV: How have you seen Google’s approach to monitoring and responding to cyberattacks change? Where do we stand now? How has user protection evolved?
RH: Keeping users safe online is more complex and urgent than ever before. We’re seeing an increasing number of new malware families, financially motivated attacks such as ransomware, supply chain attacks as well as rising cyber attacks from nation state-backed actors against critical infrastructure. This has brought a decades-long problem into focus for policymakers and enterprise leadership as it has disrupted our way of life and made the stakes higher than ever.
The lines are blurring between safety and security in ways that require us to collaborate across cyber and trust and safety, across consumer and enterprise, across public and private sector, national and international.
We need to expand our thinking about the threat landscape to holistically secure users, governments, and enterprises from ever-changing future attacks.
PV: How has your experience been at the Aspen Security Forum this week? What were some of the key takeaways?
RH: Events like this are a great way to hear from some of the best minds in security. I came eager to listen, learn, and return to Google with lessons that will make us a better partner in privacy, safety, and security. It was a jam-packed week connecting with new and old colleagues across the private and public sector.
I'm struck by the many different dimensions of AI and security here. The lines are blurring between safety and security in ways that require us to collaborate across cyber and trust and safety, across consumer and enterprise, across public and private sector, national and international. The SAIF framework has been a great way to help organizations begin building this approach into their AI plans.
You can learn more about Google Cloud’s security and AI innovations here.