Red Teaming Large Language Models

As machine learning models become increasingly integrated into our digital infrastructure, evaluating their vulnerabilities is essential for both security and ethical reasons. Large language models (LLMs) are no exception. While they represent a revolutionary leap in natural language tasks, LLMs pose unique security and ethical challenges, including the potential to generate misleading, harmful, or biased content as well as leak confidential data, denial of service, or even cause remote code execution.

Machine Learning

Testing

This talk provides an in-depth look into red-teaming LLMs as an evaluation methodology to expose these vulnerabilities. By focusing on case studies and practical examples, we will differentiate between structured red team exercises and isolated adversarial attacks, such as model jailbreaks. Attendees will gain insights into the types of vulnerabilities that red teaming can reveal in LLMs, as well as potential strategies for mitigating these risks. The session aims to equip professionals with the knowledge to better evaluate the security and ethical dimensions of deploying Large Language Models in their organizations.

NDC { Security }

Red Teaming Large Language Models

Armin Buescher