报告简介:
Deep learning has shown promising results on hard perceptual problems in recent years. However, deep learning systems are found to be vulnerable to small adversarial perturbations that are nearly imperceptible to human. Such specially crafted perturbations cause deep learning systems to output incorrect decisions, with potentially disastrous consequences. These vulnerabilities hinder the deployment of deep learning systems where safety or security is important. Attempts to secure deep learning systems either target specific attacks or have been shown to be ineffective. We propose Magnet, a framework for defending neural network classifiers against adversarial examples. MagNet does not modify the protected classifier or know the process for generating adversarial examples. MagNet includes one or more separate detector networks and a reformer network. Different from previous work, MagNet learns to differentiate between normal and adversarial examples by approximating the manifold of normal examples. Since it does not rely on any process for generating adversarial examples, it has substantial generalization power. We discuss the intrinsic difficulty in defending against whitebox attack and propose a mechanism to defend against graybox attack. We show empirically that MagNet is effective against most advanced state-of-the-art attacks in blackbox and graybox scenarios while keeping false positive rate on normal examples very low. This is a joint work with Dongyu Meng.
报告人简介:
Hao Chen is a professor at the Department of Computer Science at the University of California, Davis. He received his PhD at the Computer Science Division at the University of California, Berkeley, and his BS and MS from Southeast University. His current research interests are computer security, machine learning, and program analysis. He won the National Science Foundation CAREER award in 2007, and UC Davis College of Engineering Faculty Award in 2010.