Automatic Fairness Testing of Neural Classifiers through Adversarial Sampling
File version
Accepted Manuscript (AM)
Author(s)
Wang, J
Sun, J
Wang, X
Dong, G
Wang, X
Dai, T
Dong, JS
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
Although deep learning has demonstrated astonishing performance in many applications, there are still concerns on their dependability. One desirable property of deep learning applications with societal impact is fairness (i.e., non-discrimination). Unfortunately, discrimination might be intrinsically embedded into the models due to discrimination in the training data. As a countermeasure, fairness testing systemically identifies discriminative samples, which can be used to retrain the model and improve its fairness. Existing fairness testing approaches however have two major limitations. First, they only work well on traditional machine learning models and have poor performance (e.g., effectiveness and efficiency) on deep learning models. Second, they only work on simple tabular data and are not applicable for domains such as text. In this work, we bridge the gap by proposing a scalable and effective approach for systematically searching for discriminative samples while extending fairness testing to address a challenging domain, i.e., text classification. Compared with state-of-the-art methods, our approach only employs lightweight procedures like gradient computation and clustering, which makes it significantly more scalable. Experimental results show that on average, our approach explores the search space more effectively (9.62 and 2.38 times more than the state-of-art methods respectively on tabular and text datasets) and generates much more individual discriminatory instances (24.95 and 2.68 times) within reasonable time. The retrained models reduce discrimination by 57.2% and 60.2% respectively on average.
Journal Title
IEEE Transactions on Software Engineering
Conference Title
Book Title
Edition
Volume
4848
Issue
9
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Item Access Status
Note
Access the data
Related item(s)
Subject
Artificial intelligence
Software engineering
Information systems
Electrical engineering
Distributed computing and systems software
Persistent link to this record
Citation
Zhang, P; Wang, J; Sun, J; Wang, X; Dong, G; Wang, X; Dai, T; Dong, JS, Automatic Fairness Testing of Neural Classifiers through Adversarial Sampling, IEEE Transactions on Software Engineering, 2022, 48 (9), pp. 3593-3612