Comprehensive Evaluation of Clustering-based Defensive Technique against Label Modification Attack

Main Article Content

Dr. Jyoti Yadav
Nilambari S. Mate
Dr. A. D. Shaligram


As Machine Learning (ML) models are widely used to make security decisions, attackers are highly motivated to tamper with the models and outputs produced by machine learning algorithms. An attack where training data is manipulated to compromise the performance of ML model is known as data poisoning attack. As a result of data poisoning attack, false positive rate (FPR) of the model increases and accuracy of the model decreases. The effect of data poisoning attacks can be detected and prevented using various methods. This paper evaluates the performance of ML model by modifying the labels of the dataset used to build Support Vector Machine (SVM) classifier. Clustering based data filtering method is used to detect poisonous samples from training data. Clustering method is evaluated by using Euclidean distance measure and Jaccard distance measure on MNIST and WINE datasets. Results show the Jaccard distance measure is more accurate and consistent than the Euclidean distance measure on MNIST dataset using the clustering based filtering technique. However, same defensive technique is not useful for WINE dataset

Article Details