Comparative study on the classification methods for breast cancer diagnosis

Bull. Pol. Ac.: Tech. 66(6) 2018 Abstract. Digital mammography is one of the most widely used approaches for breast cancer diagnosis. Many researchers have demonstrated the superiority of machine learning methods in breast cancer diagnosis using different mammography databases. Since these methods often have different pros and cons, which may confuse doctors and researchers, an elaborate comparison and examination among them is urgently needed for practical breast cancer diagnosis. In this study, we conducted a comprehensive comparative study of the state-of-the-art machine learning methods that are promising in breast cancer diagnosis. For this purpose we analyze the largest mammography diagnosis database: Digital Database for Screening Mammography (DDSM). We considered various approaches for feature extraction including principal component analysis (PCA), nonnegative matrix factorization (NMF), spatial-temporal discriminant analysis (STDA) and those for classification including linear discriminant analysis (LDA), random forests (RaF), k-nearest neighbors (kNN), as well as deep learning methods including convolutional neural networks (CNN) and stacked sparse autoencoder (SSAE). This paper can serve as a guideline and useful clues for doctors who are going to select machine learning methods for their breast cancer computer-aided diagnosis (CAD) systems as well for researchers interested in developing more reliable and efficient methods for breast cancer diagnosis.

[1]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[2]  Ketan Sharma,et al.  Classification of mammogram images by using CNN classifier , 2016, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[3]  A. Jemal,et al.  Cancer statistics, 2018 , 2018, CA: a cancer journal for clinicians.

[4]  N. Dubrawsky Cancer statistics , 1989, CA: a cancer journal for clinicians.

[5]  Richard H. Moore,et al.  THE DIGITAL DATABASE FOR SCREENING MAMMOGRAPHY , 2007 .

[6]  Sejong Yoon,et al.  AdaBoost-based multiple SVM-RFE for classification of mammograms in DDSM , 2008, 2008 IEEE International Conference on Bioinformatics and Biomeidcine Workshops.

[7]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[8]  Jordi Vitrià,et al.  Analyzing non-negative matrix factorization for image classification , 2002, Object recognition supported by user interaction for service robots.

[9]  Andrzej Cichocki,et al.  Group Component Analysis for Multiblock Data: Common and Individual Feature Extraction , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[12]  Xingyu Wang,et al.  Spatial-Temporal Discriminant Analysis for ERP-Based Brain-Computer Interface , 2013, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[13]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[14]  A. Jemal,et al.  Breast cancer statistics, 2013 , 2014, CA: a cancer journal for clinicians.

[15]  Ponnuthurai N. Suganthan,et al.  Random Forests with ensemble of feature spaces , 2014, Pattern Recognit..

[16]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[17]  Leif E. Peterson K-nearest neighbor , 2009, Scholarpedia.

[18]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[19]  Xingyu Wang,et al.  Temporally Constrained Sparse Group Spatial Patterns for Motor Imagery BCI , 2019, IEEE Transactions on Cybernetics.

[20]  Jianzhong Wu,et al.  Stacked Sparse Autoencoder (SSAE) for Nuclei Detection on Breast Cancer Histopathology Images , 2016, IEEE Transactions on Medical Imaging.

[21]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[22]  Ron Kimmel,et al.  Computational mammography using deep neural networks , 2018, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[23]  Andrzej Cichocki,et al.  Common components analysis via linked blind source separation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[25]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[26]  YuDong,et al.  Convolutional neural networks for speech recognition , 2014 .

[27]  Luiz Eduardo Soares de Oliveira,et al.  Breast cancer histopathological image classification using Convolutional Neural Networks , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[28]  Joseph Y. Lo,et al.  Mutual information-based template matching scheme for detection of breast masses: From mammography to digital breast tomosynthesis , 2011, J. Biomed. Informatics.

[29]  Jieping Ye,et al.  Two-Dimensional Linear Discriminant Analysis , 2004, NIPS.

[30]  Xingyu Wang,et al.  Sparse Bayesian Classification of EEG for Brain–Computer Interface , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Ezzeddine Zagrouba,et al.  Breast cancer diagnosis in digitized mammograms using curvelet moments , 2015, Comput. Biol. Medicine.

[32]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[33]  Ayman M. Eldeib,et al.  Breast cancer classification using deep belief networks , 2016, Expert Syst. Appl..

[34]  J. Koenderink Q… , 2014, Les noms officiels des communes de Wallonie, de Bruxelles-Capitale et de la communaute germanophone.