Microsystem Technologies https://doi.org/10.1007/s00542-020-04888-5 (0123456789().,-volV)(0123456789(). ,- volV) TECHNICAL PAPER Machine learning technique for early detection of Alzheimer’s disease Rashmi Kumari1 • Akriti Nigam1 • Shashank Pushkar1 Received: 3 May 2020 / Accepted: 14 May 2020 Springer-Verlag GmbH Germany, part of Springer Nature 2020 Abstract Alzheimer’s disease (AD) is non-repairable brain disorder which impacts a person’s thinking along with shrinking the size of the brain, ultimately resulting in the death of the patient. It is necessary for the treatment of initial stages in AD so that the further degeneration could be delayed. This diagnosis can be achieved with the application of machine learning techniques which employ various optimization and probabilistic techniques. Hence with an objective of distinguishing people with normal brain ageing from those who would develop Alzheimer’s disease, this paper presents an effective machine learning model that successfully diagnosed AD, cMCI, ncMCI and CN which are being detected during pre-stages by itself. 1 Introduction decoding the states of the disease using the MRI images (Suk et al. 2015). Machine learning is utilized for inter- Alzheimer’s disease, one of the most prevalent kinds of preting and analyzing the MRI images. Furthermore, it has dementia, is a neurological brain disorder that is progres- the ability to classify the model data and patterns. Various sive in nature and usually occurs during the later life of techniques of machine learning based on extracting the human beings. This disease has affected 30 million people features of high dimension from various image modalities all over the world, and this number is expected to increase like positron emission tomography (PET) and MRI have by three times over the next five decades because of the been developed for diagnosing the AD (Hinrichs et al. increase in the population of old people. During this con- 2009; Zhang et al. 2011). These methods along with dition, one can observe that the patient’s memory and identifying the subjects in AD stage predicting the risk of intellectual functions decline (Liu et al. 2014). The clinical mild cognitive impairment subjects converting into Alz- precursor for AD is considered to be mild cognitive heimer’s disease. Hence, according to the risk of the dis- impairment (MCI). This is a phase of transition from ease progression, the instances of MCI could be named as healthy to dementia. Till now, there has been no complete MCI converters (cMCI) or else MCI non-converters cure. But, there has been the advancement of some medi- (ncMCI). Therefore diagnosing the AD at the initial stages cations that delay the progression of AD, specifically when can be modelled as a multi-classification problem (Suk it is was diagnosed in the initial stages. Hence, early et al. 2014). Many of these machine learning approaches diagnosing is significant for treating the AD patients. But achieved promising accuracies, but they were not assessed the precise as well as initial diagnosis of AD is still difficult on pathologically proven datasets which have been for the doctors. The magnetic resonance images (MRI) obtained from distinct modalities of imaging, making it inclusive of functional MRI (fMRI) and structural MRI challenging to compare it in a fair manner. Additionally, (sMRI) are useful imaging tools that aid in understanding various factors like preprocessing the significant charac- and evaluating the anatomical and neural variations of AD teristics for selecting the features, and class imbalance (Jack et al. 2011). Numerous efforts are being applied in considerably impact the prediction accuracy. To overcome the recent decade for developing the computer-aided the constraints and restrictions in previous studies, this models which use machine learning techniques for paper proposes a generalized and useful machine learning technique for early diagnosis of Alzheimer’s disease using magnetic resonance imaging (MRI) images and a convo- & Rashmi Kumari lutional neural network (CNN) classifier. The images are

[email protected]

preprocessed, segmented using Gaussian filter and Otsu 1 Birla Institute of Technology, Mesra, Ranchi, India thresholding algorithm respectively. The image features are 123 Microsystem Technologies obtained using Gray level co-occurrence matrix (GLCM) respectively. Even though the SAE model successfully technique. These features are being classified as CNN classified the images, there is still a possibility for devel- classifier. oping a deep multimodal network for shared representa- tion. Payan and Montana (2015) developed and tested a pattern classification model which was a combination of 2 Literature review sparse autoencoders and CNN. The primary objective of the research was to evaluate the accuracy of the developed With the development of machine learning models, fea- model on a comparatively larger population of patients and tures can be extracted and classified without engaging any compare the performance of the 2D, 3D architecture of of the experts. Hence the researchers are focussed on CNN. The model was tested using 2265 scanned images of developing the various models for precisely detecting and the brain obtained from the ADNI dataset (O’Shea and classifying the images. Liu et al. (2012) contributed for Nash 2015). The results indicated that the developed 3D early diagnosis of AD by computer-aided diagnosis (CAD) model has the ability of capturing the 3D patterns which by the implementation of an ensemble sparse technique for ought to ameliorate the performance of classifiers by a classifying the images. The high feature dimensionality small range. There exists some drawbacks to this tech- reduced the classification ability of the standard classifi- nique; for example, the convolutional layer is trained ini- cation models; hence the authors developed a sparse rep- tially but not fine-tuned. Fine tuning can enhance the resentation-based classifier (SRC) for generating the system performance thus providing a higher accuracy of localized patches that are fused at a later stage in order to classification. Ortiz et al. (2016) presented an approach for provide precise classification. This method was evaluated diagnosing early AD and AD by the fusion of a structural on 652 subjects (which included 198 AD patients, 229 and functional image data based on the deep learning normal and 225 MCI) from the Alzheimer’s disease Neu- technique particularly DBN. The regions of the brain are roimaging Initiative database of MRI images. From the defined based on automated anatomical labeling (AAL). experimental outcome, it was found that the technique The grey matter (GM) images from each area of the brain obtained classification accuracy of 90.8% for AD, and are separated into three-dimensional patches based on the 92.90% for MCI. From the study, it could be inferred that areas that were characterised by AAL and these patches aid the classification accuracy is high when the patches gen- the training of the DBN. Two structures based on deep erated are from AD; otherwise, it will be low. Further, the learning and four distinct schemes of voting were imple- pathologically unproven dataset and imbalance in the class mented. The resulting technique is evaluated using the demonstrated the uncertainty in the results. Ramı́rez et al. ADNI database. It can be inferred that this method was not (2013) contributed towards for early diagnosis of AD by only an ideal one for classifying the images but also per- developing a CAD model for enhancing the early detection formed better for MCI subjects as well. The architecture of AD. This model was based on selecting the parameters provided an accuracy of 90% for NC/AD and 84% for of the image and classification using support vector MCI/AD classification (Cuingnet et al. 2011). Huang et al. machine. A study was conducted to determine the Region (2017) developed DenseNet to be a new structure for deep of Interests (ROIs) and most discriminant metrics of the CNN that connected several layers to every other layer in a image. The primary objective of this study was to decrease feed-forward manner for capturing and reusing features of the dimensionality of the input space and diagnose the AD various layers to perform better when compared to the with higher precision with the aid of the radial basis CNN. This model introduced a direct connection among function (RBF) SVM. The technique achieved a sensitivity the two layers having an identical size of feature maps. of 93.10%, an accuracy of 90.38% and a specificity of This study proved that the DenseNets scaled naturally up to 86.96%. However, it is not easy to classify the images into hundreds of layers without the presence of any difficulties more than two classes in a single setting using SVM of optimization. The use of DenseNets consistently classifier. Suk and Shen (2013) developed a feature rep- improved the accuracy with the increasing number of resentation technique based on deep learning with the aid parameters. Furthermore, the DenseNets required compar- of a stacked auto-encoder (SAE). The study assumes that atively lesser parameters and computations to achieve the the complex latent patterns like non-linear relations are results which are compared to the state-of-the-art tech- implicit within the features at low-level. They combined niques. Motivated by this Li et al. (2018) developed a novel the initial low-level features with the latent information for technique for classifying the MR images. This approach developing an effective model for classifying the MCI/AD was based on the multiple cluster dense CNN. The com- with high diagnostic accuracy. The developed model was plete brain was first split into distinct local regions for tested on ADNI dataset and the accuracies obtained were extracting the 3D patches from them. Then the patches 95.9%, 75.8%, and 85% for AD, MCI-converter, and MCI from each region are clustered with the aid of the k-means 123 Microsystem Technologies clustering technique. Finally, the DenseNet is constructed OASIS is a compilation of brain images for [ 1000 for learning the features of all the clusters, and these fea- patients (Fotenos et al. 2005). The images are preprocessed tures are then ensembled for the process of classification. by applying Gaussian filter to remove any of the unwanted This technique was evaluated for the ADNI database of noise. Then the images are segmented using Otsu thresh- 831 subjects which included 403 MCI, 199 AD, and 229 olding algorithm, and the edges are detected using the normal controls. From the results, it was determined that Prewitt edge detection technique. The images are clustered the technique achieved an accuracy of 89.5% for AD, and using the fuzzy clustering technique and then GLCM fea- 77.5% for MCI. Compared to the existing approaches this ture extraction technique is applied for extracting the fea- method had certain advantages (1) it alleviated the issue of tures. Finally, using convolutional neural networks (CNN) a smaller set of images on training the DenseNets. (2) ROI the dataset is classified. segmentations were not required in processing the images thus simplifying the diagnosis process and cost of com- 3.1 Image preprocessing putation. However, this technique was limited to ADNI datasets and was not evaluated for other datasets for mul- The most commonly utilized approach for preprocessing timodal analysis of the brain images. These examinations the images is the Gaussian filtering technique. It is an exemplified how the results ought to be approved and effective local filtering technique popular for smoothening depicted, particularly in the visualization and forecast of the images. The Gaussian filter is a low pass filter which AD. However, having the capacity to recognize the suppresses the high-frequency details including the edges potential issues in the input information, design of the and noise, in turn preserving the components of low fre- experiment, validating or implementing is exceptionally quency in the image. In simpler terms, it can be referred to critical particularly for the individuals who assess various as a filter that blurs out anything that is smaller than the examinations as well as for those intending to use machine image feature (Perumal and Velmurugan 2018). The learning. operation of blurring the image and removing the noise is done by a 2-D convolution operator referred to as the Gaussian smoothing operator. A kernel is utilized that is a 3 Proposed methodology representation of the Gaussian (bell-shape) hump. The Gaussian filter is mathematically expressed using the This study proposes a novel technique for detecting Alz- equation given below. heimer’s disease and is shown in Fig. 1. The dataset con- 1 xþy sisting of MRI images, collected from the open access gðx; yÞ ¼ e 2r2 ð1Þ 2pr2 database of OASIS-3 (http://www.oasis-brains.org). Fig. 1 The flow of the proposed methodology 123 Microsystem Technologies • The starting values of xi(0) and li(0) is set up where y, x = distance between origin and vertical, hori- • The later stage is going through all the thresholds zontal axes, r = standard deviation of the Gaussian t = 1… Max intensity. function. • Updating xi, li Through convolution, the above mentioned 2D distri- • Computing r2b (t) bution is utilized as the point-spread function. The Gaus- • The desired threshold is the maximum value of r2b (t). sian function requires a discrete function to perform convolution because the image is stored as a pixels in form of discrete. 3.3 Edge detection 3.2 Image segmentation The developed model utilizes Prewitt edge detection method. The conventional Prewitt operator for detecting The developed model utilizes Otsu method that maximizes the edges comprises of two 3 9 3 matrix groups. This the variance among the classes for the process of seg- could detect only the vertical and horizontal directions menting the image. Reason for using Otsu method is that it (Yang et al. 2011). Since the edges had more than two is a non-parametric approach popular for its simplicity and directions an eight-direction template Prewitt operator is effectiveness. The Otsu method involves an exhaustive used. The template of eight directions is illustrated in search for the threshold, which minimizes class variance. Fig. 2. The Prewitt algorithm computes the gradient in all The intra-class variance is described as the summation of variances in two classes. r2w ¼ x1 ðtÞr21 ðtÞ þ x2 ðtÞr22 ðtÞ ð2Þ weights xi = probability of the classes which are separated by a threshold t, r2i = variance of the classes. The proba- bility of the classes is formulated with the help of the relations given below. t 1 X x0 ðtÞ ¼ pðiÞ i¼0 ð3Þ L 1 X x1 ðtÞ ¼ pðiÞ i¼0 Otsu demonstrated that minimization of intra-class vari- ance or maximization of inter-class variance which are equal. r2b ðtÞ ¼ r2 r2w ðtÞ ¼ x0 ðl0 lr Þ2 þx1 ðl0 lr Þ 2 ¼ x0 ðtÞx1 ðtÞ½l0 ðtÞ lt ðtފ2 ð4Þ which are indicated with respect to class means l as well as x. The mean of the classes is given by Pt 1 ipðiÞ l0 ðtÞ ¼ i¼0 x0 ðtÞ PL 1 ipðiÞ l1 ðtÞ ¼ i¼0 ð5Þ x1 ðtÞ L 1 X lT ðtÞ ¼ ipðiÞ i¼0 The steps involved in the Otsu thresholding algorithm are given below (Makkar and Pundir 2014). • The histogram as well as probabilities of every intensity level is computed. Fig. 2 Template of eight directions 123 Microsystem Technologies the directions for every point of the image and then con- the second step involves calculating the textured based siders the maximal value of the magnitude in gradient as features in the co-occurrence matrix obtained from previ- well as pixel values for edge images in correspondence ous step. The features of the images extracted in this study point. are contrast, correlation, energy, entropy, homogeneity, standard deviation, skewness, kurtosis, variance and 3.4 Fuzzy clustering skewness. In GLCM, feature extractions of images are given in the form of different expressions: The images are clustered using Fuzzy C-means (FCM) Contrast: clustering technique since its performance is high for the N 1 images without any noise. The technique of fuzzy clus- X Pi;j ði jÞ2 ð10Þ tering will allocate every training vector to the membership i;j¼0 value set, one for every cluster instead of allocating the training vectors to just one cluster (Zheng et al. 2015). The Correlation: objective fuzzy function that requires minimization is 2 3 given by N 1 X 6ði li Þ j lj 7 J N X Pi;j 6 4 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 5 7 ð11Þ X i;j¼0 2 2 Jm ¼ lm ij dij ð6Þ ðri Þ rj i¼1 j¼1 where yi is the dataset with i = (1, 2, …, N) for the vector Energy of image: space of D dimension, N = total points of data, J = cluster N 1 X number, uij represents the membership degree of yi within P2i;j ð12Þ i;j¼0 the cluster j, for each function of membership uij, m rep- resents the weighting exponent, the distance between the Entropy of image: centre of the cluster li and yi is represented by dij also X1 b¼L referred to as distance function. In FCM, the squared PðbÞlog2½Pðbފ Euclidean distance generally applied is given in the form b¼0 dij ¼ kyi li k 2 ð7Þ where Using the function in Eq. (7), through the required condi- N ð bÞ PðbÞ ¼ ð13Þ tions the FCM algorithm was iterated to obtain a mini- M mized Jm with the help of the below-mentioned equations. Homogeneity: PN m N 1 i¼1 lij yi Pij X lj ¼ PN m ð8Þ ð14Þ i¼1 lij i;j¼0 1 þ ði j Þ2 ðdij Þ1=ð1 mÞ Standard Deviation: lij ¼ PJ 1=ð1 mÞ ð9Þ h¼1 ðdij Þ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PN i¼1 ðxi xÞ^2 With the constraint PJ lij ¼ 1: ¼ ð15Þ J¼1 N 1 3.5 Feature extraction xi = the observed values of a sample item x = the SX mean value of the observations The proposed model uses GLCM for extracting the fea- N = the number of observations tures. GLCM is the most popular second order statistical Skewness: technique utilized for measuring the textural information of ðMean MedianÞ the images. It provides adequate information about the ¼ 3 ð16Þ Standard Deviation textures of the picture which is obtained from two pixels. GLCM was introduced to describe the textures by statis- Kurtosis: tically sampling the occurrence of certain grey levels with It is defined as fourth standardised moment respect to other grey levels (Haralick and Shanmugam X l 4 EðX lÞ4 l 1973). Basically, there are two steps, the first step involves Kurt½XŠ ¼ E ¼ 2 2 ¼ 44 ð17Þ r ½EðX lÞ Š r the computation of the co-occurrence matrix, whereas in Variance: 123 Microsystem Technologies 4 Results and discussion h i VarðXÞ ¼ E ðX lÞ 2 : ð18Þ The present task considers a total of 200 MRI images of the brain, 100 images for testing and 100 images for training. 3.6 Image classification Figure 4 shows a sample input image and Fig. 5 demon- strates the resized image. Thereafter, above image is then The images are next classified using convolutional neural filtered in order to eradicate noise existing in it by using a networks. The CNN is basically applied in convolution of Gaussian filter (Fig. 6). The contrast of the filter image is various images in form of kernels as procurement in feature then increased (Fig. 7). maps. The various kernel weights are benefitted in con- The filtered image is then segmented by applying the nection of every unit in feature map to the previous layers. Otsu thresholding technique. The segmented image is During training of datasets, the weights of the kernels are presented in Fig. 8. The performance of segmentation being utilized in improvement of various input attributes. technique is tabulated in Table 1. To this image, Prewitt The number of weights in kernels which have to be trained edge detection method is involved for detecting the edges in the convolutional layers must be lesser when these in the image. The obtained edge detected image is followed weights are being compared to the fully connected (FC) in Fig. 9. layers. This is because of the kernels are common to all the From the edge detected image, features are extracted units in one particular feature map. The architecture of utilizing grey level co-occurrence matrix technique. The CNN is illustrated in Fig. 3. extracted features along with their values are tabulated in The functionality of CNN can be bifurcated into four Table 2. Then fuzzy C-means clustering technique is key areas. applied to cluster them into four different groups by • The magnitude of different pixels in an image is reducing the threshold level. The clustered images are obtained from input layer shown in Fig. 10. Finally, these clustered images are fed to • The CNN decides output for the neurons that are the CNN classifier for the process of classification for early associated with the input local regions via the compu- diagnosis of AD. The classifier classified the images with tation of scalar product among the regions associated an accuracy and sensitivity of 90.25% & 85.53% respec- with the volume of the input and the weights of the tively. The classifier output is shown in Fig. 11. neurons. The ROC curve obtained from the proposed technique is • Then the downsampling of the input is achieved by the shown in Fig. 12, and Fig. 13 provides the parameters of pooling layer, hence layer parameters are reduced in a the ROC curve. particular activation. The proposed method using CNN classifier is compared • The fully connected layer then generates scores for the to results obtained using KNN classifier implemented on classes (from the activations) which are applied in the the same dataset. The KNN classifier gave different values process of classification. of performance metrics for different values of K = 5, 9, 15. The comparison of the KNN and CNN classifier with respect to the performance metrics of accuracy, sensitivity Fig. 3 Architecture of CNN 123 Microsystem Technologies Fig. 4 Input image Fig. 8 Segmented image Fig. 5 Resized image Table 1 Segmentation results Performance metric Result obtained Accuracy 0.5201 Sensitivity 0.3351 F-measure 0.3831 Precision 0.4471 MCC 0.0035 Fig. 6 Filtered image Fig. 9 Edge detected image Fig. 7 Contrast-enhanced image Table 2 Extracted features Feature extracted Value Contrast 1.634 Correlation 0.9521 Energy 0.4816 Homogeneity 0.9792 Mean 0.3333 Standard deviation 0.4376 Entropy 4.2993 and specificity is tabulated in Table 3. From Table 3, KNN RMS 0.4765 classifier provides an average accuracy of 59.3% while the Variance 0.1215 proposed CNN classifier provides an accuracy of 90.25%. Smoothness 1.000 We correlate our results in proposed method with vari- Kurtosis 1.6140 ous techniques available in existing literatures so as to Skewness 0.7163 prove the validity of proposed algorithm. The Table 4 IDM 220.4068 demonstrates in comparing with accuracy and sensitivity values accessed by few state-of-the-art methods which are being discussed in literature with our proposed method. The graphical representation for the above comparative table is shown in Fig. 14. From the inference of the graph, 123 Microsystem Technologies Fig. 10 Clustered image Fig. 12 ROC curve Fig. 11 Classifier output 123 Microsystem Technologies proposed technique delivers better accuracy which is 90.25% when compared to other existing algorithms which has the next best value of 88.58%. 5 Conclusion The present paper proposes an effective machine learning model for detecting the AD in its initial stages. The developed model applied a Gaussian filter for removal of unwanted noise, Otsu thresholding for image segmentation, Prewitt edge detection approach for detecting the edges, GLCM for extracting the features and FCM for clustering and CNN for the final classification of the images. The classifier gave an accuracy of 90.25% and sensitivity of 85.53% in comparison with the KNN classifier that pro- vided an accuracy of just 59.3% and sensitivity of 45.2%. The same results are then compared with various previous works in literature in order to prove the efficiency of our proposed algorithm Fig. 13 Parameters of ROC curve Table 3 Comparison of existing Technique Accuracy Sensitivity Specificity (KNN) and proposed (CNN) techniques KNN with K = 5 58.63 43.22 64.88 KNN with K = 9 60.64 58.47 NaN KNN with K = 15 58.63 33.89 59.54 Proposed (CNN) 90.25 85.53 NaN Table 4 Comparison of Methods Modality Accuracy Sensitivity proposed approach with state of the art techniques SK SVM MRI/PET 84.40 84.64 MK SVM MRI/PET 86.42 84.98 Liu et al. MRI/PET 87.76 88.57 Cuingnet et al. ADNI 88.58 81.00 Hinrichs et al. MRI/PET 82.00 85.00 Proposed method MRI 90.25 85.53 Fig. 14 Graphical representation of proposed approach compared with existing literature 123 Microsystem Technologies References O’Shea K, Nash R (2015) An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 Payan A, Montana G (2015) Predicting Alzheimer’s disease: a Cuingnet R, Gerardin E, Tessieras J, Auzias G, Lehericy S, Habert neuroimaging study with 3D convolutional neural networks. MO, Chupin M, Benali H, Colliot O (2011) Automatic arXiv preprint arXiv:1502.02506 classification of patients with Alzheimer’s disease from struc- Perumal S, Velmurugan T (2018) Preprocessing by contrast enhance- tural MRI: a comparison of ten methods using the ADNI ment techniques for medical images. Int J Pure Appl Math database. NeuroImage 56:766–781 118(18):3681–3688 Fotenos AF, Snyder AZ, Girton LE, Morris JC, Buckner RL (2005) Ramı́rez J, Górriz JM, Salas-Gonzalez D, Romero A, López M, Normative estimates of cross-sectional and longitudinal brain Álvarez I, Gómez-Rı́o M (2013) Computer-aided diagnosis of volume decline in aging and AD. Neurology 64(6):1032–1039 Alzheimer’s type dementia combining support vector machines Haralick RM, Shanmugam K (1973) Textural features for image and discriminant set of features. Inf Sci 237:59–72 classification. IEEE Trans Syst Man Cybern 6:610–621 Suk HI, Shen D (2013) Deep learning-based feature representation for Hinrichs C, Singh V, Mukherjee L, Xu G, Chung MK, Johnson SC AD/MCI classification. In: International conference on medical (2009) Spatially augmented LPboosting for AD classification image computing and computer-assisted intervention. Springer, with evaluations on the ADNI dataset. NeuroImage 48:138–149 Berlin, pp 583–590 Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely Suk HI, Lee SW, Shen D, Alzheimer’s Disease Neuroimaging connected convolutional networks. In: Proceedings of the IEEE Initiative (2014) Hierarchical feature representation and multi- conference on computer vision and pattern recognition, modal fusion with deep learning for AD/MCI diagnosis. pp 4700–4708 NeuroImage 101:569–582 Jack CR Jr, Albert MS, Knopman DS, McKhann GM, Sperling RA, Suk HI, Lee SW, Shen D, Alzheimer’s Disease Neuroimaging Carrillo MC, Thies B, Phelps CH (2011) Introduction to the Initiative (2015) Latent feature representation with stacked auto- recommendations from the National Institute on Aging-Alzhei- encoder for AD/MCI diagnosis. Brain Struct Funct mer’s Association workgroups on diagnostic guidelines for 220(2):841–859 Alzheimer’s disease. Alzheimer’s Dement 7(3):257–262 Yang L, Wu X, Zhao D, Li H, Zhai J (2011). An improved Prewitt Li F, Liu M, Initiative Alzheimer’s Disease Neuroimaging (2018) algorithm for edge detection based on noised image. In: 2011 4th Alzheimer’s disease diagnosis based on multiple cluster dense International congress on image and signal processing, vol 3. convolutional networks. Comput Med Imaging IEEE, pp 1197–1200 Graph 70:101–110 Zhang D, Wang Y, Zhou L, Yuan H, Shen D (2011) Multimodal Liu M, Zhang D, Shen D, Alzheimer’s Disease Neuroimaging classification of Alzheimer’s disease and mild cognitive impair- Initiative (2012) Ensemble sparse classification of Alzheimer’s ment. Neuroimage 55:856–867 disease. NeuroImage 60(2):1106–1116 Zheng Y, Jeon B, Xu D, Wu QM, Zhang H (2015) Image Liu S, Liu S, Cai W, Pujol S, Kikinis R, Feng D (2014) Early segmentation by generalized hierarchical fuzzy C-means algo- diagnosis of Alzheimer’s disease with deep learning. In: 2014 rithm. J Intell Fuzzy Syst 28(2):961–973 IEEE 11th international symposium on biomedical imaging (ISBI). IEEE, pp 1015–1018 Makkar H, Pundir A (2014) Image analysis using improved Otsu’s Publisher’s Note Springer Nature remains neutral with regard to thresholding method. Int J Recent Innov Trends Comput jurisdictional claims in published maps and institutional affiliations. Commun 2(8):2122–2126 Ortiz A, Munilla J, Gorriz JM, Ramirez J (2016) Ensembles of deep learning architectures for the early diagnosis of the Alzheimer’s disease. Int J Neural Syst 26(07):1650025 123