Papers by Merve Ayyüce Kızrak, Ph.D.

IEEE Access, 2021
We propose a strategy that focuses on estimating the number of people in a crowd, one of the aims... more We propose a strategy that focuses on estimating the number of people in a crowd, one of the aims of crowd analysis, using static images or video images. While manual feature extraction was not performed with pixel and regression-based methods in the first studies on crowd analysis, recent studies use Convolutional Neural Networks (CNN) based models. However, it is still difficult to extract spatial information such as position, orientation, posture, and angular value for crowd estimation from a density map. This study uses capsule networks and routing by agreement algorithm as an attention module. Our proposed approach consists of both CNN and capsule network-based attention modules in a two-column deep neural network architecture. We evaluate our proposed approach compared with other state-of-the-art methods using five well-known datasets: UCF-QNRF, UCF_CC_50, UCSD, ShangaiTech Part A, and WorldExpo’10.
RecycleNet: Intelligent Waste Sorting Using Deep Neural Networks
2018 Innovations in Intelligent Systems and Applications (INISTA)

SAGE Journals, 2017
Musical information retrieval (MIR) applications have become an interesting topic both for resear... more Musical information retrieval (MIR) applications have become an interesting topic both for researchers and commercial applications. The majority of the current knowledge on MIR is based on Western music. However, traditional genres, such as Classical Turkish Music (CTM), have great structural differences compared with Western music. Then, the validity of the current knowledge on this subject must be checked on such genres. Through this work, a MIR application that simulates the human music processing system based on CTM is proposed. To achieve this goal, first mel-frequency ceps-tral coefficients (MFCCs) and delta-MFCCs, which are the most frequent features used in audio applications, were used as features. In the last few years deep belief networks (DBNs) have become promising classifiers for sound classification problems. To confirm this statement, the classification accuracies of four probability theory-based neural networks, namely radial basis function networks, generalized regression neural networks, probabilistic neural networks, and support vector machines, were compared to the DBN. Our results show that the DBN outperforms the others.
INnovations in Intelligent SysTems and Applications (INISTA), 2017 IEEE International Conference on, 2017
— Crowd analysis on video recordings is an important research area currently. In this work, a com... more — Crowd analysis on video recordings is an important research area currently. In this work, a combined crowd density estimation method is presented to overcome this problem. To improve the accuracy of the system two different estimators run simultaneously and a blob is marked as a person only if both estimators mark it as person. One of the main problems in crowd density estimation is occlusion. To overcome this problem we tracked the trajectories of blobs by using a Kalman filter. The method was applied to three common benchmark data which are PETS2009, UCSD and Grand Central. The results confirm the proposed method's success.

Voting-Based Multiple Classification Approach for Turkish News Texts
IEEExplore, 2019 Innovations in Intelligent Systems and Applications Conference, 2019
Nowadays, there are numerous sources on the internet that produce news on a daily basis. Through ... more Nowadays, there are numerous sources on the internet that produce news on a daily basis. Through this growing knowledge base, it makes it difficult for users to access the information and news they are looking for. It is important to classify the information for fast and efficient search and access. In this study, a dataset consisting of Turkish news content Kemik prepared by Yıldız Technical University, Natural Language Processing Group, used. A hierarchical approach based on a voting structure is adopted by using machine learning based approaches. In order to solve the problem, firstly Tf-Idf method is applied for word 1-3- ngrams and character 2-6-ngrams. Thus, the 2000 dimensional feature vector is pre-trained. By using FastText, 300-dimensional feature vectors and 2 feature vectors are combined to produce 2300-dimensional feature vectors.. In order to determine the one that will increase the classification accuracy among these vectors, Support Vector Machines method is applied and Tf-Idf method which has the robust accuracy is determined as the main feature extraction method. Next, Support Vector Machines, K-Nearest Neighborhood Method, Random Forest, Logistic Regression, XGBoost methods are used for the classification of news texts. Estimated label values from all classifiers are voted for each sample and the label with the highest voting rate is considered as the final estimate. In this study, it is aimed to open the way to reach the right information quickly by classifying news topics. Finally, the feature vector size has been reduced using Principal Component Analysis and it is possible to gain processing speed without reducing performance. In both approaches, it is seen that the performance achieved by voting is higher than the individual performance rates of the classifiers.

Özet— Yapay sinir ağları ve makine öğrenmesi, uzun yıllardır birçok problemin çözümünde kullanılm... more Özet— Yapay sinir ağları ve makine öğrenmesi, uzun yıllardır birçok problemin çözümünde kullanılmıştır. Problemlerin ve modellerin karmaşıklaşması ve veri sayısındaki artış hesaplama yükünü de beraberinde getirmiştir. Bu çalışmada yapay sinir ağlarından derin öğrenmeye tüm geçiş süreci, modeller ve pratik uygulamalar kısa ve öz gösterilmiştir. Ayrıca donanım, yazılım ve kullanılan kütüphaneler hakkında da bilgiler verilmiştir. Özel olarak kalabalık analizi için kullanılan geleneksel yöntemler özetlenmiştir. Kalabalık analizi için literatürdeki derin öğrenme yaklaşımları detaylıca anlatılmış ve veri kümeleri tanıtılmıştır. Ayrıca son yıllarda yapılmış çalışmalar analiz edilmiş ve karşılaştırılmıştır. Sonuç olarak, kalabalık analizi, derin öğrenme yardımıyla başarılı sonuçlar alınan hem akademik hem de pratik bir çalışma alanıdır. Anahtar Kelimeler— derin öğrenme, yapay sinir ağları, evrişimli sinir ağları, özyinelemeli sinir ağları, kalabalık analizi Abstract— Artificial neural networks and machine learning have been used to solve many problems for decades. The complexity of the problems and models and the increase in the number of data also brought with it the computation burden. In this study, the whole transition process from artificial neural networks to deep learning, models and applications are briefly demonstrated. Additionally information about hardware, software, and used libraries is also provided. In particular, canonical methods for crowd analysis have been summarized. Deep learning approaches in the literature are pointed out in depth for crowd analysis and datasets are overviewed. Furthermore, studies done in recent years have been analyzed and compared. Consequently, crowd analysis is both an academic and a practical field of study where successful results evaluation. As a result, crowd analysis is both an academic and a practical field where fruitful results are achieved with the help of deep learning.

Cluster-Based Monitoring and Location Estimation for Crowd Counting
Springer, 2021
Crowd management and monitoring are important research topics in order to ensure personal and com... more Crowd management and monitoring are important research topics in order to ensure personal and community safety by making use of video images. The crowd tracking system (CMS) includes tasks such as density variation in images, irregular distribution of people and objects, overcrowding, exposure estimation. People come together for various purposes in areas designed for socializing, such as parks, stadiums, airports, hospitals, and shopping malls. Generally, these areas are monitored by closed-circuit TeleVision (CCTV). However, this type of system brings problems. The main ones are portability, flexible accessibility, limited coverage area, high power consumption. Crowd density is often examined by people for behavioral analysis or to identify suspicious people. Computer vision and deep learning approaches are maintained in order to prevent mistakes caused by human error and to make a faster evaluation.
In this study, a method for detecting event changes with cluster-based inspection in crowd images is proposed. Gaussian YOLOv3 model is used for object recognition. The proposed clustering approach is used to track changes in the number, coordinate, and direction information of the cluster. Behavior change time, location, and classification are achieved as a result of this information extractions. These two event changes are taken into consideration, especially because sudden changes in state occur in walking and running behavior. Six different video sequences in the PET2009 dataset are used for the study. Accuracy performance is achieved between 83.2% and 96.4%. The results obtained to achieve the success that can be compared with similar ones in the literature.

IEEExplore, 2020 International Conference on INnovations in Intelligent SysTems and Applications, 2020
Medical sciences are an important application area of artificial intelligence. Healthcare require... more Medical sciences are an important application area of artificial intelligence. Healthcare requires meticulousness in the whole process from collecting data to processing. It should also be handled in terms of data quality, data size, and data privacy. Various data are used within the scope of the COVID-19 outbreak struggle. Medical and location data collected from mobile phones and wearable devices are used to prevent the spread of the epidemic. In addition to this, artificial intelligence approaches are presented by using medical images in order to identify COVID-19 infected people. However, studies should be carried out by taking care not to endanger the security of the data, people, and countries needed for these useful applications. Therefore, differential privacy (DP) application, which was an interesting research subject, has been included in this study. CXR images have been collected from COVID-19 infected 139 and a total of 373 public data sources were used for a diagnostic concept. It has been trained with EfficientNet- B0, a recent and robust deep learning model, and proposal the possibility of infected with an accuracy of 94.7%. Other evaluation parameters were also discussed in detail. Despite the data constraint, this performance showed that it can be improved by augmenting the dataset. The most important aspect of the study was the proposal of differential privacy practice for such applications to be reliable in real-life use cases. With this view, experiments were repeated with DP applied images and the results obtained were presented. Here, Private Aggregation of Teacher Ensembles (PATE) approach was used to ensure privacy assurance.
Recognition of sign language using capsule networks
2018 26th Signal Processing and Communications Applications Conference (SIU)
Hearing and speech impaired persons continue to communicate with the help of lip reading or hand ... more Hearing and speech impaired persons continue to communicate with the help of lip reading or hand and face movements also known as a sign language. Ensuring the disabled persons participation in life and increasing their quality of life are achievable through healthy and effective communication with other people. In this work; digits of the sign language were recognized with %94.2 validation accuracy by Capsule Networks.

IEEExplore, 2019 Innovations in Intelligent Systems and Applications Conference, 2019
In this study, a novel and efficient deep learning model are proposed to estimate the number of p... more In this study, a novel and efficient deep learning model are proposed to estimate the number of people in highly dense crowd images. We present a convolutional neural network model consisting of two parallel modules which focus on various specific features of the images. Thus, while the general density map is derived by obtaining lower-level features from the first module, it is possible to identify regions of the human body, such as head and upper body with the help of the higher-level features in the deeper second module. These two modules are then concatenated with a fully connected neural network. The proposed model was tested with the ShanghaiTech Part-A dataset. The mean square error and mean absolute error values are used as performance metrics. By comparing these metrics regarding recent studies, more successful results were obtained by using the proposed method.

Günümüzde kullanılan çatı antenleri analog yayınları almaktadır. Analog yayının dezavantajı bir f... more Günümüzde kullanılan çatı antenleri analog yayınları almaktadır. Analog yayının dezavantajı bir frekansta sadece bir televizyon kanalının yayın yapmasıdır. DVB-T (Sayısal karasal TV yayını) görüntü ve ses kalitesi analog yayına kıyasla yüksektir. DVB-T sayesinde tek frekansta dört farklı kanalın yayını mümkündür. Kanalların girşimi ve gölgelenme DVB-T sayesinde sona ermiştir. DVB-T’yle interaktif hizmetleri kullanma imkânı da getirilmektedir. Anahtar özelliği, sayısal işaretlerin iletiminde COFDM (Kodlu dik frekans bölmeli çoğullama)’in kullanılmasıdır. Karşılaşılan problemler çoklu yol etkisi, kanallar arası girişimdir. Bunu engellemek için COFDM seçilmiştir. Bu çalışmada, MATLAB kullanılarak, OFDM’in DVB-T sisteminde çalışma performansı incelenmiştir. Performansı artırmaya yönelik DVB-T için belirtilen frekanslar üzerinde OFDM taşıyıcı sayıları ve frekans karşılaştırmalar yapılarak optimum sistemin tasarlanması hedeflenmiş ve benzetimleri gerçekleştirilmiştir.
Akut Lenfosit Lösemi Hücrelerinin Sağlıklı Hücrelerden Ayrılması için Yeni Yöntemler
Literatürde lösemi teşhisinde kullanılan yöntemler incelendiğinde, lösemi hücrelerinin görüntülem... more Literatürde lösemi teşhisinde kullanılan yöntemler incelendiğinde, lösemi hücrelerinin görüntüleme teknikleri kullanılarak tespit edilmesinde birbirinden farklı birçok yöntemin geliştirilip kullanıldığı görülmektedir. Bu yöntemler hakkında araştırma ve geliştirme çalışmaları sürmektedir. Bu çalışmaların amacı daha çok çocuklarda rastlanan ve teşhis edildiği durumda tedaviye açık olan, tedavi edilmediğindeise ölümcül sonuçlar doğurabilen Akut Lenfosit Lösemili (ALL) hücrelerde bir örüntü tanıma algoritması ile erken ve daha doğru tanı konmasını sağlamaktır. Bu araştırmada ALL tipi hücrelerin sağlıklı hücrelerden ayrılmasına yönelik çalışmalar incelenmiştir. Yeni geliştirilen yöntemler detaylı olarak verilmiştir.
Microwave Symposium, Mediterrannean, 2009
In this paper, we achieved a very important gain improvement for a circularly polarized (CP) patc... more In this paper, we achieved a very important gain improvement for a circularly polarized (CP) patch antenna using a new left handed medium (LHM) structure. To this end, a homogenization procedure of the LHM is carried out in order to design the patch antenna at a suitable frequency where the losses of the LHM are low. From the results we

Classification of Recyclable Materials Using Efficient Deep Learning Models and Benchmarking of GPU Performance
Springer, 2021
One of the consequences of climate change and global warming results from excessive consumption o... more One of the consequences of climate change and global warming results from excessive consumption of sources. To slow down global warming and increase energy saving, recycling, within the framework of waste management, needs to be widely implemented. Waste management and recycling is not only environmentally advantageous but also of great importance for a sustainable economy. Preferring smart systems instead of human workers is a socially important step that allows people to work in more welfare environments. Intelligent waste management approaches open up a major research area.
The main objective of the study is aimed to contribute to the efficient collection of recyclable materials from the end consumer, to optimize the collection process and to reduce the workload in the waste institution. In addition to the TrashNet dataset used in the previous classification of recyclable materials, an expanded dataset is collected, and a more advanced version is obtained. Data from three different classes, including glass, plastic, and metal waste, were collected and the current dataset was enhancement from 2527 to 6136. The new extended dataset is called TrashX. Therefore, not only the methods used in the literature have been improved, but also the convolutional neural network-based models used are tested. All results are evaluated according to performance criteria. In this research, 6 different recyclable waste classifications are made on a progressed dataset consisting of 6136 RGB images. Within the scope of this study, the largest dataset in the literature was created. For this purpose, high performance and robust models such as MobileNet, RecycleNet, and EfficientNet are offered. One of the most important factors of the study is that the performance of the models is evaluated in terms of time on different hardware. This benchmarking light on researchers to improve intelligence recycling and waste management systems.
Finally, the experiments are run to compare the performances of the methods for both TrashNet and the TrashX datasets. The experimental results demonstrate that EfficientNet-b3 efficiency 93.8% and 97.3% in terms of accuracy for Trashnet and TrashX datasets separately and thus it outperforms the many recent approaches for trash classification on both experimental datasets.

2018 26th Signal Processing and Communications Applications Conference (SIU), 2018
Özetçe— İşitme ve konuşma engelliler, dudak okuma ya da el ve yüz hareketlerinden oluşan ifadeler... more Özetçe— İşitme ve konuşma engelliler, dudak okuma ya da el ve yüz hareketlerinden oluşan ifadeler yardımıyla iletişimlerini sürdürmektedirler. Engelli bireylerin topluma katılımlarının sağlanması ve yaşam kalitelerinin artırılması diğer insanlarla sağlıklı ve etkili bir şekilde iletişim kurmaları ile mümkün olmaktadır. Bu çalışmada; işaret diline ait rakamların, derin yapay sinir ağı (deep artificial neural network) modeli olan Kapsül Ağları ile %94,2 başarı ile tanınması sağlanmıştır. Anahtar Kelimeler — derin öğrenme; derin sinir ağları; kapsül ağları; evrişimli sinir ağları; işaret dili; işaret dili tanıma. Abstract— Hearing and speech impaired persons continue to communicate with the help of lip reading or hand and face movements also known as a sign language. Ensuring the disabled persons participation in life and increasing their quality of life are achievable through healthy and effective communication with other people. In this work; digits of the sign language were recognized with %94.2 validation accuracy by Capsule Networks.
INnovations in Intelligent SysTems and Applications (INISTA), 2016 International Symposium on, 2016
—Automatic classification of makams from sound data is a challenging yet rarely studied topic. In... more —Automatic classification of makams from sound data is a challenging yet rarely studied topic. In this work, it is aimed to develop an MIR system which determines a song's makam. To overcome this problem, mel frequency cepstral coefficients were utilized as features. Five classifiers were considered. The best result was obtained by deep belief network as 93.10 which is comparable to the recent works.

Klasik Türk Müziği Makamlarının Derin Anlama Ağları ile Sınıflandırılması Classification of Classic Turkish Music Makams by Using Deep Belief Networks, 2015
Özetçe— Bu çalışmada 6 Klasik Türk Müziği makamının derin anlama ağ yapısı ile sınıflandırılması ... more Özetçe— Bu çalışmada 6 Klasik Türk Müziği makamının derin anlama ağ yapısı ile sınıflandırılması üzerine çalışılmıştır. Öznitelik olarak Mel ve delta Mel frekans kepstral katsayıları, sınıflayıcı olarak ta Derin Anlama Ağları kullanılmıştır. Elde edilen sonuçlar daha önceki çalışmalarda bulunan sonuçlar ile karşılaştırılmıştır. Çalışmada en yüksek başarım Mel frekans kepstral katsayıları ve derin anlama ağı ile %92.70 olarak elde edilmiştir. Yapılan çalışmanın sonucunda literatürde mevcut olan çalışmalardan daha başarılı sonuçlar elde edilmiştir. Anahtar Kelimeler — Klasik Türk Müziği; Mel Frekans Kestral Katsayıları; Derin Anlama Ağları. Abstract—Through this work, six most known Classic Turkish Music Makams (CTM) were classified by using Deep Belief Networks (DBN). The Mel and delta-Mel Frequency Cepstral Coefficients (MFCC) were used as features. The best correct recognition ratio was obtained as 92.70% by using Deep Belief Networks and Mel frequency epstral coefficients. This result is better than the recent works reported in the literature.
Bu çalışmada en yaygın 6 Klasik Türk Müziği makamının yapay sinir ağları ile tanınmasına çalışıl... more Bu çalışmada en yaygın 6 Klasik Türk Müziği makamının yapay sinir ağları ile tanınmasına çalışılmıştır. Öznitelik olarak Mel frekans kepstral katsayıları, Delta-Mel frekans kepstral katsayıları ve doğrusal öngörü katsayıları, yapay sinir ağı olarak ise radyal taban fonksiyon ağları, genelleştirilmiş regresyon sinir ağları ve olasılıksal sinir ağları kullanılarak en başarılı öznitelikler ve sinir ağı tespit edilmeye çalışılmıştır. Öznitelikler hesaplanırken kullanılan ses parçacıklarının uzunluğunun başarıma etkisi de ayrıca irdelenmiştir. En yüksek başarım Delta-Mel frekans kepstral katsayıları ve olasılıksal sinir ağı ile %89.60 olarak elde edilmiştir.
In this work, Classical Turkish Music songs are classified into six makams. Makam is a modal fra... more In this work, Classical Turkish Music songs are classified into six makams. Makam is a modal framework for melodic development in Classical Turkish Music. The effect of the sound clip length on the system performance was also evaluated. The Mel Frequency Cepstral Coefficients (MFCC) were used as features. Obtained data were classified by using Probabilistic Neural Network. The best correct recognition ratio was obtained as 89,4% by using a clip length of 6 s.
Uploads
Papers by Merve Ayyüce Kızrak, Ph.D.
In this study, a method for detecting event changes with cluster-based inspection in crowd images is proposed. Gaussian YOLOv3 model is used for object recognition. The proposed clustering approach is used to track changes in the number, coordinate, and direction information of the cluster. Behavior change time, location, and classification are achieved as a result of this information extractions. These two event changes are taken into consideration, especially because sudden changes in state occur in walking and running behavior. Six different video sequences in the PET2009 dataset are used for the study. Accuracy performance is achieved between 83.2% and 96.4%. The results obtained to achieve the success that can be compared with similar ones in the literature.
The main objective of the study is aimed to contribute to the efficient collection of recyclable materials from the end consumer, to optimize the collection process and to reduce the workload in the waste institution. In addition to the TrashNet dataset used in the previous classification of recyclable materials, an expanded dataset is collected, and a more advanced version is obtained. Data from three different classes, including glass, plastic, and metal waste, were collected and the current dataset was enhancement from 2527 to 6136. The new extended dataset is called TrashX. Therefore, not only the methods used in the literature have been improved, but also the convolutional neural network-based models used are tested. All results are evaluated according to performance criteria. In this research, 6 different recyclable waste classifications are made on a progressed dataset consisting of 6136 RGB images. Within the scope of this study, the largest dataset in the literature was created. For this purpose, high performance and robust models such as MobileNet, RecycleNet, and EfficientNet are offered. One of the most important factors of the study is that the performance of the models is evaluated in terms of time on different hardware. This benchmarking light on researchers to improve intelligence recycling and waste management systems.
Finally, the experiments are run to compare the performances of the methods for both TrashNet and the TrashX datasets. The experimental results demonstrate that EfficientNet-b3 efficiency 93.8% and 97.3% in terms of accuracy for Trashnet and TrashX datasets separately and thus it outperforms the many recent approaches for trash classification on both experimental datasets.