Author: Hend Galal Elsayed Gaber/ Title: Clustering with missing data under different mechanisms: a mathematical programming approach :

Search In this Thesis

العنوان

Clustering with missing data under different mechanisms: a mathematical programming approach :

الناشر

Hend Galal Elsayed Gaber ,

المؤلف

Hend Galal Elsayed Gaber

هيئة الاعداد

باحث / Hend Galal Elsayed Gaber

مشرف / Ahmed Mahmoud Gad

مشرف / Mahmoud Mostafa Rashwan

مشرف / Ahmed Mahmoud Gad

تاريخ النشر

2019

عدد الصفحات

66 P. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الإحصاء والاحتمالات

تاريخ الإجازة

21/8/2019

مكان الإجازة

جامعة القاهرة - كلية اقتصاد و علوم سياسية - Statistics

الفهرس

Only 14 pages are availabe for public view

from

103

from

103

Abstract

Cluster analysis is a convenient method for identifying homogenous groups of observations called clusters. The goal of clustering is to discover a natural grouping in a set of points or objects without knowledge of any class labels. In real applications, data sets suffer from the existence of missing data and outliers. So, the clustering problem becomes more challenging in the presence of such problems. The presence of missing data in a data set can affect the results and there is difficulty in obtaining accurate estimates. When missing data are ignored the results are biased, unrealistic and insignificant. In real-life situations, there are many reasons that lead to the presence of missing values. For example, a respondent in a household survey may refuse to report income. Sometimes missing values are caused by the researcher{u2014}for example, when data collection is done improperly, mistakes are made in data entry, or poorly designed questionnaire. Supervised by Prof. Ahmed Mahmoud GadDr. Mahmoud Mostafa Rashwan Professor of StatisticsAssistant Professor of Statistics Department of StatisticsDepartment of Statistics Faculty of Economics and Political ScienceFaculty of Economics and Political Science This thesis treats the problem of clustering with different mechanisms for handling missing data. Firstly, the thesis addresses the definition of cluster analysis; covers procedures for handling missing data in general; reviews the literature of treating missing values in cluster analysis. Finally, the thesis presents a new approach that depends on formulating a mathematical programming model for cluster analysis in case of existence of missing data without the need of pre-processing the data