Author: Soliman, Sarah Ahmed./ Title: MACHINE LEARNING APPROACHES FOR<br>INTELLIGENT MEDICAL DIAGNOSTIC SYSTEM /

Search In this Thesis

العنوان

MACHINE LEARNING APPROACHES FOR
INTELLIGENT MEDICAL DIAGNOSTIC SYSTEM /

المؤلف

Soliman, Sarah Ahmed.

هيئة الاعداد

باحث / Sarah Ahmed Soliman

مشرف / Abd El-Badeeh Mohamed Salem

مشرف / Safia Abbas

مناقش / AtefZakiGhalwash

تاريخ النشر

2016.

عدد الصفحات

P 124. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Computer Science Applications

تاريخ الإجازة

1/1/2016

مكان الإجازة

اتحاد مكتبات الجامعات المصرية - كلية الحاسبات /علوم الحاسب

الفهرس

Only 14 pages are availabe for public view

from

117

from

117

Abstract

Healthcare has gained a powerful impact upon all activities. Physicians experience must
have the ability to decide and diagnosis the disease under any circumstances, what is the
illness level of a patient, which is the appropriate treatment and which will be the
advancement of the patient during the treatment. With the rapid increase in the amount
of medical database and with the increasing need of investigation for such huge amount
databases, the establishment of intelligent systems that is capable of manipulating and
analyzing that repositories of data are mandatory. In the past two decades, medical
diagnostic systems based on data mining approaches have been widely used in order to
tackle and exploit the vast growth of medical databases.
Such growth of medical data records radically overwhelms and undermines the
expressive human information retrieval process that done to extract and penetrate the
critical cases entailed in such data repositories. As a result, knowledge discovery and
data mining processes form a tremendous role in excavating, depicting, and sustaining a
reliable, fast, and more sufficient medical diagnosis decision for critical cases
Recently, research efforts have been focused on medical expert systems as integral
solution to conventional technique for finding solution to medical problems. Accurate
and precise diagnosis of any disease is the key major in the medical field. However,
there are hundreds of deaths over the world as result of many factors such as: poor
diagnosis, self-medication, shortage of medical experts, time consuming at the diagnosis
and Medical negligence. Thrombosis is one of the most important and severe
complications in collagen diseases, and one of the major causes of death.
Consequentially, the main factors and symptoms for the thrombosis diagnosis and its
classification are still fuzzy.
Collagen diseases propagated due to many factors such as pressure and pollution.
Thrombosis is one of the most famous collagen diseases that obstruct the blood flow
causing vital complications for crucial parts of the circulatory system. Such diseases
cause a high risk for the doctors due to the huge number of the laboratory examinations
and the efforts to diagnosis.
The work in this thesis analyzes thrombosis disease offering the following contributions
VII
Firstly, it introduces a study for some data mining techniques that have been used in
medical diagnosis for different diseases. This study represented the type of each data
set, the used data mining technique, and the performance of the results. After then, a
brief comparison is provided in order to represent the most efficient mining technique
based the medical domain and the nature of the data set.
Secondly, the C4.5 algorithm was implemented, as one of the most famous data mining
techniques, on real thrombosis dataset. The dataset was collected from Chiba University
as a challenging dataset for thrombosis diagnosis. C4.5 algorithm provides the
opportunity to construct the Decision tree (DT) and extracting more of knowledge to
diagnose the diseases through calculating the entropy and the gain ratio based on both
the degree of the disease and the symptoms.
Thirdly, 10-Cross validation method are used to test and evaluate the dataset. The
accuracy of the created predictive model has been increased by applying the KDD
methodology: The improvement of accuracy reached to 90.89 % and this lead that the
error percentage reduced.
Finally, the main aims of the thesis are: to predict the thrombosis disease by studying
the most effective factors based on data mining and knowledge discovery process, to
help/support novice physicians to detect the presence of such collagen diseases andtake
the right diagnosis, to accelerate and improve the accuracy of the diagnostic process.