الفهرس | Only 14 pages are availabe for public view |
Abstract Healthcare has gained a powerful impact upon all activities. Physicians experience must have the ability to decide and diagnosis the disease under any circumstances, what is the illness level of a patient, which is the appropriate treatment and which will be the advancement of the patient during the treatment. With the rapid increase in the amount of medical database and with the increasing need of investigation for such huge amount databases, the establishment of intelligent systems that is capable of manipulating and analyzing that repositories of data are mandatory. In the past two decades, medical diagnostic systems based on data mining approaches have been widely used in order to tackle and exploit the vast growth of medical databases. Such growth of medical data records radically overwhelms and undermines the expressive human information retrieval process that done to extract and penetrate the critical cases entailed in such data repositories. As a result, knowledge discovery and data mining processes form a tremendous role in excavating, depicting, and sustaining a reliable, fast, and more sufficient medical diagnosis decision for critical cases Recently, research efforts have been focused on medical expert systems as integral solution to conventional technique for finding solution to medical problems. Accurate and precise diagnosis of any disease is the key major in the medical field. However, there are hundreds of deaths over the world as result of many factors such as: poor diagnosis, self-medication, shortage of medical experts, time consuming at the diagnosis and Medical negligence. Thrombosis is one of the most important and severe complications in collagen diseases, and one of the major causes of death. Consequentially, the main factors and symptoms for the thrombosis diagnosis and its classification are still fuzzy. Collagen diseases propagated due to many factors such as pressure and pollution. Thrombosis is one of the most famous collagen diseases that obstruct the blood flow causing vital complications for crucial parts of the circulatory system. Such diseases cause a high risk for the doctors due to the huge number of the laboratory examinations and the efforts to diagnosis. The work in this thesis analyzes thrombosis disease offering the following contributions VII Firstly, it introduces a study for some data mining techniques that have been used in medical diagnosis for different diseases. This study represented the type of each data set, the used data mining technique, and the performance of the results. After then, a brief comparison is provided in order to represent the most efficient mining technique based the medical domain and the nature of the data set. Secondly, the C4.5 algorithm was implemented, as one of the most famous data mining techniques, on real thrombosis dataset. The dataset was collected from Chiba University as a challenging dataset for thrombosis diagnosis. C4.5 algorithm provides the opportunity to construct the Decision tree (DT) and extracting more of knowledge to diagnose the diseases through calculating the entropy and the gain ratio based on both the degree of the disease and the symptoms. Thirdly, 10-Cross validation method are used to test and evaluate the dataset. The accuracy of the created predictive model has been increased by applying the KDD methodology: The improvement of accuracy reached to 90.89 % and this lead that the error percentage reduced. Finally, the main aims of the thesis are: to predict the thrombosis disease by studying the most effective factors based on data mining and knowledge discovery process, to help/support novice physicians to detect the presence of such collagen diseases andtake the right diagnosis, to accelerate and improve the accuracy of the diagnostic process. |