الفهرس | Only 14 pages are availabe for public view |
Abstract Recognition of dialogue acts is essential to a dialogue understanding system because dialogue acts have closely tied with the speaker{u2019}s intention. However, it has been difficult to infer a dialogue act from a surface utterance because they highly depend on the context of the utterance and his linguistic background. Building dialogues systems interaction has recently gained considerable attention, but most of the resources and systems built so far have tailored to English and other Indo-European languages. The need for designing systems for other languages is increasing such as the Arabic language. For these reasons, there is more interest for Arabic dialogue acts classification task because of it a key player in Arabic language understanding to building these systems . In this work, a novel language understanding component for Arabic spontaneous dialogues and Instant Messages at utterance level namely 2BASEER3 is presented, which designed to works on inquiry-answer dialogues. BASEER tested on Egyptian Arabic corpus, which collected manually from Egyptian call-centers. The BASEER component is composed of three parts: Pre-treatment module, segmentation turns into utterances model (i.e. USeg), and utterances labeling model (i.e. YOSR) . USeg is a machine learning approach based on context without relying on punctuation, text diacritization or lexical cues. Whereas, USeg depends on a set of features from the annotated data that include morphological features. The results obtained that USeg classifier achieved 0.907, and 0.933 without using cues and with using cues respectively, and it is very promising and to the best of my knowledge, these are the first results reported for turn segmentation into utterances task for Egyptian dialect |