Author: Alokla, Anas Hamid./ Title: Using Artificial Intelligent techniques for Source Code Generation/

Search In this Thesis

العنوان

Using Artificial Intelligent techniques for Source Code Generation/

المؤلف

Alokla, Anas Hamid.

هيئة الاعداد

باحث / Anas Hamid Alokla

مشرف / Abdel-Badeeh Mohamed Salem

مشرف / Mostafa Mahmoud Aref

مشرف / Walaa Khaled Gad

تاريخ النشر

2022.

عدد الصفحات

106 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

علوم الحاسب الآلي

تاريخ الإجازة

1/1/2022

مكان الإجازة

جامعة عين شمس - كلية الحاسبات والمعلومات - علوم الحاسب

الفهرس

Only 14 pages are availabe for public view

from

106

from

106

Abstract

The rapid growth of software development and maintenance led in the recent past to the development of mechanisms for generating source code from natural language or from database schema because this leads to reducing the time and material cost of producing programs. Natural language analysis is one of the most important ways to extract required functions from natural language. Moreover, knowledge bases are one of the most important things for storing solutions to complex and common problems in various fields such as medicine, engineering, and others. Deep learning is also used in various matters, including machine translation, which allows us to translate from one language to another.
Therefore, in this thesis, the methods of generating source code through natural language analysis of user stories and database schema are discussed. Also, the methods of converting source code to pseudo-code that contribute to understanding the source code and help developers maintain programs to write pseudo code is a very cumbersome work that most developers avoid.
Three layers are built to generate the source code: (1) The analyzer layer analyzes a model from user stories and extracts users’ roles and functions. And find relationships between tables. (2) The reasoner layer searches for solutions to the problems that have been analyzed from the previous level through the ontology and builds a solution model. (3) The convertor layer is based on converting the solution model into source code. This is done by extracting the templates and configuration the templates linked to each other to build the source code.
To convert the source code to pseudo-code, it was necessary to use machine translation based on deep learning. the Google developed transformer model based on self-attention layers. the transformer has features such as performing the input sentence completely processing in training process phase, but that is not happen in the recurrent neural networks. The transformer solves the problem of vanish gradient that occurs in Recurrent neural networks, which needs a long-short term memory unit to solve it. The transformer model also showed higher results than the other models code2NL, code2pseudocode, T2SMT, code-GRU, DeepPseudo, Code-NN, convS2S that reached 59 on the BLEU score.
In the observation of the results, it was noted that in the inputs that contain tokens or words that are not present in the training base, it leads to an error in the results; for solving this problem the retrieval mechanism was applied. This mechanism relies on retrieving a certain number of sentences from the training set that is most similar to the input sentence, and then these return sentences are passed to the Transformer model. The retrieved sentence and this difference are replaced in the retrieved translation to reach the target translation. This mechanism showed an improvement in the results with an increase of 2.7 BLEU.