Abstract
Automated creation of a conceptual data model based on user requirements expressed in the textual form of a natural language is a challenging research area. The complexity of natural language requires deep insight into the semantics buried in words, expressions, and string patterns. For the purpose of natural language processing, we created a corpus of business descriptions and an adherent lexicon containing all the words in the corpus. Thus, it was possible to define rules for the automatic translation of business descriptions into the entity–relationship (ER) data model. However, since the translation rules could not always lead to accurate translations, we created an additional classification process layer—a classifier which assigns to each input sentence some of the defined ER method classes. The classifier represents a formalized knowledge of the four data modelling experts. This rule-based classification process is based on the extraction of ER information from a given sentence. After the detailed description, the classification process itself was evaluated and tested using the standard multiclass performance measures: recall, precision and accuracy. The accuracy in the learning phase was 96.77% and in the testing phase 95.79%.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference48 articles.
1. Knowledge-Based Systems for Data Modelling
2. Knowledge-Based Systems for Data Modelling: Review and Challenges;Šuman,2017
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献