Affiliation:
1. Department of Electrical and Computer Engineering, University of Patras, 26500 Rion, Patras, Greece
Abstract
This paper introduces a statistical framework for extracting medical domain knowledge from heterogeneous corpora. The acquired information is incorporated into a natural language understanding agent and applied to DIKTIS, an existing web-based educational dialogue system for the chemotherapy of nosocomial and community acquired pneumonia, aiming at providing a more intelligent natural language interaction. Unlike the majority of existing dialogue understanding engines, the presented system automatically encodes semantic representation of a user's query using Bayesian networks. The structure of the networks is determined from annotated dialogue corpora using the Bayesian scoring method, thus eliminating the tedious and costly process of manually coding domain knowledge. The conditional probability distributions are estimated during a training phase using data obtained from the same set of dialogue acts. In order to cope with words absent from our restricted dialogue corpus, a separate offline module was incorporated, which estimates their semantic role from both medical and general raw text corpora, correlating them with known lexical-semantically similar words or predefined topics. Lexical similarity is identified on the basis of both contextual similarity and co-occurrence in conjunctive expressions. The evaluation of the platform was performed against the existing language natural understanding module of DIKTIS, the architecture of which is based on manually embedded domain knowledge.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Artificial Intelligence
Reference34 articles.
1. B. J. Grosz and C. L. Sidner, Intentions and Communication, eds. P. R. Cohen, J. L. Morgan and M. E. Pollack (MIT Press, Cambridge, MA, 1990) p. 417.
2. A Bayesian model of plan recognition