2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Download PDF

Abstract

Text mining of clinical findings has been employed to extract clinical information contained in “electronic medical records” without the need for labor intensive work by medical experts. However, the automated building of disease ontology necessitates knowledge acquisition of clinical findings documented in “medical literature” that requires an independent strategy. This study performs a preliminary analysis of clinical finding expressions in medical literature to enable the automated acquisition of disease knowledge. To this end, we selected descriptions of 20 diseases in a free-text format and annotated the texts to extract expressions of clinical findings. This resulted in 1368 expressions with varying lengths and syntactic features, and 161 annotator comments. The comments suggested that certain types of expressions, which were further classified into 10 categories. Also, in-depth analyses of their syntactic and semantic characteristics were performed, resulting in the following observations. First, expressions of clinical findings have certain patterns, syntactic and semantic, which can be exploited for appropriate knowledge acquisition. Second, clinical knowledge may guide the knowledge acquisition process in a top-down manner. Third, natural language processing of medical literature requires specific considerations compared with the processing of health records, namely, i) distinction of subjects, ii) handling of generalized knowledge, and iii) processing of expressions for examination results. This preliminary survey on the expressions in medical literature provides helpful insights for future corpus design.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles