MANUSCRIPT: Automated Assessment of Medical Training Evaluation Text
Medical post-graduate residency training and medical student training increasingly utilize electronic systems to evaluate trainee performance based on defined training competencies with quantitative and qualitative data, the later of which typically consists of text comments. Medical education is concomitantly becoming a growing area of clinical research. While electronic systems have proliferated in number, little work has been done to help manage and analyze qualitative data from these evaluations. We explored the use of text-mining techniques to assist medical education researchers in sentiment analysis and topic analysis of residency evaluations with a sample of 812 evaluation statements. While comments were predominantly positive, sentiment analysis improved the ability to discriminate statements with 93% accuracy. Similar to other domains, Latent Dirichlet Analysis and Information Gain revealed groups of core subjects and appear to be useful for identifying topics from this data.