Biomedical Informatics Insights 2013:6 Suppl. 1 51-62
Original Research
Published on 01 Aug 2013
DOI: 10.4137/BII.S11770
Sign up for email alerts to receive notifications of new articles published in Biomedical Informatics Insights
Medical entity recognition is currently generally performed by data-driven methods based on supervised machine learning. Expert-based systems, where linguistic and domain expertise are directly provided to the system are often combined with data-driven systems. We present here a case study where an existing expert-based medical entity recognition system, Ogmios, is combined with a data-driven system, Caramba, based on a linear-chain Conditional Random Field (CRF) classifier. Our case study specifically highlights the risk of overfitting incurred by an expert-based system. We observe that it prevents the combination of the 2 systems from obtaining improvements in precision, recall, or F-measure, and analyze the underlying mechanisms through a post-hoc feature-level analysis. Wrapping the expert-based system alone as attributes input to a CRF classifier does boost its F-measure from 0.603 to 0.710, bringing it on par with the data-driven system. The generalization of this method remains to be further investigated.
PDF (607.15 KB PDF FORMAT)
RIS citation (ENDNOTE, REFERENCE MANAGER, PROCITE, REFWORKS)
BibTex citation (BIBDESK, LATEX)
PMC HTML
It's a great experience publishing with Biomedical Informatics Insights. I am particularly impressed with the in-depth and constructive comments provided by the reviewers within such a short time-frame. The typesetting was not only prompt, but most importantly, effective. In fact, this was among the very few publication experiences that I have had when no correction was needed in the author proofs. I highly recommend Biomedical Informatics Insights to both readers and prospective ...
Facebook Google+ Twitter
Pinterest Tumblr YouTube