Posted Tue, Jun, 25,2013
Published today as part of the BII supplement on Computational Semantics in Clinical Text in Biomedical Informatics Insights is a new original research article by Siddhartha Jonnalagadda, Trevor Cohen, Stephen Wu, Hongfang Liu and Graciela Gonzalez. Read more about this paper below:
Title
Using Empirically Constructed Lexical Resources for Named Entity Recognition
Abstract
Because of privacy concerns and the expense involved in creating an annotated corpus, the existing small-annotated corpora might not have sufficient examples for learning to statistically extract all the named-entities precisely. In this work, we evaluate what value may lie in automatically generated features based on distributional semantics when using machine-learning named entity recognition (NER). The features we generated and experimented with include n-nearest words, support vector machine (SVM)-regions, and term clustering, all of which are considered distributional semantic features. The addition of the n-nearest words feature resulted in a greater increase in F-score than by using a manually constructed lexicon to a baseline system. Although the need for relatively small-annotated corpora for retraining is not obviated, lexicons empirically derived from unannotated text can not only supplement manually created lexicons, but also replace them. This phenomenon is observed in extracting concepts from both biomedical literature and clinical notes.
Click here to learn more about the article, download it and comment
Posted in: Articles Published
News Categories
Thu 08 Oct, 2015
Published This Week (5th - 9th October)Thu 08 Oct, 2015
Biomarker Insights Paper Endorsed by Editor in ChiefWed 07 Oct, 2015
Interview with Professor Jamie DaviesIt is great pleasure to work as a peer reviewer for LA. The contact between us is fast and friendly. They have my highest recommendation and I look forward to working with them further.
Facebook Google+ Twitter
Pinterest Tumblr YouTube