Posted Fri, Apr, 26,2013
Published today in Biomedical Informatics Insights is a new original research article by Sylvia Halász, Philip Brown, Cem Oktay, Arif Alper Çevik, Isa Kiliçaslan, Colin Goodall, Dennis G Cochrane, Thomas R Fowler, Guy Jacobson, Simon Tse and John R Allegra. Read more about this paper below:
Title
Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study
Abstract
Introduction: Syndromic surveillance is designed for early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which are generally recorded in the local language. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories. The n-gram classifier is created by using text fragments to measure associations between chief complaints (CC) and a syndromic grouping of ICD codes.
Objectives: The objective was to create a Turkish n-gram CC classifier for the respiratory syndrome and then compare daily volumes between the n-gram CC classifier and a respiratory ICD-10 code grouping on a test set of data.
Methods: The design was a feasibility Study based on retrospective cohort data. The setting was a university hospital emergency department (ED) in Turkey. Included were all ED visits in the 2002 database of this hospital. Two of the authors created a respiratory grouping of International Classification of Diseases, 10th Revision ICD-10-CM codes by consensus, chosen to be similar to a standard respiratory (RESP) grouping of ICD codes created by the Electronic Surveillance System for Early Notification of Community-based Epidemics (ESSENCE), a project of the Centers for Disease Control and Prevention. An n-gram method adapted from AT&T Labs' technologies was applied to the first 10 months of data as a training set to create a Turkish CC RESP classifier. The classifier was then tested on the subsequent 2 months of visits to generate a time series graph and determine the correlation with daily volumes measured by the CC classifier versus the RESP ICD-10 grouping.
Results: The Turkish ED database contained 30,157 visits. The correlation (R2) of n-gram versus ICD-10 for the test set was 0.78.
Conclusion: The n-gram method automatically created a CC RESP classifier of the Turkish CCs that performed similarly to the ICD-10 RESP grouping. The n-gram technique has the advantage of systematic, consistent, and rapid deployment as well as language independence.
Click here to learn more about the article, download it and comment
Posted in: Articles Published
News Categories
Thu 08 Oct, 2015
Published This Week (5th - 9th October)Thu 08 Oct, 2015
Biomarker Insights Paper Endorsed by Editor in ChiefWed 07 Oct, 2015
Interview with Professor Jamie DaviesI found Clinical Medicine Insights: Case Reports very, very user friendly. The entire process is easy and straightforward. The corresponding author is kept updated on the progress at every point. I am pleased to send this endorsement.
Facebook Google+ Twitter
Pinterest Tumblr YouTube