File Download

Conference Paper: Learning from various labeling strategies for suicide-related messages on social media: An experimental study

TitleLearning from various labeling strategies for suicide-related messages on social media: An experimental study
Authors
Issue Date2017
PublisherACM.
Citation
ACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports, 2017 How to Cite?
AbstractSuicide is an important but often misunderstood problem, one that researchers are now seeking to better understand through social media. Due in large part to the fuzzy nature of what constitutes suicidal risks, most supervised approaches for learning to automatically detect suicide-related activity in social media require a great deal of human labor to train. However, humans themselves have diverse or conflicting views on what constitutes suicidal thoughts. So how to obtain reliable gold standard labels is fundamentally challenging and, we hypothesize, depends largely on what is asked of the annotators and what slice of the data they label. We conducted multiple rounds of data labelling and collected annotations from crowdsourcing workers and domain experts. We aggregated the resulting labels in various ways to train a series of supervised models. Our preliminary evaluations show that using unanimously agreed labels from multiple annotators is helpful to achieve robust machine models.
Persistent Identifierhttp://hdl.handle.net/10722/248259

 

DC FieldValueLanguage
dc.contributor.authorLiu, T-
dc.contributor.authorCheng, Q-
dc.contributor.authorHoman, CH-
dc.contributor.authorSilenzio, VMB-
dc.date.accessioned2017-10-18T08:40:23Z-
dc.date.available2017-10-18T08:40:23Z-
dc.date.issued2017-
dc.identifier.citationACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports, 2017-
dc.identifier.urihttp://hdl.handle.net/10722/248259-
dc.description.abstractSuicide is an important but often misunderstood problem, one that researchers are now seeking to better understand through social media. Due in large part to the fuzzy nature of what constitutes suicidal risks, most supervised approaches for learning to automatically detect suicide-related activity in social media require a great deal of human labor to train. However, humans themselves have diverse or conflicting views on what constitutes suicidal thoughts. So how to obtain reliable gold standard labels is fundamentally challenging and, we hypothesize, depends largely on what is asked of the annotators and what slice of the data they label. We conducted multiple rounds of data labelling and collected annotations from crowdsourcing workers and domain experts. We aggregated the resulting labels in various ways to train a series of supervised models. Our preliminary evaluations show that using unanimously agreed labels from multiple annotators is helpful to achieve robust machine models.-
dc.languageeng-
dc.publisherACM.-
dc.relation.ispartofACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports-
dc.titleLearning from various labeling strategies for suicide-related messages on social media: An experimental study-
dc.typeConference_Paper-
dc.identifier.emailCheng, Q: chengqj@connect.hku.hk-
dc.identifier.authorityCheng, Q=rp02018-
dc.description.naturepostprint-
dc.identifier.hkuros280201-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats