Learning from various labeling strategies for suicide-related messages on social media: An experimental study

Liu, T; Cheng, Q; Homan, CH; Silenzio, VMB

File Download

Content.pdf

Supplementary

Citations:
Appears in Collections:
- Hong Kong Jockey Club Centre for Suicide Research and Prevention: Conference papers

Conference Paper: Learning from various labeling strategies for suicide-related messages on social media: An experimental study

Title	Learning from various labeling strategies for suicide-related messages on social media: An experimental study
Authors	Liu, T Cheng, Q Homan, CH Silenzio, VMB
Issue Date	2017
Publisher	ACM.
Citation	ACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports, 2017 How to Cite?
Abstract	Suicide is an important but often misunderstood problem, one that researchers are now seeking to better understand through social media. Due in large part to the fuzzy nature of what constitutes suicidal risks, most supervised approaches for learning to automatically detect suicide-related activity in social media require a great deal of human labor to train. However, humans themselves have diverse or conflicting views on what constitutes suicidal thoughts. So how to obtain reliable gold standard labels is fundamentally challenging and, we hypothesize, depends largely on what is asked of the annotators and what slice of the data they label. We conducted multiple rounds of data labelling and collected annotations from crowdsourcing workers and domain experts. We aggregated the resulting labels in various ways to train a series of supervised models. Our preliminary evaluations show that using unanimously agreed labels from multiple annotators is helpful to achieve robust machine models.
Persistent Identifier	http://hdl.handle.net/10722/248259

DC Field	Value	Language
dc.contributor.author	Liu, T	-
dc.contributor.author	Cheng, Q	-
dc.contributor.author	Homan, CH	-
dc.contributor.author	Silenzio, VMB	-
dc.date.accessioned	2017-10-18T08:40:23Z	-
dc.date.available	2017-10-18T08:40:23Z	-
dc.date.issued	2017	-
dc.identifier.citation	ACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports, 2017	-
dc.identifier.uri	http://hdl.handle.net/10722/248259	-
dc.description.abstract	Suicide is an important but often misunderstood problem, one that researchers are now seeking to better understand through social media. Due in large part to the fuzzy nature of what constitutes suicidal risks, most supervised approaches for learning to automatically detect suicide-related activity in social media require a great deal of human labor to train. However, humans themselves have diverse or conflicting views on what constitutes suicidal thoughts. So how to obtain reliable gold standard labels is fundamentally challenging and, we hypothesize, depends largely on what is asked of the annotators and what slice of the data they label. We conducted multiple rounds of data labelling and collected annotations from crowdsourcing workers and domain experts. We aggregated the resulting labels in various ways to train a series of supervised models. Our preliminary evaluations show that using unanimously agreed labels from multiple annotators is helpful to achieve robust machine models.	-
dc.language	eng	-
dc.publisher	ACM.	-
dc.relation.ispartof	ACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports	-
dc.title	Learning from various labeling strategies for suicide-related messages on social media: An experimental study	-
dc.type	Conference_Paper	-
dc.identifier.email	Cheng, Q: chengqj@connect.hku.hk	-
dc.identifier.authority	Cheng, Q=rp02018	-
dc.description.nature	postprint	-
dc.identifier.hkuros	280201	-

File Download

Supplementary

Conference Paper: Learning from various labeling strategies for suicide-related messages on social media: An experimental study

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats