File Download
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Learning from various labeling strategies for suicide-related messages on social media: An experimental study
Title | Learning from various labeling strategies for suicide-related messages on social media: An experimental study |
---|---|
Authors | |
Issue Date | 2017 |
Publisher | ACM. |
Citation | ACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports, 2017 How to Cite? |
Abstract | Suicide is an important but often misunderstood problem, one that researchers are now seeking to better understand through social media. Due in large part to the fuzzy nature of what constitutes suicidal risks, most supervised approaches for learning to automatically detect suicide-related activity in social media require a great deal of human labor to train. However, humans themselves have diverse or conflicting views on what constitutes suicidal thoughts. So how to obtain reliable gold standard labels is fundamentally challenging and, we hypothesize, depends largely on what is asked of the annotators and what slice of the data they label. We conducted multiple rounds of data labelling and collected annotations from crowdsourcing workers and domain experts. We aggregated the resulting labels in various ways to train a series of supervised models. Our preliminary evaluations show that using unanimously agreed labels from multiple annotators is helpful to achieve robust machine models. |
Persistent Identifier | http://hdl.handle.net/10722/248259 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Liu, T | - |
dc.contributor.author | Cheng, Q | - |
dc.contributor.author | Homan, CH | - |
dc.contributor.author | Silenzio, VMB | - |
dc.date.accessioned | 2017-10-18T08:40:23Z | - |
dc.date.available | 2017-10-18T08:40:23Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | ACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports, 2017 | - |
dc.identifier.uri | http://hdl.handle.net/10722/248259 | - |
dc.description.abstract | Suicide is an important but often misunderstood problem, one that researchers are now seeking to better understand through social media. Due in large part to the fuzzy nature of what constitutes suicidal risks, most supervised approaches for learning to automatically detect suicide-related activity in social media require a great deal of human labor to train. However, humans themselves have diverse or conflicting views on what constitutes suicidal thoughts. So how to obtain reliable gold standard labels is fundamentally challenging and, we hypothesize, depends largely on what is asked of the annotators and what slice of the data they label. We conducted multiple rounds of data labelling and collected annotations from crowdsourcing workers and domain experts. We aggregated the resulting labels in various ways to train a series of supervised models. Our preliminary evaluations show that using unanimously agreed labels from multiple annotators is helpful to achieve robust machine models. | - |
dc.language | eng | - |
dc.publisher | ACM. | - |
dc.relation.ispartof | ACM International Conference on Web Search and Data Mining: Workshop on Mining Online Health Reports | - |
dc.title | Learning from various labeling strategies for suicide-related messages on social media: An experimental study | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Cheng, Q: chengqj@connect.hku.hk | - |
dc.identifier.authority | Cheng, Q=rp02018 | - |
dc.description.nature | postprint | - |
dc.identifier.hkuros | 280201 | - |