File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.eswa.2023.121031
- Scopus: eid_2-s2.0-85166967683
- WOS: WOS:001059475500001
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: A cross-lingual transfer learning method for online COVID-19-related hate speech detection
Title | A cross-lingual transfer learning method for online COVID-19-related hate speech detection |
---|---|
Authors | |
Keywords | COVID-19 Cross-lingual Deep learning Hate speech detection Natural language processing |
Issue Date | 2023 |
Citation | Expert Systems with Applications, 2023, v. 234, article no. 121031 How to Cite? |
Abstract | During the COVID-19 pandemic, online social media platforms such as Twitter facilitate the exchange of information among people. However, the prevalence of “infodemic” such as online hate speech has exacerbated social rifts, discrimination, prejudice and even hate crimes. Timely and effective detection of the hate speech will help create a healthy public opinion environment. Most of the current COVID-19-related hate speech research focuses on a single language, such as English. In this paper, we introduce a cross-lingual transfer learning method, aiming to contribute to hate speech detection in low-resource languages. We propose a deep learning based model to classify hate speech with a pre-trained language model for multilingual text embedding. Data augmentation and cross-lingual contrastive learning are then utilized to further improve the performance of cross-lingual knowledge transfer. To evaluate our method, we collected three publicly available annotated COVID-19-related hate speech datasets on Twitter, i.e., two in English and one in German. Furthermore, a Chinese dataset based on Weibo is constructed to expand multilingual data. The experimental results across three languages illustrate the effectiveness of our method for cross-lingual hate speech detection. Test F1-scores of our method for English, Chinese, German as transfer target languages can reach up to 0.728, 0.799 and 0.612 respectively, which are on average better than other baselines. |
Persistent Identifier | http://hdl.handle.net/10722/330484 |
ISSN | 2023 Impact Factor: 7.5 2023 SCImago Journal Rankings: 1.875 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Liu, Lin | - |
dc.contributor.author | Xu, Duo | - |
dc.contributor.author | Zhao, Pengfei | - |
dc.contributor.author | Zeng, Daniel Dajun | - |
dc.contributor.author | Hu, Paul Jen Hwa | - |
dc.contributor.author | Zhang, Qingpeng | - |
dc.contributor.author | Luo, Yin | - |
dc.contributor.author | Cao, Zhidong | - |
dc.date.accessioned | 2023-09-05T12:11:06Z | - |
dc.date.available | 2023-09-05T12:11:06Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | Expert Systems with Applications, 2023, v. 234, article no. 121031 | - |
dc.identifier.issn | 0957-4174 | - |
dc.identifier.uri | http://hdl.handle.net/10722/330484 | - |
dc.description.abstract | During the COVID-19 pandemic, online social media platforms such as Twitter facilitate the exchange of information among people. However, the prevalence of “infodemic” such as online hate speech has exacerbated social rifts, discrimination, prejudice and even hate crimes. Timely and effective detection of the hate speech will help create a healthy public opinion environment. Most of the current COVID-19-related hate speech research focuses on a single language, such as English. In this paper, we introduce a cross-lingual transfer learning method, aiming to contribute to hate speech detection in low-resource languages. We propose a deep learning based model to classify hate speech with a pre-trained language model for multilingual text embedding. Data augmentation and cross-lingual contrastive learning are then utilized to further improve the performance of cross-lingual knowledge transfer. To evaluate our method, we collected three publicly available annotated COVID-19-related hate speech datasets on Twitter, i.e., two in English and one in German. Furthermore, a Chinese dataset based on Weibo is constructed to expand multilingual data. The experimental results across three languages illustrate the effectiveness of our method for cross-lingual hate speech detection. Test F1-scores of our method for English, Chinese, German as transfer target languages can reach up to 0.728, 0.799 and 0.612 respectively, which are on average better than other baselines. | - |
dc.language | eng | - |
dc.relation.ispartof | Expert Systems with Applications | - |
dc.subject | COVID-19 | - |
dc.subject | Cross-lingual | - |
dc.subject | Deep learning | - |
dc.subject | Hate speech detection | - |
dc.subject | Natural language processing | - |
dc.title | A cross-lingual transfer learning method for online COVID-19-related hate speech detection | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1016/j.eswa.2023.121031 | - |
dc.identifier.scopus | eid_2-s2.0-85166967683 | - |
dc.identifier.volume | 234 | - |
dc.identifier.spage | article no. 121031 | - |
dc.identifier.epage | article no. 121031 | - |
dc.identifier.isi | WOS:001059475500001 | - |