File Download
Supplementary
-
Citations:
- Appears in Collections:
Article: Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?
| Title | Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier? |
|---|---|
| Authors | |
| Issue Date | 2-Jul-2025 |
| Publisher | Cambridge University Press |
| Citation | Political Science Research and Methods, 2025 How to Cite? |
| Abstract | The increasing multimodality (e.g., images, videos, links) of social media data presents opportunities and challenges. But text-as-data methods continue to dominate as modes of classification, as multimodal social media data are costly to collect and label. Researchers who face a budget constraint may need to make informed decisions regarding whether to collect and label only the textual content of social media data or their full multimodal content. In this article, we develop five measures and an experimental framework to assist with these decisions. We propose five performance metrics to measure the costs and benefits of multimodal labeling: average time per post, average time per valid response, valid response rate, intercoder agreement, and classifier's predictive power. To estimate these measures, we introduce an experimental framework to evaluate coders' performance under text-only and multimodal labeling conditions. We illustrate the method with a tweet labeling experiment. |
| Persistent Identifier | http://hdl.handle.net/10722/366620 |
| ISSN | 2023 Impact Factor: 2.5 2023 SCImago Journal Rankings: 2.431 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Chen, Haohan | - |
| dc.contributor.author | Bisbee, James | - |
| dc.contributor.author | Tucker, Joshua A. | - |
| dc.contributor.author | Nagler, Jonathan | - |
| dc.date.accessioned | 2025-11-25T04:20:35Z | - |
| dc.date.available | 2025-11-25T04:20:35Z | - |
| dc.date.issued | 2025-07-02 | - |
| dc.identifier.citation | Political Science Research and Methods, 2025 | - |
| dc.identifier.issn | 2049-8470 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/366620 | - |
| dc.description.abstract | The increasing multimodality (e.g., images, videos, links) of social media data presents opportunities and challenges. But text-as-data methods continue to dominate as modes of classification, as multimodal social media data are costly to collect and label. Researchers who face a budget constraint may need to make informed decisions regarding whether to collect and label only the textual content of social media data or their full multimodal content. In this article, we develop five measures and an experimental framework to assist with these decisions. We propose five performance metrics to measure the costs and benefits of multimodal labeling: average time per post, average time per valid response, valid response rate, intercoder agreement, and classifier's predictive power. To estimate these measures, we introduce an experimental framework to evaluate coders' performance under text-only and multimodal labeling conditions. We illustrate the method with a tweet labeling experiment. | - |
| dc.language | eng | - |
| dc.publisher | Cambridge University Press | - |
| dc.relation.ispartof | Political Science Research and Methods | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.title | Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier? | - |
| dc.type | Article | - |
| dc.description.nature | published_or_final_version | - |
| dc.identifier.doi | 10.1017/psrm.2025.10010 | - |
| dc.identifier.eissn | 2049-8489 | - |
| dc.identifier.issnl | 2049-8470 | - |

