File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: How much is said in a microblog? A multilingual inquiry based on Weibo and Twitter

TitleHow much is said in a microblog? A multilingual inquiry based on Weibo and Twitter
Authors
KeywordsMicroblogs
Language
Design
Social Networking
Issue Date2015
PublisherAssociation for Computing Machinery (ACM). The Conference abstracts' website is located at http://websci15.org/accepted-submissions
Citation
The 2015 ACM Web Science Conference (WebSci'15), Oxford, UK., 28 June-1 July 2015. How to Cite?
AbstractThis paper presents a multilingual study on, per single post of microblog text, (a) how much can be said, (b) how much is written in terms of characters and bytes, and (c) how much is said in terms of information content in posts by different organizations in different languages. Focusing on three different languages (English, Chinese, and Japanese), this research analyses Weibo and Twitter accounts of major embassies and news agencies. We first establish our criterion for quantifying 'how much can be said' in a digital text based on the openly available Universal Declaration of Human Rights and the translated subtitles from TED talks. These parallel corpora allow us to determine the number of characters and bits needed to represent the same content in different languages and character encodings. We then derive the amount of information that is actually contained in microblog posts authored by selected accounts on Weibo and Twitter. Our results confirm that languages with larger character sets such as Chinese and Japanese contain more information per character than English, but the actual information content contained within a microblog text varies depending on both the type of organization and the language of the post. We conclude with a discussion on the design implications of microblog text limits for different languages.
DescriptionPaper Session - Digital Narratives 1: no. 25
Persistent Identifierhttp://hdl.handle.net/10722/211058
Awardlink_to_OA_fulltext

 

DC FieldValueLanguage
dc.contributor.authorLiao, HT-
dc.contributor.authorFu, KW-
dc.contributor.authorHale, SA-
dc.date.accessioned2015-07-06T06:09:21Z-
dc.date.available2015-07-06T06:09:21Z-
dc.date.issued2015-
dc.identifier.citationThe 2015 ACM Web Science Conference (WebSci'15), Oxford, UK., 28 June-1 July 2015.-
dc.identifier.urihttp://hdl.handle.net/10722/211058-
dc.descriptionPaper Session - Digital Narratives 1: no. 25-
dc.description.abstractThis paper presents a multilingual study on, per single post of microblog text, (a) how much can be said, (b) how much is written in terms of characters and bytes, and (c) how much is said in terms of information content in posts by different organizations in different languages. Focusing on three different languages (English, Chinese, and Japanese), this research analyses Weibo and Twitter accounts of major embassies and news agencies. We first establish our criterion for quantifying 'how much can be said' in a digital text based on the openly available Universal Declaration of Human Rights and the translated subtitles from TED talks. These parallel corpora allow us to determine the number of characters and bits needed to represent the same content in different languages and character encodings. We then derive the amount of information that is actually contained in microblog posts authored by selected accounts on Weibo and Twitter. Our results confirm that languages with larger character sets such as Chinese and Japanese contain more information per character than English, but the actual information content contained within a microblog text varies depending on both the type of organization and the language of the post. We conclude with a discussion on the design implications of microblog text limits for different languages.-
dc.languageeng-
dc.publisherAssociation for Computing Machinery (ACM). The Conference abstracts' website is located at http://websci15.org/accepted-submissions-
dc.relation.ispartofACM Web Science Conference, WebSci'15-
dc.subjectMicroblogs-
dc.subjectLanguage-
dc.subjectDesign-
dc.subjectSocial Networking-
dc.titleHow much is said in a microblog? A multilingual inquiry based on Weibo and Twitter-
dc.typeConference_Paper-
dc.identifier.emailFu, KW: kwfu@hkucc.hku.hk-
dc.identifier.authorityFu, KW=rp00552-
dc.identifier.doi10.1145/2786451.2786486-
dc.identifier.hkuros244299-
dc.identifier.spage1-
dc.identifier.epage9-
dc.publisher.placeOxford, United Kingdom-
dc.description.awardlink_to_OA_fulltext-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats