File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Generalizing Word Embeddings using Bag of Subwords
Title | Generalizing Word Embeddings using Bag of Subwords |
---|---|
Authors | |
Issue Date | 2018 |
Citation | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, 2018, p. 601-606 How to Cite? |
Abstract | We approach the problem of generalizing pre-trained word embeddings beyond fixed-size vocabularies without using additional contextual information. We propose a subword-level word vector generation model that views words as bags of character n-grams. The model is simple, fast to train and provides good vectors for rare or unseen words. Experiments show that our model achieves state-of-the-art performances in English word similarity task and in joint prediction of part-of-speech tag and morphosyntactic attributes in 23 languages, suggesting our model's ability in capturing the relationship between words' textual representations and their embeddings. |
Persistent Identifier | http://hdl.handle.net/10722/341493 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zhao, Jinman | - |
dc.contributor.author | Mudgal, Sidharth | - |
dc.contributor.author | Liang, Yingyu | - |
dc.date.accessioned | 2024-03-13T08:43:14Z | - |
dc.date.available | 2024-03-13T08:43:14Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, 2018, p. 601-606 | - |
dc.identifier.uri | http://hdl.handle.net/10722/341493 | - |
dc.description.abstract | We approach the problem of generalizing pre-trained word embeddings beyond fixed-size vocabularies without using additional contextual information. We propose a subword-level word vector generation model that views words as bags of character n-grams. The model is simple, fast to train and provides good vectors for rare or unseen words. Experiments show that our model achieves state-of-the-art performances in English word similarity task and in joint prediction of part-of-speech tag and morphosyntactic attributes in 23 languages, suggesting our model's ability in capturing the relationship between words' textual representations and their embeddings. | - |
dc.language | eng | - |
dc.relation.ispartof | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 | - |
dc.title | Generalizing Word Embeddings using Bag of Subwords | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.scopus | eid_2-s2.0-85077936079 | - |
dc.identifier.spage | 601 | - |
dc.identifier.epage | 606 | - |