File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.3758/s13428-023-02198-y
- Scopus: eid_2-s2.0-85167838404
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: The Children’s Picture Books Lexicon (CPB-Lex): A large-scale lexical database from children’s picture books
Title | The Children’s Picture Books Lexicon (CPB-Lex): A large-scale lexical database from children’s picture books |
---|---|
Authors | |
Keywords | Age of acquisition Child input norms Early print exposure Lexical database Picture books |
Issue Date | 1-Aug-2023 |
Publisher | Springer |
Citation | Behavior Research Methods, 2023, v. 56, n. 5, p. 4504-4521 How to Cite? |
Abstract | This article presents cpb-lex, a large-scale database of lexical statistics derived from children’s picture books (age range 0–8 years). Such a database is essential for research in psychology, education and computational modelling, where rich details on the vocabulary of early print exposure are required. Cpb-lex was built through an innovative method of computationally extracting lexical information from automatic speech-to-text captions and subtitle tracks generated from social media channels dedicated to reading picture books aloud. It consists of approximately 25,585 types (wordforms) and their frequency norms (raw and Zipf-transformed), a lexicon of bigrams (two-word sequences and their transitional probabilities) and a document-term matrix (which shows the importance of each word in the corpus in each book). Several immediate contributions of cpb-lex to behavioural science research are reported, including that the new cpb-lex frequency norms strongly predict age of acquisition and outperform comparable child-input lexical databases. The database allows researchers and practitioners to extract lexical statistics for high-frequency words which can be used to develop word lists. The paper concludes with an investigation of how cpb-lex can be used to extend recent modelling research on the lexical diversity children receive from picture books in addition to child-directed speech. Our model shows that the vocabulary input from a relatively small number of picture books can dramatically enrich vocabulary exposure from child-directed speech and potentially assist children with vocabulary input deficits. |
Persistent Identifier | http://hdl.handle.net/10722/347222 |
ISSN | 2023 Impact Factor: 4.6 2023 SCImago Journal Rankings: 2.396 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Green, Clarence | - |
dc.contributor.author | Keogh, Kathleen | - |
dc.contributor.author | Sun, He | - |
dc.contributor.author | O’Brien, Beth | - |
dc.date.accessioned | 2024-09-20T00:30:45Z | - |
dc.date.available | 2024-09-20T00:30:45Z | - |
dc.date.issued | 2023-08-01 | - |
dc.identifier.citation | Behavior Research Methods, 2023, v. 56, n. 5, p. 4504-4521 | - |
dc.identifier.issn | 1554-351X | - |
dc.identifier.uri | http://hdl.handle.net/10722/347222 | - |
dc.description.abstract | <p>This article presents cpb-lex, a large-scale database of lexical statistics derived from children’s picture books (age range 0–8 years). Such a database is essential for research in psychology, education and computational modelling, where rich details on the vocabulary of early print exposure are required. Cpb-lex was built through an innovative method of computationally extracting lexical information from automatic speech-to-text captions and subtitle tracks generated from social media channels dedicated to reading picture books aloud. It consists of approximately 25,585 types (wordforms) and their frequency norms (raw and Zipf-transformed), a lexicon of bigrams (two-word sequences and their transitional probabilities) and a document-term matrix (which shows the importance of each word in the corpus in each book). Several immediate contributions of cpb-lex to behavioural science research are reported, including that the new cpb-lex frequency norms strongly predict age of acquisition and outperform comparable child-input lexical databases. The database allows researchers and practitioners to extract lexical statistics for high-frequency words which can be used to develop word lists. The paper concludes with an investigation of how cpb-lex can be used to extend recent modelling research on the lexical diversity children receive from picture books in addition to child-directed speech. Our model shows that the vocabulary input from a relatively small number of picture books can dramatically enrich vocabulary exposure from child-directed speech and potentially assist children with vocabulary input deficits.</p> | - |
dc.language | eng | - |
dc.publisher | Springer | - |
dc.relation.ispartof | Behavior Research Methods | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject | Age of acquisition | - |
dc.subject | Child input norms | - |
dc.subject | Early print exposure | - |
dc.subject | Lexical database | - |
dc.subject | Picture books | - |
dc.title | The Children’s Picture Books Lexicon (CPB-Lex): A large-scale lexical database from children’s picture books | - |
dc.type | Article | - |
dc.identifier.doi | 10.3758/s13428-023-02198-y | - |
dc.identifier.scopus | eid_2-s2.0-85167838404 | - |
dc.identifier.volume | 56 | - |
dc.identifier.issue | 5 | - |
dc.identifier.spage | 4504 | - |
dc.identifier.epage | 4521 | - |
dc.identifier.eissn | 1554-3528 | - |
dc.identifier.issnl | 1554-351X | - |