File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: The Children’s Picture Books Lexicon (CPB-Lex): A large-scale lexical database from children’s picture books

TitleThe Children’s Picture Books Lexicon (CPB-Lex): A large-scale lexical database from children’s picture books
Authors
KeywordsAge of acquisition
Child input norms
Early print exposure
Lexical database
Picture books
Issue Date1-Aug-2023
PublisherSpringer
Citation
Behavior Research Methods, 2023, v. 56, n. 5, p. 4504-4521 How to Cite?
Abstract

This article presents cpb-lex, a large-scale database of lexical statistics derived from children’s picture books (age range 0–8 years). Such a database is essential for research in psychology, education and computational modelling, where rich details on the vocabulary of early print exposure are required. Cpb-lex was built through an innovative method of computationally extracting lexical information from automatic speech-to-text captions and subtitle tracks generated from social media channels dedicated to reading picture books aloud. It consists of approximately 25,585 types (wordforms) and their frequency norms (raw and Zipf-transformed), a lexicon of bigrams (two-word sequences and their transitional probabilities) and a document-term matrix (which shows the importance of each word in the corpus in each book). Several immediate contributions of cpb-lex to behavioural science research are reported, including that the new cpb-lex frequency norms strongly predict age of acquisition and outperform comparable child-input lexical databases. The database allows researchers and practitioners to extract lexical statistics for high-frequency words which can be used to develop word lists. The paper concludes with an investigation of how cpb-lex can be used to extend recent modelling research on the lexical diversity children receive from picture books in addition to child-directed speech. Our model shows that the vocabulary input from a relatively small number of picture books can dramatically enrich vocabulary exposure from child-directed speech and potentially assist children with vocabulary input deficits.


Persistent Identifierhttp://hdl.handle.net/10722/347222
ISSN
2023 Impact Factor: 4.6
2023 SCImago Journal Rankings: 2.396

 

DC FieldValueLanguage
dc.contributor.authorGreen, Clarence-
dc.contributor.authorKeogh, Kathleen-
dc.contributor.authorSun, He-
dc.contributor.authorO’Brien, Beth-
dc.date.accessioned2024-09-20T00:30:45Z-
dc.date.available2024-09-20T00:30:45Z-
dc.date.issued2023-08-01-
dc.identifier.citationBehavior Research Methods, 2023, v. 56, n. 5, p. 4504-4521-
dc.identifier.issn1554-351X-
dc.identifier.urihttp://hdl.handle.net/10722/347222-
dc.description.abstract<p>This article presents cpb-lex, a large-scale database of lexical statistics derived from children’s picture books (age range 0–8 years). Such a database is essential for research in psychology, education and computational modelling, where rich details on the vocabulary of early print exposure are required. Cpb-lex was built through an innovative method of computationally extracting lexical information from automatic speech-to-text captions and subtitle tracks generated from social media channels dedicated to reading picture books aloud. It consists of approximately 25,585 types (wordforms) and their frequency norms (raw and Zipf-transformed), a lexicon of bigrams (two-word sequences and their transitional probabilities) and a document-term matrix (which shows the importance of each word in the corpus in each book). Several immediate contributions of cpb-lex to behavioural science research are reported, including that the new cpb-lex frequency norms strongly predict age of acquisition and outperform comparable child-input lexical databases. The database allows researchers and practitioners to extract lexical statistics for high-frequency words which can be used to develop word lists. The paper concludes with an investigation of how cpb-lex can be used to extend recent modelling research on the lexical diversity children receive from picture books in addition to child-directed speech. Our model shows that the vocabulary input from a relatively small number of picture books can dramatically enrich vocabulary exposure from child-directed speech and potentially assist children with vocabulary input deficits.</p>-
dc.languageeng-
dc.publisherSpringer-
dc.relation.ispartofBehavior Research Methods-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subjectAge of acquisition-
dc.subjectChild input norms-
dc.subjectEarly print exposure-
dc.subjectLexical database-
dc.subjectPicture books-
dc.titleThe Children’s Picture Books Lexicon (CPB-Lex): A large-scale lexical database from children’s picture books-
dc.typeArticle-
dc.identifier.doi10.3758/s13428-023-02198-y-
dc.identifier.scopuseid_2-s2.0-85167838404-
dc.identifier.volume56-
dc.identifier.issue5-
dc.identifier.spage4504-
dc.identifier.epage4521-
dc.identifier.eissn1554-3528-
dc.identifier.issnl1554-351X-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats