File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: A globally synthesised and flagged bee occurrence dataset and cleaning workflow

TitleA globally synthesised and flagged bee occurrence dataset and cleaning workflow
Authors
Issue Date2-Nov-2023
PublisherNature Research
Citation
Scientific Data, 2023, v. 10, n. 1 How to Cite?
AbstractSpecies occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, "cleaned" and "flagged-but-uncleaned". The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.
Persistent Identifierhttp://hdl.handle.net/10722/344874
ISSN
2023 Impact Factor: 5.8
2023 SCImago Journal Rankings: 1.937
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorDorey, JB-
dc.contributor.authorFischer, EE-
dc.contributor.authorChesshire, PR-
dc.contributor.authorNava-Bolaños, A-
dc.contributor.authorO'Reilly, RL-
dc.contributor.authorBossert, S-
dc.contributor.authorCollins, SM-
dc.contributor.authorLichtenberg, EM-
dc.contributor.authorTucker, EM-
dc.contributor.authorSmith-Pardo, A-
dc.contributor.authorFalcon-Brindis, A-
dc.contributor.authorGuevara, DA-
dc.contributor.authorRibeiro, B-
dc.contributor.authorde Pedro, D-
dc.contributor.authorPickering, J-
dc.contributor.authorHung, KLJ-
dc.contributor.authorParys, KA-
dc.contributor.authorMcCabe, LM-
dc.contributor.authorRogan, MS-
dc.contributor.authorMinckley, RL-
dc.contributor.authorVelazco, SJE-
dc.contributor.authorGriswold, T-
dc.contributor.authorZarrillo, TA-
dc.contributor.authorJetz, W-
dc.contributor.authorSica, YV-
dc.contributor.authorOrr, MC-
dc.contributor.authorGuzman, LM-
dc.contributor.authorAscher, JS-
dc.contributor.authorHughes, AC-
dc.contributor.authorCobb, NS-
dc.date.accessioned2024-08-12T04:08:03Z-
dc.date.available2024-08-12T04:08:03Z-
dc.date.issued2023-11-02-
dc.identifier.citationScientific Data, 2023, v. 10, n. 1-
dc.identifier.issn2052-4463-
dc.identifier.urihttp://hdl.handle.net/10722/344874-
dc.description.abstractSpecies occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, "cleaned" and "flagged-but-uncleaned". The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.-
dc.languageeng-
dc.publisherNature Research-
dc.relation.ispartofScientific Data-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.titleA globally synthesised and flagged bee occurrence dataset and cleaning workflow-
dc.typeArticle-
dc.identifier.doi10.1038/s41597-023-02626-w-
dc.identifier.pmid37919303-
dc.identifier.scopuseid_2-s2.0-85175688705-
dc.identifier.volume10-
dc.identifier.issue1-
dc.identifier.eissn2052-4463-
dc.identifier.isiWOS:001098042400004-
dc.publisher.placeBERLIN-
dc.identifier.issnl2052-4463-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats