File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1162/dint_a_00053
- Scopus: eid_2-s2.0-85087408658
- WOS: WOS:000691823600005
Supplementary
- Citations:
- Appears in Collections:
Article: Toward Training and Assessing Reproducible Data Analysis in Data Science Education
Title | Toward Training and Assessing Reproducible Data Analysis in Data Science Education |
---|---|
Authors | |
Keywords | Data science education Reproducibility Reproducible data analysis Communication Action research |
Issue Date | 2019 |
Publisher | MIT Press: Data Intelligence. The Journal's web site is located at https://www.mitpressjournals.org/loi/dint |
Citation | Data Intelligence, 2019, v. 1 n. 4, p. 381-392 How to Cite? |
Abstract | Reproducibility is a cornerstone of scientific research. Data science is not an exception. In recent years scientists were concerned about a large number of irreproducible studies. Such reproducibility crisis in science could severely undermine public trust in science and science-based public policy. Recent efforts to promote reproducible research mainly focused on matured scientists and much less on student training. In this study, we conducted action research on students in data science to evaluate to what extent students are ready for communicating reproducible data analysis. The results show that although two-thirds of the students claimed they were able to reproduce results in peer reports, only one-third of reports provided all necessary information for replication. The actual replication results also include conflicting claims; some lacked comparisons of original and replication results, indicating that some students did not share a consistent understanding of what reproducibility means and how to report replication results. The findings suggest that more training is needed to help data science students communicating reproducible data analysis. |
Persistent Identifier | http://hdl.handle.net/10722/294172 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yu, B | - |
dc.contributor.author | Hu, X | - |
dc.date.accessioned | 2020-11-23T08:27:24Z | - |
dc.date.available | 2020-11-23T08:27:24Z | - |
dc.date.issued | 2019 | - |
dc.identifier.citation | Data Intelligence, 2019, v. 1 n. 4, p. 381-392 | - |
dc.identifier.uri | http://hdl.handle.net/10722/294172 | - |
dc.description.abstract | Reproducibility is a cornerstone of scientific research. Data science is not an exception. In recent years scientists were concerned about a large number of irreproducible studies. Such reproducibility crisis in science could severely undermine public trust in science and science-based public policy. Recent efforts to promote reproducible research mainly focused on matured scientists and much less on student training. In this study, we conducted action research on students in data science to evaluate to what extent students are ready for communicating reproducible data analysis. The results show that although two-thirds of the students claimed they were able to reproduce results in peer reports, only one-third of reports provided all necessary information for replication. The actual replication results also include conflicting claims; some lacked comparisons of original and replication results, indicating that some students did not share a consistent understanding of what reproducibility means and how to report replication results. The findings suggest that more training is needed to help data science students communicating reproducible data analysis. | - |
dc.language | eng | - |
dc.publisher | MIT Press: Data Intelligence. The Journal's web site is located at https://www.mitpressjournals.org/loi/dint | - |
dc.relation.ispartof | Data Intelligence | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject | Data science education | - |
dc.subject | Reproducibility | - |
dc.subject | Reproducible data analysis | - |
dc.subject | Communication | - |
dc.subject | Action research | - |
dc.title | Toward Training and Assessing Reproducible Data Analysis in Data Science Education | - |
dc.type | Article | - |
dc.identifier.email | Hu, X: xiaoxhu@hku.hk | - |
dc.identifier.authority | Hu, X=rp01711 | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.1162/dint_a_00053 | - |
dc.identifier.scopus | eid_2-s2.0-85087408658 | - |
dc.identifier.hkuros | 318960 | - |
dc.identifier.volume | 1 | - |
dc.identifier.issue | 4 | - |
dc.identifier.spage | 381 | - |
dc.identifier.epage | 392 | - |
dc.identifier.eissn | 2641-435X | - |
dc.identifier.isi | WOS:000691823600005 | - |
dc.publisher.place | United States | - |
dc.identifier.issnl | 2641-435X | - |