File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1093/bioinformatics/btab338
- PMID: 33964132
- WOS: WOS:000733827400032
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Alignment free sequence comparison methods and reservoir host prediction
Title | Alignment free sequence comparison methods and reservoir host prediction |
---|---|
Authors | |
Issue Date | 2021 |
Publisher | Oxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/ |
Citation | Bioinformatics, 2021, v. 37 n. 19, p. 3337-3342 How to Cite? |
Abstract | Motivation:
The emergence and subsequent pandemic of the SARS-CoV-2 virus raised urgent questions about its origin and, particularly, its reservoir host. These types of questions are long-standing problems in the management of emerging infectious diseases and are linked to virus discovery programs and the prediction of viruses that are likely to become zoonotic. Conventional means to identify reservoir hosts have relied on surveillance, experimental studies and phylogenetics. More recently, machine learning approaches have been applied to generate tools to swiftly predict reservoir hosts from sequence data.
Results:
Here, we extend a recent work that combined sequence alignment and a mixture of alignment-free approaches using a gradient boosting machines machine learning model, which integrates genomic traits and phylogenetic neighbourhood signatures to predict reservoir hosts. We add a more uniform approach by applying Machine Learning with Digital Signal Processing-based structural patterns. The extended model was applied to an existing virus/reservoir host dataset and to the SARS-CoV-2 and related viruses and generated an improvement in prediction accuracy.
Availability and implementation:
The source code used in this work is freely available at https://github.com/bill1167/hostgbms.
Supplementary information:
Supplementary data are available at Bioinformatics online. |
Persistent Identifier | http://hdl.handle.net/10722/304474 |
ISSN | 2023 Impact Factor: 4.4 2023 SCImago Journal Rankings: 2.574 |
PubMed Central ID | |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, B | - |
dc.contributor.author | Smith, DK | - |
dc.contributor.author | Guan, Y | - |
dc.date.accessioned | 2021-09-23T09:00:32Z | - |
dc.date.available | 2021-09-23T09:00:32Z | - |
dc.date.issued | 2021 | - |
dc.identifier.citation | Bioinformatics, 2021, v. 37 n. 19, p. 3337-3342 | - |
dc.identifier.issn | 1367-4803 | - |
dc.identifier.uri | http://hdl.handle.net/10722/304474 | - |
dc.description.abstract | Motivation: The emergence and subsequent pandemic of the SARS-CoV-2 virus raised urgent questions about its origin and, particularly, its reservoir host. These types of questions are long-standing problems in the management of emerging infectious diseases and are linked to virus discovery programs and the prediction of viruses that are likely to become zoonotic. Conventional means to identify reservoir hosts have relied on surveillance, experimental studies and phylogenetics. More recently, machine learning approaches have been applied to generate tools to swiftly predict reservoir hosts from sequence data. Results: Here, we extend a recent work that combined sequence alignment and a mixture of alignment-free approaches using a gradient boosting machines machine learning model, which integrates genomic traits and phylogenetic neighbourhood signatures to predict reservoir hosts. We add a more uniform approach by applying Machine Learning with Digital Signal Processing-based structural patterns. The extended model was applied to an existing virus/reservoir host dataset and to the SARS-CoV-2 and related viruses and generated an improvement in prediction accuracy. Availability and implementation: The source code used in this work is freely available at https://github.com/bill1167/hostgbms. Supplementary information: Supplementary data are available at Bioinformatics online. | - |
dc.language | eng | - |
dc.publisher | Oxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/ | - |
dc.relation.ispartof | Bioinformatics | - |
dc.title | Alignment free sequence comparison methods and reservoir host prediction | - |
dc.type | Article | - |
dc.identifier.email | Smith, DK: dsmith@hku.hk | - |
dc.identifier.email | Guan, Y: yguan@hkucc.hku.hk | - |
dc.identifier.authority | Guan, Y=rp00397 | - |
dc.description.nature | link_to_OA_fulltext | - |
dc.identifier.doi | 10.1093/bioinformatics/btab338 | - |
dc.identifier.pmid | 33964132 | - |
dc.identifier.pmcid | PMC8135978 | - |
dc.identifier.hkuros | 325414 | - |
dc.identifier.volume | 37 | - |
dc.identifier.issue | 19 | - |
dc.identifier.spage | 3337 | - |
dc.identifier.epage | 3342 | - |
dc.identifier.isi | WOS:000733827400032 | - |
dc.publisher.place | United Kingdom | - |