File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1186/gb-2003-4-4-r25
- Scopus: eid_2-s2.0-0037837485
- PMID: 12702206
- WOS: WOS:000182696200007
- Find via
Supplementary
-
Bookmarks:
- CiteULike: 3
- Citations:
- Appears in Collections:
Article: Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence.
Title | Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence. |
---|---|
Authors | |
Issue Date | 2003 |
Publisher | BioMed Central Ltd. |
Citation | Genome Biology, 2003, v. 4 n. 4, p. R25 How to Cite? |
Abstract | BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve. |
Persistent Identifier | http://hdl.handle.net/10722/43556 |
ISSN | |
PubMed Central ID | |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Cheung, J | en_HK |
dc.contributor.author | Estivill, X | en_HK |
dc.contributor.author | Khaja, R | en_HK |
dc.contributor.author | MacDonald, JR | en_HK |
dc.contributor.author | Lau, K | en_HK |
dc.contributor.author | Tsui, LC | en_HK |
dc.contributor.author | Scherer, SW | en_HK |
dc.date.accessioned | 2007-03-23T04:48:55Z | - |
dc.date.available | 2007-03-23T04:48:55Z | - |
dc.date.issued | 2003 | en_HK |
dc.identifier.citation | Genome Biology, 2003, v. 4 n. 4, p. R25 | en_HK |
dc.identifier.issn | 1465-6914 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/43556 | - |
dc.description.abstract | BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve. | en_HK |
dc.format.extent | 856152 bytes | - |
dc.format.extent | 25088 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/msword | - |
dc.language | eng | en_HK |
dc.publisher | BioMed Central Ltd. | en_HK |
dc.relation.ispartof | Genome biology | en_HK |
dc.subject.mesh | Artifacts | en_HK |
dc.subject.mesh | Chromosomes, human | en_HK |
dc.subject.mesh | Computational biology | en_HK |
dc.subject.mesh | Gene duplication | en_HK |
dc.subject.mesh | Genetic diseases, inborn - genetics | en_HK |
dc.title | Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence. | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1465-6906&volume=4&issue=4&spage=R25:1&epage=10&date=2003&atitle=Genome-wide+detection+of+segmental+duplications+and+potential+assembly+errors+in+the+human+genome+sequence | en_HK |
dc.identifier.email | Tsui, LC: tsuilc@hkucc.hku.hk | en_HK |
dc.identifier.authority | Tsui, LC=rp00058 | en_HK |
dc.description.nature | published_or_final_version | en_HK |
dc.identifier.doi | 10.1186/gb-2003-4-4-r25 | en_HK |
dc.identifier.pmid | 12702206 | - |
dc.identifier.pmcid | PMC154576 | - |
dc.identifier.scopus | eid_2-s2.0-0037837485 | en_HK |
dc.identifier.volume | 4 | en_HK |
dc.identifier.issue | 4 | en_HK |
dc.identifier.spage | R25 | en_HK |
dc.identifier.epage | R25 | en_HK |
dc.identifier.isi | WOS:000182696200007 | - |
dc.identifier.scopusauthorid | Cheung, J=7202072292 | en_HK |
dc.identifier.scopusauthorid | Estivill, X=36047834200 | en_HK |
dc.identifier.scopusauthorid | Khaja, R=7801610375 | en_HK |
dc.identifier.scopusauthorid | MacDonald, JR=7401439417 | en_HK |
dc.identifier.scopusauthorid | Lau, K=36722697000 | en_HK |
dc.identifier.scopusauthorid | Tsui, LC=7102754167 | en_HK |
dc.identifier.scopusauthorid | Scherer, SW=35374654500 | en_HK |
dc.identifier.citeulike | 838938 | - |
dc.identifier.issnl | 1465-6906 | - |