File Download
 
Links for fulltext
(May Require Subscription)
 
Supplementary

Article: Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence.
  • Basic View
  • Metadata View
  • XML View
TitleGenome-wide detection of segmental duplications and potential assembly errors in the human genome sequence.
 
AuthorsCheung, J1
Estivill, X1
Khaja, R1
MacDonald, JR1
Lau, K1
Tsui, LC1
Scherer, SW1
 
Issue Date2003
 
PublisherBioMed Central Ltd.
 
CitationGenome Biology, 2003, v. 4 n. 4, p. R25 [How to Cite?]
DOI: http://dx.doi.org/10.1186/gb-2003-4-4-r25
 
AbstractBACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve.
 
ISSN1465-6914
 
DOIhttp://dx.doi.org/10.1186/gb-2003-4-4-r25
 
PubMed Central IDPMC154576
 
ISI Accession Number IDWOS:000182696200007
 
DC FieldValue
dc.contributor.authorCheung, J
 
dc.contributor.authorEstivill, X
 
dc.contributor.authorKhaja, R
 
dc.contributor.authorMacDonald, JR
 
dc.contributor.authorLau, K
 
dc.contributor.authorTsui, LC
 
dc.contributor.authorScherer, SW
 
dc.date.accessioned2007-03-23T04:48:55Z
 
dc.date.available2007-03-23T04:48:55Z
 
dc.date.issued2003
 
dc.description.abstractBACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve.
 
dc.description.naturepublished_or_final_version
 
dc.format.extent856152 bytes
 
dc.format.extent25088 bytes
 
dc.format.mimetypeapplication/pdf
 
dc.format.mimetypeapplication/msword
 
dc.identifier.citationGenome Biology, 2003, v. 4 n. 4, p. R25 [How to Cite?]
DOI: http://dx.doi.org/10.1186/gb-2003-4-4-r25
 
dc.identifier.citeulike838938
 
dc.identifier.doihttp://dx.doi.org/10.1186/gb-2003-4-4-r25
 
dc.identifier.epageR25
 
dc.identifier.isiWOS:000182696200007
 
dc.identifier.issn1465-6914
 
dc.identifier.issue4
 
dc.identifier.openurl
 
dc.identifier.pmcidPMC154576
 
dc.identifier.pmid12702206
 
dc.identifier.scopuseid_2-s2.0-0037837485
 
dc.identifier.spageR25
 
dc.identifier.urihttp://hdl.handle.net/10722/43556
 
dc.identifier.volume4
 
dc.languageeng
 
dc.publisherBioMed Central Ltd.
 
dc.relation.ispartofGenome biology
 
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License
 
dc.subject.meshArtifacts
 
dc.subject.meshChromosomes, human
 
dc.subject.meshComputational biology
 
dc.subject.meshGene duplication
 
dc.subject.meshGenetic diseases, inborn - genetics
 
dc.titleGenome-wide detection of segmental duplications and potential assembly errors in the human genome sequence.
 
dc.typeArticle
 
<?xml encoding="utf-8" version="1.0"?>
<item><contributor.author>Cheung, J</contributor.author>
<contributor.author>Estivill, X</contributor.author>
<contributor.author>Khaja, R</contributor.author>
<contributor.author>MacDonald, JR</contributor.author>
<contributor.author>Lau, K</contributor.author>
<contributor.author>Tsui, LC</contributor.author>
<contributor.author>Scherer, SW</contributor.author>
<date.accessioned>2007-03-23T04:48:55Z</date.accessioned>
<date.available>2007-03-23T04:48:55Z</date.available>
<date.issued>2003</date.issued>
<identifier.citation>Genome Biology, 2003, v. 4 n. 4, p. R25</identifier.citation>
<identifier.issn>1465-6914</identifier.issn>
<identifier.uri>http://hdl.handle.net/10722/43556</identifier.uri>
<description.abstract>BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve.</description.abstract>
<format.extent>856152 bytes</format.extent>
<format.extent>25088 bytes</format.extent>
<format.mimetype>application/pdf</format.mimetype>
<format.mimetype>application/msword</format.mimetype>
<language>eng</language>
<publisher>BioMed Central Ltd.</publisher>
<relation.ispartof>Genome biology</relation.ispartof>
<rights>Creative Commons: Attribution 3.0 Hong Kong License</rights>
<subject.mesh>Artifacts</subject.mesh>
<subject.mesh>Chromosomes, human</subject.mesh>
<subject.mesh>Computational biology</subject.mesh>
<subject.mesh>Gene duplication</subject.mesh>
<subject.mesh>Genetic diseases, inborn - genetics</subject.mesh>
<title>Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence.</title>
<type>Article</type>
<identifier.openurl>http://library.hku.hk:4550/resserv?sid=HKU:IR&amp;issn=1465-6906&amp;volume=4&amp;issue=4&amp;spage=R25:1&amp;epage=10&amp;date=2003&amp;atitle=Genome-wide+detection+of+segmental+duplications+and+potential+assembly+errors+in+the+human+genome+sequence</identifier.openurl>
<description.nature>published_or_final_version</description.nature>
<identifier.doi>10.1186/gb-2003-4-4-r25</identifier.doi>
<identifier.pmid>12702206</identifier.pmid>
<identifier.pmcid>PMC154576</identifier.pmcid>
<identifier.scopus>eid_2-s2.0-0037837485</identifier.scopus>
<identifier.volume>4</identifier.volume>
<identifier.issue>4</identifier.issue>
<identifier.spage>R25</identifier.spage>
<identifier.epage>R25</identifier.epage>
<identifier.isi>WOS:000182696200007</identifier.isi>
<identifier.citeulike>838938</identifier.citeulike>
<bitstream.url>http://hub.hku.hk/bitstream/10722/43556/3/gb-2003-4-4-r25.pdf</bitstream.url>
</item>
Author Affiliations
  1. Hospital for Sick Children University of Toronto