Can transcriptome size be estimated from SAGE catalogs?

Stern, MD; Anisimov, SV; Boheler, KR

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1093/bioinformatics/btg018
Scopus: eid_2-s2.0-0037340843
PMID: 12611798
WOS: WOS:000181410700001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Li Ka Shing Faculty of Medicine: Journal/Magazine Articles

Article: Can transcriptome size be estimated from SAGE catalogs?

Title	Can transcriptome size be estimated from SAGE catalogs?
Authors	Stern, MD Anisimov, SV Boheler, KR
Issue Date	2003
Citation	Bioinformatics, 2003, v. 19 n. 4, p. 443-448 How to Cite? DOI: http://dx.doi.org/10.1093/bioinformatics/btg018
Abstract	Motivation: We sought to determine whether SAGE (Serial Analysis of Gene Expression) can be used to estimate the number of unique transcripts in a transcriptome. A simple estimator that corrects for sequencing and sampling errors was applied to a SAGE library (137 832 tags) obtained from mouse embryonic stem cells, and also to Monte Carlo simulated libraries generated using assumed distributions of 'true' expression levels consistent with the data. Results: When the corrected data themselves were taken as the underlying model of 'ground truth', the estimator converged to the 'true' value (53 535) only after counting 300 000 simulated tags, more than twice the number in the experiment. The SAGE data could also be well fit by a Monte Carlo model based on a truncated inversesquare distribution of expression levels, with 130 000 'true' transcripts and > 106 samples needed for convergence. We conclude that the size of a transcriptome is ill-determined from SAGE libraries of even moderately large size. In order to obtain a valid estimate, one must sample a number of tags inversely proportional to the lowest abundance level, which is not known a priori. This constrains the design of SAGE experiments intended to determine biological complexity.
Persistent Identifier	http://hdl.handle.net/10722/195164
ISSN	1367-4803 2021 Impact Factor: 6.931 2020 SCImago Journal Rankings: 3.599
ISI Accession Number ID	WOS:000181410700001

DC Field	Value	Language
dc.contributor.author	Stern, MD	-
dc.contributor.author	Anisimov, SV	-
dc.contributor.author	Boheler, KR	-
dc.date.accessioned	2014-02-25T01:40:15Z	-
dc.date.available	2014-02-25T01:40:15Z	-
dc.date.issued	2003	-
dc.identifier.citation	Bioinformatics, 2003, v. 19 n. 4, p. 443-448	-
dc.identifier.issn	1367-4803	-
dc.identifier.uri	http://hdl.handle.net/10722/195164	-
dc.description.abstract	Motivation: We sought to determine whether SAGE (Serial Analysis of Gene Expression) can be used to estimate the number of unique transcripts in a transcriptome. A simple estimator that corrects for sequencing and sampling errors was applied to a SAGE library (137 832 tags) obtained from mouse embryonic stem cells, and also to Monte Carlo simulated libraries generated using assumed distributions of 'true' expression levels consistent with the data. Results: When the corrected data themselves were taken as the underlying model of 'ground truth', the estimator converged to the 'true' value (53 535) only after counting 300 000 simulated tags, more than twice the number in the experiment. The SAGE data could also be well fit by a Monte Carlo model based on a truncated inversesquare distribution of expression levels, with 130 000 'true' transcripts and > 106 samples needed for convergence. We conclude that the size of a transcriptome is ill-determined from SAGE libraries of even moderately large size. In order to obtain a valid estimate, one must sample a number of tags inversely proportional to the lowest abundance level, which is not known a priori. This constrains the design of SAGE experiments intended to determine biological complexity.	-
dc.language	eng	-
dc.relation.ispartof	Bioinformatics	-
dc.title	Can transcriptome size be estimated from SAGE catalogs?	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1093/bioinformatics/btg018	-
dc.identifier.pmid	12611798	-
dc.identifier.scopus	eid_2-s2.0-0037340843	-
dc.identifier.volume	19	-
dc.identifier.issue	4	-
dc.identifier.spage	443	-
dc.identifier.epage	448	-
dc.identifier.isi	WOS:000181410700001	-
dc.identifier.issnl	1367-4803	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Can transcriptome size be estimated from SAGE catalogs?

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats