Targeting a complex transcriptome

The construction of the mouse full-length cDNA encyclopedia

Piero Carninci, Kazunori Waki, Toshiyuki Shiraki, Hideaki Konno, Kazuhiro Shibata, Masayoshi Itoh, Katsunori Aizawa, Takahiro Arakawa, Yoshiyuki Ishii, Daisuke Sasaki, Hidemasa Bono, Shinji Kondo, Yuichi Sugahara, Rintaro Saito, Naoki Osato, Shiro Fukuda, Kenjiro Sato, Akira Watahiki, Tomoko Hirozane-Kishikawa, Mari Nakamura & 27 others Yuko Shibata, Ayako Yasunishi, Noriko Kikuchi, Atsushi Yoshiki, Moriaki Kusakabe, Stefano Gustincich, Kirk Beisel, William Pavan, Vassilis Aidinis, Akira Nakagawara, William A. Held, Hiroo Iwata, Tomohiro Kono, Hiromitsu Nakauchi, Paul Lyons, Christine Wells, David A. Hume, Michela Fagiolini, Takao K. Hensch, Michelle Brinkmeier, Sally Camper, Junji Hirota, Peter Mombaerts, Masami Muramatsu, Yasushi Okazaki, Jun Kawai, Yoshihide Hayashizaki

Research output: Contribution to journalReview article

144 Citations (Scopus)

Abstract

We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3′-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAsannotated in the FANTOM-2 annotation. We have also produced 547,149 5′ end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, 5′-end clusters identify regions that are potential promoters for 8637 known genes and 5′-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.

Original languageEnglish
Pages (from-to)1273-1289
Number of pages17
JournalGenome Research
Volume13
Issue number6 B
DOIs
StatePublished - Jun 1 2003

Fingerprint

Encyclopedias
Transcriptome
Complementary DNA
Expressed Sequence Tags
Libraries
Clone Cells
Untranslated RNA
Polyadenylation
Genetic Promoter Regions
Genes
Organism Cloning
Genome
Messenger RNA

All Science Journal Classification (ASJC) codes

  • Genetics

Cite this

Carninci, P., Waki, K., Shiraki, T., Konno, H., Shibata, K., Itoh, M., ... Hayashizaki, Y. (2003). Targeting a complex transcriptome: The construction of the mouse full-length cDNA encyclopedia. Genome Research, 13(6 B), 1273-1289. https://doi.org/10.1101/gr.1119703

Targeting a complex transcriptome : The construction of the mouse full-length cDNA encyclopedia. / Carninci, Piero; Waki, Kazunori; Shiraki, Toshiyuki; Konno, Hideaki; Shibata, Kazuhiro; Itoh, Masayoshi; Aizawa, Katsunori; Arakawa, Takahiro; Ishii, Yoshiyuki; Sasaki, Daisuke; Bono, Hidemasa; Kondo, Shinji; Sugahara, Yuichi; Saito, Rintaro; Osato, Naoki; Fukuda, Shiro; Sato, Kenjiro; Watahiki, Akira; Hirozane-Kishikawa, Tomoko; Nakamura, Mari; Shibata, Yuko; Yasunishi, Ayako; Kikuchi, Noriko; Yoshiki, Atsushi; Kusakabe, Moriaki; Gustincich, Stefano; Beisel, Kirk; Pavan, William; Aidinis, Vassilis; Nakagawara, Akira; Held, William A.; Iwata, Hiroo; Kono, Tomohiro; Nakauchi, Hiromitsu; Lyons, Paul; Wells, Christine; Hume, David A.; Fagiolini, Michela; Hensch, Takao K.; Brinkmeier, Michelle; Camper, Sally; Hirota, Junji; Mombaerts, Peter; Muramatsu, Masami; Okazaki, Yasushi; Kawai, Jun; Hayashizaki, Yoshihide.

In: Genome Research, Vol. 13, No. 6 B, 01.06.2003, p. 1273-1289.

Research output: Contribution to journalReview article

Carninci, P, Waki, K, Shiraki, T, Konno, H, Shibata, K, Itoh, M, Aizawa, K, Arakawa, T, Ishii, Y, Sasaki, D, Bono, H, Kondo, S, Sugahara, Y, Saito, R, Osato, N, Fukuda, S, Sato, K, Watahiki, A, Hirozane-Kishikawa, T, Nakamura, M, Shibata, Y, Yasunishi, A, Kikuchi, N, Yoshiki, A, Kusakabe, M, Gustincich, S, Beisel, K, Pavan, W, Aidinis, V, Nakagawara, A, Held, WA, Iwata, H, Kono, T, Nakauchi, H, Lyons, P, Wells, C, Hume, DA, Fagiolini, M, Hensch, TK, Brinkmeier, M, Camper, S, Hirota, J, Mombaerts, P, Muramatsu, M, Okazaki, Y, Kawai, J & Hayashizaki, Y 2003, 'Targeting a complex transcriptome: The construction of the mouse full-length cDNA encyclopedia', Genome Research, vol. 13, no. 6 B, pp. 1273-1289. https://doi.org/10.1101/gr.1119703
Carninci P, Waki K, Shiraki T, Konno H, Shibata K, Itoh M et al. Targeting a complex transcriptome: The construction of the mouse full-length cDNA encyclopedia. Genome Research. 2003 Jun 1;13(6 B):1273-1289. https://doi.org/10.1101/gr.1119703
Carninci, Piero ; Waki, Kazunori ; Shiraki, Toshiyuki ; Konno, Hideaki ; Shibata, Kazuhiro ; Itoh, Masayoshi ; Aizawa, Katsunori ; Arakawa, Takahiro ; Ishii, Yoshiyuki ; Sasaki, Daisuke ; Bono, Hidemasa ; Kondo, Shinji ; Sugahara, Yuichi ; Saito, Rintaro ; Osato, Naoki ; Fukuda, Shiro ; Sato, Kenjiro ; Watahiki, Akira ; Hirozane-Kishikawa, Tomoko ; Nakamura, Mari ; Shibata, Yuko ; Yasunishi, Ayako ; Kikuchi, Noriko ; Yoshiki, Atsushi ; Kusakabe, Moriaki ; Gustincich, Stefano ; Beisel, Kirk ; Pavan, William ; Aidinis, Vassilis ; Nakagawara, Akira ; Held, William A. ; Iwata, Hiroo ; Kono, Tomohiro ; Nakauchi, Hiromitsu ; Lyons, Paul ; Wells, Christine ; Hume, David A. ; Fagiolini, Michela ; Hensch, Takao K. ; Brinkmeier, Michelle ; Camper, Sally ; Hirota, Junji ; Mombaerts, Peter ; Muramatsu, Masami ; Okazaki, Yasushi ; Kawai, Jun ; Hayashizaki, Yoshihide. / Targeting a complex transcriptome : The construction of the mouse full-length cDNA encyclopedia. In: Genome Research. 2003 ; Vol. 13, No. 6 B. pp. 1273-1289.
@article{2a2c257eb419442ebdb5bc0376a6f094,
title = "Targeting a complex transcriptome: The construction of the mouse full-length cDNA encyclopedia",
abstract = "We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3′-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAsannotated in the FANTOM-2 annotation. We have also produced 547,149 5′ end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, 5′-end clusters identify regions that are potential promoters for 8637 known genes and 5′-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.",
author = "Piero Carninci and Kazunori Waki and Toshiyuki Shiraki and Hideaki Konno and Kazuhiro Shibata and Masayoshi Itoh and Katsunori Aizawa and Takahiro Arakawa and Yoshiyuki Ishii and Daisuke Sasaki and Hidemasa Bono and Shinji Kondo and Yuichi Sugahara and Rintaro Saito and Naoki Osato and Shiro Fukuda and Kenjiro Sato and Akira Watahiki and Tomoko Hirozane-Kishikawa and Mari Nakamura and Yuko Shibata and Ayako Yasunishi and Noriko Kikuchi and Atsushi Yoshiki and Moriaki Kusakabe and Stefano Gustincich and Kirk Beisel and William Pavan and Vassilis Aidinis and Akira Nakagawara and Held, {William A.} and Hiroo Iwata and Tomohiro Kono and Hiromitsu Nakauchi and Paul Lyons and Christine Wells and Hume, {David A.} and Michela Fagiolini and Hensch, {Takao K.} and Michelle Brinkmeier and Sally Camper and Junji Hirota and Peter Mombaerts and Masami Muramatsu and Yasushi Okazaki and Jun Kawai and Yoshihide Hayashizaki",
year = "2003",
month = "6",
day = "1",
doi = "10.1101/gr.1119703",
language = "English",
volume = "13",
pages = "1273--1289",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "6 B",

}

TY - JOUR

T1 - Targeting a complex transcriptome

T2 - The construction of the mouse full-length cDNA encyclopedia

AU - Carninci, Piero

AU - Waki, Kazunori

AU - Shiraki, Toshiyuki

AU - Konno, Hideaki

AU - Shibata, Kazuhiro

AU - Itoh, Masayoshi

AU - Aizawa, Katsunori

AU - Arakawa, Takahiro

AU - Ishii, Yoshiyuki

AU - Sasaki, Daisuke

AU - Bono, Hidemasa

AU - Kondo, Shinji

AU - Sugahara, Yuichi

AU - Saito, Rintaro

AU - Osato, Naoki

AU - Fukuda, Shiro

AU - Sato, Kenjiro

AU - Watahiki, Akira

AU - Hirozane-Kishikawa, Tomoko

AU - Nakamura, Mari

AU - Shibata, Yuko

AU - Yasunishi, Ayako

AU - Kikuchi, Noriko

AU - Yoshiki, Atsushi

AU - Kusakabe, Moriaki

AU - Gustincich, Stefano

AU - Beisel, Kirk

AU - Pavan, William

AU - Aidinis, Vassilis

AU - Nakagawara, Akira

AU - Held, William A.

AU - Iwata, Hiroo

AU - Kono, Tomohiro

AU - Nakauchi, Hiromitsu

AU - Lyons, Paul

AU - Wells, Christine

AU - Hume, David A.

AU - Fagiolini, Michela

AU - Hensch, Takao K.

AU - Brinkmeier, Michelle

AU - Camper, Sally

AU - Hirota, Junji

AU - Mombaerts, Peter

AU - Muramatsu, Masami

AU - Okazaki, Yasushi

AU - Kawai, Jun

AU - Hayashizaki, Yoshihide

PY - 2003/6/1

Y1 - 2003/6/1

N2 - We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3′-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAsannotated in the FANTOM-2 annotation. We have also produced 547,149 5′ end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, 5′-end clusters identify regions that are potential promoters for 8637 known genes and 5′-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.

AB - We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3′-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAsannotated in the FANTOM-2 annotation. We have also produced 547,149 5′ end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, 5′-end clusters identify regions that are potential promoters for 8637 known genes and 5′-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.

UR - http://www.scopus.com/inward/record.url?scp=0037673675&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037673675&partnerID=8YFLogxK

U2 - 10.1101/gr.1119703

DO - 10.1101/gr.1119703

M3 - Review article

VL - 13

SP - 1273

EP - 1289

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 6 B

ER -