Toward high-throughput genotyping: Dynamic and automatic software for manipulating large-scale genotype data using fluorescently labeled dinucleotide markers

Jin Long Li, Hongyi Deng, Dong Bing Lai, Fuhua Xu, Jian Chen, Guimin Gao, Robert R. Recker, Hong Wen Deng

Research output: Contribution to journalArticle

58 Citations (Scopus)

Abstract

To efficiently manipulate large amounts of genotype data generated with fluorescently labeled dinucleotide markers, we developed a Microsoft Access database management system, named GenoDB. GenoDB offers several advantages. First, it accommodates the dynamic nature of the accumulations of genotype data during the genotyping process; some data need to be confirmed or replaced by repeat lab procedures. By using GenoDB, the raw genotype data can be imported easily and continuously and incorporated into the database during the genotyping process that may continue over an extended period of time in large projects. Second, almost all of the procedures are automatic, including autocomparison of the raw data read by different technicians from the same gel, autoadjustment among the allele fragment-size data from cross-runs or cross-platforms, autobinning of alleles, and autocompilation of genotype data for suitable programs to perform inheritance check in pedigrees. Third, GenoDB provides functions to track electrophoresis gel files to locate gel or sample sources for any resultant genotype data, which is extremely helpful for double-checking consistency of raw and final data and for directing repeat experiments. In addition, the user-friendly graphic interface of GenoDB renders processing of large amounts of data much less labor-intensive. Furthermore, GenoDB has built-in mechanisms to detect some genotyping errors and to assess the quality of genotype data that then are summarized in the statistic reports automatically generated by GenoDB. The GenoDB can easily handle >500,000 genotype data entries, a number more than sufficient for typical whole-genome linkage studies. The modules and programs we developed for the GenoDB can be extended to other database platforms, such as Microsoft SQL server, if the capability to handle still greater quantities of genotype data simultaneously is desired.

Original languageEnglish
Pages (from-to)1304-1314
Number of pages11
JournalGenome Research
Volume11
Issue number7
DOIs
StatePublished - 2001

Fingerprint

Software
Genotype
Gels
Database Management Systems
Alleles
Databases
Pedigree
Electrophoresis
Genome

All Science Journal Classification (ASJC) codes

  • Genetics

Cite this

Toward high-throughput genotyping : Dynamic and automatic software for manipulating large-scale genotype data using fluorescently labeled dinucleotide markers. / Li, Jin Long; Deng, Hongyi; Lai, Dong Bing; Xu, Fuhua; Chen, Jian; Gao, Guimin; Recker, Robert R.; Deng, Hong Wen.

In: Genome Research, Vol. 11, No. 7, 2001, p. 1304-1314.

Research output: Contribution to journalArticle

Li, Jin Long ; Deng, Hongyi ; Lai, Dong Bing ; Xu, Fuhua ; Chen, Jian ; Gao, Guimin ; Recker, Robert R. ; Deng, Hong Wen. / Toward high-throughput genotyping : Dynamic and automatic software for manipulating large-scale genotype data using fluorescently labeled dinucleotide markers. In: Genome Research. 2001 ; Vol. 11, No. 7. pp. 1304-1314.
@article{2b0e87cbd5ea439d8d2fc928c09ea82e,
title = "Toward high-throughput genotyping: Dynamic and automatic software for manipulating large-scale genotype data using fluorescently labeled dinucleotide markers",
abstract = "To efficiently manipulate large amounts of genotype data generated with fluorescently labeled dinucleotide markers, we developed a Microsoft Access database management system, named GenoDB. GenoDB offers several advantages. First, it accommodates the dynamic nature of the accumulations of genotype data during the genotyping process; some data need to be confirmed or replaced by repeat lab procedures. By using GenoDB, the raw genotype data can be imported easily and continuously and incorporated into the database during the genotyping process that may continue over an extended period of time in large projects. Second, almost all of the procedures are automatic, including autocomparison of the raw data read by different technicians from the same gel, autoadjustment among the allele fragment-size data from cross-runs or cross-platforms, autobinning of alleles, and autocompilation of genotype data for suitable programs to perform inheritance check in pedigrees. Third, GenoDB provides functions to track electrophoresis gel files to locate gel or sample sources for any resultant genotype data, which is extremely helpful for double-checking consistency of raw and final data and for directing repeat experiments. In addition, the user-friendly graphic interface of GenoDB renders processing of large amounts of data much less labor-intensive. Furthermore, GenoDB has built-in mechanisms to detect some genotyping errors and to assess the quality of genotype data that then are summarized in the statistic reports automatically generated by GenoDB. The GenoDB can easily handle >500,000 genotype data entries, a number more than sufficient for typical whole-genome linkage studies. The modules and programs we developed for the GenoDB can be extended to other database platforms, such as Microsoft SQL server, if the capability to handle still greater quantities of genotype data simultaneously is desired.",
author = "Li, {Jin Long} and Hongyi Deng and Lai, {Dong Bing} and Fuhua Xu and Jian Chen and Guimin Gao and Recker, {Robert R.} and Deng, {Hong Wen}",
year = "2001",
doi = "10.1101/gr.159701",
language = "English",
volume = "11",
pages = "1304--1314",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "7",

}

TY - JOUR

T1 - Toward high-throughput genotyping

T2 - Dynamic and automatic software for manipulating large-scale genotype data using fluorescently labeled dinucleotide markers

AU - Li, Jin Long

AU - Deng, Hongyi

AU - Lai, Dong Bing

AU - Xu, Fuhua

AU - Chen, Jian

AU - Gao, Guimin

AU - Recker, Robert R.

AU - Deng, Hong Wen

PY - 2001

Y1 - 2001

N2 - To efficiently manipulate large amounts of genotype data generated with fluorescently labeled dinucleotide markers, we developed a Microsoft Access database management system, named GenoDB. GenoDB offers several advantages. First, it accommodates the dynamic nature of the accumulations of genotype data during the genotyping process; some data need to be confirmed or replaced by repeat lab procedures. By using GenoDB, the raw genotype data can be imported easily and continuously and incorporated into the database during the genotyping process that may continue over an extended period of time in large projects. Second, almost all of the procedures are automatic, including autocomparison of the raw data read by different technicians from the same gel, autoadjustment among the allele fragment-size data from cross-runs or cross-platforms, autobinning of alleles, and autocompilation of genotype data for suitable programs to perform inheritance check in pedigrees. Third, GenoDB provides functions to track electrophoresis gel files to locate gel or sample sources for any resultant genotype data, which is extremely helpful for double-checking consistency of raw and final data and for directing repeat experiments. In addition, the user-friendly graphic interface of GenoDB renders processing of large amounts of data much less labor-intensive. Furthermore, GenoDB has built-in mechanisms to detect some genotyping errors and to assess the quality of genotype data that then are summarized in the statistic reports automatically generated by GenoDB. The GenoDB can easily handle >500,000 genotype data entries, a number more than sufficient for typical whole-genome linkage studies. The modules and programs we developed for the GenoDB can be extended to other database platforms, such as Microsoft SQL server, if the capability to handle still greater quantities of genotype data simultaneously is desired.

AB - To efficiently manipulate large amounts of genotype data generated with fluorescently labeled dinucleotide markers, we developed a Microsoft Access database management system, named GenoDB. GenoDB offers several advantages. First, it accommodates the dynamic nature of the accumulations of genotype data during the genotyping process; some data need to be confirmed or replaced by repeat lab procedures. By using GenoDB, the raw genotype data can be imported easily and continuously and incorporated into the database during the genotyping process that may continue over an extended period of time in large projects. Second, almost all of the procedures are automatic, including autocomparison of the raw data read by different technicians from the same gel, autoadjustment among the allele fragment-size data from cross-runs or cross-platforms, autobinning of alleles, and autocompilation of genotype data for suitable programs to perform inheritance check in pedigrees. Third, GenoDB provides functions to track electrophoresis gel files to locate gel or sample sources for any resultant genotype data, which is extremely helpful for double-checking consistency of raw and final data and for directing repeat experiments. In addition, the user-friendly graphic interface of GenoDB renders processing of large amounts of data much less labor-intensive. Furthermore, GenoDB has built-in mechanisms to detect some genotyping errors and to assess the quality of genotype data that then are summarized in the statistic reports automatically generated by GenoDB. The GenoDB can easily handle >500,000 genotype data entries, a number more than sufficient for typical whole-genome linkage studies. The modules and programs we developed for the GenoDB can be extended to other database platforms, such as Microsoft SQL server, if the capability to handle still greater quantities of genotype data simultaneously is desired.

UR - http://www.scopus.com/inward/record.url?scp=0034925238&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034925238&partnerID=8YFLogxK

U2 - 10.1101/gr.159701

DO - 10.1101/gr.159701

M3 - Article

C2 - 11435414

AN - SCOPUS:0034925238

VL - 11

SP - 1304

EP - 1314

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 7

ER -