PavyParsonsPauleEtAl2006

Référence

Pavy, N., Parsons, L.S., Paule, C., MacKay, J., Bousquet, J. (2006) Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for the categorization of SNPs. BMC Genomics, 7:174.

Résumé

Background: High-throughput genotyping technologies represent a highly efficient way to accelerate genetic mapping and enable association studies. As a first step toward this goal, we aimed to develop a resource of candidate Single Nucleotide Polymorphisms (SNP) in white spruce (Picea glauca [Moench] Voss), a softwood tree of major economic importance. Results: A white spruce SNP resource encompassing 12,264 SNPs was constructed from a set of 6,459 contigs derived from Expressed Sequence Tags (EST) and by using the bayesian-based statistical software PolyBayes. Several parameters influencing the SNP prediction were analysed including the a priori expected polymorphism, the probability score (P-SNP), and the contig depth and length. SNP detection in 3' and 5' reads from the same clones revealed a level of inconsistency between overlapping sequences as low as 1%. A subset of 245 predicted SNPs were verified through the independent resequencing of genomic DNA of a genotype also used to prepare cDNA libraries. The validation rate reached a maximum of 85% for SNPs predicted with either P-SNP = 0.95 or >= 0.99. A total of 9,310 SNPs were detected by using P-SNP = 0.95 as a criterion. The SNPs were distributed among 3,590 contigs encompassing an array of broad functional categories, with an overall frequency of 1 SNP per 700 nucleotide sites. Experimental and statistical approaches were used to evaluate the proportion of paralogous SNPs, with estimates in the range of 8 to 12%. The 3,789 coding SNPs identified through coding region annotation and ORF prediction, were distributed into 39% nonsynonymous and 61% synonymous substitutions. Overall, there were 0.9 SNP per 1,000 nonsynonymous sites and 5.2 SNPs per 1,000 synonymous sites, for a genome-wide nonsynonymous to synonymous substitution rate ratio (Ka/Ks) of 0.17. Conclusion: We integrated the SNP data in the ForestTreeDB database along with functional annotations to provide a tool facilitating the choice of candidate genes for mapping purposes or association studies.

Format EndNote

Vous pouvez importer cette référence dans EndNote.

Format BibTeX-CSV

Vous pouvez importer cette référence en format BibTeX-CSV.

Format BibTeX

Vous pouvez copier l'entrée BibTeX de cette référence ci-bas, ou l'importer directement dans un logiciel tel que JabRef .

@ARTICLE { PavyParsonsPauleEtAl2006,
    AUTHOR = { Pavy, N. and Parsons, L.S. and Paule, C. and MacKay, J. and Bousquet, J. },
    TITLE = { Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for the categorization of SNPs },
    JOURNAL = { BMC Genomics },
    YEAR = { 2006 },
    VOLUME = { 7 },
    PAGES = { 174 },
    NOTE = { Times Cited: 0 Article English Cited References Count: 37 083ab },
    ABSTRACT = { Background: High-throughput genotyping technologies represent a highly efficient way to accelerate genetic mapping and enable association studies. As a first step toward this goal, we aimed to develop a resource of candidate Single Nucleotide Polymorphisms (SNP) in white spruce (Picea glauca [Moench] Voss), a softwood tree of major economic importance. Results: A white spruce SNP resource encompassing 12,264 SNPs was constructed from a set of 6,459 contigs derived from Expressed Sequence Tags (EST) and by using the bayesian-based statistical software PolyBayes. Several parameters influencing the SNP prediction were analysed including the a priori expected polymorphism, the probability score (P-SNP), and the contig depth and length. SNP detection in 3' and 5' reads from the same clones revealed a level of inconsistency between overlapping sequences as low as 1%. A subset of 245 predicted SNPs were verified through the independent resequencing of genomic DNA of a genotype also used to prepare cDNA libraries. The validation rate reached a maximum of 85% for SNPs predicted with either P-SNP = 0.95 or >= 0.99. A total of 9,310 SNPs were detected by using P-SNP = 0.95 as a criterion. The SNPs were distributed among 3,590 contigs encompassing an array of broad functional categories, with an overall frequency of 1 SNP per 700 nucleotide sites. Experimental and statistical approaches were used to evaluate the proportion of paralogous SNPs, with estimates in the range of 8 to 12%. The 3,789 coding SNPs identified through coding region annotation and ORF prediction, were distributed into 39% nonsynonymous and 61% synonymous substitutions. Overall, there were 0.9 SNP per 1,000 nonsynonymous sites and 5.2 SNPs per 1,000 synonymous sites, for a genome-wide nonsynonymous to synonymous substitution rate ratio (Ka/Ks) of 0.17. Conclusion: We integrated the SNP data in the ForestTreeDB database along with functional annotations to provide a tool facilitating the choice of candidate genes for mapping purposes or association studies. },
    KEYWORDS = { single-nucleotide polymorphisms map-based cloning arabidopsis-thaliana loblolly-pine linkage map genetics discovery genes identification substitution },
    OWNER = { brugerolles },
    TIMESTAMP = { 2007.12.05 },
}

********************************************************** *************************** FRQNT ************************ **********************************************************

Un regroupement stratégique du

********************************************************** ***************** Facebook Twitter *********************** **********************************************************

Abonnez-vous à
l'Infolettre du CEF!

********************************************************** ***************** Pub - ABC CBA 2020 ****************** **********************************************************

31 mai au 4 juin 2020

********************************************************** ***************** Pub - Symphonies_Boreales ****************** **********************************************************

********************************************************** ***************** Boîte à trucs *************** **********************************************************

CEF-Référence
La référence vedette !

Jérémie Alluard (2016) Les statistiques au moments de la rédaction 

  • Ce document a pour but de guider les étudiants à intégrer de manière appropriée une analyse statistique dans leur rapport de recherche.

Voir les autres...