PrunierLemaconBastienEtAl2019

Reference

Prunier, J., Lemaçon, A., Bastien, A., Jafarikia, M., Porth, I., Robert, C., Droit, A. (2019) LD-annot: A Bioinformatics Tool to Automatically Provide Candidate SNPs With Annotations for Genetically Linked Genes. Frontiers in Genetics, 10. (Scopus )

Abstract

A multitude of model and non-model species studies have now taken full advantage of powerful high-throughput genotyping advances such as SNP arrays and genotyping-by-sequencing (GBS) technology to investigate the genetic basis of trait variation. However, due to incomplete genome coverage by these technologies, the identified SNPs are likely in linkage disequilibrium (LD) with the causal polymorphisms, rather than be causal themselves. In addition, researchers could benefit from annotations for the identified candidate SNPs and, simultaneously, for all neighboring genes in genetic linkage. In such case, LD extent estimation surrounding the candidate SNPs is required to determine the regions encompassing genes of interest. We describe here an automated pipeline, “LD-annot,” designed to delineate specific regions of interest for a given experiment and candidate polymorphisms on the basis of LD extent, and furthermore, provide annotations for all genes within such regions. LD-annot uses standard file formats, bioinformatics tools, and languages to provide identifiers, coordinates, and annotations for genes in genetic linkage with each candidate polymorphism. Although the focus lies upon SNP arrays and GBS data as they are being routinely deployed, this pipeline can be applied to a variety of datasets as long as genotypic data are available for a high number of polymorphisms and formatted into a vcf file. A checkpoint procedure in the pipeline allows to test several threshold values for linkage without having to rerun the entire pipeline, thus saving the user computational time and resources. We applied this new pipeline to four different sample sets: two breeding populations GBS datasets, one within-pedigree SNP set coming from whole genome sequencing (WGS), and a very large multi-varieties SNP dataset obtained from WGS, representing variable sample sizes, and numbers of polymorphisms. LD-annot performed within minutes, even when very high numbers of polymorphisms are investigated and thus will efficiently assist research efforts aimed at identifying biologically meaningful genetic polymorphisms underlying phenotypic variation. LD-annot tool is available under a GPL license from https://github.com/ArnaudDroitLab/LD-annot. © Copyright © 2019 Prunier, Lemaçon, Bastien, Jafarikia, Porth, Robert and Droit.

EndNote Format

You can import this reference in EndNote.

BibTeX-CSV Format

You can import this reference in BibTeX-CSV format.

BibTeX Format

You can copy the BibTeX entry of this reference below, orimport it directly in a software like JabRef .

@ARTICLE { PrunierLemaconBastienEtAl2019,
    AUTHOR = { Prunier, J. and Lemaçon, A. and Bastien, A. and Jafarikia, M. and Porth, I. and Robert, C. and Droit, A. },
    TITLE = { LD-annot: A Bioinformatics Tool to Automatically Provide Candidate SNPs With Annotations for Genetically Linked Genes },
    JOURNAL = { Frontiers in Genetics },
    YEAR = { 2019 },
    VOLUME = { 10 },
    NOTE = { cited By 0 },
    ABSTRACT = { A multitude of model and non-model species studies have now taken full advantage of powerful high-throughput genotyping advances such as SNP arrays and genotyping-by-sequencing (GBS) technology to investigate the genetic basis of trait variation. However, due to incomplete genome coverage by these technologies, the identified SNPs are likely in linkage disequilibrium (LD) with the causal polymorphisms, rather than be causal themselves. In addition, researchers could benefit from annotations for the identified candidate SNPs and, simultaneously, for all neighboring genes in genetic linkage. In such case, LD extent estimation surrounding the candidate SNPs is required to determine the regions encompassing genes of interest. We describe here an automated pipeline, “LD-annot,” designed to delineate specific regions of interest for a given experiment and candidate polymorphisms on the basis of LD extent, and furthermore, provide annotations for all genes within such regions. LD-annot uses standard file formats, bioinformatics tools, and languages to provide identifiers, coordinates, and annotations for genes in genetic linkage with each candidate polymorphism. Although the focus lies upon SNP arrays and GBS data as they are being routinely deployed, this pipeline can be applied to a variety of datasets as long as genotypic data are available for a high number of polymorphisms and formatted into a vcf file. A checkpoint procedure in the pipeline allows to test several threshold values for linkage without having to rerun the entire pipeline, thus saving the user computational time and resources. We applied this new pipeline to four different sample sets: two breeding populations GBS datasets, one within-pedigree SNP set coming from whole genome sequencing (WGS), and a very large multi-varieties SNP dataset obtained from WGS, representing variable sample sizes, and numbers of polymorphisms. LD-annot performed within minutes, even when very high numbers of polymorphisms are investigated and thus will efficiently assist research efforts aimed at identifying biologically meaningful genetic polymorphisms underlying phenotypic variation. LD-annot tool is available under a GPL license from https://github.com/ArnaudDroitLab/LD-annot. © Copyright © 2019 Prunier, Lemaçon, Bastien, Jafarikia, Porth, Robert and Droit. },
    AFFILIATION = { Genomics Center, Centre Hospitalier Universitaire de Québec–Université Laval Research Center, Quebec, QC, Canada; Forestry Research Centre, Forestry Department, Université Laval, Quebec, QC, Canada; Faculty of Agricultural and Food Science, Université Laval, Quebec, QC, Canada; Canadian Centre for Swine Improvement, Ottawa, ON, Canada; Department of Animal Biosciences, University of Guelph, Guelph, ON, Canada },
    ART_NUMBER = { 1192 },
    AUTHOR_KEYWORDS = { bioinformatics tool; candidate SNP; linkage disequilibrium; SNP annotation; SNP chip analyses; variant call format (VCF) },
    DOCUMENT_TYPE = { Article },
    DOI = { 10.3389/fgene.2019.01192 },
    SOURCE = { Scopus },
    URL = { https://www.scopus.com/inward/record.uri?eid=2-s2.0-85076977120&doi=10.3389%2ffgene.2019.01192&partnerID=40&md5=f43af6cb63474531ac1eb545d62a6fcc },
}

********************************************************** *************************** FRQNT ************************ **********************************************************

Un regroupement stratégique du

********************************************************** *********************** Infolettre *********************** **********************************************************

Abonnez-vous à
l'Infolettre du CEF!

********************************************************** ***************** Pub - Congrès Mycelium ****************** **********************************************************

Reporté en 2021

********************************************************** ***************** Pub - IWTT ****************** **********************************************************

Reporté en 2021

**********************************************************

***************** Pub - Symphonies_Boreales ****************** **********************************************************

********************************************************** ***************** Boîte à trucs *************** **********************************************************

CEF-Référence
La référence vedette !

Jérémie Alluard (2016) Les statistiques au moments de la rédaction 

  • Ce document a pour but de guider les étudiants à intégrer de manière appropriée une analyse statistique dans leur rapport de recherche.

Voir les autres...