Preparation and Curation of Multiyear, Multilocation, Multitrait Datasets
Authors:
Genome-wide association studies (GWAS) are a powerful approach to dissect genotype-phenotype associations and identify causative regions. However, this power is highly influenced by the accuracy of the phenotypic data. To obtain accurate phenotypic values, the phenotyping should be achieved through multienvironment trials (METs). In order to avoid any technical errors, the required time needs to be spent on exploring, understanding, curating and adjusting the phenotypic data in each trial before combining them using an appropriate linear mixed model (LMM). The LMM is chosen to minimize as much as possible any effect that can lead to misestimation of the phenotypic values. The purpose of this chapter is to explain a series of important steps to explore and analyze data from METs used to characterize an association panel. Two datasets are used to illustrate two different scenarios.