Abstract
An important approach to reducing missing heritability and enhancing success of genome-wide association studies (GWAS) for complex diseases is the identification of traits that are highly heritable and homogeneous in their etiology. Many approaches have been proposed to define such traits based on either cluster analysis or pedigree-based heritable component analysis. None of the existing methods, however, exploit the dense genome-wide genotypic data that are now readily available from GWAS, and with exome and whole genome sequencing more data will be available in the future. Moreover, because a phenotype can vary with respect to a covariate, such as age or race. The fixed effect due to the covariates may lead to a spuriously elevated estimate of heritability. Existing heritable component analysis methods have not considered covariate effects. We propose an optimization approach to identify composite traits with high heritability as a function of multiple phenotypic variables where heritability is estimated from genome-wide single neucleotide polymorphisms (SNPs). Our approach can model the covariate effects within heritability analysis. The proposed optimization problem can be efficiently solved by a sequential quadratic programming algorithm. A case study demonstrates the effectiveness of the proposed approach for finding composite traits with high SNP-based heritability.