Linear regression models for GWAS can be written
as follows:
Y ~ Wa + X$_s$?$_s$ + g + e
(1)
g ~ N(0, ?$^2$$_A$?)
(2)
e ~ N(0, ?$^2$$_e$I)
(3)
where, for each individual, Y is a vector of phenotype
values, W is a matrix of covariates including an intercept
term, ? is a corresponding vector of effect sizes, X$_s$ is a
vector of genotype values for all individuals at SNP s, ?$_s$
is the corresponding fixed effect size of genetic variant s
(also known as the SNP effect size), g is a random effect
that captures the polygenic effect of other SNPs, e is a
random effect of residual errors, ?$^2$$_A$ measures the addi-
tive genetic variation of the phenotype, ? is the standard
genetic relationship matrix, ?$^2$$_e$ measures residual variance
and I is an identity matrix. In logistic regression mod-
els, a logit link function is used for binomially distributed
case-control phenotypes to model outcome odds.