Vinsamlegast notið þetta auðkenni þegar þið vitnið til verksins eða tengið í það: http://hdl.handle.net/1946/30492
The object of this thesis is to compare different methods that perform genome-wide association study (GWAS) on censored data. The response variable in censored data is the time until an either event or censor occurs. Such outcomes are very common in medicine. Traditionally, survival models such as Cox model are used to analyze censored data. However, such models are rarely used in GWAS
The data is this study are simulated from Cox model where the baseline hazard follows a Weibull distribution. Three main variates were used in the simulation; censoring rate, frequency of each genetic variant and hazard ratio of corresponding variant. Every dataset consists of survival time, background variables and 1.000 genetic variants, where only small fraction has effect on survival.
One-at-a-time testing methods were applied for different models and both power and false discovery rate was estimated. Residuals from Cox model with only non-genetic variables were used as quantitative traits. In addition, the LASSO method was used in association with Cox model. The results for Cox-LASSO, Cox, Martingale residual, deviance residual, rank normal transformed Martingale residual, rank transformed deviance residual, logistic- and Poisson regression were on average for power: (95\%, 87\%, 85\%, 84\%, 81\%, 80\%, 35\%, 24\%), and on average for false discovery rate (85\%, 9.5\%, 7.5\%, 6.5\%, 5.9\%, 5.8\%, 8.2\%, 0.1\%) respectively. The loss in power is only 2\%-4\% for when using residuals instead of Cox model, but on the other hand there is gain in false discovery rate of 2\%.
Rank normal transformed residuals can be used as derived quantitative traits in GWAS and they are very competitive to the Cox model in every way. This method does not require complex software implementation, as it uses already available methods that perform GWAS on quantitative traits. Some of these methods are very sophisticated and could provide even better results. When there is much censoring the advantage of using residuals from survival models instead of logistic regression in GWAS is great. In this way, the power can be increased with little added computational cost.