Please use this identifier to cite or link to this item: http://hdl.handle.net/1946/7483
Single Nucleotide Polymorphisms (SNPs) are the most common form of variations in the human genome. Collection of SNP variants on a single chromosome copy are called haplotypes. Humans are diploid organisms, implying that they possess two nearly identical copies of each chromosome, and therefore the haplotypes come in pairs. Conflated (mixed) data from the two haplotypes is called a genotype of an individual. Minimum parsimony haplotyping (MPH) is an abstraction of haplotype finding problem arising in genetics, which tries to find the minimum set of haplotypes needed to explain a given set of genotypes. The MPH problem is known to be NP-Hard, meaning that finding computationally efficient solutions is unlikely in the general case. Here, we give novel efficient algorithms for sub-instances of the problem. In addition, a practical heuristic for MPH are implemented, solving problem instances for the general case. Experiments are done on real genotype data from the HapMap project  and heuristics developed from these experiments are used to speed up the implementation. These improvements result in a algorithm that solves MPH several times faster than previously described methods.
|MSc_Steinunn-Maria-Stefansdottir.pdf||433.67 kB||Open||Complete Text||View/Open|