Posted by: sumrandee | พฤศจิกายน 29, 2009

Assignment 2: Haplotype analysis with Haploview

Haplotype analysis using Haploview

I. The Java Runtime Environment (JRE) v1.4 or later was required to work with the haploview program.

1. Downloaded Java Runtime Environment (JRE)  and installed the program at;

            http://java.sun.com

II. Haploview Downloads

1. Search Haploview using Google search.

2. The Haploview’ s webpage

3. Choose to download the Haploview Windows installer (hapinstall.exe) from

HapInstall.exe

 

  

4. Install Haploview by double-clicking the installer file. The installer will create a Haploview folder in  Start Menu.

     -To run the program, click on “Haploview.jar” file in that folder.

5. Haploview’s welcoming page

 

Question 1. What is the name of haploview format to use in this analysis?

The name of  assigned  haploview format used in the haplotype analysis is “HapMap Project data dumps format”. This file format has several header lines beginning with “#”.

6. To input the file from the assignment to Haploview for ana lysis.

     6.1 Copy SNP data from the asssigment.

     6.2 Paste it to MS word.

    6.3 Save file as ‘plain text file’. (file name; SNP data.txt)

     6.4 Plain text file.

7. Open the Haploview program.   

8. Choose HapMap format to browse saved file ‘SNP data.txt’.

    Then click’ OK’.

9. Set the HW p-value at 0.05. Then click at ‘Rescore Markers’.

    -The screenshot  will appear;

Question2. Please show us the marker and individual quality control of the genotype data use in the analysis.

   From the screenshot above, after loading a file, Haploview shows  basic data quality checks for the markers.

The description of  terms use as follow;

  • # is the marker number.
  • Name is the marker ID specified (only if an info file is loaded).
  • Position is the marker position specified (only if an info file is loaded).
  • ObsHET is the marker’s observed heterozygosity.
  • PredHET is the marker’s predicted heterozygosity (i.e. 2*MAF*(1-MAF)).
  • HWpval is the Hardy-Weinberg equilibrium p value, which is the probability that its deviation from H-W equilibrium could be explained by chance.
  • %Geno is the percentage of non-missing genotypes for this marker.
  • FamTrio is the number of fully genotyped family trios for this marker (0 for datasets with unrelated individuals).
  • MendErr is the number of observed Mendelian inheritance errors (0 for datasets with unrelated individuals).
  • MAF is the minor allele frequency (using founders only) for this marker.
  • Alleles are the major and minor alleles for this marker.
  • Rating is checked if the marker passes all the tests and unchecked if it fails one or more tests (highlighted in red).
  • 10. Click at LD Plot on the Menu bar to show LD map.

    Question 3. Please show us the LD map then explain what do you get from the LD map?

    • Haploview  calculates several pairwise measures of LD, which it uses to create a graphical representation as shows in above screenshot.
    • Halpoview allows a number of different color schemes to represent The LD relationship.
    • It  generates haplotypes and their population frequencies. The LD display shows lines to indicate transition from one block to the next with frequencies corresponding to the thickness of the lines.
    • The LD display presents Hedridge’s multialleic D, which represent the degree of LD between 2 blocks, treating each haplotype within ablock as an allele of that region.

    This LD maps above show color scheme in the mode of  ’Standard D’/LOD’ .

    When;

  • D' is the value of D prime between the two loci.
  • LOD is the log of the likelihood odds ratio, a measure of confidence in the value of D'.
  • Question 4. How many haplotype blocks in this region of Chromosome X, then explain how to interprete them?

    There are 3 haplotype blocks in this region of Chromosome X and the values to present the relationship between each locus or marker of each blocks was shown in the white box.

    • The two most common pairwise measures of LD is D‘ and r2.
    • D‘ is defined to be 1 in the absence of obligate recombination, declining only due to recombination or recurrent mutation.
    • r2 is  the squared correlation coefficient between the two SNPs. Thus, r2 is 1 when two SNPs arose on the same branch of the genealogy and remain undisrupted by recombination, but has a value less than 1 when SNPs arose on different branches, or if an initially strong correlation has been disrupted by crossing over.

     

    • Block 1 comprises marker number 8, rs908005 and marker  no. 9, rs979484.

    •  Block 2 comprises marker number 13-17.

      For instance, this figure in the white box only shows the correlation between marker 13 and 17 of haplotype block.

    •   Block 3 comprises marker number 24-29.

    For instance, this figure in the white box only shows the correlation between marker 24 and 27 of the haplotype block.

    When;

  • D' is the value of D prime between the two loci.
  • LOD is the log of the likelihood odds ratio, a measure of confidence in the value of D'.
  • r2 is the correlation coefficient between the two loci.
  •  

    Question 5. Could you find out the tagging SNP in each haplotype block, then explain what the tagging SNPs?

    A tag SNP is a representative single nucleotide polymorphism (SNP) in a region of the genome with high  linkage disequilibrium (LD).

    •   To find out the tagging SNP in each haplotype block

    -At the Display Menu, choose  ’ Show tags in blocks’.

    There are 3 haplotype blocks and each block consisting 2 tagging SNP;

    •       Block 1  comprises 2 tagging SNP i.e., marker number 8 and 9.

                         -The  frequency of GA was 33.3%.

                        -The  frequency of AT was 64.4%.

                       -The  frequency of GTwas 2.2%.

    •       Block 2  comprises 2 tagging SNP i.e., marker number 13 and 15.

                       -The frequency of TT was 48.9%.

                        -The  frequency of GT was 27.8%.

                       -The frequency of TGwas 22.3%.

    •      Block 3  comprises 2 tagging SNP i.e., marker number 24 and 27.

                        -The frequency of CA was 73.3%.

                        -The  frequency of GG was 25.6%.

                       -The frequency of CGwas 1.1%.


    ใส่ความเห็น

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / เปลี่ยนแปลง )

    Twitter picture

    You are commenting using your Twitter account. Log Out / เปลี่ยนแปลง )

    Facebook photo

    You are commenting using your Facebook account. Log Out / เปลี่ยนแปลง )

    Connecting to %s

    หมวดหมู่

    Follow

    Get every new post delivered to your Inbox.