Posted by: sumrandee | มกราคม 10, 2010

Structural bioinformatics practice

Assignment #4

The function of a protein being a direct consequence of its 3-D structure (shape), the logical link was established.

Sequence >> Structure >> Function

It is now a central concept of molecular biology devoted bioinformatics. As a consequence, an increasing proportion of the bioinformatics pie is now devoted to the development of tools to navigate between sequences and 3-D structures. (This specialized area is called structural bioinformatics.)

Please use the following sequence of unknown (not shown ) to explain this concept.
I. Finding Open Reading Fame

1.Copy the DNA sequence of unknown in to Notepad.

These sequences of unknown contain not only gene (coding sequences). 
It also contains human promotor gene and poly A tail, and Cap gene.

2. To find open reading frame,
Go to http://www.ncbi.nlm.nih.gov/projects/gorf/

3. Put the DNA sequence of unknow in to query box  and the click “orfFind”.


4. We now get 6 open reading frames.

 

5. Choose the longest open reading frame (Frame +1) which would be the
correct frame. The window shows the Frame +1 with DNA sequence starting
with ATG (Methionine). The sequence starts form 142-5733
(a total of 5592 bases) with a length of amino acid 1,863 amino acid.


6. To check the frame for the correct frame.
We want to know our unknown protein sequence of interest is new and not
yet in Entrez,using blastp to compare the sequence against the pdb database.
  • Use Blastp (Search protein database using a protein query).     -Click “BLAST”.
  •  Then click “View report”.

 

6.2 The frame+1 significantly matches (Alignment score, E value = 0.0)
with ref|NP_009225.1|  breast cancer 1, early onset isoform 1 [Homo sapiens].

 

  • Identical proteins for accession no.  NP_00922.1 were showed.

7. To get the sequences for Open Reading Frame (ORF)  for selected frame +1.

  • Delete the sequences before ATG (sequence 1-142), and sequence 5,734-7,108 which was not coding sequences using Notepad.

8. Save the edited sequence as “unknown-edited sequence”.

  • It is the Open Reading Frame (ORF) which has the sequence length 5,592 bases.

II. Protein Translation

9. Translate unknown-edited sequence to amino acid sequences using Translate tool from  http://www.expasy.org/tools/dna.html.

9.1 Amino acid sequences (1,863 sequences).

 

 
II. Predicting Post-translational modification (PTM) from protein sequences.

10. As we known (by using blastp) that 1,863 amino acid sequences of the unknown sequences

were identity to human breast cancer 1 gene (BRCA1). Most glycosylations were assumed to be occurred in human.

  • How to predict glycosylation were showed.

10.1   Asn-Xaa-Ser/Thr sequons in the sequence output below are highlighted in blue.

             Asparagines predicted to be N-glycosylated are highlighted in red.

                                           

                           Finding subcellular localization of protein

  • It is plasma membrane protein.

 
Predicting the presence and location of signal peptide cleavage sites in amino acid sequences

12.1 The result of program analysis

13. Prediction of transmembrane helices in proteins

 

13.1 The result of program analysis

 

Conclusion for the prediction of post-translational modification.

14 .Prediction of protein secondary structure using Markov chains  in PSSFinder program.

http://linux1.softberry.com/berry.phtml?topic=pps&group=programs&subgroup=propt

15. CPHmodels 3.0 is a protein homology modeling server. The template recognition is based on profile-profile alignment guided by secondary structure and exposure prediction.

  • The result of program analysis.

 

Finding protein domains  comparing with references protein in database

  • Go to website http:\\swissmodel.expasy.org
  •  Put protein sequences of interest in query box.

The result of program analysis will appear like below windows.

 

  •  The screenshot show the protein sequences with the significant alignments and domains.

               For example, BRCT domains and Zinc finger, RING - type domains

 

 

Searching for protein similarity  of unknown protein with protein data bank  (PDB) database

  • To find conserved domains  along protein chain and structures.

1. Go to http://www.ncbi.nlm.nih.gov/Structure/cblast/cblast.cgi?

Algorithm used: blastp

  • Enter query sequence  in qery box.

  • Significant alignments were produced.

  • Then click on the first blast hit with high alignment score and low E value:  pdb|1JNX|X  Chain X,

           Crystal Structure Of The Brct Repeat Regi…  468    1e-131 Related structures.        (S stands for stucture of protein).

                          -Chain X, Crystal Structure Of The Brct Repeat Region From The Breast Cancer Associated Protein, Brca1

Description: Structure Of The Brct Repeats Of Brca1 Bound To A Ctip Phosphopeptide.
Taxonomy:
Chain A: Homo sapiens
  • This window shows the 3D structure of The Brct Repeat Region From The Breast Cancer Associated Protein, Brca1.

  • Putative conserved domains have also been detected.

  •  List of conserved domains.

 

  • Details of some  conserved domain including structure and functions from local query sequence was showed in the below window.

                For example,  cd00162, RING, RING-finger (Really Interesting New Gene) domain a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the ‘cross-brace’ motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H).

  • 3D view of  structure of RING-finger using Cn3D 3-D Structure Viewer software.

 


ใส่ความเห็น

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / เปลี่ยนแปลง )

Twitter picture

You are commenting using your Twitter account. Log Out / เปลี่ยนแปลง )

Facebook photo

You are commenting using your Facebook account. Log Out / เปลี่ยนแปลง )

Connecting to %s

หมวดหมู่

Follow

Get every new post delivered to your Inbox.