Skip to content

Commit 031e032

Browse files
committed
Improving descriptions of alignment algorithms
1 parent c78ff25 commit 031e032

File tree

6 files changed

+51
-5
lines changed

6 files changed

+51
-5
lines changed

structure/alignment.md

Lines changed: 51 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,20 +44,66 @@ The functionality to perform and visualize these alignments can of course be use
4444

4545
### Combinatorial Extension (CE)
4646

47-
The Combinatorial Extension (CE) algorithm was originally developed by [Shindyalov and Bourne in 1998](http://peds.oxfordjournals.org/content/11/9/739.short).
47+
The Combinatorial Extension (CE) algorithm was originally developed by
48+
[Shindyalov and Bourne in
49+
1998](http://peds.oxfordjournals.org/content/11/9/739.short).
50+
It works by identifying segments of the two proteins with similar local
51+
structure, and then combining those to try to align the most residues possible
52+
while keeping the overall RMSD of the superposition low.
53+
54+
CE is a rigid-body alignment algorithm, which means that the structures being
55+
compared are kept fixed during superpositon. In some cases it may be desirable
56+
to break large proteins up into domains prior to aligning them (by manually
57+
inputing a subrange, using the [SCOP or CATH databases](externaldb.md), or by
58+
decomposing the protein automatically using the [Protein Domain
59+
Parser](http://www.biojava.org/docs/api/org/biojava/bio/structure/domain/LocalProteinDomainParser.html)
60+
algorithm).
4861

4962
### Combinatorial Extension with Circular Permutation (CE-CP)
5063

51-
This is a new variation of CE that can detect circular permutations in proteins. @sbliven & @andreasprlic , unpublished
64+
CE and FATCAT both assume that aligned residues occur in the same order in both
65+
proteins (e.g. they are both *sequence-order dependent* algorithms). In proteins
66+
related by a circular permutation, the N-terminal part of one protein is related
67+
to the C-terminal part of the other, and vice versa. CE-CP allows circularly
68+
permuted proteins to be compared. For more information on circular
69+
permutations, see the
70+
[wikipedia](http://en.wikipedia.org/wiki/Circular_permutation_in_proteins) or
71+
[Molecule of the
72+
Month](http://www.pdb.org/pdb/101/motm.do?momID=124&evtc=Suggest&evta=Moleculeof%20the%20Month&evtl=TopBar)
73+
articles.
74+
75+
76+
For proteins without a circular permutation, CE-CP results look very similar to
77+
CE results (with perhaps some minor differences and a slightly longer
78+
calculation time). If a circular permutation is found, the two halves of the
79+
proteins will be shown in different colors:
80+
81+
![Concanavalin A (yellow & orange) aligned with Pea Leptin (blue and cyan)](img/3cna.A_2pel.A_cecp.png)
82+
83+
CE-CP was developed by Spencer E. Bliven, Philip E. Bourne, and Andreas Prlić.
5284

5385
### FATCAT - rigid
5486

55-
This is a Java implementation of the original FATCAT algorithm. [Yuzhen Ye & Adam Godzik in 2003] (http://bioinformatics.oxfordjournals.org/content/19/suppl_2/ii246.abstract)
87+
This is a Java implementation of the original FATCAT algorithm by [Yuzhen Ye
88+
& Adam Godzik in
89+
2003](http://bioinformatics.oxfordjournals.org/content/19/suppl_2/ii246.abstract).
90+
It performs similarly to CE for most proteins. The 'rigid' flavor uses a
91+
rigid-body superposition and only considers alignments with matching sequence
92+
order.
5693

5794
### FATCAT - flexible
5895

59-
Just as FATCAT - rigid, a Java implementation of the original FATCAT algorithm.
96+
FATCAT-flexible introduces 'twists' between different parts of the proteins
97+
which are superimposed independently. This is ideal for proteins which undergo
98+
large conformational shifts, where a global superposition cannot capture the
99+
underlying similarity between domains. For instance, the structures of
100+
calmodulin with and without calcium bound can be much better aligned with
101+
FATCAT-flexible than with one of the rigid alignment algorithms. The downside of
102+
this is that it can lead to additional false positives in unrelated structures.
103+
104+
![(Left) Rigid and (Right) flexible alignments of
105+
calmodulin](img/1cfd_1cll_fatcat.png)
60106

61107
## Acknowledgements
62108

63-
Thanks to P. Bourne, Yuzhen Ye and A. Godzik for granting permission to freely use and redistribute their algorithms.
109+
Thanks to P. Bourne, Yuzhen Ye and A. Godzik for granting permission to freely use and redistribute their algorithms.

structure/img/1cfd_1cll_fatcat.png

64.4 KB
Loading

structure/img/1cfd_1cll_fatcat.xcf

78.7 KB
Binary file not shown.
375 KB
Loading

structure/img/1cfd_1cll_rigid.png

364 KB
Loading
453 KB
Loading

0 commit comments

Comments
 (0)