Skip to content

Commit 051135a

Browse files
committed
Added database search to structure alignment chapter
1 parent f07ce29 commit 051135a

File tree

3 files changed

+44
-5
lines changed

3 files changed

+44
-5
lines changed

structure/alignment.md

Lines changed: 44 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@ For more info see the Wikipedia article on [protein structure alignment](http://
1212
## Alignment Algorithms supported by BioJava
1313

1414
BioJava comes with a number of algorithms for aligning structures. The following
15-
five options are displayed by default in the user interface, although others can
16-
be accessed programmatically using the methods in
15+
five options are displayed by default in the graphical user interface (GUI),
16+
although others can be accessed programmatically using the methods in
1717
[StructureAlignmentFactory](http://www.biojava.org/docs/api/org/biojava/bio/structure/align/StructureAlignmentFactory.html).
1818

1919
1. Combinatorial Extension (CE)
@@ -153,6 +153,7 @@ Additional methods can be added by implementing the
153153
interface.
154154

155155

156+
156157
## Creating alignments programmatically
157158

158159
The various structure alignment algorithms in BioJava implement the
@@ -161,16 +162,17 @@ The various structure alignment algorithms in BioJava implement the
161162
alignment and print some information about it.
162163

163164
```java
165+
// Fetch CA atoms for the structures to be aligned
164166
String name1 = "3cna.A";
165167
String name2 = "2pel";
166-
167168
AtomCache cache = new AtomCache();
168-
169169
Atom[] ca1 = cache.getAtoms(name1);
170170
Atom[] ca2 = cache.getAtoms(name2);
171171

172+
// Get StructureAlignment instance
172173
StructureAlignment algorithm = StructureAlignmentFactory.getAlgorithm(CeCPMain.algorithmName);
173174

175+
// Perform the alignment
174176
AFPChain afpChain = algorithm.align(ca1,ca2);
175177

176178
// Print text output
@@ -180,13 +182,50 @@ System.out.println(afpChain.toCE(ca1,ca2));
180182
To display the alignment using jMol, use:
181183

182184
```java
183-
// Or StructureAlignmentDisplay.display(afpChain, ca1, ca2);
184185
GuiWrapper.display(afpChain, ca1, ca2);
186+
// Or StructureAlignmentDisplay.display(afpChain, ca1, ca2);
185187
```
186188

187189
Note that these require that you include the structure-gui package and the jmol
188190
binary in the classpath at runtime.
189191

192+
## Command-line tools
193+
194+
## PDB-wide database searches
195+
196+
The Alignment GUI also provides functionality for PDB-wide structural searches.
197+
This systematically compares a structure against a non-redundant set of all
198+
other structures in the PDB at either a chain or a domain level. Representatives
199+
are selected using the RCSB's clustering of proteins with 40% sequence identity,
200+
as described
201+
[here](http://www.rcsb.org/pdb/static.do?p=general_information/cluster/structureAll.jsp).
202+
Domains are selected using either SCOP (when available) or the
203+
ProteinDomainParser algorithm.
204+
205+
![Database Search GUI](img/database_search.png)
206+
207+
To perform a database search, select the 'Database Search' tab, then choose a
208+
query structure based on PDB ID, SCOP domain id, or from a custom file. The
209+
output directory will be used to store results. These consist of individual
210+
alignments in compressed XML format, as well as a tab-delimited file of
211+
similarity scores and statistics. The statistics are displayed in an interactive
212+
results table, which allows the alignments to be sorted. The 'Align' column
213+
allows individual alignments to be visualized with the alignment GUI.
214+
215+
![Database Search Results](img/database_search_results.png)
216+
217+
Be aware that this process can be very time consuming. Before
218+
starting a manual search, it is worth considering whether a pre-computed result
219+
may be available online, for instance for
220+
[FATCAT-rigid](http://www.rcsb.org/pdb/static.do?p=general_information/cluster/structureAll.jsp)
221+
or [DALI](http://ekhidna.biocenter.helsinki.fi/dali/start). For custom files or
222+
specific domains, a few optimizations can reduce the time for a database search.
223+
Downloading PDB files is a considerable bottleneck. This can be solved by
224+
downloading all PDB files from the [FTP
225+
server](ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/) and setting
226+
the `PDB_DIR` environmental variable. This operation sped up the search from
227+
about 30 hours to less than 4 hours.
228+
190229

191230
## Acknowledgements
192231

structure/img/database_search.png

19.6 KB
Loading
61.1 KB
Loading

0 commit comments

Comments
 (0)