@@ -12,8 +12,8 @@ For more info see the Wikipedia article on [protein structure alignment](http://
1212## Alignment Algorithms supported by BioJava
1313
1414BioJava comes with a number of algorithms for aligning structures. The following
15- five options are displayed by default in the user interface, although others can
16- be accessed programmatically using the methods in
15+ five options are displayed by default in the graphical user interface (GUI),
16+ although others can be accessed programmatically using the methods in
1717[ StructureAlignmentFactory] ( http://www.biojava.org/docs/api/org/biojava/bio/structure/align/StructureAlignmentFactory.html ) .
1818
19191 . Combinatorial Extension (CE)
@@ -153,6 +153,7 @@ Additional methods can be added by implementing the
153153interface.
154154
155155
156+
156157## Creating alignments programmatically
157158
158159The various structure alignment algorithms in BioJava implement the
@@ -161,16 +162,17 @@ The various structure alignment algorithms in BioJava implement the
161162alignment and print some information about it.
162163
163164``` java
165+ // Fetch CA atoms for the structures to be aligned
164166String name1 = " 3cna.A" ;
165167String name2 = " 2pel" ;
166-
167168AtomCache cache = new AtomCache ();
168-
169169Atom [] ca1 = cache. getAtoms(name1);
170170Atom [] ca2 = cache. getAtoms(name2);
171171
172+ // Get StructureAlignment instance
172173StructureAlignment algorithm = StructureAlignmentFactory . getAlgorithm(CeCPMain . algorithmName);
173174
175+ // Perform the alignment
174176AFPChain afpChain = algorithm. align(ca1,ca2);
175177
176178// Print text output
@@ -180,13 +182,50 @@ System.out.println(afpChain.toCE(ca1,ca2));
180182To display the alignment using jMol, use:
181183
182184``` java
183- // Or StructureAlignmentDisplay.display(afpChain, ca1, ca2);
184185GuiWrapper . display(afpChain, ca1, ca2);
186+ // Or StructureAlignmentDisplay.display(afpChain, ca1, ca2);
185187```
186188
187189Note that these require that you include the structure-gui package and the jmol
188190binary in the classpath at runtime.
189191
192+ ## Command-line tools
193+
194+ ## PDB-wide database searches
195+
196+ The Alignment GUI also provides functionality for PDB-wide structural searches.
197+ This systematically compares a structure against a non-redundant set of all
198+ other structures in the PDB at either a chain or a domain level. Representatives
199+ are selected using the RCSB's clustering of proteins with 40% sequence identity,
200+ as described
201+ [ here] ( http://www.rcsb.org/pdb/static.do?p=general_information/cluster/structureAll.jsp ) .
202+ Domains are selected using either SCOP (when available) or the
203+ ProteinDomainParser algorithm.
204+
205+ ![ Database Search GUI] ( img/database_search.png )
206+
207+ To perform a database search, select the 'Database Search' tab, then choose a
208+ query structure based on PDB ID, SCOP domain id, or from a custom file. The
209+ output directory will be used to store results. These consist of individual
210+ alignments in compressed XML format, as well as a tab-delimited file of
211+ similarity scores and statistics. The statistics are displayed in an interactive
212+ results table, which allows the alignments to be sorted. The 'Align' column
213+ allows individual alignments to be visualized with the alignment GUI.
214+
215+ ![ Database Search Results] ( img/database_search_results.png )
216+
217+ Be aware that this process can be very time consuming. Before
218+ starting a manual search, it is worth considering whether a pre-computed result
219+ may be available online, for instance for
220+ [ FATCAT-rigid] ( http://www.rcsb.org/pdb/static.do?p=general_information/cluster/structureAll.jsp )
221+ or [ DALI] ( http://ekhidna.biocenter.helsinki.fi/dali/start ) . For custom files or
222+ specific domains, a few optimizations can reduce the time for a database search.
223+ Downloading PDB files is a considerable bottleneck. This can be solved by
224+ downloading all PDB files from the [ FTP
225+ server] ( ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/ ) and setting
226+ the ` PDB_DIR ` environmental variable. This operation sped up the search from
227+ about 30 hours to less than 4 hours.
228+
190229
191230## Acknowledgements
192231
0 commit comments