-
Notifications
You must be signed in to change notification settings - Fork 16
Expand file tree
/
Copy pathBOSC2008_Abstract.html
More file actions
262 lines (183 loc) · 13.9 KB
/
BOSC2008_Abstract.html
File metadata and controls
262 lines (183 loc) · 13.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
<h2 id="bosc2008-abstract">BOSC2008 Abstract</h2>
<p>The abstract was submitted as appears below. Please send an email to the
biojava-dev mailing list with any further changes to be made.</p>
<p><a href="http://shore.net/~heuermh/BOSC2008_Abstract.odt">BOSC2008_Abstract.odt</a></p>
<p><a href="http://shore.net/~heuermh/BOSC2008_Abstract.pdf">BOSC2008_Abstract.pdf</a></p>
<h3 id="general-information">General information</h3>
<p><strong>Paper Title:</strong> BioJava Project Update</p>
<p><strong>Student Paper?</strong> No</p>
<h3 id="authors-information">Author(s) Information</h3>
<h3 id="technical-areas">Technical Areas</h3>
<p>Bio * Open Source Project Updates</p>
<h3 id="content">Content</h3>
<p><strong>Keywords:</strong></p>
<p>OBF O|B|F open-bio biojava bioperl biosql sequence alphabet feature
annotation alignment protein structure phylogenetic trees</p>
<p><strong>Abstract:</strong></p>
<p>BioJava is a mature free and open-source project that provides a
framework for processing biological data. BioJava contains powerful
analysis and statistical routines, tools for parsing common file
formats, and packages for manipulating sequences and 3D structures.
BioJava is available freely under the terms of version 2.1 of the GNU
Lesser General Public License (LGPL) from <a href="http://biojava.org/">http://biojava.org/</a>. Here we
present the latest BioJava release (version 1.6, released on 13 Apr
2008) which provides improvements in the packages for phylogenetic
trees, processing PDB files, and genetic algorithms.</p>
<p><strong>Paper:</strong></p>
<p>BioJava Project Update</p>
<p>BioJava was conceived in 1999 by Thomas Down and Matthew Pocock as an
API to simplify bioinformatics software development using Java (Pocock
et al., 2000). It has since then evolved to become a fully-featured
framework with modules for performing many common bioinformatics tasks.</p>
<p>As a free and open-source project, BioJava is developed by volunteers
coordinated by the Open Bioinformatics Foundation (O|B|F,
<a href="http://open-bio.org/">http://open-bio.org/</a>) and is one of several Bio* toolkits (Mangalam,
2002). Over the past eight years, the BioJava has brought together
nearly fifty different code contributors, hundreds of mailing list
subscribers, and several wiki contributors. All code and related
documentation is distributed under version 2.1 of the GNU Lesser General
Public License (LGPL) license (Free Software Foundation, Inc., 1999).
All wiki documentation is made available online under version 1.2 of the
GNU Free Documentation License (Free Software Foundation, Inc., 2000).</p>
<p>BioJava has been used in a number of real-world applications, including
Bioclipse (Spjuth et al., 2007), BioWeka (Gewehr et al., 2007),
Cytoscape (Shannon et al., 2003), and Taverna (Oinn et al., 2004), and
has been referenced in over fifty published studies. A list of these can
be found on the BioJava website.</p>
<p>The latest BioJava release (version 1.6, released on 13 Apr 2008) offers
more functionality and stability over the previous official releases.
The phylogenomics package was improved and expanded by our 2007 Google
Summer of Code (GSOC’07) student Boh-Yun Lee. It now contains
fully-functional Nexus and Phylip parsers, and tools for calculating
UPGMA and Neighbour Joining, Jukes-Kantor and Kimura Two Parameter, and
MP. The PDB file parser was improved by Jules Jacobsen for better
dealing with PDB header records. Andreas Dräger provided several patches
for improving the genetic algorithm packages. The version 1.6 release
also contains numerous bug fixes and documentation improvements.</p>
<p>The BioJava website is <a href="http://biojava.org/">http://biojava.org/</a>. The version 1.6 release
can be downloaded from <a href="http://biojava.org/wiki/BioJava:Download">http://biojava.org/wiki/BioJava:Download</a>.</p>
<p><strong>References</strong></p>
<p>Free Software Foundation, Inc. (1999) GNU Lesser General Public License,
version 2.1, <a href="http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html">http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html</a>,
accessed 10 May 2008.</p>
<p>Free Software Foundation, Inc. (2000) GNU Free Documentation License,
version 1.2, <a href="http://www.gnu.org/licenses/fdl-1.2.html">http://www.gnu.org/licenses/fdl-1.2.html</a>, accessed 10 May
2008.</p>
<p>Gewehr JE, Szugat M, Zimmer R. (2007) BioWeka—extending the Weka
framework for bioinformatics Bioinformatics 2007 23(5):651-653.</p>
<p>Mangalam H. (2002) The Bio* toolkits – a brief overview. Brief
Bioinform., 3, 396-302.</p>
<p>Oinn T, Addis M, Ferris J, Marvin D, Greenwood M, Carver T, Pocock MR,
Wipat A, Li P. (2004) Taverna: a tool for the composition and enactment
of bioinformatics workflows. Bioinformatics, 20, 3045–3054.</p>
<p>Pocock M, Down T, Hubbard T. (2000) BioJava: Open Source Components for
Bioinformatics. ACM SIGBIO Newsletter 20(2), 10-12.</p>
<p>Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N,
Schwikowski B, Ideker T. (2003) Cytoscape: a software environment for
integrated models of biomolecular interaction networks. Genome Research
2003 Nov; 13(11):2498-504.</p>
<p>Spjuth O, Helmus T, Willighagen EL, Kuhn S, Eklund M, Wagener J,
Murray-Rust P, Steinbeck C, Wikberg JE. (2007) Bioclipse: an open source
workbench for chemo- and bioinformatics. BMC Bioinformatics. 2007 Feb
22;8:59.</p>
<h2 id="notes">Notes</h2>
<p><strong>Save for talk</strong></p>
<p>As a mature project, BioJava faces several challenges:</p>
<p>how one deals with a large established code base</p>
<p>what happens when committers move on, get married, have kids, etc.</p>
<p>how difficult it is to deprecate and remove existing code</p>
<p>the BioJava3 use case & refactoring/redesign criteria gathering process</p>
<p>evolutionary vs. revolutionary changes</p>
<p><a href="http://incubator.apache.org/learn/rules-for-revolutionaries.html">http://incubator.apache.org/learn/rules-for-revolutionaries.html</a></p>
<p>the “second system” problem</p>
<p><a href="http://www.joelonsoftware.com/articles/fog0000000069.html">http://www.joelonsoftware.com/articles/fog0000000069.html</a></p>
<p><strong>Version 1.6 release announcement to biojava-dev and biojava-l</strong></p>
<p>Date: Sun, 13 Apr 2008 19:02:41 +0100<br />
From: Andreas Prlic<br />
To: biojava-dev at biojava.org, biojava-l at biojava.org<br />
Subject: [Biojava-dev] biojava 1.6 released<br />
Biojava 1.6 has been released and is available from <a href="http://">http://</a>
biojava.org/wiki/BioJava:Download</p>
<p>Biojava 1.6 offers more functionality and stability over the previous
official releases. BioJava now depends on Java 1.5+. We highly recommend
you to upgrade as soon as possible.</p>
<p>In detail, the phylo package org.biojavax.bio.phylo was improved and
expanded by our GSOC’07 student Boh-Yun Lee. It now contains fully-
functional Nexus and Phylip parsers, and tools for calculating UPGMA and
Neighbour Joining, Jukes-Kantor and Kimura Two Parameter, and MP. It
uses JGraphT to represent parsed trees.</p>
<p>The PDB file parser was improved by Jules Jacobsen for better dealing
with PDB header records. Andreas Draeger provided several patches for
improving the Genetic Algorithm modules. Additionally this release
contains numerous bug fixes and documentation improvements.</p>
<p>Thanks to the entire biojava community for making this possible!</p>
<p>Happy Biojava-ing,</p>
<p>Andreas</p>
<p>From
<a href="http://www.ohloh.net/projects/6798"><a href="http://www.ohloh.net/projects/6798">http://www.ohloh.net/projects/6798</a></a></p>
<p><strong>As of 08 May 2008</strong></p>
<p>181,197 lines of code in “biojava-live/trunk”</p>
<p>Estimated effort using COCOMO <a href="http://en.wikipedia.org/wiki/COCOMO">1</a>
metric: 47 Person Years</p>
<p>48 contributors (committers with at least one commit to cvs and/or
subversion repository)</p>
<p>also compare with:</p>
<p>BioJava StatSVN:
<a href="http://www.spice-3d.org/statsvn/stats/"><a href="http://www.spice-3d.org/statsvn/stats/">http://www.spice-3d.org/statsvn/stats/</a></a></p>
<p>top 10 authors:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>mrp 114161 (25.5%)
thomasd 82637 (18.5%)
holland 58798 (13.1%)
kdj 43546 (9.7%)
andreas 40727 (9.1%)
mark_s 36616 (8.2%)
dhuen 25610 (5.7%)
gcox 5954 (1.3%)
birney 4087 (0.9%)
draeger 3994 (0.9%)
</code></pre>
</div>
<p><strong>First commit</strong></p>
<p>This commit was generated by cvs2svn to compensate for changes in r2,
which included commits to RCS files with non-trunk default branches.</p>
<p>by birney on 2000-01-26 15:53 (over 8 years ago)</p>
<p>Interesting to find out what happened before this administrative commit,
as there were 6539 lines of code already.</p>
<p>Statsvn lists the files that were addded in the first commit:</p>
<div class="highlighter-rouge"><pre class="highlight"><code> 4087 lines of code changed in:
* org/biojava/bio: BioError.java (new 92), BioException.java (new 88)
* org/biojava/bio/alignment: AbstractCursor.java (new), AbstractState.java (new 1), AbstractTrainer.java (new), Alignment.java (new 29), AmbiguityState.java (new), BaumWelchSampler.java (new), BaumWelchTrainer.java (new), Column.java (new), ComplementaryState.java (new), DNAState.java (new), DNAWeightMatrix.java (new), DP.java (new 10), DPCursor.java (new), DoubleAlphabet.java (new), EmissionState.java (new), FlatModel.java (new), IllegalTransitionException.java (new), MarkovModel.java (new), MarkovModelWrapper.java (new), MatrixCursor.java (new), ModelInState.java (new), ModelTrainer.java (new), SimpleAlignment.java (new 77), SimpleMarkovModel.java (new 3), SimpleModelInState.java (new), SimpleModelTrainer.java (new), SimpleState.java (new), SimpleStateLabeledSequence.java (new), SimpleStateTrainer.java (new), SimpleTransitionTrainer.java (new), SimpleWeightMatrix.java (new), SmallCursor.java (new), State.java (new), StateFactory.java (new), StateLabeledSequence.java (new), StateTrainer.java (new), StateWrapper.java (new), StoppingCriteria.java (new), SuffixTree.java (new 35), TrainerTransition.java (new), TrainingAlgorithm.java (new), Transition.java (new), TransitionTrainer.java (new), WMAsMM.java (new), WeightMatrix.java (new), WeightMatrixAnnotator.java (new 23), XmlMarkovModel.java (new)
* org/biojava/bio/gui: BarLogoPainter.java (new 85), DNAStyle.java (new 84), LogoPainter.java (new 45), PlainStyle.java (new 56), ResidueStyle.java (new), StateLogo.java (new 7), TextLogoPainter.java (new 209)
* org/biojava/bio/program: Meme.java (new 151)
* org/biojava/bio/seq: AbstractAlphabet.java (new), AllSymbolsAlphabet.java (new), Alphabet.java (new 4), Annotatable.java (new 1), Annotation.java (new 2), Annotator.java (new 5), CompoundLocation.java (new 2), Feature.java (new 66), FeatureFactory.java (new), FeatureFilter.java (new 81), FeatureHolder.java (new 34), FixedWidthParser.java (new), HashSequenceDB.java (new 1), IllegalResidueException.java (new), Location.java (new 7), NameParser.java (new), PointLocation.java (new 2), RangeLocation.java (new), Residue.java (new), ResidueList.java (new), ResidueParser.java (new), SeqException.java (new), Sequence.java (new 63), SequenceDB.java (new), SequenceFactory.java (new 28), SequenceIterator.java (new 33), SimpleAlphabet.java (new), SimpleAnnotation.java (new), SimpleFeature.java (new 14), SimpleFeatureFactory.java (new), SimpleFeatureHolder.java (new 65), SimpleResidue.java (new 1), SimpleResidueList.java (new 2), SimpleSequence.java (new 1), SimpleSequenceFactory.java (new), SymbolParser.java (new)
* org/biojava/bio/seq/io: DefaultDescriptionReader.java (new), EmblFormat.java (new 60), FastaDescriptionReader.java (new), FastaFormat.java (new 109), SequenceFormat.java (new 37), StreamReader.java (new 93), StreamWriter.java (new 50)
* org/biojava/bio/seq/tools: AlphabetManager.java (new 1), DNATools.java (new)
* org/biojava/stats/svm: LinearKernel.java (new 37), ListSumKernel.java (new 74), PolynomialKernel.java (new 89), RadialBaseKernel.java (new 67), SMORegressionTrainer.java (new 429), SMOTrainer.java (new 283), SVMKernel.java (new 39), SVMModel.java (new), SVMRegressionModel.java (new 170), SigmoidKernel.java (new 83), SparseVector.java (new 113), TrainingContext.java (new 31), TrainingEvent.java (new 39), TrainingListener.java (new 34)
* org/biojava/stats/svm/tools: ClassifierExample.java (new 387), Classify.java (new 80), SVM_Light.java (new 199), Train.java (new 90), TrainRegression.java (new 86)
* org/biojava/utils/xml: XMLDispatcher.java (new), XMLPeerBuilder.java (new), XMLPeerFactory.java (new)
</code></pre>
</div>
<p><strong>Latest commit</strong></p>
<p>started to develop a mmcif parser</p>
<p>by Andreas.Prlic (Using name ‘andreas’) on 2008-04-28 07:27 (11 days
ago)</p>
<p><strong>BioJava group on LinkedIn</strong></p>
<p>There is a BioJava group on LinkedIn:</p>
<p>Developers of the BioJava open-source bioinformatics project.</p>
<p>Not sure how to link to it though.</p>
<p>To join the BioJava linkedin group: You need to be a linkedin member.
You then need to find the group and ask to join. I then get notified and
asked to approve it, which I will if your name sounds vaguely familiar :
) You don’t need to be a contributor just a user or interested
party. –<a href="User:Mark" title="wikilink">Mark</a> 14:39, 22 May 2008 (UTC)</p>
<p><strong>Wiki edits for later</strong></p>
<p>Clarify reference to LGPL.</p>
<p>Update references to “open source” with “free and open source”. Link to
FOSS page on wikipedia?</p>
<p>DengueInfo link on BioJavaInside is broken.</p>
<p>BioJava in Anger is now on the wiki (so under FDL?) but has a separate
vague Copyright section, see
<a href="http://biojava.org/wiki/BioJava:CookBook#Copyright">http://biojava.org/wiki/BioJava:CookBook#Copyright</a></p>
<p>This copyright section is a direct copy from the old BioJava in Anger
page. This means the statement is outdated and can probably be
removed. –<a href="User:Mark" title="wikilink">Mark</a> 14:39, 22 May 2008 (UTC)</p>