Skip to content

Commit 784139b

Browse files
dicknetherlandsandreasprlic
authored andcommitted
Change to wiki page
1 parent 209824b commit 784139b

File tree

2 files changed

+52
-3
lines changed

2 files changed

+52
-3
lines changed

_wikis/BioJava3_Proposal.md

Lines changed: 38 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ Executive Summary
88
It is suggested that development stop on the existing
99
BioJava/BioJavaX/BioJava2 aggregation and start afresh as BioJava3.
1010

11-
Reasoning
12-
---------
11+
General reasoning
12+
-----------------
1313

1414
- The existing code is disorganised, poorly commented, and hard to
1515
maintain due to the use of numerous different coding styles.
@@ -107,6 +107,42 @@ Action plan
107107
2. Tentative Singapore meeting to get the ball rolling on the final
108108
design and initial coding front.
109109

110+
Previous work on the subject
111+
----------------------------
112+
113+
1. Michael Heuer's
114+
[proposal](http://www3.shore.net/~heuermh/static-alphabet-generics.tar.gz)
115+
for static generic symbols/symbol lists.
116+
2. Matthew Pocock's [BioJava2
117+
proposals](http://www.derkholm.net/svn/repos/bjv2).
118+
119+
Major problem areas
120+
-------------------
121+
122+
1. The singleton symbol model is hard to use and understand. It needs
123+
simplification.
124+
2. Strand is specified on feature and not on location. This is not
125+
biologically logical.
126+
3. Sequence and Feature objects are tightly bound - Features can't
127+
exist without also loading and assigning the appropriate Sequence
128+
object. This slows things down and uses memory.
129+
4. In general, most operations require a Sequence object underlying
130+
whatever object you are manipulating. At the time BioJava was
131+
designed and written, this was fine as most biologists were
132+
interested in sequence manipulation. Now they have moved on and are
133+
more interested in sequence meta-data such as features or protein
134+
structures or microarray experiments or phylogenetics. To enforce
135+
having to load the sequence for every feature in a region of
136+
interest before doing even basic analysis is wasteful of resources,
137+
and illogical. BioJava needs to lose the Sequence-centric view of
138+
the world.
139+
5. Interfaces that have already been deprecated in the 1.5 release need
140+
removing entirely. Many of them are heavily used within the existing
141+
code base, e.g. Sequence. To remove them would require a rewrite of
142+
almost the entire codebase anyway, and also a rewrite of most client
143+
code (e.g. to use RichSequence as the default replacement for
144+
Sequence, which would no longer exist).
145+
110146
Categories of Improvement
111147
-------------------------
112148

_wikis/BioJava3_Proposal.mediawiki

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
It is suggested that development stop on the existing BioJava/BioJavaX/BioJava2 aggregation and start afresh as BioJava3.
44

5-
==Reasoning==
5+
==General reasoning==
66

77
* The existing code is disorganised, poorly commented, and hard to maintain due to the use of numerous different coding styles.
88
* Existing documentation is poor and it would be hard to try and write any given the lack of code comments.
@@ -41,6 +41,19 @@ It is suggested that development stop on the existing BioJava/BioJavaX/BioJava2
4141
# Please modify this page and the [[Talk:BioJava3_Proposal|Talk page]] as you see fit in order to flesh out details and/or make new points.
4242
# Tentative Singapore meeting to get the ball rolling on the final design and initial coding front.
4343
44+
==Previous work on the subject==
45+
46+
#Michael Heuer's [http://www3.shore.net/~heuermh/static-alphabet-generics.tar.gz proposal] for static generic symbols/symbol lists.
47+
#Matthew Pocock's [http://www.derkholm.net/svn/repos/bjv2 BioJava2 proposals].
48+
49+
==Major problem areas==
50+
51+
#The singleton symbol model is hard to use and understand. It needs simplification.
52+
#Strand is specified on feature and not on location. This is not biologically logical.
53+
#Sequence and Feature objects are tightly bound - Features can't exist without also loading and assigning the appropriate Sequence object. This slows things down and uses memory.
54+
#In general, most operations require a Sequence object underlying whatever object you are manipulating. At the time BioJava was designed and written, this was fine as most biologists were interested in sequence manipulation. Now they have moved on and are more interested in sequence meta-data such as features or protein structures or microarray experiments or phylogenetics. To enforce having to load the sequence for every feature in a region of interest before doing even basic analysis is wasteful of resources, and illogical. BioJava needs to lose the Sequence-centric view of the world.
55+
#Interfaces that have already been deprecated in the 1.5 release need removing entirely. Many of them are heavily used within the existing code base, e.g. Sequence. To remove them would require a rewrite of almost the entire codebase anyway, and also a rewrite of most client code (e.g. to use RichSequence as the default replacement for Sequence, which would no longer exist).
56+
4457
==Categories of Improvement==
4558

4659
Initally suggested by Andreas this attempts to group the currently recognized ''issues'' surrounding Biojava. See also [[UsageAnalysis]].

0 commit comments

Comments
 (0)